Playing with GitHub Data: The Start (Part 2)

In continuation to the questions raised in “Playing with GitHub Data: The Start (Part 1)”, I decided to ask the Man. The man being Ilya Grigorik: the man behind GitHub Archive.

Surely, he will be able to provide insight on the differences observed between the GitHub search and GitHub Archive query search counts.

github_total_comparison_projectsgithub_total_comparison_repos

 

I reached out to Ilya.

20140317_github_ilyagrigorik_email1_v2

 

He responds.

 

20140317_github_ilyagrigorik_email2_v2

 

So let it be known. Keep it in mind when playing with GitHub Archive via BigQuery.

 

Leave a comment