BigQuery query sample using Github Data Challenge's data

Since BigQuery is now available to the public, here's a sample query for analysing Github Data Challenge's data. I actually just edited a sample query in Google's official BigQuery page. I am not 100% certain this query is right in counting the language, and feel free to comment if it is wrong.

Make sure you load up the data by Switch to Project->Display Project and enter "githubarchive".

The query below will count how many repository's languages were in Perl.

select repository_language, COUNT(repository_language) as count
  FROM githubarchive:github.timeline
  WHERE (REGEXP_MATCH(repository_language,r'Perl'))
  GROUP BY repository_language

BigQueryが一般に公開されたので、Github Data Challengeのデータで簡単なサンプルクエリを作りました。参考になれば嬉しいです。(BigQueryのオフィシャルサイトに出ているサンプルクエリを少し修正しただけですが。)


最初にSwitch to Project->Display Projectで"githubarchive"とタイプしてGithubのデータをロードしてください。


select repository_language, COUNT(repository_language) as count
  FROM githubarchive:github.timeline
  WHERE (REGEXP_MATCH(repository_language,r'Perl'))
  GROUP BY repository_language


select repository_language, COUNT(repository_language) as count
  FROM githubarchive:github.timeline
  WHERE (REGEXP_MATCH(repository_language,r'Go'))
  GROUP BY repository_language


1 Go 15179
2 Gosu 109
