Set up Google BigQuery

Google BigQuery interface

BigQuery makes it simple and fast to query data from your Google Parquet data lake via SQL. It can e.g. be used in e.g. Grafana-BigQuery dashboards or Python scripts.

In this section we explain how you can set up BigQuery.


Prerequisites

  1. Set up Google Parquet data lake [~10 min]

Note

The above steps are required before proceeding


1: Deploy BigQuery and mapping function

  1. Upload below zip to your input bucket root via the console (storage overview)
  2. Open the canedge-google-cloud-terraform repository
  3. Go through the ‘setup instructions’ to open your Cloud Shell and clone the repository
  4. Go through step 3 (BigQuery) and reference the uploaded mapping function zip name

Mapping function zip | changelog


2: Map your Parquet data lake to tables

  1. Verify that your output bucket contains Parquet files[1]
  2. Open your <id>-bq-map-tables function via the console (function overview)
  3. Click ‘Test’ (at the top), copy the ‘CLI test command’ and click ‘Test in cloud shell’
  4. Paste the command and run it, then verify that the script succeeds

Note

The mapping script adds ‘meta data’ about your output bucket. If new devices/messages are added to your Parquet data lake, the script should be run again (manually or by schedule)[2]

You are now ready to use BigQuery as a data source in e.g. Grafana-BigQuery dashboards.


[1]If your output bucket is empty, you can upload a test MDF file to your input bucket to create some Parquet data
[2]You only need to re-run the script if new tables are to be created, not if you simply add more data to an existing table