Set up Azure Synapse

Azure Synapse interface

Synapse makes it simple and fast to query data from your Azure Parquet data lake via SQL. It can e.g. be used in Grafana-Synapse dashboards, PowerBI-Synapse dashboards or Python scripts.

In this section we explain how you can set up your Synapse Serverless SQL Pool.


Prerequisites

  1. Set up Azure Parquet data lake [~10 min]

Note

The above steps are required before proceeding


Deploy Synapse

  1. In the Azure console go to Azure Synapse Analytics
  2. Create a new workspace (same resource group and region as your data lake)
  3. Select your output Storage Account under ‘Account name’[1]
  4. Select your output container under ‘File system name’
  5. Click ‘Next: Security’ and specify a strong ‘SQL Password’ (note this down)
  6. Click ‘Review + create’

Map your Parquet data lake to tables

  1. Verify that your output container contains Parquet files[2]
  2. Download and unzip below script
  3. Open the synapse-map-tables.py via a text editor
  4. Add your details and save the script
  5. Run the script to map your current Parquet data lake to tables[3]

Synapse map tables script | changelog

Note

The script adds ‘meta data’ about your output container. If new devices/messages are added to your Parquet data lake, the script should be run again (manually or by schedule)[4]

You are now ready to use Synapse as a data source in e.g. Grafana-Synapse dashboards or PowerBI-Synapse dashboards.


[1]If your output storage account does not show up it may be because it was set up incorrectly. Check that it shows as a ‘StorageV2’ kind under Storage Accounts and that ‘Hierarchical namespace’ is enabled
[2]If your output bucket is empty, you can upload a test MDF file to your input bucket to create some Parquet data
[3]This assumes that you have installed Python 3.11 and the requirements.txt (we recommend using a virtual environment). You can set this up by running python -m venv env & env\Scripts\activate & pip install -r requirements.txt
[4]You only need to re-run the script if new tables are to be created, not if you simply add more data to an existing table