Local Parquet data lake
The simplest way to set up a Parquet data lake is to manually create and store it locally:
- Create a local folder (
input/) next to yourmdf2parquet_decode.exeand prefixed DBC files - Copy your MDF log files into this folder (with the CANedge path structure[1])
- Drag & drop this folder onto the
mdf2parquet_decode.exeto create your data lake
If you want to use the more advanced cloud automation functionality locally, clone the canedge-mdftoparquet-automation repo and follow the README to process your local input folder.
Open source interfaces like DuckDB and ClickHouse let query data from your local Parquet data lake via SQL. They can be used in e.g. Grafana dashboards, Excel or Python.
Local interface are beyond the scope of this intro, but we recommend below resources:
- MF4 decoder Docs - learn how to set up Grafana-ClickHouse or use DuckDB in Python
| [1] | I.e. how the files are stored in the LOG/ folder on a CANedge SD: [DEVICE_ID]/[SESSION_NUMBER]/[SPLIT_NUMBER].[FILE_EXTENSION] |