Customize your cloud function
In some use cases you may need to add custom code to your cloud function automation. This can e.g. be used to create custom Parquet tables with calculated signals or writing data to external endpoints (e.g. databases).
Table of Contents
When to use custom cloud functions
Before you start customizing your cloud function, consider if it is the best way to achieve your intended goal. Generally, a cloud function is useful in below situations:
- If the processing has to be done immediately upon file upload
- If the processing has to be done on all of your data
- If the processing can be done file-by-file[1]
How the default cloud function works
To customize your cloud function, it is useful to understand the ‘default’ workflow:
- The function downloads the trigger MDF log file and various files (e.g. DBCs)
- It uses
mdf2parquet_decodeto DBC decode the data into Parquet files - It may do message customization and event detection (if required JSON files are found)
- It runs the function
process_decoded_dataon the Parquet data (by default uploading files to S3)
How to customize your cloud function and redeploy
You may want to update certain files in this workflow, e.g. custom_message_functions.py (to add custom messages) or functions.py (to e.g. push data to an external endpoint instead of your output bucket). To do this, go through the below steps:
- Download the cloud function zip from your input bucket and unzip it
- Customize the code as needed, then re-zip the files correctly[2]
- Redeploy your cloud function
In Amazon, you can follow our guide to update your deployed Lambda function. In Google/Azure, you can update the zip name (e.g. change the last version digit) before uploading it to your input bucket/container. After this, you can re-run the Terraform deployment script with your revised function zip name.
| [1] | For example, Lambda functions are not useful for performing analyses that need to aggregate data across devices, trips, days or similar as this involve multiple log files (and may exceed the max Lambda function timeout). For such use cases, a periodic Glue job is better suited (see our trip summary section as an example) |
| [2] | Make sure that you include all the required files in the zip and that you do not e.g. zip a folder containing the files. If you are using Windows 11 rather than Windows 10, you may experience issues with the zip method. If so, try creating an empty zip folder and moving the files into that instead. |