Customize your cloud function

In some use cases you may need to add custom code to your cloud function automation. This can e.g. be used to create custom Parquet tables with calculated signals or writing data to external endpoints (e.g. databases).


When to use custom cloud functions

Before you start customizing your cloud function, consider if it is the best way to achieve your intended goal. Generally, a cloud function is useful in below situations:

  • If the processing has to be done immediately upon file upload
  • If the processing has to be done on all of your data
  • If the processing can be done file-by-file[1]

How the default cloud function works

To customize your cloud function, it is useful to understand the ‘default’ workflow:

  1. The function downloads the trigger MDF log file and various files (e.g. DBCs)
  2. It uses mdf2parquet_decode to DBC decode the data into Parquet files
  3. It may do message customization and event detection (if required JSON files are found)
  4. It runs the function process_decoded_data on the Parquet data (by default uploading files to S3)

How to customize your cloud function and redeploy

You may want to update certain files in this workflow, e.g. custom_message_functions.py (to add custom messages) or functions.py (to e.g. push data to an external endpoint instead of your output bucket). To do this, go through the below steps:

  1. Download the cloud function zip from your input bucket and unzip it
  2. Customize the code as needed, then re-zip the files correctly[2]
  3. Redeploy your cloud function

In Amazon, you can follow our guide to update your deployed Lambda function. In Google/Azure, you can update the zip name (e.g. change the last version digit) before uploading it to your input bucket/container. After this, you can re-run the Terraform deployment script with your revised function zip name.


[1]For example, Lambda functions are not useful for performing analyses that need to aggregate data across devices, trips, days or similar as this involve multiple log files (and may exceed the max Lambda function timeout). For such use cases, a periodic Glue job is better suited (see our trip summary section as an example)
[2]Make sure that you include all the required files in the zip and that you do not e.g. zip a folder containing the files. If you are using Windows 11 rather than Windows 10, you may experience issues with the zip method. If so, try creating an empty zip folder and moving the files into that instead.