Write data to external database
In this section we explain how you can customize your cloud function to write data to an external database via REST, enabling you to integrate your data with practically anything.
Table of Contents
Write data to external database
To write data to an external database, follow below steps:
- Follow the steps to set up a custom cloud function
- Update the
functions.pyto load the Parquet file(s) with pyarrow into a data frame[1] - Add your external database writing logic, then redeploy your function and test it
Note
You can use the built-in custom message functionality to resample all your data to a single Parquet to simplify your database injection code
REST vs. Python clients
The example takes outset in the REST API for writing data, which is supported by most databases for data ingestion. Many databases also offer a ‘Python client’ which you can alternatively use - but in this case notice that you will need to create an ARN layer to enable support for the library.
Alternatively, you may be able to leverage the existing functionality within the AWS SDK ARN layer as per their API documentation. Examples of supported databases are below:
- Amazon Redshift
- PostgreSQL
- MySQL
- Microsoft SQL Server
- DynamoDB
- Amazon Timestream
| [1] | see the utils.py function load_parquet_to_df for this purpose |