TXT to Parquet Decoder

The Parquet decoder stores the decoded output as Parquet files. For more information the Parquet format itself, see: https://parquet.apache.org/

Note

All major programming languages have Parquet support, see: https://arrow.apache.org/docs/


Tool support

Examples of some specific tools/languages supporting the Parquet format:


Output

The Parquet output data-schema always uses the following structure:

  • One timestamp value (t) using datatype Int64 (MICROS) and snappy compression

  • One or more signal values using datatype double / NULL and snappy compression

The row-group-size is set to 1.000.000 (1e6).

The signal names are constructed from the database used for decoding, as in the example below:

         t              Speed    SpeedAccuracy    SpeedValid
____________________    _____    _____________    __________

22-Apr-2022 14:14:43    0.01         2.006            1
22-Apr-2022 14:14:44    0.01         2.152            1
22-Apr-2022 14:14:45    0.01         2.290            1

If specific values exceed the MIN/MAX as defined in the database, they are included in the output as NULL.

Warning

Output records are skipped if all values are NULL.


Changelog

# Changelog

All notable changes to this project will be documented in this file.

## [24.10.17]

### Added

- Support for transport protocols (ISO-TP, J1939-21, NMEA-TP, MUX-TP)

### Changed

- Versioning schema from SemVer to CalVer
- Default verbosity level changed to 2
- Type of out-of-range signal values changed from double *NaN* to *NULL* (change to parquet schema)
- Restriction on max 5 DBC-files per interface/channel removed

### Fixed

- Fix for order of messages with same timestamp

### Known-issues

- Reduced parquet write speed

## [2.3.2]

### Fixed

- Issue with merge of PGN source/destination addresses sometimes resulting in invalid output files

Download

Windows AMD64 / x86-64 (64-bit)

Linux AMD64 / x86-64 (64-bit)

Linux ARM64 (64-bit)