Though most hotglue compatible taps write data following the Singer spec, storing and processing Singer data can be inefficient.

To solve this, hotglue supports transforming your sync output into a more efficient storage format. Your ETL script processes the data from this format.

hotglue supports four Intermediate Formats:

FormatDescription
SingerDoes not transform synced Singer data. This can speed up your jobs by avoiding the intermediate transformation, but is much less storage efficient and is harder to work with in ETL
CSVComma Separated Values. Larger file sizes than Parquet but offers more type flexibility
ParquetThe Apache Parquet format. More storage efficient than CSV and offers strict type validation
Parquet With ChunkingWrites Apache Parquet format in chunks, and compiles them together before ETL.