Skip to main content

readSnapshots Function

This function reads snapshot data from either Parquet or CSV files for a specified stream. Parameters:
  • stream (string): The name of the stream to read snapshots for.
  • snapshotDir (string): The directory containing the snapshot files.
  • options (object): Optional. CSV read options.
Returns:
  • A Polars.DataFrame containing the snapshot data, or null if no snapshot exists.
Example Usage:
const snapshot = readSnapshots('users', '/path/to/snapshot/dir');
if (snapshot) {
  console.log('Snapshot loaded:', snapshot);
} else {
  console.log('No snapshot available.');
}

snapshotRecords Function

This function creates or updates snapshot records for a data stream with options for deduplication, type coercion, and more. Parameters:
  • streamData (Polars.DataFrame or null): The data to snapshot.
  • stream (string): The name of the stream.
  • snapshotDir (string): The directory to store snapshot files.
  • pk (string): Primary key for deduplication. Defaults to “id”.
  • justNew (boolean): Whether to return only new data instead of the full snapshot.
  • useCsv (boolean): Whether to write snapshots as CSV instead of Parquet.
  • coerceTypes (boolean): If true, types are coerced to match streamData dtypes.
  • localizeDatetimeTypes (boolean): If true, converts datetime columns to UTC.
  • overwrite (boolean): If true, overwrites the snapshot entirely instead of merging.
  • options (object): Additional options for CSV reading.
Returns:
  • The updated Polars.DataFrame of the snapshot, merged according to the primary key specified.
Example Usage:
const df = new Polars.DataFrame({ columns: ['user_id', 'name'] });
const dfWithSnapshottedData = snapshotRecords(df, 'users', '/path/to/snapshot/dir');
console.log('Updated Snapshot:', dfWithSnapshottedData);