Skip to main content

PLLazyFrameReader

A Lazyframe is an abstraction of a dataframe that can stream your data from your sync output, apply relevant transformations, and then writes to your export format without ever loading the entire dataset into memory.
import gluestick as gs
import polars as pl

reader = gs.PLLazyFrameReader()

TENANT_ID = "TENANT_123"

for stream in reader.input_files.keys():
    lf = reader.get(stream, catalog_types=True)
    lf = lf.with_columns(
        pl.lit(TENANT_ID).alias("tenant_id")
    )
    gs.to_export(lf, stream, "./etl-output", keys=["id"])