Data Transformation Functions
Functions for cleaning, mapping, and transforming data in pandas DataFrames
clean_convert
Recursively cleans None values from lists and dictionaries, handling nested structures and converting datetime objects to ISO format.
Installation
Basic Usage
Parameters
input
(dict, list): Input data structure to clean- Handles:
- Dictionaries
- Lists
- Datetime objects
- Scalar values
Returns
Cleaned data structure with None values removed
map_fields
Maps row values according to a specified mapping dictionary, supporting nested structures and arrays.
Usage
Parameters
row
(dict): Source data rowmapping
(dict): Mapping configuration- Keys: Target field names
- Values: Source field names or nested mappings
Returns
Dictionary with mapped values
rename
Renames DataFrame columns using JSON format with support for type conversion.
Usage
Parameters
df
(pd.DataFrame): Input DataFrametarget_columns
(dict, list):- dict: Mapping of old to new column names
- list: Columns to select
Returns
DataFrame with renamed columns
Notes
- Supports both renaming and column selection
- Preserves data types
- Returns original DataFrame if no mapping provided
- Only renames existing columns
localize_datetime
Converts DataFrame datetime columns to UTC timezone, handling both naive and timezone-aware timestamps.
Usage
Parameters
df
(pd.DataFrame): Input DataFramecolumn_name
(str): Name of datetime column to localize
Returns
Series with localized datetime values
deep_convert_datetimes
Recursively transforms all datetime objects to ISO format strings within nested data structures.
Usage
Parameters
value
(any): Input value or data structure- Handles dictionaries, lists, datetime objects
- Processes nested structures recursively
Returns
Data structure with datetime objects converted to ISO format strings
Notes
- Uses “%Y-%m-%dT%H:%M:%S.%fZ” format
- Handles both datetime and date objects
- Preserves original data structure
- Safe for non-datetime values
exception
Recommended: Standardized error handling and logging for ETL pipelines.
Usage
Parameters
exception
(Exception): Caught exceptionroot_dir
(str): Directory for error logerror_message
(str): Additional context message
Notes
- Creates consistent error format
- Logs errors to ‘errors.txt’
- Preserves original exception details
- Adds contextual information