> ## Documentation Index
> Fetch the complete documentation index at: https://docs.hotglue.com/llms.txt
> Use this file to discover all available pages before exploring further.

# ETL

> hotglue CLI commands for ETL

Programmatically manage your transformation scripts using the `etl` commands below.

# ETL Download

### Description

Clones the remote ETL script saved in hotglue to your local machine.

### Sample

```shell theme={null}

$ hotglue etl download [--overwrite] [--downloadTo]
✔ Finished: Verifying user and authorizing.
✔ Finished: Scanning for downloadable files.
ℹ Info: Downloading script files to ./scripts/tap.
✔ Finished: Downloading file: etl.ipynb.
┌───────────┬────────────┐
│ File      │ Status     │
├───────────┼────────────┤
│ etl.ipynb │ Downloaded │
└───────────┴────────────┘

```

### Parameters

| Option         | Default | Description                                                                                 |
| -------------- | ------- | ------------------------------------------------------------------------------------------- |
| `--overwrite`  |         |                                                                                             |
| `-o`           | `false` | When enabled, overwrites any files that already exist locally in the download to directory. |
| `--downloadTo` |         |                                                                                             |
| `-d`           | `.`     | The directory to download the ETL to. Defaults to the local directory.                      |

# ETL Deploy

### Description

Deploys the local ETL script to hotglue.

### Sample

[](https://docs.hotglue.com/docs/cli-etl#sample-1)

```
$ hotglue etl deploy [--sourceFolder]
✔ Finished: Verifying user and authorizing.
✔ Finished: Validating flow and tap location.
✔ Finished: Preparing deployment target.
ℹ Info: Deploying ETL scripts.
✔ Finished: Pushing file: default/flows/bTHIweD0W/taps/cin7/etl/etl.ipynb.
┌─────────────────────────────────────────────────┬──────────┐
│ File                                            │ Status   │
├─────────────────────────────────────────────────┼──────────┤
│ default/flows/bTHIweD0W/taps/cin7/etl/etl.ipynb │ Deleted  │
├─────────────────────────────────────────────────┼──────────┤
│ default/flows/bTHIweD0W/taps/cin7/etl/etl.ipynb │ Deployed │
└─────────────────────────────────────────────────┴──────────┘
```

### Parameters

| Option           | Default | Description                                                                   |
| ---------------- | ------- | ----------------------------------------------------------------------------- |
| `--sourceFolder` |         |                                                                               |
| `-s`             | `.`     | The directory to upload the ETL script from. Defaults to the local directory. |

# ETL Delete

### Description

Deletes a deployed ETL script on hotglue.

### Sample

```shell theme={null}
$ hotglue etl delete
ℹ Info: Deleting ETL scripts for Tenant tenantId Flow flowId and Tap tapId to envId.
✔ Finished: Verifying user and authorizing.
✔ Finished: Deleting ETL scripts.
┌───────────┬─────────┐
│ File      │ Status  │
├───────────┼─────────┤
│ etl.ipynb │ Deleted │
└───────────┴─────────┘

```

# ETL Set up Local Job Data

### Description

Clones hotglue job data to your local machine and creates a `.env` file with the job's environment variables.
The file structure and content is identical to the file system your ETL script ran in.

The downloaded `etl-output` folder will be renamed to `etl-output-reference`\
and the `snapshots` folder will be renamed to `snapshots-reference`.

We recommend using this command to reproduce ETL failures or back test again successful jobs.

### Sample

```shell theme={null}

$ hotglue etl setup-local-run tenant123/flows/At_kHalC/jobs/2026/02/09/05/7Y6iUA [--include-configs] [--overwrite] [--downloadTo]
✔ Finished: Verifying user and authorizing.
✔ Finished: Scanning for downloadable files.
┌───────────────────────────────┬─────┬──────────────┐
│ File                                            │ Size    │ LastModified         │
├───────────────────────────────┼─────┼──────────────┤
│ catalog.json                                │ 571     │ 1/9/2026, 6:29:30 PM │
├───────────────────────────────┼─────┼──────────────┤
│ sync-output/products-20220209T222727.csv        │ 3226419 │ 1/9/2026, 5:27:34 PM │
└───────────────────────────────┴─────┴──────────────┘
ℹ Info: Downloading files to `.`
✔ Finished: Downloading file: catalog.json
✔ Finished: Downloading file: sync-output/products-20220209T222727.csv

```

### Parameters

| Option              | Default | Description                                                                                                                                |
| ------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `--include-configs` | `false` | When enabled, also downloads `target-config.json`, `source-config.json`, `tenant-config.json` and sets the `API_KEY` environment variable. |
| `--overwrite`       |         |                                                                                                                                            |
| `-o`                | `false` | When enabled, overwrites any files that already exist locally in the download to directory.                                                |
| `--downloadTo`      |         |                                                                                                                                            |
| `-d`                | `.`     | The directory to download the job data to. Defaults to the local directory.                                                                |

### Running the ETL with the jobs environment variables

After running the `setup-local-run` command a `.env` file will be created containing the same environment variables that were available when the job ran in the hotglue environment.\
In order to run the ETL with those same environment variables use one of the following methods:

#### For VSCode and it's variants

Open the launcher file `{project_folder}/.vscode/launch.json` and add the `envFile` entry for launch configuration:

```json theme={null}
{
    "version": "0.2.0",
    "configurations": [
        ...,
        {
            "name": "Run ETL script",
            "type": "debugpy",
            "request": "launch",
            "program": "${workspaceFolder}/etl.py",
            "console": "integratedTerminal",
            "cwd": "${workspaceFolder}",
            "python": "${workspaceFolder}/.venv/bin/python",
            "envFile": "${workspaceFolder}/.env"
        }
    ]
}
```

#### For Linux/macOS/Git bash/WSL terminal

Run the following:

```shell theme={null}
source .env
python etl.py
```

# ETL Local Run

### Description

Runs the ETL locally replicating the Hotglue environment and compares the `etl-output` files with `etl-output-reference` (the one from the job).

The comparator check for extra or missing files in the `etl-output` folder
and also compares the matching `.csv` and `.singer` files, if they are not the same an error will be shown.

In order to use it you first need to download the Transformation script using the `etl download` command
and the job data using the `etl setup-local-run` command.

We recommend using this command to reproduce ETL failures or back test again successful jobs.

### Sample where the output matches the output from the job

```shell theme={null}
$ hotglue etl local-run
✔ Finished: Building Docker image. With code 0
2026-05-06 13:28:15,194 - ETL Local Run - INFO - Installing dependencies...
2026-05-06 13:28:17,316 - ETL Local Run - INFO - Dependencies installed successfully.
2026-05-06 13:28:17,317 - ETL Local Run - INFO - 
                #########################################################
                ################# Running ETL ###########################
                #########################################################
2026-05-06 13:28:17,707 - ETL Local Run - INFO - ETL run successfully completed.
2026-05-06 13:28:17,707 - ETL Local Run - INFO - 
                #########################################################
                ################# Comparing ETL output ##################
                #########################################################
2026-05-06 13:28:17,708 - ETL Local Run - INFO - Comparing file: users.csv
2026-05-06 13:28:17,709 - ETL Local Run - INFO - No differences found in users.csv
2026-05-06 13:28:17,709 - ETL Local Run - INFO - 
                ##############################################################
                ################ ETL output comparator result ################
                ##############################################################
2026-05-06 13:28:17,709 - ETL Local Run - INFO - NOTE: Only files with ('.csv', '.singer') extension are compared.

2026-05-06 13:28:17,709 - ETL Local Run - INFO - Files compared: 
2026-05-06 13:28:17,710 - ETL Local Run - INFO -   - users.csv
2026-05-06 13:28:17,710 - ETL Local Run - INFO - No differences found in the ETL output!
✔ Finished: Running ETL locally. With code 0
```

### Sample where the output doesn't match the output from the job

```shell theme={null}
$ hotglue etl local-run
✔ Finished: Building Docker image. With code 0
Docker run failed with code 1.
2026-05-06 14:02:43,905 - ETL Local Run - INFO - Installing dependencies...
2026-05-06 14:02:46,200 - ETL Local Run - INFO - Dependencies installed successfully.
2026-05-06 14:02:46,204 - ETL Local Run - INFO - 
                #########################################################
                ################# Running ETL ###########################
                #########################################################
2026-05-06 14:02:46,579 - ETL Local Run - INFO - ETL run successfully completed.
2026-05-06 14:02:46,579 - ETL Local Run - INFO - 
                #########################################################
                ################# Comparing ETL output ##################
                #########################################################
2026-05-06 14:02:46,584 - ETL Local Run - ERROR - Extra files in etl-output: {'customers.csv'}
2026-05-06 14:02:46,584 - ETL Local Run - INFO - Comparing file: users.csv
2026-05-06 14:02:46,585 - ETL Local Run - INFO - No differences found in users.csv
2026-05-06 14:02:46,585 - ETL Local Run - INFO - 
                ##############################################################
                ################ ETL output comparator result ################
                ##############################################################

2026-05-06 14:02:46,585 - ETL Local Run - INFO - NOTE: Only files with ('.csv', '.singer') extension are compared.

2026-05-06 14:02:46,585 - ETL Local Run - INFO - Files compared: 
2026-05-06 14:02:46,585 - ETL Local Run - INFO -   - users.csv
2026-05-06 14:02:46,585 - ETL Local Run - ERROR - Found differences in the ETL output:

Extra files in etl-output: {'customers.csv'}
✖ Error: Running ETL locally.
```

### Parameters

| Option              | Default | Description                                                                                                    |
| ------------------- | ------- | -------------------------------------------------------------------------------------------------------------- |
| `--etlScriptFolder` | `.`     | ETL script folder (downloaded using `etl download`)                                                            |
| `--jobDataFolder`   | `.`     | Job data folder (downloaded using `etl setup-local-run`)                                                       |
| `--dockerPlatform`  | ` `     | Docker platform (linux/amd64, linux/arm64, etc.), leave empty to use the default platform of the docker daemon |

### Output comparator options

The ETL file comparator has some options that can be set by creating a `test-config.json` file in the Script folder.
The options are listed below.

#### 1. **`sort_config`**

Specifies how rows in a stream and nested fields within rows should be sorted. Supports flat fields, nested fields, and lists of scalars.

* **Flat Field Sorting**: Specifies the column used to sort the rows of a stream.
* **Nested Field Sorting**: Uses dot notation to sort lists of dictionaries within a row.
* **List of Scalars Sorting**: Uses a trailing `.` to sort lists of scalars within a row.

**Example Configuration**:

```json theme={null}
"sort_config": {
    "products": [
        "id",            // Sort rows by "id"
        "images.id",     // Sort "images" (list of dictionaries) by "id"
        "tags."          // Sort "tags" (list of scalars) alphabetically
    ]
}
```

#### 2. **`ignore_columns`**

Specifies fields to ignore during the comparison. Supports flat fields and nested fields using dot notation.

* **Flat Fields**: Directly removes the specified field from rows.
* **Nested Fields**: Removes specified fields within nested structures using dot notation.

**Example Configuration**:

```json theme={null}
"ignore_columns": {
    "products": [
        "body_html",       // Ignore "body_html" column
        "images.alt"       // Ignore "alt" field in "images" (list of dictionaries)
    ]
}
```

#### 3. **`rename_config`**

Specifies fields to rename in the etl-output only. Supports flat and nested fields using dot notation.

* **Flat Fields**: Renames top-level fields in rows.
* **Nested Fields**: Renames fields within nested structures using dot notation.

**Example Configuration**:

```json theme={null}
"rename_config": {
    "products": {
        "created_at": "c_at",         // Rename "created_at" to "c_at"
        "images.created_at": "c_at"  // Rename "created_at" to "c_at" within "images" (list of dictionaries)
    }
}
```
