Connector Details

NameValue
PlatformAWS S3
Auth TypeAPI Keys
DirectionBidirectional
Tap Metrics

Usage:

Target Metrics

Usage:

Overview

You can use the S3 connector to read and write files with S3.

How to authenticate S3

Determining the appropriate IAM Policy

However you choose to authenticate, you will need to determine an appropriate IAM policy for your integration. The policy you create depends on your integration needs:

  • Exporting to S3: Only needs s3:PutObject

  • Fetching from S3: Needs s3:ListBucket and s3:GetObject

Here’s a policy for both reading and writing. Replace BUCKET_NAME with your bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::BUCKET_NAME/*",
                "arn:aws:s3:::BUCKET_NAME"
            ]
        }
    ]
}

Once you have your policy, you have two ways to authenticate with S3:

  1. HMAC Access Key ID and Secret (default)

  2. Cross-account role with external ID

Method 1: HMAC Access Key ID and Secret

By default, the S3 connector authenticates using an Access Key ID and Secret Access Key.

Step 1: Create an IAM User

  1. In AWS, navigate to IAM service

  2. Go to Users > Create user

  3. Enter a user name

  4. Select Access key - Programmatic access

  5. Click Next: Permissions

Step 2: Create and Attach Policy

  1. Click Attach policies directly > Create Policy

  2. Select the JSON tab and paste the policy from above.

  3. Click Next: Tags > Next: Review

  4. Name the policy and click Create policy

  5. Return to the user creation workflow

  6. Select your newly created policy

  7. Click Next: Review > Create user

  8. Copy your Access Key ID and Secret Access Key, and paste them into the widget.

⚠️ Important: This is your only chance to copy the Secret Access Key. Store it securely.

Method 2: Cross-account AssumeRole with External ID

For enterprises requiring temporary credentials and enhanced security controls, you can use cross-account role access.

To enable this in hotglue, change the S3 connect_ui_params in your connector settings to include aws_external_id and aws_role_arn in lieu of aws_access_key_id and aws_access_key_id, like below:

[
   {
      "id":"s3",
      ...
      
      "connect_ui_params":{
         "aws_external_id":{
            "label":"AWS External Id",
            "description":"AWS External Id"
         },
         "aws_role_arn":{
            "label":"AWS Role ARN",
            "description":"Role ARN for IAM role"
         },
         ...
         
      }
   }
]

Once aws_external_id is configured in your connector settings, the widget will automatically update to display a copy-pastable external ID for each tenant. Tenants will need this value to configure their settings in AWS.

Step 1: Create IAM Policy

  1. Navigate to IAM service

  2. Go to Policies > Create policy

  3. Select the JSON tab and paste the policy above.

  4. Click Next: Tags > Next: Review

  5. Name the policy and click Create policy

Step 2: Create IAM Role

  1. Navigate to IAM > Roles > Create role

  2. Select Custom trust policy

  3. Paste the trust policy below, replacing EXTERNAL_ID with the external ID copied from the widget:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::581362835603:root"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "sts:ExternalId": "EXTERNAL_ID"
                }
            }
        }
    ]
}
  1. Click Next

  2. Select the policy created in Step 1 and click Next

  3. Enter a Role name and click Create role

  4. Copy the role’s ARN from the summary page, and paste it into the widget.

Configuring your bucket connection

Each S3 connection accepts two key pieces of information:

  • bucket: Your S3 bucket name (e.g., my-company-data)

  • path_prefix: (Optional) A specific path within your bucket (e.g., incoming/orders)

The path prefix helps organize your data. For example, if you set:

path_prefix: exports/invoices

Files will be read from and written to that location in your bucket:

my-company-data/exports/invoices/

Using variables in paths

Make your paths dynamic by using variables in curly braces. For example:

orders/{tenant}/{date}/incoming

Could resolve to:

orders/acme-corp/202312/incoming

Available variables:

VariableDescriptionExample Output
tenantYour tenant IDacme_corp
root_tenant_idRoot tenant ID (for subtenants)acme
tapTap ID this job ran forsalesforce
dateCurrent date (YYYYMM)202312
full_dateFull date (YYYYMMDD)20231206
flow_idFlow identifierFq183dc
job_idInternal job IDjklc4
env_idEnvironmentprod.hg.acme.com

Reading Files

When reading from S3:

  • Files are copied from your bucket maintaining their directory structure

  • All files within your path_prefix are processed, unless using incremental mode

  • File contents must be parsed in your transformation script

Writing Files

When writing to S3:

  • Files are written to your specified path_prefix

  • Directory structure is created automatically if it doesn’t exist

  • File naming and organization can also be controlled in your transformation script

For example, setting:

bucket: customer-exports
path_prefix: data/{tenant}/{date}

Might write files to:

customer-exports/
  data/
    acme-corp/
      202312/
        orders.csv
        customers.csv