Name | Value |
---|---|
Platform | AWS S3 |
Auth Type | API Keys |
Direction | Bidirectional |
Tap Metrics | Usage: |
Target Metrics | Usage: |
You can use the S3 connector to read and write files with S3.
However you choose to authenticate, you will need to determine an appropriate IAM policy for your integration. The policy you create depends on your integration needs:
Exporting to S3: Only needs s3:PutObject
Fetching from S3: Needs s3:ListBucket
and s3:GetObject
Here’s a policy for both reading and writing. Replace BUCKET_NAME
with your bucket:
Once you have your policy, you have two ways to authenticate with S3:
HMAC Access Key ID and Secret (default)
Cross-account role with external ID
By default, the S3 connector authenticates using an Access Key ID and Secret Access Key.
In AWS, navigate to IAM service
Go to Users > Create user
Enter a user name
Select Access key - Programmatic access
Click Next: Permissions
Click Attach policies directly > Create Policy
Select the JSON tab and paste the policy from above.
Click Next: Tags > Next: Review
Name the policy and click Create policy
Return to the user creation workflow
Select your newly created policy
Click Next: Review > Create user
Copy your Access Key ID
and Secret Access Key
, and paste them into the widget.
⚠️ Important: This is your only chance to copy the Secret Access Key. Store it securely.
For enterprises requiring temporary credentials and enhanced security controls, you can use cross-account role access.
To enable this in hotglue, change the S3 connect_ui_params
in your connector settings to include aws_external_id
and aws_role_arn
in lieu of aws_access_key_id
and aws_access_key_id
, like below:
Once aws_external_id
is configured in your connector settings, the widget will automatically update to display a copy-pastable external ID for each tenant. Tenants will need this value to configure their settings in AWS.
Navigate to IAM service
Go to Policies > Create policy
Select the JSON tab and paste the policy above.
Click Next: Tags > Next: Review
Name the policy and click Create policy
Navigate to IAM > Roles > Create role
Select Custom trust policy
Paste the trust policy below, replacing EXTERNAL_ID
with the external ID copied from the widget:
Click Next
Select the policy created in Step 1 and click Next
Enter a Role name and click Create role
Copy the role’s ARN
from the summary page, and paste it into the widget.
Each S3 connection accepts two key pieces of information:
bucket
: Your S3 bucket name (e.g., my-company-data
)
path_prefix
: (Optional) A specific path within your bucket (e.g., incoming/orders
)
The path prefix helps organize your data. For example, if you set:
Files will be read from and written to that location in your bucket:
Make your paths dynamic by using variables in curly braces. For example:
Could resolve to:
Available variables:
Variable | Description | Example Output |
---|---|---|
tenant | Your tenant ID | acme_corp |
root_tenant_id | Root tenant ID (for subtenants) | acme |
tap | Tap ID this job ran for | salesforce |
date | Current date (YYYYMM) | 202312 |
full_date | Full date (YYYYMMDD) | 20231206 |
flow_id | Flow identifier | Fq183dc |
job_id | Internal job ID | jklc4 |
env_id | Environment | prod.hg.acme.com |
When reading from S3:
Files are copied from your bucket maintaining their directory structure
All files within your path_prefix are processed, unless using incremental mode
File contents must be parsed in your transformation script
When writing to S3:
Files are written to your specified path_prefix
Directory structure is created automatically if it doesn’t exist
File naming and organization can also be controlled in your transformation script
For example, setting:
Might write files to:
Name | Value |
---|---|
Platform | AWS S3 |
Auth Type | API Keys |
Direction | Bidirectional |
Tap Metrics | Usage: |
Target Metrics | Usage: |
You can use the S3 connector to read and write files with S3.
However you choose to authenticate, you will need to determine an appropriate IAM policy for your integration. The policy you create depends on your integration needs:
Exporting to S3: Only needs s3:PutObject
Fetching from S3: Needs s3:ListBucket
and s3:GetObject
Here’s a policy for both reading and writing. Replace BUCKET_NAME
with your bucket:
Once you have your policy, you have two ways to authenticate with S3:
HMAC Access Key ID and Secret (default)
Cross-account role with external ID
By default, the S3 connector authenticates using an Access Key ID and Secret Access Key.
In AWS, navigate to IAM service
Go to Users > Create user
Enter a user name
Select Access key - Programmatic access
Click Next: Permissions
Click Attach policies directly > Create Policy
Select the JSON tab and paste the policy from above.
Click Next: Tags > Next: Review
Name the policy and click Create policy
Return to the user creation workflow
Select your newly created policy
Click Next: Review > Create user
Copy your Access Key ID
and Secret Access Key
, and paste them into the widget.
⚠️ Important: This is your only chance to copy the Secret Access Key. Store it securely.
For enterprises requiring temporary credentials and enhanced security controls, you can use cross-account role access.
To enable this in hotglue, change the S3 connect_ui_params
in your connector settings to include aws_external_id
and aws_role_arn
in lieu of aws_access_key_id
and aws_access_key_id
, like below:
Once aws_external_id
is configured in your connector settings, the widget will automatically update to display a copy-pastable external ID for each tenant. Tenants will need this value to configure their settings in AWS.
Navigate to IAM service
Go to Policies > Create policy
Select the JSON tab and paste the policy above.
Click Next: Tags > Next: Review
Name the policy and click Create policy
Navigate to IAM > Roles > Create role
Select Custom trust policy
Paste the trust policy below, replacing EXTERNAL_ID
with the external ID copied from the widget:
Click Next
Select the policy created in Step 1 and click Next
Enter a Role name and click Create role
Copy the role’s ARN
from the summary page, and paste it into the widget.
Each S3 connection accepts two key pieces of information:
bucket
: Your S3 bucket name (e.g., my-company-data
)
path_prefix
: (Optional) A specific path within your bucket (e.g., incoming/orders
)
The path prefix helps organize your data. For example, if you set:
Files will be read from and written to that location in your bucket:
Make your paths dynamic by using variables in curly braces. For example:
Could resolve to:
Available variables:
Variable | Description | Example Output |
---|---|---|
tenant | Your tenant ID | acme_corp |
root_tenant_id | Root tenant ID (for subtenants) | acme |
tap | Tap ID this job ran for | salesforce |
date | Current date (YYYYMM) | 202312 |
full_date | Full date (YYYYMMDD) | 20231206 |
flow_id | Flow identifier | Fq183dc |
job_id | Internal job ID | jklc4 |
env_id | Environment | prod.hg.acme.com |
When reading from S3:
Files are copied from your bucket maintaining their directory structure
All files within your path_prefix are processed, unless using incremental mode
File contents must be parsed in your transformation script
When writing to S3:
Files are written to your specified path_prefix
Directory structure is created automatically if it doesn’t exist
File naming and organization can also be controlled in your transformation script
For example, setting:
Might write files to: