Skip to content
Credential vending

Credential vending

Credential vending

By default, an Iceberg client that opens a table reads and writes object storage using the catalog’s own long-lived credentials — anyone who can reach the catalog effectively has warehouse-wide access to the bucket.

With vending enabled, the catalog’s bucket keys stay inside puddle. Clients get short-lived, table-scoped tokens instead:

  • expire after a configurable TTL (default 1 hour)
  • only grant access to the requested table’s prefix in the bucket
  • read-only when the client is reading; read-write when it’s mutating
  • consumed transparently by PyIceberg, Spark, Trino, and other Iceberg clients — no client-side configuration

Three backends are supported:

  • AWS — real STS AssumeRole
  • MinIO — built-in STS on the S3 endpoint
  • Ceph (RGW) — STS AssumeRole, with a role created via radosgw-admin

The setup is the same shape in all three. Define a role granting the access you want vended tokens to be able to use — this is the upper bound. Give puddle long-lived credentials that are allowed to sts:AssumeRole that role. Puddle assumes the role per request and narrows the result with an inline session policy scoped to the table being accessed, so a token vended for table A can’t touch table B.

Puddle configuration

Add a vended-credentials block to a warehouse’s s3: config:

warehouses:
  prod:
    location: s3://my-warehouse
    s3:
      region: us-east-1
      endpoint: https://s3.amazonaws.com  # omit for AWS, set for MinIO/Ceph
      path-style-access: false            # true for MinIO; AWS supports both
      access-key-id: ${PUDDLE_AWS_ACCESS_KEY_ID}
      secret-access-key: ${PUDDLE_AWS_SECRET_ACCESS_KEY}
      vended-credentials:
        enabled: true
        role-arn: arn:aws:iam::123456789012:role/IcebergTableAccess
        duration-seconds: 3600       # default; AWS allows 900..43200
        # Optional:
        # sts-endpoint: ""           # defaults to s3.endpoint, then SDK regional default
        # sts-region: ""             # defaults to s3.region
        # external-id: ""            # cross-account sts:ExternalId
        # read-policy-template: ""   # text/template, see Custom policy templates
        # write-policy-template: ""

When vended-credentials.enabled: true, s3.access-key-id and s3.secret-access-key become required — the catalog needs explicit credentials to call sts:AssumeRole. This is enforced at config-load time, so a misconfig fails at boot, not on the first request.

AWS

1. Create an IAM user (or role) for the catalog

This is the principal whose long-lived keys puddle uses to call sts:AssumeRole. If puddle runs on EC2/EKS, prefer an instance role or IRSA — the SDK chain picks them up when access-key-id / secret-access-key are left empty.

For a static-key setup (simplest), create a user puddle-catalog with no AWS-managed policies attached, then create an access key for it and put the values in s3.access-key-id / s3.secret-access-key.

2. Create the table-access role

Trust policy (who can assume it):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "AWS": "arn:aws:iam::123456789012:user/puddle-catalog" },
    "Action": "sts:AssumeRole"
  }]
}

For cross-account, add an sts:ExternalId condition and set the matching value in vended-credentials.external-id:

"Condition": { "StringEquals": { "sts:ExternalId": "puddle-prod-2026" } }

Permissions policy on the role (the upper bound; puddle’s inline session policy narrows per table):

{
  "Version": "2012-10-17",
  "Statement": [
    { "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
      "Resource": "arn:aws:s3:::my-warehouse/*" },
    { "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::my-warehouse" }
  ]
}

3. Allow the user to assume it

Attach this inline policy to puddle-catalog (or its instance role):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": "sts:AssumeRole",
    "Resource": "arn:aws:iam::123456789012:role/IcebergTableAccess"
  }]
}

4. Wire it into puddle

s3:
  region: us-east-1
  access-key-id: AKIA...
  secret-access-key: ...
  vended-credentials:
    enabled: true
    role-arn: arn:aws:iam::123456789012:role/IcebergTableAccess

Leave sts-endpoint and sts-region unset — the AWS SDK picks the correct regional STS endpoint automatically.

MinIO

MinIO accepts AssumeRole against its S3 listener and enforces the inline session policy directly. role-arn is nominal — any well-formed ARN is accepted; only the inline session policy gates access.

1. Enable STS (default-on)

STS is enabled by default in modern MinIO releases. No config change is needed unless you’ve explicitly disabled it.

2. Create a user with bucket access

The catalog’s user must hold at least the permissions you want vended tokens to be able to use — the inline session policy can only narrow, not widen.

mc admin user add myminio puddle-catalog 'long-random-secret'
mc admin policy attach myminio readwrite --user puddle-catalog

(Attach a tighter custom policy in production; readwrite is the fastest path to a working setup.)

3. Wire it into puddle

warehouses:
  dev:
    location: s3://my-warehouse
    s3:
      region: us-east-1
      endpoint: http://minio.local:9000
      path-style-access: true
      access-key-id: puddle-catalog
      secret-access-key: long-random-secret
      vended-credentials:
        enabled: true
        # Placeholder ARN — MinIO accepts any well-formed ARN.
        role-arn: arn:aws:iam::000000000000:role/dummy
        duration-seconds: 900   # MinIO STS minimum

sts-endpoint defaults to s3.endpoint, which is the right value for MinIO. Leave it unset.

There is a working end-to-end example in fileio/s3/vending_integration_test.go — it spins up MinIO via testcontainers, configures vending against it, and asserts that vended tokens can read/write inside the table prefix and are rejected outside.

Ceph (RGW)

Ceph’s RGW supports STS AssumeRole against the S3 endpoint, but unlike MinIO the role must actually exist (created via radosgw-admin) and the calling user must hold sts:AssumeRole permission via a user policy.

1. Enable STS in ceph.conf

[client.rgw.<id>]
rgw_sts_key = abcdefghijklmnop          # exactly 16 chars; signs session tokens
rgw_s3_auth_use_sts = true

Restart the RGW daemon after editing.

2. Create a real role

radosgw-admin role create \
  --role-name=IcebergTableAccess \
  --path=/ \
  --assume-role-policy-doc='{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": { "AWS": ["arn:aws:iam:::user/puddle-catalog"] },
      "Action": ["sts:AssumeRole"]
    }]
  }'

Attach a permissions policy to the role (warehouse-wide upper bound):

radosgw-admin role-policy put \
  --role-name=IcebergTableAccess \
  --policy-name=AllowWarehouseAccess \
  --policy-doc='{
    "Version": "2012-10-17",
    "Statement": [
      { "Effect": "Allow",
        "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
        "Resource": "arn:aws:s3:::my-warehouse/*" },
      { "Effect": "Allow",
        "Action": "s3:ListBucket",
        "Resource": "arn:aws:s3:::my-warehouse" }
    ]
  }'

3. Grant the catalog user sts:AssumeRole

radosgw-admin user policy put \
  --uid=puddle-catalog \
  --policy-name=AllowAssumeIcebergRole \
  --policy-doc='{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Resource": "arn:aws:iam:::role/IcebergTableAccess"
    }]
  }'

4. Wire it into puddle

warehouses:
  prod:
    location: s3://my-warehouse
    s3:
      region: us-east-1
      endpoint: https://rgw.local:8080
      path-style-access: true
      access-key-id: <puddle-catalog access key>
      secret-access-key: <puddle-catalog secret key>
      vended-credentials:
        enabled: true
        role-arn: arn:aws:iam:::role/IcebergTableAccess
        duration-seconds: 3600

sts-endpoint defaults to s3.endpoint, which is correct for RGW.

Custom policy templates

The defaults scope vended tokens to arn:aws:s3:::{{.Bucket}}/{{.KeyPrefix}}/* and grant the minimum set of actions Iceberg clients need (s3:GetObject, s3:PutObject, s3:DeleteObject, s3:ListBucket).

Override one or both via read-policy-template / write-policy-template when you need to:

  • Tag uploads (s3:PutObjectTagging)
  • Allow s3:AbortMultipartUpload for large writers
  • Add condition keys (e.g. aws:SourceVpc, aws:SourceIp)
  • Restrict to specific storage classes

Templates use Go’s text/template. Available fields:

FieldExample
.Bucketmy-warehouse
.KeyPrefixdb/events_v1

Bad template syntax fails at warehouse construction time, not at request time — misconfig is loud.

Troubleshooting

AccessDenied from STS at startup, not at request time. Puddle does not call STS on boot — failures only surface on the first table-returning request. Check the catalog’s logs for the wrapped vending: AssumeRole: ... error.

AccessDenied on the vended token. The inline session policy is the intersection of (a) the role’s permissions policy and (b) the rendered template. If the role’s permissions policy doesn’t cover the table’s prefix, the session gets nothing — widen the role first.

MinIO returns InvalidParameterValue for DurationSeconds. MinIO’s minimum is 900s. Anything below that is rejected; puddle’s config validator also rejects it.

Wrong STS endpoint on MinIO/Ceph. Leave sts-endpoint empty — it defaults to s3.endpoint, which is where MinIO and RGW serve STS. Setting it to AWS’s https://sts.amazonaws.com will cause AssumeRole to fail.

Cross-account on AWS: AccessDenied even with the right role. The trust policy needs an sts:ExternalId condition that matches the value puddle sends. Set vended-credentials.external-id to match.