The following information applies only to in-VPC deployments of Unstructured Enterprise.For dedicated instance deployments of Unstructured Enterprise, contact your Unstructured sales representative,
or email Unstructured Sales at sales@unstructured.io.
- Do it all for me: Have Unstructured set up the required infrastructure in your AWS account and then deploy the Unstructured UI and API into that newly created infrastructure.
- Bring my own infrastructure: Set up the required infrastructure yourself in your AWS account, and then have Unstructured deploy the Unstructured UI and API into your existing infrastructure.
Questions? Need help?
If you have questions or need help as you go, contact your Unstructured sales representative or technical enablement contact. If you do not know who they are, email Unstructured Sales at sales@unstructured.io, and a member of the Unstructured sales or technical enablement teams will get back to you as soon as possible.Do it all for me
If you want Unstructured to set up the required infrastructure for you in your AWS account and then deploy the Unstructured UI and API into that newly created infrastructure, then provide your Unstructured sales representative or technical enablement contact with the access credentials for an IAM user or service principal in your AWS account that has the following required permissions.Core networking permissions
For VPC and subnet management:ec2:CreateVpcec2:CreateSubnetec2:CreateRouteTableec2:CreateInternetGatewayec2:CreateNatGatewayec2:ModifyVpcAttribute(for DNS settings)ec2:AssociateRouteTable,ec2:CreateRoute(for public and private route tables)ec2:AllocateAddress(for Elastic IP assignment to the NAT Gateway)
ec2:AuthorizeSecurityGroupIngress/Egress(to configure cluster and node security groups to allow VPC CIDR traffic)
EKS permissions
For the cluster role:- Attach the managed policies
AmazonEKSClusterPolicyandAmazonEKSVPCResourceControllerto a role withsts:AssumeRoletrust foreks.amazonaws.com
AmazonEKSWorkerNodePolicy(for node operations)AmazonEKS_CNI_Policy(for networking)AmazonEC2ContainerRegistryReadOnly(for ECR access)
iam:CreateOpenIDConnectProvider(to associate the EKS cluster with IAM OIDC)iam:CreateRole+iam:AttachRolePolicy(for service accounts in therecommender,etl-operator, anddata-brokernamespaces)
Storage and database
These permissions:s3:CreateBuckets3:PutBucketVersionings3:PutBucketEncryption
u10d-*-etl-blob-cacheu10d-*-etl-job-dbu10d-*-etl-job-statusu10d-*-job-files
rds:CreateDBInstancerds:CreateDBSubnetGrouprds:CreateDBSecurityGroup+ec2:AuthorizeSecurityGroupIngress(to allow VPC CIDR access)
Add-ons and utilities
For the EBS CSI Driver:eks:CreateAddonwith IAM role attachment permissions for theebs.csi.aws.comservice account
ec2:CreateKeyPair+ec2:ExportKeyPair(for node group remote access)
Cross-service requirements
- For IAM:
iam:PassRole(to assign roles to EKS, RDS, and S3) - For KMS:
kms:CreateKey(if using CMK for S3 and RDS encryption) - For CloudFormation:
cloudformation:*
u10d-*-etl*).
The EKS Pod Identity Agent requires eks-auth:AssumeRoleForPodIdentity permission on node roles when used with IRSA.
Bring my own infrastructure
If you want to set up the required infrastructure yourself, set things up as follows within your AWS account for Unstructured to deploy the Unstructured UI and API into. You must also provide your Unstructured sales representative or technical enablement contact with the access credentials for an IAM user or service principal in your AWS account that has access to the target Amazon Elastic Kubernetes Service (EKS) cluster to deploy the Unstructured UI and API into.VPC and networking
-
VPC
- CIDR:
10.0.0.0/16- Any CIDR should work, but make sure it has enough space. - DNS Hostnames: Enabled
- DNS Support: Enabled
- CIDR:
-
Internet Gateway
- Attached to the VPC
-
Public Subnet
- CIDR:
10.0.0.0/24 - Public IP on launch: true
- Availability Zone:
${region}a
- CIDR:
-
NAT Gateway + Elastic IP
- Lives in the public subnet
-
Private Subnets (x2)
- CIDRs:
10.0.1.0/24,10.0.2.0/24 - AZs:
${region}aand${region}b
- CIDRs:
-
Route Tables
- Public: default route (
0.0.0.0/0) via IGW - Private (x2): default route via NAT Gateway
- Public: default route (
IAM roles and policies
-
EKS Cluster Role
-
Trusts:
eks.amazonaws.com -
Attached policies:
AmazonEKSClusterPolicyAmazonEKSVPCResourceController
-
Trusts:
-
EKS Node Group Role
-
Trusts:
ec2.amazonaws.com,eks.amazonaws.com -
Attached policies:
AmazonEKSWorkerNodePolicyAmazonEKS_CNI_PolicyAmazonEC2ContainerRegistryReadOnly
-
Trusts:
-
OIDC Service Account IAM Roles (x3)
- Namespaces:
recommender,etl-operator,data-broker - Each role assumes via
sts:AssumeRoleWithWebIdentitywith OIDC provider - Each has an S3 policy allowing access to specific buckets (see below)
- Namespaces:
EKS cluster
-
EKS Control Plane
- Version:
1.31or greater - Subnet: Private subnets only
- Public endpoint access: Enabled
- Private endpoint access: Disabled
- Version:
-
Node Group
- Instance type:
c5.4xlarge(or larger, depending on cost factors) - Disk size: 100 GB
- Desired size: 2 (min 2, max 5)
- Remote SSH access: Enabled (with generated SSH key)
- SSH key: Key pair created and exported
- Instance type:
-
Security Groups
- EKS Cluster SG (implicitly created by AWS)
- Node SG: Allows all traffic within cluster CIDR (
10.0.0.0/16), self, and metadata IP - Egress: Allows all
Kubernetes add-ons
Installed viaaws.eks.Addon:
-
EKS Pod Identity Agent
- Version:
v1.3.4-eksbuild.1
- Version:
-
Metrics Server
- Version:
v0.7.2-eksbuild.1
- Version:
-
EBS CSI Driver
-
Version:
v1.38.1-eksbuild.2 -
Configured with:
- Service account annotation:
eks.amazonaws.com/role-arn - Pod identity access annotation
- Service account annotation:
-
Version:
Storage class
- Name:
ebs-sc - Default: Yes
- Provisioner:
ebs.csi.aws.com - Parameters:
type=gp3,encrypted=true - Volume Binding Mode:
WaitForFirstConsumer
RDS
-
RDS Subnet Group
- Uses the private subnets
-
RDS Instance
- Engine: Postgres 16
- Size:
db.t3.micro - Allocated storage: 20 GB
- Auth: Setup a Username and Password, keep secure.
- Security group: Allows all traffic from
10.0.0.0/16(keep in mind your CIDR group from the VPC) - DB name:
postgres
S3 buckets
u10d-{stack_name}-etl-blob-cacheu10d-{stack_name}-etl-job-dbu10d-{stack_name}-etl-job-statusu10d-{stack_name}-job-files
- Versioning enabled
- Server-side encryption (AES256)
- Force destroy: true
Keys
-
SSH Key Pair (RSA 4096-bit)
- Key exported as
private_key(PEM)
- Key exported as
Secrets and ConfigMaps
After your infrastructure is set up, but before Unstructured can deploy the Unstructured UI and API into your insfrastructure, Unstructured will need to know the values of the following Secrets and ConfigMaps. These must be provided to Unstructured as a set of YAML files in Kubernetes Secret and ConfigMap format. The Secrets are as follows.Blob storage credentials
BLOB_STORAGE_ADAPTER_ACCESS_KEY_IDBLOB_STORAGE_ADAPTER_SECRET_ACCESS_KEYBLOB_STORAGE_ADAPTER_REGION_NAME
Database credentials
DB_USERNAMEDB_PASSWORDDB_HOSTDB_NAMEDB_DATABASE(used inplatform-apionly)
Authentication
JWT_SECRET_KEYAUTH_STRATEGY(sometimes encoded, sometimes not)SESSION_SECRETSHARED_SECRETKEYCLOAK_CLIENT_SECRETKEYCLOAK_ADMIN_SECRETKEYCLOAK_ADMINKEYCLOAK_ADMIN_PASSWORDAPI_BEARER_TOKEN
Blob storage settings
BLOB_STORAGE_ADAPTER_TYPE(alwayss3for AWS)BLOB_STORAGE_ADAPTER_BUCKETETL_BLOB_CACHE_BUCKET_NAMEETL_API_BLOB_STORAGE_ADAPTER_BUCKETETL_API_BLOB_STORAGE_ADAPTER_TYPEETL_API_DB_REMOTE_BUCKET_NAMEETL_API_JOB_STATUS_DEST_BUCKET_NAMEJOB_STATUS_BUCKET_NAMEJOB_DB_BUCKET_NAME
Environment
ENVENVIRONMENTJOB_ENVJOB_ENVIRONMENT
Observability and OpenTelemetry (OTel)
JOB_OTEL_EXPORTER_OTLP_ENDPOINTJOB_OTEL_METRICS_EXPORTERJOB_OTEL_TRACES_EXPORTEROTEL_EXPORTER_OTLP_ENDPOINTOTEL_METRICS_EXPORTEROTEL_TRACES_EXPORTER
Unstructured API and authentication
UNSTRUCTURED_API_URLJWKS_URLJWT_ISSUERJWT_AUDIENCESINGLE_PLANE_DEPLOYMENT
Front end and dashboard
API_BASE_URLAPI_CLIENT_BASE_URLAPI_URLAPM_SERVICE_NAMEAPM_SERVICE_NAME_CLIENTAUTH_STRATEGYFRONTEND_BASE_URLKEYCLOAK_CALLBACK_URLKEYCLOAK_CLIENT_IDKEYCLOAK_DOMAINKEYCLOAK_REALMKEYCLOAK_SSL_ENABLEDKEYCLOAK_TRUST_ISSUERPUBLIC_BASE_URLPUBLIC_RELEASE_CHANNEL
Sentry and feature flags
SENTRY_DSNSENTRY_SAMPLE_RATEWORKFLOW_NODE_EDITOR_FF_REQUEST_FORMCUSTOM_WORKFLOW_FF_REQUEST_FORM
Redis
REDIS_DSN
Other
IMAGE_PULL_SECRETSPRIVATE_KEY_SECRETS_ADAPTER_TYPEPRIVATE_KEY_SECRETS_ADAPTER_AWS_REGIONSECRETS_ADAPTER_TYPESECRETS_ADAPTER_AWS_REGION
| File name | Type | Resource name | Namespace | Data keys |
|---|---|---|---|---|
data-broker-env-cm.yaml | ConfigMap | data-broker-env | api | JOB_STATUS_BUCKET_NAME, JOB_DB_BUCKET_NAME, BLOB_STORAGE_ADAPTER_TYPE |
data-broker-env-secret.yaml | Secret | data-broker-env | api | BLOB_STORAGE_ADAPTER_ACCESS_KEY_ID, BLOB_STORAGE_ADAPTER_REGION_NAME, BLOB_STORAGE_ADAPTER_SECRET_ACCESS_KEY |
dataplane-api-env-cm.yaml | Secret | dataplane-api-env | api | DB_PASSWORD, DB_USERNAME, DB_HOST, DB_NAME |
etl-operator-env-cm.yaml | ConfigMap | etl-operator-env | etl-operator | BLOB_STORAGE_ADAPTER_BUCKET, JOB_STATUS_BUCKET_NAME, JOB_DB_BUCKET_NAME, BLOB_STORAGE_ADAPTER_TYPE, ENV, ENVIRONMENT, REDIS_DSN, ETL_API_BLOB_STORAGE_ADAPTER_BUCKET, ETL_API_BLOB_STORAGE_ADAPTER_TYPE, ETL_API_DB_REMOTE_BUCKET_NAME, ETL_API_JOB_STATUS_DEST_BUCKET_NAME (x2), ETL_BLOB_CACHE_BUCKET_NAME, IMAGE_PULL_SECRETS, JOB_ENV, JOB_ENVIRONMENT, JOB_OTEL_EXPORTER_OTLP_ENDPOINT, JOB_OTEL_METRICS_EXPORTER, JOB_OTEL_TRACES_EXPORTER, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_METRICS_EXPORTER, OTEL_TRACES_EXPORTER, UNSTRUCTURED_API_URL |
etl-operator-env-secret.yaml | Secret | etl-operator-env | etl-operator | BLOB_STORAGE_ADAPTER_ACCESS_KEY_ID, BLOB_STORAGE_ADAPTER_REGION_NAME, BLOB_STORAGE_ADAPTER_SECRET_ACCESS_KEY |
frontend-env-cm.yaml | ConfigMap | frontend-env | www | API_BASE_URL, API_CLIENT_BASE_URL, API_URL, APM_SERVICE_NAME, APM_SERVICE_NAME_CLIENT, AUTH_STRATEGY, ENV, FRONTEND_BASE_URL, KEYCLOAK_CALLBACK_URL, KEYCLOAK_CLIENT_ID, KEYCLOAK_DOMAIN, KEYCLOAK_REALM, KEYCLOAK_SSL_ENABLED, KEYCLOAK_TRUST_ISSUER, PUBLIC_BASE_URL, PUBLIC_RELEASE_CHANNEL, SENTRY_DSN, SENTRY_SAMPLE_RATE, WORKFLOW_NODE_EDITOR_FF_REQUEST_FORM, CUSTOM_WORKFLOW_FF_REQUEST_FORM |
frontend-env-secret.yaml | Secret | frontend-env | www | API_BEARER_TOKEN, KEYCLOAK_ADMIN_SECRET, KEYCLOAK_CLIENT_SECRET, SESSION_SECRET, SHARED_SECRET |
keycloak-secret.yaml | Secret | phasetwo-keycloak-env | www | KEYCLOAK_ADMIN, KEYCLOAK_ADMIN_PASSWORD |
platform-api-env-cm.yaml | ConfigMap | platform-api-env | api | JWKS_URL, JWT_ISSUER, JWT_AUDIENCE, SINGLE_PLANE_DEPLOYMENT |
platform-api-env-secret.yaml | Secret | platform-api-env | api | DB_PASSWORD, DB_USERNAME, DB_HOST, DB_NAME, DB_DATABASE, JWT_SECRET_KEY, AUTH_STRATEGY |
recommender-env-cm.yaml | ConfigMap | recommender-env | recommender | BLOB_STORAGE_ADAPTER_TYPE, ETL_BLOB_CACHE_BUCKET_NAME |
recommender-env-secret.yaml | Secret | recommender-env | recommender | BLOB_STORAGE_ADAPTER_ACCESS_KEY_ID, BLOB_STORAGE_ADAPTER_REGION_NAME, BLOB_STORAGE_ADAPTER_SECRET_ACCESS_KEY |
secret-provider-api-env-cm.yaml | ConfigMap | secrets-provider-api-env | secrets | ENV, ENVIRONMENT, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_METRICS_EXPORTER, OTEL_TRACES_EXPORTER, PRIVATE_KEY_SECRETS_ADAPTER_AWS_REGION, PRIVATE_KEY_SECRETS_ADAPTER_TYPE, SECRETS_ADAPTER_AWS_REGION, SECRETS_ADAPTER_TYPE |
secret-provider-api-env-secret.yaml | Secret | secrets-provider-api-env | secrets | BLOB_STORAGE_ADAPTER_ACCESS_KEY_ID, BLOB_STORAGE_ADAPTER_REGION_NAME, BLOB_STORAGE_ADAPTER_SECRET_ACCESS_KEY |
usage-collector-env-secret.yaml | Secret | usage-collector-env | api | DB_PASSWORD, DB_USERNAME, DB_HOST, DB_NAME, BLOB_STORAGE_ADAPTER_TYPE |
etl-operator-env-cm.yaml ConfigMap file, the contents would look like this:
etl-operator-env-secret.yaml Secret file, the contents would look like this:

