> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Unstructured Business in-VPC on Google Cloud Platform (GCP) - onboarding checklist

<Note>
  The following information applies only to in-VPC deployments of [Unstructured Business](/business/overview).

  For dedicated instance deployments of Unstructured Business, contact your Unstructured sales representative,
  or email Unstructured Sales at [sales@unstructured.io](mailto:sales@unstructured.io).
</Note>

After your organization has signed the **Business** account agreement with Unstructured, a member of the Unstructured technical enablement team will reach out to you to begin the
deployment onboarding process. To streamline this process, you are encouraged to begin setting up your target environment as soon as possible. Choose one of the following setup options:

* [Do it all for me](#do-it-all-for-me): Have Unstructured set up the required infrastructure in your AWS account and then deploy the Unstructured UI and API into that newly created infrastructure.
* [Bring my own infrastructure](#bring-my-own-infrastructure): Set up the required infrastructure yourself in your AWS account, and then have Unstructured deploy the Unstructured UI and API into your existing infrastructure.

## Questions? Need help?

If you have questions or need help as you go, contact your Unstructured sales representative or technical enablement contact. If you do not know who they are,
email Unstructured Sales at [sales@unstructured.io](mailto:sales@unstructured.io), and a member of the Unstructured sales or technical enablement teams
will get back to you as soon as possible.

## Do it all for me

If you want Unstructured to set up the required infrastructure for you in your GCP account and then deploy the Unstructured UI and API into that newly created infrastructure, then provide your Unstructured sales representative or technical enablement contact with
the access credentials for an IAM user or service account in your GCP account that has the following required permissions:

### Core networking permissions

VPC/subnet management:

* `compute.networks.create`
* `compute.subnetworks.create`
* `compute.routers.create` (for Cloud NAT)
* `compute.addresses.create` (for NAT IPs)
* `compute.firewalls.create` (for intra-cluster traffic rules)

Shared VPC (if used):

* `compute.organizations.admin` (for the host project)
* `compute.networks.use` (for the service project)

### GKE cluster permissions

Control plane:

* `container.clusters.create`
* `container.clusters.update` (for private cluster settings)
* `compute.networks.useExternalIp` (for public endpoint access)

Node pools:

* `compute.instances.create`
* `compute.disks.create` (for node disks)
* `compute.instanceGroups.create` (for autoscaling)

IAM roles:

* For the GKE cluster SA service account: `roles/container.hostServiceAgentUser`
* For the node SA	service account: `roles/container.nodeServiceAccount`
* For the workload identity	service account: `roles/iam.workloadIdentityUser`

### Storage and database

GCS buckets:

* `storage.buckets.create`
* `storage.objects.create` (for versioning)
* `storage.buckets.update` (for encryption/lifecycle rules)

Cloud SQL:

* `cloudsql.instances.create`
* `cloudsql.instances.connect` (for private IPs)
* `vpcaccess.connectors.use` (if using Serverless VPC Access)

Persistent disks (CSI):

* `compute.disks.create` (for `pd.csi.storage.gke.io`)
* `compute.subnetworks.use` (for regional disks)

### Advanced configurations

Workload identity:

* `iam.serviceAccounts.getAccessToken` (for federated access)
* `iam.serviceAccounts.setIamPolicy` (to bind Kubernetes SAs to GCP SAs)

Cloud NAT:

* `compute.routers.update` (for NAT configuration)
* `compute.addresses.use` (for NAT IP allocation)

OS login/SSH:

* `compute.projects.setCommonInstanceMetadata` (for SSH key upload)
* `compute.instances.osAdminLogin`

### Minimum required roles

Project level:

* `roles/editor` (broad access, or scope with custom roles)

Scoped roles:

* `roles/compute.networkAdmin` (for VPC and subnets)
* `roles/container.admin` (for GKE)
* `roles/storage.admin` (for GCS)
* `roles/cloudsql.admin` (for Postgres)

## Bring my own infrastructure

If you want to set up the required infrastructure yourself, set things up as follows within your GCP account for Unstructured to deploy the Unstructured UI and API into.

You must also provide your Unstructured sales representative or technical enablement contact with
the access credentials for an IAM user or service account in your GCP account that has access to the target Google Kubernetes Engine (GKE) cluster to deploy the
Unstructured UI and API into.

### **VPC and networking (GCP equivalent)**

* **VPC Network**

  * Name: `u10d-platform`
  * Subnet Mode: *Custom*
  * CIDR: `10.0.0.0/16`
  * DNS: Internal DNS supported by default

* **Internet Gateway**

  * GCP provides implicit internet access via default internet gateway

    (No need to explicitly create)

* **Public Subnet**

  * Subnet: `public-subnet` — `10.0.0.0/24`
  * Region: `${region}`
  * Enable external IPs on VM instances for internet access

* **NAT Gateway**

  * Use **Cloud NAT** attached to a **Cloud Router** in public subnet
  * Needed to provide egress internet access to private subnet instances

* **Private Subnets (x2)**

  * `private-subnet-a`: `10.0.1.0/24`, region `${region}-a`
  * `private-subnet-b`: `10.0.2.0/24`, region `${region}-b`

* **Routes**

  * Public subnet: default route `0.0.0.0/0` to Internet Gateway (via external IPs)
  * Private subnets: route `0.0.0.0/0` via Cloud NAT

### **IAM roles and policies**

* **GKE Cluster IAM Service Account**

  * Grant roles:

    * `roles/container.clusterAdmin`
    * `roles/compute.networkAdmin`

* **GKE Node IAM Service Account**

  * Grant roles:

    * `roles/container.nodeServiceAccount`
    * `roles/compute.viewer`
    * `roles/storage.objectViewer`

* **Workload Identity IAM Bindings (x3)**

  * Namespaces: `recommender`, `etl-operator`, `data-broker`
  * Use **Workload Identity Federation**
  * Bind GCP IAM Service Accounts to Kubernetes service accounts
  * Grant `roles/storage.objectAdmin` for access to GCS buckets

### **GKE cluster**

* **Control Plane**

  * Version: `1.31` or higher
  * Private Cluster: *Enabled*
  * Master Authorized Networks: your IP(s)
  * Enable Public Endpoint Access: Yes

* **Node Pool**

  * Machine type: `n2-standard-16`
  * Disk: 100GB, SSD (default boot disk)
  * Node count: min 2, max 5, autoscaling enabled
  * SSH access: via OS Login + SSH keys
  * SSH key: Add public key to instance metadata

* **Firewall Rules**

  * Allow:

    * Internal: `10.0.0.0/16`
    * Egress: all
  * Kubernetes master access to nodes

### **Kubernetes add-ons (installed via `kubectl` or Helm)**

* **Workload Identity Config**

* **Metrics Server**

  * Deployed manually (same version: `v0.7.2`)

* **GCP CSI Driver**

  * Provisioner: `pd.csi.storage.gke.io`
  * Role binding needed for controller SA

### **Storage class**

```yaml  theme={null}
apiVersion: [storage.k8s.io/v1](http://storage.k8s.io/v1)
kind: StorageClass
metadata:
 name: pd-ssd
 annotations:
  [storageclass.kubernetes.io/is-default-class:](http://storageclass.kubernetes.io/is-default-class:) "true"
provisioner: [pd.csi.storage.gke.io](http://pd.csi.storage.gke.io/)
parameters:
 type: pd-ssd
 encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
```

### **Cloud SQL (Postgres)**

* **Private IP-enabled Cloud SQL instance**

  * Engine: Postgres 16
  * Size: `db-f1-micro` (or `db-custom-1-3840`)
  * Storage: 20GB
  * Credentials: Username/password
  * Private network: Use the private VPC

* **Cloud SQL Auth Proxy** or private VPC peering to connect from GKE

### **GCS Buckets**

* Buckets:

  * `u10d-{stack_name}-etl-blob-cache`
  * `u10d-{stack_name}-etl-job-db`
  * `u10d-{stack_name}-etl-job-status`
  * `u10d-{stack_name}-job-files`

* Config:

  * Versioning: Enabled
  * Encryption: Default (Google-managed key or CMEK if needed)
  * Lifecycle rule: Auto-delete / force destroy if needed (optional)

### **Keys**

* **SSH Key Pair**

  * Generate manually (`ssh-keygen -t rsa -b 4096`)
  * Upload public key to project metadata or OS Login
  * Export private key as PEM for automation

### Secrets and ConfigMaps

After your infrastructure is set up, but before Unstructured can deploy the Unstructured UI and API into your insfrastructure,
Unstructured will need to know the values of the following Secrets and configuration mappings (also known as *ConfigMaps*).

The Secrets are as follows.

#### **Blob storage credentials**

* `BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON`
* `BLOB_STORAGE_ADAPTER_REGION_NAME`

#### **Database credentials**

* `DB_USERNAME`
* `DB_PASSWORD`
* `DB_HOST`
* `DB_NAME`
* `DB_DATABASE` (used in `platform-api` only)

#### **Authentication**

* `JWT_SECRET_KEY`
* `AUTH_STRATEGY` (sometimes encoded, sometimes not)
* `SESSION_SECRET`
* `SHARED_SECRET`
* `KEYCLOAK_CLIENT_SECRET`
* `KEYCLOAK_ADMIN_SECRET`
* `KEYCLOAK_ADMIN`
* `KEYCLOAK_ADMIN_PASSWORD`
* `API_BEARER_TOKEN`

The ConfigMaps are as follows.

#### **Blob storage settings**

* `BLOB_STORAGE_ADAPTER_TYPE` (always `gcp`  for GCP)
* `BLOB_STORAGE_ADAPTER_BUCKET`
* `ETL_BLOB_CACHE_BUCKET_NAME`
* `ETL_API_BLOB_STORAGE_ADAPTER_BUCKET`
* `ETL_API_BLOB_STORAGE_ADAPTER_TYPE`
* `ETL_API_DB_REMOTE_BUCKET_NAME`
* `ETL_API_JOB_STATUS_DEST_BUCKET_NAME`
* `JOB_STATUS_BUCKET_NAME`
* `JOB_DB_BUCKET_NAME`

#### **Environment**

* `ENV`
* `ENVIRONMENT`
* `JOB_ENV`
* `JOB_ENVIRONMENT`

#### **Observability and OpenTelemetry (OTel)**

* `JOB_OTEL_EXPORTER_OTLP_ENDPOINT`
* `JOB_OTEL_METRICS_EXPORTER`
* `JOB_OTEL_TRACES_EXPORTER`
* `OTEL_EXPORTER_OTLP_ENDPOINT`
* `OTEL_METRICS_EXPORTER`
* `OTEL_TRACES_EXPORTER`

#### **Unstructured API and authentication**

* `UNSTRUCTURED_API_URL`
* `JWKS_URL`
* `JWT_ISSUER`
* `JWT_AUDIENCE`
* `SINGLE_PLANE_DEPLOYMENT`

#### **Front end and dashboard**

* `API_BASE_URL`
* `API_CLIENT_BASE_URL`
* `API_URL`
* `APM_SERVICE_NAME`
* `APM_SERVICE_NAME_CLIENT`
* `AUTH_STRATEGY`
* `FRONTEND_BASE_URL`
* `KEYCLOAK_CALLBACK_URL`
* `KEYCLOAK_CLIENT_ID`
* `KEYCLOAK_DOMAIN`
* `KEYCLOAK_REALM`
* `KEYCLOAK_SSL_ENABLED`
* `KEYCLOAK_TRUST_ISSUER`
* `PUBLIC_BASE_URL`
* `PUBLIC_RELEASE_CHANNEL`

#### **Sentry & Feature Flags**

* `SENTRY_DSN`
* `SENTRY_SAMPLE_RATE`
* `WORKFLOW_NODE_EDITOR_FF_REQUEST_FORM`
* `CUSTOM_WORKFLOW_FF_REQUEST_FORM`

#### **Redis**

* `REDIS_DSN`

#### **Other**

* `IMAGE_PULL_SECRETS`
* `PRIVATE_KEY_SECRETS_ADAPTER_TYPE`
* `PRIVATE_KEY_SECRETS_ADAPTER_GCP_REGION`
* `SECRETS_ADAPTER_TYPE`
* `SECRETS_ADAPTER_GCP_REGION`

The preceding Secrets and ConfigMaps must be added to the following files:

| File name                             | Type      | Resource name              | Namespace      | Data keys                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| ------------------------------------- | --------- | -------------------------- | -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data-broker-env-cm.yaml`             | ConfigMap | `data-broker-env`          | `api`          | `JOB_STATUS_BUCKET_NAME`, `JOB_DB_BUCKET_NAME`, `BLOB_STORAGE_ADAPTER_TYPE`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `data-broker-env-secret.yaml`         | Secret    | `data-broker-env`          | `api`          | `BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON`, `BLOB_STORAGE_ADAPTER_REGION_NAME`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `dataplane-api-env-cm.yaml`           | Secret    | `dataplane-api-env`        | `api`          | `DB_PASSWORD`, `DB_USERNAME`, `DB_HOST`, `DB_NAME`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `etl-operator-env-cm.yaml`            | ConfigMap | `etl-operator-env`         | `etl-operator` | `BLOB_STORAGE_ADAPTER_BUCKET`, `JOB_STATUS_BUCKET_NAME`, `JOB_DB_BUCKET_NAME`, `BLOB_STORAGE_ADAPTER_TYPE`, `ENV`, `ENVIRONMENT`, `REDIS_DSN`, `ETL_API_BLOB_STORAGE_ADAPTER_BUCKET`, `ETL_API_BLOB_STORAGE_ADAPTER_TYPE`, `ETL_API_DB_REMOTE_BUCKET_NAME`, `ETL_API_JOB_STATUS_DEST_BUCKET_NAME` (x2), `ETL_BLOB_CACHE_BUCKET_NAME`, `IMAGE_PULL_SECRETS`, `JOB_ENV`, `JOB_ENVIRONMENT`, `JOB_OTEL_EXPORTER_OTLP_ENDPOINT`, `JOB_OTEL_METRICS_EXPORTER`, `JOB_OTEL_TRACES_EXPORTER`, `OTEL_EXPORTER_OTLP_ENDPOINT`, `OTEL_METRICS_EXPORTER`, `OTEL_TRACES_EXPORTER`, `UNSTRUCTURED_API_URL` |
| `etl-operator-env-secret.yaml`        | Secret    | `etl-operator-env`         | `etl-operator` | `BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON`, `BLOB_STORAGE_ADAPTER_REGION_NAME,`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `frontend-env-cm.yaml`                | ConfigMap | `frontend-env`             | `www`          | `API_BASE_URL`, `API_CLIENT_BASE_URL`, `API_URL`, `APM_SERVICE_NAME`, `APM_SERVICE_NAME_CLIENT`, `AUTH_STRATEGY`, `ENV`, `FRONTEND_BASE_URL`, `KEYCLOAK_CALLBACK_URL`, `KEYCLOAK_CLIENT_ID`, `KEYCLOAK_DOMAIN`, `KEYCLOAK_REALM`, `KEYCLOAK_SSL_ENABLED`, `KEYCLOAK_TRUST_ISSUER`, `PUBLIC_BASE_URL`, `PUBLIC_RELEASE_CHANNEL`, `SENTRY_DSN`, `SENTRY_SAMPLE_RATE`, `WORKFLOW_NODE_EDITOR_FF_REQUEST_FORM`, `CUSTOM_WORKFLOW_FF_REQUEST_FORM`                                                                                                                                                |
| `frontend-env-secret.yaml`            | Secret    | `frontend-env`             | `www`          | `API_BEARER_TOKEN`, `KEYCLOAK_ADMIN_SECRET`, `KEYCLOAK_CLIENT_SECRET`, `SESSION_SECRET`, `SHARED_SECRET`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `keycloak-secret.yaml`                | Secret    | `phasetwo-keycloak-env`    | `www`          | `KEYCLOAK_ADMIN`, `KEYCLOAK_ADMIN_PASSWORD`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `platform-api-env-cm.yaml`            | ConfigMap | `platform-api-env`         | `api`          | `JWKS_URL`, `JWT_ISSUER`, `JWT_AUDIENCE`, `SINGLE_PLANE_DEPLOYMENT`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `platform-api-env-secret.yaml`        | Secret    | `platform-api-env`         | `api`          | `DB_PASSWORD`, `DB_USERNAME`, `DB_HOST`, `DB_NAME`, `DB_DATABASE`, `JWT_SECRET_KEY`, `AUTH_STRATEGY`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `recommender-env-cm.yaml`             | ConfigMap | `recommender-env`          | `recommender`  | `BLOB_STORAGE_ADAPTER_TYPE`, `ETL_BLOB_CACHE_BUCKET_NAME`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `recommender-env-secret.yaml`         | Secret    | `recommender-env`          | `recommender`  | `BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON`, `BLOB_STORAGE_ADAPTER_REGION_NAME`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `secret-provider-api-env-cm.yaml`     | ConfigMap | `secrets-provider-api-env` | `secrets`      | `ENV`, `ENVIRONMENT`, `OTEL_EXPORTER_OTLP_ENDPOINT`, `OTEL_METRICS_EXPORTER`, `OTEL_TRACES_EXPORTER`, `PRIVATE_KEY_SECRETS_ADAPTER_GCP_REGION`, `PRIVATE_KEY_SECRETS_ADAPTER_TYPE`, `SECRETS_ADAPTER_GCP_REGION`, `SECRETS_ADAPTER_TYPE`                                                                                                                                                                                                                                                                                                                                                     |
| `secret-provider-api-env-secret.yaml` | Secret    | `secrets-provider-api-env` | `secrets`      | `BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON`, `BLOB_STORAGE_ADAPTER_REGION_NAME`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `usage-collector-env-secret.yaml`     | Secret    | `usage-collector-env`      | `api`          | `DB_PASSWORD`, `DB_USERNAME`, `DB_HOST`, `DB_NAME`, `BLOB_STORAGE_ADAPTER_TYPE`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |

For example, for the `data-broker-env-cm.yaml` [ConfigMap](https://kubernetes.io/docs/concepts/configuration/configmap/) file, the contents would look like this:

```yaml  theme={null}
apiVersion: v1
kind: ConfigMap
metadata:
  name: data-broker-env
  namespace: api
data:
  JOB_STATUS_BUCKET_NAME: "<your-value>"
  JOB_DB_BUCKET_NAME: "<your-value>"
  BLOB_STORAGE_ADAPTER_TYPE: "<your-value>"
```

For the `data-broker-env-secret.yaml` [Secret](https://kubernetes.io/docs/concepts/configuration/secret/) file, the contents would look like this:

```yaml  theme={null}
apiVersion: v1
kind: Secret
metadata:
  name: data-broker-env
  namespace: api
type: Opaque
stringData:
  BLOB_STORAGE_ADAPTER_GCP_SERVICE_ACCOUNT_KEY_JSON: "<your-value>"
  BLOB_STORAGE_ADAPTER_REGION_NAME: "<your-value>"
```
