Google Cloud Storage
Ingest your files into Unstructured from Google Cloud Storage.
You’ll need:
The Google Cloud Storage prerequisites:
-
A Google Cloud service account. Create a service account.
-
A service account key for the service account. See Create a service account key in Create and delete service account keys.
To ensure maximum compatibility across Unstructured service offerings, you should give the service account key information to Unstructured as a single-line string that contains the contents of the downloaded service account key file (and not the service account key file itself). To print this single-line string without line breaks, suitable for copying, you can run one of the following commands from your Terminal or Command Prompt. In this command, replace
<path-to-downloaded-key-file>
with the path to the service account key file that you downloaded by following the preceding instructions.- For macOS or Linux:
- For Windows:
- For macOS or Linux:
-
The URI for a Google Cloud Storage bucket. This URI consists of the target bucket name, plus any target folder within the bucket, expressed as
gs://<bucket-name>[/folder-name]
. Create a bucket.This bucket must have, at minimum, one of the following roles applied to the target Google Cloud service account:
Storage Object Viewer
for bucket read access.Storage Object Creator
for bucket write access.- The
Storage Object Admin
role provides read and write access, plus access to additional bucket operations.
To apply one of these roles to a service account for a bucket, see Add a principal to a bucket-level policy in Set and manage IAM policies on buckets.
To create the source connector:
- On the sidebar, click Connectors.
- Click Sources.
- Click Add new.
- Give the connector some unique Name.
- In the Provider area, click Google GCS.
- Click Continue.
- Follow the on-screen instructions to fill in the fields as described later on this page.
- Click Save and Test.
Fill in the following fields:
- Name (required): A unique name for this connector.
- Bucket URI (required): The URI for the Google Cloud Storage bucket and any target folder path within the bucket. This URI takes the format
gs://<bucket-name>[/folder-name]
. - Service Account Key (required): The contents of a service account key file, expressed as a single string without line breaks, for a Google Cloud service account that has the required access permissions to the bucket.
- Recursive: Check this box to ingest data recursively from any subfolders, starting from the path specified by Bucket URI.
Was this page helpful?