This page was recently updated. What do you think about it? Let us know!.
- A Google Cloud service account. Create a service account.
-
A service account key for the service account. See Create a service account key in
Create and delete service account keys.
To ensure maximum compatibility across Unstructured service offerings, you should give the service account key information to Unstructured as
a single-line string that contains the contents of the downloaded service account key file (and not the service account key file itself).
To print this single-line string without line breaks, suitable for copying, you can run one of the following commands from your Terminal or Command Prompt.
In this command, replace
<path-to-downloaded-key-file>
with the path to the service account key file that you downloaded by following the preceding instructions.- For macOS or Linux:
- For Windows:
- For macOS or Linux:
-
The URI for a Google Cloud Storage bucket. This URI consists of the target bucket name, plus any target folder within the bucket, expressed as
gs://<bucket-name>[/folder-name]
. Create a bucket. This bucket must have, at minimum, one of the following roles applied to the target Google Cloud service account:Storage Object Viewer
for bucket read access.Storage Object User
for bucket write access.- The
Storage Object Admin
role provides read and write access, plus access to additional bucket operations.
CLI, Python
GCS_SERVICE_ACCOUNT_KEY
- The Google Cloud service account key for Google Cloud Storage, represented by--service-account-key
(CLI) orservice_account_key
(Python).GCS_REMOTE_URL
- The Google Cloud Storage bucket URL, represented by--remote-url
(CLI) orremote_url
(Python).
--partition-by-api
option (CLI) or partition_by_api
(Python) parameter to specify where files are processed:
-
To do local file processing, omit
--partition-by-api
(CLI) orpartition_by_api
(Python), or explicitly specifypartition_by_api=False
(Python). Local file processing does not use an Unstructured API key or API URL, so you can also omit the following, if they appear:--api-key $UNSTRUCTURED_API_KEY
(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)- The environment variables
UNSTRUCTURED_API_KEY
andUNSTRUCTURED_API_URL
-
To send files to the Unstructured Partition Endpoint for processing, specify
--partition-by-api
(CLI) orpartition_by_api=True
(Python). Unstructured also requires an Unstructured API key and API URL, by adding the following:--api-key $UNSTRUCTURED_API_KEY
(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)- The environment variables
UNSTRUCTURED_API_KEY
andUNSTRUCTURED_API_URL
, representing your API key and API URL, respectively.
You must specify the API URL only if you are not using the default API URL for Unstructured Ingest, which applies to Starter and Team accounts.The default API URL for Unstructured Ingest ishttps://api.unstructuredapp.io/general/v0/general
, which is the API URL for the Unstructured Partition Endpoint. However, you should always use the URL that was provided to you when your Unstructured account was created. If you do not have this URL, email Unstructured Support at support@unstructured.io.If you do not have an API key, get one now.If you are using an Enterprise account, the process for generating Unstructured API keys, and the Unstructured API URL that you use, are different. For instructions, see your Unstructured account administrator, or email Unstructured Support at support@unstructured.io.