Batch process all your records to store structured outputs in an Astra DB account.

You will need:

The Astra DB connector prerequisites:

The Astra DB connector dependencies:

CLI, Python
pip install "unstructured-ingest[astradb]"

You might also need to install additional dependencies, depending on your needs. Learn more.

These environment variables:

  • ASTRA_DB_API_ENDPOINT - The API endpoint for the Astra DB database, represented by --api-endpoint (CLI) or api_endpoint (Python). To get the endpoint, see the Database Details > API Endpoint value on your database’s Overview tab.
  • ASTRA_DB_APPLICATION_TOKEN - The database application token value for the database, represented by --token (CLI) or token (Python). To get the token, see the Database Details > Application Tokens box on your database’s Overview tab.
  • ASTRA_DB_KEYSPACE - The name of the keyspace for the database, represented by --keyspace (CLI) or keyspace (Python).
  • ASTRA_DB_COLLECTION - The name of the collection for the keyspace, represented by --collection-name (CLI) or collection_name (Python).
  • ASTRA_DB_EMBEDDING_DIMENSIONS - The number of dimensions in the collection, represented by --embedding-dimension (CLI) or embedding_dimension (Python).

These environment variables:

  • UNSTRUCTURED_API_KEY - Your Unstructured API key value.
  • UNSTRUCTURED_API_URL - Your Unstructured API URL.

Now call the Unstructured CLI or Python SDK. The source connector can be any of the ones supported. This example uses the local source connector: