This page was recently updated. What do you think about it? Let us know!.

Connect Couchbase Database to your preprocessing pipeline, and use the Unstructured Ingest CLI or the Unstructured Ingest Python library to batch process all your documents and store structured outputs locally on your filesystem.

You will need:

The Couchbase database prerequisites for a Couchbase Capella deployment, or a local Couchbase server.

For Couchbase Capella deployment:

For a local Couchbase server:

  • Installation of a local Couchbase server. Learn how.
  • Connect to the local Couchbase server. Learn how.

To learn more about how to set up a Couchbase cluster and play with data, refer to this tutorial.

The Couchbase DB connector dependencies:

CLI, Python
pip install "unstructured-ingest[couchbase]"

You might also need to install additional dependencies, depending on your needs. Learn more.

These environment variables are required for the Couchbase Connector:

  • CB_CONN_STR - The Connection String for the Couchbase server, represented by --connection-string (CLI) or connection_string (Python).
  • CB_USERNAME - The username for the Couchbase server, represented by --username (CLI) or username (Python).
  • CB_PASSWORD - The password for the Couchbase server, represented by --password (CLI) or password (Python).
  • CB_BUCKET - The name of the bucket in the Couchbase server, represented by --bucket (CLI) or bucket (Python).
  • CB_SCOPE - The name of the scope in the bucket, represented by --scope (CLI) or scope (Python).
  • CB_COLLECTION - The name of the collection in the scope, represented by --collection (CLI) or collection (Python).

These environment variables:

  • UNSTRUCTURED_API_KEY - Your Unstructured API key value.
  • UNSTRUCTURED_API_URL - Your Unstructured API URL.

Now call the Unstructured CLI or Python. The destination connector can be any of the ones supported. This example uses the local destination connector: