Ingest your files into Unstructured from Couchbase.

The requirements are as follows.

For Couchbase Capella, you will need:

For a local Couchbase server, you will need:

To learn more about how to set up a Couchbase cluster and play with data, refer to this tutorial.

To create a Couchbase source connector, see the following examples.

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateSourceRequest
from unstructured_client.models.shared import (
    CreateSourceConnector,
    SourceConnectorType,
    CouchbaseSourceConnectorConfigInput
)

with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
    response = client.sources.create_source(
        request=CreateSourceRequest(
            create_source_connector=CreateSourceConnector(
                name="<name>",
                type=SourceConnectorType.COUCHBASE,
                config=CouchbaseSourceConnectorConfigInput(
                    username="<username>",
                    bucket="<bucket>",
                    connection_string="<connection-string>",
                    scope="<scope>",
                    collection="<collection>",
                    password="<password>",
                    batch_size=<batch-size>,
                    collection_id="<collection-id>" 
                )
            )
        )
    )

    print(response.source_connector_information)

Replace the preceding placeholders as follows:

  • <name> (required) - A unique name for this connector.
  • <username> (required) - The username for the Couchbase server.
  • <bucket> (required) - The name of the bucket in the Couchbase server.
  • <connection-string> (required) - The connection string for the Couchbase server.
  • <scope> - The name of the scope in the bucket. The default is _default if not otherwise specified.
  • <collection> - The name of the collection in the scope. The default is _default if not otherwise specified.
  • <password> (required) - The password for the Couchbase server.
  • <batch-size> - The maximum number of records to transmit per batch. The default is 50 if not otherwise specified.
  • <collection-id> (source connector only) - The name of the collection field that contains the document ID. The default is id if not otherwise specified.