This page was recently updated. What do you think about it? Let us know!.

Batch process all your records to store structured outputs in Milvus.

You will need:

The Milvus prerequisites:

The Milvus connector dependencies:

CLI, Python
pip install "unstructured-ingest[milvus]"

You might also need to install additional dependencies, depending on your needs. Learn more.

The following environment variables:

  • MILVUS_URI - The Milvus instance’s URI, represented by --uri (CLI) or uri (Python).
  • MILVUS_USER and MILVUS_PASSWORD, or MILVUS_TOKEN - The username and password, or token, to access the instance. This is represented by --user and --password, or --token (CLI); or user and password, or token (Python).
  • MILVUS_DB - The database’s name, represented by --db-name (CLI) or db_name (Python).
  • MILVUS_COLLECTION - The collection’s name, represented by --collection-name (CLI) or collection_name (Python).
  • MILVUS_FIELDS_TO_INCLUDE - A list of fields to include a comma-separated list (CLI) or an array of strings (Python), represented by --field-to-include (CLI) or fields_to_include (Python).

Additional settings include:

  • To emit the metadata field’s child fields directly into the output, include --flatten-metadata (CLI) or flatten_metadata=True (Python). This is the default if not specified.
  • To keep the metadata field with its child fields intact in the output, include --no-flatten-metadata (CLI) or flatten_metadata=False (Python).

These environment variables:

  • UNSTRUCTURED_API_KEY - Your Unstructured API key value.
  • UNSTRUCTURED_API_URL - Your Unstructured API URL.

Now call the Unstructured CLI or Python SDK. The source connector can be any of the ones supported. This example uses the local source connector: