If you’re new to Unstructured, read this note first.Before you can create a destination connector, you must first sign in to your Unstructured account:
- If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your new Unstructured Starter account, at https://platform.unstructured.io. To sign up for a Team or Enterprise account instead, contact Unstructured Sales, or learn more.
- If you already have an Unstructured Starter or Team account and are not already signed in, sign in to your account at https://platform.unstructured.io. For an Enterprise account, see your Unstructured account administrator for instructions, or email Unstructured Support at support@unstructured.io.
- For the Unstructured UI or the Unstructured API, only Milvus cloud-based instances (such as Zilliz Cloud, and Milvus on IBM watsonx.data) are supported.
- For Unstructured Ingest, Milvus local and cloud-based instances are supported.
-
For Zilliz Cloud, you will need:
- A Zilliz Cloud account.
- A Zilliz Cloud cluster.
-
The URI of the cluster, also known as the cluster’s public endpoint, which takes a format such as
https://<cluster-id>.<cluster-type>.<cloud-provider>-<region>.cloud.zilliz.com
. To get this public endpoint value, do the following:- After you sign in to your Zilliz Cloud account, on the sidebar, in the list of available projects, select the project that contains the cluster.
- On the sidebar, click Clusters.
- Click the tile for the cluster.
- On the Cluster Details tab, on the Connect subtab, copy the Public Endpoint value.
-
The username and password to access the cluster, as follows:
- After you sign in to your Zilliz Cloud account, on the sidebar, in the list of available projects, select the project that contains the cluster.
- On the sidebar, click Clusters.
- Click the tile for the cluster.
- On the Users tab, copy the name of the user.
- Next to the user’s name, under Actions, click the ellipsis (three dots) icon, and then click Reset Password.
- Enter a new password for the user, and then click Confirm. Copy this new password.
- The name of the database in the instance.
-
The name of the collection in the database.
The collection must have a a defined schema before Unstructured can write to the collection. The minimum viable
schema for Unstructured contains only the fields
element_id
,embeddings
, andrecord_id
, as follows:
In the Create Index area for the collection, next to Vector Fields, click Edit Index. Make sure that for theField Name Field Type Max Length Dimension element_id
(primary key field)VARCHAR 200
— embeddings
(vector field)FLOAT_VECTOR — 3072
record_id
VARCHAR 200
— embeddings
field, the Field Type is set to FLOAT_VECTOR and the Metric Type is set to Cosine.
-
For Milvus on IBM watsonx.data, you will need:
- An IBM Cloud account.
- The IBM watsonx.data subscription plan.
- A Milvus service instance in IBM watsonx.data.
- The URI of the instance, which takes the format of
https://
, followed by instance’s GRPC host, followed by a colon and the GRPC port. This takes the format ofhttps://<host>:<port>
. Get the instance’s GRPC host and GRPC port. - The name of the database in the instance.
- The name of the collection in the database. Note the collection requirements at the end of this section.
- The uername and password to access the instance.
The username for Milvus on IBM watsonx.data is always
ibmlhapikey
. The password for Milvus on IBM watsonx.data is in the form of an IBM Cloud user API key. Get the user API key.
-
For Milvus local, you will need:
- A Milvus instance.
- The URI of the instance.
- The name of the database in the instance.
- The name of the collection in the database. Note the collection requirements at the end of this section.
- The username and password, or token to access the instance.
element_id
, embeddings
, and record_id
, as follows. This example code demonstrates the use of the
Python SDK for Milvus to create a collection with this minimum viable schema,
targeting Milvus on IBM watsonx.data. For the connections.connect
arguments to connect to other types of Milvus deployments, see your Milvus provider’s documentation:
Python
- On the sidebar, click Connectors.
- Click Destinations.
- Cick New or Create Connector.
- Give the connector some unique Name.
- In the Provider area, click Milvus.
- Click Continue.
- Follow the on-screen instructions to fill in the fields as described later on this page.
- Click Save and Test.
- Name (required): A unique name for this connector.
- GRPC Host (required): The GRPC host name for the Milvus instance.
- GRPC Port: The GRPC port number for the instance.
- DB Name: The name of the database in the instance. The default is
default
if not otherwise specified. - Collection Name (required): The name of the collection in the database.
- Username: The username to access the Milvus instance. The default is
ibmlhapikey
if not otherwise specified. - API Key (required): The IBM Cloud user API key.
- Name (required): A unique name for this connector.
- URI (required): The URI of the Milvus instance, for example:
https://12345.serverless.gcp-us-west1.cloud.zilliz.com
. - DB Name: The name of the database in the instance. The default is
default
if not otherwise specified. - Collection Name (required): The name of the collection in the database.
- Username (required): The username to access the Milvus instance.
- Password (required): The password corresponding to the username to access the instance.
- Name (required): A unique name for this connector.
- URI (required): The URI of the Milvus instance, for example:
https://12345.serverless.gcp-us-west1.cloud.zilliz.com
. - DB Name: The name of the database in the instance. The default is
default
if not otherwise specified. - Collection Name (required): The name of the collection in the database.
- Username (required): The username to access the Milvus instance.
- Password (required): The password corresponding to the username to access the instance.