Weaviate
Send processed data from Unstructured to Weaviate.
The requirements are as follows.
-
For the Unstructured Platform: only Weaviate Cloud clusters are supported.
-
For Unstructured Ingest: Weaviate Cloud clusters, Weaviate installed locally, and Embedded Weaviate are supported.
-
For Weaviate installed locally, you will need the name of the target collection on the local instance.
-
For Embedded Weaviate, you will need the instance’s connection URL and the name of the target collection on the instance.
-
For Weaviate Cloud, you will need:
- A Weaviate database instance. The following information assumes that you have a Weaviate Cloud (WCD) account with a Weaviate database cluster in that account. Create a WCD account. Create a database cluster. For other database options, learn more.
- The URL and API key for the database cluster. Get the URL and API key.
- The name of the target collection in the database. Create a collection.
Weaviate requires the collection to have a data schema before you add data. At minimum, this schema must contain the record_id
property, as follows:
Weaviate generates any additional properties based on the incoming data.
If you have specific schema requirements, you can define the schema manually. Unstructured cannot provide a schema that is guaranteed to work for everyone in all circumstances. This is because these schemas will vary based on your source files’ types; how you want Unstructured to partition, chunk, and generate embeddings; any custom post-processing code that you run; and other factors.
You can adapt the following collection schema example for your own specific schema requirements:
See also :
To create the destination connector:
- On the sidebar, click Connectors.
- Click Destinations.
- Cick New or Create Connector.
- Give the connector some unique Name.
- In the Provider area, click Weaviate.
- Click Continue.
- Follow the on-screen instructions to fill in the fields as described later on this page.
- Click Save and Test.
Fill in the following fields:
- Name (required): A unique name for the connector.
- Cluster URL (required): The URL of the Weaviate database cluster.
- Collection Name (required): The name of the target collection within the cluster.
- API Key (required): The API key provided by Weaviate to access the cluster.