Send processed data from Unstructured to Weaviate.

The requirements are as follows.

Weaviate requires the collection to have a data schema before you add data. At minimum, this schema must contain the record_id property, as follows:

{
    "class": "Elements",
    "properties": [
        {
            "name": "record_id",
            "dataType": ["text"]
        }
    ]
}

Weaviate generates any additional properties based on the incoming data.

If you have specific schema requirements, you can define the schema manually. Unstructured cannot provide a schema that is guaranteed to work for everyone in all circumstances. This is because these schemas will vary based on your source files’ types; how you want Unstructured to partition, chunk, and generate embeddings; any custom post-processing code that you run; and other factors.

You can adapt the following collection schema example for your own specific schema requirements:

{
    "class": "Elements",
    "properties": [
        {
            "name": "record_id",
            "dataType": ["text"]
        },
        {
            "name": "element_id",
            "dataType": ["text"]
        },
        {
            "name": "text",
            "dataType": ["text"]
        },
        {
            "name": "embeddings",
            "dataType": ["number[]"]
        },
        {
            "name": "metadata",
            "dataType": ["object"],
            "nestedProperties": [
                {
                    "name": "parent_id",
                    "dataType": ["text"]
                },
                {
                    "name": "page_number",
                    "dataType": ["text"]
                },
                {
                    "name": "is_continuation",
                    "dataType": ["boolean"]
                },
                {
                    "name": "orig_elements",
                    "dataType": ["text"]
                }
            ]
        }
    ]
}

See also :

To create or change a Weaviate destination connector, see the following examples.

Replace the preceding placeholders as follows:

  • <name> (required) - A unique name for this connector.
  • <host-url> (required) - The URL of the Weaviate database cluster.
  • <class-name> (required) - The name of the target collection within the cluster.
  • <api-key> (required) - The API key provided by Weaviate to access the cluster.

To change a connector, replace <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.