If you’re new to Unstructured, read this note first.Before you can create a source connector, you must first sign in to your Unstructured account:
- If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your new Unstructured Let’s Go account, at https://platform.unstructured.io. To sign up for a Business account instead, contact Unstructured Sales, or learn more.
- If you already have an Unstructured Let’s Go, Pay-As-You-Go, or Business SaaS account and are not already signed in, sign in to your account at https://platform.unstructured.io. For other types of Business accounts, see your Unstructured account administrator for sign-in instructions, or email Unstructured Support at [email protected].
-
A Teradata Vantage system that can be accessed by its host name or IP address.
For example, a Teradata Vantage system in Teradata ClearScape Analytics Experience includes:
- A Teradata ClearScape Analytics Experience account.
- An environment in the account.
- A Teradata Vantage database in the environment.
- The name and password for a Teradata user who has the appropriate access to the database.
-
The system’s corresponding host name or IP address.
For example, you can get these values from Teradata ClearScape Analytics Experience as follows:
- Sign in to your Teradata ClearScape Analytics Experience account.
- On the sidebar, under Environments, click the name of the database’s corresponding environment.
- Under Connection details for Vantage database, use the Host value.
- Sign in to your Teradata ClearScape Analytics Experience account.
-
The name of the target database in the system. To get a list of available databases in the system, you can run a Teradata SQL query such as the following:
-
The name of the target table in the database. To get a list of available tables in a database, you can run a Teradata SQL query such as the following, replacing
<database-name>with the name of the target database:When Unstructured writes rows to a table, the table’s columns must have a schema that is compatible with Unstructured. Unstructured cannot provide a schema that is guaranteed to work for everyone in all circumstances. This is because these schemas will vary based on your source files’ types; how you want Unstructured to partition, chunk, and generate embeddings; any custom post-processing code that you run; and other factors. In any case, note the following about table schemas:-
The following columns are always required by Unstructured:
record_idandelement_id. -
The following columns are optional for Unstructured, but highly recommended:
textandtype. -
The rest of the columns are optional and typically will be output by Unstructured as part of the
metadatafield. -
If Unstructured is generating vector embeddings, the
embeddingscolumn is also required.
metadatafield. Be sure to replace<database-name>with the name of the target database and<table-name>with the name of the target table (by Unstructured convention, the table name is typicallyelements, but this is not a requirement). -
The following columns are always required by Unstructured:
-
For the source connector, the name of the primary key column in the table (for example, a column named
id, typically defined as"id" VARCHAR(64) NOT NULL, PRIMARY KEY ("id")). - For the source connector, the names of any specific columns to fetch from the table. By default, all columns are fetched unless otherwise specified.
-
For the destination connector, the name of the column in the table that uniquely identifies each record for Unstructured to perform any necessary record updates. By default convention, Unstructured expects this field to be named
record_id. -
The name of the Teradata user who has the appropriate access to the target database.
For example, you can get this from Teradata ClearScape Analytics Experience as follows:
- Sign in to your Teradata ClearScape Analytics account.
- On the sidebar, under Environments, click the name of the database’s corresponding environment.
- Under Connection details for Vantage database, use the Username value.
- Sign in to your Teradata ClearScape Analytics account.
-
The password for the user, which was set up when the user was created.
If the user has forgotten their password, the Teradata SQL command to change a user’s password is as follows, replacing
<user-name>with the name of the user and<new-password>with the new password:To change a user’s password, you must be an administrator (such as theDBCuser or another user withDROP USERprivileges).
- On the sidebar, click Connectors.
- Click Sources.
- Cick New or Create Connector.
- Give the connector some unique Name.
- In the Provider area, click Teradata.
- Click Continue.
- Follow the on-screen instructions to fill in the fields as described later on this page.
- Click Save and Test.
- Name (required): A unique name for this connector.
- Host (required): The hostname or IP address associated with the target Teradata Vantage database.
- Database: The name of the target database. By default, the default database name is used if not otherwise specified. To get the name of the default database, you can run the Teradata SQL command
SELECT DATABASE;. - Table Name (required): The name of the target table in the database.
- Batch Size: The maximum number of rows per batch. The default is
100if not otherwise specified. - ID Column (required, source connector only): The name of the primary key column that Teradata uses to uniquely identify each record in the table.
- Record ID Key (destination connector only): The name of the column that Unstructured uses to uniquely identify each record in the table for record update purposes. The default is
record_idif not otherwise specified. - Columns (source connector only): The names of the columns to fetch from the table, seperated by commas. By default, all columns are fetched unless otherwise specified.
- Username (required): The name of the user who has the appropriate access to the database.
- Password (required): The password for the user.

