If you’re new to Unstructured, read this note first.Before you can create a source connector, you must first sign in to your Unstructured account:
- If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your new Unstructured Let’s Go account, at https://platform.unstructured.io. To sign up for a Business account instead, contact Unstructured Sales, or learn more.
- If you already have an Unstructured Let’s Go, Pay-As-You-Go, or Business SaaS account and are not already signed in, sign in to your account at https://platform.unstructured.io. For other types of Business accounts, see your Unstructured account administrator for sign-in instructions, or email Unstructured Support at [email protected].
-
After you sign in to your Unstructured Let’s Go, Pay-As-You-Go, or Business account, click API Keys on the sidebar.
For a Business account, before you click API Keys, make sure you have selected the organizational workspace you want to create an API key for. Each API key works with one and only one organizational workspace. Learn more. -
Click Generate API Key.
-
Follow the on-screen instructions to finish generating the key.
-
Click the Copy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click the Copy icon again.
-
A Teradata Vantage system that can be accessed by its host name or IP address.
For example, a Teradata Vantage system in Teradata ClearScape Analytics Experience includes:
- A Teradata ClearScape Analytics Experience account.
- An environment in the account.
- A Teradata Vantage database in the environment.
- The name and password for a Teradata user who has the appropriate access to the database.
-
The system’s corresponding host name or IP address.
For example, you can get these values from Teradata ClearScape Analytics Experience as follows:
- Sign in to your Teradata ClearScape Analytics Experience account.
- On the sidebar, under Environments, click the name of the database’s corresponding environment.
- Under Connection details for Vantage database, use the Host value.
- Sign in to your Teradata ClearScape Analytics Experience account.
-
The name of the target database in the system. To get a list of available databases in the system, you can run a Teradata SQL query such as the following:
-
The name of the target table in the database. To get a list of available tables in a database, you can run a Teradata SQL query such as the following, replacing
<database-name>with the name of the target database:When Unstructured writes rows to a table, the table’s columns must have a schema that is compatible with Unstructured. Unstructured cannot provide a schema that is guaranteed to work for everyone in all circumstances. This is because these schemas will vary based on your source files’ types; how you want Unstructured to partition, chunk, and generate embeddings; any custom post-processing code that you run; and other factors. In any case, note the following about table schemas:-
The following columns are always required by Unstructured:
record_idandelement_id. -
The following columns are optional for Unstructured, but highly recommended:
textandtype. -
The rest of the columns are optional and typically will be output by Unstructured as part of the
metadatafield. -
If Unstructured is generating vector embeddings, the
embeddingscolumn is also required.
metadatafield. Be sure to replace<database-name>with the name of the target database and<table-name>with the name of the target table (by Unstructured convention, the table name is typicallyelements, but this is not a requirement). -
The following columns are always required by Unstructured:
-
For the source connector, the name of the primary key column in the table (for example, a column named
id, typically defined as"id" VARCHAR(64) NOT NULL, PRIMARY KEY ("id")). - For the source connector, the names of any specific columns to fetch from the table. By default, all columns are fetched unless otherwise specified.
-
For the destination connector, the name of the column in the table that uniquely identifies each record for Unstructured to perform any necessary record updates. By default convention, Unstructured expects this field to be named
record_id. -
The name of the Teradata user who has the appropriate access to the target database.
For example, you can get this from Teradata ClearScape Analytics Experience as follows:
- Sign in to your Teradata ClearScape Analytics account.
- On the sidebar, under Environments, click the name of the database’s corresponding environment.
- Under Connection details for Vantage database, use the Username value.
- Sign in to your Teradata ClearScape Analytics account.
-
The password for the user, which was set up when the user was created.
If the user has forgotten their password, the Teradata SQL command to change a user’s password is as follows, replacing
<user-name>with the name of the user and<new-password>with the new password:To change a user’s password, you must be an administrator (such as theDBCuser or another user withDROP USERprivileges).
<name>(required) - A unique name for this connector.<host>(required): The hostname or IP address associated with the target Teradata Vantage database.<database>: The name of the target database. By default, the default database name is used if not otherwise specified. To get the name of the default database, you can run the Teradata SQL commandSELECT DATABASE;.<table-name>(required): The name of the target table in the database.<batch-size>: The maximum number of rows per batch. The default is100if not otherwise specified.<id-column>(required, source connector only): The name of the primary key column that Teradata uses to uniquely identify each record in the table.<record-id-key>(destination connector only): The name of the column that Unstructured uses to uniquely identify each record in the table for record update purposes. The default isrecord_idif not otherwise specified.<column-name>(source connector only): The name of a column to fetch from the table. By default, all columns are fetched unless otherwise specified.<username>(required): The name of the user who has the appropriate access to the database.<password>(required): The password for the user.

