-
A Teradata Vantage system that can be accessed by its host name or IP address.
For example, a Teradata Vantage system in Teradata ClearScape Analytics Experience includes:
- A Teradata ClearScape Analytics Experience account.
- An environment in the account.
- A Teradata Vantage database in the environment.
- The name and password for a Teradata user who has the appropriate access to the database.
-
The system’s corresponding host name or IP address.
For example, you can get these values from Teradata ClearScape Analytics Experience as follows:
- Sign in to your Teradata ClearScape Analytics Experience account.
- On the sidebar, under Environments, click the name of the database’s corresponding environment.
- Under Connection details for Vantage database, use the Host value.
- Sign in to your Teradata ClearScape Analytics Experience account.
-
The name of the target database in the system. To get a list of available databases in the system, you can run a Teradata SQL query such as the following:
-
The name of the target table in the database. To get a list of available tables in a database, you can run a Teradata SQL query such as the following, replacing
<database-name>with the name of the target database:When Unstructured writes rows to a table, the table’s columns must have a schema that is compatible with Unstructured. Unstructured cannot provide a schema that is guaranteed to work for everyone in all circumstances. This is because these schemas will vary based on your source files’ types; how you want Unstructured to partition, chunk, and generate embeddings; any custom post-processing code that you run; and other factors. In any case, note the following about table schemas:-
The following columns are always required by Unstructured:
record_idandelement_id. -
The following columns are optional for Unstructured, but highly recommended:
textandtype. -
The rest of the columns are optional and typically will be output by Unstructured as part of the
metadatafield. -
If Unstructured is generating vector embeddings, the
embeddingscolumn is also required.
metadatafield. Be sure to replace<database-name>with the name of the target database and<table-name>with the name of the target table (by Unstructured convention, the table name is typicallyelements, but this is not a requirement). -
The following columns are always required by Unstructured:
-
For the source connector, the name of the primary key column in the table (for example, a column named
id, typically defined as"id" VARCHAR(64) NOT NULL, PRIMARY KEY ("id")). - For the source connector, the names of any specific columns to fetch from the table. By default, all columns are fetched unless otherwise specified.
-
For the destination connector, the name of the column in the table that uniquely identifies each record for Unstructured to perform any necessary record updates. By default convention, Unstructured expects this field to be named
record_id. -
The name of the Teradata user who has the appropriate access to the target database.
For example, you can get this from Teradata ClearScape Analytics Experience as follows:
- Sign in to your Teradata ClearScape Analytics account.
- On the sidebar, under Environments, click the name of the database’s corresponding environment.
- Under Connection details for Vantage database, use the Username value.
- Sign in to your Teradata ClearScape Analytics account.
-
The password for the user, which was set up when the user was created.
If the user has forgotten their password, the Teradata SQL command to change a user’s password is as follows, replacing
<user-name>with the name of the user and<new-password>with the new password:To change a user’s password, you must be an administrator (such as theDBCuser or another user withDROP USERprivileges).
CLI, Python
TERADATA_HOST- The host name, represented by--host(CLI) orhost(Python).TERADATA_PORT- The port number, represented by--dbs-port(CLI) ordbs_port(Python). This is optional, and the default is1025if not otherwise specified.TERADATA_USERNAME- The name of the user who has access to the database, represented by--user(CLI) oruser(Python).TERADATA_PASSWORD- The user’s password, represented by--password(CLI) orpassword(Python).TERADATA_DATABASE- The name of the database, represented by--database(CLI) ordatabase(Python). If not otherwise specified, the default database name is used. To get the name of the default database, you can run the Teradata SQL commandSELECT DATABASE;.TERADATA_TABLE- The name of the table, represented by--table-name(CLI) ortable_name(Python).TERADATA_ID_COLUMN- For the source connector, the name of the column that uniquely identifies each record in the table, represented by--id-column(CLI) orid_column(Python).TERADATA_RECORD_ID_KEY- For the destination connector, the name of the column in the table that uniquely identifies each record, represented by--record-id-key(CLI) orrecord_id_key(Python).
--partition-by-api option (CLI) or partition_by_api (Python) parameter to specify where files are processed:
-
To do local file processing, omit
--partition-by-api(CLI) orpartition_by_api(Python), or explicitly specifypartition_by_api=False(Python). Local file processing does not use an Unstructured API key or API URL, so you can also omit the following, if they appear:--api-key $UNSTRUCTURED_API_KEY(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")(Python)--partition-endpoint $UNSTRUCTURED_API_URL(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")(Python)- The environment variables
UNSTRUCTURED_API_KEYandUNSTRUCTURED_API_URL
-
To send files to the legacy Unstructured Partition Endpoint for processing, specify
--partition-by-api(CLI) orpartition_by_api=True(Python). Unstructured also requires an Unstructured API key and API URL, by adding the following:--api-key $UNSTRUCTURED_API_KEY(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")(Python)--partition-endpoint $UNSTRUCTURED_API_URL(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")(Python)- The environment variables
UNSTRUCTURED_API_KEYandUNSTRUCTURED_API_URL, representing your API key and API URL, respectively.
You must specify the API URL only if you are not using the default API URL for Unstructured Ingest, which applies to Let’s Go, Pay-As-You-Go, and Business SaaS accounts.The default API URL for Unstructured Ingest ishttps://api.unstructuredapp.io/general/v0/general, which is the API URL for the legacyUnstructured Partition Endpoint. However, you should always use the URL that was provided to you when your Unstructured account was created. If you do not have this URL, email Unstructured Support at [email protected].If you do not have an API key, get one now.If you are using a Business account, the process for generating Unstructured API keys, and the Unstructured API URL that you use, are different. For instructions, see your Unstructured account administrator, or email Unstructured Support at [email protected].

