This page was recently updated. What do you think about it? Let us know!.
-
A Snowflake account and its account identifier.
To get the identifier for the current Snowflake account:
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click your username, and then click Account > View account details.
- On the Account tab, note the value of the Account Identifier field.
-
A Snowflake user, which can be a service user (recommended) or a human user.
To create a service user entry and get their login name (not username):
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Projects > Worksheets.
- Click the + button to create a SQL worksheet.
-
In the worksheet, enter the following Snowflake query to create a service user, replacing the following placeholders:
- Replace
<service-user-name>
with some name for the service user. - Replace
<default-role-name>
with the name of any default role for the service user to use.
- Replace
- Click the arrow icon to run the worksheet, which creates the service user.
- To get their login name, on the navigation menu, click Admin > Users & Roles.
- On the Users tab, in the list of available users, click the name of the target user.
- In the About tile, note the Login Name for the user.
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Users & roles.
- Click the Users tab.
- Click + User.
- Follow the on-screen guidance to specify the user’s settings.
- Click Create User.
- To get their login name, on the navigation menu, click Admin > Users & Roles.
- On the Users tab, in the list of available users, click the name of the target user.
- In the About tile, note the Login Name for the user.
-
A programmatic access token (PAT) for the Snowflake user.
To create a programmatic access token (PAT) for a user:
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Users & Roles.
- On the Users tab, in the list of available users, click the name of the target user.
- In the Programmatic access tokens tile, click the Generate new token button.
-
Follow the on-screen guidance to specify the PAT’s settings.
You must set an expiration date for the PAT. This expiration date can be as soon as one day after the PAT is created or up to one year or even later. Once this PAT expires, the connector will stop working. To make sure that your connector continues to work, before your current PAT expires, you must follow this procedure again to generate a new PAT and update your connector’s settings with your new PAT’s value.Unstructured does not notify you when a PAT is about to expire or has already expired. You are responsible for tracking your PATs’ expiration dates and taking corrective action before they expire.
- Click Generate.
- Copy the generated PAT’s value to a secure location, as you will not be able to access it again. If you lose this PAT’s value, you will need to repeat this procedure to generate a new, replacement one.
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Security > Network Rules.
- Click + Network Rule.
- Enter some name for the network rule.
- For Type, select IPv4.
- For Mode, select Ingress.
-
For Identifiers, next to the magnifying glass icon, enter
0.0.0.0/0
, and then press Enter.The0.0.0.0/0
value allows all IP addresses to access the Snowflake account. You can specify a more specific IP address range if you prefer. However, this more specific IP address range will apply to all users, including the user for which you created the PAT. - Click Create Network Rule.
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Security > Network Policies.
- Click + Network Policy.
- Enter some name for the network policy.
- Make sure Allowed is selected.
- In the Select rule drop-down list, select the precedingnetwork rule to attach to this network policy.
- Click Create Network Policy.
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Security > Network Policies.
- Click the name of the precedingnetwork policy to activate.
- In the policy’s side panel, click the ellipsis (three dots) icon, and then click Activate On Account.
- Click Activate policy.
- (No longer recommended, as passwords are being deprecated by Snowflake—use PATs instead) The Snowflake user’s login name (not username) and the user’s password in the account. This user must be a human user. Passwords are not supported for service users.
-
The name of the Snowflake role that the user belongs to and that also has sufficient access to the Snowflake database, schema, table, and host.
- To create a database in Snowflake, the role needs to be granted
CREATE DATABASE
privilege at the current account level; andUSAGE
privilege on the warehouse that is used to create the database. - To create a schema in a database in Snowflake, the role needs to be granted
USAGE
privilege on the database and the warehouse that is used to create the schema; andCREATE SCHEMA
on the database. - To create a table in a schema in Snowflake, the role needs to be granted
USAGE
privilege on the database and schema and the warehouse that is used to create the table; andCREATE TABLE
on the schema. - To write to a table in Snowflake, the role needs to be granted
USAGE
privilege on the database and schema and the warehouse that is used to write to the table; andINSERT
on the table. - To read from a table in Snowflake, the role needs to be granted
USAGE
privilege on the database and schema and the warehouse that is used to write to the table; andSELECT
on the table.
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Users & Roles.
- Click the Roles tab.
Grant privileges to a role. Learn more. - To create a database in Snowflake, the role needs to be granted
-
The Snowflake warehouse’s hostname and its port number in the account.
To view a list of available warehouses in the current Snowflake account:
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Admin > Warehouses. This view does not provide access to the warehouses’ hostnames or port numbers. To get this information, you must run a Snowflake query.
type
ofSNOWFLAKE_DEPLOYMENT
: -
The name of the Snowflake database in the account.
To view a list of available databases in the current Snowflake account:
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Data > Databases.
-
The name of the schema in the database.
To view a list of available schemas for a database in the current Snowflake account:
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Data > Databases.
- Expand the name of the target database.
The following Snowflake query returns a list of available schemas for the database named<database_name>
in the current account: -
The name of the table in the schema.
To view a list of available tables for a schema in a database in the current Snowflake account:
- Log in to Snowsight with your Snowflake account.
- In Snowsight, on the navigation menu, click Data > Databases.
- Expand the name of the database that contains the target schema.
- Expand the name of the target schema.
- Expand Tables.
<schema_name>
in the datbase named<database_name>
in the current account:Snowflake requires the target table to have a defined schema before Unstructured can write to the table. The recommended table schema for Unstructured is as follows. In the followingCREATE TABLE
statement, replace the following placeholders with the appropriate values:<database_name>
: The name of the target database in the Snowflake account.<schema_name>
: The name of the target schema in the database.<number-of-dimensions>
: The number of dimensions for any embeddings that you plan to use. This value must match the number of dimensions for any embeddings that are
specified in your related Unstructured workflows or pipelines. If you plan to use Snowflake vector embedding generation or Snowflake vector search, this value must match the number of dimensions that you plan to have Snowflake generate or search against.
SQL -
The name of the column in the table that uniquely identifies each record (for example,
RECORD_ID
).
CLI, Python
-
SNOWFLAKE_ACCOUNT
- The ID of the target Snowflake account, represented by--account
(CLI) oraccount
(Python). -
SNOWFLAKE_USER
- The name of the target Snowflake user, represented by--user
(CLI) oruser
(Python). -
SNOWFLAKE_PROGRAMMATIC_ACCESS_TOKEN
- The user’s programmatic access token (PAT), represented by--password
(CLI) orpassword
(Python).Specifying a password is no longer recommended, as passwords are being deprecated by Snowflake. Use a PAT instead. -
SNOWFLAKE_ROLE
- The target role for the user, represented by--role
(CLI) orrole
(Python). -
SNOWFLAKE_HOST
- The hostname for the target Snowflake warehouse, represented by--host
(CLI) orhost
(Python). -
SNOWFLAKE_PORT
- The warehouse’s port number, represented by--port
(CLI) orport
(Python). The default is443
if not otherwise specified. -
SNOWFLAKE_DATABASE
- The name of the target Snowflake database, represented by--database
(CLI) ordatabase
(Python). -
SNOWFLAKE_SCHEMA
- The name of the target schema in the database, represented by--schema
(CLI) orschema
(Python). -
SNOWFLAKE_TABLE
- The name of the target table in the schema, represented by--table-name
(CLI) ortable_name
(Python). For the destination connector, the default iselements
if not otherwise specified. -
SNOWFLAKE_RECORD_ID_KEY
- The name of the column in the table that uniquely identifies each record, represented by:- For the source connector,
--id-column
(CLI) orid_column
(Python). - For the destination connector,
--record-id-key
(CLI) orrecord_id_key
(Python). For the destination connector, the default isrecord_id
if not otherwise specified.
- For the source connector,
--partition-by-api
option (CLI) or partition_by_api
(Python) parameter to specify where files are processed:
-
To do local file processing, omit
--partition-by-api
(CLI) orpartition_by_api
(Python), or explicitly specifypartition_by_api=False
(Python). Local file processing does not use an Unstructured API key or API URL, so you can also omit the following, if they appear:--api-key $UNSTRUCTURED_API_KEY
(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)- The environment variables
UNSTRUCTURED_API_KEY
andUNSTRUCTURED_API_URL
-
To send files to the Unstructured Partition Endpoint for processing, specify
--partition-by-api
(CLI) orpartition_by_api=True
(Python). Unstructured also requires an Unstructured API key and API URL, by adding the following:--api-key $UNSTRUCTURED_API_KEY
(CLI) orapi_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) orpartition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)- The environment variables
UNSTRUCTURED_API_KEY
andUNSTRUCTURED_API_URL
, representing your API key and API URL, respectively.
You must specify the API URL only if you are not using the default API URL for Unstructured Ingest, which applies to Starter and Team accounts.The default API URL for Unstructured Ingest ishttps://api.unstructuredapp.io/general/v0/general
, which is the API URL for the Unstructured Partition Endpoint. However, you should always use the URL that was provided to you when your Unstructured account was created. If you do not have this URL, email Unstructured Support at support@unstructured.io.If you do not have an API key, get one now.If you are using an Enterprise account, the process for generating Unstructured API keys, and the Unstructured API URL that you use, are different. For instructions, see your Unstructured account administrator, or email Unstructured Support at support@unstructured.io.