If you’re new to Unstructured, read this note first.Before you can create a source connector, you must first sign in to your Unstructured account:
- If you do not already have an Unstructured account, sign up for free. After you sign up, you are automatically signed in to your new Unstructured Starter account, at https://platform.unstructured.io. To sign up for a Team or Enterprise account instead, contact Unstructured Sales, or learn more.
- If you already have an Unstructured Starter or Team account and are not already signed in, sign in to your account at https://platform.unstructured.io. For an Enterprise account, see your Unstructured account administrator for instructions, or email Unstructured Support at support@unstructured.io.
- A Google Cloud account.
- The Google Drive API enabled in the account. Learn how.
-
Within the account, a Google Cloud service account and its related
credentials.json
key file or its contents in JSON format. Create a service account. Create credentials for a service account. To ensure maximum compatibility across Unstructured service offerings, you should give the service account key information to Unstructured as a single-line string that contains the contents of the downloaded service account key file (and not the service account key file itself). To print this single-line string without line breaks, suitable for copying, you can run one of the following commands from your Terminal or Command Prompt. In this command, replace<path-to-downloaded-key-file>
with the path to thecredentials.json
key file that you downloaded by following the preceding instructions.-
For macOS or Linux:
-
For Windows:
-
For macOS or Linux:
- A Google Drive shared folder or shared drive.
- Give the service account access to the shared folder or shared drive. To do this, share the folder or drive with the service account’s email address. Learn how. Learn more.
-
Get the shared folder’s ID or shared drive’s ID. This is a part of the URL for your Google Drive shared folder or shared drive, represented in the following URL as
{folder_id}
:https://drive.google.com/drive/folders/{folder-id}
.
Document permissions metadata
The source connector outputs any permissions information that it can find in the source location about the processed source documents and associates that information with each corresponding element that is generated. This permissions information is output into thepermissions_data
field, which is within the
data_source
field under the element’s metadata
field. This information lists the users or groups, if any, that have
permissions to read, update, or delete the element’s associated source document.
The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity.
- On the sidebar, click Connectors.
- Click Sources.
- Cick New or Create Connector.
- Give the connector some unique Name.
- In the Provider area, click Google Drive.
- Click Continue.
- Follow the on-screen instructions to fill in the fields as described later on this page.
- Click Save and Test.
- Name (required): A unique name for this connector.
- Drive ID (required): The target folder’s or drive’s ID.
-
Extensions: A comma-separated list of any file extensions to be included in the ingestion process (such as
jpg,pdf
), if filtering is needed. The default is to include all files, if not otherwise specified.Do not include the leading dot in the file extensions. For example, usejpg
orpdf
instead of.jpg
or.pdf
. - Recursive: Check this box to also access files from all subfolders within the folder or drive.
-
Account Key (required): The contents of the
credentials.json
key file for the target service account. These contents must be expressed as a single-line string without line breaks.