Google Drive
If you’re new to Unstructured, read this note first.
Before you can create a source connector, you must first sign up for Unstructured and get your Unstructured API key. After you sign up, the Unstructured user interface (UI) appears, which you use to get the key. To learn how, watch this 40-second how-to video.
After you create the source connector, add it along with a destination connector to a workflow. Then run the worklow as a job. To learn how, try out the hands-on Workflow Endpoint quickstart, go directly to the quickstart notebook, or watch the two 4-minute video tutorials for the Unstructured Python SDK.
You can also create source connectors with the Unstructured user interface (UI). Learn how.
If you need help, reach out to the community on Slack, or contact us directly.
You are now ready to start creating a source connector! Keep reading to learn how.
Ingest your files into Unstructured from Google Drive.
The requirements are as follows.
-
The Google Drive API enabled in the account. Learn how.
-
Within the account, a Google Cloud service account and its related
credentials.json
key file or its contents in JSON format. Create a service account. Create credentials for a service account.To ensure maximum compatibility across Unstructured service offerings, you should give the service account key information to Unstructured as a single-line string that contains the contents of the downloaded service account key file (and not the service account key file itself). To print this single-line string without line breaks, suitable for copying, you can run one of the following commands from your Terminal or Command Prompt. In this command, replace
<path-to-downloaded-key-file>
with the path to thecredentials.json
key file that you downloaded by following the preceding instructions.-
For macOS or Linux:
-
For Windows:
-
-
A Google Drive shared folder or shared drive.
-
Give the service account access to the shared folder or shared drive. To do this, share the folder or drive with the service account’s email address. Learn how. Learn more.
-
Get the shared folder’s ID or shared drive’s ID. This is a part of the URL for your Google Drive shared folder or shared drive, represented in the following URL as
{folder_id}
:https://drive.google.com/drive/folders/{folder-id}
.
Document permissions metadata
The source connector outputs any permissions information that it can find in the source location about the processed source documents and associates that information with each
corresponding element that is generated. This permissions information is output into the permissions_data
field, which is within the
data_source
field under the element’s metadata
field. This information lists the users or groups, if any, that have
permissions to read, update, or delete the element’s associated source document.
The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity.
To look up information about a particular Google Cloud user, use the user’s ID along with the Admin SDK API or the People API for Google Cloud.
To look up information about a particular Google Cloud group, use the group’s ID along with the Admin SDK API or the Cloud Identity API for Google Cloud.
To create a Google Drive source connector, see the following examples.
Replace the preceding placeholders as follows:
-
<name>
(required) - A unique name for this connector. -
<drive-id>
- The ID for the target Google Drive folder or drive. -
<service-account-key>
- The contents of thecredentials.json
key file as a single-line string. -
For
extensions
, set one or more<extension>
values (such aspdf
ordocx
) to process files with only those extensions. The default is to include all extensions.Do not include the leading dot in the file extensions. For example, use
pdf
ordocx
instead of.pdf
or.docx
. -
Set
recursive
totrue
to recursively process data from subfolders within the target folder or drive. The default isfalse
if not otherwise specified.
Was this page helpful?