Airtable
Connect Airtable to your preprocessing pipeline, and batch process all your documents using unstructured-ingest
to
store structured outputs locally on your filesystem.
Make sure to have the Airtable dependencies installed:
pip install "unstructured-ingest[airtable]"
Before connecting your preprocessing pipeline to Airtable, obtain a personal access token to authenticate into Airtable. Check Airtable documentation for more info.
Unless otherwise specified, Unstructured will process all tables within each and every base within an Airtable org.
Optionally, you can choose to specify the locations to ingest data from within Airtable using the --list-of-paths
argument
(list_of_paths
in Python example).
An Airtable path has the following structure: base_id/table_id(optional)/view_id(optional)/
Refer to Airtable documentation to learn how you can obtain ids in bulk:
Finally, make sure to set the --partition-by-api
flag and pass in your API key with --api-key
:
Additionally, if you’re using Unstructured Serverless API, your locally deployed Unstructured API, or an Unstructured API
deployed on Azure or AWS, you also need to specify the API URL via the --partition-endpoint
argument.