Unstructured Serverless API
This page describes how to get started with the Unstructured Serverless API. Learn how to obtain an API key to use with the Unstructured Serverless API, and get started in no time.
Get started
To call the Unstructured Serverless API, you need an API key and API URL:
- Go to https://app.unstructured.io.
- Enter your email address and then click Sign In to receive a magic link to sign in to your personalized dashboard, or authenticate yourself with your Google or GitHub account.
- Once you authenticate with the magic link or with your Google or GitHub account, your dashboard appears.
- On the sidebar, click API Keys, if it is not already selected.
- To get your API key, click the copy icon in the Actions column for your API key. Store your copied API key in a secure location. Do not share it with others.
- To get your API URL, click the copy icon next to the URL next to API URL. Store your copied API URL in a secure location. Do not share it with others.
Unstructured Serverless API keys do not work with the Free Unstructured API. If you try to use an Unstructured Serverless API key with a Free Unstructured API URL, the call will fail. Use your Unstructured Serverless API URL instead.
Set up billing
Once you sign up for the Unstructured Serverless API, you can enjoy a free 14-day trial with usage capped at 1000 pages per day.
At the end of the 14-day free trial, or if you need to go past the trial’s page processing limits during the 14-day free trial, you must set up your billing information to keep using the Unstructured Serverless API:
- Go to https://app.unstructured.io and sign in.
- On the sidebar, click Billing.
- Click Manage Payment Method, follow the on-screen instructions to enter your payment details through Stripe, and then click Save card.
Your card is billed monthly based on your usage. The Billing page shows a billing overview for the current month and a list of your billing invoices.
We calculate a page as follows:
- For these file types, a page is a page, slide, or image: .pdf, .pptx, and .tiff.
- For .docx files that have page metadata, we calculate the number of pages based on that metadata.
- For all other file types, we calculate the number of pages as the file’s size divided by 100 KB.
Quickstart
These examples use your local machine. They send source (input) files from your local machine to the Unstructured Serverless API which delivers the processed data to a destination (output) location, also on your local machine. Data is processed on Unstructured-hosted compute resources.
Unstructured Ingest CLI
To work with the Unstructured Serverless API by using the Unstructured Ingest CLI, you will need to:
-
Install Python, and then install the CLI package:
pip install unstructured-ingest
-
Set the following environment variables:
- Set
UNSTRUCTURED_API_KEY
to your API key. - Set
UNSTRUCTURED_API_URL
to your API URL.
- Set
-
Have some compatible files on your local machine to be processed. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.
Now, use the CLI to call the API, replacing:
<path/to/input>
with the source (input) path to the directory on your local machine that contains the compatible files for Unstructured to process on its hosted compute resources.<path/to/output>
with the destination (output) path to the directory on your local machine that will contain the processed data that Unstructured returns from its hosted compute resources.
unstructured-ingest \
local \
--input-path <path/to/input> \
--output-dir <path/to/output> \
--partition-by-api \
--api-key $UNSTRUCTURED_API_KEY \
--partition-endpoint $UNSTRUCTURED_API_URL \
--strategy hi_res \
--additional-partition-args="{\"split_pdf_page\":\"true\", \"split_pdf_allow_failed\":\"true\", \"split_pdf_concurrency_level\": 15}"
After the command successfully runs, see the results in the specified output path on your local machine.
Unstructured Ingest Python library
To work with the Unstructured Serverless API by using the Unstructured Python library, you will need to:
-
Install Python, and then install the CLI package:
pip install unstructured-ingest
-
Set the following environment variables:
- Set
UNSTRUCTURED_API_KEY
to your API key. - Set
UNSTRUCTURED_API_URL
to your API URL.
- Set
-
Have some compatible files on your local machine to be processed. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.
Now, use the CLI to call the API, replacing:
<path/to/input>
with the source (input) path to the directory on your local machine that contains the compatible files for Unstructured to process on its hosted compute resources.<path/to/output>
with the destination (output) path to the directory on your local machine that will contain the processed data that Unstructured returns from its hosted compute resources.
import os
from unstructured_ingest.v2.pipeline.pipeline import Pipeline
from unstructured_ingest.v2.interfaces import ProcessorConfig
from unstructured_ingest.v2.processes.connectors.local import (
LocalIndexerConfig,
LocalDownloaderConfig,
LocalConnectionConfig,
LocalUploaderConfig
)
from unstructured_ingest.v2.processes.partitioner import PartitionerConfig
if __name__ == "__main__":
Pipeline.from_configs(
context=ProcessorConfig(),
indexer_config=LocalIndexerConfig(input_path=os.getenv("LOCAL_FILE_INPUT_DIR")),
downloader_config=LocalDownloaderConfig(),
source_connection_config=LocalConnectionConfig(),
partitioner_config=PartitionerConfig(
partition_by_api=True,
api_key=os.getenv("UNSTRUCTURED_API_KEY"),
partition_endpoint=os.getenv("UNSTRUCTURED_API_URL"),
strategy="hi_res",
additional_partition_args={
"split_pdf_page": True,
"split_pdf_allow_failed": True,
"split_pdf_concurrency_level": 15
}
),
uploader_config=LocalUploaderConfig(output_dir=os.getenv("LOCAL_FILE_OUTPUT_DIR"))
).run()
After the command successfully runs, see the results in the specified output path on your local machine.
Manage your account
To manage your account: Begin by going to https://app.unstructured.io and signing in.
To manage your API keys:
- On the sidebar, click API Keys.
- To create a key, click Generate New Key, and follow the on-screen instructions.
- To enable or disable a key, switch On/Off in the column for that key to on or off.
- To delete a key, click the trash can in the Actions column for that key.
To view your usage: On the sidebar, click Usage.
To view your billing costs and invoices and to manage your payment method: On the sidebar, click Billing.
To log out of your account: On the sidebar, click your email address, and then click Logout.
If you need direct assistance, our support team is just an email away. Contact us at support@unstructured.io.
Telemetry
We’ve partnered with Scarf to collect anonymized user statistics to understand which features our community is using and how to prioritize product decision-making in the future.
To learn more about how we collect and use this data, please read our Privacy Policy.
To opt out of this data collection, you can set the environment variable SCARF_NO_ANALYTICS=true
before running any commands that call Unstructured Serverless API services.
Was this page helpful?