Get an API key

The Free Unstructured API requires authentication via an API key. Here’s how you can obtain your API key:

  1. Go to https://unstructured.io/api-key-free.
  2. Fill out the registration form. Make sure your contact information (especially your Email) is valid.
  3. Check the I agree box if you consent to Unstructured contacting you about our products and services.
  4. Click the Terms and Conditions link, read it, and check the related box to agree.
  5. Click Submit. You will receive a Free Unstructured API key at the Email you provided. Store your API key in a secure location. Do not share it with others.
  6. For the Free Unstructured API, you must also provide an API URL only when you make POST requests, and the API URL for POST requests is always https://api.unstructured.io/general/v0/general. In all other cases, the API URL is an empty string.

Free Unstructured API keys do not work with the Unstructured Serverless API. If you try to use a Free Unstructured API key with an Unstructured Serverless API URL, the call will fail. Use your Free Unstructured API URL instead.

Try the quickstart.

Free Unstructured API limitations

The Free Unstructured API is designed for prototyping purposes, and not for production use:

  • The API usage is limited to 1000 pages per month.
  • Unlike the users of Unstructured Serverless API, users of the Free Unstructured API to do not get their own dedicated infrastructure.
  • The data sent over the Free Unstructured API can be used for model training purposes, and other service improvements.

If you require a production-ready API, consider using the Unstructured Serverless API instead.

We calculate a page as follows:

  • For these file types, a page is a page, slide, or image: .pdf, .pptx, and .tiff.
  • For .docx files that have page metadata, we calculate the number of pages based on that metadata.
  • For all other file types, we calculate the number of pages as the file’s size divided by 100 KB.

Quickstart

Let’s say you want to preprocess an *.eml file using the free Unstructured API. There are several ways you can do this, which all lead to the same result, so pick your preferred method: POST, CLI, SDK, or open source.

When using the Free Unstructured API, use https://api.unstructured.io/general/v0/general as the URL for POST requests, or a blank string as the URL when using the CLI, SDKs, or open source library. Note that the Free API URL has a slightly different domain name than the Serverless API which is api.unstructuredapp.io.

POST request

To work with the Unstructured Serverless API by calling the Unstructured REST API POST with curl, first do the following:

  • Set the UNSTRUCTURED_API_KEY environment variable to your Free Unstructured API key.
  • Set the UNSTRUCTURED_API_URL environment variable to your Free Unstructured API URL, which for POST requests is always https://api.unstructured.io/general/v0/general.
Setting the UNSTRUCTURED_API_URL environment variable makes your code forward-compatible if you later upgrade to the Unstructured Serverless API, which requires a different API URL.

Now, use curl to call the API, specifying where the source (input) file is to preprocess and the destination (output) where Unstructured will deliver the processed data:

curl -X POST $UNSTRUCTURED_API_URL \
     -H 'accept: application/json' \
     -H 'Content-Type: multipart/form-data' \
     -H 'unstructured-api-key: $UNSTRUCTURED_API_KEY' \
     -F 'files=@<local/path/to/input/file>' \
     -o '<local/path/to/output/file>'

If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.

The result will look something like this: Sample Output

POST requests support using only local machine paths as the source (input) for the file to preprocess and as the destination (output) that Unstructured sends the processed data to. To specify a source or destination other than a local machine, use the CLI, the Python SDK, or the open source library instead.
Unstructured does not recommend POST to process multiple files at a time. Instead, use the Unstructured CLI or the Unstructured Python SDK with their provided source connectors and destination connectors.

Learn more about how to use POST requests.

Unstructured CLI

To work with the Free Unstructured API by using the Unstructured CLI, you will need to:

  • Install Python, and then install the CLI package:

    pip install unstructured
    
  • Set the UNSTRUCTURED_API_KEY environment variable to your Free Unstructured API key.

Now, use the CLI to call the API, specifying where the files are to preprocess and where Unstructured will output the processed data.

CLI
unstructured-ingest \
  local \
    --input-path <local/path/to/input/files> \
    --output-dir <local/path/to/output/files> \
    --partition-by-api \
    --api-key $UNSTRUCTURED_API_KEY

If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.

To learn more about how to use the local command, run the command unstructured-ingest local --help.

Unstructured Python SDK and JavaScript/TypeScript SDK

To work with the free Unstructured API in Python or JavaScript, use the Unstructured Python SDK, or JavaScript SDK.

To work with the Unstructured Serverless API in Python, JavaScript, or TypeScript, use the Unstructured Python SDK or JavaScript/TypeScript SDK.

The JavaScript/TypeScript SDK supports using only local machine paths as the source (input) for the files to preprocess and as the destination (output) that Unstructured sends the processed data to. To specify a source or destination other than a local machine, use the CLI, the Python SDK, or the open source library instead..

First, install your preferred SDK:

Next, set the UNSTRUCTURED_API_KEY environment variable to your Free Unstructured API key.

Now call the API.

If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.

The partition method in the Python and JavaScript/TypeScript SDKs process single files only. To process multiple files at a time, use the Unstructured CLI or Unstructured Python SDK with their provided source connectors and destination connectors.

Learn more about how to use the Unstructured Python and JavaScript/TypeScript SDKs.

Calling the Unstructured API from the Unstructured open source library

You can call the Unstructured Serverless API directly from the Unstructured open source library. Unstructured recommends this approach only for rapid local script or code prototyping or simple proofs-of-concept, not for production scenarios:

You will need to:

  • Install Python, and then install the open source library:

    pip install unstructured
    
  • Set the UNSTRUCTURED_API_KEY environment variable to your API key.

Now, use the open source library to call the API, specifying the file to preprocess:

import os

from unstructured.partition.api import partition_via_api

filename = "PATH_TO_FILE"

elements = partition_via_api(
  filename=filename,
  api_key=os.getenv("UNSTRUCTURED_API_KEY")
)

If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.

The partition_via_api function in the open source library processes single files only. To process multiple files at a time, use the Unstructured CLI or the open source library with their provided source connectors and destination connectors.

Learn more about how to use the Unstructured open source library.

Telemetry

We’ve partnered with Scarf to collect anonymized user statistics to understand which features our community is using and how to prioritize product decision-making in the future.

To learn more about how we collect and use this data, please read our Privacy Policy.

To opt out of this data collection, you can set the environment variable SCARF_NO_ANALYTICS=true before running any commands that call Unstructured Serverless API services.