The Free Unstructured API is different than the free 14-day trial for the Unstructured Serverless API.

Get an API key

The Free Unstructured API requires authentication via an API key. Here’s how you can obtain your API key:

  1. Go to https://unstructured.io/api-key-free.
  2. Fill out the registration form. Make sure your contact information (especially your Email) is valid.
  3. Check the I agree box if you consent to Unstructured contacting you about our products and services.
  4. Click the Terms and Conditions link, read it, and check the related box to agree.
  5. Click Submit. You will receive a Free Unstructured API key at the Email you provided. Store your API key in a secure location. Do not share it with others.
  6. For the Free Unstructured API, the API URL is https://api.unstructured.io/general/v0/general

Free Unstructured API keys do not work with the Unstructured Serverless API. If you try to use a Free Unstructured API key with an Unstructured Serverless API URL, the call will fail. Use your Free Unstructured API URL instead.

Try the quickstart.

Free Unstructured API limitations

The Free Unstructured API is designed for prototyping purposes, and not for production use:

  • The API usage is limited to 1000 pages per month.
  • Unlike the users of Unstructured Serverless API, users of the Free Unstructured API do not get their own dedicated infrastructure.
  • The data sent over the Free Unstructured API can be used for model training purposes, and other service improvements.

If you require a production-ready API, consider using the Unstructured Serverless API instead.

We calculate a page as follows:

  • For these file types, a page is a page, slide, or image: .pdf, .pptx, and .tiff.
  • For .docx files that have page metadata, we calculate the number of pages based on that metadata.
  • For all other file types, we calculate the number of pages as the file’s size divided by 100 KB.

Quickstart

These examples use your local machine. They send source (input) files from your local machine to the Unstructured Serverless API which delivers the processed data to a destination (output) location, also on your local machine. Data is processed on Unstructured-hosted compute resources.

Unstructured Ingest CLI

To work with the Free Unstructured API by using the Unstructured Ingest CLI, you will need to:

  • Install Python, and then install the CLI package:

    pip install unstructured
    
  • Set the UNSTRUCTURED_API_KEY environment variable to your Free Unstructured API key.

  • Set the UNSTRUCTURED_API_URL environment variable to your Free Unstructured API URL, which is https://api.unstructured.io/general/v0/general

  • Have some compatible files on your local machine to be processed. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.

Now, use the CLI to call the API, replacing:

  • <path/to/input> with the source (input) path to the directory on your local machine that contains the compatible files for Unstructured to process on its hosted compute resources.
  • <path/to/output> with the destination (output) path to the directory on your local machine that will contain the processed data that Unstructured returns from its hosted compute resources.
CLI
unstructured-ingest \
  local \
    --input-path <path/to/input> \
    --output-dir <path/to/output> \
    --partition-by-api \
    --api-key $UNSTRUCTURED_API_KEY \
    --partition-endpoint $UNSTRUCTURED_API_URL \
    --strategy hi_res \
    --additional-partition-args="{\"split_pdf_page\":\"true\", \"split_pdf_allow_failed\":\"true\", \"split_pdf_concurrency_level\": 15}"

After the command successfully runs, see the results in the specified output path on your local machine.

Unstructured Ingest Python library

To work with the Unstructured Serverless API by using the Unstructured Python library, you will need to:

  • Install Python, and then install the CLI package:

    pip install unstructured-ingest
    
  • Set the following environment variables:

    • Set UNSTRUCTURED_API_KEY to your API key.
    • Set UNSTRUCTURED_API_URL to your API URL.

    Get your API key and API URL.

  • Have some compatible files on your local machine to be processed. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.

Now, use the CLI to call the API, replacing:

  • <path/to/input> with the source (input) path to the directory on your local machine that contains the compatible files for Unstructured to process on its hosted compute resources.
  • <path/to/output> with the destination (output) path to the directory on your local machine that will contain the processed data that Unstructured returns from its hosted compute resources.
Python Ingest v2
import os

from unstructured_ingest.v2.pipeline.pipeline import Pipeline
from unstructured_ingest.v2.interfaces import ProcessorConfig
from unstructured_ingest.v2.processes.connectors.local import (
    LocalIndexerConfig,
    LocalDownloaderConfig,
    LocalConnectionConfig,
    LocalUploaderConfig
)
from unstructured_ingest.v2.processes.partitioner import PartitionerConfig

if __name__ == "__main__":
    Pipeline.from_configs(
        context=ProcessorConfig(),
        indexer_config=LocalIndexerConfig(input_path=os.getenv("LOCAL_FILE_INPUT_DIR")),
        downloader_config=LocalDownloaderConfig(),
        source_connection_config=LocalConnectionConfig(),
        partitioner_config=PartitionerConfig(
            partition_by_api=True,
            api_key=os.getenv("UNSTRUCTURED_API_KEY"),
            partition_endpoint=os.getenv("UNSTRUCTURED_API_URL"),
            strategy="hi_res",
            additional_partition_args={
                "split_pdf_page": True,
                "split_pdf_allow_failed": True,
                "split_pdf_concurrency_level": 15
            }
        ),
        uploader_config=LocalUploaderConfig(output_dir=os.getenv("LOCAL_FILE_OUTPUT_DIR"))
    ).run()

After the command successfully runs, see the results in the specified output path on your local machine.

Telemetry

We’ve partnered with Scarf to collect anonymized user statistics to understand which features our community is using and how to prioritize product decision-making in the future.

To learn more about how we collect and use this data, please read our Privacy Policy.

To opt out of this data collection, you can set the environment variable SCARF_NO_ANALYTICS=true before running any commands that call Unstructured Serverless API services.