Product offerings

Unstructured offers three products:

  

  Unstructured Platform - No-code UI. Production-ready. Pay as you go.

  

  Unstructured Serverless API Services - Use scripts or code. Production-ready. Pay as you go. (There is also a non-production, free edition with limits.)

  

  Unstructured open source library - Use scripts or code. Not production-ready. Limited.

Learn more about these products:

Unstructured Serverless API Services


Use scripts or code to call the Unstructured CLI, SDKs, or REST API to get all of your data RAG-ready.

Unstructured Serverless API Services have a
Serverless pay-as-you-go edition and a Free limited edition that process data on Unstructured-hosted compute resources.

If you need to use compute resources that you host instead, there are also Azure pay-as-you-go and AWS pay-as-you-go editions; these editions process data by using the Unstructured API installed on compute resources hosted in your own Azure or AWS account.

Try the quickstart.

Learn more.

  Read the launch announcement.


Quickstart: Unstructured Platform

If you want to use your local machine for either the source (input) or the destination (output) location, you cannot use this quickstart. You must run scripts on your local machine instead: skip to the Quickstart: Unstructured Serverless API Services or Quickstart: Unstructured open source library, later in this article.

This quickstart uses a no-code, point-and-click user interface in your web browser to get all of your data RAG-ready. Data is processed on Unstructured-hosted compute resources.

You will need:

2

Sign in

Use the sign in URL, username, and temporary password in the welcome email that Unstructured sends you.

3

Set the source (input) location

  1. In the sidebar, click Sources.
  2. Click New Source.
  3. In the Type dropdown list, select the source location type that matches yours.
  4. Fill in the rest of the fields with the appropriate settings. Learn more.
  5. Click Test connection.
  6. Click Submit.
4

Set the destination (output) location

  1. In the sidebar, click Destinations.
  2. Click New Destination.
  3. In the Type dropdown list, select the destination location type that matches yours.
  4. Fill in the rest of the fields with the appropriate settings. Learn more.
  5. Click Test connection.
  6. Click Submit.
5

Process the documents

  1. In the sidebar, click Jobs.
  2. Click Run Job.
  3. In the Select a Workflow or create a new one dropdown list, select New.
  4. In the Sources dropdown list, select your source location from Step 3.
  5. In the Destination dropdown list, select your destination location from Step 4.
  6. Click Run.
6

Monitor the processing job

  1. In the list of Jobs, click the Workflow link for your New job.
  2. When the Status shows JOB FINISHED, go to the next Step.
7

View the processed data

Go to your destination location to view the processed data.

Learn more about the Unstructured Platform.


Quickstart: Unstructured API service

This quickstart uses your local machine for the source (input) and destination (output) locations, and the Free Unstructured API edition. Data is processed on Unstructured-hosted compute resources.

The Free Unstructured API has limits. To remove these limits, sign up for the Unstructured Serverless API.

You will need:

  • Python installed on your local machine.
  • Compatible files on your local machine to be processed. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.
2

Get your API key and API URL

  1. Get your Unstructured API key from the welcome email that Unstructured sends you. Store your API key in a secure location. Do not share it with others.
  2. For this quickstart, your Unstructured API URL is an empty string.
3

Set enviromnent variables

  1. Set an environment variable named UNSTRUCTURED_API_KEY to the value of your Unstructured API key.
  2. Set another environment variable named UNSTRUCTURED_API_URL to an empty string.
    Setting the environment variable named UNSTRUCTURED_API_URL to an empty string makes your code forward-compatible if you later upgrade to the Unstructured Serverless API, which requires an API URL instead of an empty string.
    To learn how to set environment variables, see your operating system’s documentation.
4

Install the API library

Run the following command:

pip install "unstructured[all-docs]"
5

Run the code

Run the following command, replacing:

  • <path/to/input> with the source (input) path on your local machine that contains the compatible files for Unstructured to process on its hosted compute resources.
  • <path/to/output> with the destination (output) path on your local machine that will contain the processed data that Unstructured returns from its hosted compute resources.
unstructured-ingest \
  local \
    --input-path <path/to/input> \
    --output-dir <path/to/output> \
    --partition-by-api \
    --api-key $UNSTRUCTURED_API_KEY \
    --partition-endpoint $UNSTRUCTURED_API_URL
For speed, this quickstart uses the Unstructured CLI with the minimum number of required command options. You can also use the Unstructured Python or JavaScript/TypeScript SDKs or call the REST API directly.
6

View the processed data

Go to your destination location to view the processed data.

Learn more about the Unstructured Serverless API.


Quickstart: Unstructured open source library

This quickstart uses your local machine for the source (input) and destination (output) locations and for local data processing. It does not call Unstructured Serverless API Services.

The open source library is limited compared to the Unstructured Platform and the Unstructured Serverless API.

You will need:

  • Python installed on your local machine.
  • Compatible files on your local machine to be processed. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured repo on GitHub.
1

Install the open source library

Run the following command:

pip install "unstructured[all-docs]"
2

Run the code

Run the following command, replacing:

  • <path/to/input> with the source (input) path on your local machine that contains the compatible files to process.
  • <path/to/output> with the destination (output) path on your local machine that will contain the processed data.
unstructured-ingest \
  local \
    --input-path <path/to/input> \
    --output-dir <path/to/output>
For speed, this quickstart uses the open source CLI with the minimum number of required command options. You can also use Python code.
3

View the processed data

Go to your destination location to view the processed data.

Learn more about the Unstructured open source library.


Get in touch

If you don’t find the information you’re looking for in the documentation, or require assistance, get in touch with our Support team at support@unstructured.io, or join our Slack where our team and community can help you.

Was this page helpful?