The Unstructured Platform API is separate from Unstructured Serverless API services.

For information about Unstructured Serverless API services, see the Unstructured Serverless API services overview.

The Unstructured Platform features a no-code user interface for transforming your unstructured data into data that is ready for Retrieval Augmented Generation (RAG).

The Unstructured Platform also provides a back-end API for automation usage scenarios as well as for documentation, reporting, and recovery needs. This page provides an overview of the Unstructured Platform API.

Requirements

To use the Unstructured Platform API, you must have:

  • An Unstructured account, created through the For Developers page, with one of the following plans:

  • An Unstructured API key, created through your Unstructured account console.

    If you signed up through the For Enterprise page, your API key creation process might be different. For API key creation guidance, email Unstructured Sales at sales@unstructured.io.

    Free Unstructured API keys are not supported.

    To create an API key, do the following:

    1. Sign in to the Unstructured Platform. Learn how.
    2. At the bottom of the sidebar, click your user icon, and then click Account Settings.
    3. On the API Keys tab, click Generate New Key.
    4. Enter some descriptive name for the API key, and then click Save.
    5. Click the Copy icon for your new API key. The API key’s value is copied to your system’s clipboard.
  • The Unstructured Platform API URL. This is typically https://platform.unstructuredapp.io, which is unique to the Unstructured Python SDK; and https://platform.unstructuredapp.io/api/v1 for standard REST-enabled utilities (such as curl), tools (such as Postman), programming languages, packages, and libraries.

    Important: Do not use https://platform.unstructuredapp.io/api/v1 with the Unstructured Python SDK, or else calls made by the Python SDK will fail. Use https://platform.unstructuredapp.io instead.

    Do not use the Unstructured Serverless API URL, which is separate from the Unstructured Platform API URL.

    If you signed up through the For Enterprise page, your API URL might be different. For API URL guidance, email Unstructured Sales at sales@unstructured.io. If your API URL is different, be sure to substitute https://platform.unstructuredapp.io/api/v1 for your API URL throughout the following examples.

The Unstructured Platform API is offered as follows:

  • As part of the Unstructured Python SDK beginning with version 0.30.0, which you can call through standard Python code.

    To install the Unstructured Python SDK, run the following command from within your Python virtual environment:

    pip install "unstructured-client>=0.30.0"
    

    If you already have the Unstructured Python SDK installed, upgrade to at least version 0.30.0 by running the following command instead:

    pip install --upgrade "unstructured-client>=0.30.0"
    
  • As a set of Representational State Transfer (REST) endpoints, which you can call through standard REST-enabled utilities, tools, programming languages, packages, and libraries. The following sections describe how to call the Unstructured Platform API with curl and Postman. You can adapt this information as needed for your preferred programming languages and libraries, for example by using the requests library with Python.

    You can also use the Unstructured Platform API - Swagger UI to call the REST endpoints that are available through https://platform.unstructuredapp.io.

The Unstructured Platform API is separate from Unstructured Serverless API services and Unstructured Ingest.

Because of this separation, the following Unstructured SDKs, tools, and libraries do not work with the Unstructured Platform API:

Free Unstructured API accounts are also not supported.

The following Unstructured API URLs are also not supported:

  • https://api.unstructuredapp.io/general/v0/general (the Unstructured Serverless API URL)
  • https://api.unstructured.io/general/v0/general (the Free Unstructured API URL)

Basics

The Unstructured Platform API enables you to work with connectors, workflows, and jobs in the Unstructured Platform.

  • A source connector ingests files or data into Unstructured from a source location.
  • A destination connector sends the processed data from Unstructured to a destination location.
  • A workflow defines how Unstructured will process the data.
  • A job runs a workflow at a specific point in time.

For general information about these objects, see:

The following sections provide examples, showing the use of the Unstructured SDK for Python for all of the supported API operations, as well as curl and Postman for all of the supported REST endpoints.

You can also use the Unstructured Platform API - Swagger UI to call the REST endpoints that are available through https://platform.unstructuredapp.io.

The following Unstructured Python SDK examples use the following environment variables, which you can set as follows:

export UNSTRUCTURED_API_URL="https://platform.unstructuredapp.io"
export UNSTRUCTURED_API_KEY="<your-unstructured-api-key>"

Important: Do not use https://platform.unstructuredapp.io/api/v1 with the Python SDK, or else calls made by the Python SDK will fail. Use https://platform.unstructuredapp.io instead.

The following curl and Postman examples use the following environment variables, which you can set as follows:

export UNSTRUCTURED_API_URL="https://platform.unstructuredapp.io/api/v1"
export UNSTRUCTURED_API_KEY="<your-unstructured-api-key>"

Important: For standard REST-enabled clients (such as curl), do not use https://platform.unstructuredapp.io (which is unique to the Unstructured Python SDK), or else calls made by these REST-enabled clients will fail. Use https://platform.unstructuredapp.io/api/v1 instead.

These environment variables enable you to more easily run the following Unstructured Python SDK and curl examples and help prevent you from storing scripts that contain sensitive URLs and API keys in public source code repositories.

The following Postman examples use variables, which you can set as follows:

  1. In Postman, on your workspace’s sidebar, click Environments.

  2. Click Globals.

  3. Create two global variables with the following settings:

    • Variable: UNSTRUCTURED_API_URL

    • Type: default

    • Initial value: https://platform.unstructuredapp.io/api/v1

    • Current value: https://platform.unstructuredapp.io/api/v1

      Important: Do not use https://platform.unstructuredapp.io (which is unique to the Unstructured Python SDK), or else calls made by Postman will fail.


    • Variable: UNSTRUCTURED_API_URL
    • Type: secret
    • Initial value: <your-unstructured-api-key>
    • Current value: <your-unstructured-api-key>
  4. Click Save.

These variables enable you to more easily run the following examples in Postman and help prevent you from storing Postman collections that contain sensitive URLs and API keys in public source code repositories.

Connectors

You can list, get, create, update, and delete source connectors. You can also list, get, create, update, and delete destination connectors.

For general information, see Connectors.

List source connectors

To list source connectors, use the UnstructuredClient object’s sources.list_sources function (for the Python SDK) or the GET method to call the /sources endpoint (for curl or Postman).

To filter the list of source connectors, use the ListSourcesRequest object’s source_type parameter (for the Python SDK) or the query parameter source_type=<type> (for curl or Postman), replacing <type> with the source connector type’s unique ID (for example, s3 for the Amazon S3 source connector type). To get this ID, see Sources.

Get a source connector

To get information about a source connector, use the UnstructuredClient object’s sources.get_source function (for the Python SDK) or the GET method to call the /sources/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.

Create a source connector

To create a source connector, use the UnstructuredClient object’s sources.create_source function (for the Python SDK) or the POST method to call the /sources endpoint (for curl or Postman).

In the CreateSourceConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Sources.

Update a source connector

To update information about a source connector, use the UnstructuredClient object’s sources.update_source function (for the Python SDK) or the PUT method to call the /sources/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.

In the UpdateSourceConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Sources.

You must specify all of the settings for the connector, even for settings that are not changing.

You can change any of the connector’s settings except for its name and type.

Delete a source connector

To delete a source connector, use the UnstructuredClient object’s sources.delete_source function (for the Python SDK) or the DELETE method to call the /sources/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.

List destination connectors

To list destination connectors, use the UnstructuredClient object’s destinations.list_destinations function (for the Python SDK) or the GET method to call the /destinations endpoint (for curl or Postman).

To filter the list of destination connectors, use the ListDestinationsRequest object’s destination_type parameter (for the Python SDK) or the query parameter destination_type=<type> (for curl or Postman), replacing <type> with the destination connector type’s unique ID (for example, s3 for the Amazon S3 destination connector type). To get this ID, see Destinations.

Get a destination connector

To get information about a destination connector, use the UnstructuredClient object’s destinations.get_destination function (for the Python SDK) or the GET method to call the /destinations/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.

Create a destination connector

To create a destination connectors, use the UnstructuredClient object’s destinations.create_destination function (for the Python SDK) or the POST method to call the /destinations endpoint (for curl or Postman).

In the CreateDestinationConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Destinations.

Update a destination connector

To update information about a destination connector, use the UnstructuredClient object’s destinations.update_destination function (for the Python SDK) or the PUT method to call the /destinations/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.

In the UpdateDestinationConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Destinations.

You must specify all of the settings for the connector, even for settings that are not changing.

You can change any of the connector’s settings except for its name and type.

Delete a destination connector

To delete a destination connector, use the UnstructuredClient object’s destinations.delete_destination function (for the Python SDK) or the DELETE method to call the /destinations/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.

Workflows

You can list, get, create, run, update, and delete workflows.

For general information, see Workflows.

List workflows

To list workflows, use the UnstructuredClient object’s workflows.list_workflows function (for the Python SDK) or the GET method to call the /workflows endpoint (for curl or Postman).

To filter the list of workflows, use one or more of the following ListWorkflowsRequest parameters (for the Python SDK) or query parameters (for curl or Postman):

  • source_id=<connector-id>, replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.
  • destination_id=<connector-id>, replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.
  • status=<status>, replacing <status> with one of the following workflow statuses: active or inactive.

You can specify multiple query parameters, for example ?source_id=<connector-id>&status=<status>.

Get a workflow

To get information about a workflow, use the UnstructuredClient object’s workflows.get_workflow function (for the Python SDK) or the GET method to call the /workflows/<workflow-id> endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

Create a workflow

To create a workflow, use the UnstructuredClient object’s workflows.create_workflow function (for the Python SDK) or the POST method to call the /workflows endpoint (for curl or Postman).

In the CreateWorkflow object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the workflow. For the specific settings to include, see Create a workflow.

Run a workflow

To run a workflow manually, use the UnstructuredClient object’s workflows.run_workflow function (for the Python SDK) or the POST method to call the /workflows/<workflow-id>/run endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

To run a workflow on a schedule instead, specify the schedule setting in the request body when you create or update a workflow. See Create a workflow or Update a workflow.

Update a workflow

To update information about a workflow, use the UnstructuredClient object’s workflows.update_workflow function (for the Python SDK) or the PUT method to call the /workflows/<workflow-id> endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

In UpdateWorkflow object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the workflow. For the specific settings to include, see Update a workflow.

Delete a workflow

To delete a workflow, use the UnstructuredClient object’s workflows.delete_workflow function (for the Python SDK) or the DELETE method to call the /workflows/<workflow-id> endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

Jobs

You can list, get, and cancel jobs.

A job is created automatically whenever a workflow runs on a schedule; see Create a workflow. A job is also created whenever you run a workflow; see Run a workflow.

For general information, see Jobs.

List jobs

To list jobs, use the UnstructuredClient object’s jobs.list_jobs function (for the Python SDK) or the GET method to call the /jobs endpoint (for curl or Postman).

To filter the list of jobs, use one or both of the following ListJobsRequest parameters (for the Python SDK) or query parameters (for curl or Postman):

  • workflow_id=<workflow-id>, replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.
  • status=<status>, replacing <status> with one of the following job statuses: failed, finished, or running.

For curl or Postman, you can specify multiple query parameters as ?workflow_id=<workflow-id>&status=<status>.

Get a job

To get information about a job, use the UnstructuredClient object’s jobs.get_job function (for the Python SDK) or the GET method to call the /jobs/<job-id> endpoint (for curl or Postman), replacing <job-id> with the job’s unique ID. To get this ID, see List jobs.

Cancel a job

To cancel a running job, use the UnstructuredClient object’s jobs.cancel_job function (for the Python SDK) or the POST method to call the /jobs/<job-id>/cancel endpoint (for curl or Postman), replacing <job-id> with the job’s unique ID. To get this ID, see List jobs.