Overview

The Unstructured UI features a no-code user interface for transforming your unstructured data into data that is ready for retrieval-augmented generation (RAG). The Unstructured Workflow Endpoint, part of the Unstructured API, enables a full range of partitioning, chunking, embedding, and enrichment options for your files and data. It is designed to batch-process files and data in remote locations; send processed results to various storage, databases, and vector stores; and use the latest and highest-performing models on the market today. It has built-in logic to deliver the highest quality results at the lowest cost. This page provides an overview of the Unstructured Workflow Endpoint. This endpoint enables Unstructured UI automation usage scenarios as well as for documentation, reporting, and recovery needs.

Getting started

Choose one of the following options to get started with the Unstructured Workflow Endpoint:

Follow the quickstart, which uses the Unstructured Python SDK from a remote hosted Google Colab notebook.
Start using the Unstructred Python SDK.
Start using a REST client, such as curl or Postman.

Quickstart

This quickstart uses the Unstructured Python SDK to call the Unstructured Workflow Endpoint to get your data RAG-ready. The Python code for this quickstart is in a remote hosted Google Colab notebook. Data is processed on Unstructured-hosted compute resources. The requirements are as follows:

A compatible source (input) location that contains your data for Unstructured to process. See the list of supported source types. This quickstart uses an Amazon S3 bucket as the source location. If you use a different source type, you will need to modify the quickstart notebook accordingly.
For document-based source locations, compatible files in that location. See the list of supported file types. If you do not have any files available, you can download some from the example-docs folder in the Unstructured-IO/unstructured-ingest repository in GitHub.
A compatible destination (output) location for Unstructured to put the processed data. See the list of supported destination types. For this quickstart’s destination location, a different folder in the same Amazon S3 bucket as the source location is used. If you use a different destination S3 bucket or a different destination type, you will need to modify the quickstart notebook accordingly.

Sign in to your Unstructured account:
- If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
- If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. After you sign in, the Unstructured user interface (UI) then appears, and you can start using it right away. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.
Get your Unstructured API key: a. In the Unstructured UI, click API Keys on the sidebar.
b. Click Generate API Key.
c. Follow the on-screen instructions to finish generating the key.
d. Click the Copy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click the Copy icon again.

Create and set up the S3 bucket

This quickstart uses an Amazon S3 bucket as both the source location and the destination location. (You can use other source and destination types that are supported by Unstructured. If you use a different source or destination type, or if you use a different S3 bucket for the destination location, you will need to modify the quickstart notebook accordingly.)Inside of the S3 bucket, a folder named input represents the source location. This is where your files to be processed will be stored. The S3 URI to the source location will be s3://<your-bucket-name>/input.Inside of the same S3 bucket, a folder inside named output represents the destination location. This is where Unstructured will put the processed data. The S3 URI to the destination location will be s3://<your-bucket-name>/output.Learn how to create an S3 bucket and set it up for Unstructured. (Do not run the Python SDK code or REST commands at the end of those setup instructions.)

Run the quickstart notebook

After your S3 bucket is created and set up, follow the instructions in this quickstart notebook.

View the processed data

After you run the quickstart notebook, go to your destination location to view the processed data.

Unstructured Python SDK

Watch the following 4-minute video to learn how to use the Python SDK to call the Unstructured Workflow Endpoint to create connectors in the Unstructured UI. Watch the following 4-minute video to learn how to use the Python SDK to call the Unstructured Workflow Endpoint to create workflows and jobs in the Unstructured UI. Open a related notebook that covers many of the concepts that are shown in the preceding videos. The Unstructured Python SDK, beginning with version 0.30.6, allows you to call the Unstructured Workflow Endpoint through standard Python code. To install the Unstructured Python SDK, run the following command from within your Python virtual environment:

pip install "unstructured-client>=0.30.6"

If you already have the Unstructured Python SDK installed, upgrade to at least version 0.30.6 by running the following command instead:

pip install --upgrade "unstructured-client>=0.30.6"

The Unstructured Python SDK code examples, shown later on this page and on related pages, use the following environment variable, which you can set as follows:

export UNSTRUCTURED_API_KEY="<your-unstructured-api-key>"

This environment variable enables you to more easily run the following Unstructured Python SDK examples and help prevent you from storing scripts that contain sensitive API keys in public source code repositories. To get your Unstructured API key, do the following:

Sign in to your Unstructured account:
- If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
- If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. After you sign in, the Unstructured user interface (UI) then appears, and you can start using it right away. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.
Get your Unstructured API key: a. In the Unstructured UI, click API Keys on the sidebar.
b. Click Generate API Key.
c. Follow the on-screen instructions to finish generating the key.
d. Click the Copy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click the Copy icon again.

Calls made by the Unstructured Python SDK’s unstructured_client functions for creating, listing, updating, and deleting connectors, workflows, and jobs in the Unstructured UI all use the Unstructured Workflow Endpoint URL. This URL was provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

The default URL for the Unstructured Worfklow Endpoint is https://platform.unstructuredapp.io/api/v1. However, you should always use the URL that was provided to you when your Unstructured account was created.

To specify an API URL in your code, set the server_url parameter in the UnstructuredClient constructor to the target API URL. The Unstructured Workflow Endpoint enables you to work with connectors, workflows, and jobs in the Unstructured UI.

A source connector ingests files or data into Unstructured from a source location.
A destination connector sends the processed data from Unstructured to a destination location.
A workflow defines how Unstructured will process the data.
A job runs a workflow at a specific point in time.

For general information about these objects, see:

Skip ahead to start learning about how to use the Unstructured Python SDK to work with connectors, workflows, and jobs programmatically.

REST endpoints

The Unstructured Workflow Endpoint is callable from a set of Representational State Transfer (REST) endpoints, which you can call through standard REST-enabled utilities, tools, programming languages, packages, and libraries. The examples, shown later on this page and on related pages, describe how to call the Unstructured Workflow Endpoint with curl and Postman. You can adapt this information as needed for your preferred programming languages and libraries, for example by using the requests library with Python.

You can also use the Unstructured Workflow Endpoint - Swagger UI to call the REST endpoints that are available through the default Unstructured Workflow Endpoint URL: https://platform.unstructuredapp.io. To use the Swagger UI, you must provide your Unstructured API key with each call. To get this API key, see the quickstart, earlier on this page.Note that you should always use the URL that was provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

curl and Postman

The following curl examples use the following environment variables, which you can set as follows:

export UNSTRUCTURED_API_URL="https://platform.unstructuredapp.io/api/v1"
export UNSTRUCTURED_API_KEY="<your-unstructured-api-key>"

For the API URL, this URL was provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.

These environment variables enable you to more easily run the following curl examples and help prevent you from storing scripts that contain sensitive URLs and API keys in public source code repositories. To get your Unstructured API key, do the following:

Sign in to your Unstructured account:
- If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
- If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. After you sign in, the Unstructured user interface (UI) then appears, and you can start using it right away. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.
Get your Unstructured API key: a. In the Unstructured UI, click API Keys on the sidebar.
b. Click Generate API Key.
c. Follow the on-screen instructions to finish generating the key.
d. Click the Copy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click the Copy icon again.

The following Postman examples use variables, which you can set as follows:

In Postman, on your workspace’s sidebar, click Environments.
Click Globals.
Create two global variables with the following settings:
- Variable: UNSTRUCTURED_API_URL
- Type: default
- Initial value: The Unstructured Workflow Endpoint URL that was provided to you when your Unstructured account was created.
- Current value: The Unstructured Workflow Endpoint URL that was provided to you when your Unstructured account was created.
- Variable: UNSTRUCTURED_API_KEY
- Type: secret
- Initial value: <your-unstructured-api-key>
- Current value: <your-unstructured-api-key>
Click Save.

These variables enable you to more easily run the following examples in Postman and help prevent you from storing Postman collections that contain sensitive URLs and API keys in public source code repositories. Unstructured offers a Postman collection that you can import into Postman to make Workflow Endpoint requests through a graphical user interface.

Install Postman.
Sign in to Postman.
In your workspace, click Import.

In the Paste cURL, Raw text or URL box, enter the following URL, and then press Enter:

https://raw.githubusercontent.com/Unstructured-IO/docs/main/examplecode/codesamples/api/Unstructured-REST-API-Workflow-Endpoint.postman_collection.json

On the sidebar, click Collections.
Expand Unstructured REST API - Workflow Endpoint.
Select the request that you want to use.
As applicable, modify the URL as needed to specify any required resource IDs for the request.
On the Headers tab, next to unstructured-api-key, enter your Unstructured API key in the Value column. As applicable, add, remove, or modify any other required headers for the request.
As applicable, on the Params tab, add, remove, or modify any required parameters for the request.
As applicable, on the Body tab, add, remove, or modify the required request body for the request.
Click Send.
To save the response, in the response area, click the ellipses, and then click Save response to file.

To get your Unstructured API key, do the following:

Sign in to your Unstructured account:
- If you do not already have an Unstructured account, go to https://unstructured.io/contact and fill out the online form to indicate your interest.
- If you already have an Unstructured account, sign in by using the URL of the sign in page that Unstructured provided to you when your Unstructured account was created. After you sign in, the Unstructured user interface (UI) then appears, and you can start using it right away. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io.
Get your Unstructured API key: a. In the Unstructured UI, click API Keys on the sidebar.
b. Click Generate API Key.
c. Follow the on-screen instructions to finish generating the key.
d. Click the Copy icon next to your new key to add the key to your system’s clipboard. If you lose this key, simply return and click the Copy icon again.

The Unstructured Workflow Endpoint enables you to work with connectors, workflows, and jobs in the Unstructured UI.

A source connector ingests files or data into Unstructured from a source location.
A destination connector sends the processed data from Unstructured to a destination location.
A workflow defines how Unstructured will process the data.
A job runs a workflow at a specific point in time.

For general information about these objects, see:

Skip ahead to start learning about how to use the REST endpoints to work with connectors, workflows, and jobs programmatically.

Restrictions

The following Unstructured SDKs, tools, and libraries do not work with the Unstructured Workflow Endpoint:

The Unstructured JavaScript/TypeScript SDK
Local single-file POST requests to the Unstructured Partition Endpoint
The Unstructured open source Python library
The Unstructured Ingest CLI
The Unstructured Ingest Python library

The following Unstructured API URL is also not supported: https://api.unstructuredapp.io/general/v0/general (the default Unstructured Partition Endpoint URL).

Connectors

You can list, get, create, update, delete, and test source connectors. You can also list, get, create, update, delete, and test destination connectors. For general information, see Connectors.

List source connectors

To list source connectors, use the UnstructuredClient object’s sources.list_sources function (for the Python SDK) or the GET method to call the /sources endpoint (for curl or Postman). To filter the list of source connectors, use the ListSourcesRequest object’s source_type parameter (for the Python SDK) or the query parameter source_type=<type> (for curl or Postman), replacing <type> with the source connector type’s unique ID (for example, for the Amazon S3 source connector type, S3 for the Python SDK or s3 for curl or Postman). To get this ID, see Sources.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListSourcesRequest
from unstructured_client.models.shared import SourceConnectorType

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.sources.list_sources(
    request=ListSourcesRequest(
        source_type=SourceConnectorType.<type> # Optional, list only for this source type.
    )
)

# Print the list in alphabetical order by connector name.
sorted_sources = sorted(
    response.response_list_sources, 
    key=lambda source: source.name.lower()
)

for source in sorted_sources:
    print(f"{source.name} ({source.id})")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListSourcesRequest
from unstructured_client.models.shared import SourceConnectorType

async def list_sources():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.sources.list_sources_async(
        request=ListSourcesRequest(
            source_type=SourceConnectorType.<type> # Optional, list only for this source type. 
        )
    )

    # Print the list in alphabetical order by connector name.
    sorted_sources = sorted(
        response.response_list_sources, 
        key=lambda source: source.name.lower()
    )

    for source in sorted_sources:
        print(f"{source.name} ({source.id})")

asyncio.run(list_sources())

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/sources" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list of source connectors:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/sources?source_type=<type>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/sources
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
To filter the list of source connectors, on the Params tab, enter the following query parameter:
- Key: source_type, Value: <type>
Click Send.

Get a source connector

To get information about a source connector, use the UnstructuredClient object’s sources.get_source function (for the Python SDK) or the GET method to call the /sources/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetSourceRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.sources.get_source(
    request=GetSourceRequest(
        source_id="<connector-id>"
    )
)

info = response.source_connector_information

print(f"name: {info.name}")
    
for key, value in info.config:
    print(f"{key}: {value}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetSourceRequest

async def get_source():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.sources.get_source_async(
        request=GetSourceRequest(
            source_id="<connector-id>"
        )
    )

    info = response.source_connector_information

    print(f"name: {info.name}")
        
    for key, value in info.config:
        print(f"{key}: {value}")

asyncio.run(get_source())

curl

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/sources/<connector-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Create a source connector

To create a source connector, use the UnstructuredClient object’s sources.create_source function (for the Python SDK) or the POST method to call the /sources endpoint (for curl or Postman). In the CreateSourceConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Sources. For the Python SDK, replace <type> with the source connector type’s unique ID (for example, for the Amazon S3 source connector type, S3). To get this ID, see Sources.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateSourceRequest
from unstructured_client.models.shared import (
    CreateSourceConnector,
    SourceConnectorType,
    <type>SourceConnectorConfigInput
)

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

destination_connector = CreateSourceConnector(
    name="<name>",
    type=SourceConnectorType.<type>,
    config=<type>SourceConnectorConfigInput(
        # Specify the settings for the connector here.
    )
)

response = client.sources.create_source(
    request=CreateSourceRequest(
        create_source_connector=source_connector
    )
)

info = response.source_connector_information

print(f"name: {info.name}")
print(f"id: {info.id}")
    
for key, value in info.config:
    print(f"{key}: {value}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateSourceRequest
from unstructured_client.models.shared import (
    CreateSourceConnector,
    SourceConnectorType,
    <type>SourceConnectorConfigInput
)

async def create_source():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    source_connector = CreateSourceConnector(
        name="<name>",
        type=SourceConnectorType.<type>,
        config=<type>SourceConnectorConfigInput(
            # Specify the settings for the connector here.
        )
    )

    response = await client.sources.create_source_async(
        request=CreateSourceRequest(
            create_source_connector=source_connector
        )
    )

    info = response.source_connector_information

    print(f"name: {info.name}")
    print(f"id: {info.id}")
        
    for key, value in info.config:
        print(f"{key}: {value}")

asyncio.run(create_source())

curl

curl --request 'POST' --location \
"$UNSTRUCTURED_API_URL/sources" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--header 'content-type: application/json' \
--data \
'{
    # Specify the settings for the connector here.
}'

Postman

In the method drop-down list, select POST.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/sources
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
- Key: content-type, Value, application/json
On the Body tab, select raw and JSON, and specify the settings for the connector.
Click Send.

Update a source connector

To update information about a source connector, use the UnstructuredClient object’s sources.update_source function (for the Python SDK) or the PUT method to call the /sources/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors. In the UpdateSourceConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Sources. For the Python SDK, replace <type> with the source connector type’s unique ID (for example, for the Amazon S3 source connector type, S3). To get this ID, see Sources. You must specify all of the settings for the connector, even for settings that are not changing. You can change any of the connector’s settings except for its name and type.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import UpdateSourceRequest
from unstructured_client.models.shared import (
    UpdateSourceConnector,
    <type>SourceConnectorConfigInput
)

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

source_connector = UpdateSourceConnector(
    config=<type>SourceConnectorConfigInput(
        # Specify the settings for the connector here.
    )
)

response = client.sources.update_source(
    request=UpdateSourceRequest(
        source_id="<connector-id>",
        update_source_connector=source_connector
    )
)

info = response.source_connector_information

print(f"name: {info.name}")
print(f"id: {info.id}")
    
for key, value in info.config:
    print(f"{key}: {value}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import UpdateSourceRequest
from unstructured_client.models.shared import (
    UpdateSourceConnector,
    <type>SourceConnectorConfigInput
)

async def update_source():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    source_connector = UpdateSourceConnector(
        config=<type>SourceConnectorConfigInput(
            # Specify the settings for the connector here.
        )
    )

    response = await client.sources.update_source_async(
        request=UpdateSourceRequest(
            source_id="<connector-id>",
            update_source_connector=source_connector
        )
    )

    info = response.source_connector_information

    print(f"name: {info.name}")
    print(f"id: {info.id}")
        
    for key, value in info.config:
        print(f"{key}: {value}")

asyncio.run(update_source())

curl

curl --request 'PUT' --location \
"$UNSTRUCTURED_API_URL/sources/<connector-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--header 'content-type: application/json' \
--data \
'{
    # Specify the settings for the connector here.
}'

Postman

In the method drop-down list, select PUT.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/sources/<connector-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
- Key: content-type, Value, application/json
On the Body tab, select raw and JSON, and specify the settings for the connector.
Click Send.

Delete a source connector

To delete a source connector, use the UnstructuredClient object’s sources.delete_source function (for the Python SDK) or the DELETE method to call the /sources/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import DeleteSourceRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.sources.delete_source(
    request=DeleteSourceRequest(
        source_id="<connector-id>"
    )
)

print(response.raw_response)

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import DeleteSourceRequest

async def delete_source():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.sources.delete_source_async(
        request=DeleteSourceRequest(
            source_id="<connector-id>"
        )
    )

    print(response.raw_response)

asyncio.run(delete_source())

curl

Postman

In the method drop-down list, select DELETE.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/sources/<connector-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Test a source connector

To test a source connector, use the POST method to call the /sources/<connector-id>/connection-check endpoint (for curl or Postman), replacing <connector-id> with the connector’s unique ID. To get this ID, see List source connectors. The Python SDK does not support testing source connectors.

curl

curl --request 'POST' --location \
"$UNSTRUCTURED_API_URL/sources/<connector-id>/connection-check" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select POST.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/sources/<connector-id>/connection-check

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

To get information about the most recent connector check for a source connector, use the GET method to call the /sources/<connector-id>/connection-check endpoint (for curl or Postman), replacing <connector-id> with the connector’s unique ID. To get this ID, see List source connectors. The Python SDK does not support getting information about the most recent connector check for a source connector.

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/sources/<connector-id>/connection-check" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/sources/<connector-id>/connection-check

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

List destination connectors

To list destination connectors, use the UnstructuredClient object’s destinations.list_destinations function (for the Python SDK) or the GET method to call the /destinations endpoint (for curl or Postman). To filter the list of destination connectors, use the ListDestinationsRequest object’s destination_type parameter (for the Python SDK) or the query parameter destination_type=<type> (for curl or Postman), replacing <type> with the destination connector type’s unique ID (for example, for the Amazon S3 source connector type, S3 for the Python SDK or s3 for curl or Postman). To get this ID, see Destinations.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListDestinationsRequest
from unstructured_client.models.shared import DestinationConnectorType

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.destinations.list_destinations(
    request=ListDestinationsRequest(
        destination_type=DestinationConnectorType.<type> # Optional, list only for this destination type.
    )
)

# Print the list in alphabetical order by connector name.
sorted_destinations = sorted(
    response.response_list_destinations, 
    key=lambda destination: destination.name.lower()
)

for destination in sorted_destinations:
    print(f"{destination.name} ({destination.id})")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListDestinationsRequest
from unstructured_client.models.shared import DestinationConnectorType

async def list_destinations():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.destinations.list_destinations_async(
        request=ListDestinationsRequest(
            destination_type=DestinationConnectorType.<type>  # Optional, list only for this destination type.
        )
    )

    # Print the list in alphabetical order by connector name.
    sorted_destinations = sorted(
        response.response_list_destinations, 
        key=lambda destination: destination.name.lower()
    )

    for destination in sorted_destinations:
        print(f"{destination.name} ({destination.id})")

asyncio.run(list_destinations())

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/destinations" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list of destination connectors:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/destinations?destination_type=<type>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/destinations
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
To filter the list of destination connectors, on the Params tab, enter the following query parameter:
- Key: destination_type, Value: <type>
Click Send.

Get a destination connector

To get information about a destination connector, use the UnstructuredClient object’s destinations.get_destination function (for the Python SDK) or the GET method to call the /destinations/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetDestinationRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.destinations.get_destination(
    request=GetDestinationRequest(
        destination_id="<connector-id>"
    )
)

info = response.destination_connector_information

print(f"name: {info.name}")
    
for key, value in info.config:
    print(f"{key}: {value}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetDestinationRequest

async def get_destination():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.destinations.get_destination_async(
        request=GetDestinationRequest(
            destination_id="<connector-id>"
        )
    )

    info = response.destination_connector_information

    print(f"name: {info.name}")
        
    for key, value in info.config:
        print(f"{key}: {value}")

asyncio.run(get_destination())

curl

Postman

In the method drop-down list, select GET.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/destinations/<connector-id>

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Create a destination connector

To create a destination connectors, use the UnstructuredClient object’s destinations.create_destination function (for the Python SDK) or the POST method to call the /destinations endpoint (for curl or Postman). In the CreateDestinationConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Destinations. For the Python SDK, replace <type> with the destination connector type’s unique ID (for example, for the Amazon S3 source connector type, S3). To get this ID, see Destinations.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateDestinationRequest
from unstructured_client.models.shared import (
    CreateDestinationConnector,
    DestinationConnectorType,
    <type>DestinationConnectorConfigInput
)

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

destination_connector = CreateDestinationConnector(
    name="<name>",
    type=DestinationConnectorType.<type>,
    config=<type>DestinationConnectorConfigInput(
        # Specify the settings for the connector here.
    )
)

response = client.destinations.create_destination(
    request=CreateDestinationRequest(
        create_destination_connector=destination_connector
    )
)

info = response.destination_connector_information

print(f"name: {info.name}")
print(f"id: {info.id}")
    
for key, value in info.config:
    print(f"{key}: {value}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateDestinationRequest
from unstructured_client.models.shared import (
    CreateDestinationConnector,
    DestinationConnectorType,
    <type>DestinationConnectorConfigInput
)

async def create_destination():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    destination_connector = CreateDestinationConnector(
        name="my-s3-connector",
        type=DestinationConnectorType.<type>,
        config=<type>DestinationConnectorConfigInput(
            # Specify the settings for the connector here.
        )
    )

    response = await client.destinations.create_destination_async(
        request=CreateDestinationRequest(
            create_destination_connector=destination_connector
        )
    )

    info = response.destination_connector_information

    print(f"name: {info.name}")
    print(f"id: {info.id}")
        
    for key, value in info.config:
        print(f"{key}: {value}")

asyncio.run(create_destination())

curl

curl --request 'POST' --location \
"$UNSTRUCTURED_API_URL/destinations" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--header 'content-type: application/json' \
--data \
'{
    # Specify the settings for the connector here.
}'

Postman

In the method drop-down list, select POST.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/destinations
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
- Key: content-type, Value, application/json
On the Body tab, select raw and JSON, and specify the settings for the connector.
Click Send.

Update a destination connector

To update information about a destination connector, use the UnstructuredClient object’s destinations.update_destination function (for the Python SDK) or the PUT method to call the /destinations/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors. In the UpdateDestinationConnector object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Destinations. You must specify all of the settings for the connector, even for settings that are not changing. You can change any of the connector’s settings except for its name and type.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import UpdateDestinationRequest
from unstructured_client.models.shared import (
    UpdateDestinationConnector,
    <type>DestinationConnectorConfigInput
)

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

destination_connector = UpdateDestinationConnector(
    config=<type>DestinationConnectorConfigInput(
        # Specify the settings for the connector here.
    )
)

response = client.destinations.update_destination(
    request=UpdateDestinationRequest(
        destination_id="<connector-id>",
        update_destination_connector=destination_connector
    )
)

info = response.destination_connector_information

print(f"name: {info.name}")
print(f"id: {info.id}")
    
for key, value in info.config:
    print(f"{key}: {value}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import UpdateDestinationRequest
from unstructured_client.models.shared import (
    UpdateDestinationConnector,
    <type>DestinationConnectorConfigInput
)

async def update_destination():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    destination_connector = UpdateDestinationConnector(
        config=<type>DestinationConnectorConfigInput(
            # Specify the settings for the connector here.
        )
    )

    response = await client.destinations.update_destination_async(
        request=UpdateDestinationRequest(
            destination_id="<connector-id>",
            update_destination_connector=destination_connector
        )
    )

    info = response.destination_connector_information

    print(f"name: {info.name}")
    print(f"id: {info.id}")
        
    for key, value in info.config:
        print(f"{key}: {value}")

asyncio.run(update_destination())

curl

curl --request 'PUT' --location \
"$UNSTRUCTURED_API_URL/destinations/<connector-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--header 'content-type: application/json' \
--data \
'{
    # Specify the settings for the connector here.
}'

Postman

In the method drop-down list, select PUT.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/destinations/<connector-id>

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
- Key: content-type, Value, application/json
On the Body tab, select raw and JSON, and specify the settings for the connector.
Click Send.

Delete a destination connector

To delete a destination connector, use the UnstructuredClient object’s destinations.delete_destination function (for the Python SDK) or the DELETE method to call the /destinations/<connector-id> endpoint (for curl or Postman), replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import DeleteDestinationRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.destinations.delete_destination(
    request=DeleteDestinationRequest(
        destination_id="<connector-id>"
    )
)

print(response.raw_response)

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import DeleteDestinationRequest

async def delete_destination():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.destinations.delete_destination_async(
        request=DeleteDestinationRequest(
            destination_id="<connector-id>"
        )
    )

    print(response.raw_response)

asyncio.run(delete_destination())

curl

Postman

In the method drop-down list, select DELETE.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/destinations/<connector-id>

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Test a destination connector

To test a destination connector, use the POST method to call the /destinations/<connector-id>/connection-check endpoint (for curl or Postman), replacing <connector-id> with the connector’s unique ID. To get this ID, see List destination connectors. The Python SDK does not support testing destination connectors.

curl

curl --request 'POST' --location \
"$UNSTRUCTURED_API_URL/destinations/<connector-id>/connection-check" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select POST.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/destinations/<connector-id>/connection-check

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

To get information about the most recent connector check for a destination connector, use the GET method to call the /destinations/<connector-id>/connection-check endpoint (for curl or Postman), replacing <connector-id> with the connector’s unique ID. To get this ID, see List destination connectors. The Python SDK does not support getting information about the most recent connector check for a destination connector.

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/destinations/<connector-id>/connection-check" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/destinations/<connector-id>/connection-check

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Workflows

You can list, get, create, run, update, and delete workflows. For general information, see Workflows.

List workflows

To list workflows, use the UnstructuredClient object’s workflows.list_workflows function (for the Python SDK) or the GET method to call the /workflows endpoint (for curl or Postman). To filter the list of workflows, use one or more of the following ListWorkflowsRequest parameters (for the Python SDK) or query parameters (for curl or Postman):

source_id=<connector-id>, replacing <connector-id> with the source connector’s unique ID. To get this ID, see List source connectors.
destination_id=<connector-id>, replacing <connector-id> with the destination connector’s unique ID. To get this ID, see List destination connectors.
status=WorkflowState.<status> (for the Python SDK) or status=<status> (for curl or Postman), replacing <status> with one of the following workflow statuses: ACTIVE or INACTIVE (for the Python SDK) or active or inactive (for curl or Postman).

You can specify multiple query parameters, for example ?source_id=<connector-id>&status=<status>.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListWorkflowsRequest
from unstructured_client.models.shared import WorkflowState

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.workflows.list_workflows(
    request=ListWorkflowsRequest(
        destination_id="<connector-id>", # Optional, list only for this destination connector ID.
        source_id="<connector-id>", # Optional, list only for this source connector ID.
        status=WorkflowState.<status> # Optional, list only for this workflow status.
    )
)

# Print the list in alphabetical order by workflow name.
sorted_workflows = sorted(
    response.response_list_workflows, 
    key=lambda workflow: workflow.name.lower()
)

for workflow in sorted_workflows:
    print(f"{workflow.name} ({workflow.id})")

Python SDK (async)

import os
import asyncio 

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListWorkflowsRequest
from unstructured_client.models.shared import WorkflowState

async def list_workflows():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.workflows.list_workflows_async(
        request=ListWorkflowsRequest(
            destination_id="<connector-id>", # Optional, list only for this destination connector ID.
            source_id="<connector-id>", # Optional, list only for this source connector ID.
            status=WorkflowState.<status> # Optional, list only for this workflow status.
        )
    )

    # Print the list in alphabetical order by workflow name.
    sorted_workflows = sorted(
        response.response_list_workflows, 
        key=lambda workflow: workflow.name.lower()
    )

    for workflow in sorted_workflows:
        print(f"{workflow.name} ({workflow.id})")

asyncio.run(list_workflows())

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/workflows" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list by source connector ID:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/workflows?source_id=<connector-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list by destination connector ID:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/workflows?destination_id=<connector-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list by workflow status:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/workflows?status=<status>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/workflows
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
To filter the list of workflows, on the Params tab, enter one or more of the following query parameter:
- By source connector ID: Key: source_id, Value: <connector-id>
- By destination connector ID: Key: destination_id, Value: <connector-id>
- By workflow status: Key: status, Value: <status>
Click Send.

Get a workflow

To get information about a workflow, use the UnstructuredClient object’s workflows.get_workflow function (for the Python SDK) or the GET method to call the /workflows/<workflow-id> endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetWorkflowRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.workflows.get_workflow(
    request=GetWorkflowRequest(
        workflow_id="<workflow-id>"
    )
)

info = response.workflow_information

print(f"name:           {info.name}")
print(f"id:             {info.id}")
print(f"status:         {info.status}")
print(f"type:           {info.workflow_type}")
print("source(s):")

for source in info.sources:
    print(f"            {source}")

print("destination(s):")

for destination in info.destinations:
    print(f"            {destination}")

print("schedule(s):")

for crontab_entry in info.schedule.crontab_entries:
    print(f"            {crontab_entry.cron_expression}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetWorkflowRequest

async def get_workflow():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.workflows.get_workflow_async(
        request=GetWorkflowRequest(
            workflow_id="<workflow-id>"
        )
    )

    info = response.workflow_information

    print(f"name: {info.name}")
    print(f"id: {info.id}")
    print(f"status: {info.status}")
    print(f"type: {info.workflow_type}")
    print("source(s):")

    for source in info.sources:
        print(f"    {source}")

    print("destination(s):")

    for destination in info.destinations:
        print(f"    {destination}")

    print("schedule(s):")

    for crontab_entry in info.schedule.crontab_entries:
        print(f"    {crontab_entry.cron_expression}")

asyncio.run(get_workflow())

curl

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/workflows/<workflow-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Create a workflow

To create a workflow, use the UnstructuredClient object’s workflows.create_workflow function (for the Python SDK) or the POST method to call the /workflows endpoint (for curl or Postman). In the CreateWorkflow object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the workflow. For the specific settings to include, see Create a workflow.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateWorkflowRequest
from unstructured_client.models.shared import (
    WorkflowNode,
    CreateWorkflow,
    WorkflowType,
    Schedule
)

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

workflow = CreateWorkflow(
    # Specify the settings for the workflow here.
)

response = client.workflows.create_workflow(
    request=CreateWorkflowRequest(
        create_workflow=workflow
    )
)

info = response.workflow_information

print(f"name:           {info.name}")
print(f"id:             {info.id}")
print(f"status:         {info.status}")
print(f"type:           {info.workflow_type}")
print("source(s):")

for source in info.sources:
    print(f"            {source}")

print("destination(s):")

for destination in info.destinations:
    print(f"            {destination}")

print("schedule(s):")

for crontab_entry in info.schedule.crontab_entries:
    print(f"            {crontab_entry.cron_expression}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateWorkflowRequest
from unstructured_client.models.shared import (
    WorkflowNode,
    CreateWorkflow,
    WorkflowType,
    Schedule
)

async def create_workflow():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    workflow = CreateWorkflow(
        # Specify the settings for the workflow here.
    )

    response = await client.workflows.create_workflow_async(
        request=CreateWorkflowRequest(
            create_workflow=workflow
        )
    )

    info = response.workflow_information

    print(f"name:           {info.name}")
    print(f"id:             {info.id}")
    print(f"status:         {info.status}")
    print(f"type:           {info.workflow_type}")
    print("source(s):")

    for source in info.sources:
        print(f"            {source}")

    print("destination(s):")

    for destination in info.destinations:
        print(f"            {destination}")

    print("schedule(s):")

    for crontab_entry in info.schedule.crontab_entries:
        print(f"            {crontab_entry.cron_expression}")

asyncio.run(create_workflow())

curl

curl --request 'POST' --location \
"$UNSTRUCTURED_API_URL/workflows" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--header 'content-type: application/json' \
--data \
'{
    # Specify the settings for the workflow here.
}'

Postman

In the method drop-down list, select POST.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/workflows
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
- Key: content-type, Value, application/json
On the Body tab, select raw and JSON, and specify the settings for the workflow.
Click Send.

Run a workflow

To run a workflow manually, use the UnstructuredClient object’s workflows.run_workflow function (for the Python SDK) or the POST method to call the /workflows/<workflow-id>/run endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

Python SDK (remote source and remote destination)

If the target workflow was originally created programmatically by the Unstructured Python SDK or with a REST API client such as curl or Postman, and the workflow uses a local source connector, you can run the workflow only with a REST API client such as curl or Postman, as described later in this section. You cannot run the workflow with the Python SDK or the Unstructured user interface (UI), even though the workflow is visible in the UI.

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import RunWorkflowRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.workflows.run_workflow(
    request=RunWorkflowRequest(
        workflow_id="<workflow-id>"
    )
)

print(response.raw_response)

Python SDK (async) (remote source and remote destination)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import RunWorkflowRequest

async def run_workflow():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.workflows.run_workflow_async(
        request=RunWorkflowRequest(
            workflow_id="<workflow-id>"
        )
    )

    print(response.raw_response)

asyncio.run(run_workflow())

curl (remote source and remote destination)

curl (local source and local or remote destination)

In the following command, replace:

</full/path/to/local/filename.extension> with the full path to the local file to upload.
<filename.extension> with the filename of the local file to upload.
<local-file-media-type> with the local file’s media type. For a list of available media types, such as application/pdf, see Media Types.

To upload multiple files, add additional --form entries, one per file.

curl --request 'POST' --location \
"$UNSTRUCTURED_API_URL/workflows/<workflow-id>/run" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--form "input_files=@</full/path/to/local/filename.extension>;filename=<filename.extension>;type=<local-file-media-type>" \
--form "input_files=@</full/path/to/local/filename.extension>;filename=<filename.extension>;type=<local-file-media-type>" # For each additional file to be uploaded.

For a local destination, to access the processed files’ data, download a processed local file from the workflow’s job run.

Postman (remote source and remote destination)

In the method drop-down list, select POST.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/workflows/<workflow-id>/run

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Postman (local source and local or remote destination)

In the method drop-down list, select POST.

In the address box, enter the following URL:

{{UNSTRUCTURED_API_URL}}/workflows/<workflow-id>/run

On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
On the Body tab, select form-data, and specify the settings for the workflow run:
- Key: input_files, File, Value: Click the Value box, then click New file from local machine, and select the file to upload. To upload multiple files, add additional input_files entries after this one, one entry per additional file to upload.
- Key: filename, Text, Value: Type the name of the file that you just uploaded. To upload multiple files, add additional filename entries after this one, one entry per additional file to upload. Make sure the order of these filename entries matches the order of the input_files entries, respectively.
- Key: type, Text, Value: <local-file-media-type> To upload multiple files, add additional type entries after this one, one entry per additional file to upload. Make sure the order of these type entries matches the order of the input_files entries, respectively.
For a list of available media types, such as application/pdf, see Media Types.
Click Send.

For a local destination, to access the processed files’ data, download a processed local file from the workflow’s job run.

To run a workflow on a schedule instead, specify the schedule setting in the request body when you create or update a workflow. See Create a workflow or Update a workflow.

Update a workflow

To update information about a workflow, use the UnstructuredClient object’s workflows.update_workflow function (for the Python SDK) or the PUT method to call the /workflows/<workflow-id> endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows. In UpdateWorkflow object (for the Python SDK) or the request body (for curl or Postman), specify the settings for the workflow. For the specific settings to include, see Update a workflow.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import UpdateWorkflowRequest
from unstructured_client.models.shared import (
    WorkflowNode,
    UpdateWorkflow,
    WorkflowType,
    Schedule
)

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

workflow = UpdateWorkflow(
    # Specify the settings for the workflow here.
)

response = client.workflows.update_workflow(
    request=UpdateWorkflowRequest(
        workflow_id="<workflow-id>",
        update_workflow=workflow
    )
)

info = response.workflow_information

print(f"name:           {info.name}")
print(f"id:             {info.id}")
print(f"status:         {info.status}")
print(f"type:           {info.workflow_type}")
print("source(s):")

for source in info.sources:
    print(f"            {source}")

print("destination(s):")

for destination in info.destinations:
    print(f"            {destination}")

print("schedule(s):")

for crontab_entry in info.schedule.crontab_entries:
    print(f"            {crontab_entry.cron_expression}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import UpdateWorkflowRequest
from unstructured_client.models.shared import (
    WorkflowNode,
    UpdateWorkflow,
    WorkflowType,
    Schedule
)

async def update_workflow():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    workflow = UpdateWorkflow(
        # Specify the settings for the workflow here.
    )

    response = await client.workflows.update_workflow_async(
        request=UpdateWorkflowRequest(
            workflow_id="<workflow-id>",
            update_workflow=workflow
        )
    )

    info = response.workflow_information

    print(f"name:           {info.name}")
    print(f"id:             {info.id}")
    print(f"status:         {info.status}")
    print(f"type:           {info.workflow_type}")
    print("source(s):")

    for source in info.sources:
        print(f"            {source}")

    print("destination(s):")

    for destination in info.destinations:
        print(f"            {destination}")

    print("schedule(s):")

    for crontab_entry in info.schedule.crontab_entries:
        print(f"            {crontab_entry.cron_expression}")

asyncio.run(update_workflow())

curl

curl --request 'PUT' --location \
"$UNSTRUCTURED_API_URL/workflows/<workflow-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY" \
--header 'content-type: application/json' \
--data \
'{
    # Specify the settings for the workflow here.
}'

Postman

In the method drop-down list, select PUT.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/workflows/<workflow-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
- Key: content-type, Value, application/json
On the Body tab, select raw and JSON, and specify the settings for the workflow.
Click Send.

Delete a workflow

To delete a workflow, use the UnstructuredClient object’s workflows.delete_workflow function (for the Python SDK) or the DELETE method to call the /workflows/<workflow-id> endpoint (for curl or Postman), replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import DeleteWorkflowRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.workflows.delete_workflow(
    request=DeleteWorkflowRequest(
        workflow_id="<workflow-id>"
    )
)

print(response.raw_response)

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import DeleteWorkflowRequest

async def delete_workflow():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.workflows.delete_workflow_async(
        request=DeleteWorkflowRequest(
            workflow_id="<workflow-id>"
        )
    )

    print(response.raw_response)

asyncio.run(delete_workflow())

curl

Postman

In the method drop-down list, select DELETE.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/workflows/<workflow-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Jobs

You can list, get, and cancel jobs. A job is created automatically whenever a workflow runs on a schedule; see Create a workflow. A job is also created whenever you run a workflow; see Run a workflow. For general information, see Jobs.

List jobs

To list jobs, use the UnstructuredClient object’s jobs.list_jobs function (for the Python SDK) or the GET method to call the /jobs endpoint (for curl or Postman). To filter the list of jobs, use one or both of the following ListJobsRequest parameters (for the Python SDK) or query parameters (for curl or Postman):

workflow_id=<workflow-id>, replacing <workflow-id> with the workflow’s unique ID. To get this ID, see List workflows.
status=<status>, replacing <status> with one of the following job statuses: completed, failed, im progress, scheduled, and stopped.

For curl or Postman, you can specify multiple query parameters as ?workflow_id=<workflow-id>&status=<status>.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListJobsRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.jobs.list_jobs(
    request=ListJobsRequest(
        workflow_id="<workflow-id>", # Optional, list only for this workflow ID.
        status="<status>", # Optional, list only for this job status.
    )
)

# Print the list in alphabetical order by workflow name.
sorted_jobs = sorted(
    response.response_list_jobs, 
    key=lambda job: job.workflow_name.lower()
)

for job in sorted_jobs:
    print(f"{job.id} (workflow name: {job.workflow_name}, id: {job.workflow_id})")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import ListJobsRequest

async def list_jobs():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.jobs.list_jobs_async(
        request=ListJobsRequest(
        workflow_id="<workflow-id>", # Optional, list only for this workflow ID.
        status="<status>", # Optional, list only for this job status.
        )
    )

    # Print the list in alphabetical order by workflow name.
    sorted_jobs = sorted(
        response.response_list_jobs, 
        key=lambda job: job.workflow_name.lower()
    )

    for job in sorted_jobs:
        print(f"{job.id} (workflow name: {job.workflow_name}, id: {job.workflow_id})")

asyncio.run(list_jobs())

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/jobs" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list by workflow ID:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/jobs?workflow_id=<workflow-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

To filter the list by job status:

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/job?status=<status>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/jobs
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
To filter the list of jobs, on the Params tab, enter one or more of the following query parameter:
- By workflow ID: Key: workflow_id, Value: <workflow-id>
- By job status: Key: status, Value: <status>
Click Send.

Get a job

To get basic information about a job, use the UnstructuredClient object’s jobs.get_job function (for the Python SDK) or the GET method to call the /jobs/<job-id> endpoint (for curl or Postman), replacing <job-id> with the job’s unique ID. To get this ID, see List jobs. This function/endpoint returns basic information about the job, such as:

The job’s unique ID.
The unique ID and name of the workflow that created the job.
The job’s current status.
When the job was created.

To get details about a job’s current processing status, see Get processing details for a job.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetJobRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.jobs.get_job(
    request=GetJobRequest(
        job_id="<job-id>"
    )
)

info = response.job_information

print(f"id:            {info.id}")
print(f"status:        {info.status}")
print(f"workflow name: {info.workflow_name}")
print(f"workflow id:   {info.workflow_id}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetJobRequest

async def get_job():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.jobs.get_job_async(
        request=GetJobRequest(
            job_id="<job-id>"
        )
    )

    info = response.job_information

    print(f"id: {info.id}")
    print(f"status: {info.status}")
    print(f"workflow name: {info.workflow_name}")
    print(f"workflow id: {info.workflow_id}")

asyncio.run(get_job())

curl

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Get processing details for a job

To get current processing information about a job, use the UnstructuredClient object’s jobs.get_job_details function (for the Python SDK) or the GET method to call the /jobs/<job-id>/details endpoint (for curl or Postman), replacing <job-id> with the job’s unique ID. To get this ID, see List jobs. To get basic information about a job, see Get a job.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetJobDetailsRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.jobs.get_job_details(
    request=GetJobDetailsRequest(
        job_id="<job-id>"
    )
)

info = response.job_details

print(f"job id:            {info.id}")
print(f"processing status: {info.processing_status}")
print(f"message:           {info.message}")
print(f"node stats:")

for node_stat in info.node_stats:
    print(f"---")
    print(f"name:        {node_stat.node_name}")
    print(f"type:        {node_stat.node_type}")
    print(f"subtype:     {node_stat.node_subtype}")
    print(f"ready:       {node_stat.ready}")
    print(f"in progress: {node_stat.in_progress}")
    print(f"success:     {node_stat.success}")
    print(f"failure:     {node_stat.failure}")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetJobDetailsRequest

async def get_job_details():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = client.jobs.get_job_details(
        request=GetJobDetailsRequest(
            job_id="<job-id>"
        )
    )

    info = response.job_details

    print(f"job id:            {info.id}")
    print(f"processing status: {info.processing_status}")
    print(f"message:           {info.message}")
    print(f"node stats:")

    for node_stat in info.node_stats:
        print(f"---")
        print(f"name:        {node_stat.node_name}")
        print(f"type:        {node_stat.node_type}")
        print(f"subtype:     {node_stat.node_subtype}")
        print(f"ready:       {node_stat.ready}")
        print(f"in progress: {node_stat.in_progress}")
        print(f"success:     {node_stat.success}")
        print(f"failure:     {node_stat.failure}")

asyncio.run(get_job_details())

curl

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>/details
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Get failed file details for a job

To get the list of any failed files for a job and why those files failed, use the UnstructuredClient object’s jobs.get_job_failed_files function (for the Python SDK) or the GET method to call the /jobs/<job-id>/failed-files endpoint (for curl or Postman), replacing <job-id> with the job’s unique ID. To get this ID, see List jobs.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetJobFailedFilesRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.jobs.get_job_failed_files(
    request=GetJobFailedFilesRequest(
        job_id="<job-id>"
    )
)

info = response.job_failed_files

if info.failed_files.__len__() > 0:
    print(f"{info.failed_files.__len__()} failed file(s):")

    for failed_file in info.failed_files:
        print(f"---")
        print(f"document: {failed_file.document}")
        print(f"error:    {failed_file.error}")
else:
    print(f"No failed files.")

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import GetJobFailedFilesRequest

async def get_job_failed_files():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = client.jobs.get_job_failed_files(
        request=GetJobFailedFilesRequest(
            job_id="<job-id>"
        )
    )

    info = response.job_failed_files

    if info.failed_files.__len__() > 0:
        print(f"{info.failed_files.__len__()} failed file(s):")

        for failed_file in info.failed_files:
            print(f"---")
            print(f"document: {failed_file.document}")
            print(f"error:    {failed_file.error}")
    else:
        print(f"No failed files.")

asyncio.run(get_job_failed_files())

curl

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>/failed-files
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Cancel a job

To cancel a running job, use the UnstructuredClient object’s jobs.cancel_job function (for the Python SDK) or the POST method to call the /jobs/<job-id>/cancel endpoint (for curl or Postman), replacing <job-id> with the job’s unique ID. To get this ID, see List jobs.

Python SDK

import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CancelJobRequest

client = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
)

response = client.jobs.cancel_job(
    request=CancelJobRequest(
        job_id="<job-id>"
    )
)

print(response.raw_response)

Python SDK (async)

import os
import asyncio

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CancelJobRequest

async def cancel_job():
    client = UnstructuredClient(
        api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")
    )

    response = await client.jobs.cancel_job_async(
        request=CancelJobRequest(
            job_id="<job-id>"
        )
    )

    print(response.raw_response)

asyncio.run(cancel_job())

curl

Postman

In the method drop-down list, select POST.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>/cancel
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
Click Send.

Download a processed local file from a job

This applies only to jobs that use a workflow with a local source and a local destination. To download a processed local file from a completed job, use GET to call the /jobs/<job-id>/download endpoint, replacing <job-id> with the job’s unique ID. To get this ID, see List jobs. You must also provide Unstructured’s IDs for the file to download and the workflow’s output node. To get these IDs, see Get a job. In the response:

Unstructured’s IDs for the file to download and the workflow’s output node are in the output_node_files array.
The ID for the file to download is in the output_node_files array’s file_id field.
The ID for the workflow’s output node is in the output_node_files array’s node_id field.

Currently, you cannot use the Unstructured user interface (UI) or the Unstructured Python SDK to download a file from a job that uses a workflow with a local source and a local destination.

curl

curl --request 'GET' --location \
"$UNSTRUCTURED_API_URL/jobs/<job-id>/download?file_id=<file-id>&node_id=<node-id>" \
--header 'accept: application/json' \
--header "unstructured-api-key: $UNSTRUCTURED_API_KEY"

Postman

In the method drop-down list, select GET.
In the address box, enter the following URL:
```
{{UNSTRUCTURED_API_URL}}/jobs/<job-id>/download
```
On the Headers tab, enter the following headers:
- Key: unstructured-api-key, Value: {{UNSTRUCTURED_API_KEY}}
- Key: accept, Value: application/json
On the Params tab, enter the following query parameters:
- Key: file_id, Value: <file-id>
- Key: node_id, Value: <node-id>
Click Send.

Unstructured API

Workflow Endpoint

Partition Endpoint

Legacy APIs

Troubleshooting

Getting started

Quickstart

Unstructured Python SDK

REST endpoints

curl and Postman

Restrictions

Connectors

List source connectors

Get a source connector

Create a source connector

Update a source connector

Delete a source connector

Test a source connector

List destination connectors

Get a destination connector

Create a destination connector

Update a destination connector

Delete a destination connector

Test a destination connector

Workflows

List workflows

Get a workflow

Create a workflow

Run a workflow

Update a workflow

Delete a workflow

Jobs

List jobs

Get a job

Get processing details for a job

Get failed file details for a job

Cancel a job

Download a processed local file from a job

Unstructured API

Workflow Endpoint

Partition Endpoint

Legacy APIs

Troubleshooting

​Getting started

​Quickstart

​Unstructured Python SDK

​REST endpoints

​curl and Postman

​Restrictions

​Connectors

​List source connectors

​Get a source connector

​Create a source connector

​Update a source connector

​Delete a source connector

​Test a source connector

​List destination connectors

​Get a destination connector

​Create a destination connector

​Update a destination connector

​Delete a destination connector

​Test a destination connector

​Workflows

​List workflows

​Get a workflow

​Create a workflow

​Run a workflow

​Update a workflow

​Delete a workflow

​Jobs

​List jobs

​Get a job

​Get processing details for a job

​Get failed file details for a job

​Cancel a job

​Download a processed local file from a job

Getting started

Quickstart

Unstructured Python SDK

REST endpoints

curl and Postman

Restrictions

Connectors

List source connectors

Get a source connector

Create a source connector

Update a source connector

Delete a source connector

Test a source connector

List destination connectors

Get a destination connector

Create a destination connector

Update a destination connector

Delete a destination connector

Test a destination connector

Workflows

List workflows

Get a workflow

Create a workflow

Run a workflow

Update a workflow

Delete a workflow

Jobs

List jobs

Get a job

Get processing details for a job

Get failed file details for a job

Cancel a job

Download a processed local file from a job