Overview
The Unstructured Platform API is separate from Unstructured Serverless API services.
For information about Unstructured Serverless API services, see the Unstructured Serverless API services overview.
The Unstructured Platform features a no-code user interface for transforming your unstructured data into data that is ready for Retrieval Augmented Generation (RAG).
The Unstructured Platform also provides a back-end API for automation usage scenarios as well as for documentation, reporting, and recovery needs. This page provides an overview of the Unstructured Platform API.
Requirements
To use the Unstructured Platform API, you must have:
-
An Unstructured account, created through the For Developers page, with one of the following plans:
-
An Unstructured API key, created through your Unstructured account console.
If you signed up through the For Enterprise page, your API key creation process might be different. For API key creation guidance, email Unstructured Sales at sales@unstructured.io.
Free Unstructured API keys are not supported.
To create an API key, do the following:
- Sign in to the Unstructured Platform. Learn how.
- At the bottom of the sidebar, click your user icon, and then click Account Settings.
- On the API Keys tab, click Generate New Key.
- Enter some descriptive name for the API key, and then click Save.
- Click the Copy icon for your new API key. The API key’s value is copied to your system’s clipboard.
-
The Unstructured Platform API URL. This is typically
https://platform.unstructuredapp.io
, which is unique to the Unstructured Python SDK; andhttps://platform.unstructuredapp.io/api/v1
for standard REST-enabled utilities (such ascurl
), tools (such as Postman), programming languages, packages, and libraries.Important: Do not use
https://platform.unstructuredapp.io/api/v1
with the Unstructured Python SDK, or else calls made by the Python SDK will fail. Usehttps://platform.unstructuredapp.io
instead.Do not use the Unstructured Serverless API URL, which is separate from the Unstructured Platform API URL.
If you signed up through the For Enterprise page, your API URL might be different. For API URL guidance, email Unstructured Sales at sales@unstructured.io. If your API URL is different, be sure to substitute
https://platform.unstructuredapp.io/api/v1
for your API URL throughout the following examples.
The Unstructured Platform API is offered as follows:
-
As part of the Unstructured Python SDK beginning with version 0.30.0, which you can call through standard Python code.
To install the Unstructured Python SDK, run the following command from within your Python virtual environment:
If you already have the Unstructured Python SDK installed, upgrade to at least version 0.30.0 by running the following command instead:
-
As a set of Representational State Transfer (REST) endpoints, which you can call through standard REST-enabled utilities, tools, programming languages, packages, and libraries. The following sections describe how to call the Unstructured Platform API with
curl
and Postman. You can adapt this information as needed for your preferred programming languages and libraries, for example by using therequests
library with Python.You can also use the Unstructured Platform API - Swagger UI to call the REST endpoints that are available through
https://platform.unstructuredapp.io
.
The Unstructured Platform API is separate from Unstructured Serverless API services and Unstructured Ingest.
Because of this separation, the following Unstructured SDKs, tools, and libraries do not work with the Unstructured Platform API:
- The Unstructured JavaScript/TypeScript SDK
- Local single-file POST requests to Unstructured Serverless API services
- The Unstructured open source Python library
- The Unstructued Ingest CLI
- The Unstructured Ingest Python library
Free Unstructured API accounts are also not supported.
The following Unstructured API URLs are also not supported:
https://api.unstructuredapp.io/general/v0/general
(the Unstructured Serverless API URL)https://api.unstructured.io/general/v0/general
(the Free Unstructured API URL)
Basics
The Unstructured Platform API enables you to work with connectors, workflows, and jobs in the Unstructured Platform.
- A source connector ingests files or data into Unstructured from a source location.
- A destination connector sends the processed data from Unstructured to a destination location.
- A workflow defines how Unstructured will process the data.
- A job runs a workflow at a specific point in time.
For general information about these objects, see:
The following sections provide examples, showing the use of the Unstructured SDK for Python for all of the supported API operations,
as well as curl
and Postman for all of the supported REST endpoints.
You can also use the Unstructured Platform API - Swagger UI to call the REST endpoints
that are available through https://platform.unstructuredapp.io
.
The following Unstructured Python SDK examples use the following environment variables, which you can set as follows:
Important: Do not use https://platform.unstructuredapp.io/api/v1
with the Python SDK, or else calls made by the Python SDK will fail.
Use https://platform.unstructuredapp.io
instead.
The following curl
and Postman examples use the following environment variables, which you can set as follows:
Important: For standard REST-enabled clients (such as curl
),
do not use https://platform.unstructuredapp.io
(which is unique to the
Unstructured Python SDK), or else calls made by these REST-enabled clients will fail.
Use https://platform.unstructuredapp.io/api/v1
instead.
These environment variables enable you to more easily run the following Unstructured Python SDK and curl
examples and help prevent
you from storing scripts that contain sensitive URLs and API keys in public source code repositories.
The following Postman examples use variables, which you can set as follows:
-
In Postman, on your workspace’s sidebar, click Environments.
-
Click Globals.
-
Create two global variables with the following settings:
-
Variable:
UNSTRUCTURED_API_URL
-
Type:
default
-
Initial value:
https://platform.unstructuredapp.io/api/v1
-
Current value:
https://platform.unstructuredapp.io/api/v1
Important: Do not use
https://platform.unstructuredapp.io
(which is unique to the Unstructured Python SDK), or else calls made by Postman will fail.
- Variable:
UNSTRUCTURED_API_URL
- Type:
secret
- Initial value:
<your-unstructured-api-key>
- Current value:
<your-unstructured-api-key>
-
-
Click Save.
These variables enable you to more easily run the following examples in Postman and help prevent you from storing Postman collections that contain sensitive URLs and API keys in public source code repositories.
Connectors
You can list, get, create, update, and delete source connectors. You can also list, get, create, update, and delete destination connectors.
For general information, see Connectors.
List source connectors
To list source connectors, use the UnstructuredClient
object’s sources.list_sources
function (for the Python SDK) or
the GET
method to call the /sources
endpoint (for curl
or Postman).
To filter the list of source connectors, use the ListSourcesRequest
object’s source_type
parameter (for the Python SDK)
or the query parameter source_type=<type>
(for curl
or Postman),
replacing <type>
with the source connector type’s unique ID
(for example, s3
for the Amazon S3 source connector type).
To get this ID, see Sources.
Get a source connector
To get information about a source connector, use the UnstructuredClient
object’s sources.get_source
function (for the Python SDK) or
the GET
method to call the /sources/<connector-id>
endpoint (for curl
or Postman), replacing
<connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors.
Create a source connector
To create a source connector, use the UnstructuredClient
object’s sources.create_source
function (for the Python SDK) or
the POST
method to call the /sources
endpoint (for curl
or Postman).
In the CreateSourceConnector
object (for the Python SDK) or
the request body (for curl
or Postman),
specify the settings for the connector. For the specific settings to include, which differ by connector, see
Sources.
Update a source connector
To update information about a source connector, use the UnstructuredClient
object’s sources.update_source
function (for the Python SDK) or
the PUT
method to call the /sources/<connector-id>
endpoint (for curl
or Postman), replacing
<connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors.
In the UpdateSourceConnector
object (for the Python SDK) or
the request body (for curl
or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see
Sources.
You must specify all of the settings for the connector, even for settings that are not changing.
You can change any of the connector’s settings except for its name
and type
.
Delete a source connector
To delete a source connector, use the UnstructuredClient
object’s sources.delete_source
function (for the Python SDK) or
the DELETE
method to call the /sources/<connector-id>
endpoint (for curl
or Postman), replacing
<connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors.
List destination connectors
To list destination connectors, use the UnstructuredClient
object’s destinations.list_destinations
function (for the Python SDK) or
the GET
method to call the /destinations
endpoint (for curl
or Postman).
To filter the list of destination connectors, use the ListDestinationsRequest
object’s destination_type
parameter (for the Python SDK) or
the query parameter destination_type=<type>
(for curl
or Postman),
replacing <type>
with the destination connector type’s unique ID
(for example, s3
for the Amazon S3 destination connector type).
To get this ID, see Destinations.
Get a destination connector
To get information about a destination connector, use the UnstructuredClient
object’s destinations.get_destination
function (for the Python SDK) or
the GET
method to call the /destinations/<connector-id>
endpoint (for curl
or Postman), replacing
<connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors.
Create a destination connector
To create a destination connectors, use the UnstructuredClient
object’s destinations.create_destination
function (for the Python SDK) or
the POST
method to call the /destinations
endpoint (for curl
or Postman).
In the CreateDestinationConnector
object (for the Python SDK) or
the request body (for curl
or Postman),
specify the settings for the connector. For the specific settings to include, which differ by connector, see
Destinations.
Update a destination connector
To update information about a destination connector, use the UnstructuredClient
object’s destinations.update_destination
function (for the Python SDK) or
the PUT
method to call the /destinations/<connector-id>
endpoint (for curl
or Postman), replacing
<connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors.
In the UpdateDestinationConnector
object (for the Python SDK) or
the request body (for curl
or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see
Destinations.
You must specify all of the settings for the connector, even for settings that are not changing.
You can change any of the connector’s settings except for its name
and type
.
Delete a destination connector
To delete a destination connector, use the UnstructuredClient
object’s destinations.delete_destination
function (for the Python SDK) or
the DELETE
method to call the /destinations/<connector-id>
endpoint (for curl
or Postman), replacing
<connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors.
Workflows
You can list, get, create, run, update, and delete workflows.
For general information, see Workflows.
List workflows
To list workflows, use the UnstructuredClient
object’s workflows.list_workflows
function (for the Python SDK) or
the GET
method to call the /workflows
endpoint (for curl
or Postman).
To filter the list of workflows, use one or more of the following ListWorkflowsRequest
parameters (for the Python SDK) or
query parameters (for curl
or Postman):
source_id=<connector-id>
, replacing<connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors.destination_id=<connector-id>
, replacing<connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors.status=<status>
, replacing<status>
with one of the following workflow statuses:active
orinactive
.
You can specify multiple query parameters, for example ?source_id=<connector-id>&status=<status>
.
Get a workflow
To get information about a workflow, use the UnstructuredClient
object’s workflows.get_workflow
function (for the Python SDK) or
the GET
method to call the /workflows/<workflow-id>
endpoint (for curl
or Postman), replacing
<workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows.
Create a workflow
To create a workflow, use the UnstructuredClient
object’s workflows.create_workflow
function (for the Python SDK) or
the POST
method to call the /workflows
endpoint (for curl
or Postman).
In the CreateWorkflow
object (for the Python SDK) or
the request body (for curl
or Postman),
specify the settings for the workflow. For the specific settings to include, see
Create a workflow.
Run a workflow
To run a workflow manually, use the UnstructuredClient
object’s workflows.run_workflow
function (for the Python SDK) or
the POST
method to call the /workflows/<workflow-id>/run
endpoint (for curl
or Postman), replacing
<workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows.
To run a workflow on a schedule instead, specify the schedule
setting in the request body when you create or update a
workflow. See Create a workflow or Update a workflow.
Update a workflow
To update information about a workflow, use the UnstructuredClient
object’s workflows.update_workflow
function (for the Python SDK) or
the PUT
method to call the /workflows/<workflow-id>
endpoint (for curl
or Postman), replacing
<workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows.
In UpdateWorkflow
object (for the Python SDK) or
the request body (for curl
or Postman), specify the settings for the workflow. For the specific settings to include, see
Update a workflow.
Delete a workflow
To delete a workflow, use the UnstructuredClient
object’s workflows.delete_workflow
function (for the Python SDK) or
the DELETE
method to call the /workflows/<workflow-id>
endpoint (for curl
or Postman), replacing
<workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows.
Jobs
You can list, get, and cancel jobs.
A job is created automatically whenever a workflow runs on a schedule; see Create a workflow. A job is also created whenever you run a workflow; see Run a workflow.
For general information, see Jobs.
List jobs
To list jobs, use the UnstructuredClient
object’s jobs.list_jobs
function (for the Python SDK) or
the GET
method to call the /jobs
endpoint (for curl
or Postman).
To filter the list of jobs, use one or both of the following ListJobsRequest
parameters (for the Python SDK) or
query parameters (for curl
or Postman):
workflow_id=<workflow-id>
, replacing<workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows.status=<status>
, replacing<status>
with one of the following job statuses:failed
,finished
, orrunning
.
For curl
or Postman, you can specify multiple query parameters as ?workflow_id=<workflow-id>&status=<status>
.
Get a job
To get information about a job, use the UnstructuredClient
object’s jobs.get_job
function (for the Python SDK) or
the GET
method to call the /jobs/<job-id>
endpoint (for curl
or Postman), replacing
<job-id>
with the job’s unique ID. To get this ID, see List jobs.
Cancel a job
To cancel a running job, use the UnstructuredClient
object’s jobs.cancel_job
function (for the Python SDK) or
the POST
method to call the /jobs/<job-id>/cancel
endpoint (for curl
or Postman), replacing
<job-id>
with the job’s unique ID. To get this ID, see List jobs.
Was this page helpful?