Skip to main content
POST
/
api
/
v1
/
workflows
curl --request POST \
  --url "${UNSTRUCTURED_API_URL}/api/v1/workflows/" \
  --header "unstructured-api-key: ${UNSTRUCTURED_API_KEY}" \
  --header "Content-Type: application/json" \
  --data '{
    "name": "my-workflow",
    "workflow_type": "auto",
    "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
    "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
    "schedule": "daily"
  }'
{
  "id": "9b8c7d6e-5f4a-3b2c-1d0e-9f8a7b6c5d4e",
  "name": "my-workflow",
  "workflow_type": "auto",
  "status": "active",
  "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
  "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
  "schedule": "daily",
  "dag_nodes": null,
  "created_at": "2026-04-29T10:00:00Z",
  "updated_at": null
}

Documentation Index

Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt

Use this file to discover all available pages before exploring further.

This endpoint creates a workflow that persists until it is explicitly deleted (a long-lived workflow). To create a workflow that exists only for the duration of a single job run using local files as input, use the create job endpoint instead.

Body

name
string
required
Workflow name.
workflow_type
string
required
Execution mode. auto uses sensible default workflow settings to enable you to get good-quality results faster. custom enables you to fine-tune the workflow settings to get very specific results.
The workflow types advanced, basic, and platinum are non-operational and will be removed in a future release.
Workflows with no source_id use a local file source. Local-source workflows must set workflow_type to custom, cannot be set to run on a repeating schedule, and cannot be run from the Unstructured UI (though they can be run via the API or Python SDK).
source_id
string
ID of the source connector.
destination_id
string
ID of the destination connector.
workflow_nodes
array
Processing pipeline stages. Each node requires id (string, UUID) and node_type (string), and supports optional node_subtype (string), config (object), and params (object).For more information on workflow nodes, see Workflow nodes.
template_id
string
ID of a pre-built workflow template to use as the basis for the workflow.
schedule
string
Repeating run schedule. Valid values and their cron equivalents:
ValuecronDescription
every 15 minutes*/15 * * * *Every 15 minutes.
every hour0 * * * *At the first minute of every hour.
every 2 hours0 */2 * * *At the first minute of every second hour.
every 4 hours0 */4 * * *At the first minute of every fourth hour.
every 6 hours0 */6 * * *At the first minute of every sixth hour.
every 8 hours0 */8 * * *At the first minute of every eighth hour.
every 10 hours0 */10 * * *At the first minute of every tenth hour.
every 12 hours0 */12 * * *At the first minute of every twelfth hour.
daily0 0 * * *At the first minute of every day.
weekly0 0 * * 0At the first minute of every Sunday.
monthly0 0 1 * *At the first minute of the first day of every month.
If omitted, the workflow does not automatically run on a repeating schedule.Workflows with a local source cannot be set to run on a repeating schedule.
reprocess_all
boolean
Default: false. If true, reprocesses all documents in the source location on every run. If false, the workflow excludes from future processing any files Unstructured determines are unchanged since the last time the workflow ran.Unstructured determines if a document has changed based on the document version. For each workflow, Unstructured maintains a record of documents (and their versions, if present) processed by that workflow. Each document record consists of:
  • A record_id derived from the document name and path.
  • A record_version derived from either the document Etag (if the source provider generates one) or the source provider’s native version identifier.
When you set reprocess_all to false for a source connector that supports reprocess_all, Unstructured uses this list of records to determine whether or not to process each document:
  • If the record_id does not exist in the workflow records, Unstructured processes the document.
  • If the record_id exists, but the record_version has changed, or there is no record_version, Unstructured processes the document.
The following table lists out the possible record-id and record_version combinations, and the action Unstructured takes in each case:
record_idrecord_versionAction
ExistsUnchangedDo not process file
ExistsChangedProcess file
Exists(none)Process file
New(Does not apply)Process file
Renaming a document results in a new record_id; Unstructured will then reprocess the renamed document when the workflow runs.
The following table lists the source connectors that support the reprocess_all setting. The Record version base column specifies the versioning information Unstructured uses to generated the corresponding record version for each processed document.Source connectors that do not support reprocess_all reprocess every document in the source location each time the workflow runs.
Connectorrecord_version base
Amazon S3ETag
Azure Blob StorageETag
BoxProvider version ID
DropboxProvider version ID
ElastisearchProvider version ID
Google Cloud StorageETag
Google DriveProvider version ID
Microsoft OneDriveProvider version ID
Microsoft SharePointProvider version ID
Additional considerations to take into account when setting reprocess_all to false:
  • Unstructured only adds document records for documents that it successfully processes. Documents that failed to process will be reprocessed the next time the workflow is run.
  • Because S3 ETags are content-based, changing the metadata on an S3 object will not result in it being reprocessed.
  • For source providers that support the S3 protocol, be aware that deleting an object and then reuploading it to the source location will maintain the same record_id, but may result in a different record_version being generated. This is especially true of multipart uploads. This results in Unstructured reprocessing the document.
  • For source providers that offer Key Management Services (KMS), be aware that server-side encryption can change document ETags. This results in the the record_version of a document changing, and Unstructured reprocessing the document.
  • If you clone or recreate a source connector, the resulting connector does not include the document processing history of the previous connector.
  • Changing a workflow’s configuration does not automatically result in Unstructured reprocessing all documents. For example, changing chunker, embedder, enrichment, or partitioner settings may not result in reprocessing all document. To reprocess all documents using new workflow settings, set reprocess_all to true for at least the next workflow run.

Response

id
string
required
Unique identifier for the workflow.
name
string
required
Workflow name.
workflow_type
string
required
Workflow type: custom or auto.
status
string
required
Workflow state: active, inactive, or paused.
created_at
string
required
ISO 8601 timestamp when the workflow was created.
source_id
string
Source connector ID.
destination_id
string
Destination connector ID.
schedule
string
Repeating run schedule.
dag_nodes
array
Workflow processing pipeline nodes.
updated_at
string
ISO 8601 timestamp when the workflow was last updated.
curl --request POST \
  --url "${UNSTRUCTURED_API_URL}/api/v1/workflows/" \
  --header "unstructured-api-key: ${UNSTRUCTURED_API_KEY}" \
  --header "Content-Type: application/json" \
  --data '{
    "name": "my-workflow",
    "workflow_type": "auto",
    "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
    "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
    "schedule": "daily"
  }'
{
  "id": "9b8c7d6e-5f4a-3b2c-1d0e-9f8a7b6c5d4e",
  "name": "my-workflow",
  "workflow_type": "auto",
  "status": "active",
  "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
  "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
  "schedule": "daily",
  "dag_nodes": null,
  "created_at": "2026-04-29T10:00:00Z",
  "updated_at": null
}