Create workflow - Unstructured

curl --request POST \
  --url "${UNSTRUCTURED_API_URL}/api/v1/workflows/" \
  --header "unstructured-api-key: ${UNSTRUCTURED_API_KEY}" \
  --header "Content-Type: application/json" \
  --data '{
    "name": "my-workflow",
    "workflow_type": "auto",
    "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
    "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
    "schedule": "daily"
  }'

{
  "id": "9b8c7d6e-5f4a-3b2c-1d0e-9f8a7b6c5d4e",
  "name": "my-workflow",
  "workflow_type": "auto",
  "status": "active",
  "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
  "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
  "schedule": "daily",
  "dag_nodes": null,
  "created_at": "2026-04-29T10:00:00Z",
  "updated_at": null
}

POST

api

workflows

curl --request POST \
  --url "${UNSTRUCTURED_API_URL}/api/v1/workflows/" \
  --header "unstructured-api-key: ${UNSTRUCTURED_API_KEY}" \
  --header "Content-Type: application/json" \
  --data '{
    "name": "my-workflow",
    "workflow_type": "auto",
    "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
    "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
    "schedule": "daily"
  }'

{
  "id": "9b8c7d6e-5f4a-3b2c-1d0e-9f8a7b6c5d4e",
  "name": "my-workflow",
  "workflow_type": "auto",
  "status": "active",
  "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
  "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
  "schedule": "daily",
  "dag_nodes": null,
  "created_at": "2026-04-29T10:00:00Z",
  "updated_at": null
}

This endpoint creates a workflow that persists until it is explicitly deleted (a long-lived workflow). To create a workflow that exists only for the duration of a single job run using local files as input, use the create job endpoint instead.

Body

name

string

required

Workflow name.

workflow_type

string

required

Execution mode. auto uses sensible default workflow settings to enable you to get good-quality results faster. custom enables you to fine-tune the workflow settings to get very specific results.

The workflow types advanced, basic, and platinum are non-operational and will be removed in a future release.

Workflows with no source_id use a local file source. Local-source workflows must set workflow_type to custom, cannot be set to run on a repeating schedule, and cannot be run from Unstructured Pipelines (though they can be run via the API or Python SDK).

source_id

string

ID of the source connector.

destination_id

string

ID of the destination connector.

workflow_nodes

array

Processing pipeline stages. Each node requires id (string, UUID) and node_type (string), and supports optional node_subtype (string), config (object), and params (object).For more information on workflow nodes, see Workflow nodes.

template_id

string

ID of a pre-built workflow template to use as the basis for the workflow.

schedule

string

Repeating run schedule. Valid values and their cron equivalents:

Value	cron	Description
`every 15 minutes`	`/15 * * *`	Every 15 minutes.
`every hour`	`0 * * * *`	At the first minute of every hour.
`every 2 hours`	`0 /2 * *`	At the first minute of every second hour.
`every 4 hours`	`0 /4 * *`	At the first minute of every fourth hour.
`every 6 hours`	`0 /6 * *`	At the first minute of every sixth hour.
`every 8 hours`	`0 /8 * *`	At the first minute of every eighth hour.
`every 10 hours`	`0 /10 * *`	At the first minute of every tenth hour.
`every 12 hours`	`0 /12 * *`	At the first minute of every twelfth hour.
`daily`	`0 0 * * *`	At the first minute of every day.
`weekly`	`0 0 * * 0`	At the first minute of every Sunday.
`monthly`	`0 0 1 * *`	At the first minute of the first day of every month.

If omitted, the workflow does not automatically run on a repeating schedule.Workflows with a local source cannot be set to run on a repeating schedule.

reprocess_all

boolean

Default: false. If true, reprocesses all documents in the source location on every run. If false, the workflow excludes from future processing any files Unstructured determines are unchanged since the last time the workflow ran.Unstructured determines if a document has changed based on the document version. For each workflow, Unstructured maintains a record of documents (and their versions, if present) processed by that workflow. Each document record consists of:

A record_id derived from the document name and path.
A record_version derived from either the document Etag (if the source provider generates one) or the source provider’s native version identifier.

When you set reprocess_all to false for a source connector that supports reprocess_all, Unstructured uses this list of records to determine whether or not to process each document:

If the record_id does not exist in the workflow records, Unstructured processes the document.
If the record_id exists, but the record_version has changed, or there is no record_version, Unstructured processes the document.

The following table lists out the possible record-id and record_version combinations, and the action Unstructured takes in each case:

`record_id`	`record_version`	Action
Exists	Unchanged	Do not process file
Exists	Changed	Process file
Exists	(none)	Process file
New	(Does not apply)	Process file

Renaming a document results in a new record_id; Unstructured will then reprocess the renamed document when the workflow runs.

The following table lists the source connectors that support the reprocess_all setting. The Record version base column specifies the versioning information Unstructured uses to generated the corresponding record version for each processed document.Source connectors that do not support reprocess_all reprocess every document in the source location each time the workflow runs.

Connector	`record_version` base
Amazon S3	ETag
Azure Blob Storage	ETag
Box	Provider version ID
Dropbox	Provider version ID
Elastisearch	Provider version ID
Google Cloud Storage	ETag
Google Drive	Provider version ID
Microsoft OneDrive	Provider version ID
Microsoft SharePoint	Provider version ID

Additional considerations to take into account when setting reprocess_all to false:

Unstructured only adds document records for documents that it successfully processes. Documents that failed to process will be reprocessed the next time the workflow is run.
Because S3 ETags are content-based, changing the metadata on an S3 object will not result in it being reprocessed.
For source providers that support the S3 protocol, be aware that deleting an object and then reuploading it to the source location will maintain the same record_id, but may result in a different record_version being generated. This is especially true of multipart uploads. This results in Unstructured reprocessing the document.
For source providers that offer Key Management Services (KMS), be aware that server-side encryption can change document ETags. This results in the the record_version of a document changing, and Unstructured reprocessing the document.
If you clone or recreate a source connector, the resulting connector does not include the document processing history of the previous connector.
Changing a workflow’s configuration does not automatically result in Unstructured reprocessing all documents. For example, changing chunker, embedder, enrichment, or partitioner settings may not result in reprocessing all document. To reprocess all documents using new workflow settings, set reprocess_all to true for at least the next workflow run.

Response

string

required

Unique identifier for the workflow.

name

string

required

Workflow name.

workflow_type

string

required

Workflow type: custom or auto.

status

string

required

Workflow state: active, inactive, or paused.

created_at

string

required

ISO 8601 timestamp when the workflow was created.

source_id

string

Source connector ID.

destination_id

string

Destination connector ID.

schedule

string

Repeating run schedule.

dag_nodes

array

Workflow processing pipeline nodes.

updated_at

string

ISO 8601 timestamp when the workflow was last updated.

curl --request POST \
  --url "${UNSTRUCTURED_API_URL}/api/v1/workflows/" \
  --header "unstructured-api-key: ${UNSTRUCTURED_API_KEY}" \
  --header "Content-Type: application/json" \
  --data '{
    "name": "my-workflow",
    "workflow_type": "auto",
    "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
    "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
    "schedule": "daily"
  }'

{
  "id": "9b8c7d6e-5f4a-3b2c-1d0e-9f8a7b6c5d4e",
  "name": "my-workflow",
  "workflow_type": "auto",
  "status": "active",
  "source_id": "7f3e2a1b-4c5d-6e7f-8a9b-0c1d2e3f4a5b",
  "destination_id": "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d",
  "schedule": "daily",
  "dag_nodes": null,
  "created_at": "2026-04-29T10:00:00Z",
  "updated_at": null
}

Get workflow Update workflow

⌘I

Documentation Index

​Body

​Response

Body

Response