> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Workflows

## Workflows dashboard

<img src="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=9d5132a4746e738b3e12f261b1ea373f" alt="Workflows in the sidebar" data-og-width="1084" width="1084" data-og-height="413" height="413" data-path="img/ui/Workflows-Sidebar.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?w=280&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=d6c4fb1e9bad1f998944b31c2c275dc5 280w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?w=560&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=9804b3a5659bacccf8da2f8e15c977b2 560w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?w=840&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=cc79d6080f79079ab5d2f08b7d1b49e2 840w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?w=1100&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=5c2fcdd7d204c5016f44182a464a5cc2 1100w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?w=1650&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=9dc05b2e7f41a6dc124f3e99f2e6bb99 1650w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflows-Sidebar.png?w=2500&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=31dfa1766d5ca1abad9b0968cec9a0fe 2500w" />

To view the workflows dashboard, on the sidebar, click **Workflows**.

A workflow in Unstructured is a defined sequence of processes that automate the data handling from source to destination. It allows users to configure how and when data should be ingested, processed, and stored.

Workflows are crucial for establishing a systematic approach to managing data flows within the platform, ensuring consistency, efficiency, and adherence to specific data processing requirements.

## Create a workflow

Unstructured provides two types of workflow builders:

* [Automatic](#create-an-automatic-workflow) or **Build it For Me** workflows, which use sensible default workflow settings to enable you to get good-quality results faster.
* [Custom](#create-a-custom-workflow) or **Build it Myself** workflows, which enable you to fine-tune the workflow settings behind the scenes to get very specific results.

### Create an automatic workflow

<Warning>
  You must first have an existing source connector and destination connector to add to the workflow.

  You cannot create an automatic workflow that uses a local file as a source.

  If you do not have an existing remote connector for either your target source (input) or destination (output) location, [create the source connector](/ui/sources/overview), [create the destination connector](/ui/destinations/overview), and then return here.

  To see your existing connectors, on the sidebar, click **Connectors**, and then click **Sources** or **Destinations**.
</Warning>

To create an automatic workflow:

1. On the sidebar, click **Workflows**.

2. Click **New Workflow**.

3. Next to **Build it for Me**, click **Create Workflow**.

   <Note>If a radio button appears instead of **Build it for Me**, select it, and then click **Continue**.</Note>

4. For **Workflow Name**, enter some unique name for this workflow.

5. In the **Sources** dropdown list, select your source location.

6. In the **Destinations** dropdown list, select your destination location.

   <Note>You can select multiple source and destination locations. Files will be ingested from all of the selected source locations, and the processed data will be delivered to all of the selected destination locations.</Note>

7. Click **Continue**.

8. The **Reprocess All** box applies only to blob storage connectors such as the Amazon S3, Azure Blob Storage, and Google Cloud Storage connectors:

   * Checking this box reprocesses all documents in the source location on every workflow run.
   * Unchecking this box causes only new documents that are added to the source location, or existing documents that are updated in the source location (as determined by checking whether the file's version has changed), since the last workflow run to be processed on future runs. Previously processed documents are not processed again. However:

     * Even if this box is unchecked, a renamed file is always treated as a new file, regardless of whether the file's original contents have changed.
     * Even if this box is unchecked, a file that is removed but is added back later with the same file name is processed on future runs only if the file's contents have changed since the file was originally processed.

9. Click **Continue**.

10. If you want this workflow to run on a schedule, in the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings. Otherwise, select **Don't repeat**.

11. Click **Complete**.

By default, this workflow partitions, chunks, and generates embeddings as follows:

* **Partitioner**: **Auto** strategy

  Unstructured automatically analyzes and processes files on a page-by-page basis (for PDF files) and on a document-by-document basis for everything else:

  * If the page or document has no images and likely does not have tables, **Fast** partitioning is used, and the page or document is billed at the **Fast** rate for processing.
  * If the page or document has only a few tables or images with standard layouts and languages, **High Res** partitioning is used, and the page or document is billed at the **High Res** rate for processing.
  * If the page or document has more than a few tables or images, **VLM** partitioning is used, and the page or document is billed at the **VLM** rate for processing.

  [Learn about partitioning strategies](/ui/partitioning).

* **Chunker**: **Chunk by Title** strategy

  * **Contextual Chunking**: No (unchecked)
  * **Combine Text Under N Characters**: 3000
  * **Include Original Elements**: Yes (checked)
  * **Max Characters**: 5500
  * **Multipage Sections**: Yes (checked)
  * **New After N Characters**: 3500
  * **Overlap**: 350
  * **Overlap All**: Yes (checked)

  [Learn about chunking strategies](/ui/chunking).

* **Embedder**:

  * **Provider**: Azure OpenAI
  * **Model**: text-embedding-3-large, with 3072 dimensions

  [Learn about embedding providers and models](/ui/embedding).

* Enrichments:

  This workflow contains no enrichments, other than a **Chunker** node.

  [Learn about available enrichments](/ui/enriching/overview).

After this workflow is created, you can change any or all of its settings if you want to. This includes the workflow's
source connector, destination connector, partitioning, chunking, and embedding settings. You can also add enrichments
to the workflow if you want to.

<Warning>
  Unstructured can potentially generate image summary descriptions, table summary descriptions, table-to-HTML output, and generative OCR optimizations, only for workflows that are configured as follows:

  * With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and an image summary description node, table summary description node, table-to-HTML output node, or generative OCR optimization node is added.
  * With a **Partitioner** node set to use the **VLM** partitioning strategy. No image summary description node, table summary description node, table-to-HTML output node, or generative OCR optimization node is needed (or allowed).

  Even with these configurations, Unstructured actually generates image summary descriptions, table summary descriptions, and table-to-HTML output only for files that contain images or tables and are also eligible
  for processing with the following partitioning strategies:

  * **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**.
  * **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**.

  Unstructured never generates image summary descriptions, table summary descriptions, or table-to-HTML output for workflows that are configured as follows:

  * With a **Partitioner** node set to use the **Fast** partitioning strategy.
  * With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain images or tables.

  Unstructured never produces generative OCR optimizations for workflows with a **Partitioner** node set to use the **Fast** partitioning strategy.
</Warning>

To change the workflow's default settings or to add enrichments:

1. On the sidebar, click **Workflows**.
2. In the list of available workflows, click the workflow that was just created. This opens a visual designer that shows
   your workflow as a directed acyclic graph (DAG). This DAG contains a node representing each step in the workflow.
   There is one node for the partitioning step, another node for the chunking step, and so on.
3. To learn how to change a node's settings or to add enrichment nodes, click the **FAQ** button in the flyout pane in
   the workflow DAG designer.

If you did not previously set the workflow to run on a schedule, you can [run the workflow](#edit-delete-or-run-a-workflow) now.

### Create a custom workflow

<Tip>
  If you already have an existing workflow that you want to change, do the following:

  1. On the sidebar, click **Workflows**.
  2. Click the name of the workflow that you want to change.
  3. Skip ahead to Step 11 in the following procedure.
</Tip>

<Warning>
  You can create and save a custom workflow that uses a local file as a source or does not have a source or destination connector added. However, you cannot activate the workflow or run the workflow
  either manually or on a schedule until a source and destination connector are added to the workflow.

  If you do not have an existing connector for either your target source or destination location, [create the source connector](/ui/sources/overview), [create the destination connector](/ui/destinations/overview), and then return here.

  To see your existing connectors, on the sidebar, click **Connectors**, and then click **Sources** or **Destinations**.
</Warning>

1. On the sidebar, click **Workflows**.

2. Click **New Workflow**.

3. Click the **Build it Myself** option, and then click **Continue**.

4. In the **This workflow** pane, click the **Details** button.

   <img src="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=c1d122e67465f86bb1db9dfbea20c5f1" alt="Workflow details" data-og-width="575" width="575" data-og-height="289" height="289" data-path="img/ui/Workflow-Details.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?w=280&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=d944223a9e33290c09f3683625e54bd7 280w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?w=560&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=8aad23cd43b2847c30168e836c1f65d1 560w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?w=840&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=82b650ccb976445c98cd17cc653020b4 840w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?w=1100&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=3aeaacf982e57a3a05d63662394cce98 1100w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?w=1650&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=1282e21aaf2631c1403bf878d780166c 1650w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Details.png?w=2500&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=a5ff4a4507d83ff80730295decf053df 2500w" />

5. Next to **Name**, click the pencil icon, enter some unique name for this workflow, and then click the check mark icon.

6. If you want this workflow to run on a schedule, click the **Schedule** button. In the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings.

7. To overwrite any previously processed files, or to retry any documents that fail to process, click the **Settings** button, and check either or both of the boxes.

   The **Reprocess All Files** box applies only to blob storage connectors such as the Amazon S3, Azure Blob Storage, and Google Cloud Storage connectors:

   * Checking this box reprocesses all documents in the source location on every workflow run.
   * Unchecking this box causes only new documents that are added to the source locations, or existing documents that are updated in the source location (as determined by checking whether the file's version has changed), since the last workflow run to be processed on future runs. Previously processed documents are not processed again. However:

     * Even if this box is unchecked, a renamed file is always treated as a new file, regardless of whether the file's original contents have changed.
     * Even if this box is unchecked, a file that is removed but is added back later with the same file name is processed on future runs only if the file's contents have changed since the file was originally processed.

8. The workflow begins with the following layout:

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Destination
   ```

   The following workflow layouts are also valid:

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Chunker-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Chunker-->Embedder-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Enrichment-->Chunker-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Enrichment-->Chunker-->Embedder-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Extract-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Chunker-->Extract-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Chunker-->Embedder-->Extract-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Enrichment-->Chunker-->Extract-->Destination
   ```

   ```mermaid  theme={null}
   flowchart LR
     Source-->Partitioner-->Enrichment-->Chunker-->Embedder-->Extract-->Destination
   ```

   <Note>
     For workflows that use **Chunker** and enrichment nodes together, the **Chunker** node should be placed after all enrichment nodes. Placing the
     **Chunker** node before any enrichment nodes could cause incomplete or no enrichment results to be generated.
   </Note>

   <Warning>
     You can create and save a workflow that does not use a valid workflow layout. However, you cannot activate the workflow or run the workflow
     either manually or on a schedule until the workflow is changed to use a valid workflow layout.
   </Warning>

9. In the pipeline designer, click the **Source** node. In the **Source** pane, select the source location. Then click **Save**.

   <img src="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=9540fd01d7b9c8c5a79fba5ae13b32cb" alt="Workflow designer" data-og-width="1105" width="1105" data-og-height="414" height="414" data-path="img/ui/Workflow-Designer.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?w=280&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=695aa535da592354eabd31d4e08c9539 280w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?w=560&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=c9c35e533d821170ce0a3991743ca7e6 560w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?w=840&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=6dd3f8d5f23a8599d18e5f9f371b24ee 840w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?w=1100&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=1157d536701bd7ad61b5bfd6373e204d 1100w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?w=1650&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=1e9dbfbd49217d364c8885fec9721e2e 1650w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Designer.png?w=2500&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=5137ad1c2021a35880cd64a63c0d82cc 2500w" />

   <Note>
     To use a local source location, do not choose a source connector.

     If the workflow uses a local source location, in the **Source** node, drag or click to specify a local file, and then click **Test**. The workflow's
     results are displayed on-screen.

     A workflow that uses a local source location has the following limitations:

     * You cannot save the workflow.
     * You cannot send the results to a remote destination location, even if you have attached a destination connector to
       the workflow. However, you can save the results to a local JSON-formatted file.
   </Note>

10. Click the **Destination** node. In the **Destination** pane, select the destination location. Then click **Save**.

11. As needed, add more nodes by clicking the plus icon (recommended) or **Add Node** button:

    <img src="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=cefd2e6f3b698ee9dcea70a6df0d4b3f" alt="Add node to workflow" data-og-width="1102" width="1102" data-og-height="417" height="417" data-path="img/ui/Workflow-Add-Node.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?w=280&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=696f38974833f6cd18af461a37967293 280w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?w=560&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=35a72c0d7a1064742f22269b0fe52ae4 560w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?w=840&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=7cdeb149bf7945f39f2b792dafabc857 840w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?w=1100&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=d6a1df32a9b59e393499c465c63d5213 1100w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?w=1650&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=9000e38332aa7173ef9894f0a17d39b0 1650w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Workflow-Add-Node.png?w=2500&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=9d4ffdcd4145b5a349ce47c88844fc62 2500w" />

    * Click **Connect** to add another **Source** or **Destination** node. You can add multiple source and destination locations. Files will be ingested from all of the source locations, and the processed data will be delivered to all of the destination locations. [Learn more](#custom-workflow-node-types).

    * Click **Enrich** to add a chunker or enrichment node. [Learn more](#custom-workflow-node-types).

      <Warning>
        Unstructured can potentially generate image summary descriptions, table summary descriptions, table-to-HTML output, and generative OCR optimizations, only for workflows that are configured as follows:

        * With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and an image summary description node, table summary description node, table-to-HTML output node, or generative OCR optimization node is added.
        * With a **Partitioner** node set to use the **VLM** partitioning strategy. No image summary description node, table summary description node, table-to-HTML output node, or generative OCR optimization node is needed (or allowed).

        Even with these configurations, Unstructured actually generates image summary descriptions, table summary descriptions, and table-to-HTML output only for files that contain images or tables and are also eligible
        for processing with the following partitioning strategies:

        * **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**.
        * **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**.

        Unstructured never generates image summary descriptions, table summary descriptions, or table-to-HTML output for workflows that are configured as follows:

        * With a **Partitioner** node set to use the **Fast** partitioning strategy.
        * With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain images or tables.

        Unstructured never produces generative OCR optimizations for workflows with a **Partitioner** node set to use the **Fast** partitioning strategy.
      </Warning>

    * Click **Transform** to add a **Partitioner** or **Embedder** node. [Learn more](#custom-workflow-node-types).

      <Warning>
        If you add an **Embedder** node, you must set the **Chunker** node's **Max Characters** setting to a value at or below Unstructured's recommended
        maximum chunk size for your selected embedding model. [Learn more](/ui/embedding#chunk-sizing-and-embedding-models).
      </Warning>

    <Tip>
      Make sure to add nodes in the correct order. If you are unsure, see the usage hints in the blue note that appears
      in the node's settings pane.

        <img src="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=916e1845f39f58dd118d4b5550a8e55e" alt="Node usage hints note" data-og-width="577" width="577" data-og-height="225" height="225" data-path="img/ui/Node-Usage-Hints.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?w=280&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=a5a31a1b9ee8fead7ee8fe1b8afd93dd 280w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?w=560&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=700e078a6c233b55cb5d16a211330720 560w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?w=840&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=13c4cc3c71154eb8faf48bc92653a325 840w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?w=1100&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=a64a32c7de52832c84271c0385ad2ca8 1100w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?w=1650&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=dad98d300d5a94f3c476a674ca874e85 1650w, https://mintcdn.com/unstructured-53/0PpGBVwVpmOG7-W9/img/ui/Node-Usage-Hints.png?w=2500&fit=max&auto=format&n=0PpGBVwVpmOG7-W9&q=85&s=cc4dfeb2563d793b57c12a7bfddf8c32 2500w" />
    </Tip>

    To edit a node, click that node, and then change its settings.

    To delete a node, click that node, and then click the trash can icon above it.

12. Click **Save**.

13. If you did not set the workflow to run on a schedule, you can [run the workflow](#edit-delete-or-run-a-workflow) now.

#### Custom workflow node types

<AccordionGroup>
  <Accordion title="Partitioner node">
    Choose from one of four available partitioning strategies.

    Unstructured recommends that you choose the **Auto** partitioning strategy in most cases. With **Auto**, Unstructured does all
    the heavy lifting, optimizing at runtime for the highest quality at the lowest cost page-by-page.

    You should consider the following additional strategies only if you are absolutely sure that your documents are of the same
    type. Each of the following strategies are best suited for specific situations. Choosing one of these
    strategies other than **Auto** for sets of documents of different types could produce undesirable results,
    including reduction in transformation quality.

    * **VLM**: For the highest-quality transformation of these file types: `.bmp`, `.gif`, `.heic`, `.jpeg`, `.jpg`, `.pdf`, `.png`, `.tiff`, and `.webp`.
    * **High Res**: For all other [supported file types](/ui/supported-file-types), and for the generation of bounding box coordinates.
    * **Fast**: For text-only documents.

    The **Auto** partitioning strategy routes each file as a complete unit to the appropriate partitioning strategy (**VLM**, **High Res**, or **Fast**)
    based on the preceding file types. Additionally, for `.pdf` files, the **Auto** partitioning strategy routes these files' pages
    on a page-by-page basis, as follows:

    * A page is routed to **Fast** when it contains only embedded text and no images or tables are detected.
    * All other kinds of pages are routed to **VLM** or **High Res**, depending on the complexity of a page's
      content. Unstructured constantly optimizes its proprietary algorithm for routing to **VLM** or **High Res** in these cases.

    For **VLM**, you must also choose a VLM provider and model from among the available choices that are shown.

    <Warning>
      The following models are no longer available as of the following dates:

      * Amazon Bedrock Claude Sonnet 3.5: October 22, 2025
      * Anthropic Claude Sonnet 3.5: October 22, 2025

      Unstructured recommends the following actions:

      * For new workflows, do not use any of these models.
      * For any workflow that uses any of these models, update that workflow as soon as possible to use a different model.

      Workflows that attempt to use any of these models on or after its associated date will return errors.
    </Warning>

    <Note>
      When you use the **VLM** strategy with embeddings for PDF files of 200 or more pages, you might notice some errors when
      these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
    </Note>

    If you choose the **Fast** strategy, you can also choose from among the following additional settings:

    * **Include Page breaks**: Check this box to include distinct `PageBreak` document elements in the output, if the file type supports it.
    * **Infer Table Structure**: Check this box to add, for each table in a PDF file, a metadata field named `text_as_html` to the output for that table's document element. This field will contain an HTML representation of the table.
    * **Elements to Exclude**: Select the name of each available type of [document element](/ui/document-elements) to exclude from the output.

    If you choose the **High Res** strategy, you can also choose from among the following additional settings:

    * **Include Page breaks**: Check this box to include distinct `PageBreak` document elements in the output, if the file type supports it.
    * **Infer Table Structure**: Check this box to add, for each table in a PDF file, a metadata field named `text_as_html` to the output for that table's document element. This field will contain an HTML representation of the table.
    * **Include Coordinates**: Check this box to add, for each [document element](/ui/document-elements) in the output, a metadata field named `coordinates` to the output for that document element. This field will contain the bounding box coordinates of the document element's content on the page, as well as the bounding box's width and height in pixels.
    * **Extract Image Block Types**: Select the name of each available type of document element to add a metadata field named `image_base64` to the output for that document element. This field will contain a Base64-encoded representation of the document element's content. A Base64-to-image decoding of this field's value will return an image representing the document element's original content.
    * **Elements to Exclude**: Select the name of each available type of document element to exclude from the output.

    [Learn more](/ui/partitioning).
  </Accordion>

  <Accordion title="Chunker node">
    For **Chunkers**, select one of the following:

    * **Chunk by title**: Preserve section boundaries and optionally page boundaries as well. A single chunk will never contain text that occurred in two different sections. When a new section starts, the existing chunk is closed and a new one is started, even if the next element would fit in the prior chunk. Also, specify the following:

      * **Contextual chunking**: When switched on, prepends chunk-specific explanatory context to each chunk. [Learn more](/ui/chunking#contextual-chunking).
      * **Combine text under n chars**: Combine elements until a section reaches a length of this many characters. The default is **0**.
      * **Include original elements**: Check this box to output the elements that were used to form a chunk, to appear in the `metadata` field's `orig_elements` field for that chunk. By default, this box is unchecked.
      * **Max characters**: Cut off new sections after reaching a length of this many characters. This is a strict limit. The default is **2048**.
      * **Multipage sections**: Check this box to allow sections to span multiple pages. By default, this box is unchecked.
      * **New after n chars**: Cut off new sections after reaching a length of this many characters. This is an approximate limit. The default is **1500**.
      * **Overlap**: Apply a prefix of this many trailing characters from the prior text-split chunk to second and later chunks formed from oversized elements by text-splitting. The default is **160**.
      * **Overlap all**: Check this box to apply overlap to "normal" chunks formed by combining whole elements. Use with caution as this can introduce noise into otherwise clean semantic units. By default, this box is unchecked.

    * **Chunk by character** (also known as *basic* chunking): Combine sequential elements to maximally fill each chunk. Also, specify the following:

      * **Contextual chunking**: When switched on, prepends chunk-specific explanatory context to each chunk. [Learn more](/ui/chunking#contextual-chunking).
      * **Include original elements**: Check this box to output the elements that were used to form a chunk, to appear in the `metadata` field's `orig_elements` field for that chunk. By default, this box is unchecked.
      * **Max characters**: Cut off new sections after reaching a length of this many characters. The default is **2048**.
      * **New after n chars**: Cut off new sections after reaching a length of this many characters. This is an approximate limit. The default is **1500**.
      * **Overlap**: Apply a prefix of this many trailing characters from the prior text-split chunk to second and later chunks formed from oversized elements by text-splitting. The default is **160**.
      * **Overlap All**: Check this box to apply overlap to "normal" chunks formed by combining whole elements. Use with caution as this can introduce noise into otherwise clean semantic units. By default, this box is unchecked.

    * **Chunk by page**: Preserve page boundaries. When a new page is detected, the existing chunk is closed and a new one is started, even if the next element would fit in the prior chunk. Also, specify the following:

      * **Contextual chunking**: When switched on, prepends chunk-specific explanatory context to each chunk. [Learn more](/ui/chunking#contextual-chunking).
      * **Include original elements**: Check this box to output the elements that were used to form a chunk, to appear in the `metadata` field's `orig_elements` field for that chunk. By default, this box is unchecked.
      * **Max characters**: Cut off new sections after reaching a length of this many characters. This is a strict limit. The default is **500**.
      * **New after n chars**: Cut off new sections after reaching a length of this many characters. This is an approximate limit. The default is **50**.
      * **Overlap**: Apply a prefix of this many trailing characters from the prior text-split chunk to second and later chunks formed from oversized elements by text-splitting. The default is **30**.
      * **Overlap all**: Check this box to apply overlap to "normal" chunks formed by combining whole elements. Use with caution as this can introduce noise into otherwise clean semantic units. By default, this box is unchecked.

    * **Chunk by similarity**: Use the [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) embedding model to identify topically similar sequential elements and combine them into chunks. Also, specify the following:

      * **Contextual chunking**: When switched on, prepends chunk-specific explanatory context to each chunk. [Learn more](/ui/chunking#contextual-chunking).
      * **Include original elements**: Check this box to output the elements that were used to form a chunk, to appear in the `metadata` field's `orig_elements` field for that chunk. By default, this box is unchecked.
      * **Max characters**: Cut off new sections after reaching a length of this many characters. This is a strict limit. The default is **500**.
      * **Similarity threshold**: Specify a threshold between 0 and 1 exclusive (0.01 to 0.99 inclusive), where 0 indicates completely dissimilar vectors and 1 indicates identical vectors, taking into consideration the trade-offs between precision (a higher threshold) and recall (a lower threshold). The default is **0.5**. [Learn more](https://towardsdatascience.com/introduction-to-embedding-clustering-and-similarity-11dd80b00061).

    Learn more:

    * [Chunking overview](/ui/chunking)
    * [Chunking for RAG: best practices](https://unstructured.io/blog/chunking-for-rag-best-practices)
  </Accordion>

  <Accordion title="Enrichment node">
    Each enrichment node type has its own unique settings:

    <Warning>
      The following models are no longer available as of the following dates:

      * Amazon Bedrock Claude Sonnet 3.5: October 22, 2025
      * Anthropic Claude Sonnet 3.5: October 22, 2025

      Unstructured recommends the following actions:

      * For new workflows, do not use any of these models.
      * For any workflow that uses any of these models, update that workflow as soon as possible to use a different model.

      Workflows that attempt to use any of these models on or after its associated date will return errors.
    </Warning>

    <Warning>
      Unstructured can potentially generate image summary descriptions, table summary descriptions, table-to-HTML output, and generative OCR optimizations, only for workflows that are configured as follows:

      * With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and an image summary description node, table summary description node, table-to-HTML output node, or generative OCR optimization node is added.
      * With a **Partitioner** node set to use the **VLM** partitioning strategy. No image summary description node, table summary description node, table-to-HTML output node, or generative OCR optimization node is needed (or allowed).

      Even with these configurations, Unstructured actually generates image summary descriptions, table summary descriptions, and table-to-HTML output only for files that contain images or tables and are also eligible
      for processing with the following partitioning strategies:

      * **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**.
      * **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**.

      Unstructured never generates image summary descriptions, table summary descriptions, or table-to-HTML output for workflows that are configured as follows:

      * With a **Partitioner** node set to use the **Fast** partitioning strategy.
      * With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain images or tables.

      Unstructured never produces generative OCR optimizations for workflows with a **Partitioner** node set to use the **Fast** partitioning strategy.
    </Warning>

    * The **Image Description** node summarizes images. You must select one of the available provider (and model) combinations that are shown.

      [Learn more](/ui/enriching/image-descriptions).

    * The **Table Description** node summarizes tables. You must select one of the available provider (and model) combinations that are shown.

      [Learn more](/ui/enriching/table-descriptions).

    * The **Table to HTML** node generates HTML representations for tables. Also select the following:

      * To use agentic AI to increase HTML accuracy for complex tables, select **Agentic** for **Mode**.
      * To use a VLM for standard tables, select **Standard** for **Mode**. Then select one of the available **Provider** and **Model** combinations that are shown.

      [Learn more](/ui/enriching/table-to-html).

    * The **NER** node generates a list of recognized entities and their relationships by using a technique called *named entity recognition* (NER).
      You must select one of the available provider (and model) combinations that are shown.

      You can also customize the prompt used to add or remove entities and relationships. In the **Details** tab, under **Prompt**, click **Edit**. Click **Run Prompt** in the
      **Edit & Test Prompt** section to test the prompt.

      [Learn more](/ui/enriching/ner).

    * The **Generative OCR** node optimizes the fidelity of text blocks that Unstructured initially processed during its partitioning phase.
      You must select one of the available provider (and model) combinations that are shown.

      <Warning>
        Generative OCR does not process any text blocks by default. You must also explicitly specify which document element
        types containing text that you want generative OCR to process. To do this, in the workflow editor for your workflow:

        1. Click the **Partitioner** node.
        2. In the node's settings pane, scroll down to and then click a blank area inside of the **Extract Image Block Types** list.
        3. Select each [document element type](/ui/document-elements#element-type) that you want generative OCR to process. For this
           walkthrough, select only **NarrativeText**.

        Generative OCR does not process the text of any `Image` or `Table` elements if they have already been processed by
        [image description](#image-description-task) or [table description](#table-description-task) enrichments, respectively. Do
        not remove the **Image** or **Table** document elements types from this **Extract Image Block Types** list, or else
        the image description and table description enrichments in your workflow might produce unexpected results or might not work at all.
      </Warning>

      [Learn more](/ui/enriching/generative-ocr).
  </Accordion>

  <Accordion title="Embedder node">
    For **Select Embedding Model**, select one of the available models that are shown.

    <Warning>
      If you add an **Embedder** node, you must set the **Chunker** node's **Max Characters** setting to a value at or below Unstructured's recommended
      maximum chunk size for your selected embedding model. [Learn more](/ui/embedding#chunk-sizing-and-embedding-models).
    </Warning>

    Learn more:

    * [Embedding overview](/ui/embedding)
    * [Understanding embedding models: make an informed choice for your RAG](https://unstructured.io/blog/understanding-embedding-models-make-an-informed-choice-for-your-rag).
  </Accordion>

  <Accordion title="Extract node">
    Do one of the following to define the custom schema for the structured data that you want to extract:

    * To use a custom schema that conforms to the [OpenAI Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs#supported-schemas) guidelines,
      click **Upload JSON**; enter your own custom schema or upload a JSON file that contains your custom schema; and then click **Use this Schema**.
      [Learn about the OpenAI Structured Outputs format](https://platform.openai.com/docs/guides/structured-outputs#supported-schemas).
    * To use a visual editor to define the schema, enter your own custom schema objects and their properties. To clear the current schema and start over,
      click the ellipses (three dots) icon, and then click **Reset form**.
      [Learn about OpenAI Structured Outputs data types](https://platform.openai.com/docs/guides/structured-outputs#supported-schemas).

    [Learn more](/ui/data-extractor).
  </Accordion>
</AccordionGroup>

## Edit, delete, or run a workflow

To run a workflow once, manually:

1. On the sidebar, click **Workflows**.
2. In the list of workflows, click **Run** in the row for the workflow that you want to run.

For each of the workflows on the **Workflows** list page, the following actions are available by clicking the ellipses (the three dots) in the row for the respective workflow:

* **Edit via Form**: Changes the existing configuration of your workflow.
* **Delete**: Removes the workflow from the platform. Use this action cautiously, as it will permanently delete the workflow and its configurations.
* **Open**: Opens the workflow's settings page.

## Pause a scheduled workflow

To stop running a workflow that is set to run on a repeating schedule:

1. On the sidebar, click **Workflows**.
2. In the list of workflows, turn off the **Status** toggle in the row for the workflow that you want to stop running on a repeated schedule.

Turning off the **Status** toggle also disables the workflow's **Run** button, which prevents that workflow from being run manually as well.

To resume running the workflow on its original repeating schedule, as well as enable the workflow to be run manually as needed, turn on the workflow's **Status** toggle.

## Duplicate a workflow

To duplicate (copy or clone) a workflow:

1. On the sidebar, click **Workflows**.
2. In the list of workflows, click the ellipses (the three dots) in the row for the workflow that you want to duplicate.
3. Click **Duplicate**.

   A duplicate of the workflow is created with the same configuration as the original workflow. The duplicate workflow has the same display name as the original
   workflow but with **(Copy)** at the end.
