> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Unstructured Pipelines quickstarts

## Local file quickstart

This quickstart shows how, in just a few minutes, you can use Unstructured Pipelines to quickly and easily see Unstructured's
best-in-class transformation results for a single file that is stored on your local computer.

<Tip>
  This quickstart focuses on a single, local file for ease-of-use demonstration purposes.

  To use Unstructured later to do
  large-scale batch processing of multiple files and semi-structured data that are stored in remote locations,
  [skip over](/pipelines/quickstart#remote-quickstart) to the remote quickstart after you finish this one.
</Tip>

If you do not already have an Unstructured account, [sign up for free](https://unstructured.io/?modal=try-for-free).
After you sign up, you are automatically signed in to your new Unstructured **Let's Go** account, at [https://platform.unstructured.io](https://platform.unstructured.io).

<Note>
  If you already have an Unstructured **Pay-As-You-Go** or **Business SaaS** account, you are already signed up for Unstructured.
  Sign in to your existing Unstructured **Pay-As-You-Go** or **Business SaaS** account, at [https://platform.unstructured.io](https://platform.unstructured.io).

  If you already have an Unstructured **dedicated instance** or **in-VPC** deployment, your sign-in link will be unique to your deployment.
  If you're not sure what your unique sign-in link is, see your Unstructured account administrator, or email Unstructured Support at
  [support@unstructured.io](mailto:support@unstructured.io).
</Note>

Do the following:

1. After you are signed in, the **Start** page appears.

2. In the **Welcome** area, do one of the following:

   * Click one of the sample files, such as **realestate.pdf**, to have Unstructured parse and transform that sample file.
   * Click **Browse files**, or drag and drop a file onto **Drop file to test**, to have Unstructured parse and transform your own file.

     If you choose to use your own file, the file must be 50 MB or less in size. Also, the file must be one of the following supported file types:

     | File extension |
     | -------------- |
     | `.bmp`         |
     | `.csv`         |
     | `.doc`         |
     | `.docx`        |
     | `.eml`         |
     | `.epub`        |
     | `.heic`        |
     | `.html`        |
     | `.jpeg`        |
     | `.jpg`         |
     | `.md`          |
     | `.msg`         |
     | `.odt`         |
     | `.org`         |
     | `.p7s`         |
     | `.pdf`         |
     | `.png`         |
     | `.ppt`         |
     | `.pptx`        |
     | `.rst`         |
     | `.rtf`         |
     | `.tif`         |
     | `.tiff`        |
     | `.tsv`         |
     | `.txt`         |
     | `.xls`         |
     | `.xlsx`        |
     | `.xml`         |

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/welcome.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=f595833be0432a2e94e65171b5b11c2b" alt="Welcome interface on the Start page" width="1346" height="591" data-path="img/pipelines/single-file/welcome.png" />

3. After Unstructured has finished parsing and transforming the file (a process known as
   [partitioning](/concepts/partitioning)), you will see the file's contents in the
   **Preview** pane in the center and Unstructured's results in the **Result** pane on the right.

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/results.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=0f1ee4ceb9776f0128ec4586757214fc" alt="Unstructured's parse and transform results" width="3454" height="1912" data-path="img/pipelines/single-file/results.png" />

4. The **Result** pane shows a formatted view of Unstructured's results by default. This formatted view is designed for human
   readability. To see the underlying JSON view of the results, which is designed for RAG and agentic AI,
   click **JSON** at the top of the **Result** pane.
   [Learn about what's in the JSON view](/concepts/document-elements).

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/json-view.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=4566790e657f7a75c5548c285ffcc402" alt="Switching to the JSON view of the results" width="784" height="191" data-path="img/pipelines/single-file/json-view.png" />

5. Unstructured's initial results are based on its **High Res** [partitioning strategy](/concepts/partitioning), which
   begins processing the file's contents and converting these contents into a series of Unstructured
   [document elements and metadata](/concepts/document-elements). This partitioning strategy provides good results overall, depending on the complexity of the file's contents.
   This partitioning strategy also generates a bounding box for each detected object in the file. A *bounding box* is
   an imaginary rectangular box drawn around the object to show its location and extent within the file.

   After the High Res partitioning results are shown, Unstructured begins improving these initial results by
   using vision language models (VLMs) to apply a series of generative refinements known as *enrichments*. These
   enrichments include:

   * An [image description](ui/enriching/image-descriptions) enrichment, which uses a VLM to provide a text-based summary of the contents of each detected image.
   * A [generative OCR](/concepts/enriching/generative-ocr) enrichment, which uses a VLM to improve the accuracy of each block of initially-processed text.
   * A [table to HTML](/concepts/enriching/table-to-html) enrichment, which uses a VLM to provide an HTML-structured representation of each detected table.

   While these enrichments are being applied, a banner appears at the top of the **Result** pane.

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/generative-refinement.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=b42cb9094cb4a30b9ec45a31346c2eb6" alt="Updating the initial results with enrichments" width="779" height="138" data-path="img/pipelines/single-file/generative-refinement.png" />

   To see these enrichments applied to the initial results, click **Update results** in the banner as soon as this button appears,
   which might take up to a minute or more.

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/apply-generative-refinement.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=7545479513951098b86d8fef40849ad5" alt="Seeing the initial results updated with the enrichments" width="780" height="136" data-path="img/pipelines/single-file/apply-generative-refinement.png" />

   <Warning>
     Each page that Unstructured processes by using this approach is counted as two pages for usage and billing purposes.

     This is because Unstructured processes each page once with its **High Res** partitioning strategy and then reprocesses each
     page with a VLM to improve the quality, accuracy, and relevance of the initial partitioning results.
     The final results of these two processing passes for each page count as two pages for usage and billing purposes.
     This two-pass process happens regardless of whether you click **Update results** in the banner.

     This two-page usage and billing behavior is a known issue and will be addressed in a future release.
   </Warning>

6. To synchronize the scrolling of the **Preview** pane's selected contents with the **Result** pane's **Formatted** results,
   rest your mouse pointer anywhere inside the contents of the **Preview** pane until a bounding box appears.
   Then click the bounding box. Unstructured automatically scrolls the **Result** pane's **Formatted**
   results to match the selected bounding box. (You cannot synchronize the scrolling of the **JSON** results.)

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/bounding-box.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=1657719d2c7976ec1210dda5d3eb55b3" alt="Selecting a bounding box" width="2890" height="688" data-path="img/pipelines/single-file/bounding-box.png" />

   To show all of the bounding boxes in the **Preview** pane at once, turn on the **Show all bounding boxes** toggle at the top of the **Preview** pane.
   You can now click any of the bounding boxes without first needing to rest your mouse pointer on them to show them.

   <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/show-all-bounding-boxes.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=d7d9adb8f4d2d69c7e21f28833a01454" alt="Showing all bounding boxes" width="1448" height="856" data-path="img/pipelines/single-file/show-all-bounding-boxes.png" />

You can also do the following:

* To download the JSON view of the results as a local JSON file, click the download icon to the left of the **Formatted** and **JSON** buttons in the **Result** pane.
  (You cannot download the formatted view of the results.)

  <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/download.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=8a606ee656fd4ccb08fca935a6929b87" alt="Downloading the results as a local JSON file" width="784" height="191" data-path="img/pipelines/single-file/download.png" />

* To have Unstructured partition a different file, click **Add new file** in the **Files** pane on the left, and then browse to and select the target file.

* To view the results for a file that was previously partitioned during this session, click the file's name in the **Recent files** list in the **Files** pane.

* To return to the **Start** page, click the **X** (close) button at the left on the title bar, next to **Transform**.

* To have Unstructured do more—such as
  [chunking](/concepts/chunking), [embedding](/concepts/embedding),
  applying additional kinds of [enrichments](/concepts/enriching/overview), and
  processing larger files and semi-structured data in batches at scale—click
  **Edit in Workflow Editor** at the right on the title bar, and then [skip over to the walkthrough](/pipelines/walkthrough-2).

  <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/single-file/workflow-editor.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=8cd89c3be135534dd429073ec23ab925" alt="Switching to the workflow editor" width="732" height="204" data-path="img/pipelines/single-file/workflow-editor.png" />

What's next?

* <Icon icon="code" />  [Learn how to extract structured data in a custom format from your local file](/concepts/structured-data-extractor/using-the-structured-data-extractor#use-the-structured-data-extractor-from-the-start-page).
* <Icon icon="plus" />  [Learn how to add chunking, embeddings, custom structured data extraction, and additional enrichments to your local file results](/pipelines/walkthrough-2).
* <Icon icon="database" />  [Learn how to do large-scale batch processing of multiple files and semi-structured data that are stored in remote locations instead](/pipelines/quickstart#remote-quickstart).
* <Icon icon="desktop" />  [Learn more about Unstructured Pipelines](/pipelines/overview).

***

## Remote quickstart

The following quickstart shows you how to use Unstructured Pipelines to process remote files (or data).

The requirements are as follows.

* A compatible source (input) location that contains your data for Unstructured to process. [See the list of supported source types](/pipelines/connectors#sources).

  If your source (input) location is not in this list, or if you do not yet have any source locations for Unstructured to process, **stop here** and
  skip over to the [Dropbox source connector quickstart](/pipelines/sources/dropbox-source-quickstart) instead. This quickstart
  guides you through the process of creating a free Dropbox account, uploading your files to Dropbox,
  and creating a source connector to connect Unstructured to those files.

* For document-based source locations, compatible files in that location. [See the list of supported file types](/pipelines/supported-file-types). If you do not have any files available, you can download some from the [example-docs](https://github.com/Unstructured-IO/unstructured-ingest/tree/main/example-docs) folder in the Unstructured repo on GitHub.

* A compatible destination (output) location for Unstructured to put the processed data. [See the list of supported destination types](/pipelines/connectors#destinations).

  If your destination (output) location is not in this list, or if you do not yet have any destination locations for Unstructured to send its processed data, **stop here** and
  skip over to the [Pinecone destination connector quickstart](/pipelines/destinations/pinecone-destination-quickstart) instead. This quickstart
  guides you through the process of creating a free Pinecone account
  and creating a destination connector to connect Unstructured to a Pinecone dense serverless index within your Pinecone account.

<iframe width="560" height="315" src="https://www.youtube.com/embed/Wn2FfHT6H-o" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

<Steps>
  <Step title="Sign up and sign in">
    1. If you do not already have an Unstructured account, [sign up for free](https://unstructured.io/?modal=try-for-free).
       After you sign up, you are automatically signed in to your new Unstructured **Let's Go** account, at [https://platform.unstructured.io](https://platform.unstructured.io).

           <Note>
             To sign up for a **Business** account instead, [contact Unstructured Sales](https://unstructured.io/?modal=contact-sales), or [learn more](/pipelines/overview#how-am-i-billed%3F).
           </Note>

    2. If you have an Unstructured **Let's Go**, **Pay-As-You-Go**, or **Business SaaS** account and are not already signed in, sign in to your account at [https://platform.unstructured.io](https://platform.unstructured.io).

           <Note>
             For other types of **Business** accounts, see your Unstructured account administrator for sign-in instructions,
             or email Unstructured Support at [support@unstructured.io](mailto:support@unstructured.io).
           </Note>
  </Step>

  <Step title="Set the source (input) location">
    <img src="https://mintcdn.com/unstructured-53/4PbeTBTFGabETZ0g/img/pipelines/Sources-Sidebar.png?fit=max&auto=format&n=4PbeTBTFGabETZ0g&q=85&s=9b1d6f885d6419fb1c228aa6d6661ba9" alt="Sources in the sidebar" width="522" height="509" data-path="img/pipelines/Sources-Sidebar.png" />

    1. From your Unstructured  dashboard, in the sidebar, click **Connectors**.
    2. Click **Sources**.
    3. Click **New** or **Create Connector**.
    4. For **Name**, enter some unique name for this connector.
    5. In the **Provider** area, click the source location type that matches yours.
    6. Click **Continue**.
    7. Fill in the fields with the appropriate settings. [Learn more](/pipelines/sources/overview).
    8. If a **Continue** button appears, click it, and fill in any additional settings fields.
    9. Click **Save and Test**.
  </Step>

  <Step title="Set the destination (output) location">
    <img src="https://mintcdn.com/unstructured-53/4PbeTBTFGabETZ0g/img/pipelines/Destinations-Sidebar.png?fit=max&auto=format&n=4PbeTBTFGabETZ0g&q=85&s=1d60a69c4c2c1c951f4bb6dc35885682" alt="Destinations in the sidebar" width="519" height="505" data-path="img/pipelines/Destinations-Sidebar.png" />

    1. In the sidebar, click **Connectors**.
    2. Click **Destinations**.
    3. Click **New** or **Create Connector**.
    4. For **Name**, enter some unique name for this connector.
    5. In the **Provider** area, click the destination location type that matches yours.
    6. Click **Continue**.
    7. Fill in the fields with the appropriate settings. [Learn more](/pipelines/sources/overview).
    8. If a **Continue** button appears, click it, and fill in any additional settings fields.
    9. Click **Save and Test**.
  </Step>

  <Step title="Define the workflow">
    <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/Workflows-Sidebar.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=4bd7cc931376b99241dc0bbc491d3d50" alt="Workflows in the sidebar" width="1084" height="413" data-path="img/pipelines/Workflows-Sidebar.png" />

    1. In the sidebar, click **Workflows**.

    2. Click **New Workflow**.

    3. Next to **Build it for Me**, click **Create Workflow**.

       <Note>If a radio button appears instead of **Build it for Me**, select it, and then click **Continue**.</Note>

    4. For **Workflow Name**, enter some unique name for this workflow.

    5. In the **Sources** dropdown list, select your source location from Step 3.

    6. In the **Destinations** dropdown list, select your destination location from Step 4.

       <Note>You can select multiple source and destination locations. Files will be ingested from all of the selected source locations, and the processed data will be delivered to all of the selected destination locations.</Note>

    7. Click **Continue**.

    8. The **Reprocess All** box applies only to blob storage connectors such as the Amazon S3, Azure Blob Storage, and Google Cloud Storage connectors:

       * Checking this box reprocesses all documents in the source location on every workflow run.
       * Unchecking this box causes only new documents that are added to the source location, or existing documents that are updated in the source location (as determined by checking whether the file's version has changed), since the last workflow run to be processed on future runs. Previously processed documents are not processed again. However:

         * Even if this box is unchecked, a renamed file is always treated as a new file, regardless of whether the file's original contents have changed.
         * Even if this box is unchecked, a file that is removed but is added back later with the same file name is processed on future runs only if the file's contents have changed since the file was originally processed.

    9. Click **Continue**.

    10. If you want this workflow to run on a schedule, in the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings. Otherwise, select **Don't repeat**.

    11. Click **Complete**.
  </Step>

  <Step title="Process the documents">
    <img src="https://mintcdn.com/unstructured-53/MKM9xSjZ6pt1WWvX/img/pipelines/Workflows-Sidebar.png?fit=max&auto=format&n=MKM9xSjZ6pt1WWvX&q=85&s=4bd7cc931376b99241dc0bbc491d3d50" alt="Workflows in the sidebar" width="1084" height="413" data-path="img/pipelines/Workflows-Sidebar.png" />

    1. If you did not choose to run this workflow on a schedule in Step 5, you can run the workflow now: on the sidebar, click **Workflows**.
    2. Next to your workflow from Step 5, click **Run**.
  </Step>

  <Step title="Monitor the processing job">
    <img src="https://mintcdn.com/unstructured-53/4PbeTBTFGabETZ0g/img/pipelines/Select-Job.png?fit=max&auto=format&n=4PbeTBTFGabETZ0g&q=85&s=209c93a0a2410e08b3b471521a01658a" alt="Select a job" width="1081" height="413" data-path="img/pipelines/Select-Job.png" />

    <img src="https://mintcdn.com/unstructured-53/4PbeTBTFGabETZ0g/img/pipelines/Job-Complete.png?fit=max&auto=format&n=4PbeTBTFGabETZ0g&q=85&s=c540581beafaeae72b5a160b03ca6dfd" alt="Completed job" width="371" height="575" data-path="img/pipelines/Job-Complete.png" />

    1. In the sidebar, click **Jobs**.
    2. In the list of jobs, wait for the job's **Status** to change to **Finished**.
    3. Click the row for the job.
    4. After **Overview** displays **Finished**, go to the next Step.
  </Step>

  <Step title="View the processed data">
    Go to your destination location to view the processed data.
  </Step>
</Steps>

Learn more about Unstructured [source connectors](/pipelines/sources/overview),
[destination connectors](/pipelines/destinations/overview),
[workflows](/pipelines/workflows),
[jobs](/pipelines/jobs), and
[managing your account](/pipelines/account/overview).
