> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Unstructured API Quickstart

<Note>
  The following information applies to the legacy Unstructured Partition Endpoint.

  Unstructured recommends that you use the
  [on-demand jobs](/api-reference/api/job/create-job) functionality in the
  [Unstructured API](/api-reference/overview) instead. Unstructured's on-demand jobs provide
  many benefits over the legacy Unstructured Partition Endpoint, including support for:

  * Production-level usage.
  * Multiple local input files in batches.
  * The latest and highest-performing models.
  * Post-transform enrichments.
  * All of Unstructured's chunking strategies.
  * The generation of vector embeddings.

  The Unstructured API also provides support for processing files and data in remote locations.
</Note>

<Tip>
  Do you want to run this quickstart without modifying your local machine?
  [Skip ahead](https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Partition_Endpoint_Quickstart.ipynb) to run this quickstart as a notebook on Google Colab now!

  Do you want to just copy the sample code for use on your local machine? [Skip ahead](#sample-code) to the code now!

  This quickstart uses the Unstructured Partition Endpoint and focuses on a single, local file for ease-of-use demonstration purposes. This quickstart also
  focuses only on a limited set of Unstructured's full capabilities. To unlock the full feature set, as well as use Unstructured to do
  large-scale batch processing of multiple files and semi-structured data that are stored in remote locations,
  [skip over](/api-reference/workflow/overview#quickstart) to an expanded, advanced version of this quickstart that uses the
  Unstructured API's workflow operations instead.
</Tip>

<iframe width="560" height="315" src="https://www.youtube.com/embed/0EogKNU_BPU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

The following code shows how to use the [Unstructured Python SDK](/api-reference/legacy-api/partition/sdk-python)
to have Unstructured process one or more local files by using
the [Unstructured Partition Endpoint](/api-reference/legacy-api/partition/overview).

To run this code, you will need the following:

* An Unstructured account and an Unstructured API key for your account. [Learn how](/api-reference/legacy-api/partition/overview#get-started).

* Python 3.9 or higher installed on your local machine.

* A Python virtual environment is recommended for isolating and versioning Python project code dependencies, but this is not required.
  To create and activate a virtual environment, you can use a framework such as
  [uv](https://docs.astral.sh/uv/) (recommended). Python provides a built-in framework named
  [venv](https://docs.python.org/3/library/venv.html).

* You must install the Unstructured Python SDK on your local machine, for example by running one of the
  following commands:

  * For `uv`, run `uv add unstructured-client`
  * For `venv` (or for no virtual environment), run `pip install unstructured-client`

* Add the following code to a Python file on your local machine; make the following code changes; and then run the code file to see the results.

  * Replace `<unstructured-api-key>` with your Unstructured API key.

  * To process all files within a directory, change `None` for `input_dir` to a string that contains the path to the directory on your local machine. This can be a relative or absolute path.

  * To process specific files within a directory or across multiple directories, change `None` for `input_file` to a string that contains
    a comma-separated list of filepaths on your local machine, for example `"./input/2507.13305v1.pdf,./input2/table-multi-row-column-cells.pdf"`. These filepaths
    can be relative or absolute.

    <Note>
      If `input_dir` and `input_file` are both set to something other than `None`, then the `input_dir` setting takes precedence, and the `input_file` setting is ignored.
    </Note>

  * For the `output_dir` parameter, specify a string that contains the path to the directory on your local machine that you want Unstructured to send its JSON output files. If the specified directory does not exist at that location, the code will create the missing directory for you. This path can be relative or absolute.

  <Note>
    If you choose to run this code in a notebook (such as a Google Colab notebook), you must do the following first to avoid
    nested event loop errors:

    1. Install the `nest_asyncio` Python package, by running the following command from a notebook cell:

       ```python theme={null}
       !pip install nest_asyncio
       ```

    2. Import the `nest_asyncio` package and then enable nested event loops, by running the following code from a notebook cell:

       ```python theme={null}
       import nest_asyncio

       nest_asyncio.apply()
       ```

    After completing these two steps, you can then run the following code as normal.
  </Note>

## Sample code

```python Python SDK theme={null}
import asyncio
import os
import json
import unstructured_client
from unstructured_client.models import shared, errors

client = unstructured_client.UnstructuredClient(
    api_key_auth="<unstructured-api-key>"
)

async def partition_file_via_api(filename):
    req = {
        "partition_parameters": {
            "files": {
                "content": open(filename, "rb"),
                "file_name": os.path.basename(filename),
            },
            "strategy": shared.Strategy.AUTO,
            "vlm_model": "gpt-4o",
            "vlm_model_provider": "openai",
            "languages": ['eng'],
            "split_pdf_page": True, 
            "split_pdf_allow_failed": True,
            "split_pdf_concurrency_level": 15
        }
    }

    try:
        res = await client.general.partition_async(request=req)
        return res.elements
    except errors.UnstructuredClientError as e:
        print(f"Error partitioning {filename}: {e.message}")
        return []

async def process_file_and_save_result(input_filename, output_dir):
    elements = await partition_file_via_api(input_filename)

    if elements:
        results_name = f"{os.path.basename(input_filename)}.json"
        output_filename = os.path.join(output_dir, results_name)

        with open(output_filename, "w") as f:
            json.dump(elements, f)

def load_filenames_in_directory(input_dir):
    filenames = []
    for root, _, files in os.walk(input_dir):
        for file in files:
            if not file.endswith('.json'):
                filenames.append(os.path.join(root, file))

    return filenames

async def process_files():
    # Initialize with either a directory name, to process everything in the dir,
    # or a comma-separated list of filepaths.
    input_dir = None   # "path/to/input/directory"
    input_files = None # "path/to/file,path/to/file,path/to/file"

    # Set to the directory for output json files. This dir 
    # will be created if needed.
    output_dir = "./output/"

    if input_dir:
        filenames = load_filenames_in_directory(input_dir)
    else:
        filenames = input_files.split(",")

    os.makedirs(output_dir, exist_ok=True)

    tasks = []
    for filename in filenames:
        tasks.append(
            process_file_and_save_result(filename, output_dir)
        )

    await asyncio.gather(*tasks)

if __name__ == "__main__":
    asyncio.run(process_files())
```

## Next steps

This quickstart shows how to use the Unstructured Partition Endpoint, which is intended for rapid prototyping of
some of Unstructured's [partitioning](/api-reference/legacy-api/partition/partitioning) strategies, with limited support for [chunking](/api-reference/legacy-api/partition/chunking).
It is designed to work only with processing of local files.

Take your code to the next level by switching over to the [Unstructured API's workflow operations](/api-reference/workflow/overview)
for production-level scenarios, file processing in batches, files and data in remote locations, full support for [chunking](/concepts/chunking),
generating [embeddings](/concepts/embedding), applying post-transform [enrichments](/concepts/enriching/overview),
using the latest and highest-performing models, and much more.
[Get started](/api-reference/workflow/overview).