> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Full installation

To install the Unstructured open source library on a local development machine, run one or more of the following commands.

These commands assume that you are using the Python package and project manager
[uv](https://docs.astral.sh/uv/), running within an activated [venv](https://docs.python.org/3/library/venv.html)
virtual environment that was created with `uv`. However, `uv` and `venv` are not required.

To work with all [supported file types](/open-source/introduction/supported-file-types), run:

```bash  theme={null}
uv add "unstructured[all-docs]"
```

To conserve disk space and reduce code dependencies, you can run the following command instead to work with a
default set of supported file types:

```bash  theme={null}
uv add unstructured
```

The preceding command supports plain text files (`.txt`), HTML files (`.html`), XML files (`.xml`), and emails (`.eml`, `.msg`, and `.p7s`) by default.

To further conserve disk space and reduce code dependencies, you can run the following command instead, replacing `<extra>` with the appropriate extra for the target file type:

```bash  theme={null}
uv add "unstructured[<extra>]"
```

The following file type extras are available:

* `all-docs` (for all supported file types in this list)
* `csv` (for `.csv` files only)
* `docx` (for `.doc` and `.docx` files only)
* `epub` (for `.epub` files only)
* `image` (for all supported image file types: `.bmp`, `.heic`, `.jpeg`, `.png`, and `.tiff`)
* `md` (for `.md` files only)
* `odt` (for `.odt` files only)
* `org` (for `.org` files only)
* `pdf` (for `.pdf` files only)
* `pptx` (for `.ppt` and `.pptx` files only)
* `rst` (for `.rst` files only)
* `rtf` (for `.rtf` files only)
* `tsv` (for `.tsv` files only)
* `xlsx` (for `.xls` and `.xlsx` files only)

Note that you can install multiple extras at the same time by separating them with commas, for example:

```bash  theme={null}
uv add "unstructured[pdf,docx]"
```

For maximum compatiblity, you should also install the following system dependencies:

* [libmagic-dev](https://man7.org/linux/man-pages/man3/libmagic.3.html) (for filetype detection)
* [poppler-utils](https://poppler.freedesktop.org/) and [tesseract-ocr](https://github.com/tesseract-ocr/tesseract) (for images and PDFs), and `tesseract-lang` (for additional language support)
* [libreoffice](https://www.libreoffice.org/discover/libreoffice/) (for Microsoft Office documents)
* [pandoc](https://pandoc.org/) (for `.epub`, `.odt`, and `.rtf` files. For `.rtf` files, you must have version 2.14.2 or newer. Running [this script](https://github.com/Unstructured-IO/unstructured/blob/main/scripts/install-pandoc.sh) will install the correct version for you.)

Installation instructured for these system dependencies vary by operating system type. For details, follow the preceding links or see your
operating system's documentation.
