When you install the Unstructured Ingest CLI and the Unstructured Ingest Python library by running the command pip install unstructured-ingest by itself, you get the following by default:

File type
.bmp
.eml
.heic
.html
.jpg
.jpeg
.tiff
.png
.txt
.xml

To add support for additional file types, run the following:

CommandFile type
pip install "unstructured-ingest[csv]".csv
pip install "unstructured-ingest[doc]".doc
pip install "unstructured-ingest[docx]".docx
pip install "unstructured-ingest[epub]".epub
pip install "unstructured-ingest[md]".md
pip install "unstructured-ingest[msg]".msg
pip install "unstructured-ingest[odt]".odt
pip install "unstructured-ingest[org]".org
pip install "unstructured-ingest[pdf]".pdf
pip install "unstructured-ingest[ppt]".ppt
pip install "unstructured-ingest[pptx]".pptx
pip install "unstructured-ingest[rtf]".rtf
pip install "unstructured-ingest[rst]".rst
pip install "unstructured-ingest[tsv]".tsv
pip install "unstructured-ingest[xlsx]".xlsx

To add support for additional connectors, run the following:

CommandConnector type
pip install "unstructured-ingest[airtable]"Airtable
pip install "unstructured-ingest[astra]"Astra DB
pip install "unstructured-ingest[azure]"Azure Blob Storage
pip install "unstructured-ingest[azure-cognitive-search]"Azure Cognitive Search Service
pip install "unstructured-ingest[biomed]"Biomed
pip install "unstructured-ingest[box]"Box
pip install "unstructured-ingest[chroma]"Chroma
pip install "unstructured-ingest[clarifai]"Clarifai
pip install "unstructured-ingest[confluence]"Confluence
pip install "unstructured-ingest[couchbase]"Couchbase
pip install "unstructured-ingest[databricks-volumes]"Databricks Volumes
pip install "unstructured-ingest[delta-table]"Delta Tables
pip install "unstructured-ingest[discord]"Discord
pip install "unstructured-ingest[dropbox]"Dropbox
pip install "unstructured-ingest[elasticsearch]"Elasticsearch
pip install "unstructured-ingest[gcs]"Google Cloud Storage
pip install "unstructured-ingest[github]"GitHub
pip install "unstructured-ingest[gitlab]"GitLab
pip install "unstructured-ingest[google-drive]"Google Drive
pip install "unstructured-ingest[hubspot]"HubSpot
pip install "unstructured-ingest[jira]"JIRA
pip install "unstructured-ingest[kafka]"Apache Kafka
pip install "unstructured-ingest[milvus]"Milvus
pip install "unstructured-ingest[mongodb]"MongoDB
pip install "unstructured-ingest[notion]"Notion
pip install "unstructured-ingest[onedrive]"OneDrive
pip install "unstructured-ingest[opensearch]"OpenSearch
pip install "unstructured-ingest[outlook]"Outlook
pip install "unstructured-ingest[pinecone]"Pinecone
pip install "unstructured-ingest[postgres]"PostgreSQL, SQLite
pip install "unstructured-ingest[qdrant]"Qdrant
pip install "unstructured-ingest[reddit]"Reddit
pip install "unstructured-ingest[s3]"Amazon S3
pip install "unstructured-ingest[sharepoint]"SharePoint
pip install "unstructured-ingest[salesforce]"Salesforce
pip install "unstructured-ingest[singlestore]"SingleStore
pip install "unstructured-ingest[sftp]"SFTP
pip install "unstructured-ingest[slack]"Slack
pip install "unstructured-ingest[wikipedia]"Wikipedia
pip install "unstructured-ingest[weaviate]"Weaviate

To add support for available embedding libraries, run the following:

CommandEmbedding library type
pip install "unstructured-ingest[bedrock]"Amazon Bedrock
pip install "unstructured-ingest[embed-huggingface]"Hugging Face
pip install "unstructured-ingest[embed-octoai]"OctoAI
pip install "unstructured-ingest[embed-vertexai]"Google Vertex AI
pip install "unstructured-ingest[embed-voyageai]"Voyage AI
pip install "unstructured-ingest[embed-mixedbreadai]"Mixedbread
pip install "unstructured-ingest[openai]"OpenAI

For details about the specific dependencies that are installed, see:

See also setup.py.