Best practices
Chunking
Unstructured API services
Getting started with API services
Process individual files
Batch processing and ingestion
- Overview
- Ingest CLI
- Ingest Python library
- Ingest dependencies
- Ingest configuration
- Source connectors
- Destination connectors
How to
- Choose a partitioning strategy
- Choose a hi-res model
- Get element contents
- Process a subset of files
- Set embedding behavior
- Parse simple PDFs and HTML
- Set partitioning behavior
- Set chunking behavior
- Output unique element IDs
- Output bounding box coordinates
- Set document language for better OCR
- Extract tables as HTML
- Extract images and tables from documents
- Get chunked elements
- Change element coordinate systems
- Work with PowerPoint files
- Use LangChain and Ollama
- Use LangChain and Llama 3
- Transform a JSON file into a different schema
- Generate a JSON schema for a file