The Unstructured Platform API consists of two parts:

  • The Unstructured Platform Workflow Endpoint enables a full range of partitioning, chunking, embedding, and enrichment options for your files and data. It is designed to batch-process files and data in remote locations; send processed results to various storage, databases, and vector stores; and use the latest and highest-performing models on the market today. It has built-in logic to deliver the highest quality results at the lowest cost. Learn more.
  • The Unstructured Platform Partition Endpoint is intended for rapid prototyping of Unstructured’s various partitioning strategies, with limited support for chunking. It is designed to work only with processing of local files, one file at a time. Use the Unstructured Platform Workflow Endpoint for production-level scenarios, file processing in batches, files and data in remote locations, generating embeddings, applying post-transform enrichments, using the latest and highest-performing models, and for the highest quality results at the lowest cost. Learn more.

Benefits over open source

The Unstructured Platform API provides the following benefits beyond the Unstructured open source library offering:

  • Designed for production scenarios.
  • Significantly increased performance on document and table extraction.
  • Access to newer and more sophisticated vision transformer models.
  • Access to Unstructured’s fine-tuned OCR models.
  • Access to Unstructured’s by-page and by-similarity chunking strategies.
  • Adherence to security and SOC2 Type 1, SOC2 Type 2, and HIPAA compliance standards.
  • Authentication and identity management.
  • Incremental data loading.
  • Image extraction from documents.
  • More sophisticated document hierarchy detection.
  • Unstructured manages code dependencies, for instance for libraries such as Tesseract.
  • Unstructured manages its own infrastructure, including parallelization and other performance optimizations.

Get support

Should you require any assistance or have any questions regarding the Unstructured Platform API, please contact us directly.