The following information applies to the legacy Unstructured Partition Endpoint.Unstructured recommends that you use the
on-demand jobs functionality in the
Unstructured API instead. Unstructured’s on-demand jobs provide
many benefits over the legacy Unstructured Partition Endpoint, including support for:
- Production-level usage.
- Multiple local input files in batches.
- The latest and highest-performing models.
- Post-transform enrichments.
- All of Unstructured’s chunking strategies.
- The generation of vector embeddings.
Task
You want to get, save, or show the contents of elements that are represented as HTML, such as tables that are embedded in a PDF document.Approach
Extract the contents of an element’stext_as_html JSON object, which is nested inside of its parent metadata object.
To run this example
You will need a document that is one of the document types that can output thetext_as_html JSON object. For the list of applicable document types, see the entries in the table at the beginning of Partitioning where “Table Support” is “Yes.”
This example uses a PDF file with an embedded table.
Code
For the Unstructured Python SDK, you’ll need: These environment variables:UNSTRUCTURED_API_KEY- Your Unstructured API key value.UNSTRUCTURED_API_URL- Your Unstructured API URL.
Python SDK

