Documentation Index
Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
Use this file to discover all available pages before exploring further.
After partitioning, you can have Unstructured generate representations of each detected table in HTML markup format.
This table-to-HTML output is generated by using agentic AI or a vision language model (VLM).
The agentic AI option typically provides more accurate table-to-HTML output than the VLM option.
Here is an example of the HTML markup output of a detected table using GPT-4o. Note specifically the
text_as_html field that is added.
Line breaks have been inserted here for readability. The output will not contain these line breaks.
The
image_base64 field is generated only for documents or PDF pages that are partitioned by using the High Res strategy. This field is not generated for
documents or PDF pages that are partitioned by using the Fast or VLM strategy.- If a
Tableelement must be chunked, theTableelement is replaced by a set of relatedTableChunkelements. - Each of these
TableChunkelements will contain HTML table output for only its own element. - None of these
TableChunkelements will contain animage_base64field.
Generate table-to-HTML output
To have Unstructured generate table-to-HTML output, do the following:- For Unstructured UI users, add an Enrichment node of type Table to HTML to an Unstructured custom workflow.
- For Unstructured API users, add a Table to HTML task.
You add this task as either as an object in a
workflow_nodesarray (for curl) or as aWorkflowNodein aWorkflowNodescollection (for Python). This object or collections applies whenever you create a workflow, update a workflow, or create an on-demand workflow job.

