After partitioning and chunking, you can have Unstructured generate representations of each detected table in HTML markup format.This table-to-HTML output is done by using GPT-4o, provided through OpenAI.Here is an example of the HTML markup output of a detected table using GPT-4o. Note specifically the text_as_html field that is added.
Line breaks have been inserted here for readability. The output will not contain these line breaks.
To generate table-to-HTML output, in an Enrichment node in a workflow, for Model, select OpenAI (GPT-4o).Make sure after you choose this provider and model, that Table to HTML is also selected.
You can change a workflow’s table description settings only through Custom workflow settings.
Table-to-HTML generation happens only when the Partitioner node in a workflow is set to use the High Respartitioning strategy and
the workflow also contains a table-to-HTML enrichment node.Setting the Partitioner node to use Auto, VLM, or Fast in a workflow that also contains a table-to-HTML enrichment node
will not generate any table-to-HTML output, and it could also cause the workflow to stop running or produce unexpected results.