> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Tables to HTML

<iframe width="560" height="315" src="https://www.youtube.com/embed/lT2ixyunrvA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

After partitioning, you can have Unstructured generate representations of each detected table in HTML markup format.

This table-to-HTML output is generated by using agentic AI or a vision language model (VLM).
The agentic AI option typically provides more accurate table-to-HTML output than the VLM option.

Here is an example of the HTML markup output of a detected table using GPT-4o. Note specifically the `text_as_html` field that is added.
Line breaks have been inserted here for readability. The output will not contain these line breaks.

````json  theme={null}
{
    "type": "Table",
    "element_id": "31aa654088742f1388d46ea9c8878272",
    "text": "Inhibitor Polarization Corrosion be (V/dec) ba (V/dec) Ecorr (V) icorr 
        (AJcm?) concentration (g) resistance (Q) rate (mmj/year) 0.0335 0.0409 
        \u20140.9393 0.0003 24.0910 2.8163 1.9460 0.0596 .8276 0.0002 121.440 
        1.5054 0.0163 0.2369 .8825 0.0001 42121 0.9476 s NO 03233 0.0540 
        \u20140.8027 5.39E-05 373.180 0.4318 0.1240 0.0556 .5896 5.46E-05 
        305.650 0.3772 = 5 0.0382 0.0086 .5356 1.24E-05 246.080 0.0919",
    "metadata": {
        "text_as_html": "```html\n
            <table>\n
                <tr>\n<th>Inhibitor concentration (g)</th>\n
                    <th>bc (V/dec)</th>\n<th>ba (V/dec)</th>\n<th>Ecorr (V)</th>\n
                    <th>icorr (A/cm\u00b2)</th>\n<th>Polarization resistance (\u03a9)</th>\n
                    <th>Corrosion rate (mm/year)</th>\n
                </tr>\n  
                <tr>\n
                    <td>0</td>\n<td>0.0335</td>\n<td>0.0409</td>\n<td>\u22120.9393</td>\n
                    <td>0.0003</td>\n<td>24.0910</td>\n<td>2.8163</td>\n  
                </tr>\n
                <tr>\n   
                    <td>2</td>\n<td>1.9460</td>\n<td>0.0596</td>\n<td>\u22120.8276</td>\n<td>0.0002</td>\n<td>121.440</td>\n<td>1.5054</td>\n  
                </tr>\n
                <tr>\n
                    <td>4</td>\n<td>0.0163</td>\n<td>0.2369</td>\n<td>\u22120.8825</td>\n<td>0.0001</td>\n<td>42.121</td>\n<td>0.9476</td>\n  
                </tr>\n  
                <tr>\n
                    <td>6</td>\n<td>0.3233</td>\n<td>0.0540</td>\n<td>\u22120.8027</td>\n<td>5.39E-05</td>\n<td>373.180</td>\n<td>0.4318</td>\n  
                </tr>\n  
                <tr>\n
                    <td>8</td>\n<td>0.1240</td>\n<td>0.0556</td>\n<td>\u22120.5896</td>\n<td>5.46E-05</td>\n<td>305.650</td>\n<td>0.3772</td>\n  
                </tr>\n  
                <tr>\n
                    <td>10</td>\n<td>0.0382</td>\n<td>0.0086</td>\n<td>\u22120.5356</td>\n<td>1.24E-05</td>\n<td>246.080</td>\n<td>0.0919</td>\n
                </tr>\n
            </table>\n```",
        "filetype": "application/pdf",
        "languages": [
            "eng"
        ],
        "page_number": 1,
        "image_base64": "/9j...<full results omitted for brevity>...//Z",
        "image_mime_type": "image/jpeg",
        "filename": "embedded-images-tables.pdf",
        "data_source": {}
    }
}
````

<Note>
  The `image_base64` field is generated only for documents or PDF pages that are [partitioned](/ui/partitioning) by using the High Res strategy. This field is not generated for
  documents or PDF pages that are partitioned by using the Fast or VLM strategy.
</Note>

For workflows that use [chunking](/ui/chunking), note the following changes:

* If a `Table` element must be chunked, the `Table` element is replaced by a set of related `TableChunk` elements.
* Each of these `TableChunk` elements will contain HTML table output for only its own element.
* None of these `TableChunk` elements will contain an `image_base64` field.

## Generate table-to-HTML output

To generate table-to-HTML output, add a **Table to HTML** node to your workflow by clicking **+** in the workflow editor, and then click **Enrich > Table to HTML**.

To use agentic AI to generate the table-to-HTML output, for **Mode**, click **Agentic**.

To use a VLM to generate the table-to-HTML output instead, for **Mode**, click **Standard**. Then click
one of the available **Provider** and **Model** options.

<Note>
  You can change a workflow's table description settings only through [Custom](/ui/workflows#create-a-custom-workflow) workflow settings.

  For workflows that use [chunking](/ui/chunking), the **Chunker** node should be placed after all enrichment nodes. Placing the
  **Chunker** node before a table-to-HTML output enrichment node could cause incomplete or no table-to-HTML output to be generated.
</Note>

<Warning>
  The following models are no longer available as of the following dates:

  * Amazon Bedrock Claude Sonnet 3.5: October 22, 2025
  * Anthropic Claude Sonnet 3.5: October 22, 2025

  Unstructured recommends the following actions:

  * For new workflows, do not use any of these models.
  * For any workflow that uses any of these models, update that workflow as soon as possible to use a different model.

  Workflows that attempt to use any of these models on or after its associated date will return errors.
</Warning>

<Warning>
  Unstructured can potentially generate table-to-HTML output only for workflows that are configured as follows:

  * With a **Partitioner** node set to use the **Auto** or **High Res** partitioning strategy, and a table-to-HTML output node is added.
  * With a **Partitioner** node set to use the **VLM** partitioning strategy. No table-to-HTML output node is needed (or allowed).

  Even with these configurations, Unstructured actually generates table-to-HTML output only for files that contain tables and are also eligible
  for processing with the following partitioning strategies:

  * **High Res**, when the workflow's **Partitioner** node is set to use **Auto** or **High Res**.
  * **VLM** or **High Res**, when the workflow's **Partitioner** node is set to use **VLM**.

  Unstructured never generates table-to-HTML output for workflows that are configured as follows:

  * With a **Partitioner** node set to use the **Fast** partitioning strategy.
  * With a **Partitioner** node set to use the **Auto**, **High Res**, or **VLM** partitioning strategy, for all files that Unstructured encounters that do not contain tables.
</Warning>

## Learn more

* <Icon icon="video" />  [How to Extract Data from Complex Tables](https://unstructured.io/events/how-to-extract-data-from-complex-tables)
* <Icon icon="blog" />  [The Case for HTML as the Canonical Representation in Document AI](https://unstructured.io/blog/the-case-for-html-as-the-canonical-representation-in-document-ai)
* <Icon icon="blog" />  [Preserving Table Structure for Better Retrieval](https://unstructured.io/blog/preserving-table-structure-for-better-retrieval)
