> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Tables to HTML

<iframe width="560" height="315" src="https://www.youtube.com/embed/lT2ixyunrvA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />

After partitioning, you can have Unstructured generate representations of each detected table in HTML markup format.

This table-to-HTML output is generated by using agentic AI or a vision language model (VLM).
The agentic AI option typically provides more accurate table-to-HTML output than the VLM option.

Here is an example of the HTML markup output of a detected table using GPT-4o. Note specifically the `text_as_html` field that is added.
Line breaks have been inserted here for readability. The output will not contain these line breaks.

````json theme={null}
{
    "type": "Table",
    "element_id": "31aa654088742f1388d46ea9c8878272",
    "text": "Inhibitor Polarization Corrosion be (V/dec) ba (V/dec) Ecorr (V) icorr 
        (AJcm?) concentration (g) resistance (Q) rate (mmj/year) 0.0335 0.0409 
        \u20140.9393 0.0003 24.0910 2.8163 1.9460 0.0596 .8276 0.0002 121.440 
        1.5054 0.0163 0.2369 .8825 0.0001 42121 0.9476 s NO 03233 0.0540 
        \u20140.8027 5.39E-05 373.180 0.4318 0.1240 0.0556 .5896 5.46E-05 
        305.650 0.3772 = 5 0.0382 0.0086 .5356 1.24E-05 246.080 0.0919",
    "metadata": {
        "text_as_html": "```html\n
            <table>\n
                <tr>\n<th>Inhibitor concentration (g)</th>\n
                    <th>bc (V/dec)</th>\n<th>ba (V/dec)</th>\n<th>Ecorr (V)</th>\n
                    <th>icorr (A/cm\u00b2)</th>\n<th>Polarization resistance (\u03a9)</th>\n
                    <th>Corrosion rate (mm/year)</th>\n
                </tr>\n  
                <tr>\n
                    <td>0</td>\n<td>0.0335</td>\n<td>0.0409</td>\n<td>\u22120.9393</td>\n
                    <td>0.0003</td>\n<td>24.0910</td>\n<td>2.8163</td>\n  
                </tr>\n
                <tr>\n   
                    <td>2</td>\n<td>1.9460</td>\n<td>0.0596</td>\n<td>\u22120.8276</td>\n<td>0.0002</td>\n<td>121.440</td>\n<td>1.5054</td>\n  
                </tr>\n
                <tr>\n
                    <td>4</td>\n<td>0.0163</td>\n<td>0.2369</td>\n<td>\u22120.8825</td>\n<td>0.0001</td>\n<td>42.121</td>\n<td>0.9476</td>\n  
                </tr>\n  
                <tr>\n
                    <td>6</td>\n<td>0.3233</td>\n<td>0.0540</td>\n<td>\u22120.8027</td>\n<td>5.39E-05</td>\n<td>373.180</td>\n<td>0.4318</td>\n  
                </tr>\n  
                <tr>\n
                    <td>8</td>\n<td>0.1240</td>\n<td>0.0556</td>\n<td>\u22120.5896</td>\n<td>5.46E-05</td>\n<td>305.650</td>\n<td>0.3772</td>\n  
                </tr>\n  
                <tr>\n
                    <td>10</td>\n<td>0.0382</td>\n<td>0.0086</td>\n<td>\u22120.5356</td>\n<td>1.24E-05</td>\n<td>246.080</td>\n<td>0.0919</td>\n
                </tr>\n
            </table>\n```",
        "filetype": "application/pdf",
        "languages": [
            "eng"
        ],
        "page_number": 1,
        "image_base64": "/9j...<full results omitted for brevity>...//Z",
        "image_mime_type": "image/jpeg",
        "filename": "embedded-images-tables.pdf",
        "data_source": {}
    }
}
````

<Note>
  The `image_base64` field is generated only for documents or PDF pages that are [partitioned](/concepts/partitioning) by using the High Res strategy. This field is not generated for
  documents or PDF pages that are partitioned by using the Fast or VLM strategy.
</Note>

For workflows that use [chunking](/concepts/chunking), note the following changes:

* If a `Table` element must be chunked, the `Table` element is replaced by a set of related `TableChunk` elements.
* Each of these `TableChunk` elements will contain HTML table output for only its own element.
* None of these `TableChunk` elements will contain an `image_base64` field.

## Generate table-to-HTML output

To have Unstructured generate table-to-HTML output, do the following:

* For **Unstructured UI** users, add an [Enrichment node](/ui/workflows#custom-workflow-node-types) of type **Table to HTML**
  to an Unstructured [custom workflow](/ui/workflows#create-a-custom-workflow).
* For **Unstructured API** users, add a [Table to HTML task](/api-reference/workflow/nodes/enrichment/enrichment-table-to-html).
  You add this task as either as an object in a `workflow_nodes` array
  (for curl) or as a `WorkflowNode` in a `WorkflowNodes` collection (for Python). This object or collections applies whenever you
  [create a workflow](/api-reference/api/workflow/create-workflow),
  [update a workflow](/api-reference/api/workflow/update-workflow), or
  [create an on-demand workflow job](/api-reference/api/job/create-job).

## Learn more

* <Icon icon="video" />  [How to Extract Data from Complex Tables](https://unstructured.io/events/how-to-extract-data-from-complex-tables)
* <Icon icon="blog" />  [The Case for HTML as the Canonical Representation in Document AI](https://unstructured.io/blog/the-case-for-html-as-the-canonical-representation-in-document-ai)
* <Icon icon="blog" />  [Preserving Table Structure for Better Retrieval](https://unstructured.io/blog/preserving-table-structure-for-better-retrieval)
