After partitioning and chunking, summarizing generates text-based summaries of images and tables. This summarization is done by using the GPT-4o or Claude 3.5 Sonnet models.

Here is an example of the output of image summarization using GPT-4o. Note specifically the text field that is added. Line breaks have been inserted here for readability. The output will not contain these line breaks.

{
    "type": "Image",
    "element_id": "3303aa13098f5a26b9845bd18ee8c881",
    "text": "{\n  \"type\": \"graph\",\n  \"description\": \"The graph shows 
        the relationship between Potential (V) and Current Density (A/cm2). 
        The x-axis is labeled 'Current Density (A/cm2)' and ranges from 
        0.0000001 to 0.1. The y-axis is labeled 'Potential (V)' and ranges 
        from -2.5 to 1.5. There are six different data series represented 
        by different colors: blue (10g), red (4g), green (6g), purple (2g), 
        orange (Control), and light blue (8g). The data points for each series 
        show how the potential changes with varying current density.\"\n}",
    "metadata": {
        "filetype": "application/pdf",
        "languages": [
            "eng"
        ],
        "page_number": 1,
        "image_base64": "/9j...<full results omitted for brevity>...Q==",
        "image_mime_type": "image/jpeg",
        "filename": "7f239e1d4ef3556cc867a4bd321bbc41.pdf",
        "data_source": {}
    }
}

Here is an example of the output of table summarization using GPT-4o. Note specifically the text field that is added. Line breaks have been inserted here for readability. The output will not contain these line breaks.

{
    "type": "Table",
    "element_id": "5713c0e90194ac7f0f2c60dd614bd24d",
    "text": "The table consists of 6 rows and 7 columns. The columns represent 
        inhibitor concentration (g), bc (V/dec), ba (V/dec), Ecorr (V), icorr 
        (A/cm\u00b2), polarization resistance (\u03a9), and corrosion rate 
        (mm/year). As the inhibitor concentration increases, the corrosion 
        rate generally decreases, indicating the effectiveness of the 
        inhibitor. Notably, the polarization resistance increases with higher 
        inhibitor concentrations, peaking at 6 grams before slightly 
        decreasing. This suggests that the inhibitor is most effective at 
        6 grams, significantly reducing the corrosion rate and increasing 
        polarization resistance. The data provides valuable insights into the 
        optimal concentration of the inhibitor for corrosion prevention.",
    "metadata": {
        "text_as_html": "<table>...<full results omitted for brevity>...</table>",
        "filetype": "application/pdf",
        "languages": [
            "eng"
        ],
        "page_number": 1,
        "image_base64": "/9j...<full results omitted for brevity>...//Z",
        "image_mime_type": "image/jpeg",
        "filename": "7f239e1d4ef3556cc867a4bd321bbc41.pdf",
        "data_source": {}
    }
}

Summarize images and tables

To summarize images and tables, in the Enrichment model section of an Enrichment node in a workflow, specify the following:

You can change a workflow’s summarization settings only through Custom workflow settings.

For image summarization, choose one of the following:

  • OpenAI Image Description: Use GPT-4o to summarize images. Learn more.
  • Anthropic Image Description: Use Claude 3.5 Sonnet to summarize images. Learn more.

For table summarization, choose one of the following:

  • OpenAI Table Description: Use GPT-4o to summarize tables. Learn more.
  • Anthropic Table Description: Use Claude 3.5 Sonnet to summarize tables. Learn more.