Skip to main content
Type: chunk Subtype: chunk_by_character

Settings

unstructured_api_url
string
If specified, overrides the default API URL used for chunker calls. Default: none (uses Unstructured’s internal default).
unstructured_api_key
string
If specified, overrides the default API key used for chunker calls. Default: none (uses Unstructured’s internal default).
include_orig_elements
boolean
If true, the elements used to form a chunk appear in .metadata.orig_elements for that chunk. Default: false.
new_after_n_chars
integer
Soft maximum length of a chunk in characters. Closes a section after reaching approximately this length. Default: none.
max_characters
integer
Hard maximum number of characters in a chunk. Default: none.
overlap
integer
Number of trailing characters from the prior text-split chunk to prepend to each subsequent chunk formed by splitting an oversized element. Default: none.
overlap_all
boolean
If true, applies overlap to chunks formed by combining whole elements, not just oversized ones. Use with caution — this can introduce noise into otherwise clean semantic units. Default: false.
isolate_table
boolean
If true, each table is placed in its own dedicated chunk, separate from any surrounding text elements. If false, small tables may share a chunk with adjacent text, producing mixed chunks that contain both text and table content.Regardless of this setting, narrative overlap is never prepended to table-only chunks, and table overlap is never appended to following narrative chunks.Note that max_characters applies to a chunk’s visible .text content. When a table joins a mixed chunk with isolate_table set to false, the table’s metadata.text_as_html is not subject to the same size budget. If the table has a large HTML representation, the serialized chunk payload may exceed max_characters. Use isolate_table: true if you have strict payload size requirements.Default: true.
contextual_chunking_strategy
string
If specified, prepends chunk-specific explanatory context to each chunk. Allowed value: v1. Default: none.
chunk_by_character_chunker_workflow_node = WorkflowNode(
    name="Chunker",
    subtype="chunk_by_character",
    type="chunk",
    settings={
        "unstructured_api_url": None,
        "unstructured_api_key": None,
        "include_orig_elements": <True|False>,
        "new_after_n_chars": <new-after-n-chars>,
        "max_characters": <max-characters>,
        "overlap": <overlap>,
        "overlap_all": <True|False>,
        "isolate_table": <True|False>,
        "contextual_chunking_strategy": "<contextual-chunking-strategy>"
    }
)