Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt

Use this file to discover all available pages before exploring further.

Type: chunk Subtype: chunk_by_character

Settings

unstructured_api_url
string
If specified, overrides the default API URL used for chunker calls. Default: none (uses Unstructured’s internal default).
unstructured_api_key
string
If specified, overrides the default API key used for chunker calls. Default: none (uses Unstructured’s internal default).
include_orig_elements
boolean
If true, the elements used to form a chunk appear in .metadata.orig_elements for that chunk. Default: false.
new_after_n_chars
integer
Soft maximum length of a chunk in characters. Closes a section after reaching approximately this length. Default: none.
max_characters
integer
Hard maximum number of characters in a chunk. Default: none.
overlap
integer
Number of trailing characters from the prior text-split chunk to prepend to each subsequent chunk formed by splitting an oversized element. Default: none.
overlap_all
boolean
If true, applies overlap to chunks formed by combining whole elements, not just oversized ones. Use with caution — this can introduce noise into otherwise clean semantic units. Default: false.
contextual_chunking_strategy
string
If specified, prepends chunk-specific explanatory context to each chunk. Allowed value: v1. Default: none.
chunk_by_character_chunker_workflow_node = WorkflowNode(
    name="Chunker",
    subtype="chunk_by_character",
    type="chunk",
    settings={
        "unstructured_api_url": None,
        "unstructured_api_key": None,
        "include_orig_elements": <True|False>,
        "new_after_n_chars": <new-after-n-chars>,
        "max_characters": <max-characters>,
        "overlap": <overlap>,
        "overlap_all": <True|False>,
        "contextual_chunking_strategy": "<contextual-chunking-strategy>"
    }
)