Type:Documentation Index
Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
Use this file to discover all available pages before exploring further.
chunk
Subtype: chunk_by_similarity
Settings
If specified, overrides the default API URL used for chunker calls. Default: none (uses Unstructured’s internal default).
If specified, overrides the default API key used for chunker calls. Default: none (uses Unstructured’s internal default).
If
true, the elements used to form a chunk appear in .metadata.orig_elements for that chunk. Default: false.Soft maximum length of a chunk in characters. Closes a section after reaching approximately this length. Default: none.
Hard maximum number of characters in a chunk. Default: none.
Number of trailing characters from the prior text-split chunk to prepend to each subsequent chunk formed by splitting an oversized element. Default: none.
If
true, applies overlap to chunks formed by combining whole elements, not just oversized ones. Use with caution — this can introduce noise into otherwise clean semantic units. Default: false.If specified, prepends chunk-specific explanatory context to each chunk. Allowed value:
v1. Default: none.Minimum similarity score required for consecutive elements to be combined into the same chunk. Must be between
0.0 and 1.0 exclusive (i.e., 0.01 to 0.99). Default: none.
