CompositeElement
: Any text element will become a CompositeElement
after chunking. A composite element can be a
combination of two or more original text elements that together fit within the max characters setting. It can also be a single
element that doesn’t leave room in the chunk for any others but fits by itself. Or it can be a fragment of an original
text element that was too big to fit in one chunk and required splitting.Table
: A table element is not combined with other elements, and if it fits within the max characters setting it will remain as is.TableChunk
: Large tables that exceed the max characters setting are split into special TableChunk
elements.metadata
field’s orig_elements
field for that chunk.
This setting applies to all of the chunking strategies.
0.0
and 1.0
, exclusive (0.01
to 0.99
). The default is 0.5
if not otherwise specified.
To specify this setting, enter a number into the Similarity threshold field.
This setting applies only to the chunking strategy Chunk by similarity.
Prefix:
and ends with a semicolon (;
).
The chunk’s original content begins with Original:
.
For example, without contextual chunking applied, elements would for instance be generated similar to the following.
Line breaks have been inserted here for readability. The output will not contain these line breaks: