> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Choose an extraction method: LLM or Regex

The structured data extractor supports two extraction methods — **LLM** and **Regex** — each suited to different document types and use cases. The extraction **method** setting determines how each field is populated.

## How each extraction method works

Use this table to compare the two methods at a high level — how each processes your documents, the schema format it expects, and what the output looks like.

|                  | **LLM**                                                                                                                                                                       | **Regex**                                                                                                                               |
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| Choose when      | Values depend on context; you need nested or typed fields                                                                                                                     | Values follow a stable, recognizable pattern (for example: invoice numbers, dates, phone numbers)                                       |
| How it works     | A model reads meaning from text and populates schema-defined fields with inferred values                                                                                      | The extractor scans partitioned text for named patterns and returns matched strings                                                     |
| Schema format    | JSON in [OpenAI Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs#supported-schemas) format: named fields, types, descriptions, optional nesting | `name` / `pattern` pairs: a label and a regex for each capture field                                                                    |
| Output structure | Typed fields — objects, arrays, numbers, booleans, and strings. See [output examples](/concepts/structured-data-extractor/data-extractor#custom-defined-output).              | An array of matched substrings per pattern name. See [output examples](/concepts/structured-data-extractor/regex-options#regex-output). |
| Model selection  | The provider and model are configurable                                                                                                                                       | No model required — extraction uses a regex engine that matches patterns directly against partitioned text, not a language model        |

<Tip>If both methods could fit, run a small sample with each and compare quality and maintenance cost before you standardize on one.</Tip>

## Available options

The following table shows which options are available for each method. Links go to the detail pages where each option is described.

|                                                                                                        | **LLM**                                                                                    | **Regex**                                                                         |
| ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------- |
| Model selection — choose the LLM provider and model that powers your extraction                        | [Yes](/concepts/structured-data-extractor/llm-options#select-llm-provider-and-model)       | No                                                                                |
| Visual schema builder and JSON upload / export — build your schema visually or import from a JSON file | [Yes](/concepts/structured-data-extractor/llm-options#upload-a-json-file)                  | [Yes](/concepts/structured-data-extractor/regex-options#define-your-schema)       |
| Schema-only output toggle — return extracted fields only, without Unstructured document elements       | [Yes](/concepts/structured-data-extractor/llm-options#schema-only-output-llm)              | [Yes](/concepts/structured-data-extractor/regex-options#schema-only-output-regex) |
| Schema prompt — generate a schema from plain-language instructions                                     | [Yes](/concepts/structured-data-extractor/llm-options#prompt-a-schema)                     | No                                                                                |
| Extraction guidance — instruct the LLM how to format or normalize extracted values                     | [Yes](/concepts/structured-data-extractor/llm-options#extraction-guidance-workflow-editor) | No                                                                                |

## Next steps

**To learn more about the options for each method:**

* [Structured extraction with LLM](/concepts/structured-data-extractor/llm-options) — schema definition, model selection, schema prompt, and extraction guidance
* [Structured extraction with Regex](/concepts/structured-data-extractor/regex-options) — schema definition, validation behavior, output examples, pattern examples, and tools for testing your patterns

**To go straight to step-by-step procedures for using either method:**

* [Using the structured data extractor](/concepts/structured-data-extractor/using-the-structured-data-extractor)
