Models
Depending on your need, Unstructured
provides OCR-based and Transformer-based models to detect elements in the documents. The models are useful to detect the complex layout in the documents and predict the element types.
Basic usage:
-
To use any model with the partition, set the
strategy
tohi_res
as shown above. -
To maintain the consistency between the
unstructured
andunstructured-api
libraries, we are deprecating themodel_name
parameter. Please usehi_res_model_name
parameter when specifying a model.
The hi_res_model_name
parameter supports the yolox
and detectron2_onnx
arguments.
Using a Non-Default Model
Unstructured
will download the model specified in UNSTRUCTURED_HI_RES_MODEL_NAME
environment variable. If not defined, it will download the default model.
There are three ways you can use the non-default model as follows:
- Store the model name in the environment variable
- Pass the model name in the
partition
function.
- Use unstructured-inference library.
Was this page helpful?