Embedding configuration
A common embedding configuration is a critical component that allows for dynamic selection of embedders and their associated parameters to create vectors from data. This configuration provides the flexibility to choose from various embedding models and fine-tune parameters to optimize the quality and characteristics of the resulting vectors. It enables users to tailor the embedding process to the specific needs of their data and downstream applications, ensuring that the generated vectors effectively capture semantic relationships and contextual information within the dataset.
Configs
-
api_key
: The API key to use, if one is required to generate the embeddings through an API service, such as OpenAI. -
aws_access_key_id
: The AWS access key ID to be used for AWS-based embedders, such as Amazon Bedrock. -
aws_region
: The AWS Region ID to be used for AWS-based embedders, such as Amazon Bedrock. -
aws_secret_access_key
: The AWS secret access key to be used for AWS-based embedders, such as Amazon Bedrock. -
embedding_provider
: The embedding provider to use while doing embedding. Available values includeaws-bedrock
,azure-openai
,huggingface
,mixedbread-ai
,octoai
,openai
,togetherai
,vertexai
, andvoyageai
. -
embedding_api_key
: The API key to use, if one is required to generate the embeddings through an API service, such as OpenAI. -
embedding_aws_access_key_id
: The AWS access key ID to be used for AWS-based embedders, such as Amazon Bedrock. -
embedding_aws_region
: The AWS Region ID to be used for AWS-based embedders, such as Amazon Bedrock. -
embedding_aws_secret_access_key
: The AWS secret access key to be used for AWS-based embedders, such as Amazon Bedrock. -
embedding_model_name
: The specific model to use for the embedding provider, if necessary. -
model_name
: The specific model to use for the embedding provider, if necessary. -
provider
: The embedding provider to use while doing embedding. Available values includeaws-bedrock
,azure-openai
,huggingface
,mixedbread-ai
,octoai
,openai
,togetherai
,vertexai
, andvoyageai
.
model_name
values unless otherwise specified are:
-
aws-bedrock
: None -
azure-openai
:text-embedding-ada-002
, with 1536 dimensions -
huggingface
:sentence-transformers/all-MiniLM-L6-v2
, with 384 dimensions -
mixedbread-ai
:mixedbread-ai/mxbai-embed-large-v1
, with 1024 dimensions -
octoai
:thenlper/gte-large
, with 1024 dimensions -
openai
:text-embedding-ada-002
, with 1536 dimensions -
togetherai
:togethercomputer/m2-bert-80M-8k-retrieval
, with 768 dimensions -
vertexai
:textembedding-gecko@001
, with 768 dimensions -
voyageai
: None
Was this page helpful?