EMBEDDINGS
that will contains vector embeddings for the text in the table’s TEXT
column.
The following Streamlit example app assumes that the EMBEDDINGS
column contains 1,024 vector embeddings and has a data type of VECTOR(FLOAT, 1024)
.
To create this table, you can create a custom Unstructured workflow
that uses any supported source connector along with the
Snowflake destination connector. Then
run the workflow to
generate the data and then insert that generated data into the target Snowflake table.
After the data is inserted into the target Snowflake table, you can run the following Snowflake SQL statement to
generate the 1,024 vector embeddings for the text in the table’s TEXT
column and then insert those generated vector
embeddings into the table’s EMBEDDINGS
column. The model specified here for generating the vector embeddings is the
same one that is used by the Streamlit example app:
Create the Streamlit app
Unstructured Demo Streamlit App
.Add code to the Streamlit app
ELEMENTS
table.
The ELEMENTS
table contains the data that was generated by Unstructured. The code uses the
SNOWFLAKE.CORTEX.EMBED_TEXT_1024
function to generate vector embeddings for the user’s search query and the VECTOR_COSINE_SIMILARITY
function to get the similarity between the vector embeddings for the user’s search query and the vector embeddings for the TEXT
column
for each rown in the ELEMENTS
table. The code then orders the results by similarity and limits the results to the row with the greatest similarity
between the search query and the target text.
TEXT
column from the top result and use it as context for the user’s search query.
SNOWFLAKE.CORTEX.COMPLETE
function to generate a response to the user’s search query based on the context from the top result.
Run the Streamlit app
TEXT
column in the table.