Streamlit in Snowflake
Streamlit is an open-source Python framework for data scientists and AI/ML engineers to deliver dynamic data apps with only a few lines of code.
Streamlit in Snowflake enables data scientists and Python developers to combine Streamlit’s component-rich, open-source Python library with the scale, performance and security of the Snowflake platform. Streamlit Python scripts can define user interface (UI) components such as filters, graphs, sliders, and more to interact with your data.
In this example, you use Snowsight in your Snowflake account to create a simple Streamlit app that uses Snowflake Cortex Search for RAG to ask natural-language questions about an existing table in your Snowflake account. This table contains data that was generated by Unstructured. Answers are returned in natural-language, chatbot-style format.
Prerequisites
-
A table in Snowflake that contains data that was generated by Unstructured. The target Snowflake table must have a column named
EMBEDDINGS
that will contains vector embeddings for the text in the table’sTEXT
column. The following Streamlit example app assumes that theEMBEDDINGS
column contains 1,024 vector embeddings and has a data type ofVECTOR(FLOAT, 1024)
.To create this table, you can create a custom Unstructured workflow that uses any supported source connector along with the Snowflake destination connector. Then run the workflow to generate the data and then insert that generated data into the target Snowflake table.
After the data is inserted into the target Snowflake table, you can run the following Snowflake SQL statement to generate the 1,024 vector embeddings for the text in the table’s
TEXT
column and then insert those generated vector embeddings into the table’sEMBEDDINGS
column. The model specified here for generating the vector embeddings is the same one that is used by the Streamlit example app:To learn how to run Snowflake SQL statements, see for example Querying data using worksheets.
-
You must have the appropriate privileges to create and use a Streamlit app in your Snowflake account. These privileges include ones for the target table’s parent database and schema as well as the Snowflake warehouse that runs the Streamlit app. For details, see Getting started with Streamlit in Snowflake.
Create and run the example app
Create the Streamlit app
- In Snowsight for your Snowflake account, on the sidebar, click Projects > Streamlit.
- Click + Streamlit App.
- For App title, enter a name for your app, such as
Unstructured Demo Streamlit App
. - For App location, chose the target database and schema to store the app in.
- For App warehouse, choose the warehouse that you want to use to run your app and execute its queries.
- Click Create.
Add code to the Streamlit app
In this step, you add Python code to the Streamlit app that you created in the previous step.
This step explains each part of the code as you add it. If you want to skip past these explanations, add the code in the complete code example all at once, and then skip ahead to the next step, “Run the Streamlit app.”
-
Import Python dependencies that get the current connection to the Snowflake database and schema and get Streamlit functions and features.
-
Get the current connection to the Snowflake database and schema.
-
Display the title of the app in the Streamlit UI, and get the user’s search query from the Streamlit UI.
-
Get the user’s search query and display a progress indicator in the UI.
-
Use the user’s search query to get the top result from the
ELEMENTS
table. TheELEMENTS
table contains the data that was generated by Unstructured. The code uses theSNOWFLAKE.CORTEX.EMBED_TEXT_1024
function to generate vector embeddings for the user’s search query and theVECTOR_COSINE_SIMILARITY
function to get the similarity between the vector embeddings for the user’s search query and the vector embeddings for theTEXT
column for each rown in theELEMENTS
table. The code then orders the results by similarity and limits the results to the row with the greatest similarity between the search query and the target text. -
Get the
TEXT
column from the top result and use it as context for the user’s search query. -
Use the user’s search query and the context from the top result to get a response from Snowflake Cortex Search for RAG. The code uses the
SNOWFLAKE.CORTEX.COMPLETE
function to generate a response to the user’s search query based on the context from the top result. -
Display the generated response in the Streamlit UI.
Run the Streamlit app
- In the upper right corner, click Run.
- For Enter your search query, enter some natural-language question about the
TEXT
column in the table. - Press Enter.
Snowflake Cortex Search for RAG returns its answer to your question in natural-language, chatbot-style format.
Complete code example
The full code example for the Streamlit app is as follows: