Notebooks
Notebooks contain complete working sample code for end-to-end solutions.
Agentic RAG with Hugging Face smolagents vs Vanilla RAG
Build Agentic RAG with smolagents
library and compare the results with Vanilla RAG in pure Python
Unstructured Platform UI
GPT-4o
smolagents
Agents
DataStax
S3
Advanced notebook
LLama3.2 RAG evaluation on unstructured text
Evaluate Llama3.2 for your RAG system with Unstructured Platform, GPT-4o, Ragas, and LangChain
Unstructured Platform UI
GPT-4o
Ragas
LangChain
Llama3.2
Pinecone
S3
Advanced notebook
Multimodal RAG: Enhancing RAG outputs with image results
Process a file in S3 with Unstructured Platform and return images in your RAG output
Unstructured Platform UI
S3
FAISS
GPT-4o-mini
Advanced notebook
Quantitative Reasoning with tables inside PDFs
From Pixels to Insights: Seamlessly Extracting and Visualizing Table Data with Unstructured and Hex
Unstructured API
Hex
Advanced notebook
PII removal with GLiNER in unstructured data ETL
Remove Personally Identifiable Information (PII) as a part of unstructured data preprocessing.
Unstructured API
PII
GLiNER
Advanced notebook
Custom metadata extraction and self-querying retrieval
Extract custom metadata, and enable metadata pre-filtering in your RAG.
Unstructured API
MongoDB
Metadata
Advanced notebook
Selecting an embedding model for custom data
End-to-end data processing pipeline using Unstructured Serverless API.
Unstructured API
Hugging Face
Advanced notebook
RAG with PDFs, LangChain and Llama 3
A RAG system with the Llama 3 model from Hugging Face.
Unstructured API
🤗 Hugging Face
LangChain
Llama 3
Introductory notebook
Unstructured data ETL from S3 to SingleStore DB
Learn to ingest, partition, chunk, embed and load data from an S3 bucket into SingleStore DB.
Unstructured API
SingleStoreDB
AWS S3
Introductory notebook
Google Drive to DataStax Astra DB
Embed your Google Drive Docs in an Astra Vector Database with Unstructured Serverless API
Unstructured API
Google
DataStax
Introductory notebook
Weaviate RAG quickstart
Embed your local documents in an Weaviate Vector Database with Unstructured Serverless API
Unstructured API
OpenAI
Weaviate
Introductory notebook
Preprocess PDFs in AWS S3, load into Elasticsearch
Ingest PDF documents from an S3 bucket, transform them into a normalized JSON with Unstructured Serverless API, chunk, embed and load into Elasticsearch.
Unstructured API
AWS S3
Elasticsearch
Introductory notebook
Preprocess documents in Google Drive, load into Databricks Volume
Preprocess documents from a Google Drive Unstructured Serverless API and load them into Databricks Volume.
Unstructured API
Google Drive
Databricks
Introductory notebook
Source references in RAG responses
Add document source references to RAG responses based on documents metadata.
Unstructured API
RAG
LangChain
Intermediate notebook
Query processed PDF with HuggingChat
Send a PDF to Unstructured for processing, and send a subset of the returned PDF’s processed text to HuggingChat for chatbot-style querying.
Unstructured API
🤗 Hugging Face
🤗 HuggingChat
Introductory notebook
Llama 3 Local RAG with emails
Build a local RAG app for your emails with Unstructured, LangChain and Ollama.
Unstructured API
LangChain
Ollama
Llama 3
Introductory notebook
Building RAG With PowerPoint presentations
A RAG solution that is based on PowerPoint files.
Unstructured API
🤗 Hugging Face
LangChain
Llama 3
Introductory notebook
Synthetic test dataset generation
Build a Synthetic Test Dataset for your RAG system in 5 easy steps
Unstructured API
GPT-4o
Ragas
LangChain
Advanced notebook
Was this page helpful?