Notebooks
Notebooks contain complete working sample code for end-to-end solutions.
PII removal with GLiNER in unstructured data ETL
Remove Personally Identifiable Information (PII) as a part of unstructured data preprocessing.
Unstructured
Advanced
PII
GLiNER
Custom metadata extraction and self-querying retrieval
Extract custom metadata, and enable metadata pre-filtering in your RAG.
Unstructured
MongoDB
Advanced
Metadata
Selecting an embedding model for custom data
End-to-end data processing pipeline using Unstructured Serverless API.
Unstructured
RAG with PDFs, LangChain and Llama 3
A RAG system with the Llama 3 model from Hugging Face.
Unstructured
🤗 Hugging Face
LangChain
Llama 3
Unstructured data ETL from S3 to SingleStore DB
Learn to ingest, partition, chunk, embed and load data from an S3 bucket into SingleStore DB.
Unstructured
SingleStoreDB
AWS S3
Google Drive to DataStax Astra DB
Embed your Google Drive Docs in an Astra Vector Database with Unstructured Serverless API
Unstructured
Google
DataStax
Weaviate RAG quickstart
Embed your local documents in an Weaviate Vector Database with Unstructured Serverless API
Unstructured
OpenAI
Weaviate
Preprocess PDFs in AWS S3, load into Elasticsearch
Ingest PDF documents from an S3 bucket, transform them into a normalized JSON with Unstructured Serverless API, chunk, embed and load into Elasticsearch.
Unstructured
AWS S3
Elasticsearch
Preprocess documents in Google Drive, load into Databricks Volume
Preprocess documents from a Google Drive Unstructured Serverless API and load them into Databricks Volume.
Unstructured
Google Drive
Databricks
Source references in RAG responses
Add document source references to RAG responses based on documents metadata.
Unstructured
RAG
LangChain
Query processed PDF with HuggingChat
Send a PDF to Unstructured for processing, and send a subset of the returned PDF’s processed text to HuggingChat for chatbot-style querying.
Unstructured
🤗 Hugging Face
🤗 HuggingChat
Llama 3 Local RAG with emails
Build a local RAG app for your emails with Unstructured, LangChain and Ollama.
Unstructured
LangChain
Ollama
Llama 3
Building RAG With PowerPoint presentations
A RAG solution that is based on PowerPoint files.
Unstructured
🤗 Hugging Face
LangChain
Llama 3
Synthetic test dataset generation
Build a Synthetic Test Dataset for your RAG system in 5 easy steps
Unstructured
GPT-4o
Ragas
LangChain
LLama3.1 RAG evaluation on unstructured text
Build a Synthetic Test Dataset for your RAG system in 5 easy steps
Unstructured
GPT-4o
Ragas
LangChain
Llama3.1
Was this page helpful?