Custom metadata extraction and self-querying retrieval


Extract custom metadata, and enable metadata pre-filtering in your RAG.


Unstructured MongoDB Advanced Metadata


Selecting an embedding model for custom data


End-to-end data processing pipeline using Unstructured Serverless API.


Unstructured


Simple PDF and HTML parsing


Quickstart guide for parsing simple PDF and HTML documents with Unstructured.


Unstructured


Llama 3 Local RAG with emails


Build a local RAG app for your emails with Unstructured, LangChain and Ollama.


Unstructured LangChain Ollama Llama 3

RAG with PDFs, LangChain and Llama 3


A RAG system with the Llama 3 model from Hugging Face.


Unstructured 🤗 Hugging Face LangChain Llama 3

Building RAG With Powerpoint presentations


A RAG solution that is based on Powerpoint files.


Unstructured 🤗 Hugging Face LangChain Llama 3

Unstructured data ETL from S3 to SingleStore DB


Learn to ingest, partition, chunk, embed and load data from an S3 bucket into SingleStore DB.


Unstructured SingleStoreDB AWS S3

LLM chatbot with Databricks


A Chatbot on Databricks with RAG, DBRX Instruct & Vector Search


Unstructured Databricks LangChain

Synthetic test dataset generation


Build a Synthetic Test Dataset for your RAG system in 5 easy steps


Unstructured GPT-4o Ragas LangChain


LLama3.1 RAG evaluation on unstructured text


Build a Synthetic Test Dataset for your RAG system in 5 easy steps


Unstructured GPT-4o Ragas LangChain Llama3.1


Google Drive to DataStax Astra DB


Embed your Google Drive Docs in an Astra Vector Database with Unstructured Serverless API


Unstructured Google DataStax


Weaviate RAG quickstart


Embed your local documents in an Weaviate Vector Database with Unstructured Serverless API


Unstructured OpenAI Weaviate


Preprocess PDFs in AWS S3, load into Elasticsearch


Ingest PDF documents from an S3 bucket, transform them into a normalized JSON with Unstructured Serverless API, chunk, embed and load into Elasticsearch.


Unstructured AWS S3 Elasticsearch


Preprocess documents in Google Drive, load into Databricks Volume


Preprocess documents from a Google Drive Unstructured Serverless API and load them into Databricks Volume.


Unstructured Google Drive Databricks


Source references in RAG responses


Add document source references to RAG responses based on documents metadata.


Unstructured RAG LangChain