PII removal with GLiNER in unstructured data ETL


Remove Personally Identifiable Information (PII) as a part of unstructured data preprocessing.


Unstructured Advanced PII GLiNER


Custom metadata extraction and self-querying retrieval


Extract custom metadata, and enable metadata pre-filtering in your RAG.


Unstructured MongoDB Advanced Metadata


Selecting an embedding model for custom data


End-to-end data processing pipeline using Unstructured Serverless API.


Unstructured


RAG with PDFs, LangChain and Llama 3


A RAG system with the Llama 3 model from Hugging Face.


Unstructured 🤗 Hugging Face LangChain Llama 3

Unstructured data ETL from S3 to SingleStore DB


Learn to ingest, partition, chunk, embed and load data from an S3 bucket into SingleStore DB.


Unstructured SingleStoreDB AWS S3

Google Drive to DataStax Astra DB


Embed your Google Drive Docs in an Astra Vector Database with Unstructured Serverless API


Unstructured Google DataStax


Weaviate RAG quickstart


Embed your local documents in an Weaviate Vector Database with Unstructured Serverless API


Unstructured OpenAI Weaviate


Preprocess PDFs in AWS S3, load into Elasticsearch


Ingest PDF documents from an S3 bucket, transform them into a normalized JSON with Unstructured Serverless API, chunk, embed and load into Elasticsearch.


Unstructured AWS S3 Elasticsearch


Preprocess documents in Google Drive, load into Databricks Volume


Preprocess documents from a Google Drive Unstructured Serverless API and load them into Databricks Volume.


Unstructured Google Drive Databricks


Source references in RAG responses


Add document source references to RAG responses based on documents metadata.


Unstructured RAG LangChain


Llama 3 Local RAG with emails


Build a local RAG app for your emails with Unstructured, LangChain and Ollama.


Unstructured LangChain Ollama Llama 3

Building RAG With PowerPoint presentations


A RAG solution that is based on PowerPoint files.


Unstructured 🤗 Hugging Face LangChain Llama 3

Synthetic test dataset generation


Build a Synthetic Test Dataset for your RAG system in 5 easy steps


Unstructured GPT-4o Ragas LangChain


LLama3.1 RAG evaluation on unstructured text


Build a Synthetic Test Dataset for your RAG system in 5 easy steps


Unstructured GPT-4o Ragas LangChain Llama3.1