rag-retriever

End-to-end RAG Fine Tuning example using Feast and Milvus.

Introduction

This example notebook provides a step-by-step demonstration of building and using a RAG system with Feast and the custom FeastRagRetriever. The notebook walks through:

Data Preparation
- Loads a subset of the Wikipedia DPR dataset (1% of training data)
- Implements text chunking with configurable chunk size and overlap
- Processes text into manageable passages with unique IDs
Embedding Generation
- Uses all-MiniLM-L6-v2 sentence transformer model
- Generates 384-dimensional embeddings for text passages
- Demonstrates batch processing with GPU support
Feature Store Setup
- Creates a Parquet file as the historical data source
- Configures Feast with the feature repository
- Demonstrates writing embeddings from data source to Milvus online store which can be used for model training later
RAG System Implementation
- Embedding Model: all-MiniLM-L6-v2 (configurable)
- Generator Model: granite-3.2-2b-instruct (configurable)
- Vector Store: Custom implementation with Feast integration
- Retriever: Custom implementation extending HuggingFace's RagRetriever
Query Demonstration
- Perform inference with retrieved context

Requirements

A Kubernetes cluster with:
- GPU nodes available (for model inference)
- At least 200GB of storage
- A standalone Milvus deployment. See example here.

Running the example

Clone this repository: https://github.com/feast-dev/feast.git Navigate to the examples/rag-retriever directory. Here you will find the following files:

feature_repo/feature_store.yaml This is the core configuration file for the RAG project's feature store, configuring a Milvus online store on a local provider.
- In order to configure Milvus you should:
  - Update feature_store.yaml with your Milvus connection details:
    - host
    - port (default: 19530)
    - credentials (if required)
feature_repo/ragproject_repo.py This is the Feast feature repository configuration that defines the schema and data source for Wikipedia passage embeddings.
rag_feast.ipynb This is a notebook demonstrating the implementation of a RAG system using Feast. The notebook provides:
- A complete end-to-end example of building a RAG system with:
  - Data preparation using the Wiki DPR dataset
  - Text chunking and preprocessing
  - Vector embedding generation using sentence-transformers
  - Integration with Milvus vector store
  - Inference utilising a custom RagRetriever: FeastRagRetriever
- Uses all-MiniLM-L6-v2 for generating embeddings
- Implements granite-3.2-2b-instruct as the generator model

Open rag_feast.ipynb and follow the steps in the notebook to run the example.

Using DocEmbedder for Simplified Ingestion

As an alternative to the manual data preparation steps in the notebook above, Feast provides the DocEmbedder class that automates the entire document-to-embeddings pipeline: chunking, embedding generation, FeatureView creation, and writing to the online store.

Install Dependencies

pip install feast[milvus,rag]

Quick Start

from feast import DocEmbedder
from datasets import load_dataset

# Load your dataset
dataset = load_dataset("facebook/wiki_dpr", "psgs_w100.nq.exact", split="train[:1%]",
                       with_index=False, trust_remote_code=True)
df = dataset.select(range(100)).to_pandas()

# DocEmbedder handles everything in one step
embedder = DocEmbedder(
    repo_path="feature_repo_docembedder/",
    feature_view_name="text_feature_view",
)

result = embedder.embed_documents(
    documents=df,
    id_column="id",
    source_column="text",
    column_mapping=("text", "text_embedding"),
)

What DocEmbedder Does

Generates a FeatureView: Automatically creates a Python file with Entity and FeatureView definitions compatible with feast apply
Applies the repo: Registers the FeatureView in the Feast registry and deploys infrastructure (e.g., Milvus collection)
Chunks documents: Splits text into smaller passages using TextChunker (configurable chunk size, overlap, etc.)
Generates embeddings: Produces vector embeddings using MultiModalEmbedder (defaults to all-MiniLM-L6-v2)
Writes to online store: Stores the processed data in your configured online store (e.g., Milvus)

Customization

Custom Chunker: Subclass BaseChunker for your own chunking strategy
Custom Embedder: Subclass BaseEmbedder to use a different embedding model
Logical Layer Function: Provide a SchemaTransformFn to control how the output maps to your FeatureView schema

Example Notebook

See rag_feast_docembedder.ipynb for a complete end-to-end example that uses DocEmbedder with the Wiki DPR dataset and then queries the results using FeastRAGRetriever.

FeastRagRetriver Low Level Design

Helpful Information

Ensure your Milvus instance is properly configured and running
Vector dimensions and similarity metrics can be adjusted in the feature store configuration
The example uses Wikipedia data, but the system can be adapted for other datasets

Name		Name	Last commit message	Last commit date
parent directory ..
feature_repo		feature_repo
images		images
README.md		README.md
rag_feast.ipynb		rag_feast.ipynb
rag_feast_docembedder.ipynb		rag_feast_docembedder.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

End-to-end RAG Fine Tuning example using Feast and Milvus.

Introduction

Requirements

Running the example

Using DocEmbedder for Simplified Ingestion

Install Dependencies

Quick Start

What DocEmbedder Does

Customization

Example Notebook

FeastRagRetriver Low Level Design

Helpful Information

FilesExpand file tree

rag-retriever

Directory actions

More options

Directory actions

More options

Latest commit

History

rag-retriever

Folders and files

parent directory

README.md

End-to-end RAG Fine Tuning example using Feast and Milvus.

Introduction

Requirements

Running the example

Using DocEmbedder for Simplified Ingestion

Install Dependencies

Quick Start

What DocEmbedder Does

Customization

Example Notebook

FeastRagRetriver Low Level Design

Helpful Information