← Back to projects

RAG Pipeline Visualizer

An interactive visualization of how Retrieval-Augmented Generation works — from query embedding to vector search to LLM response generation, using a fictional Space Colony knowledge base.

JavaScript TF-IDF Canvas 2D RAG Pipeline

How It Works

Type a question (or pick a preset) and watch the full RAG pipeline animate step-by-step. The knowledge base contains 20 chunks from a fictional Space Colony Handbook.

StageWhat Happens
1. EmbeddingYour query is converted into a vector representation (animated bars)
2. Vector SearchTF-IDF cosine similarity finds the most relevant chunks
3. Context AssemblyTop-K retrieved chunks are assembled into the LLM context window
4. GenerationThe LLM generates a response token-by-token

Controls

ControlAction
Preset buttonsLoad a pre-written question
Top-K sliderControl how many chunks are retrieved (1-5)
Click a chunk cardView its full content and keywords

The Concept

RAG (Retrieval-Augmented Generation) enhances LLM responses by first searching a knowledge base for relevant context, then including that context in the prompt. This demo uses real TF-IDF vectorization and cosine similarity to find the best matching chunks.