← Back to projects

RAG Pipeline Visualizer

An interactive visualization of how Retrieval-Augmented Generation works — from query embedding to vector search to LLM response generation, using a fictional Space Colony knowledge base.

JavaScript TF-IDF Canvas 2D RAG Pipeline

How It Works

Type a question (or pick a preset) and watch the full RAG pipeline animate step-by-step. The knowledge base contains 20 chunks from a fictional Space Colony Handbook.

Stage	What Happens
1. Embedding	Your query is converted into a vector representation (animated bars)
2. Vector Search	TF-IDF cosine similarity finds the most relevant chunks
3. Context Assembly	Top-K retrieved chunks are assembled into the LLM context window
4. Generation	The LLM generates a response token-by-token

Controls

Control	Action
Preset buttons	Load a pre-written question
Top-K slider	Control how many chunks are retrieved (1-5)
Click a chunk card	View its full content and keywords

The Concept

RAG (Retrieval-Augmented Generation) enhances LLM responses by first searching a knowledge base for relevant context, then including that context in the prompt. This demo uses real TF-IDF vectorization and cosine similarity to find the best matching chunks.