🔗

Advanced RAG Chatbot

RAG Cross-encoder re-ranking Sentence-transformers ChromaDB Vector search Flask Session management RAGAS evaluation Faithfulness scoring OpenAI · Ollama

GitHub ↗

Overview

A complete Retrieval-Augmented Generation system built from scratch, demonstrating end-to-end AI engineering — from raw document ingestion to evaluated, context-aware responses. Designed to highlight real-world architectural decisions, not just prototype-level RAG. What makes it non-trivial: most RAG demos stop at "embed → retrieve → generate." This one adds a cross-encoder re-ranking stage, sliding-window conversation memory with LLM context injection, and a RAGAS-inspired evaluation suite that scores every response on faithfulness and relevancy — the kind of observability a production team actually needs.

Have questions about this project? Ask my AI assistant for details.

Ask AI about this →