Machine learning chatbot for technical queries with semantic retrieval and generation.
An ML-powered chatbot that answers technical questions using semantic retrieval from PostgreSQL and context-aware generation with LLaMA 3.1, with Scikit-learn supporting the ML workflow.
Students and learners need quick answers to ML concepts without digging through scattered notes. The project needed a retrieval-backed chatbot that could surface relevant material before generating a response.
I built Flask APIs for document indexing and retrieval, stored content in PostgreSQL, and used semantic search with LLaMA 3.1 to produce context-aware answers. Scikit-learn supports ML-related processing in the pipeline.
Documents are indexed and stored in PostgreSQL. User queries trigger retrieval of relevant content, which is passed to LLaMA 3.1 for generation. Flask exposes the chat and retrieval endpoints.
Architecture Preview
Flask API
Chat + IR
PostgreSQL
Indexed docs
Semantic Retrieval
Context
LLaMA 3.1
Generation
Scikit-learn
ML workflow
Indexing technical content
ML concepts needed structured storage and retrieval so the chatbot could surface relevant explanations.
Retrieval before generation
Responses had to use retrieved PostgreSQL content as context for LLaMA 3.1 rather than generating blindly.
API design for chat flows
Flask endpoints needed to handle indexing, retrieval, and response generation in a maintainable way.
PostgreSQL for document storage
Indexed content lives in PostgreSQL for reliable retrieval during chat sessions.
LLaMA 3.1 for context-aware answers
Generation uses retrieved context to keep responses tied to stored material.
Flask service APIs
Flask exposes indexing, retrieval, and chat endpoints behind a single application.
Image placeholder
ML Chatbot conversation interface
Image placeholder
Intent classification confidence breakdown
Image placeholder
Model evaluation metrics per intent