Vikramaditya Mishra
_
Building GenAI systems, open-source AI projects, and exploring LLM workflows with a focus on experimentation, reliability, and real-world impact.
About
AI/ML Engineer focused on building production-grade Retrieval-Augmented Generation systems and intelligent workflows for real-world, high-stakes domains.
I work across the full stack of modern AI, LangChain, LangGraph, MCP, Vector Databases, and Pydantic, with a focus on structured retrieval, evaluation frameworks, and reliable LLM reasoning pipelines.
I care about systems that work under constraints, not just demos that work on clean data. You can find me on X and LinkedIn.
Experience
VRCYN
Data Science Intern
Dec 2024 – May 2025
- •Assisted in preparing, cleaning, and structuring customer feedback datasets (chat logs and surveys) using Python, NLTK, spaCy, and SQL.
- •Explored and implemented NLP techniques, including LSTM-based sentiment analysis, keyword extraction, and basic topic modeling to identify customer pain points and satisfaction drivers.
- •Contributed to insights that supported data-driven discussions for improving product features, customer experience, and internal decision-making.
Featured Projects
ClaimLens
Jan 2026 – Mar 2026
Built a production-grade RAG pipeline for insurance policy analysis across multi-insurer documents.
Designed for real-world insurance policy analysis — where naive RAG fails and structure-aware retrieval wins.
- •Built a production-grade Retrieval-Augmented Generation (RAG) pipeline for insurance policy analysis across multi-insurer documents.
- •Implemented a deterministic clause splitter parsing legal PDFs into atomic clause chunks across 5 structural formats with canonical clause IDs and duplicate detection.
- •Designed a two-stage retrieval architecture using embeddings + FAISS (Top-40) with cross-encoder reranking (Top-5), achieving Recall@20: 0.93, MRR: 0.89 on single-clause queries.
- •Developed a multi-clause evaluation framework with stage-wise diagnostics, achieving Coverage@20: 0.87, Full Recall@20: 0.60, and MRR: 0.77 on composite multi-hop queries.
- •Developed an LLM reasoning engine with strict Pydantic schema validation, citation grounding, and JSON retry logic returning structured answers with confidence and clause citations.
Insurance-Aware RAG
Jan 2026 – Feb 2026
Advanced RAG system for extracting and reasoning over complex insurance policy clauses.
Improved retrieval reliability in complex legal documents with deterministic, traceable clause parsing.
- •Implemented deterministic clause splitting with stable canonical IDs (e.g., ICICILombard_p8_Grace_Period_1) to ensure perfect evaluation traceability and prevent vector overwrites.
- •Built a two-stage retrieval pipeline: high-recall dense FAISS retrieval using BGE embeddings followed by cross-encoder reranking for precision evidence surfacing.
- •Enforced strict Pydantic schema validation (RAGResponse, Citation) as a structural firewall, rejecting hallucinated citations and logically contradictory LLM outputs.
Skills
AI Systems
Machine Learning
Data & Analysis
Backend
Languages
Achievements
Education
Indian Institute of Technology Patna
Master of Technology in AI and DSE
Jan 2026 – Present
Lovely Professional University
Bachelor of Technology in Computer Science and Engineering
Aug 2020 – Aug 2024