Vikramaditya Mishra

Building GenAI systems, open-source AI projects, and exploring LLM workflows with a focus on experimentation, reliability, and real-world impact.

GitHub Resume

About

I'm an AI Engineer who builds systems that actually work, translating research papers into production AI and intelligent agents that solve real problems.

Right now, I'm at Grant Thornton INDUS, where I spend my days building LLM pipelines that automate the messy parts of business operations. Before this, I was researching multimodal medical AI at Neurapex.

Most of my work involves the engineering side of AI, figuring out how to make LLMs retrieve reliably, reason properly, and use tools without breaking in production. You can find me talking about it on X and LinkedIn.

Experience

Grant Thornton INDUS

AI DevSecOps Engineer

Jul 2026 – Present

•Developing enterprise AI agents for workflow orchestration and intelligent task automation using Large Language Models (LLMs) and agentic workflows.
•Building an AI-powered Task Orchestration Agent that automates task assignment, reminder generation, completion tracking, escalations, and operational reporting.
•Designing rule-based workflow orchestration by integrating task ownership, scheduling, business rules, and enterprise systems.
•Collaborating with cross-functional teams to build production-ready AI solutions for internal operations and process automation.

Neurapex AI

AI Research Engineer

Jun 2026 – Jul 2026

•Researched multimodal AI systems for 3D medical imaging using vision-language models across CT and MRI modalities.
•Designed AI pipeline architectures compatible with DICOM imaging standards and PACS-based clinical imaging workflows.
•Evaluated medical vision-language models and representation learning techniques for volumetric/3D medical imaging.
•Explored generalized multimodal learning approaches for building foundation models that transfer across multiple medical imaging modalities.

VRCYN

Data Science Intern

Dec 2024 – May 2025

•Assisted in preparing, cleaning, and structuring customer feedback datasets (chat logs and surveys) using Python, NLTK, spaCy, and SQL.
•Explored and implemented NLP techniques, including LSTM-based sentiment analysis, keyword extraction, and basic topic modeling to identify customer pain points and satisfaction drivers.
•Contributed to insights that supported data-driven discussions for improving product features, customer experience, and internal decision-making.

Featured Projects

ClaimLens

Jan 2026 – Mar 2026

Built a production-grade RAG pipeline for insurance policy analysis across multi-insurer documents.

View System Design

Designed for real-world insurance policy analysis — where naive RAG fails and structure-aware retrieval wins.

•Built a production-grade Retrieval-Augmented Generation (RAG) pipeline for insurance policy analysis across multi-insurer documents.
•Implemented a deterministic clause splitter parsing legal PDFs into atomic clause chunks across 5 structural formats with canonical clause IDs and duplicate detection.
•Designed a two-stage retrieval architecture using embeddings + FAISS (Top-40) with cross-encoder reranking (Top-5), achieving Recall@20: 0.93, MRR: 0.89 on single-clause queries.
•Developed a multi-clause evaluation framework with stage-wise diagnostics, achieving Coverage@20: 0.87, Full Recall@20: 0.60, and MRR: 0.77 on composite multi-hop queries.
•Developed an LLM reasoning engine with strict Pydantic schema validation, citation grounding, and JSON retry logic returning structured answers with confidence and clause citations.

LangChainFAISSRAGLLM

GitHub ↗

Insurance-Aware RAG

Jan 2026 – Feb 2026

Advanced RAG system for extracting and reasoning over complex insurance policy clauses.

Improved retrieval reliability in complex legal documents with deterministic, traceable clause parsing.

•Implemented deterministic clause splitting with stable canonical IDs (e.g., ICICILombard_p8_Grace_Period_1) to ensure perfect evaluation traceability and prevent vector overwrites.
•Built a two-stage retrieval pipeline: high-recall dense FAISS retrieval using BGE embeddings followed by cross-encoder reranking for precision evidence surfacing.
•Enforced strict Pydantic schema validation (RAGResponse, Citation) as a structural firewall, rejecting hallucinated citations and logically contradictory LLM outputs.

FAISSBM25Cross-EncoderRAG

GitHub ↗

View All Projects →

Skills

LLM / GenAI

LLMsRAGAI AgentsLangChainLangGraphMCPCopilot Studio

Machine Learning

PyTorchTFSklearnYOLO

Data & Analysis

PandasNumPyMatplotlib

Backend

FastAPIDockerGit

Languages

PythonSQLC++

Achievements

IFERP | Conference

GitHub ↗

•Paper accepted at IFERP International Conference on "Real-Time Sign Language Detection Using CNN and OpenCV" (2024).

GDSC | AI/ML Team

•Team member at Google Developer Student Clubs.
•Helped organize multiple technical and ML-focused events under GDSC.

Education

Indian Institute of Technology Patna

Master of Technology in AI and DSE

Jan 2026 – Present

Lovely Professional University

Bachelor of Technology in Computer Science and Engineering

Aug 2020 – Aug 2024

About

I'm an AI Engineer who builds systems that actually work, translating research papers into production AI and intelligent agents that solve real problems.

Experience

Grant Thornton INDUS

AI DevSecOps Engineer

Jul 2026 – Present

•Developing enterprise AI agents for workflow orchestration and intelligent task automation using Large Language Models (LLMs) and agentic workflows.
•Building an AI-powered Task Orchestration Agent that automates task assignment, reminder generation, completion tracking, escalations, and operational reporting.
•Designing rule-based workflow orchestration by integrating task ownership, scheduling, business rules, and enterprise systems.
•Collaborating with cross-functional teams to build production-ready AI solutions for internal operations and process automation.

Neurapex AI

AI Research Engineer

Jun 2026 – Jul 2026

•Researched multimodal AI systems for 3D medical imaging using vision-language models across CT and MRI modalities.
•Designed AI pipeline architectures compatible with DICOM imaging standards and PACS-based clinical imaging workflows.
•Evaluated medical vision-language models and representation learning techniques for volumetric/3D medical imaging.
•Explored generalized multimodal learning approaches for building foundation models that transfer across multiple medical imaging modalities.

VRCYN

Data Science Intern

Dec 2024 – May 2025

•Assisted in preparing, cleaning, and structuring customer feedback datasets (chat logs and surveys) using Python, NLTK, spaCy, and SQL.
•Explored and implemented NLP techniques, including LSTM-based sentiment analysis, keyword extraction, and basic topic modeling to identify customer pain points and satisfaction drivers.
•Contributed to insights that supported data-driven discussions for improving product features, customer experience, and internal decision-making.

Featured Projects

ClaimLens

Jan 2026 – Mar 2026

Built a production-grade RAG pipeline for insurance policy analysis across multi-insurer documents.

View System Design

Designed for real-world insurance policy analysis — where naive RAG fails and structure-aware retrieval wins.

•Built a production-grade Retrieval-Augmented Generation (RAG) pipeline for insurance policy analysis across multi-insurer documents.
•Implemented a deterministic clause splitter parsing legal PDFs into atomic clause chunks across 5 structural formats with canonical clause IDs and duplicate detection.
•Designed a two-stage retrieval architecture using embeddings + FAISS (Top-40) with cross-encoder reranking (Top-5), achieving Recall@20: 0.93, MRR: 0.89 on single-clause queries.
•Developed a multi-clause evaluation framework with stage-wise diagnostics, achieving Coverage@20: 0.87, Full Recall@20: 0.60, and MRR: 0.77 on composite multi-hop queries.
•Developed an LLM reasoning engine with strict Pydantic schema validation, citation grounding, and JSON retry logic returning structured answers with confidence and clause citations.

LangChainFAISSRAGLLM

GitHub ↗

Insurance-Aware RAG

Jan 2026 – Feb 2026

Advanced RAG system for extracting and reasoning over complex insurance policy clauses.

Improved retrieval reliability in complex legal documents with deterministic, traceable clause parsing.

•Implemented deterministic clause splitting with stable canonical IDs (e.g., ICICILombard_p8_Grace_Period_1) to ensure perfect evaluation traceability and prevent vector overwrites.
•Built a two-stage retrieval pipeline: high-recall dense FAISS retrieval using BGE embeddings followed by cross-encoder reranking for precision evidence surfacing.
•Enforced strict Pydantic schema validation (RAGResponse, Citation) as a structural firewall, rejecting hallucinated citations and logically contradictory LLM outputs.

FAISSBM25Cross-EncoderRAG

GitHub ↗