Projects
Project Archive
Generative AI systems, model experiments, and applied AI projects built with real implementation depth.
This page brings together the projects that best reflect how I build with GenAI, explore LLM workflows, work on open-source AI ideas, and experiment across model architecture, retrieval, and computer vision.
Browse by category
Switch between themes without losing the portfolio narrative.
Highlighted Work
Featured Projects
The most representative work for structured retrieval, system design, and end-to-end AI engineering.
Featured
ClaimLens
Jan 2026 – Mar 2026
Built a production-grade RAG pipeline for insurance policy analysis across multi-insurer documents.
Designed for real-world insurance policy analysis — where naive RAG fails and structure-aware retrieval wins.
- •Built a production-grade Retrieval-Augmented Generation (RAG) pipeline for insurance policy analysis across multi-insurer documents.
- •Implemented a deterministic clause splitter parsing legal PDFs into atomic clause chunks across 5 structural formats with canonical clause IDs and duplicate detection.
- •Designed a two-stage retrieval architecture using embeddings + FAISS (Top-40) with cross-encoder reranking (Top-5), achieving Recall@20: 0.93, MRR: 0.89 on single-clause queries.
- •Developed a multi-clause evaluation framework with stage-wise diagnostics, achieving Coverage@20: 0.87, Full Recall@20: 0.60, and MRR: 0.77 on composite multi-hop queries.
- •Developed an LLM reasoning engine with strict Pydantic schema validation, citation grounding, and JSON retry logic returning structured answers with confidence and clause citations.
Featured
Insurance-Aware RAG
Jan 2026 – Feb 2026
Advanced RAG system for extracting and reasoning over complex insurance policy clauses.
Improved retrieval reliability in complex legal documents with deterministic, traceable clause parsing.
- •Implemented deterministic clause splitting with stable canonical IDs (e.g., ICICILombard_p8_Grace_Period_1) to ensure perfect evaluation traceability and prevent vector overwrites.
- •Built a two-stage retrieval pipeline: high-recall dense FAISS retrieval using BGE embeddings followed by cross-encoder reranking for precision evidence surfacing.
- •Enforced strict Pydantic schema validation (RAGResponse, Citation) as a structural firewall, rejecting hallucinated citations and logically contradictory LLM outputs.
Broader Exploration
Other Projects
Additional projects across model architecture, computer vision, and applied ML experimentation.
GPT From Scratch
Oct 2025 – Nov 2025
Built a decoder-only transformer language model from scratch in PyTorch.
Demonstrates deep understanding of transformer internals — attention, training dynamics, and optimization.
- •Built a decoder-only transformer language model from scratch, implementing multi-head self-attention, positional embeddings, feed-forward layers, and layer normalization across 6 transformer blocks with 384-dimensional embeddings.
- •Trained a character-level transformer on 200K tokens of structured Shakespearean dialogue (20K+ lines) with speaker annotations, modeling long-context dependencies (256-token window) and generating coherent multi-turn text using the AdamW optimizer, achieving training loss 1.5 and validation loss 1.8.
- •Optimized the training pipeline with gradient clipping, dropout (0.2), and efficient batching (64 sequences, 256-token context), reducing overfitting while improving validation stability.
Footfall Counting System
May 2025 – Jul 2025
Real-time system to detect, track, and count people in video streams.
Production-ready people counting with 98.2% mAP — reliable enough for retail and security deployments.
- •Built using YOLOv8 for detection and BoT-SORT for long-term ID tracking with adaptive trip-wire logic, motion filtering, and cooldown logic enabling accurate directional counts (entries vs exits).
- •Achieved strong performance: mAP@0.5: 98.2%, Precision: 91.9%, Recall: 95.4%, and ID Stability: 96.3%.
Hand Glove Detection
Jul 2025 – Aug 2025
Custom YOLOv8-based model for detecting and classifying gloved vs bare hands in safety-critical environments.
Lightweight real-time safety classifier deployable in industrial and lab settings.
- •Synthesized and preprocessed a custom annotated classification dataset covering gloved and bare-hand instances across varied lighting and backgrounds.
- •Trained and evaluated a lightweight YOLOv8 nano model, optimizing for real-time inference speed without sacrificing detection accuracy.
Real-Time Sign Language Recognition
Jan 2024 – Apr 2024
Tracks hand movements & interprets signs from video using a combined CNN + LSTM architecture for spatial and temporal action recognition.
End-to-end gesture recognition pipeline bridging computer vision and sequential modeling.
- •Captured spatial features natively via CNN layers.
- •Processed temporal sequential actions utilizing LSTM networks.