Skip to content
View oshindutta's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report oshindutta

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
oshindutta/README.md

πŸ‘‹ Hi, I'm Oshin Dutta! πŸ€–βœ¨

Applied AI Researcher | AI Solution Architect | Ph.D., IIT Delhi

I specialize in Agentic Frameworks, LLM Optimization, and turning wild AI ideas into scalable production reality. If it hallucinates, I tame it. If it's too big, I compress it! πŸ—œοΈπŸ§ 

🌐 Website β€’ πŸ“« Email β€’ πŸ’Ό LinkedIn β€’ πŸŽ“ Google Scholar β€’ πŸ“„ Resume/CV


πŸš€ What I'm building...

  • 🏒 AI Consultant / Applied AI Researcher @ KPMG: Architecting and deploying Enterprise-Scale Agentic Solutions. I recently engineered to production a globally adopted skill-based agentic platform using FastAPI, React, and Azure OpenAI.
  • πŸ› οΈ Vibe-Coding Full-Stack: Bridging the gap between cutting-edge AI research and slick, scalable, containerised applications.
  • πŸ›‘οΈ Trusted AI Governance: Building novel hallucination detection mechanisms and transforming traditional SDLC into an AI-Driven Life Cycle (AIDLC).

🌟 Featured Open Source

To get a taste of how I architect AI systems, check out my open-source work:

  • πŸ—οΈ EvalAgent – An open-source architecture demo of a full-stack agentic idea-evaluation platform. Built with a FastAPI backend, async SQLAlchemy, Azure OpenAI integration, and a React frontend.
  • 🧠 Efficient AI & LLM Compression – During my Ph.D., I developed VTrans (10Γ— speed-up for LLM fine-tuning) and its upgraded version, TVA-prune (60% GPU inference speed-up for LLaMA/Mistral).
  • πŸƒ Action Recognition Compression – Designed algorithm achieving over 70x compression and ~100x Raspberry Pi speedup vs. full LSTMs. Code Repo

🧰 Tech Stack & Superpowers

  • AI/ML Jedi Skills: Multi-Agent Systems, Structured Tool Calling, LLM Compression & Quantization (LoRA, PEFT), Hardware-Aware NAS.
  • Languages & Frameworks: Python, PyTorch, LangChain, FastAPI, React.
  • MLOps & Deployment: Docker, Azure (App Service, OpenAI, Blob Storage), CI/CD pipelines, async architectures.
  • Research Chops: Published at top-tier venues including ICML and WACV.

⚑ Fun Facts

  • I can accelerate LLM fine-tuning by 10x, but I still can't speed up my morning coffee brewing process. β˜•
  • From coding precise lunar landings (IISc Internship) to deploying enterprise multi-agent systems, I love making complex architectures land smoothly! πŸŒ•πŸš€

πŸ“« Let's Collaborate!

Looking for an AI Solution Architect or Applied Researcher who can speak both "deep learning math" and "production architecture"? Let's talk!

Twitter β€’ GitHub

Pinned Loading

  1. TVAprune TVAprune Public

    [ICML 2024 Es-FoMo] - Efficient LLM Pruning with Global Token-Dependency Awareness and Hardware-Adapted Inference

    Python 6 4

  2. DCA-NAS DCA-NAS Public

    [PReMI 2023]- Device-Constraint - Aware Neural Architecture Search Method. It incorporates methods to constrain architecture search given device constraints and to fasten the search.

  3. CoFiPruning_RemovedErrors CoFiPruning_RemovedErrors Public

    Forked from princeton-nlp/CoFiPruning

    ACL 2022: Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

    Python 1

  4. Compression-Related-Papers Compression-Related-Papers Public

    Lists papers read related to model compression of transformers, CNNs, RNNs and Neural Architecture search (NAS). Includes papers on Variational Information Bottleneck.

    1

  5. tempo-estimation tempo-estimation Public

    Matlab code to estimate tempo of various genres of music

    MATLAB