Hero image
Lead Applied Scientist (AI/ML) • Production AI & LLM Systems Engineer

Building production-grade AI and LLM Systems that scale from research to real-world impact.

Hi, I am Anmol Gautam a Lead Applied Scientist specializing in large scale GenAI systems. I build, optimize, and deploy LLM pipelines including fine-tuning, pruning, quantization, and GPU backed inference for enterprise Text to SQL, RAG, and agentic platforms.

  • Open-source LLM training & alignment
  • Production Text-to-SQL & RAG
  • GPU inference with vLLM & SGLang
  • Multi-agent platforms in production
projects.py
Enterprise AI Systems
Find me on

Technologies I work with

Agentic AI
Multi-Agent Systems
RAG Systems
Text-to-SQL
Fine-Tuning (SFT, DPO, GRPO)
LoRA / PEFT
Model Pruning & Quantization
LLM Evaluation
Large Language Models (LLMs)
PyTorch
Hugging Face Transformers
NLP
Computer Vision
vLLM
SGLang
GPU Inference
Python
FastAPI
Go
Node.js
REST APIs
SSE
WebSockets
PostgreSQL
MongoDB
Trino
ChromaDB
Milvus
pgvector
Docker
AWS
Azure
MLflow
Agentic AI
Multi-Agent Systems
RAG Systems
Text-to-SQL
Fine-Tuning (SFT, DPO, GRPO)
LoRA / PEFT
Model Pruning & Quantization
LLM Evaluation
Large Language Models (LLMs)
PyTorch
Hugging Face Transformers
NLP
Computer Vision
vLLM
SGLang
GPU Inference
Python
FastAPI
Go
Node.js
REST APIs
SSE
WebSockets
PostgreSQL
MongoDB
Trino
ChromaDB
Milvus
pgvector
Docker
AWS
Azure
MLflow
About Me

Building the future,from applied research to production AI systems.

I am a Lead Applied Scientist with 4+ years of experience architecting and shipping production-grade AI systems - from multi-agent platforms and Text-to-SQL engines to enterprise RAG and real-time multimodal solutions.

My work sits at the intersection of applied research, large-scale system design, and real-world deployment. I have built and owned enterprise GenAI platforms including Text-to-SQL, RAG pipelines, GPU-backed inference systems, and multi-agent architectures running in production.

Across roles at 8bit.ai, SuperAGI, Oracle, and NVIDIA, I have taken systems end-to-end: from research and experimentation, through model fine-tuning and alignment (SFT, DPO, GRPO), to deployment on scalable infrastructure with inference optimization using vLLM and SGLang.

Alongside industry work, I have 6 published papers across IEEE, Springer, and arXiv spanning LLMs, multimodal learning, and computer vision. My master’s research at NIT Meghalaya, completed in collaboration with NVIDIA, received the Institute Best Thesis Award and achieved state-of-the-art results.

🎓

M.Tech CSE - NIT Meghalaya | Gold Medalist | 10.0 CGPA

Institute Best Master’s Thesis Award - Region of Interest Segmentation in Biomedical Images

“I care deeply about systems that actually ship balancing performance, reliability, and cost and about building AI products that move beyond demos into real impact.”

Production LLM Systems

Designing, optimizing, and deploying large-scale LLM pipelines with GPU-backed inference, pruning, and quantization.

Enterprise RAG & Text-to-SQL

Building reliable retrieval and structured-query systems used in real enterprise workflows.

Agentic Architectures

Developing multi-agent platforms with tool orchestration, planning, and execution in production environments.

Applied Research → Production

Translating research ideas into shipped systems, backed by evaluation, experimentation, and peer-reviewed work.

Career Journey

Experience that speaks volumes.

A timeline of my professional growth, from curious beginner to Lead Applied Scientist leading teams and building products at scale.

Oct 2024 - Present

Lead Applied Scientist (AI/ML)

8bit.ai

Current
  • Architected Neutrino, a multi-agent AI platform powering enterprise search, Text-to-SQL, and workflow automation with human-in-the-loop execution and multi-LLM orchestration, deployed across 5 major ISV partners.
  • Fine-tuned domain-specific LLMs using LoRA, DoRA, PEFT, and alignment techniques (DPO, GRPO).
  • Built a multi-schema Text-to-SQL engine using agentic ReAct workflows across PostgreSQL and Trino.
  • Designed end-to-end enterprise search spanning RAG ingestion, hybrid retrieval, knowledge-graph augmentation, PII tagging, and multi-tenant data discovery.
  • Optimized inference latency and cost using vLLM and SGLang, and delivered real-time multimodal solutions including voice and sign-language AI.
Multi-Agent SystemsText-to-SQLRAGvLLMSGLangFastAPISSELoRADPOGRPOPostgreSQLTrinoAzure
Nov 2023 - Oct 2024

Applied Scientist (AI/ML)

SuperAGI

  • Built Text-to-SQL and RAG-based conversational multi-agent systems for SuperSales.
  • Developed SuperCoder 2.0, an autonomous code navigation and issue-resolution system achieving 33% on SWE-Bench-Lite.
  • Architected a fully autonomous multi-agent platform using open-source and closed-source LLMs with ReAct-style execution and Planner-Orchestration patterns, taking projects from PoC to AWS production.
  • Developed SAM-7B, an instruction-tuned Mistral-7B model achieving GPT-3.5-comparable performance and outperforming Orca on GSM8K and ARC.
Multi-Agent SystemsRAGText-to-SQLOpen-Source LLMsSFTDPOLoRA / PEFTAWS
Aug 2022 - Oct 2023

Associate Consultant

Oracle

  • Built document AI and information extraction pipelines using OCI Document Understanding and EasyOCR, improving NER and key-value extraction by 7%.
  • Developed RAG-based question answering and computer vision systems using Falcon, Llama, ChromaDB, TensorFlow, and transfer learning.
  • Built a face recognition pipeline using transfer learning that improved performance by 37%.
RAGOCRNERFalconLLaMAChromaDBTensorFlowOCI
May 2021 - Apr 2022

Research Intern

NVIDIA

  • Worked on NLP and computer vision systems using NVIDIA NeMo, HuggingFace, and transfer learning.
  • Built English-to-Hindi machine translation, object detection, and image segmentation pipelines.
NVIDIA NeMoMachine TranslationComputer VisionTransformersPyTorch
Research

Research that informsreal-world AI systems.

Alongside building production AI platforms, I actively publish and experiment on problems across LLMs, multimodal learning, and computer vision. My research focuses on ideas that translate into practical, deployable systems.

arXiv2024

SuperCoder 2.0: Exploring the Feasibility of LLMs as Autonomous Programmers

A technical study on multi-agent autonomous code generation and navigation systems, evaluated on SWE-Bench-Lite. Explores how LLM-driven agents can reason, plan, and execute code-level tasks.

arXiv2024

Veagle: Advancements in Multimodal Representation Learning

Research on multimodal representation learning, focusing on improved alignment across vision and language modalities.

IEEE2022

SAU-NET: Scale-Aware Polyp Segmentation using Encoder–Decoder Networks

Introduces a scale-aware segmentation architecture for biomedical imaging, achieving state-of-the-art performance and outperforming UNet and DeepLab variants.

Springer2022

ED-NET: Educational Teaching Video Classification Network

Proposes a deep learning architecture for classifying educational videos, combining CNN and RNN-based temporal modeling.

IEEE2022

Batch Image Encryption and Compression using Chaotic Map Infused Autoencoder Network

A novel approach combining chaotic map-based encryption with autoencoder networks for secure batch image encryption and compression.

IEEE2022

Li-SegPNet: Encoder-Decoder Mode Lightweight Segmentation Network for Colorectal Polyps Analysis

A lightweight encoder-decoder segmentation network designed for efficient colorectal polyp analysis in biomedical imaging.

Selected Systems & Research

A curated selection of applied research and production AI systems spanning autonomous agents, open-source LLMs, enterprise RAG, and large-scale inference infrastructure.

Dendrux : Open-Source Runtime for Real-World Agents

Dendrux : Open-Source Runtime for Real-World Agents

Dendrux is an open-source runtime built for real-world agents that need to do more than chat. It provides tool calling, persistence, observability, and a client-tool bridge that lets agents pause mid-execution, hand off to a human or external system, and resume exactly where they left off.

  • Tool calling with pause/resume execution
  • Built-in persistence & observability
  • FastAPI/SSE hosting out of the box
  • Client-tool bridge for human-in-the-loop
Open SourceAgentic AIFastAPITool CallingSSE
SuperCoder 2.0 : Autonomous Multi-Agent Programming System

SuperCoder 2.0 : Autonomous Multi-Agent Programming System

An autonomous multi-agent coding system that reasons, plans, navigates repositories, and executes code changes. SuperCoder 2.0 achieved 33% on SWE-Bench-Lite, outperforming several open coding agents and approaching proprietary systems.

  • Multi-agent planning & execution loops
  • Repository-level code navigation
  • Custom RAG pipelines for code context
  • Evaluated on SWE-Bench-Lite
Multi-Agent SystemsAutonomous CodingRAGLLM Evaluation
SAM-7B : Small Agentic Model (Open-Source LLM)

SAM-7B : Small Agentic Model (Open-Source LLM)

SAM is an open-source 7B parameter agentic language model fine-tuned for reasoning and task execution using explanation-trace supervision. Designed for efficiency, controllability, and strong reasoning performance in agent workflows.

  • LoRA-based fine-tuning & PEFT
  • Explanation-trace dataset construction
  • Agent-centric instruction tuning
  • 3k+ HuggingFace downloads
Open-Source LLMsLoRA / PEFTAgentic ModelsAlignment

Other Production Systems

Enterprise Text-to-SQL & RAG Platform

A production conversational AI system translating natural-language queries into SQL and retrieval workflows over structured and unstructured enterprise data.

  • Schema-aware query generation
  • Latency vs correctness tradeoffs
  • Evaluation & monitoring in production

LLM Inference & Optimization Platform

GPU-backed inference infrastructure built using vLLM and SGLang, with autoscaling and optimization pipelines for cost-efficient deployment.

  • Quantization & pruning pipelines
  • High-throughput GPU inference
  • Benchmarking & cost optimization
Contact

Let’s connect and talk systems.

I’m always open to discussing AI systems, research, collaboration, and interesting opportunities.

Open to collaboration & opportunities