Available for new opportunities

I'm Thillai,
an ML Engineer
building scalable AI systems.

About Me

I'm an ML Engineer specializing in LLM systems, scalable inference, and production-scale GenAI infrastructure. Currently pursuing my Master's in Data Science at Stony Brook University.

My work centers on optimizing LLM inference pipelines - from FlashAttention and speculative decoding to KV cache optimization and CUDA kernel profiling. I build systems that serve models faster and more efficiently at scale.

I'm an active open source contributor to vLLM, llm-d, and EasyEdit, working on the infrastructure that powers LLM serving for thousands of developers. Previously, I've shipped production AI systems at Zideas LLC and conducted research at ISRO and IIT.

What I bring to the table:

① LLM Inference & Optimization +

Deep expertise in LLM serving - vLLM, SGLang, FlashAttention, speculative decoding, quantization, KV cache optimization, CUDA profiling with Nsight and nvprof, and Triton kernel development.

② Scalable AI Systems +

Building production-grade AI infrastructure - distributed training (FSDP, DDP), agentic RAG architectures, multi-agent systems, and end-to-end ML pipelines on AWS, GCP, and Kubernetes.

③ Open Source & Research +

Active contributor to vLLM (52k+ ★), llm-d, and EasyEdit. Published research in computer vision and deep learning across IEEE, MDPI, and MICCAI venues.

Experience

Work Experience

Dec 2025 - Present

Research Assistant

Stony Brook University - New York, USA

Researching LLM knowledge editing methods and unsafe compliance behavior - investigating how models produce unsafe content in response to unsafe requests through measurable evidence from the pretraining corpus.
Developing evaluation frameworks to trace model safety failures back to pretraining data, enabling targeted interventions for improving LLM alignment and safety.

May 2025 - Aug 2025

Applied AI Engineer Intern

Zideas LLC - New York, USA

Built a production-grade LLM document intelligence system to autonomously crawl, parse, and validate KYC artifacts across multiple regulatory sources.
Designed an agentic hybrid RAG + vector indexing architecture with optimized LLM inference via prompt compression and caching.

Jan 2024 - Aug 2024

Computer Vision Researcher

ISRO - Liquid Propulsion Systems Centre, Bengaluru

Developed a visual defect detection pipeline for X-ray radiography analysis of welded aerospace components using deep learning.
Designed a SegFormer-based segmentation model integrated with Kubeflow pipelines for automated quality inspection workflows.

May 2023 - Jul 2023

Research Intern

Indian Institute of Technology, Tirupati

Implemented a UniFormer transformer model for liver lesion diagnosis from multi-phase MRI scans.
Ranked among the top 15 teams globally in the MICCAI Liver Lesion Diagnosis Challenge.

Dec 2022 - Apr 2023

Machine Learning Engineer

BillOK

Built an OCR model integrated with a language model to process invoices and extract essential fields for financial operations.
Implemented an automation pipeline linking the system with WhatsApp and email for large-scale invoice processing.

Open Source

Contributions

52k+

vLLM

vllm-project

A high-throughput and memory-efficient inference and serving engine for large language models. Contributed to core infrastructure, improving serving performance and developer experience.

LLM Inference Python CUDA Performance

View on GitHub ↗

2.8k+

llm-d

Distributed LLM serving infrastructure designed for Kubernetes-native deployments. Contributed to the disaggregated serving architecture and deployment tooling for scalable LLM inference.

Kubernetes Distributed Systems Go Infrastructure

View on GitHub ↗

2.7k+

EasyEdit (ACL 2024)

zjunlp

An easy-to-use knowledge editing framework for large language models. Contributed to improving model editing capabilities and extending the framework's support for new editing methods.

Knowledge Editing LLMs Python Research

View on GitHub ↗

Education

Academic Background

M.S. in Data Science

Stony Brook University

Expected Graduation: May 2026

B.Tech. in Computer Science and Engineering

Vellore Institute of Technology

Graduated: May 2024

Competencies

Technical Skills

Languages

Python C++ CUDA Triton Java Bash SQL Rust

ML & Inference

PyTorch TensorFlow JAX vLLM SGLang Hugging Face ONNX Runtime Triton Kernels FlashAttention Speculative Decoding Quantization KV Cache Optimization FSDP / DDP MLflow W&B

Cloud, DevOps & Agents

AWS GCP Azure Docker Kubernetes Spark Kafka Linux Git CI/CD LangChain LangGraph LlamaIndex AutoGen MCP

Highlights

Featured Highlights

Harvard Project for Asian and International Relations

Delegate for HPAIR Asia Conference 2022

Selected as a delegate for the prestigious HPAIR Asia Conference 2022 in New Delhi, presenting on AI solutions for global crises and climate change.

Research Paper - IEEE

Deep Learning-driven Detection of Nuclear Fusion Ignition

Investigated three deep learning architectures - Transformers, LSTM, and ResNet50 - for nuclear fusion event detection. Transformers achieved the highest accuracy.

Read Paper ↗

Research Paper - IEEE

Martian Terrain Classification through Federated Learning

Developed a novel federated learning approach for multi-class Martian terrain classification using DenseNet-121 architecture while preserving data privacy.

Read Paper ↗

Association for Computing Machinery (ACM)

Research and Development Head of ACM-VIT Chapter

Served as R&D Head in 2023, fostering a research-oriented culture through Data Science workshops and mentoring aspiring researchers.

Review Article - MDPI

Exploring Huntington's Disease Diagnosis via AI Models

Comprehensive review of AI-powered algorithms for Huntington's Disease diagnosis, analyzing clinical, genetic, and neuroimaging data.

Read Paper ↗

Portfolio

Latest Projects

Multi-Agent AI

Agentic Research Assistant

Multi-agent AI system that automates academic research, literature review, and research paper generation using advanced LLM agents.

View Project ↗

Vision-Language Models

Vision Language Driving Perception

VLM fine-tuning pipeline for autonomous driving with distributed training, TensorRT optimization, and custom evaluation metrics.

View Project ↗

Mental Health AI

CBT-Copilot

Fine-tuned Llama-3.2-3B-Instruct for compassionate CBT-style therapeutic conversations while maintaining professional boundaries.

View Project ↗

Generative AI

Flash AI Search Engine

AI-powered search engine using Gemini 2.0 Flash with live web search results for fast, precise, source-backed answers.

View Project ↗

Generative AI

Dynamic Benchmarking Framework

Dynamic benchmarking framework evaluating LLM accuracy using real-time, location-specific data from WeatherAPI.

View Project ↗

Astroinformatics

Continual LIGO Glitch Detection

Continual learning architecture for LIGO glitch detection using Vision Transformer, achieving 93.4% accuracy in glitch classification.

View Project ↗

Generative AI

MediQuill LLM

Fine-tuned Llama-2 7B on curated medical Q&A data for accurate diagnoses, treatment recommendations, and drug information.

View Project ↗

Astroinformatics

Super Resolution Astronomical Denoiser

SRGAN for galaxy image denoising, improving PSNR by 32.7% and SSIM by 19.8% using transfer learning techniques.

View Project ↗

Fitness Analytics

AI-powered Virtual Fitness Trainer

Real-time exercise tracking using Mediapipe for body landmark detection, angle calculation, and form correction feedback.

View Project ↗

Python• PyTorch• TensorFlow• LangChain• Hugging Face• Docker• Kubernetes• AWS• GCP• JAX• vLLM• React• Go• Rust• Apache Spark• Kafka• Python• PyTorch• TensorFlow• LangChain• Hugging Face• Docker• Kubernetes• AWS• GCP• JAX• vLLM• React• Go• Rust• Apache Spark• Kafka•

I'm Thillai, an ML Engineer building scalable AI systems.

About Me

Work Experience

Research Assistant

Stony Brook University - New York, USA

Applied AI Engineer Intern

Zideas LLC - New York, USA

Computer Vision Researcher

ISRO - Liquid Propulsion Systems Centre, Bengaluru

Research Intern

Indian Institute of Technology, Tirupati

Machine Learning Engineer

BillOK

Contributions

vLLM

llm-d

EasyEdit (ACL 2024)

Academic Background

M.S. in Data Science

B.Tech. in Computer Science and Engineering

Technical Skills

Languages

ML & Inference

Cloud, DevOps & Agents

Featured Highlights

Delegate for HPAIR Asia Conference 2022

Deep Learning-driven Detection of Nuclear Fusion Ignition

Martian Terrain Classification through Federated Learning

Research and Development Head of ACM-VIT Chapter

Exploring Huntington's Disease Diagnosis via AI Models

Latest Projects

Agentic Research Assistant

Vision Language Driving Perception

CBT-Copilot

Flash AI Search Engine

Dynamic Benchmarking Framework

Continual LIGO Glitch Detection

MediQuill LLM

Super Resolution Astronomical Denoiser

AI-powered Virtual Fitness Trainer

I'm Thillai,
an ML Engineer
building scalable AI systems.