AI / ML Engineer focused on LLMOps, Large Language Models, and Scalable Inference Systems.
Experienced in deploying and evaluating open-source LLMs using vLLM, Kubernetes, and GPU-accelerated infrastructure.
Actively working on agentic systems, RAG pipelines, and model evaluation frameworks.
(unchanged as requested)
- Open-source LLM evaluation (GSM8K, ARC, MMLU, HumanEval)
- High-throughput inference using vLLM + H100 / A30 GPUs
- Agentic workflows with AutoGen
- Multi-tenant LLM platforms on Kubernetes
