Ammar Ammar-Alnagar

Hey, I'm Ammar

I build systems that actually run LLMs - everywhere from Raspberry Pis to B200 servers to your smart fridge (yes, really). It's all about figuring out how to integrate these models into real software and making them work within whatever constraints you've got.

What I'm Working On

Helios-Engine - A Rust-based agent framework with streaming and tool calling. Built this for projects that need reliable LLM integration without the overhead.

Zllm - Building an LLM inference engine from scratch in C++/CUDA. Learning how Flash Attention, PagedKV cache, and RadixAttention work under the hood helps a lot when you need to deploy these things in weird places.

MILI - An end-to-end LLM inference engine in Mojo. Experimenting with getting good performance while keeping the code readable. Includes RoPE, RMSNorm, and FlashAttention implementations.

Axion - A Rust inference engine from an earlier project. Good learning experience in getting models to run efficiently in production environments.

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣤⡶⠿⠿⠷⣶⣄⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣰⡿⠁⠀⠀⢀⣀⡀⠙⣷⡀⠀⠀⠀
⠀⠀⠀⡀⠀⠀⠀⠀⠀⢠⣿⠁⠀⠀⠀⠘⠿⠃⠀⢸⣿⣿⣿⣿
⠀⣠⡿⠛⢷⣦⡀⠀⠀⠈⣿⡄⠀⠀⠀⠀⠀⠀⠀⣸⣿⣿⣿⠟
⢰⡿⠁⠀⠀⠙⢿⣦⣤⣤⣼⣿⣄⠀⠀⠀⠀⠀⢴⡟⠛⠋⠁⠀
⣿⠇⠀⠀⠀⠀⠀⠉⠉⠉⠉⠉⠁⠀⠀⠀⠀⠀⠈⣿⡀⠀⠀⠀
⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢹⡇⠀⠀⠀
⣿⡆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣼⡇⠀⠀⠀
⠸⣷⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⡿⠀⠀⠀⠀
⠀⠹⣷⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣰⡿⠁⠀⠀⠀⠀
⠀⠀⠀⠉⠙⠛⠿⠶⣶⣶⣶⣶⣶⠶⠿⠟⠛⠉⠀⠀⠀⠀⠀⠀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ammar Ammar-Alnagar

Achievements

Achievements

Highlights

Organizations

Block or report Ammar-Alnagar

Pinned Loading

Uh oh!