Skip to content
View Ammar-Alnagar's full-sized avatar
:copilot:
Deciphering the GPU manuscript.....
:copilot:
Deciphering the GPU manuscript.....

Block or report Ammar-Alnagar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Ammar-Alnagar/README.md

Hey, I'm Ammar

I build systems that actually run LLMs - everywhere from Raspberry Pis to B200 servers to your smart fridge (yes, really). It's all about figuring out how to integrate these models into real software and making them work within whatever constraints you've got.

What I'm Working On

Helios-Engine - A Rust-based agent framework with streaming and tool calling. Built this for projects that need reliable LLM integration without the overhead.

Zllm - Building an LLM inference engine from scratch in C++/CUDA. Learning how Flash Attention, PagedKV cache, and RadixAttention work under the hood helps a lot when you need to deploy these things in weird places.

MILI - An end-to-end LLM inference engine in Mojo. Experimenting with getting good performance while keeping the code readable. Includes RoPE, RMSNorm, and FlashAttention implementations.

Axion - A Rust inference engine from an earlier project. Good learning experience in getting models to run efficiently in production environments.

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣤⡶⠿⠿⠷⣶⣄⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣰⡿⠁⠀⠀⢀⣀⡀⠙⣷⡀⠀⠀⠀
⠀⠀⠀⡀⠀⠀⠀⠀⠀⢠⣿⠁⠀⠀⠀⠘⠿⠃⠀⢸⣿⣿⣿⣿
⠀⣠⡿⠛⢷⣦⡀⠀⠀⠈⣿⡄⠀⠀⠀⠀⠀⠀⠀⣸⣿⣿⣿⠟
⢰⡿⠁⠀⠀⠙⢿⣦⣤⣤⣼⣿⣄⠀⠀⠀⠀⠀⢴⡟⠛⠋⠁⠀
⣿⠇⠀⠀⠀⠀⠀⠉⠉⠉⠉⠉⠁⠀⠀⠀⠀⠀⠈⣿⡀⠀⠀⠀
⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢹⡇⠀⠀⠀
⣿⡆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣼⡇⠀⠀⠀
⠸⣷⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⡿⠀⠀⠀⠀
⠀⠹⣷⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣰⡿⠁⠀⠀⠀⠀
⠀⠀⠀⠉⠙⠛⠿⠶⣶⣶⣶⣶⣶⠶⠿⠟⠛⠉⠀⠀⠀⠀⠀⠀

Pinned Loading

  1. Helios-Engine Helios-Engine Public

    Helios Engine is a powerful and flexible Rust framework for building LLM-powered agents with tool support, chat capabilities, and easy configuration management. Create intelligent agents that can i…

    Rust 40 3

  2. YAIE YAIE Public

    YAIE (Yet Another Inference Engine) is an educational project designed to help students and developers understand how modern LLM inference engines work. This implementation is inspired by state-of-…

    Python

  3. VisoLearn VisoLearn Public

    VisoLearn-2 is an AI-powered educational platform designed specifically for children with Autism Spectrum Disorder (ASD). Our mission is to leverage cutting-edge artificial intelligence to create p…

    Python

  4. MILI MILI Public

    A comprehensive, hands-on guide to building a high-performance LLM inference system in Mojo and Python.

    Mojo

  5. Axion Axion Public

    Axion is a high-performance LLM serving platform built with Rust that provides OpenAI-compatible APIs for chat completions, embeddings, and reranking. Designed for production environments, Axion de…

    Rust 1

  6. cronnx cronnx Public

    Cronnx is a high-performance, asynchronous Machine Learning inference server built in Rust. It demonstrates how to take a raw ONNX model and serve it via a robust HTTP API with features like dynami…

    Rust