🛡 SentinelAI — Production-Grade AI Inference Platform

SentinelAI

🛡 SentinelAI — Production-Grade AI Inference Platform

SentinelAI is a full-stack, GPU-accelerated AI inference platform designed for real-world production deployment.
Built with FastAPI, CUDA, Llama 3, Kubernetes, CI/CD, MLFlow, Prometheus, and Streamlit.

🚀 Features

🔥 Llama 3 inference (CUDA-accelerated)
⚡ FastAPI REST API
📊 Streamlit live dashboard
📦 Docker + Render deployment
☸️ Kubernetes GPU orchestration
📈 Prometheus monitoring
🤖 N8N automation workflows
🔐 Rate-limiting & auth ready
🧪 Test suite + CI/CD

🧠 System Architecture

⚙️ Quick Start

git clone https://github.com/Trojan3877/SentinelAI
cd SentinelAI
pip install -r requirements.txt
uvicorn api.main:app --reload





 
<img width="1536" height="1024" alt="image" src="https://github.com/user-attachments/assets/4cbc93b4-a7ba-4615-9ddb-82f06745151a" />


🧰 Tech Stack
Frontend

Next.js (TypeScript)

Tailwind CSS

Streamlit (Live Metrics Dashboard)

Backend

FastAPI

Llama 3 (CUDA)

Auth + Rate Limiting

MLflow Experiment Tracking

Infrastructure

Docker

Kubernetes (GPU scheduling)

Prometheus

Render Deployment

GitHub Actions CI/CD

⚡ Quick Start (Local)
docker compose up --build


API → http://localhost:8000

UI → http://localhost:3000

☸️ Kubernetes Deployment
kubectl apply -f k8s/


Supports NVIDIA GPU nodes, metrics scraping, and horizontal scaling.

🧪 Testing
pytest tests/


Includes:

Health checks

Auth validation

Rate limiting

LLM inference validation

📊 Observability

/metrics → Prometheus

MLflow UI → experiment tracking

Streamlit → live dashboard

🎯 Why SentinelAI

✔ Production-ready
✔ GPU-accelerated
✔ Full-stack TypeScript + Python
✔ MLOps + Platform Engineering
✔ Recruiter-credible system design

Design Questions & Reflections

Q: What problem does this project aim to solve?
A: SentinelAI is designed to explore how a real-time monitoring and alerting system could automatically detect and respond to important changes in live streams of data. The goal wasn’t just to build alerts, but to investigate how pattern detection, rule-based triggers, and scalable event processing work together in a monitoring pipeline that could be extended to different domains.

Q: Why did I choose this approach instead of alternatives?
A: I chose a modular architecture that separates ingestion, detection, and notification logic to make it easier to experiment with different detection strategies and scale individual components independently. This was more complex than a monolithic script, but allowed clearer reasoning about how each part contributes to overall behavior.

Q: What were the main trade-offs I made?
A: The trade-off was flexibility versus immediate simplicity — I could have built a quick script that triggered hard-coded alerts, but that wouldn’t scale or adapt to varied signal types. By building modular components and using an event-driven mindset, I gained clarity and extensibility at the cost of added upfront complexity.

Q: What didn’t work as expected?
A: One challenge was balancing false positives versus detection sensitivity. Initially, the system generated too many alerts — most of them not meaningful — which made it harder to trust outputs. That forced me to think about thresholds and event aggregation rather than just adding more detection rules.

Q: What did I learn from building this project?
A: I learned how critical evaluation and tuning are when moving from simple logic to production-style detection systems. Building observability into the system early helped me understand failure patterns and iteratively refine detection criteria rather than guessing at configurations.

Q: If I had more time or resources, what would I improve next?
A: I would add more advanced anomaly detection modules using basic statistical techniques or lightweight ML models so the system could adapt its sensitivity over time. I’d also build clearer logging and dashboard integration so a user could visually explore why alerts were generated.




👤 Author

Corey Leath
Senior Software Engineering Student
AI / ML / Platform Engineering
https://github.com/Trojan3877

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github/workflows		.github/workflows
api		api
backend/app		backend/app
dashboard		dashboard
docker		docker
frontend		frontend
k8s		k8s
mlflow		mlflow
monitoring		monitoring
n8n		n8n
tests		tests
Contributing.md		Contributing.md
DailyLog.md		DailyLog.md
FILE Structure		FILE Structure
LICENSE		LICENSE
Metrics.md		Metrics.md
Prometheus ServiceMonitor (Metrics Credibility)		Prometheus ServiceMonitor (Metrics Credibility)
README.md		README.md
Test		Test
mlflow_tracking.py		mlflow_tracking.py
render.yaml		render.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡 SentinelAI — Production-Grade AI Inference Platform

🚀 Features

🧠 System Architecture

⚙️ Quick Start

About

Uh oh!

Releases

Packages

Languages

License

Trojan3877/SentinelAI

Folders and files

Latest commit

History

Repository files navigation

🛡 SentinelAI — Production-Grade AI Inference Platform

🚀 Features

🧠 System Architecture

⚙️ Quick Start

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages