A production-grade ML pipeline demonstrating end-to-end machine learning engineering: from exploratory research to deployed web application with REST API, comprehensive testing, and CI/CD automation.
🚀 Try the Live Application | 📊 View Research Notebook | 🚂 Trigger a Training Run | 📚 Read Documentation
- Project Highlights
- Live Application
- Research Notebook
- Quick Start
- Architecture
- Model Performance
- Technology Stack
- Development
- Deployment
- Contact
This project showcases professional ML engineering practices through two complementary components:
Comprehensive Jupyter notebook featuring:
- Exploratory Data Analysis (EDA) with 20+ visualizations
- Advanced Feature Engineering: Title extraction, cabin analysis, fare normalization
- Model Comparison: Evaluated 8 algorithms (Random Forest, XGBoost, CatBoost, SVM, etc.)
- Hyperparameter Optimization: GridSearchCV with 5-fold cross-validation
- Model Interpretability: SHAP analysis, feature importance, force plots
Enterprise-ready web application with:
- RESTful API with Swagger/OpenAPI documentation
- Flask Web Interface for real-time predictions with confidence scores
- Modular ML Pipeline: Separate data ingestion, transformation, and training modules
- Production Best Practices: Type hints, logging, error handling, comprehensive testing
- CI/CD Pipeline: Automated testing, security scanning, Docker builds
- Cloud Deployment: Azure App Service with container deployment
Key Differentiator: Unlike typical Kaggle projects, this demonstrates the complete ML lifecycle from research to production deployment.
Experience a production-grade ML training pipeline running entirely in GitHub Actions. Watch the complete model training process from data download to performance reporting.
- View Past Training Runs - See detailed logs and metrics from previous executions
- Trigger a Training Run - Click "Run workflow" to start a new training session
- Download Artifacts - Get trained models, predictions, and detailed performance reports
The automated pipeline executes a complete ML workflow (~15 minutes total):
- 📥 Data Acquisition (10s) - Downloads Titanic dataset from Kaggle API or fallback source
- ✅ Data Validation (5s) - Verifies data integrity and structure
- 🧪 Pre-Training Tests (15s) - Runs unit tests to ensure code quality
- 🚂 Model Training (~12 minutes) - Trains ensemble of 6 models with hyperparameter optimization
- 📊 Performance Reporting (30s) - Generates comprehensive metrics and visualizations
- 🎯 Kaggle Submission (20s) - Creates competition-ready prediction file
Each run produces downloadable artifacts:
- Trained Models (.pkl) - Serialized model objects ready for deployment
- Performance Metrics (JSON) - Accuracy, precision, recall, F1 scores
- Training Report (Markdown) - Detailed analysis with metric tables
- Kaggle Submission (CSV) - Competition-ready predictions
- Training Logs (TXT) - Complete execution trace
- Automatic Data Download - No data files committed to repo
- Comprehensive Logging - Track every step with detailed status indicators
- Error Handling - Graceful fallbacks for missing data sources
- Performance Tracking - Timing and metrics for each pipeline stage
- Artifact Management - 90-day retention for models, 30-day for submissions
- Job Summaries - Beautiful GitHub Actions dashboard with key metrics
- Multiple Triggers - Manual, scheduled (weekly), or on code changes
💡 Note: Training takes approximately 12-15 minutes to complete. The model training phase itself requires ~12 minutes due to extensive hyperparameter optimization across multiple algorithms.
🔮 Real-Time Predictions
- Enter passenger details through intuitive web form
- Receive instant survival prediction with confidence percentage
- Confidence scores range from realistic probabilities (not just 0% or 100%)
🎨 User Interface
- Clean, responsive design optimized for mobile and desktop
- Clear visualization of prediction results
- Direct links to research notebook and project repository
🔌 REST API
/api/predict- Get prediction with JSON input/api/health- Health check endpoint/api/docs- Interactive Swagger UI documentation- Full CORS support for integration
📊 Under the Hood
- VotingClassifier ensemble (6 models)
- Automatic feature engineering (family size, title extraction, fare inference)
- Confidence calculated by averaging individual model probabilities
- Input validation with Pydantic schemas
Example API Request:
curl -X POST https://YOUR-AZURE-APP.azurewebsites.net/api/predict \
-H "Content-Type: application/json" \
-d '{
"age": 22,
"sex": "female",
"pclass": "1",
"sibsp": 0,
"parch": 0,
"embarked": "C",
"name_title": "Miss",
"cabin_multiple": 1
}'Response:
{
"prediction": "survived",
"confidence": "high",
"probability": 0.968,
"message": "Passenger likely survived with 96.8% confidence",
"features": {
"family_size": 1,
"inferred_fare": 84.15
}
}The Jupyter notebook contains the complete data science workflow with reproducible results and detailed analysis.
- Dataset overview and statistics
- Missing value analysis (Age: 19.9%, Cabin: 77.1%, Embarked: 0.2%)
- Distribution analysis with histograms and box plots
- Correlation heatmaps and pair plots
- Survival rate analysis by features (Sex, Pclass, Age groups, etc.)
Advanced feature creation demonstrating domain knowledge:
- Title Extraction: From Name field (Mr, Mrs, Miss, Master, Rare titles)
- Family Features:
family_size,is_aloneflags - Cabin Analysis: Deck extraction, cabin multiplicity
- Fare Engineering: Log transformation, per-person fare, fare groups
- Interaction Features:
sex_pclass,age_class - Age Imputation: Median by title and class
Systematic comparison of multiple algorithms:
| Model | Baseline Accuracy | Tuned Accuracy | Improvement |
|---|---|---|---|
| Logistic Regression | 80.2% | 82.1% | +1.9% |
| K-Nearest Neighbors | 78.5% | 81.3% | +2.8% |
| Support Vector Machine | 81.5% | 83.7% | +2.2% |
| Random Forest | 79.8% | 84.2% | +4.4% |
| XGBoost | 82.1% | 85.9% | +3.8% |
| CatBoost | 81.9% | 85.1% | +3.2% |
| Voting Ensemble | - | 86.2% | Best |
- SHAP Analysis: Waterfall plots, force plots, summary plots
- Feature Importance: Random Forest, XGBoost importances
- Permutation Importance: Model-agnostic feature ranking
- Partial Dependence Plots: Effect of individual features
Most Predictive Features:
- Sex (female = 74% survival, male = 19% survival)
- Passenger Class (1st = 63%, 2nd = 47%, 3rd = 24%)
- Fare (log-transformed, normalized)
- Title (grouped: Mr, Mrs, Miss, Master, Rare)
- Age (children < 12 had higher survival)
Key Findings:
- Feature engineering improved accuracy by 3-5%
- Ensemble methods outperformed individual models
- Cross-validation showed stable performance (σ < 2%)
- Model achieves 86.2% accuracy on validation set
- Python 3.11+ (tested on 3.11, 3.12, 3.13)
- uv package manager (recommended)
- Docker (optional, for containerized deployment)
# Clone repository
git clone https://github.com/JustaKris/Titanic-Machine-Learning-from-Disaster.git
cd Titanic-Machine-Learning-from-Disaster
# Install with uv (recommended - 10-100x faster than pip)
uv sync
# Or install with pip
pip install -e .Option 1: Web Application
# Start Flask server
uv run titanic-api
# Navigate to http://localhost:5000Option 2: Command-Line Inference
# Single prediction
uv run titanic-predict \
--age 22 \
--sex female \
--pclass 1 \
--name-title Miss \
--embarked C
# Batch predictions from CSV
uv run titanic-predict --input data/test.csv --output predictions.csvOption 3: Python API
from titanic_ml.models.predict import CustomData, PredictPipeline
# Create passenger data
passenger = CustomData(
age=22,
sex="female",
name_title="Miss",
sibsp=0,
pclass="1",
embarked="C",
cabin_multiple=1
)
# Get prediction
pipeline = PredictPipeline()
df = passenger.get_data_as_dataframe()
prediction, probability = pipeline.predict(df)
print(f"Survived: {bool(prediction[0])}")
print(f"Confidence: {probability[0]:.1%}")# Install notebook dependencies
uv sync --group notebooks
# Launch Jupyter Lab
jupyter lab notebooks/Titanic-Machine-Learning-from-Disaster.ipynb# Build image
docker build -t titanic-ml .
# Run container
docker run -p 5000:5000 titanic-ml
# Access at http://localhost:5000Titanic-Machine-Learning-from-Disaster/
├── titanic_ml/ # Main package (renamed from titanic_ml/)
│ ├── config/
│ │ └── settings.py # Centralized configuration with Pydantic
│ ├── data/
│ │ ├── loader.py # Data ingestion from CSV
│ │ └── transformer.py # Feature engineering pipeline
│ ├── features/
│ │ └── build_features.py # Advanced feature creation
│ ├── models/
│ │ ├── train.py # Model training with hyperparameter tuning
│ │ ├── predict.py # Inference pipeline
│ │ └── schemas.py # Pydantic validation schemas
│ ├── app/
│ │ ├── routes.py # Flask application with API endpoints
│ │ ├── templates/ # HTML templates
│ │ └── static/ # CSS, JS, images
│ └── utils/
│ ├── logger.py # Structured logging
│ ├── helpers.py # Model persistence utilities
│ └── exception.py # Custom exception handling
├── notebooks/
│ └── Titanic-Machine-Learning-from-Disaster.ipynb
├── tests/
│ ├── unit/ # Unit tests (60 tests)
│ └── integration/ # API integration tests (22 tests)
├── data/
│ ├── raw/ # Original Kaggle datasets
│ └── processed/ # Engineered features
├── models/ # Saved model artifacts
├── scripts/ # CLI entry points
├── docs/ # MkDocs documentation
├── pyproject.toml # Modern Python packaging
└── Dockerfile # Multi-stage production build
graph LR
A[Raw Data] --> B[Data Loader]
B --> C[Feature Engineering]
C --> D[Preprocessing]
D --> E[Model Training]
E --> F[Evaluation]
F --> G{Deploy?}
G -->|Yes| H[Save Model]
H --> I[Flask API]
I --> J[Predictions]
- Package Structure: Migrated from
titanic_ml/totitanic_ml/for proper Python packaging - Configuration Management: Centralized settings with Pydantic for type safety
- Pipeline Architecture: sklearn Pipeline with FeatureUnion for reproducibility
- Testing Strategy: 82 tests (66% coverage) with unit + integration tests
- CI/CD: GitHub Actions for automated testing, security scanning, Docker builds
- Logging: Structured JSON logging with different levels for dev/prod
Composition: 6 base estimators with hard voting
- Random Forest Classifier
- XGBoost Classifier
- Logistic Regression
- CatBoost Classifier
- Support Vector Classifier
- K-Neighbors Classifier
| Metric | Training | Validation | Test |
|---|---|---|---|
| Accuracy | 88.5% | 86.2% | TBD |
| Precision | 87.3% | 84.7% | TBD |
| Recall | 82.1% | 79.8% | TBD |
| F1 Score | 84.6% | 82.2% | TBD |
| ROC AUC | 0.923 | 0.901 | TBD |
5-Fold stratified cross-validation:
- Mean Accuracy: 85.1%
- Std Deviation: 1.8%
- Min: 82.9%
- Max: 87.4%
Interpretation: Low variance indicates stable, generalizable model.
- Sex - 28.3%
- Title (Grouped) - 15.7%
- Fare (Normalized) - 12.4%
- Age - 9.8%
- Pclass - 8.6%
- Family Size - 6.2%
- Cabin Known - 5.1%
- Embarked - 4.3%
- Sex × Pclass - 3.9%
- Is Alone - 2.8%
✅ Ensemble Diversity: Combines tree-based (RF, XGB, CB) with linear (LR) and distance-based (KNN) models
✅ Feature Engineering: Domain knowledge improves signal extraction
✅ Proper Validation: Stratified CV prevents data leakage
✅ Hyperparameter Tuning: Grid search optimizes each estimator
✅ Probability Calibration: Averaging individual model probabilities gives realistic confidence scores
- pandas
2.2+- Data manipulation and analysis - numpy
1.26+- Numerical computing - scikit-learn
1.5+- ML algorithms and pipelines - XGBoost
2.1+- Gradient boosting - CatBoost
1.2+- Categorical boosting - SHAP
0.46+- Model explainability
- Flask
3.0+- Web application framework - flask-swagger-ui
4.11+- API documentation - Pydantic
2.10+- Data validation - pydantic-settings
2.6+- Configuration management
- pytest
8.3+- Testing framework (82 tests, 66% coverage) - pytest-cov - Code coverage reporting
- black
24.0+- Code formatting - flake8
7.1+- Linting - mypy
1.13+- Static type checking - isort
5.13+- Import sorting
- uv - Fast Python package manager (10-100x faster than pip)
- Docker - Containerization
- GitHub Actions - CI/CD automation
- Azure App Service - Cloud hosting
- Jupyter Lab
4.4+- Interactive notebooks - MkDocs Material
9.5+- Documentation site - matplotlib
3.9+- Visualizations - seaborn
0.13+- Statistical plots
# Run all tests with coverage
uv run pytest tests/ --cov=titanic_ml --cov-report=term-missing
# Run specific test categories
uv run pytest tests/unit -v # Unit tests only
uv run pytest tests/integration -v # Integration tests only
# Run with specific markers
uv run pytest -m "not slow" -v # Skip slow testsTest Coverage: 82 tests, 66% coverage (exceeds 40% threshold)
# Format code
uv run black titanic_ml/ tests/
# Sort imports
uv run isort titanic_ml/ tests/
# Lint code
uv run flake8 titanic_ml/ tests/
# Type checking
uv run mypy titanic_ml/
# Security scan
uv run pip-audit# Train with default settings
uv run titanic-train
# Train with custom data paths
uv run python scripts/run_training.py \
--train-path data/raw/train.csv \
--test-path data/raw/test.csv \
--output-dir models/The project includes three CLI entry points:
# Train models
uv run titanic-train
# Make predictions
uv run titanic-predict --age 22 --sex female --pclass 1
# Start API server
uv run titanic-api --port 8000Live Demo: 🔗 Titanic Survival Predictor on Render
Render provides free hosting perfect for portfolio projects and demos:
# 1. Build Docker image
docker build -t titanic-ml .
# 2. Push to Docker Hub
docker tag titanic-ml:latest YOUR_USERNAME/titanic-ml:latest
docker push YOUR_USERNAME/titanic-ml:latest
# 3. Create Render service:
# - Go to render.com → New Web Service
# - Select "Deploy existing image from registry"
# - Image URL: docker.io/YOUR_USERNAME/titanic-ml:latest
# - Configure health check: /health
# - Click Create Web Service
# 4. Setup automatic deployments:
# - GitHub secret: RENDER_DEPLOY_WEBHOOK (copy from Render settings)
# - CI/CD pipeline automatically redeploys on pushFeatures:
- ✅ Free tier with 750 hours/month
- ✅ Auto-deploys on image updates
- ✅ HTTPS included
⚠️ Cold starts (~30s) on free tier
For production workloads with better performance and scaling:
# 1. Build Docker image
docker build -t titanic-ml:latest .
# 2. Tag for Azure Container Registry
docker tag titanic-ml:latest YOUR_REGISTRY.azurecr.io/titanic-ml:latest
# 3. Push to registry
docker push YOUR_REGISTRY.azurecr.io/titanic-ml:latest
# 4. Deploy to Azure App Service
az webapp create \
--resource-group YOUR_RG \
--plan YOUR_PLAN \
--name YOUR_APP_NAME \
--deployment-container-image-name YOUR_REGISTRY.azurecr.io/titanic-ml:latest# Pull pre-built image
docker pull justakris/titanic-ml:latest
# Run container
docker run -d -p 5000:5000 \
--name titanic-api \
--health-cmd "curl -f http://localhost:5000/api/health || exit 1" \
--health-interval=30s \
justakris/titanic-ml:latestAutomated workflows handle testing, building, and deployment:
Push to main
↓
build.yml (Docker image → Docker Hub)
↓
├→ deploy-render.yml (auto-deploy to Render)
└→ deploy-azure.yml (auto-deploy to Azure)
Workflows:
- build.yml - Builds multi-platform Docker image, pushes to Docker Hub
- deploy-render.yml - Auto-deploys to Render via webhook (triggered after build)
- deploy-azure.yml - Auto-deploys to Azure (triggered after build)
- ci.yml - Tests on Python 3.11-3.13, security scanning
- security.yml - Bandit, pip-audit vulnerability scans
- deploy-docs.yml - Publishes documentation to GitHub Pages
Required GitHub Secrets:
DOCKERHUB_USERNAME,DOCKERHUB_TOKEN- Docker Hub credentialsRENDER_DEPLOY_WEBHOOK- Render auto-deploy webhook (optional)AZURE_CLIENT_ID,AZURE_TENANT_ID,AZURE_SUBSCRIPTION_ID- Azure credentials (optional)
📖 See .github/workflows/README.md for complete CI/CD setup documentation.
Comprehensive documentation available at justakris.github.io/Titanic-Machine-Learning-from-Disaster
- Quick Start Guide: Installation and first predictions
- API Reference: Complete function/class documentation
- Architecture Guide: System design and patterns
- Deployment Guide: Production setup instructions
- Methodology: Detailed explanation of approach
- Advanced Features: Feature engineering deep-dive
Build docs locally:
# Install docs dependencies
uv sync --group docs
# Serve documentation
uv run mkdocs serve
# Navigate to http://localhost:8001Contributions welcome! This is a portfolio project demonstrating ML engineering best practices.
# Clone and install
git clone https://github.com/JustaKris/Titanic-Machine-Learning-from-Disaster.git
cd Titanic-Machine-Learning-from-Disaster
uv sync --all-groups
# Create feature branch
git checkout -b feature/amazing-feature
# Make changes and test
uv run pytest tests/
uv run black titanic_ml/ tests/
uv run flake8 titanic_ml/ tests/
# Commit and push
git commit -m "Add amazing feature"
git push origin feature/amazing-featureAll PRs must:
- ✅ Pass all 82 tests (
pytest) - ✅ Maintain ≥40% code coverage (
pytest-cov) - ✅ Pass security scans (
bandit,pip-audit) - ✅ Follow code style (
black,flake8,isort) - ✅ Include type hints (
mypycompatible) - ✅ Update documentation if adding features
This project is licensed under the MIT License - see the LICENSE file for details.
Kristiyan Bonev - ML Engineer
- 💼 LinkedIn: kristiyan-bonev
- 🐙 GitHub: @JustaKris
- 📧 Email: k.s.bonev@gmail.com
- 🌐 Project: Titanic-Machine-Learning-from-Disaster
This portfolio project demonstrates:
✅ End-to-End ML Engineering - Research → Production
✅ Software Engineering Best Practices - Testing, CI/CD, Documentation
✅ Cloud Deployment - Containerization, Azure integration
✅ API Development - RESTful design, Swagger docs
✅ Code Quality - Type hints, linting, formatting
✅ Model Interpretability - SHAP, feature importance
✅ Production Readiness - Logging, error handling, validation
⭐ If you found this project helpful, please consider giving it a star! ⭐
Built with ❤️ as a portfolio demonstration of professional ML engineering