This project demonstrates provenance tracking for a machine learning model trained on the MNIST dataset. It includes comprehensive tracking of data, model, and training provenance, along with verification capabilities.
- Data provenance tracking
- Model architecture and weights tracking
- Training process monitoring
- Comprehensive verification system
- Detailed reporting with markdown output
mnist_provenance/
├── src/
│ ├── provenance/
│ │ ├── tracker.py
│ │ ├── verifier.py
│ │ └── generate_final_report.py
│ └── training/
│ └── train.py
├── scripts/
│ └── run_training.sh
├── artifacts/
│ ├── models/
│ └── provenance/
└── tests/
└── test_provenance.py
- Create a virtual environment:
python -m venv venv- Activate the virtual environment:
source venv/bin/activate # On Unix/macOS
# or
.\venv\Scripts\activate # On Windows- Install dependencies:
pip install -r requirements.txtRun the training script:
./scripts/run_training.shThis will:
- Train a model on the MNIST dataset
- Track all provenance information
- Generate a detailed report in the artifacts directory
- Python 3.8+
- TensorFlow 2.x
- NumPy
- pytest (for testing)
MIT License