Awesome-OL: A General Toolkit for Online Learning Approaches

📖 Table of Contents

📖 Table of Contents
🌟 Overview
🧠 OAL Strategies
⚙️ OAL Classifiers
🔍 OSSL Classifiers
📊 Supervised Classifiers
🔮 OAL Regression
🚨 Drift Detection
🧩 Summary of Features
🛠 Usage Guide
📚 References
📝 Note
✨ Contributor Declaration
👀 Views

🌟 Overview

Welcome to Awesome-OL, your comprehensive toolkit for online learning strategies and classifiers. This repository includes state-of-the-art implementations for Online Active Learning (OAL) and Online Semi-Supervised Learning (OSSL), complete with classifiers, datasets, and visualizations.

For usage instructions, please see the Usage Guide.

🧠 OAL Strategies

Explore a variety of online active learning strategies in the OAL_strategies/ folder.

🧩 Strategy	📝 Description	📚 Reference	💾 Code	📅 Year	🏛️ Journal/Conference
CogDQS	Dual-query strategy using human memory cognition	IEEE	—	2023	TNNLS
DSA-AI	Dynamic submodular learning for imbalanced drifting streams	IEEE	GitHub	2024	TNNLS
MTSGQS	Memory-triggered submodularity-guided strategy	IEEE	—	2023	TITS
DMI-DD	Explanation-based query strategy at chunk level	IEEE	GitHub	2024	TCYB

Baseline Strategies

🧩 Strategy	📝 Description	📚 Reference	💾 Code	📅 Year	🏛️ Journal/Conference
RS	Random Sampling	—	—	—	—
US_fix	Uncertainty sampling with fixed threshold	IEEE	—	2014	TNNLS
US_var	Uncertainty sampling with variable threshold	IEEE	—	2014	TNNLS

⚙️ OAL Classifiers

🤖 Classifier	📝 Description	📚 Reference	💾 Source	📅 Year	🏛️ Journal/Conference
ROALE-DI	Reinforcement-based ensemble for drifting imbalanced data	IEEE	GitHub	2022	TKDE
OALE	Online ensemble with hybrid labeling	IEEE	—	2019	TNNLS

🔍 OSSL Classifiers

🤖 Classifier	📝 Description	📚 Reference	💾 Source	📅 Year	🏛️ Journal/Conference
OSSBLS	Semi-supervised BLS with static anchors	IEEE	—	2021	TII
ISSBLS	Semi-supervised BLS without historical dependency	IEEE	—	2021	TII

Baseline Strategy

🤖 Classifier	📝 Description	📚 Reference	📅 Year	🏛️ Journal/Conference
SOSELM	Semi-supervised ELM	ScienceDirect Paper	2016	Neurocomputing

📊 Supervised Classifiers

🤖 Classifier	📝 Description	📚 Reference	💾 Source	📅 Year	🏛️ Journal/Conference
OLI2DS	Imbalanced data stream learner with dynamic costs	IEEE	GitHub	2023	TKDE
IWDA	Learner-agnostic drift adaptation using density estimation	IEEE	GitHub	2023	TNNLS
DES	Drift-adaptive ensemble with SMOTE	IEEE	GitHub	2024	TNNLS
ACDWM	Adaptive chunk selection for stability and drift	IEEE	GitHub	2020	TNNLS
ARF	Adaptive resampling ensemble with ADWIN	Springer	GitHub	2017	Machine Learning
SRP	Random subspace + online bagging	IEEE	GitHub	2019	ICDM
BLS-W	Online BLS with Sherman–Morrison–Woodbury update	IEEE	GitHub	2023	TCYB
QRBLS	BLS with QR factorization	IEEE	GitHub	2025	TNNLS

Baseline Classifier

🤖 Classifier	📝 Description	📚 Reference	💾 Source	📅 Year	🏛️ Journal/Conference
OSELM	Sequential ELM without drift detection	IEEE	GitHub	2006	TNNLS

🔮 OAL Regression

🤖 Regressor	📝 Description	📚 Reference	💾 Source	📅 Year	🏛️ Journal/Conference
KNN	k-Nearest Neighbors for online regression with sliding window	—	—	—	—
Lasso	Online Lasso regression with ℓ1 regularization	—	—	—	—
Ridge	Online Ridge regression with ℓ2 regularization	—	—	—	—
Linear	Ordinary Least Squares with incremental updates	—	—	—	—
HoeffdingTree	Adaptive decision tree for regression with drift detection	IEEE	GitHub	2007	ICIAfS
ARF	Adaptive Random Forest regressor	—	GitHub	—	—

🚨 Drift Detection

🤖 Detector	📝 Description	📚 Reference	💾 Source	📅 Year	🏛️ Journal/Conference
DDM	Drift Detection Method based on error rate monitoring	—	Github	—	—
EDDM	Enhanced DDM for gradual drift detection	—	Github	—	—
KSWIN	Kolmogorov-Smirnov Windowing method for concept drif	—	Github	—	—
PageHinkley	Sequential change detection with cumulative sum	—	Github	—	—

🧩 Summary of Features

🔹 Method	🧠 OAL Strategy	🤖 Classifier	⚪ Binary	🟢 Multi-class	🔄 Drift Adaptation	🧩 Ensemble
ROALE-DI	✅	✅	✅	✅	✅	✅
CogDQS	✅	—	✅	✅	✅	—
DSA-AI	✅	—	✅	✅	✅	—
DMI-DD	✅	—	✅	✅	✅	—
MTSGQS	✅	—	✅	✅	✅	—
RS	✅	—	✅	✅	—	—
US-fix	✅	—	✅	✅	—	—
US-var	✅	—	✅	✅	—	—
OLI2DS	—	✅	✅	—	✅	—
IWDA	—	✅	✅	✅	✅	✅
DES	—	✅	✅	—	✅	✅
ACDWM	—	✅	✅	—	✅	✅
SRP	—	✅	✅	✅	✅	✅
ARF	—	✅	✅	✅	✅	✅
QRBLS	—	✅	✅	✅	—	—

🛠 Usage Guide

👉 This section will guide users on how to use this project. It will introduce the operation steps including environment preparation, data loading, model selection and visualization results.

🔧 Environment Setup

💡 Follow these steps to complete the environment setup (using VSCode as an example):

Open Anaconda Prompt or Terminal
Navigate to the directory containing the env.yml file
Create the Conda environment by running:
```
conda env create -f env.yml
```
Activate the Conda environment
Open the integrated terminal in VSCode (Terminal > New Terminal) and execute:
```
conda activate OL
```
Select Python Interpreter in VSCode
- Press Ctrl + Shift + P (Windows/Linux) or Cmd + Shift + P (Mac) to open the Command Palette
- Type and select Python: Select Interpreter
- Choose the interpreter corresponding to the activated Conda environment (OL)
Run your Python code
Open your Python files and run them as usual. The activated environment will provide all required packages and dependencies.

🧪 Demo

In the project root directory, locate the file classify.ipynb, regression.ipynb, drift_detection. Within these notebooks, you can select the framework, dataset, classifier, strategies, drift detector, and hyperparameters you wish to use.
Optionally, you can output visualization results for an intuitive comparison of model performance. These results will also be saved automatically in the Results folder.
For detailed guidance, please follow the step-by-step instructions provided within the notebook.

📂 Datasets

Datasets are stored as .csv files in the datasets folder. Each file contains:

🔹 Attributes (features)
🔹 Labels

You can select any .csv file as your test dataset.

📈 Visualization 📉

Visualization tools are provided in the visualization folder, including:

Multi-model confusion matrix
Dynamic GIFs displaying Accuracy curves and Macro F1 scores

The following example results can be viewed directly in the main.ipynb:

📜 Logs

You can find detailed log information for each demo run in the Logs folder located at the project root. Example log snippet:

16:08 --------------------------------------------------------------------------------------------------
16:08 Max samples: 1000
16:08 n_round: 3
16:08 n_pt: 100
16:08 dataset_name: Waveform
16:08 chunk_size: 1
16:08 framework: OL
16:08 stream: None
16:08 clf_name_list: ['BLS', 'NB', 'ISSBLS', 'OSSBLS', 'DWM']
16:08 num_method: 5
16:08 directory_path: C:\Users\Projects\Online-Learning-Framework\Results\Results_Waveform_OL_100_1_1000
16:08 --------------------------------------------------------------------------------------------------

🧰 Utility

utils.py: Interfaces between classifiers and strategies, enabling smooth combination and extension.

📚 References

explore related resources and inspiration at:

GitHub - scikit-multiflow

📝 Note

We sincerely hope this toolkit becomes a valuable resource in your journey with online learning. Our dedicated team at the THUFDD Research Group, led by Prof. Xiao He and Prof. Donghua Zhou from the Department of Automation at Tsinghua University, is committed to driving innovation and excellence in machine learning applications for industry.

Wishing you a rewarding and inspiring learning experience!

Project contributors include:

Zeyi Liu:
- liuzy21@mails.tsinghua.edu.cn
Songqiao Hu:
- hsq23@mails.tsinghua.edu.cn
Pengyu Han:
- hpy24@mails.tsinghua.edu.cn
Jiaming Liu:
- 23371007@buaa.edu.cn

✨ Contributor Declaration

If you are interested in becoming a contributor to this project, we welcome your participation. Together, we can continue to refine and expand this toolkit to empower researchers, practitioners, and enthusiasts in the field.

Please feel free to get in touch!

Contact: Zeyi Liu
Email: liuzy21@mails.tsinghua.edu.cn

We look forward to your participation and collaboration to push this project forward! 💪

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
Drift_detection		Drift_detection
OAL_classifier		OAL_classifier
OAL_strategies		OAL_strategies
OSSL_classifier		OSSL_classifier
Results		Results
Tools		Tools
__pycache__		__pycache__
classifier		classifier
datasets		datasets
drift_management		drift_management
regression		regression
visualization		visualization
LICENSE		LICENSE
README.md		README.md
classify.ipynb		classify.ipynb
drift_detection.ipynb		drift_detection.ipynb
drift_management.ipynb		drift_management.ipynb
env.yml		env.yml
regression.ipynb		regression.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation