Skip to content

A General Toolkit for Advanced Online Learning, Online Active Learning, Online Semi-supervised Learning Approaches

License

Notifications You must be signed in to change notification settings

liuzy0708/Awesome-OL

Repository files navigation

Awesome-OL: A General Toolkit for Online Learning Approaches

Awesome-OL


๐Ÿ“– Table of Contents


๐ŸŒŸ Overview

Welcome to Awesome-OL, your comprehensive toolkit for online learning strategies and classifiers. This repository includes state-of-the-art implementations for Online Active Learning (OAL) and Online Semi-Supervised Learning (OSSL), complete with classifiers, datasets, and visualizations.

For usage instructions, please see the Usage Guide.


๐Ÿง  OAL Strategies

Explore a variety of online active learning strategies in the OAL_strategies/ folder.

๐Ÿงฉ Strategy ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Code ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
CogDQS Dual-query strategy using human memory cognition IEEE โ€” 2023 TNNLS
DSA-AI Dynamic submodular learning for imbalanced drifting streams IEEE GitHub 2024 TNNLS
MTSGQS Memory-triggered submodularity-guided strategy IEEE โ€” 2023 TITS
DMI-DD Explanation-based query strategy at chunk level IEEE GitHub 2024 TCYB

Baseline Strategies

๐Ÿงฉ Strategy ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Code ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
RS Random Sampling โ€” โ€” โ€” โ€”
US_fix Uncertainty sampling with fixed threshold IEEE โ€” 2014 TNNLS
US_var Uncertainty sampling with variable threshold IEEE โ€” 2014 TNNLS

โš™๏ธ OAL Classifiers

๐Ÿค– Classifier ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Source ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
ROALE-DI Reinforcement-based ensemble for drifting imbalanced data IEEE GitHub 2022 TKDE
OALE Online ensemble with hybrid labeling IEEE โ€” 2019 TNNLS

๐Ÿ” OSSL Classifiers

๐Ÿค– Classifier ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Source ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
OSSBLS Semi-supervised BLS with static anchors IEEE โ€” 2021 TII
ISSBLS Semi-supervised BLS without historical dependency IEEE โ€” 2021 TII

Baseline Strategy

๐Ÿค– Classifier ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
SOSELM Semi-supervised ELM ScienceDirect Paper 2016 Neurocomputing

๐Ÿ“Š Supervised Classifiers

๐Ÿค– Classifier ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Source ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
OLI2DS Imbalanced data stream learner with dynamic costs IEEE GitHub 2023 TKDE
IWDA Learner-agnostic drift adaptation using density estimation IEEE GitHub 2023 TNNLS
DES Drift-adaptive ensemble with SMOTE IEEE GitHub 2024 TNNLS
ACDWM Adaptive chunk selection for stability and drift IEEE GitHub 2020 TNNLS
ARF Adaptive resampling ensemble with ADWIN Springer GitHub 2017 Machine Learning
SRP Random subspace + online bagging IEEE GitHub 2019 ICDM
BLS-W Online BLS with Shermanโ€“Morrisonโ€“Woodbury update IEEE GitHub 2023 TCYB
QRBLS BLS with QR factorization IEEE GitHub 2025 TNNLS

Baseline Classifier

๐Ÿค– Classifier ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Source ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
OSELM Sequential ELM without drift detection IEEE GitHub 2006 TNNLS

๐Ÿ”ฎ OAL Regression

๐Ÿค– Regressor ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Source ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
KNN k-Nearest Neighbors for online regression with sliding window โ€” โ€” โ€” โ€”
Lasso Online Lasso regression with โ„“1 regularization โ€” โ€” โ€” โ€”
Ridge Online Ridge regression with โ„“2 regularization โ€” โ€” โ€” โ€”
Linear Ordinary Least Squares with incremental updates โ€” โ€” โ€” โ€”
HoeffdingTree Adaptive decision tree for regression with drift detection IEEE GitHub 2007 ICIAfS
ARF Adaptive Random Forest regressor โ€” GitHub โ€” โ€”

๐Ÿšจ Drift Detection

๐Ÿค– Detector ๐Ÿ“ Description ๐Ÿ“š Reference ๐Ÿ’พ Source ๐Ÿ“… Year ๐Ÿ›๏ธ Journal/Conference
DDM Drift Detection Method based on error rate monitoring โ€” Github โ€” โ€”
EDDM Enhanced DDM for gradual drift detection โ€” Github โ€” โ€”
KSWIN Kolmogorov-Smirnov Windowing method for concept drif โ€” Github โ€” โ€”
PageHinkley Sequential change detection with cumulative sum โ€” Github โ€” โ€”

๐Ÿงฉ Summary of Features

๐Ÿ”น Method ๐Ÿง  OAL Strategy ๐Ÿค– Classifier โšช Binary ๐ŸŸข Multi-class ๐Ÿ”„ Drift Adaptation ๐Ÿงฉ Ensemble
ROALE-DI โœ… โœ… โœ… โœ… โœ… โœ…
CogDQS โœ… โ€” โœ… โœ… โœ… โ€”
DSA-AI โœ… โ€” โœ… โœ… โœ… โ€”
DMI-DD โœ… โ€” โœ… โœ… โœ… โ€”
MTSGQS โœ… โ€” โœ… โœ… โœ… โ€”
RS โœ… โ€” โœ… โœ… โ€” โ€”
US-fix โœ… โ€” โœ… โœ… โ€” โ€”
US-var โœ… โ€” โœ… โœ… โ€” โ€”
OLI2DS โ€” โœ… โœ… โ€” โœ… โ€”
IWDA โ€” โœ… โœ… โœ… โœ… โœ…
DES โ€” โœ… โœ… โ€” โœ… โœ…
ACDWM โ€” โœ… โœ… โ€” โœ… โœ…
SRP โ€” โœ… โœ… โœ… โœ… โœ…
ARF โ€” โœ… โœ… โœ… โœ… โœ…
QRBLS โ€” โœ… โœ… โœ… โ€” โ€”

๐Ÿ›  Usage Guide

๐Ÿ‘‰ This section will guide users on how to use this project. It will introduce the operation steps including environment preparation, data loading, model selection and visualization results.


๐Ÿ”ง Environment Setup

๐Ÿ’ก Follow these steps to complete the environment setup (using VSCode as an example):

  1. Open Anaconda Prompt or Terminal

  2. Navigate to the directory containing the env.yml file

  3. Create the Conda environment by running:

    conda env create -f env.yml
  4. Activate the Conda environment
    Open the integrated terminal in VSCode (Terminal > New Terminal) and execute:

    conda activate OL
  5. Select Python Interpreter in VSCode

    • Press Ctrl + Shift + P (Windows/Linux) or Cmd + Shift + P (Mac) to open the Command Palette
    • Type and select Python: Select Interpreter
    • Choose the interpreter corresponding to the activated Conda environment (OL)
  6. Run your Python code
    Open your Python files and run them as usual. The activated environment will provide all required packages and dependencies.


๐Ÿงช Demo

  • In the project root directory, locate the file classify.ipynb, regression.ipynb, drift_detection. Within these notebooks, you can select the framework, dataset, classifier, strategies, drift detector, and hyperparameters you wish to use.
  • Optionally, you can output visualization results for an intuitive comparison of model performance. These results will also be saved automatically in the Results folder.
  • For detailed guidance, please follow the step-by-step instructions provided within the notebook.

๐Ÿ“‚ Datasets

Datasets are stored as .csv files in the datasets folder. Each file contains:

๐Ÿ”น Attributes (features)
๐Ÿ”น Labels

You can select any .csv file as your test dataset.


๐Ÿ“ˆ Visualization ๐Ÿ“‰

Visualization tools are provided in the visualization folder, including:

  • Multi-model confusion matrix
  • Dynamic GIFs displaying Accuracy curves and Macro F1 scores

The following example results can be viewed directly in the main.ipynb:

Combined Animation

Confusion Matrix

Regression


๐Ÿ“œ Logs

You can find detailed log information for each demo run in the Logs folder located at the project root. Example log snippet:

16:08 --------------------------------------------------------------------------------------------------
16:08 Max samples: 1000
16:08 n_round: 3
16:08 n_pt: 100
16:08 dataset_name: Waveform
16:08 chunk_size: 1
16:08 framework: OL
16:08 stream: None
16:08 clf_name_list: ['BLS', 'NB', 'ISSBLS', 'OSSBLS', 'DWM']
16:08 num_method: 5
16:08 directory_path: C:\Users\Projects\Online-Learning-Framework\Results\Results_Waveform_OL_100_1_1000
16:08 --------------------------------------------------------------------------------------------------

๐Ÿงฐ Utility

  • utils.py: Interfaces between classifiers and strategies, enabling smooth combination and extension.

๐Ÿ“š References

explore related resources and inspiration at:


๐Ÿ“ Note

We sincerely hope this toolkit becomes a valuable resource in your journey with online learning. Our dedicated team at the THUFDD Research Group, led by Prof. Xiao He and Prof. Donghua Zhou from the Department of Automation at Tsinghua University, is committed to driving innovation and excellence in machine learning applications for industry.

Wishing you a rewarding and inspiring learning experience!

Project contributors include:


โœจ Contributor Declaration

If you are interested in becoming a contributor to this project, we welcome your participation. Together, we can continue to refine and expand this toolkit to empower researchers, practitioners, and enthusiasts in the field.

Please feel free to get in touch!

We look forward to your participation and collaboration to push this project forward! ๐Ÿ’ช


๐Ÿ‘€ Views

Visitor Map

About

A General Toolkit for Advanced Online Learning, Online Active Learning, Online Semi-supervised Learning Approaches

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5