Code Monkey home page Code Monkey logo

western-oc2-lab / msana-online-data-stream-analytics-and-concept-drift-adaptation Goto Github PK

View Code? Open in Web Editor NEW
30.0 4.0 6.0 10.42 MB

Data stream analytics: Implement online learning methods to address concept drift and model drift in dynamic data streams. Code for the paper entitled "A Multi-Stage Automated Online Network Data Stream Analytics Framework for IIoT Systems" published in IEEE Transactions on Industrial Informatics.

License: MIT License

Jupyter Notebook 100.00%
adaptive-learning anomaly-detection change-detector cicids2017 concept-drift data-preprocessing data-stream drift drift-detection ensemble-learning

msana-online-data-stream-analytics-and-concept-drift-adaptation's Introduction

MSANA-Online-Data-Stream-Analytics-And-Concept-Drift-Adaptation

This repository contains the code for the paper entitled "A Multi-Stage Automated Online Network Data Stream Analytics Framework for IIoT Systems" published in IEEE Transactions on Industrial Informatics (Q1, IF: 11.648), doi: 10.1109/TII.2022.3212003.
Authors: Li Yang and Abdallah Shami
Organization: The Optimized Computing and Communications (OC2) Lab, ECE Department, Western University

In this work, we propose a comprehensive online learning framework for data stream analytics and concept drift adaptation in dynamic environments.
Two other tutorial code for concept drift, online machine learning, and data stream analytics can be found in: PWPAE-Concept-Drift-Detection-and-Adaptation and OASW-Concept-Drift-Detection-and-Adaptation

Paper Link

Open access version on arXiv
Published version on IEEE

Abstract of The Paper

Industry 5.0 aims at maximizing the collaboration between humans and machines. Machines are capable of automating repetitive jobs, while humans handle creative tasks. As a critical component of Industrial Internet of Things (IIoT) systems for service delivery, network data stream analytics often encounter concept drift issues due to dynamic IIoT environments, causing performance degradation and automation difficulties. In this paper, we propose a novel Multi-Stage Automated Network Analytics (MSANA) framework for concept drift adaptation in IIoT systems, consisting of dynamic data pre-processing, the proposed Drift-based Dynamic Feature Selection (DD-FS) method, dynamic model learning & selection, and the proposed Window-based Performance Weighted Probability Averaging Ensemble (W-PWPAE) model. It is a complete automated data stream analytics framework that enables automatic, effective, and efficient data analytics for IIoT systems in Industry 5.0. Experimental results on two public IoT datasets demonstrate that the proposed framework outperforms state-of-the-art methods for IIoT data stream analytics.

Concept Drift

In non-stationary and dynamical environments, such as IoT environments, the distribution of input data often changes over time, known as concept drift. The occurrence of concept drift will result in the performance degradation of the current trained data analytics model. Traditional offline machine learning (ML) models cannot deal with concept drift, making it necessary to develop online adaptive analytics models that can adapt to the predictable and unpredictable changes in data streams.

To address concept drift, effective methods should be able to detect concept drift and adapt to the changes accordingly. Therefore, concept drift detection and adaptation are the two major steps for online learning on data streams.

Implementation

AutoML Pipeline and Procedures

  1. Dynamic Data Pre-Processing
    • Data Balancing
    • Data Normalization
  2. Dynamic Feature Engineering
    • Drift-based Dynamic Feature Selection
  3. Based Model Learning and Selection
    • Online Base Model Learning
    • Dynamic Model Selection
  4. Online Ensemble Model Development
    • Online Model Ensemble
    • Concept Drift Detection

Online Learning/Concept Drift Adaptation Algorithms

  • Adaptive Random Forest (ARF) with ADWIN drift detector (ARF-ADWIN)
  • Adaptive Random Forest (ARF) with EDDM drift detector (ARF-EDDM)
  • Streaming Random Patches (SRP)
  • Extremely Fast Decision Tree (EFDT)
  • K-Nearest Neighbors (KNN) classifier with ADWIN change detector (KNN-ADWIN)
  • Self Adapting Memory (SAM) KNN model (SAM-KNN)
  • Online Passive-Aggressive (OPA)
  • Leveraging Bagging (LB)
  • Performance Weighted Probability Averaging Ensemble (PWPAE)
  • Window-based Performance Weighted Probability Averaging Ensemble (W-PWPAE)
    • Proposed in this work

Drift Detection Algorithms

  • Adaptive Windowing (ADWIN)
  • Early Drift Detection Method (EDDM)

Dataset

  1. CICIDS2017 dataset, a popular network traffic dataset for intrusion detection problems

  2. IoTID20 dataset, a novel IoT botnet dataset

Code

Requirements & Libraries

Contact-Info

Please feel free to contact me for any questions or cooperation opportunities. I'd be happy to help.

Citation

If you find this repository useful in your research, please cite this article as:

L. Yang and A. Shami, โ€œA Multi-Stage Automated Online Network Data Stream Analytics Framework for IIoT Systems,โ€ IEEE Transactions on Industrial Informatics, vol. 19, no. 2, pp. 2107-2116, Feb. 2023, doi: 10.1109/TII.2022.3212003.

@ARTICLE{9910406,
  author={Yang, Li and Shami, Abdallah},
  journal={IEEE Transactions on Industrial Informatics}, 
  title={A Multi-Stage Automated Online Network Data Stream Analytics Framework for IIoT Systems}, 
  year={2023},
  volume={19},
  number={2},
  pages={2107-2116},
  doi={10.1109/TII.2022.3212003}}

msana-online-data-stream-analytics-and-concept-drift-adaptation's People

Contributors

liyanghart avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.