Code Monkey home page Code Monkey logo

ocsvm-anomaly-detection's Introduction

Unsupervised Anomaly Detection with One-Class Support Vector Machine

This repository includes codes for unsupervised anomaly detection by means of One-Class SVM(Support Vector Machine). In the codes, CIFAR10 is expected to be used. Firstly, the image data are compressed by convolutional autoencoder(CAE) to vector features. Secondly, training a model only with the features of the data which you define as normal will be done. At the last, you can run anomaly detection with One-Class SVM and you can evaluate the models by AUCs of ROC and PR.

Dependencies

scikit-learn, Keras, Numpy, OpenCV

My test environment: Python3.6, scikit-learn==.21.2, Keras==2.2.4 , numpy==1.16.4, opencv-python==4.1.0.25

How to use

1. Prepare data

Prepare data and labels to use. For instance, CIFAR10 is composed of 10 classes and each label should express unique class and be integer. These prepared data should be placed in the data directory.

You can download CIFAR10 data via :
https://www.kaggle.com/janzenliu/cifar-10-batches-py

Put them in "data" directory and run the following code to compress them into NPZ file.

python make_cifar10_npz.py

After running this code, you can get cifar10.npz under "data" directory.

(Optional)

When you use your own dataset, please prepare npz file as the same format as CIFAR-10.

data = np.load('your_data.npz')
data.files
-> ['images', 'labels'] # "images" and "labels" keys'

data['labels']
-> array([6, 9, 9, ..., 5, 1, 7]) # labels is the vector composed of integers which correspond to each class identifier.

Note : Please be careful fo input image size of model.py.
You might need to change network architecture's parameter so that it can deal with your images.

2. Train CAE

Run the following command. Settable parameters like epoch, batchsize or output directory are described in the script.

python cae.py

The encoded features by CAE will be saved in the "data" directory as cifar10_cae.npz.

3. Run Anomaly Detection

First, normal class needs to be defined by "normal_label". It means the other classes EXCEPT the normal class will be automatically defined as abnormal. By running the script below, OC-SVM is trained with the normal data. As evaluation metrics, AUCs of ROC(Receiver Operating Characteristic) and PR(Precision and Recall) are calculated.

By default, training models and test procedure are repeated over different nu parameters(see scikit-learn document. gamma and kernel are fixed in the script). For each nu and its trained model, the AUCs are averaged over 10 different test data set.

python anomaly_detection_ocsvm.py

Please look into the script for the settable parameters.

scikit-learn(sklearn.svm.OneClassSVM)
http://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html

ocsvm-anomaly-detection's People

Contributors

hiram64 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.