Code Monkey home page Code Monkey logo

malware-analysis's Introduction

Malware Analysis

This repository contains code for analyzing malware using four different deep learning models: Convolutional Neural Networks (CNN), Generative Adversarial Networks (GAN), Support Vector Machines (SVM), and Random Forest. The analysis includes both detection and classification of malware files.

Generative Adversarial Networks (GAN)

GAN Image

Figure 1 : GAN Architecture

Generative Adversarial Networks (GAN) is a deep learning model used for generating synthetic data, and is one of the models used for malware analysis in this repository. The code for this model can be found in the GAN folder.The idea is to use a generative adversarial network (GAN) based algorithm to generate adversarial malware examples, which are able to bypass black-box machine learning based detection models.Figure 1 shows the adversarial malware generator’s training architecture.

Convolutional Neural Networks (CNN)

CNN Image

Figure 2 : CNN Architecture

Convolutional Neural Networks (CNN) is a deep learning model used for image classification, and is another model used for malware analysis in this repository.The CNN model used for this project consists of several convolutional layers, followed by max pooling layers and fully connected layers. The model is trained on the dataset using backpropagation and gradient descent to minimize the cross-entropy loss.The code for this model can be found in the CNN folder.

Random Forest

RF Image

Figure 3 : Random Forest Architecture

The Random Forest model used for this project consists of multiple decision trees, each trained on a subset of the dataset. The model is trained on the dataset using the Random Forest algorithm, which generates predictions by aggregating the predictions of multiple decision trees.

Support Vector Machine (SVM)

SVM Image

Figure 4 : SVM Architecture

In the DL-SVM classifier we use three models for malware classification: MLP-SVM, GRU-SVM, and CNN-SVM. MLP-SVM combines a multilayer perceptron (MLP) neural network with a SVM classifier and similarly the other models.In all three models, the dataset is divided into training and testing sets, and the model is trained using the training set. The model is then evaluated on the testing set using metrics such as accuracy, precision, and recall.

Results

GAN Results

The exisiting malware samples are changed by adding noise and certain parameters.These samples are then tested against various models to test the model's capabilities, the parameters such as LR and Optimizer can also be changed to better underastand the functioning of the model.

Index:
Blue : RandomForest
Pink: Logisitic Regression
Yellow: Decision Tree
White: MultiLayerPerceptron

Detector Loss

GAN Image

Generator Loss

GAN Image

CNN Results

Accuracy of the model

CNN Image

SVM Results

Accuracy of the 3 models

CNN Image

Random Forest Results

Accuracy in percentages

CNN Image

malware-analysis's People

Contributors

yash-bhootda avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.