Code Monkey home page Code Monkey logo

gnn-vulnerability-prediction's Introduction

Predicting vulnerability inducing function versions using node embeddings and graph neural networks

This repository stores the source code used in the journal paper, "Predicting vulnerability inducing function versions using node embeddings and graph neural networks"

We aim to propose a vulnerability prediction model that runs after every code change, and identifies vulnerability inducing functions in that version. We also would like to assess the success of node and token based source code representations over abstract syntax trees (ASTs) on predicting vulnerability inducing functions.

This research project is mainly conducted on wireshark project by using wireshark security advisories and wireshark bug repository

The dataset formed by this study and used in this study can be accessed through our submission in Mendeley repository

Please cite our work in case you use our dataset or source code.

@article{SAHIN2022106822,
title = {Predicting vulnerability inducing function versions using node embeddings and graph neural networks},
journal = {Information and Software Technology},
volume = {145},
pages = {106822},
year = {2022},
issn = {0950-5849},
doi = {https://doi.org/10.1016/j.infsof.2022.106822},
url = {https://www.sciencedirect.com/science/article/pii/S0950584922000015},
author = {Sefa Eren Şahin and Ecem Mine Özyedierler and Ayse Tosun},
keywords = {Software vulnerabilities, Graph neural networks, Graph embeddings, Abstract syntax trees},
}

Installation

Python 3.6+ is required. Additionally, LLVM backend is required for AST parsing

First, install requirements,

pip install -r requirements.txt

Then, add project to PYTHONPATH, according to your OS.

Usage

Config Setup

Rename vulnerability_prediction/config/config.yaml.example as vulnerability_prediction/config/config.yaml and fill acorrdingly.

Scrapers

Currently, wireshark and mozilla foundation bug repository scrapers are implemented. Just execute their scripts.

python vulnerability_prediction/scrapers/wireshark_scraper.py
python vulnerability_prediction/scrapers/mozilla_scraper.py

Commit Mining

Commit mining in done in a sequential way. First, file changes are extracted. Then, commits are matched to bugs. Finally, vulnerability inducing code changes are found by SZZ algorithm.

Execute following scripts:

python vulnerability_prediction/commit_mining/extract_file_changes.py
python vulnerability_prediction/commit_mining/bug_commit_matching.py
python vulnerability_prediction/commit_mining/szz.py

AST Extraction

Make sure that you have LLVM backend an Clang installed. Then, execute

python vulnerability_prediction/ast_extraction/ast_extractor.py

gnn-vulnerability-prediction's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

clyly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.