Code Monkey home page Code Monkey logo

navex's Introduction

navex

Navex is an exploit generation framework for web applications. It is composed of two main steps: (1) vulnerable sinks identification by performing static analysis, and (2) the generation of concrete exploits through dynamic analysis of web apps, for the identified vulnerable sinks. Navex extends/uses many open-source tools: Joern, PHPJoern, Z3, Z3-str2, crawler4j, Narcissus JavaScript engine, and Xdebug. For more information on Navex, please read our paper "Precise and Scalable Exploit Generation for Dynamic Web Applications" published at USENIX Security 2018. https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-alhuzali.pdf.

Step 1: vulnerable sinks identification

For Step 1, we enhanced Joern and PHPJoern. The enhanced tools are forks of the original Joern and PHPJoern, and available at https://github.com/aalhuz/joern/tree/navex and https://github.com/aalhuz/phpjoern/tree/navex.

Using our PHPJoern and Joern forks

  • Follow all installation instructions at https://github.com/aalhuz/phpjoern/tree/navex.

  • Before parsing an application using PHPJoern, the database schema of the application has to analyzed and formatted as a CSV file. The dbAnalysis package in https://github.com/aalhuz/joern/tree/navex/projects/extensions/joern-php/src/main/java/dbAnalysis will parse the schema files and produce one file (by default called schema.csv) that has the schema information as CSV file.

  • Run the main class in DBAnalysis.java and provide the directory that has the schema files. For example

      	cd joern/projects/extensions/joern-php
      	java -classpath "build/libs/*:lib/*" dbAnalysis/DBAnalysis
    
  • TO run the parser, you have to supply the database schema file (i.e., schema.csv) as the following example

      ./php2ast -f jexp -n nodes.csv -r edges.csv -d $PATH/schema.csv $APPLICATION
    

$PATH is the path to the schema.csv file, and $APPLICATION is the application to parse.

  • Edit joern/projects/extensions/joern-php/build.gradle as explained in the file.

  • Follow the rest of the instructions on how to generate code property graphs with Joern and import them into Neo4j*.

  • *please check this https://github.com/aalhuz/batch-importer2.1. It has the missing binary that you will need to export your graph into Neo4j.

Graph Traversals guided by our Attack Dictionary

To find vulnerabilities using our attack dictionary, we need to search the enhanced Code Property graph using gremlin queries (graph traversals). We have added several Joern-steps in our python-joern fork at https://github.com/aalhuz/python-joern/tree/navex.

  • Follow the installation instructions at https://github.com/aalhuz/python-joern/tree/navex. The python wrapper static-main.py is the script that invokes Analysis.py, which has our attack dictionary.

  • The traversals output will be in results/static_analysis_results.txt and results/include_map_results.txt. The first file has the analysis results that summarizes all found vulnerable paths and safe sinks as well. The vulnerable paths are written as TAC formulas as described in the paper. The Second file has PHP files inclusion relationships, which is going to be used in Step 2. Note, the paths to the result files are hardcoded in static-main.py and need to be changed before running the python script.

Generating exploit strings (exploit seeds) using Z3 solver

Prerequisites: install Z3 solver and Z3-str2 extension. We have used Z3-str2 in Navex's implementation (not Z3-str3 which was not available during our evaluation). You can find Z3-str2 at https://github.com/z3str/Z3-str and the installation instructions at https://github.com/z3str/Z3-str/blob/master/README_OLD.md.

  • The TAC formulas of each path to a vulnerable sink have to be rewritten as Z3 specifications to verify the exploitability of the path. The solver package at https://github.com/aalhuz/navex/tree/master/src/solver encapsulates our translation to solver specification implementation. Specifically, run StaticSolver.java, and you will be prompted to enter the vulnerability type (i.e., SQL, XSS, etc.) you are investigating. This java program will do the following:
    • Read both result files generated by the traversals.
    • Generate solver specification files for vulnerable paths. The spec files can be found in staticAnalysisSpec directory.
    • Invoke Z3-str2 to solve the constructed formulas in the Spec files. The models will also be in staticAnalysisSpec Directory.
    • create a file that resolves the inclusion relationships to find candidate URLs (used in Step 2 as described in the paper). The file will be in results/include_map_resolution_results.txt. Note, paths to the above directories are hardcoded and have to be changed before running the program.

Step 2: concrete exploit generation

Prerequisites: Deploy on a server (e.g., localhost) the applications that Step 1 found vulnerabilities in them (not all the applications that you have tested). Read more about this under "setup" in the evaluation section of our paper. Xdebug for trace generation is required too. We have used version 2.5.2 in our evaluation. Xdebug and its installation instructions are at https://xdebug.org/. In addition, Narcissus and spiderMonkey 1.8.5 JavaScript engine are required.Download spiderMonkey 1.8.5 JS engine from https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Releases/1.8.5 and build it. You can find the build instructions at https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Build_Documentation. Let $SpiderMonkey = the directory of the spiderMonkey 1.8.5 installation folder. Download our modified version of Narcissus JavaScript interpreter at https://github.com/aalhuz/narcissus and place the downloaded folder inside $SpiderMonkey/js. Finally, edit the path to narcissus folder inside "extractJSConstraints" function in https://github.com/aalhuz/navex/blob/master/src/Main.java.

In this step, Navex crawls web applications to construct their navigation graphs (Neo4j graph).

The Navigation graph setup

  • specify the name of the navigation graphthat we are about to construct in "org.neo4j.server.database.location" in $PATH_tO_YOUR_NEO4J_INSTALLATION/conf/neo4j-server.properties. For example:

    org.neo4j.server.database.location = "$YOUR_PATH_TO_BATCH_IMPORT/navigationGraph.db"

Then, point to your Neo4J installation and start the server.

Application crawling

We have extended crawler4j in the fork https://github.com/aalhuz/crawler4j/tree/navex to allow for web forms and JavaScript reasoning.

  • To run the extended crawler and construct the Navigation Graph, edit "run.pl" in navex directory as explained in the script. Then, run "run.pl" as the following

          cd navex
         ./run.pl data 1 config/auth-appName.txt $SEED_URL 
    

config/auth-appName.txt is a file that you have to create to store login information for appName. A sample file is provided. $SEED_URL is the seed URL for the crawler (e.g., http://localhost/appName/index.php). While crawling the applications, nodes and edges will be added to the navigationGraph.db simultaneously.

Concrete exploit generation

To find navigation paths to exploit seeds. We have to traverse the Navigation graph using exploitFinding.py in our python-joern fork. This wrapper script invokes traversals that check the inclusion map (in results/include_map_resolution_results.txt), matches it with the exploit seeds (i.e., exploit strings), and finally outputs the concrete exploits in results/navigation_sequences.txt.

	    cd python-joern
	    python exploitFinding.py $ATTACK_TYPE

For XSS, for instance, the $ATTACK_TYPE would be "xss" (python exploitFinding.py xss).

navex's People

Contributors

aalhuz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.