Code Monkey home page Code Monkey logo

andresriancho / w3af Goto Github PK

View Code? Open in Web Editor NEW
4.5K 194.0 1.2K 169.96 MB

w3af: web application attack and audit framework, the open source web vulnerability scanner.

Home Page: http://w3af.org/

Shell 0.13% Python 73.74% PHP 0.01% Assembly 0.03% HTML 24.35% JavaScript 0.45% C 0.10% Perl 0.02% C++ 0.02% PLpgSQL 0.01% ASP 0.02% Java 0.01% Roff 0.99% Rebol 0.05% Smarty 0.04% Dockerfile 0.03% Hack 0.01% TSQL 0.01%
scanner security appsec cross-site-scripting sql-injection

w3af's Introduction

w3af - Web Application Attack and Audit Framework

w3af is an open source web application security scanner which helps developers and penetration testers identify and exploit vulnerabilities in their web applications.

The scanner is able to identify 200+ vulnerabilities, including Cross-Site Scripting, SQL injection and OS commanding.

Contributing

Pull requests are always welcome! If you're not sure where to start, please take a look at the First steps as a contributor document in our wiki. All contributions, no matter how small, are welcome.

Links and documentation

Sponsors

Holm Security sponsors the project and uses w3af as part of their amazing automated and continuous vulnerability assessment platform.

Found this project useful? Donations are accepted via ethereum at 0xb1B56F04E6cc5F4ACcB19678959800824DA8DE82

w3af's People

Contributors

andresriancho avatar aronmolnar avatar artem-smotrakov avatar attwad avatar bdamele avatar codychamberlain avatar foobarmonk avatar glira avatar gxsghsn avatar inkz avatar jarrodcoulter avatar jekil avatar leks-ha avatar linerd0196 avatar maniqui avatar mdeous avatar meatballs1 avatar mleblebici avatar mpoindexter avatar nixwizard avatar owentuz avatar paralax avatar pvdl avatar q-back avatar rhabacker avatar stamparm avatar swans0n avatar tvelazquez avatar vinnytroia avatar vulnerscom avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

w3af's Issues

Improve error handling in extended_urllib.py

Review this comment and the associated code: "This except clause will catch unexpected errors For the first N errors, return an empty response... Then a w3afMustStopException will be raised"

Show progress and status

Show progress and status in some meaningful way in both consoleUI and gtkUI. Also, fix this error: "Current value can never be greater than max value!"

Migrate unittests that use pysvn

Some unittests in w3af use SVN repo meta-data to determine if the file needs to be updated. Migrate this to use git meta-data

Migrate ticket creation

Migrate automated ticket creation to use the w3af issues for tickets.
Create a new label to report vulnerabilities there.

Valid scan, exploit, run remote command

Perform a valid scan and verify vulnerabilities appear in log and KB; exploit the vulnerability and execute a command in the shell; execute a payload in the shell

Payload testing in GUI

  • Unittest GTK_UI lsp command output (./w3af_console -n -s scripts/script-local_file_include-payload-debug.w3af)
  • Unittest GTK_UI running payload without parameters
  • Unittest GTK_UI running payload with parameters

Export request tool tests

Export request tool:

  • Export request to python
  • Export request to html
  • Export request to ajax
  • Export request to ruby

Improve w3af's score for WAVSEP XSS by at least 20%

User story

As a user I want w3af to find as many XSS vulnerabilities as possible.

Conditions of satisfaction

  • Unittests for all XSS sections of WAVSEP need to be written
  • Coverage % needs to be calculated and asserted
  • Once that's done, improve detection rate by 20%
  • If we're not at 100%, create a new task (similar to this one) to improve another 10%

Context code rewrite v3.0

While I was re-writing the code as specified in the Context code rewrite v2.0 section one main issue appeared: finding context based on inside_context (see below and code in xss branch) was too error/false-positive prone. With inside_context the context accuracy is rather low ๐Ÿ‘Ž

The main reason for using the previous strategy was that we're sending payloads that might break the HTML structure, this classic parsers would be unable to determine the payload's location. A solution for this issue would be to:

  • Send the payload
  • Replace the payload with something "clean" such as "123456789abcdef"
  • Find the context for the clean payload, which doesn't break any HTML structures

Using that strategy we would be able to use almost any HTML parser to determine the context, the problems that I foresee now are:

  • Parsers might not be able to identify all the contexts we need
  • Parsers might not make a difference between single/double/backtick delimited tag attributes
  • lxml had problems with memory leaks - if we choose an HTML parser which gives us everything we need but is not tested, we might end up once again with a problem like that

The same problem with context appears when parsing JavaScript and CSS, so we might need to find a parser for that too.

An option that we might experiment with is to build our own parser-lexer

This method looks promising. If we use this there is a silly trick that might come handy to determine if the payload is in single/double/backtick quotes, simply dump the tag text, get the attribute value (containing the payload) and search for "fooPAYLOADbar", then if that's not in the tag text search for the same but with single quotes and so on. The only detail would be to check for " and ' escapes

Context code rewrite v2.0

The code that processes the HTML and makes it possible to identify the HTML context where the payload landed is complex, hard to debug and extend. So I propose a rewrite with the following objectives:

  • Completely remove the ByteChunk
  • Make each context self-contained and testeable
  • Write contexts in such a way that they can be nested
  • Write contexts to decode JS

New context classes

class Context(object):
    NAME = 'HTML'

    @staticmethod
    def match(normalized_html):
        return True

    def can_break(payload):
        raise NotImplementedError

    def executable():
        return False

    def inside_context(normalized_html, context_start, context_end):
        """
        :return: True if we perform a reverse find of context_start (ie '<script'), then
                     a reverse find of context_end (ie. '</script>') and the second has a
                     lower index; meaning we're still in the context of '<script>' tag

        :param context_start: Would be '<script' in the example above
        :param context_start: Would be '</script>' in the example above
        """

    def current_context_content(normalized_html, context_start, context_end):
        """
        Extract the current context text, handles the following cases:
            <script type="application/json">foo();</script>
            <script>foo();</script>

        Returning 'foo();' in both.

        :param context_start: Would be '<script' in the example above
        :param context_start: Would be '</script>' in the example above
        """
        pass

This context matches if we're inside a <script> tag:

class ScriptTagContext(Context):
    NAME = 'SCRIPT_TAG'

    @staticmethod
    def match(normalized_html):
        return Context.inside_context(normalized_html, '<script', '</script>')

This context matches when we're inside a script tag and a multi-line comment:

class ScriptTagMultiLineCommentContext(ScriptTagContext):
    NAME = 'SCRIPT_MULTI_COMMENT'

    @staticmethod
    def match(normalized_html):
        if not ScriptTagContext.match(normalized_html):
            return False
        script_code = Context.current_context_content(normalized_html, '<script', '</script>')
        js_context = get_js_context(script_code)
        if js_context is None:
            return False
        return isinstance(js_context, JSMultiLineComment)

    def can_break(payload)
        return '*/' in payload

Since contexts don't hold any data I believe it would be smart to make the match method a @staticmethod and only create an instance when we find a match and need to return it to the xss plugin.

With this new approach the order in which we match the contexts is very important. It MUST start with the more specific contexts and then move down to the generic cases (ScriptTagMultiLineCommentContext goes before ScriptTagContext)

It might be a good idea to write the inside_context(normalized_html, '<script', '</script>') with a lru cache, since we might call it with the same params several times (one for each context+sub-contexts inside it).

Sub-contexts for JavaScript

JavaScript can be found in several places:
* <script>...</script>
* <a href="javascript:...">
* <a onmouseover="...">

When found we need to analyze the JS code to understand if in this context we can run arbitrary code (or not). In order to do this we'll need JavaScript sub-contexts.

What I have in mind is to:

  • First write a context analyzer for JS which given a piece of JS code and a payload will tell me if I can_break or if it's executable (just like the other contexts)
  • Then call these context handler each time JS is found in the HTML-level contexts

Sub-contexts for Style Sheets

Same as above but for CSS

Encode/Decode tool tests

Encode/Decode tool:

  • Decode using URL encoding
  • Encode using URL encoding
  • Decode invalid data using base64

OS Detection refactoring

OS detection needs to be in the core (near 404 fingerprinting). Add also ICMP fingerprinting and filename case sensitiviness index.html vs. indEX.html in URLs.

Manual request editor tests

Manual request editor:

  • Send GET request to a local web server, assert at the web server level that the information is correct
  • Send POST and make the same assertions as in the previous case
  • Send a request to an offline web server
  • Modify the request to be invalid, assert that the send button is disabled and that the background is red. Modify the request again to be valid and verify that it was properly sent.

Migrate auto-update to git

The current auto-update code looks for updates from Sourceforge's repository using pysvn. Modify the code to make it look for updates in this repo.

8.3 filename detection plugin

Verify if I can write a plugin or core component that exploits the 8.3 filename format as explained by Bogdan in a blog post. Tomas sent iis_short_name_brute.py a while ago which could be useful; but I was thinking about something that wouldn't depend on a separate wordlist. My idea would work more like:

  • Intercept all HTTP requests and responses
  • Verify if the remote server supports 8.3
  • If the response was a 404, and the remote server supports 8.3 try the short name instead.
    The good thing about this is that if the user enabled 8.3 and nikto, and nikto requests /backup2012.tgz and it doesn't exist, the 8.3 would request /backup~1.tgz and that might exist. The bad thing is that it is a mixture between a grep plugin (needs to read all http traffic) and a crawl plugin (needs to perform requests and return new URLs to the core); which might be difficult to implement respecting the framework's rules.

456

123asdfafs

Remove output manager singleton

The simplest solution would be to move the set the output manager instance as an attribute of the w3afCore, but maybe there is a different design pattern I could use? Research design patterns for logging

This task gets even more complicated because of "Move grep plugins to a multiprocessing process pool" #28 , which requires me to be able to send logging messages from different processes.

More details and related information at "Remove KB and CF singleton objects" #26

Fast Exploit refactoring using vulnerability templates

The idea is to rewrite the attack plugin "fast exploit" feature, which never worked as expected, using a simple idea: "Users add vulnerabilities to the KB manually, attack plugins exploit them just as they would exploit a vulnerability added by a plugin". For doing this, I have to define vulnerability templates for allowing users to add the vulns with the required data.

This will require code at the core level, testing of all templates with the corresponding attack plugin, and modifications in the console and GUI in order to give the user the ability to add vulnerabilities to the KB.

Fuzzy request editor tests

Fuzzy request editor:

  • Send N requests to a local web server, assert at the web server level that the information is correct
  • Try to send and invalid request, assert exception correctly handled
    Send a request to an offline web server

New plugin to detect XXE

User story

As a user I want to be able to detect XXE vulnerabilities

Scenarios to cover

  • Web application receives application/xml data which we got via a spider_man. The XXE is in the XML we send in the post-body. In this case test two different things:
    • Watobo/Arachni seem to modify the original XML and send that instead, which seems to be a good way to find more vulnerabilities. Add the payload which reads files and match file contents in output
    • "Blindly" send the XEE payload as the post-body (overwrite the original XML) which reads files and match file contents in output
  • Web application receives the XML to process in an input (QS, form, inside JSON, etc.) and parses it from there. Send the XEE payload which reads files by creating mutants and match file contents in output

Payloads

  • <?xml version=""1.0"" encoding=""ISO-8859-1""?><!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM ""file://c:/boot.ini"">]><foo>&xxe;</foo>
  • <?xml version=""1.0"" encoding=""ISO-8859-1""?><!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM ""file:////etc/passwd"">]><foo>&xxe;</foo>

Testing

  • Basic vulnerable script available @ moth (audit/xxe/).
  • Vulnerable scripts at MCIR

Exploitation

Once detected, I could check for RCE with a payload that contains (]>) just like it is explained here: https://gist.github.com/3623896

Going to leave exploitation out of the initial implementation since it's complex to code and users can do it on their own ๐Ÿ‘

Move grep plugins to a different process to improve performance

Introduction

In order to achieve this EPIC task, many things need to be analyzed.

History

We already tried to do this, and failed: https://github.com/andresriancho/w3af/commits/multiprocessing

Measure performance

I need a good way to measure performance of this improvement, so before even starting I'll need to define how to measure performance improvements.

A good idea would be to:

  • Enable all grep plugins
  • Enable web_spider
  • Scan django-moth 100 times
  • Measure the average, min, max times.

Compare before and after.

Output manager refactoring

Grep plugins call the output manager to print information about newly identified vulnerabilities. Calling the output manager from another process is an already solved problem, see how this was done in the multiprocessing document parser.

The same ideas could be applied to communication with cf and kb from the main thread.

Grep worker refactoring

We can re-use a lot of the things we learnt from he multiprocessing document parser.

Serialization is completely transparent when using pebble. If all attributes from request and response are serializable then we wouldn't have any issues. The only worry I have is the re-work of, for example, having to parse the same HTTP response in each process because I had to remove that attribute from the HTTP response instance before sending it to the wire.

The main thread would create N grep consumer processes, each with its own queue. Each process would have a subset of the enabled grep plugins. Each enabled grep plugin would have only one instance, living in one of the grep consumer processes.

When the main thread receives a request / response, it has to send it to all grep consumer processes.

Having multiple process for the grep consumer means that the multiprocessing document parser cache will have N instances and be 1/N times effective. A lot of rework would be done to parse the same response multiple times. This is something to solve.

KnowledgeBase and Configuration refactoring

Grep plugins query the KnowledgeBase and cf objects, how are we going to "proxy" (?) those calls to the parent process / main thread?

Fingerprint object in knowledge base

Add an object that inherits from info() and represents a system fingerprint. We should have different classes of fingerprint objects, one for OS, other for HTTP daemon, other for programming language, other for programming framework, etc. These should be stored in the KB by the infrastructure plugins that do this job. In the future this could be used by the core to cross this info with an XML file that says "PHP version X.Y.Z has vulnerability CVE-12345".

Profile testing

Create a profile with 1 enabled plugin from each family, click on "Empty Profile" and then back on the newly created plugin. Verify that the configuration is there.

Attack plugins: Add support for duplicated command output in response body

Introduction

Attack plugins should support the case when the application returns the expected command output more than once. The problem looks like this:

Works

http://localhost/foo?cmd=whoami
...
root

Doesn't work

http://localhost/foo?cmd=whoami
...
root
...
root
...
root

Conditions of satisfaction

  • Write a django-moth test with os_commanding
  • Write a unittest showing that it doesn't work
  • Fix it to work with 1-N copies of the command output in the HTTP response body

Local Proxy tests

Local Proxy:

  • Send request using urllib2 configured to go through the proxy, do not modify and forward to the remote server
  • Send request using urllib2 configured to go through the proxy, modify and forward to the remote server
  • Send request to an offline URL using urllib2 configured to go through the proxy, forward to the remote server

Unittest: Two consecutive GUI scans

Perform a valid scan and verify vulnerabilities appear in log and KB; clear results; perform a scan with different configuration and verify results

Remove KB and CF singleton objects

Remove the ugly kb and cf "singleton" objects. The main goal is to be able to have two w3afCore objects in the same python process. This is a continuation of the feature/module efforts.

A good idea would be to make the kb and cf objects w3afCore attributes

Related with #25 (Remove output manager singleton).

Glob DoS plugin

Glob DoS plugin: " with .../blah.php?a=/..//..//..//..//..//../* "

unittest wavsep

  • Install WAVSEP in moth
  • Make sure we run unittests against it
  • No matter how many vulnerabilities we're missing, just make sure WAVSEP is integrated into our unittests. We'll improve later.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.