andresriancho / w3af Goto Github PK

w3af: web application attack and audit framework, the open source web vulnerability scanner.

Shell 0.13% Python 73.74% PHP 0.01% Assembly 0.03% HTML 24.35% JavaScript 0.45% C 0.10% Perl 0.02% C++ 0.02% PLpgSQL 0.01% ASP 0.02% Java 0.01% Roff 0.99% Rebol 0.05% Smarty 0.04% Dockerfile 0.03% Hack 0.01% TSQL 0.01%

scanner security appsec cross-site-scripting sql-injection

w3af's Introduction

w3af - Web Application Attack and Audit Framework

w3af is an open source web application security scanner which helps developers and penetration testers identify and exploit vulnerabilities in their web applications.

The scanner is able to identify 200+ vulnerabilities, including Cross-Site Scripting, SQL injection and OS commanding.

Contributing

Pull requests are always welcome! If you're not sure where to start, please take a look at the First steps as a contributor document in our wiki. All contributions, no matter how small, are welcome.

Links and documentation

w3af's People

Contributors

Stargazers

Watchers

Forkers

chiehwen igrub grobertrambo hogehogeworld dmitris haofree 1d3df9903ad fr34k8 satpk pochadri wisdark moonbingbing mike-lesniak tvelazquez safeboy caliskanfurkan daemon13 r41nb0w kam40oz crowforge omaroooh luoq anemic elcodigok nullshvar todun gaoj aravind10 madmercen ghiles jrogosky genba shenyuanv imiyoo2010 testing4cap 0xd5dc enricostano quyetdd biddyweb hgazi2002 darksback mukk85 graytips vgiorgi smallevilbeast victos saleem144 dharrya zepolino123 rspindel dark-cms ajurso alexisxavier onlivimal banlyst mitiaj vs280685 ceezax wawava tarzq306 orvant sinzear arkilis thachp jongal joshisa pvdl za pcd78 talsoft oxotheg pointunbalance prodigeni babarnazmi mahzad salimane ly0 meyarobert depierre arjunmenon guochanglea ecneladis adastra-thw rejahrehim 0xn0va ttepatti nghinv jarrodcoulter silverskin aeppert yaoyi2008 3rddegree atracy intfrr dlstenbro jatkatz jcockhren vinnytroia joelkern cookies0430

w3af's Issues

Migrate code from SVN to GIT

Retain all history
Retain all branches

Improve error handling in extended_urllib.py

Review this comment and the associated code: "This except clause will catch unexpected errors For the first N errors, return an empty response... Then a w3afMustStopException will be raised"

Show progress and status

Show progress and status in some meaningful way in both consoleUI and gtkUI. Also, fix this error: "Current value can never be greater than max value!"

Migrate unittests that use pysvn

Some unittests in w3af use SVN repo meta-data to determine if the file needs to be updated. Migrate this to use git meta-data

Migrate ticket creation

Migrate automated ticket creation to use the w3af issues for tickets.
Create a new label to report vulnerabilities there.

Valid scan, exploit, run remote command

Perform a valid scan and verify vulnerabilities appear in log and KB; exploit the vulnerability and execute a command in the shell; execute a payload in the shell

Perform a scan to a valid URL with no plugins

Payload testing in GUI

Unittest GTK_UI lsp command output (./w3af_console -n -s scripts/script-local_file_include-payload-debug.w3af)
Unittest GTK_UI running payload without parameters
Unittest GTK_UI running payload with parameters

Export request tool tests

Export request tool:

Export request to python
Export request to html
Export request to ajax
Export request to ruby

Verify if redos and csrf plugins are thread-safe

The plugins that require a thread-safe fix are:

redos.py
csrf.py (but it is a work in progress).

123 Remove Rapid7 branding

Remove Rapid7 branding from threading2 branch

Split plugins into vendor-specific and generic types

Improve w3af's score for WAVSEP XSS by at least 20%

User story

As a user I want w3af to find as many XSS vulnerabilities as possible.

Conditions of satisfaction

Unittests for all XSS sections of WAVSEP need to be written
Coverage % needs to be calculated and asserted
Once that's done, improve detection rate by 20%
If we're not at 100%, create a new task (similar to this one) to improve another 10%

Context code rewrite v3.0

While I was re-writing the code as specified in the Context code rewrite v2.0 section one main issue appeared: finding context based on inside_context (see below and code in xss branch) was too error/false-positive prone. With inside_context the context accuracy is rather low 👎

The main reason for using the previous strategy was that we're sending payloads that might break the HTML structure, this classic parsers would be unable to determine the payload's location. A solution for this issue would be to:

Send the payload
Replace the payload with something "clean" such as "123456789abcdef"
Find the context for the clean payload, which doesn't break any HTML structures

Using that strategy we would be able to use almost any HTML parser to determine the context, the problems that I foresee now are:

Parsers might not be able to identify all the contexts we need
Parsers might not make a difference between single/double/backtick delimited tag attributes
lxml had problems with memory leaks - if we choose an HTML parser which gives us everything we need but is not tested, we might end up once again with a problem like that

The same problem with context appears when parsing JavaScript and CSS, so we might need to find a parser for that too.

An option that we might experiment with is to build our own parser-lexer

This method looks promising. If we use this there is a silly trick that might come handy to determine if the payload is in single/double/backtick quotes, simply dump the tag text, get the attribute value (containing the payload) and search for "fooPAYLOADbar", then if that's not in the tag text search for the same but with single quotes and so on. The only detail would be to check for " and ' escapes

Context code rewrite v2.0

The code that processes the HTML and makes it possible to identify the HTML context where the payload landed is complex, hard to debug and extend. So I propose a rewrite with the following objectives:

Completely remove the ByteChunk
Make each context self-contained and testeable
Write contexts in such a way that they can be nested
Write contexts to decode JS

New context classes

class Context(object):
    NAME = 'HTML'

    @staticmethod
    def match(normalized_html):
        return True

    def can_break(payload):
        raise NotImplementedError

    def executable():
        return False

    def inside_context(normalized_html, context_start, context_end):
        """
        :return: True if we perform a reverse find of context_start (ie '<script'), then
                     a reverse find of context_end (ie. '</script>') and the second has a
                     lower index; meaning we're still in the context of '<script>' tag

        :param context_start: Would be '<script' in the example above
        :param context_start: Would be '</script>' in the example above
        """

    def current_context_content(normalized_html, context_start, context_end):
        """
        Extract the current context text, handles the following cases:
            <script type="application/json">foo();</script>
            <script>foo();</script>

        Returning 'foo();' in both.

        :param context_start: Would be '<script' in the example above
        :param context_start: Would be '</script>' in the example above
        """
        pass

This context matches if we're inside a <script> tag:

class ScriptTagContext(Context):
    NAME = 'SCRIPT_TAG'

    @staticmethod
    def match(normalized_html):
        return Context.inside_context(normalized_html, '<script', '</script>')

This context matches when we're inside a script tag and a multi-line comment:

class ScriptTagMultiLineCommentContext(ScriptTagContext):
    NAME = 'SCRIPT_MULTI_COMMENT'

    @staticmethod
    def match(normalized_html):
        if not ScriptTagContext.match(normalized_html):
            return False
        script_code = Context.current_context_content(normalized_html, '<script', '</script>')
        js_context = get_js_context(script_code)
        if js_context is None:
            return False
        return isinstance(js_context, JSMultiLineComment)

    def can_break(payload)
        return '*/' in payload

Since contexts don't hold any data I believe it would be smart to make the match method a @staticmethod and only create an instance when we find a match and need to return it to the xss plugin.

With this new approach the order in which we match the contexts is very important. It MUST start with the more specific contexts and then move down to the generic cases (ScriptTagMultiLineCommentContext goes before ScriptTagContext)

It might be a good idea to write the inside_context(normalized_html, '<script', '</script>') with a lru cache, since we might call it with the same params several times (one for each context+sub-contexts inside it).

Sub-contexts for JavaScript

JavaScript can be found in several places:
* <script>...</script>
* <a href="javascript:...">
* <a onmouseover="...">

When found we need to analyze the JS code to understand if in this context we can run arbitrary code (or not). In order to do this we'll need JavaScript sub-contexts.

What I have in mind is to:

First write a context analyzer for JS which given a piece of JS code and a payload will tell me if I can_break or if it's executable (just like the other contexts)
Then call these context handler each time JS is found in the HTML-level contexts

Sub-contexts for Style Sheets

Same as above but for CSS

Encode/Decode tool tests

Encode/Decode tool:

Decode using URL encoding
Encode using URL encoding
Decode invalid data using base64

Try to start a scan without saving the plugin options

OS Detection refactoring

OS detection needs to be in the core (near 404 fingerprinting). Add also ICMP fingerprinting and filename case sensitiviness index.html vs. indEX.html in URLs.

GUI: Add a one-time "show and accept" window for w3af's disclaimer

Add a one-time "show and accept" window for w3af's disclaimer (core/data/constants/disclaimer.py):

Manual request editor tests

Manual request editor:

Send GET request to a local web server, assert at the web server level that the information is correct
Send POST and make the same assertions as in the previous case
Send a request to an offline web server
Modify the request to be invalid, assert that the send button is disabled and that the background is red. Modify the request again to be valid and verify that it was properly sent.

Migrate auto-update to git

The current auto-update code looks for updates from Sourceforge's repository using pysvn. Modify the code to make it look for updates in this repo.

Complete coding and unittest for csrf.py

8.3 filename detection plugin

Verify if I can write a plugin or core component that exploits the 8.3 filename format as explained by Bogdan in a blog post. Tomas sent iis_short_name_brute.py a while ago which could be useful; but I was thinking about something that wouldn't depend on a separate wordlist. My idea would work more like:

Intercept all HTTP requests and responses
Verify if the remote server supports 8.3
If the response was a 404, and the remote server supports 8.3 try the short name instead.
The good thing about this is that if the user enabled 8.3 and nikto, and nikto requests /backup2012.tgz and it doesn't exist, the 8.3 would request /backup~1.tgz and that might exist. The bad thing is that it is a mixture between a grep plugin (needs to read all http traffic) and a crawl plugin (needs to perform requests and return new URLs to the core); which might be difficult to implement respecting the framework's rules.

Migrate pydev setup to wiki

https://sourceforge.net/apps/trac/w3af/wiki/pydev-setup and change the documentation to use git instead of svn

456

123asdfafs

redos.py unittest

''Shay Chen ‏@sectooladdict: try the premium login page of the latest puzzlemall. Watch http://www.youtube.com/watch?v=3k_eJ1bcCro&feature=plcp''

Remove output manager singleton

The simplest solution would be to move the set the output manager instance as an attribute of the w3afCore, but maybe there is a different design pattern I could use? Research design patterns for logging

This task gets even more complicated because of "Move grep plugins to a multiprocessing process pool" #28 , which requires me to be able to send logging messages from different processes.

More details and related information at "Remove KB and CF singleton objects" #26

Fast Exploit refactoring using vulnerability templates

The idea is to rewrite the attack plugin "fast exploit" feature, which never worked as expected, using a simple idea: "Users add vulnerabilities to the KB manually, attack plugins exploit them just as they would exploit a vulnerability added by a plugin". For doing this, I have to define vulnerability templates for allowing users to add the vulns with the required data.

This will require code at the core level, testing of all templates with the corresponding attack plugin, and modifications in the console and GUI in order to give the user the ability to add vulnerabilities to the KB.

Fuzzy request editor tests

Fuzzy request editor:

Send N requests to a local web server, assert at the web server level that the information is correct
Try to send and invalid request, assert exception correctly handled
Send a request to an offline web server

Perform a scan to an invalid URL, check that the error is handled

Manually verify and then unittest error handling

Verify if error handling works as expected and unittest all of it:
* Ctrl+C does what we expect (in both console and gui)
* Ctrl+C still stops the crawl process? Verify documentation.

New plugin to detect XXE

User story

As a user I want to be able to detect XXE vulnerabilities

Scenarios to cover

Web application receives application/xml data which we got via a spider_man. The XXE is in the XML we send in the post-body. In this case test two different things:
- Watobo/Arachni seem to modify the original XML and send that instead, which seems to be a good way to find more vulnerabilities. Add the payload which reads files and match file contents in output
- "Blindly" send the XEE payload as the post-body (overwrite the original XML) which reads files and match file contents in output
Web application receives the XML to process in an input (QS, form, inside JSON, etc.) and parses it from there. Send the XEE payload which reads files by creating mutants and match file contents in output

Payloads

<?xml version=""1.0"" encoding=""ISO-8859-1""?><!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM ""file://c:/boot.ini"">]><foo>&xxe;</foo>
<?xml version=""1.0"" encoding=""ISO-8859-1""?><!DOCTYPE foo [<!ELEMENT foo ANY><!ENTITY xxe SYSTEM ""file:////etc/passwd"">]><foo>&xxe;</foo>

Testing

Basic vulnerable script available @ moth (audit/xxe/).
Vulnerable scripts at MCIR

Exploitation

Once detected, I could check for RCE with a payload that contains (]>) just like it is explained here: https://gist.github.com/3623896

Going to leave exploitation out of the initial implementation since it's complex to code and users can do it on their own 👍

Crawl plugin: http://host.tld/images.jsp?id=3 to http://host.tld/images/

New crawl plugin: http://host.tld/images.jsp?id=3 #> http://host.tld/images/

Use shell_identifiers to extract information.

All attack plugins should use shell_identifiers to extract information.

Move grep plugins to a different process to improve performance

Introduction

In order to achieve this EPIC task, many things need to be analyzed.

History

We already tried to do this, and failed: https://github.com/andresriancho/w3af/commits/multiprocessing

Measure performance

I need a good way to measure performance of this improvement, so before even starting I'll need to define how to measure performance improvements.

A good idea would be to:

Enable all grep plugins
Enable web_spider
Scan django-moth 100 times
Measure the average, min, max times.

Compare before and after.

Output manager refactoring

Grep plugins call the output manager to print information about newly identified vulnerabilities. Calling the output manager from another process is an already solved problem, see how this was done in the multiprocessing document parser.

The same ideas could be applied to communication with cf and kb from the main thread.

Grep worker refactoring

We can re-use a lot of the things we learnt from he multiprocessing document parser.

Serialization is completely transparent when using pebble. If all attributes from request and response are serializable then we wouldn't have any issues. The only worry I have is the re-work of, for example, having to parse the same HTTP response in each process because I had to remove that attribute from the HTTP response instance before sending it to the wire.

The main thread would create N grep consumer processes, each with its own queue. Each process would have a subset of the enabled grep plugins. Each enabled grep plugin would have only one instance, living in one of the grep consumer processes.

When the main thread receives a request / response, it has to send it to all grep consumer processes.

Having multiple process for the grep consumer means that the multiprocessing document parser cache will have N instances and be 1/N times effective. A lot of rework would be done to parse the same response multiple times. This is something to solve.

KnowledgeBase and Configuration refactoring

Grep plugins query the KnowledgeBase and cf objects, how are we going to "proxy" (?) those calls to the parent process / main thread?

Fingerprint object in knowledge base

Add an object that inherits from info() and represents a system fingerprint. We should have different classes of fingerprint objects, one for OS, other for HTTP daemon, other for programming language, other for programming framework, etc. These should be stored in the KB by the infrastructure plugins that do this job. In the future this could be used by the core to cross this info with an XML file that says "PHP version X.Y.Z has vulnerability CVE-12345".

Profile testing

Create a profile with 1 enabled plugin from each family, click on "Empty Profile" and then back on the newly created plugin. Verify that the configuration is there.

Perform a scan to an invalid URL, correct the error and run a scan to a valid URL, verify the results

Since running a scan and verifying the results is already done in other tests, just test if the invalid URL error can be corrected.

Perform a scan to a URL that's offline

Attack plugins: Add support for duplicated command output in response body

Introduction

Attack plugins should support the case when the application returns the expected command output more than once. The problem looks like this:

Works

http://localhost/foo?cmd=whoami
...
root

Doesn't work

http://localhost/foo?cmd=whoami
...
root
...
root
...
root

Conditions of satisfaction

Write a django-moth test with os_commanding
Write a unittest showing that it doesn't work
Fix it to work with 1-N copies of the command output in the HTTP response body

Move knowledgeBase and config to sqlite databases

Move knowledgeBase and config to sqlite databases; and use sqlite's concurrency (https://www.sqlite.org/faq.html#q5) in our favor: each process should open a connection to the sqlite file, when creating a new process we don't need to share the whole kb/cf, we just share the name of the file where the data is stored.

Local Proxy tests

Local Proxy:

Send request using urllib2 configured to go through the proxy, do not modify and forward to the remote server
Send request using urllib2 configured to go through the proxy, modify and forward to the remote server
Send request to an offline URL using urllib2 configured to go through the proxy, forward to the remote server

Related with #25 (Remove output manager singleton).

Install WAVSEP in moth
Make sure we run unittests against it
No matter how many vulnerabilities we're missing, just make sure WAVSEP is integrated into our unittests. We'll improve later.

Inject into parameter names

Inject into parameter names, something like http://host/foo<script>...</script>=3 , details here http://blog.portswigger.net/2008/08/attacking-parameter-names.html

andresriancho / w3af Goto Github PK

w3af's Introduction

w3af - Web Application Attack and Audit Framework

Contributing

Links and documentation

Sponsors

w3af's People

Contributors

Stargazers

Watchers

Forkers

w3af's Issues

User story

Conditions of satisfaction

Context code rewrite v3.0

Context code rewrite v2.0

New context classes

Sub-contexts for JavaScript

Sub-contexts for Style Sheets

User story

Scenarios to cover

Payloads

Testing

Exploitation

Introduction

History

Measure performance

Output manager refactoring

Grep worker refactoring

KnowledgeBase and Configuration refactoring

Introduction

Works

Doesn't work

Conditions of satisfaction

Recommend Projects

Recommend Topics

Recommend Org