Code Monkey home page Code Monkey logo

domainaccessibilityaudit's Introduction

Domain Accessibility Audit

This web application automatically crawls websites and checks for accessibility violations. It can crawl within subdomains of the initial domain it starts with. It reports statistics of violations for the whole audit, domains and pages.

To start it

  • Install Docker and docker-compose if needed.
  • Edit a .env file at the root of this folder (next to the README), with the following parameters:
    ADMIN_USERNAME='username'
    ADMIN_PASSWORD='password'
    
    (this password is needed to create and remove audits)
  • docker-compose up -d
  • Direct a browser to http://localhost/.
  • Use the username and the password you entered to create audits.

To stop it

  • A running audit can be stopped with the Stop button in the form to start a new audit.
  • docker-compose stop will stop the containers.
  • docker-compose down will stop and remove the containers. They are recreated automatically with docker-compose up -d.

To check the server logs

  • Get a list of container ids: docker ps.
  • Look at the logs for a container: docker logs <container_id>.
  • Keep looking in real time: docker logs -f <container_id>.
    (another way to do that is to use docker-compose up without the -d option)

To uninstall

Warning: this will remove all the data !!!

  • docker-compose down -v --rmi all --remove-orphans
  • Remove the files.

Features

  • Accessibility testing based on axe, which does not return false positives.
  • Choice of accessibility standard to use: WCAG 2.0 Level A or AA, WCAG 2.1 Level AA or Section 508.
  • Choice of web browser for testing: Firefox or Chromium.
  • Option to check subdomains automatically.
  • Options to use site maps and/or crawling to discover pages to test.
  • Option to limit the number of pages checked per domain.
  • Option to include only pages matching a regular expression.
  • Results can be browsed on a dynamic website. Access to create new audits or remove them is protected by password.
  • Results include violation statistics with links to Deque documentation given for the whole audit (including subdomains), for each domain and for each page.
  • Easy way to see which domains or pages are most affected by specific violations.
  • User and group management, with authorizations based on domains.
  • 2 methods of authentication: local and SAML.

Other environment variables

Besides the required ADMIN_PASSWORD variable, other variables can be used in .env:

  • MODE: running environment, development or production (production by default)
  • RESTRICTED_IP: an IP address which will be the only one able to access the app (127.0.0.1 by default for development, 0.0.0.0 by default for production, set to 0.0.0.0 to allow connections from everywhere even in development)
  • DEVELOPMENT_PORT: the port used for development (3142 by default)
  • DEVELOPMENT_API_PORT: the port used for API calls in development (3143 by default)
  • PRODUCTION_PORT: the port used for production, except with SSL (80 by default)
  • SAML_ENTRYPOINT: SAML authentication: identity provider entrypoint
  • SAML_ISSUER: SAML authentication: issuer string to supply to identity provider
  • SAML_CERT_FILENAME: SAML authentication: name of the IdP's public signing certificate used to validate the signatures of the incoming SAML Responses (should be placed in /certs)
  • SAML_PRIVATE_CERT_FILENAME: SAML authentication: name of the certificate used to sign requests sent to the IdP
  • NODE_USER_UID: optional user id to use for the node user (this should be set before the image is built); default is 1000, which could conflict with host users.
  • NODE_USER_GID: optional group id to use for the node group (this should be set before the image is built); default is 1000, which could conflict with host groups.

Permissions

Permissions are always applied to groups. Two groups are automatically created:

  • Superusers: for application administrators, with all permissions enabled. The administrator given in the .env file is automatically added to this group.
  • Guests: for users who are not logged in. By default, they are only able to read created audits, but this permission can be removed.

Another group can be created with SAML authentication:

  • Authenticated: users who passed SAML authentication but do not have a matching user. They can have different permissions from guests. More groups can be created and assigned users.

There are separate permissions to read audits, create audits, remove audits, and edit users and groups. The audit permissions can also be given for specific domains (which include subdomains).

Service Installation (systemd)

A template service example is provided in the root directory of this project (daa.service.example)

  • Copy the example to "daa.service" and edit [FULL_PATH] to be the path of the project. Remember to check path to docker-compose, this may be /usr/bin/docker-compose and ensure environment variables are set (This can be done in the .env file above if you wish).
cp daa.service.example daa.service
nano daa.service
  • Copy or move daa.service to systemd service directory
sudo cp daa.service /etc/systemd/system/ ` or ` /lib/systemd/system
  • Enable and start service
sudo systemd enable daa.service
sudo systemd start daa.service 

FAQ

  • Does this tool accurately reflect a website's state of accessibility ?
    No. Because it is not reporting potential false positives, it will miss a number of real web accessibility issues. Even if it was reporting potential false positives, it might still miss some other issues that are hard to identify automatically. It is meant as a tool to help identify and fix the most common issues, but does not replace a full manual audit.
    That being said, it is a good and economic first step to fix reported issues before doing a more thorough manual audit, and there is often a lot to fix. Also, results over a large number of websites are more likely to be consistent, objective and uniform than with manual audits, so it is a useful tool to compare standard compliance and measure progress.
  • How to set up SSL ?
    This could be done with a proxy, but if you want to set up SSL directly in node, this is possible in production mode on port 443:
    • Add server.key and server.crt inside the certs directory.
    • Restart with docker-compose: docker-compose up -d
  • Why would I ever want to not use site maps when they're available ?
    Site maps are great to check entire sites. A crawling depth of 0 can even be used when they are complete. However one might want to focus on the most visible pages (based on the number of clicks used to reach them). Ignoring site maps and crawling the sites with a maximum depth is a better option in this case.
  • How could I check only a part of a site ?
    With the "Include only paths matching the regular expression" option, for instance ^/section1 would only match paths starting with /section1 (the paths the expression is checked against start with a slash, but do not include the protocol or domain parts of the URL).
  • How can I precisely control which ports are exposed by the application ?
    Edit the ports section in docker-compose.yml. Development mode needs 2 ports (one for the static web files and one for the API), but production mode only needs 1.
  • Do I need to add a "delay to let dynamic pages load" ?
    The application is always waiting for the initial page load. However some web pages (such as the ones generated by this application!) do a server request after the initial load before they display anything. If the accessibility checks start before content is loaded, they might fail one way or another.
    Issues can also occur when a page has a meta refresh tag (see below).
    In both cases, adding a delay before the checks after the initial load can resolve problems.
    A long delay (such as 1000 ms) is more likely to resolve problems, but it can slow down the audit. A short one (100 ms) might be sufficient for very fast websites, but might not be enough for slow ones. Experimenting might be the best way to choose a value.
  • How could I customize the application's header and footer ?
    client/src/Header.js and client/src/Footer.js can be customized. They are using the React JSX syntax. Images can be added to client/public. When git is used, these files can be added to .git/info/exclude to avoid warnings when the application is updated. The container will have to be restarted in production.
  • I set an admin password in the .env file but I can't log in. What is going on ?
    The variables passed to the Docker container with the .env file on the host are used only when the container is created, and the container is not updated when the file is modified afterwards. If you have modified the .env file after launching the application for the first time, you can simply delete the containers and recreate them. Since they don't contain any data (which is saved in a Docker volume), you will not lose any saved audit data.
    docker-compose down
    docker-compose up -d
    The administrator user is only created the first time the application is launched with an administrator password. If you have started it once with a password but without ADMIN_USERNAME, it will have been created with the default admin username. You can use that name to log in and modify the user name.
    Also make sure the .env file is created in the same directory as the README.md file, and that ADMIN_USERNAME and ADMIN_PASSWORD are uppercase.

Current issues

  • Browsers and drivers might crash sometimes, resulting in scan errors, but the audit will recover and continue.
  • Some redirects like meta refresh can also cause scan errors, as the page changes while axe tries to check the page.

Current non-features that would be nice to have in the future

  • Possibility to access non-public websites.
  • Taking robots.txt into account.
  • Option to only start subdomain audits at the root.
  • Reporting more than accessibility violations.
  • Ability to pause an audit.
  • Option to ignore pages returning a 404 status code.
  • Regular expression to ignore some URLs.

Licence

GPL 3.0.

Development

This project is using the MERN stack. Docker is used for both development and production.

Tests should be run in Docker:

  • docker-compose run --rm accessibility_audit npm run test:server
  • docker-compose run --rm accessibility_audit npm run test:client

ESLint should be integrated in the editor, which might require an npm install on the host machine. It can also be used to check the whole project, using Docker:

  • docker-compose run --rm accessibility_audit npm run lint

Database backup and restore

Backup:

docker exec domainaccessibilityaudit_mongodb_1 sh -c 'exec mongodump -d accessibility_audit --archive --gzip' > db_dump.gz

Restore will remove all existing data in the database:

docker exec -i domainaccessibilityaudit_mongodb_1 sh -c 'exec mongorestore --nsInclude 'accessibility_audit.*' --drop --archive --gzip' < db_dump.gz

Technologies used

domainaccessibilityaudit's People

Contributors

damien-git avatar dependabot[bot] avatar karbassi avatar meganschanz avatar warpcoil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

domainaccessibilityaudit's Issues

Install error

I tried to run the latest from github & got this error:

38.42 gyp verb build type Release
38.42 gyp verb architecture arm64
38.42 gyp verb node dev dir /home/node/.cache/node-gyp/12.22.12
38.42 gyp ERR! build error
38.42 gyp ERR! stack Error: not found: make
38.42 gyp ERR! stack at getNotFoundError (/app/client/node_modules/node-gyp/node_modules/which/which.js:10:17)
38.42 gyp ERR! stack at /app/client/node_modules/node-gyp/node_modules/which/which.js:57:18
38.42 gyp ERR! stack at new Promise ()
38.42 gyp ERR! stack at step (/app/client/node_modules/node-gyp/node_modules/which/which.js:54:21)
38.42 gyp ERR! stack at /app/client/node_modules/node-gyp/node_modules/which/which.js:71:22
38.42 gyp ERR! stack at new Promise ()
38.42 gyp ERR! stack at subStep (/app/client/node_modules/node-gyp/node_modules/which/which.js:69:33)
38.42 gyp ERR! stack at /app/client/node_modules/node-gyp/node_modules/which/which.js:80:22
38.42 gyp ERR! stack at /app/client/node_modules/isexe/index.js:42:5
38.42 gyp ERR! stack at /app/client/node_modules/isexe/mode.js:8:5
38.42 gyp ERR! System Linux 5.15.49-linuxkit-pr
38.42 gyp ERR! command "/usr/bin/node" "/app/client/node_modules/node-gyp/bin/node-gyp.js" "rebuild" "--verbose" "--libsass_ext=" "--libsass_cflags=" "--libsass_ldflags=" "--libsass_library="
38.42 gyp ERR! cwd /app/client/node_modules/node-sass
38.42 gyp ERR! node -v v12.22.12
38.42 gyp ERR! node-gyp -v v7.1.2
38.42 gyp ERR! not ok
38.43 Build failed with error code: 1
38.94 npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules/webpack-dev-server/node_modules/fsevents):
38.94 npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"arm64"})
38.94 npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules/watchpack-chokidar2/node_modules/fsevents):
38.94 npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"arm64"})
38.95 npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules/fsevents):
38.95 npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"arm64"})
38.95
38.96 npm ERR! code ELIFECYCLE
38.96 npm ERR! errno 1
38.96 npm ERR! [email protected] postinstall: node scripts/build.js
38.96 npm ERR! Exit status 1
38.96 npm ERR!
38.96 npm ERR! Failed at the [email protected] postinstall script.
38.96 npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
38.97
38.97 npm ERR! A complete log of this run can be found in:
38.97 npm ERR! /home/node/.npm/_logs/2023-09-14T14_24_10_724Z-debug.log
39.01 npm ERR! code ELIFECYCLE
39.01 npm ERR! errno 1
39.01 npm ERR! [email protected] postinstall: cd backend && npm install && cd ../client && npm install
39.01 npm ERR! Exit status 1
39.01 npm ERR!
39.01 npm ERR! Failed at the [email protected] postinstall script.
39.01 npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
39.01
39.01 npm ERR! A complete log of this run can be found in:
39.01 npm ERR! /home/node/.npm/_logs/2023-09-14T14_24_10_766Z-debug.log

Exclude all paths matching the regular expression

Would be great to have a field to allow this tool to exclude URLs like:

?page=
?external_url=
?sort_bef_combine=
?search_api_fulltext=
?f%5B0%5D=

So that they aren't treated as new pages. These generally aren't but are features of many dynamic CMS tools.

Axe error

When I try to run an audit I get this in Docker:

aXe analyze error this.driver.switchTo(...).parentFrame is not a function

and in Audit Status I get scan errors on every URL.

Thoughts?

I tried updating Axe, made no difference. Ultimately, I had to go back to a version from February and that seems to work.

Thanks!

Cannot login

Hello, I created a .env file in the root folder with these informations:

ADMIN_USERNAME=giulio
ADMIN_PASSWORD=12345

And restarted the service with:

docker-compose up -d

Everything starts and it doesn't complain about missing admin password as it would if I don't compile an .env file:

WARNING: The SAML_ENTRYPOINT variable is not set. Defaulting to a blank string.
WARNING: The SAML_ISSUER variable is not set. Defaulting to a blank string.
WARNING: The SAML_CERT variable is not set. Defaulting to a blank string.
WARNING: The SAML_PRIVATE_CERT variable is not set. Defaulting to a blank string.
domainaccessibilityaudit-master_mongodb_1 is up-to-date
Recreating domainaccessibilityaudit-master_accessibility_audit_1 ... done

The problem is that whatever admin user + password I use I cannot login.

Port 80 already in use

I run
sudo docker-compose up -d

and it throws the error -

domainaccessibilityaudit_mongodb_1 is up-to-date
Starting accessibility_audit ... 
Starting accessibility_audit ... error

ERROR: for accessibility_audit  Cannot start service accessibility_audit: driver failed programming external connectivity on endpoint accessibility_audit (485d0fd6c8a014ea59922afc03f6534e4ebd59795f61c7255a451056eadf606c): Error starting userland proxy: listen tcp4 0.0.0.0:80: bind: address already in use

ERROR: for accessibility_audit  Cannot start service accessibility_audit: driver failed programming external connectivity on endpoint accessibility_audit (485d0fd6c8a014ea59922afc03f6534e4ebd59795f61c7255a451056eadf606c): Error starting userland proxy: listen tcp4 0.0.0.0:80: bind: address already in use
ERROR: Encountered errors while bringing up the project.

I understand that it says port 80 is in use, and it is indeed in use by the apache server. Since I am not at all familiar with docker, could someone tell me how I can start it on another port?

Thank you !

Why not ship with a .env file?

I'm having trouble logging in after Docker is installed. I didn't create the .env first. I've tried removing & re-installing this, but it doesn't seem to be working.

I'll keep trying, but wanting to know why the .env isn't just shipped with a admin/admin type of u/p. Yes, this would be bad if it were a public site, but most won't be at first. And we can put many warnings into the README so folks are reminded to change the password.

Simple Report Export Function

Would be nice to be able to quickly export the accessibility report in a format that could be easily shared with a client.

Basic HTML (or .mhtml) would be good, but even CSV could be very useful. Pushing it up to a Google Spreadsheet would be a bonus too.

Respect Robots.txt Files

Robots have rights too. Well, ok, not yet, but we can get into trouble by ignoring the robots.txt files that some sites use.

Would be great if by default the scanner respected the wishes of the site owner.

Export all scans

I just spent a bunch of time trying to recover a docker instance. Finally got it back.

Must remember never to change the name of the Docker folder!

Anyways, would be good if there was an easy way to export all of the scans on mass (and possibly even encrypted).

accepting contributors?

I've been working on my own tool, similar to yours, which uses @axe-core:puppeteer, but it is more of a proof of concept and it is becoming too much to manage myself. I was wondering if you are accepting contributors so we could, perhaps pool our efforts. What I've been working on has some features that I'd want in any tool, such as (among others) allowing users to:

  • choose Chrome device profiles to use when running scans
  • limit axe rules that are used based on category tags
  • provide a list of URLS to scan, in addition to using sitemaps and crawling
  • save what I call scan configurations

Would you be receptive to working together or should I just fork?

Use Detail/Summary Rather than JS Button

Would be more semantic of the report just used HTML/CSS rather than JS to expand/collapse the violations:

Screenshot of expand/collapse function

<button title="See affected pages" type="button" class="btn btn-info btn-xs"><svg aria-hidden="true" focusable="false" data-prefix="fas" data-icon="plus-square" class="svg-inline--fa fa-plus-square fa-w-14 " role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 448 512"><path fill="currentColor" d="M400 32H48C21.5 32 0 53.5 0 80v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V80c0-26.5-21.5-48-48-48zm-32 252c0 6.6-5.4 12-12 12h-92v92c0 6.6-5.4 12-12 12h-56c-6.6 0-12-5.4-12-12v-92H92c-6.6 0-12-5.4-12-12v-56c0-6.6 5.4-12 12-12h92v-92c0-6.6 5.4-12 12-12h56c6.6 0 12 5.4 12 12v92h92c6.6 0 12 5.4 12 12v56z"></path></svg></button>

If you save the HTML as a .mhtml format the expand/collapse function is lost.

This is defined here:
client/src/audits/ViolationStats.test.js

Mike

Recurring checks or re-test function

Would be great to be able to test pages over time. If MD5 hashes were captured and stored for each page in a review, then it would be possible to skip a re-test if the content hadn't changed.

Mostly it is about trying to narrow in on what has changed. Are there less errors? Are pages that have changed gotten better/worse?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.