mozilla / ssh_scan_api Goto Github PK

An API for ssh_scan (https://github.com/mozilla/ssh_scan) and the backend API service for the Mozilla SSH Observatory (https://observatory.mozilla.org/)

Ruby 97.15% Shell 1.82% Dockerfile 1.03%

ssh_scan_api's Introduction

WARNING Deprecated - please use ssh_scan from command-line

ssh_scan_api

A web api to scale ssh_scan operations

Setup

To install and run from source, type:

# clone repo
git clone https://github.com/mozilla/ssh_scan_api.git
cd ssh_scan_api

# install rvm,
# you might have to provide root to install missing packages
gpg2 --keyserver hkp://keys.gnupg.net --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3
curl -sSL https://get.rvm.io | bash -s stable

# install Ruby 2.3.1 with rvm,
# again, you might have to install missing devel packages
rvm install 2.3.1
rvm use 2.3.1

# resolve dependencies
gem install bundler
bundle install

./bin/ssh_scan_api

ssh_scan as a command-line tool?

This project is focused on providing ssh_scan as a service/API.

If you would like to run ssh_scan from command-line, checkout the ssh_scan project.

Rubies Supported

This project is integrated with travis-ci and is regularly tested to work with multiple rubies.

To checkout the current build status for these rubies, click here.

Contributing

If you are interested in contributing to this project, please see CONTRIBUTING.md.

Credits

Sources of Inspiration for ssh_scan

Mozilla OpenSSH Security Guide - For providing a sane baseline policy recommendation for SSH configuration parameters (eg. Ciphers, MACs, and KexAlgos).

ssh_scan_api's People

Contributors

Stargazers

Watchers

Forkers

rishabhs95 april ameihm0912 hvardhanx nicolasleger mikalv mozilla-github-standards

ssh_scan_api's Issues

Need unit-tests for worker class

Need to get Coveralls working on this project

Port bug

/root/code/ssh_scan/lib/ssh_scan/result.rb:38:in `port=': Invalid attempt to set port to a non-port value (ArgumentError)
        from /root/code/ssh_scan/lib/ssh_scan/scan_engine.rb:26:in `scan_target'
        from /root/code/ssh_scan/lib/ssh_scan/scan_engine.rb:171:in `block (2 levels) in scan'

Mock up a Dynamo Database type in preparation for move to AWS

See #90 for more details.

Mozilla Security Blog Post for MWoS Wrap-up and Observatory integration announcement

Add travis build status badge to project readme

Include reasoning why certain ciphers should be removed

Source: https://twitter.com/fugueish/status/876891820134813696

Currently, we suggest removal for cases where a cipher is not in the policy. Perhaps we need to be more specific about why a cipher is not part of a given policy.

Automation apparently failed and never restarted nginx and I have no alerting if it goes down. Do something like the following, so I don't need to take it down, just reload to re-read the new certs on disk.

https://gist.github.com/april/ab85597c46c4c0e0e1530380873cecba

Should we allow scanning localhost/127/RFC1918?

I was thinking this would be a no-brainer, but then again, I kind of want to scan localhost. I suppose maybe this could be just adding features that describe what can/cannot be scanned, in case someone runs this on their edge and wants to prevent internal scanning from external sources.

I suppose one simple solution would be to allow the ability to restrict RFC1918 ranges in the API config and reject any submission requests for that. This could be just a set of CIDRs or individual addrs that we check before we scan.

It's currently not an issue as we host the service in a VPS, but would be more relevant if we self-hosted.

Caching does not appear to be working

We implemented caching at one point, but I've received reports from people stating that they are getting unique UUIDs for repeated scans of the same property.

It's possible we're doing wrong, or the config it's set properly, but creating this bug to verify so it's not forgotten.

Go Serverless? (FaaS)

This is more of an experimental thing about whether we could make ssh_scan_api a server-less function.

What would it take?

Re-writing the API as a Lambda service
Hosting the Data storage container somewhere in AWS (probably)
Maybe we could even make workers that are server-less functions

Drop compression from the modern policy

This really doesn't have any security implications, and we might have configuration limitations with openssh (the most popular ssh lib).

By removing it, we will effectively not care what they have for compression settings.

Consume SSH endpoints from R7's sonar dataset

Doing broad internet scanning to find SSH ports is time consuming and simply outside the scope of this project. However, using the tool against known SSH endpoints that are already known (like the sonar dataset) might be helpful from a research and progress tracking perspective that would also help inform decision making around policy creation. It's also helpful for finding and fixing bugs in the engine by giving it more examples to work against.

To do this, it likely depends on running a research instance of the infra or just having a low priority queue where we can dump large quantities of hosts to scan (#77)

Add CORS stuff for Observatory integration

we need to add these headers....

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST
Access-Control-Max-Age: 86400

Need better unit-test coverage for API operations

It's pretty easy to introduce API bugs due to a lack of complete unit-test coverage there. This bug is for us to get in there and increase the coverage there.

Add Service Monitoring

Usually, April is the first person to hear about Mozilla SSH Observatory issues because she's working Observatory stuff a lot more than I. However, these issues generally boil down to one of two areas, which I should just add monitoring to let me know, so I'm the first person to know.

1.) Alert me when the site is not responding (this is usually nginx restarting and failing or a failed lets encrypt renew)
2.) Alert me when the queues are non-zero and not changing (this is usually an indication that something is broken or site abuse)

Add an HTML/JS example client

Why are all scans coming back with uuid fdfe4f31-e4f1-4928-8b4f-806c317da389?

getaddrinfo: Name or service not known (SocketError)

I, [2017-06-05T20:05:25.287073 #13870] INFO -- : Started job: 2b9baa46-335c-48da-880c-b83e1bc765ab
/usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/socket.rb:231:in getaddrinfo': getaddrinfo: Name or service not known (SocketError) from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/socket.rb:231:in foreach'
from /usr/local/rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/socket.rb:626:in tcp' from /root/code/ssh_scan/lib/ssh_scan/client.rb:21:in connect'
from /root/code/ssh_scan/lib/ssh_scan/scan_engine.rb:53:in scan_target' from /root/code/ssh_scan/lib/ssh_scan/scan_engine.rb:145:in block (2 levels) in scan'

How do we want to deal with scans that failed auth_method detection?

Options...

1.) Investigate further why the auth_method detection isn't working or why the client is erroring
2.) Return a partial result, but fail on compliance for the auth_method part
3.) In cases where we can't determine compliance, maybe we give them a pass on auth_method detection

Cases I think of that could cause this would be services that expect client-side certs or have some sort of MFA requirement, but pokeinthe.io is a good repro target to work with.

Make sure that workers are either always authenticated or always on loopback

With the API open without auth, it's possible to take work off the queue and put results into the DB arbitrarily.

This is not ideal, in fact, it's a potential integrity risk to the data store, so we should change that soon.

Add rate limiting for unauth'd users

this hasn't been a problem yet, but it's probably worth thinking about and adding some rate-limiting or throttling to prevent single IP DoS scenario.

We could also have a max queue size at anyone time that is X and when that queue limit is hit then we stop queuing scans until the queues subside. Though, this could also make it easier to DoS, so we'll need to make sure that single user limit is less than the global limit.

Provide cipher aliasing for ssh lib specific references

Example:

[email protected] vs. curve25519-sha256

If the policy states it should be there, then either implmentation would be acceptable and "compliant"

Add geo stats/map on targets scanned

Make stats interface based on actual DB content

Stats interface is sort of a crude shim right now, would be nice if it looked at the DB and said...

How many scans in what state transition? (queued, running, errored, completed)
How many scans attempted total?
....

Create an InvalidTargetChecker

basically, we're always going to have targets or target matches that we don't want scanned...

#91

Let's make a full-featured invalid target checker that can take a list of invalid targets/regexs to check against to act as a gate keeper.

This should also help in abuse cases where people can send their own blacklist items as data rather than code. Should also make it easier to spec.

Enable ipv6 on prod host

Having some error issues with hosts that support ipv6, but we don't have ipv6 enabled in prod. This issue is to fix that

Custom ports on sshscan.rubidus.com

The instance that is running on https://sshscan.rubidus.com/api/v1/

Is it possible to change the allowed_ports so a custom port number (e.g. 222) can be specified?

At the moment it returns {"error":"invalid port"} for anything other than port 22.

Would be useful for Issue 126 on the Mozilla HTTP Observatory.

Add ruby example client

https://github.com/mozilla/ssh_scan_api/blob/master/examples/client.py but in Ruby!

Add nginx rate limiting to example config

nginx allows us to rate limit routes to prevent/limit the impact of abuse scenarios, like DoS or filling up the queues faster than we can process the requests.

In progress scan polling returns weak/confusing messaging

Scenario:

1.) I use the API to task a scan
2.) I then poll the API periodically to see when the scan is completed
3.) During that window of waiting for the scan, we provide very little feedback to the end user about what's happening (currently just { scan: "not found" } until the scan comes back)

This is particially due to the currently implemented crude queuing strategy in place where it's not maintaining state or bound to the DB in any way. Once this is DB integrated we should provide active status for the end user so they can bubble that feedback back to the user.

Grading scheme

mozilla/ssh_scan#362

500 error from API endpoint

Reproduction with...

curl -X POST https://sshscan.rubidus.com/api/v1/scan?target=mozilla.org

Need better unit-test for authenticator

This is generally "under-spec'd" by my definition, which a current coverage level of about 30%

Refactor ssh_scan_worker bin/lib

The workers are not as robust as I would prefer. Things that I think we could do better include...

Having a worker manager (that can spin up multiple worker processes in a single command-line run)
Have some sort of service recovery (that can make sure if a worker dies/dead-locks that it is respawned)
Have some sort of system service (that can ensure service start back up on a reboot)

Need to create integration tests for API and workers

with the unit-test coverage being a little light ATM, I think this would be a good fail safe to prevent bugs making their way into prod for basic stuff.

Make lets encrypt auto-update, so no more letsencrypt nags or outdates certs

Add some sort of BATCHED_QUEUE functionality

The idea here is say we want to scan 1000 servers for research purposes, but we know if we do that we might impact a real user sitting in front of a screen.

So this would allow someone the ability to throw those 1000 servers into a lower-priority queue so they can be completed when the normal queue isn't so busy.

Dockerize the ssh_scan_api

This is primarily so I can more easily maintain the production deployment, but I suspect it will open up more doors for viable integration testing.

Figure out a way to go back and preserve commit history from original project

it sort of sucks that by breaking this out we lost commit history for API improvements.

I'd like to figure out a way to start with the ssh_scan master repo as the seed for this repo, allowing the existing content to be taken in as a merge. the goal is to preserve commit history for the MWoS students.

Should we allow scanning localhost/127

It's currently not an issue as we host the service in a VPS, but would be more relevant if we self-hosted.

Find some consistancy in logging between worker and API code.

Run sinatra in production mode without exposing errors

This is more just a clarity thing, so we're not spraying stack traces everywhere

Ansible-ize ssh_scan_api deployment at sshscan.rubidus.com

This is mainly just to make it easier to deploy updates and such so it's not so manual.

I was thinking deployment will run via master, and that way small bug fixes and the like won't be burdensome.

Have a way to prevent a user from making X number of scans in some time window

This is merely to prevent a single user from running up the queues/networkIO on the ssh_scan_api and workers.

Provide configuration restrictions on what ports can be scanned

Currently, the prod API allows the scanning of really any TCP port. We should probably provide some controls around this that limit it's exposure so people can make their own choices about what ports are acceptable. I say this because some orgs have conventions of splitting say SCM services into SCM + MGMT ssh services, which require alternative ports.

Need some way for workers to communicate exceptions back to the API and populate them in the API stats

Need to get CI setup on this project to prevent bugs

Move production API && DB to AWS

Locallized mongo instances are greate for local testing and such, but for the production deployment, it would be really handy to have the DB live in a very stable place where it can get backups and such in the case of API server corruption resulting in data loss.

Add a go API client example

Want a Go version of what we have here...

https://github.com/mozilla/ssh_scan_api/blob/master/examples/client.py
https://github.com/mozilla/ssh_scan_api/blob/master/examples/client.rb