mozilla / browserid-verifier Goto Github PK

View Code? Open in Web Editor NEW

3.0 27.0 19.0 508 KB

DEPRECATED - Migrated to https://github.com/mozilla/fxa

License: Other

JavaScript 88.89% Python 8.18% Makefile 2.06% Dockerfile 0.87%

browserid-verifier's Introduction

This repository has been migrated to https://github.com/mozilla/fxa/tree/master/packages/browserid-verifier

Please file issues and open pull requests against https://github.com/mozilla/fxa

browserid-verifier's People

Stargazers

Watchers

Forkers

seanmonstar edwindotcom pdehaan zaach vladikoff suparwotohadi dannycoates robberthalff jua-mp ferjm kleopatra999 alexxnica parse-alex mozilla-github-standards

browserid-verifier's Issues

Log metrics on audience and issuer

To help us make operational decisions about the standalone version of this service hosted at https://verifier.accounts.firefox.com, we need to better understand who is using it and why. Let's ensure we have logging to capture:

the audience field, to tell us who is using the service
the issuer field from the assertion, to give us an idea about why

I can do the logging, but @jrgm @mostlygeek is the stand-alone FxA verifier plugged into our logging infra in a way that would enabled easy analysis of these values? e.g. do the application logs feed into kibana somewhere?

Make summary log line more consistent with other fxa servers

(From https://github.com/mozilla-services/puppet-config/pull/384#issuecomment-41078194)

remove the redundant "message" field which is just a string encoding of the other stuff
put the args into top-level fields rather than a one-element "args" field
use consistent field naming conventions e.g. "timestamp"=>"time" and "host"=>"hostname"

Containerized browserid-verifier uses more CPU than native one

I deployed a single tokenserver canary into production with the container. It appears that the CPU usage is much higher than the other production servers:

In stage testing it was the verifier container that was using up the bulk of the CPU.

I suspect that this may be an issue that node4 requires a different compile flag for browserid-crypto - usage

verifier v2 provides wrong error messages when assertions are malformed

My internal test suite shows different error messages from the v1 verifier and v2 verifier. I think the v2 verifier is wrong. I'm using the following endpoints for v1 and v2:

  public static String VERIFIER_URL_10 = "https://verifier.login.persona.org/verify";
  public static String VERIFIER_URL_20 = "https://verifier.accounts.firefox.com/v2";

A representative failing test is at [1], and looks like:

  @Test
  public void testCertificateExpired() throws Exception {
    long ciat = System.currentTimeMillis() - 2;
    long cexp = ciat + 1;
    long aiat = System.currentTimeMillis();
    long aexp = aiat + JSONWebTokenUtils.DEFAULT_ASSERTION_DURATION_IN_MILLISECONDS;
    String certificate = mockMyIdTokenFactory.createMockMyIDCertificate(publicKey, TEST_USERNAME, ciat, cexp);
    String assertion = JSONWebTokenUtils.createAssertion(privateKey, certificate, TEST_AUDIENCE, JSONWebTokenUtils.DEFAULT_ASSERTION_ISSUER, aiat, aexp);
    assertVerifyFailure(TEST_AUDIENCE, assertion, "assertion has expired");
  }

You can see that the certificate is issued in the past, is short-lived, and expires immediately; but the assertion is long-lived and includes the time of verification. I'm getting "certificate expired" instead of "assertion has expired" like I used to.

A minor issue, but it's already really hard to debug these things. Let's not make them harder!

[1] https://github.com/mozilla-services/android-sync/blob/cdafc277cea4228f8a494498b984f979ec68be5c/src/test/java/org/mozilla/gecko/browserid/test/TestLiveMockMyIDTokenFactory.java#L146

Performance regression of ~5% CPU for 0.5.1 vs 0.4.0

This is htop output from two servers in the tokenserver cluster. The top panel is running 0.4.0 and the bottom panel is running 0.5.1. I filtered only the worker.js processes and after watching it for a little while, 0.5.1 consistently uses ~5% more CPU.

Here's the canary server compared to a couple others in the cluster.

Load test: Investigate use of Vaurien with the Verifier test

Just a tracker issue to figure out the use/feasibility...

figure out a meaningful way to implement code coverage reports

because we start the verifier process for each test, jscoverage which keeps global state in process is somewhat tricky to leverage. we should figure out a simple non-intrusive means of getting code coverage numbers.

statsd^H^H^H^H^H^H heka support

We should implement application level metrics reporting at least on parity with what the current verifier reports.

Just as with mozilla IdP and our other projects, we should focus on the right level of metrics for key interesting events. The goal is a dashboard where we can understand at a glance if the system is healthy.

implement ELB friendly healthcheck

we should follow conventions in browserid, mozilla-idp, and sideshow and implement a simple healthcheck middleware that is registered before toobusy (health check is first middleware)

request logging

each HTTP request should be logged to the intel logger.

request summary log line

It would be great if the verifier logged one summary line per request, similar to fxa-auth-server: mozilla/fxa-auth-server#541

In particular, I'll propose these fields:

Timestamp (currently not logged, but essential)
API version (v1 or v2, currently logged with the http request info)
http response code
error number (give a specific number to each reason a verification could fail, 0==success)
rp
user agent
assertion_verification_time
hostname
pid
v (log format version)

~~op ("verify.summary")~~

Make verifier return a 405 rather than a 404 for POST

During some TokenServer/Verifier Stage testing this morning we, ran across this:
https://verifier.stage.mozaws.net/v2 returns 404
https://verifier.stage.mozaws.net returns 404
https://verifier.accounts.firefox.com/v2 returns 404
https://verifier.accounts.firefox.com/ returns 404

per @rfk , we should return a 405 for these cases
14:31 < rfkelly> jbonacci re: verifier above, pls file a bug to make it return a 405 rather than a 404

Create a version that uses 5b1c69 of browserid-local-verify

this is a throwaway branch, just using it to generate an RPM

API documentation

We should document the REST API. Perhaps we should document the v1 endpoint as deprecated, and guide folks toward the v2 api.

consider well-known caching

Longer term issue. Currently in persona we cache well-knowns at the squid (outbound http proxy) layer. We might consider caching at the application level. The complexity and benefits of this should be carefully weighed.

Improve the repo description

@6a68 Could you please change the repo description to be "BrowserID verifier hosted at verifier.accounts.firefox.com"

Remove footguns for verifying FxA generated assertions

IMO, there are too many footguns to trap FxA-only reliers (i.e., reliers who only want to accept FxA assertions) when verifying and extracting information from FxA generated assertions. The [Mozilla FxA Sync token server](such a service: https://github.com/mozilla-services/tokenserver) is an example of such a service. Here are two pitfalls I can think of:

The relier needs to check that the assertion is really backed by an authority it trusts. For example, the token server configures itself with a list of allowedIssuers and checks that the issuer of the assertion is on that list: https://github.com/mozilla-services/tokenserver/blob/b55e9533835258f16bf1611521a9efc11aa2afbc/tokenserver/verifiers.py#L90
Another issue is that the FxA uid is buried in a non-routable email (i.e., <uid>@accounts.firefox.com), which is returned as email to the caller. To extract the proper FxA uid, the relier needs to parse the returned email.

I don't expect these checks to be reliably employed as the number and type of reliers scales.

A simple mitigation would be to provide an FxA-only verifier endpoint, which only verifies FxA assertions, and extracts and returns the user information in a more meaningful format.

gmp.h not found

Hi,
I am trying to compile this on FreeBSD. I am getting an error reporting that gmp.h is not found. FreeBSD stores this file in /usr/local/include/gmp.h. Where would I modify the include path to include /usr/local/include?
Thanks

Please make HTTP mode configurable in local-verify 😿

For years now I have to install a custom branch of browserid-verifier in local dev : https://github.com/mozilla/fxa-local-dev/blob/master/_scripts/install_all.sh#L18 that has a custom branch of local-verify

vladikoff/browserid-local-verify@bd8ed64#diff-6545ee45f6c9535564ff7b051bd099a7R24

We really need to make this configurable because it's a bit of a mess

Ref: https://github.com/mozilla/browserid-verifier/blob/master/package.json#L20

Restore code-coverage tests in a production-friendly manner

Support for ass was removed in #36 which breaks the automated tests for code coverage. We need to restore this test in a manner that doesn't interfere with building for production.

Need a quick deployment verification test for Verifier Stage

Because we don't have one like we do for fxa-auth and sync storage.
Just getting this on the radar.

Add an errno field to the summary log lines

Each textual error message should have a corresponding machine-readable errno that we can report in the summary log line. It probably makes sense to include it in the verifier response to the client as well.

scalability

Compute cluster seems inappropriate given the tiny compute cost of assertion verification and the relatively high cost of request processing. After production of a load generation tool, we should consider using the built in socket sharing stuff in node.js to run a configurable number of verifier processes all sharing the same socket and using node.js round robin implementation.

This would mean shared state is not, and the complexity of this feature would affect our ability to do application level caching.

We should go slow here.

Use mozlog for logging

It's the new hotness, we should add it like so: mozilla/fxa-profile-server#74

Give me Admin rights!

To whomever is owning this repo now...
I need to be able to create and add labels to this repo.
Thanks.

Unable to verify assertions from persona.org

Trying to validate an assertion from persona.org with the hosted FxA verifier gives this error:

2014/02/27 23:57:05 [1]handler:Index: Persona Auth Failed {error: bad support document for 'login.persona.org': support document missing required property: 'authentication', buffer: map[status:failure reason:bad support document for 'login.persona.org': support document missing required property: 'authentication']}

The support-doc for persona.org is indeed missing these fields, so perhaps it's technically against the spec, but we shouldn't need those fields just to verify the assertion. We should fix this situation one way or another - either update the supportdoc on persona.org, or special-case it in the verifier to now require those fields.

ping @fmarier

Change logging to something other than INFO

Right now the app logs are showing probably too much information for what is really needed:

/media/ephemeral0/fxa-browserid-verifier/fxa-browserid-verifier-8000-out.log
has INFO in it

as does this one:
/media/ephemeral0/fxa-browserid-verifier/fxa-browserid-verifier-8001-out.log

JSON parse errors aren't handled and expose stack trace

This is a bad smell and I'm concerned what else is lurking.

curl -H 'Content-Type: application/json' -d '{ a }' https://verifier.accounts.firefox.com/v2
SyntaxError: Unexpected token a
    at Object.parse (native)
    at /data/fxa-browserid-verifier/node_modules/express/node_modules/connect/lib/middleware/json.js:85:25
    at IncomingMessage.onEnd (/data/fxa-browserid-verifier/node_modules/express/node_modules/connect/node_modules/raw-body/index.js:57:7)
    at IncomingMessage.g (events.js:180:16)
    at IncomingMessage.EventEmitter.emit (events.js:92:17)
    at _stream_readable.js:920:16
    at process._tickDomainCallback

Spawn a child process to do the actual verification

When the verifier does the CPU intensive verification of an assertion please spawn a child process to do it. It should be limited to the number of real CPUs in the system.

Consider dropping the minimal/default number of agents for load tests

We are finding that with a single instance, 5 agents or even 2 agents per 20 users can cause 503s on the host.

For the most recent release, we found that 20 users and 1 agent prevented 503s.
This seems too late, but we might want to scale this better going forward given the number and size of instances in Stage.

Log more information about "malformed signature" errors

As noted in https://bugzilla.mozilla.org/show_bug.cgi?id=1044532 (private bug, sorry world) we are seeing an unusual number of "malformed signature" errors on our production verifier. We should log more details about these in order to track down what's causing the bug.

This may involve digging into jwcrypto to throw more detailed error messages, but I wanted to get it on the books at this level for tracking purposes.

v2 endpoint

the current verifier API is extremely open. GET or POST is supported. sloppy audience specification is supported. It will also support new or old style assertions. You can interface it with / or /verify.

We should consider implementing a /v2 endpoint that is much more tightly defined. Perhaps this only accepts json, perhaps it requires complete origins as audience, perhaps it does not support some of the legacy APIs of the current endpoint. We could also figure out a reasonable code and test structure and alias /v1 to the current api.

Deployment Documentation

We should document the server from a deployment perspective. How you run it, how you configure it.

Unit Tests are showing many failures on Mac OS10.9

I am a bit suspicious of my recent upgrade from Mac OS 10.8.5 to 10.9 (and XCode 4.x to 5.1.1), but this is what I see...
https://jbonacci.pastebin.mozilla.org/5159635

I will further debug later on Mac and on Linux...

Investigate/Document 503s while running combined load test

This is the combined load test defined here:
https://github.com/mozilla-services/server-syncstorage/tree/master/loadtest

And executed roughly as follows:
make megabench SERVER_URL=https://token.stage.mozaws.net

Standard settings for this shortened test:
users = 20
duration = 900
agents = 5

I found the following in /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log
1400019877.304 "54.237.136.38" "POST /v2 HTTP/1.1" 503 60 "-" "python-requests/2.2.1 CPython/2.6.6 Linux/2.6.32-431.11.2.el6.x86_64" 0.002 0.002

And looking here: /media/ephemeral0/fxa-browserid-verifier/verifier_out.log
{"op":"bid.v2","name":"bid.v2","time":"2014-05-13T22:28:37.328Z","pid":2326,"v":1,"hostname":"ip-10-187-17-59","message":"service_failure"}
{"op":"bid.v2","name":"bid.v2","time":"2014-05-13T22:28:37.328Z","pid":2326,"v":1,"hostname":"ip-10-187-17-59","message":"verify { result: 'failure',\n reason: 'compute cluster error: cannot enqueue work: maximum backlog exceeded (40)',\n rp: 'https://token.stage.mozaws.net' }"}

We documented this once before (at least) but I can not find the issue or Bugzilla ticket...

re:
rfkelly> so, this is the verifier being overloaded by the combined TS+sync loadtest

Something to research and document going forward...

fallback configuration

The verifier should have a configurable fallback. For persona purposes, this fallback should be login.persona.org.

.json metrics file

we should record verification requests to a metrics.json file that matches the current verifier's format. This will allow us to preserve our current metrics dashboard.

browserid-verifier - docker build failed

docker build failed on Debian Stretch.

see paste.
https://paste.debian.net/1040699/

awsbox support

The verifier should be deployable out of the box on awsbox.

testing verifier running

this milestone is complete when we have a hosted verifier on awsbox with a REST API that is compatible with persona's current verifier

load generation tool

We should implement a load generation tool to stress the verifier library.

Support multiple public keys via "keys" property

Over in mozilla/fxa-auth-server#1064 we taught the fxa-auth-server how to serve multiple public keys at once, in order to support graceful key rotation. We need to teach browserid-verifier to use these multiple keys when verifying assertions.

Provide a /version endpoint and return the current running version

Just like https://accounts.firefox.com/ver.json

cc @jbuck @jrgm

Create Release 0.3.1 with Docker stuff

I want to test that the docker container automatically builds and pushes tags. There are some changes to master since 0.3.0. Is it ok to release a 0.3.1 with the latest changes and the docker stuff?

Change the app log names to something more meaningful

Why do we have log names that look like this?
(Stage env)

fxa-browserid-verifier-8000-err.log
fxa-browserid-verifier-8000-out.log
fxa-browserid-verifier-8001-err.log
fxa-browserid-verifier-8001-out.log

What is the significance of 8000/8001?

Upgrade to node 6?

I was going to rev this, but it's still on node 4. Node 4 support goes away in April 2018. Should we look at upgrading this to node 6, or is it possible that we won't be using this by then?

cc @rfk

Errors when trying to install on Node 0.12

Looks like a downstream issue w/ toobusy... But we should try and make this Node 0.12 (current stable) compatible. Actually, @shane-tomlinson just reported that this isn't working in Node 0.10.32 either.

Ref: lloyd/node-toobusy#45

Screenshot: https://cloudup.com/cOktBSWibQA