This repository has been migrated to https://github.com/mozilla/fxa/tree/master/packages/browserid-verifier
Please file issues and open pull requests against https://github.com/mozilla/fxa
DEPRECATED - Migrated to https://github.com/mozilla/fxa
License: Other
Please file issues and open pull requests against https://github.com/mozilla/fxa
To help us make operational decisions about the standalone version of this service hosted at https://verifier.accounts.firefox.com, we need to better understand who is using it and why. Let's ensure we have logging to capture:
audience
field, to tell us who is using the serviceissuer
field from the assertion, to give us an idea about whyI can do the logging, but @jrgm @mostlygeek is the stand-alone FxA verifier plugged into our logging infra in a way that would enabled easy analysis of these values? e.g. do the application logs feed into kibana somewhere?
(From https://github.com/mozilla-services/puppet-config/pull/384#issuecomment-41078194)
I deployed a single tokenserver canary into production with the container. It appears that the CPU usage is much higher than the other production servers:
In stage testing it was the verifier container that was using up the bulk of the CPU.
I suspect that this may be an issue that node4 requires a different compile flag for browserid-crypto - usage
My internal test suite shows different error messages from the v1 verifier and v2 verifier. I think the v2 verifier is wrong. I'm using the following endpoints for v1 and v2:
public static String VERIFIER_URL_10 = "https://verifier.login.persona.org/verify";
public static String VERIFIER_URL_20 = "https://verifier.accounts.firefox.com/v2";
A representative failing test is at [1], and looks like:
@Test
public void testCertificateExpired() throws Exception {
long ciat = System.currentTimeMillis() - 2;
long cexp = ciat + 1;
long aiat = System.currentTimeMillis();
long aexp = aiat + JSONWebTokenUtils.DEFAULT_ASSERTION_DURATION_IN_MILLISECONDS;
String certificate = mockMyIdTokenFactory.createMockMyIDCertificate(publicKey, TEST_USERNAME, ciat, cexp);
String assertion = JSONWebTokenUtils.createAssertion(privateKey, certificate, TEST_AUDIENCE, JSONWebTokenUtils.DEFAULT_ASSERTION_ISSUER, aiat, aexp);
assertVerifyFailure(TEST_AUDIENCE, assertion, "assertion has expired");
}
You can see that the certificate is issued in the past, is short-lived, and expires immediately; but the assertion is long-lived and includes the time of verification. I'm getting "certificate expired" instead of "assertion has expired" like I used to.
A minor issue, but it's already really hard to debug these things. Let's not make them harder!
This is htop output from two servers in the tokenserver cluster. The top panel is running 0.4.0 and the bottom panel is running 0.5.1. I filtered only the worker.js
processes and after watching it for a little while, 0.5.1 consistently uses ~5% more CPU.
Here's the canary server compared to a couple others in the cluster.
Just a tracker issue to figure out the use/feasibility...
because we start the verifier process for each test, jscoverage which keeps global state in process is somewhat tricky to leverage. we should figure out a simple non-intrusive means of getting code coverage numbers.
We should implement application level metrics reporting at least on parity with what the current verifier reports.
Just as with mozilla IdP and our other projects, we should focus on the right level of metrics for key interesting events. The goal is a dashboard where we can understand at a glance if the system is healthy.
we should follow conventions in browserid, mozilla-idp, and sideshow and implement a simple healthcheck middleware that is registered before toobusy (health check is first middleware)
each HTTP request should be logged to the intel logger.
It would be great if the verifier logged one summary line per request, similar to fxa-auth-server: mozilla/fxa-auth-server#541
In particular, I'll propose these fields:
op ("verify.summary")
During some TokenServer/Verifier Stage testing this morning we, ran across this:
https://verifier.stage.mozaws.net/v2 returns 404
https://verifier.stage.mozaws.net returns 404
https://verifier.accounts.firefox.com/v2 returns 404
https://verifier.accounts.firefox.com/ returns 404
per @rfk , we should return a 405 for these cases
14:31 < rfkelly> jbonacci re: verifier above, pls file a bug to make it return a 405 rather than a 404
this is a throwaway branch, just using it to generate an RPM
We should document the REST API. Perhaps we should document the v1
endpoint as deprecated, and guide folks toward the v2
api.
Longer term issue. Currently in persona we cache well-knowns at the squid (outbound http proxy) layer. We might consider caching at the application level. The complexity and benefits of this should be carefully weighed.
@6a68 Could you please change the repo description to be "BrowserID verifier hosted at verifier.accounts.firefox.com"
IMO, there are too many footguns to trap FxA-only reliers (i.e., reliers who only want to accept FxA assertions) when verifying and extracting information from FxA generated assertions. The [Mozilla FxA Sync token server](such a service: https://github.com/mozilla-services/tokenserver) is an example of such a service. Here are two pitfalls I can think of:
The relier needs to check that the assertion is really backed by an authority it trusts. For example, the token server configures itself with a list of allowedIssuers
and checks that the issuer of the assertion is on that list: https://github.com/mozilla-services/tokenserver/blob/b55e9533835258f16bf1611521a9efc11aa2afbc/tokenserver/verifiers.py#L90
Another issue is that the FxA uid is buried in a non-routable email (i.e., <uid>@accounts.firefox.com
), which is returned as email
to the caller. To extract the proper FxA uid
, the relier needs to parse the returned email
.
I don't expect these checks to be reliably employed as the number and type of reliers scales.
A simple mitigation would be to provide an FxA-only verifier endpoint, which only verifies FxA assertions, and extracts and returns the user information in a more meaningful format.
Hi,
I am trying to compile this on FreeBSD. I am getting an error reporting that gmp.h is not found. FreeBSD stores this file in /usr/local/include/gmp.h. Where would I modify the include path to include /usr/local/include?
Thanks
For years now I have to install a custom branch of browserid-verifier in local dev : https://github.com/mozilla/fxa-local-dev/blob/master/_scripts/install_all.sh#L18 that has a custom branch of local-verify
vladikoff/browserid-local-verify@bd8ed64#diff-6545ee45f6c9535564ff7b051bd099a7R24
We really need to make this configurable because it's a bit of a mess
Ref: https://github.com/mozilla/browserid-verifier/blob/master/package.json#L20
Support for ass
was removed in #36 which breaks the automated tests for code coverage. We need to restore this test in a manner that doesn't interfere with building for production.
Because we don't have one like we do for fxa-auth and sync storage.
Just getting this on the radar.
Each textual error message should have a corresponding machine-readable errno that we can report in the summary log line. It probably makes sense to include it in the verifier response to the client as well.
Compute cluster seems inappropriate given the tiny compute cost of assertion verification and the relatively high cost of request processing. After production of a load generation tool, we should consider using the built in socket sharing stuff in node.js to run a configurable number of verifier processes all sharing the same socket and using node.js round robin implementation.
This would mean shared state is not, and the complexity of this feature would affect our ability to do application level caching.
We should go slow here.
It's the new hotness, we should add it like so: mozilla/fxa-profile-server#74
To whomever is owning this repo now...
I need to be able to create and add labels to this repo.
Thanks.
Trying to validate an assertion from persona.org with the hosted FxA verifier gives this error:
2014/02/27 23:57:05 [1]handler:Index: Persona Auth Failed {error: bad support document for 'login.persona.org': support document missing required property: 'authentication', buffer: map[status:failure reason:bad support document for 'login.persona.org': support document missing required property: 'authentication']}
The support-doc for persona.org is indeed missing these fields, so perhaps it's technically against the spec, but we shouldn't need those fields just to verify the assertion. We should fix this situation one way or another - either update the supportdoc on persona.org, or special-case it in the verifier to now require those fields.
ping @fmarier
Right now the app logs are showing probably too much information for what is really needed:
/media/ephemeral0/fxa-browserid-verifier/fxa-browserid-verifier-8000-out.log
has INFO in it
as does this one:
/media/ephemeral0/fxa-browserid-verifier/fxa-browserid-verifier-8001-out.log
This is a bad smell and I'm concerned what else is lurking.
curl -H 'Content-Type: application/json' -d '{ a }' https://verifier.accounts.firefox.com/v2
SyntaxError: Unexpected token a
at Object.parse (native)
at /data/fxa-browserid-verifier/node_modules/express/node_modules/connect/lib/middleware/json.js:85:25
at IncomingMessage.onEnd (/data/fxa-browserid-verifier/node_modules/express/node_modules/connect/node_modules/raw-body/index.js:57:7)
at IncomingMessage.g (events.js:180:16)
at IncomingMessage.EventEmitter.emit (events.js:92:17)
at _stream_readable.js:920:16
at process._tickDomainCallback
When the verifier does the CPU intensive verification of an assertion please spawn a child process to do it. It should be limited to the number of real CPUs in the system.
We are finding that with a single instance, 5 agents or even 2 agents per 20 users can cause 503s on the host.
For the most recent release, we found that 20 users and 1 agent prevented 503s.
This seems too late, but we might want to scale this better going forward given the number and size of instances in Stage.
As noted in https://bugzilla.mozilla.org/show_bug.cgi?id=1044532 (private bug, sorry world) we are seeing an unusual number of "malformed signature" errors on our production verifier. We should log more details about these in order to track down what's causing the bug.
This may involve digging into jwcrypto to throw more detailed error messages, but I wanted to get it on the books at this level for tracking purposes.
the current verifier API is extremely open. GET or POST is supported. sloppy audience specification is supported. It will also support new or old style assertions. You can interface it with /
or /verify
.
We should consider implementing a /v2
endpoint that is much more tightly defined. Perhaps this only accepts json, perhaps it requires complete origins as audience, perhaps it does not support some of the legacy APIs of the current endpoint. We could also figure out a reasonable code and test structure and alias /v1
to the current api.
We should document the server from a deployment perspective. How you run it, how you configure it.
I am a bit suspicious of my recent upgrade from Mac OS 10.8.5 to 10.9 (and XCode 4.x to 5.1.1), but this is what I see...
https://jbonacci.pastebin.mozilla.org/5159635
I will further debug later on Mac and on Linux...
This is the combined load test defined here:
https://github.com/mozilla-services/server-syncstorage/tree/master/loadtest
And executed roughly as follows:
make megabench SERVER_URL=https://token.stage.mozaws.net
Standard settings for this shortened test:
users = 20
duration = 900
agents = 5
I found the following in /media/ephemeral0/nginx/logs/fxa-browserid-verifier.access.log
1400019877.304 "54.237.136.38" "POST /v2 HTTP/1.1" 503 60 "-" "python-requests/2.2.1 CPython/2.6.6 Linux/2.6.32-431.11.2.el6.x86_64" 0.002 0.002
And looking here: /media/ephemeral0/fxa-browserid-verifier/verifier_out.log
{"op":"bid.v2","name":"bid.v2","time":"2014-05-13T22:28:37.328Z","pid":2326,"v":1,"hostname":"ip-10-187-17-59","message":"service_failure"}
{"op":"bid.v2","name":"bid.v2","time":"2014-05-13T22:28:37.328Z","pid":2326,"v":1,"hostname":"ip-10-187-17-59","message":"verify { result: 'failure',\n reason: 'compute cluster error: cannot enqueue work: maximum backlog exceeded (40)',\n rp: 'https://token.stage.mozaws.net' }"}
We documented this once before (at least) but I can not find the issue or Bugzilla ticket...
re:
rfkelly> so, this is the verifier being overloaded by the combined TS+sync loadtest
Something to research and document going forward...
The verifier should have a configurable fallback. For persona purposes, this fallback should be login.persona.org
.
we should record verification requests to a metrics.json file that matches the current verifier's format. This will allow us to preserve our current metrics dashboard.
docker build failed on Debian Stretch.
see paste.
https://paste.debian.net/1040699/
The verifier should be deployable out of the box on awsbox.
this milestone is complete when we have a hosted verifier on awsbox with a REST API that is compatible with persona's current verifier
We should implement a load generation tool to stress the verifier library.
Over in mozilla/fxa-auth-server#1064 we taught the fxa-auth-server how to serve multiple public keys at once, in order to support graceful key rotation. We need to teach browserid-verifier to use these multiple keys when verifying assertions.
Just like https://accounts.firefox.com/ver.json
I want to test that the docker container automatically builds and pushes tags. There are some changes to master since 0.3.0. Is it ok to release a 0.3.1 with the latest changes and the docker stuff?
Why do we have log names that look like this?
(Stage env)
fxa-browserid-verifier-8000-err.log
fxa-browserid-verifier-8000-out.log
fxa-browserid-verifier-8001-err.log
fxa-browserid-verifier-8001-out.log
What is the significance of 8000/8001?
I was going to rev this, but it's still on node 4. Node 4 support goes away in April 2018. Should we look at upgrading this to node 6, or is it possible that we won't be using this by then?
cc @rfk
Looks like a downstream issue w/ toobusy... But we should try and make this Node 0.12 (current stable) compatible. Actually, @shane-tomlinson just reported that this isn't working in Node 0.10.32 either.
Screenshot: https://cloudup.com/cOktBSWibQA
we should port all current API tests from the persona repository to ensure the same level of testing and API conformance
This is a user-facing production service, we should have the same kinds of dependency management and sanity-checking and what-not as our other nodejs services.
/cc @pdehaan who seems to know about these things, care to take a look and tell us what we're missing in this repo?
See the following:
mozilla-services/loads#220
mozilla-services/loads#221
This requires a change to the Makefile, and maybe a change to the config/megabench.ini file...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.