Code Monkey home page Code Monkey logo

ping-centre's Introduction

ping-centre

Build Status Coverage Status

A client for easily collecting events and metrics.

Install

$ npm install ping-centre

Usage

Node

  1. npm install ping-centre --save to get the ping-centre dependencies
  2. Import as follows: const PingCentre = require("ping-centre");

Firefox Addon

  1. Run npm run bundle to output bundles to dist/
  2. Place the ping-centre.addon.min.js bundle in the addon directory
  3. Import as follows: const PingCentre = require("./ping-centre.addon.min");

Browser

  1. Run npm run bundle to output bundles to dist/
  2. Place the ping-centre.min.js bundle in your project directory
  3. Import as follows: <script type="text/javascript" src="ping-centre.min.js" charset="utf-8"></script>
  4. PingCentre will be available as a global

From there, the following code can be executed in any environment:

// create a ping-centre object
const pc = new PingCentre("some_topic_foo", "some_cient_id_123");

// create the payload
const payload = {
  "event_type": "ping_session"
};

// send the payload asynchronously
pc.sendPing(payload);

// validate the payload asynchronously
pc.validate(payload);

When testing your app with Ping Centre, your data will be sent to a staging server by default. To send your data to a production server, set the NODE_ENV environment variable to production.

Overview

Ping-centre consists of three main parts: the clients, the data pipeline, and the dashboard.

The clients are responsible for collecting the events and forwarding them to Onyx - the entrance of the data pipeline. Besides Onyx, the data pipeline employes a Disco cluster to run the ETL jobs, which in turn persist the outcome to AWS Redshift. Through re:dash dashboard, the user can access the data warehouse, slice and dice the datasets via SQL queries.

Behind the scenes, a ping-centre client is simply a wrapper around the HTTP POST request. Therefore, it could be implemented in any programming language. And this repo implements it in Javascript.

Topics

As ease-of-use is the primary goal of the client, the client user does not need to specify the telemetry destination, i.e. the endpoint of the Onyx. Instead, the user just specifies the topic of the payload. In fact, Onyx merely exposes a single endpoint and multiplexes all the topics onto that endpoint. The ETL task runner Infernyx will demultiplex the inputs and process each topic separately.

Schemas

For each topic, the user is going to provide a schema to describe the associated payload. As the reference of table schema in Redshift, this schema could also be used by the ETL jobs to conduct the data extraction, cleaning, and transforming.

We use joi-browser to define the schemas for the Javascript client. By convention, all schemas are saved in the schemas directory with the same name of the topics. In each schema, the user specifies following attributes in the schema for each topic:

  • Field name
  • Field modifiers
    • type
    • required or optional
    • length if applicable
    • enum values, e.g. ['click', 'search', 'delete']
    • see Joi for more details
  • Other ETL requirements are attached as comments if applicable

Here is an example:

const Joi = require("joi-browser");

const schema = Joi.object().keys({
    // a required string field with no more than 128 characters
    client_id: Joi.string().max(128).required(),
    // a required javascript timestamp with milliseconds
    received_at: Joi.date().timestamp().required(),
    // an required enum string field
    event: Joi.any().valid(['add', 'delete', 'search']).required(),
    // an optional positive integer field
    value: Joi.number().integer().positive().optional(),
}).options({allowUnknown: true});  // allow other non-specified fields

/*
 * ETL processing
 *
 * 1. Truncate the milliseconds of the 'received_at', e.g. 147743323232 -> 147743323
 * 2. Rename the 'value' field to 'latency' in the database
 * 3. Capitalize the 'event' field
 */

module.exports = schema;

ping-centre's People

Contributors

gasolin avatar jaredhirsch avatar mozilla-github-standards avatar ncloudioj avatar pdehaan avatar rlr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ping-centre's Issues

Update example.js?

Here's the current behavior if I try running ./example.js locally (with a couple minor tweaks):

Input:

// example.js

const topic = process.env.PING_CENTRE_TOPIC;
const PingCentre = require("./src/ping-centre");
const pingClient = new PingCentre(topic, "clientID", "https://onyx_tiles.stage.mozaws.net/v3/links/activity-stream");

pingClient.validate({
  event_type: "search",
  value: 3
});

Output:

$ PING_CENTRE_TOPIC=commonSchema node example

(node:1492) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: Invalid payload {"event_type":"search","value":3}.

$ node --version # v6.9.5
$ npm --version # 3.10.10

Not super helpful failure message, but if I sprinkle some .then().catch() magic, things get a bit better:

Input:

const topic = process.env.PING_CENTRE_TOPIC;
const PingCentre = require("./src/ping-centre");
const pingClient = new PingCentre(topic, "clientID", "https://onyx_tiles.stage.mozaws.net/v3/links/activity-stream");

pingClient.validate({
  event_type: "search",
  value: 3
})
.then(data => console.log(data))
.catch(err => console.error(err));

Output:

$ PING_CENTRE_TOPIC=commonSchema node example

{ Error: Invalid payload {"event_type":"search","value":3}.
    at Joi.validate (/Users/pdehaan/dev/github/mozilla/ping-centre/src/ping-centre.js:28:25)
    at _class._validateWithOptions (/Users/pdehaan/dev/github/mozilla/ping-centre/node_modules/joi-browser/dist/joi-browser.js:5163:21)
    at _class.root.validate (/Users/pdehaan/dev/github/mozilla/ping-centre/node_modules/joi-browser/dist/joi-browser.js:197:24)
    at Promise (/Users/pdehaan/dev/github/mozilla/ping-centre/src/ping-centre.js:26:11)
    at PingCentre.validate (/Users/pdehaan/dev/github/mozilla/ping-centre/src/ping-centre.js:25:12)
    at Object.<anonymous> (/Users/pdehaan/dev/github/mozilla/ping-centre/example2.js:5:12)
    at Module._compile (module.js:570:32)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
  details: [ '"client_id" is required', '"topic" is required' ],
  payload: { event_type: 'search', value: 3 } }

So it looks like I still need to pass a client_id and topic to .validate(), even though I specified them in the constructor.

Adding those two missing props seems to work and validates as expected:

Input

const topic = process.env.PING_CENTRE_TOPIC;
const PingCentre = require("./src/ping-centre");
const pingClient = new PingCentre(topic, "clientID", "https://onyx_tiles.stage.mozaws.net/v3/links/activity-stream");

pingClient.validate({
  event_type: "search",
  value: 3,
  client_id: 'clientID',
  topic
})
.then(data => console.log(data))
.catch(err => console.error(err));

Output:

$ PING_CENTRE_TOPIC=commonSchema node example

{ event_type: 'search',
  value: 3,
  client_id: 'clientID',
  topic: 'commonSchema' }

SUCCESS! Now that my example ping validates, if I change the .validate() to .sendPing(), it implodes with another error:

Input:

const topic = process.env.PING_CENTRE_TOPIC;
const PingCentre = require("./src/ping-centre");
const pingClient = new PingCentre(topic, "clientID", "https://onyx_tiles.stage.mozaws.net/v3/links/ping-centre");

pingClient.sendPing({
  event_type: "search",
  value: 3,
  client_id: 'clientID',
  topic
})
.then(data => console.log(data))
.catch(err => console.error(err));

Output:

$ PING_CENTRE_TOPIC=commonSchema node example

Ping failure with response code: 400
Error: Ping failure with response code: 400
    at fetch.then.response (/Users/pdehaan/dev/github/mozilla/ping-centre/src/ping-centre.js:53:17)
    at process._tickCallback (internal/process/next_tick.js:103:7)

consider joi-browser instead of joi

It looks like joi-browser provides a custom build of joi that is ES5-compatible and smaller (excluding the hoek crypto library by default).

I'm not sure if it's worth switching, but might be something to consider.

Consider using navigator.sendBeacon()?

@6a68 mentioned about there is a built-in utility function navigator.sendBeacon available.

I think it's worth investigating that if it's suitable for Ping-centre. Note that this is a web api other the addon SDK, that means we could use it directly on the content side as well as the addon side through the hidden window.

Unused node-fetch dependency?

I'm guessing node-fetch is unused in package.json, since we're using isometric-fetch.

$ depcheck

Unused dependencies
* node-fetch

Unused devDependencies
* mocha

Missing dependencies
* ping-centre

The mocha and ping-centre warnings are red herrings. I think mocha gets called from a Makefile, and the confusing [self referential] ping-centre reference comes from the example.js file.

Making "payload" and "variants" work with redshift for TestPilotTest

In the testpilottest.js, both "payload" and "variants" are defined as object fields in the schema. Unfortunately, Redshift doesn't support the nested object in the schema. To work around this limitation, we could consider using following options:

  • Investigate the JSON support of Redshift at here
  • Introducing the experiment_id to track the A/B test for each Testpilot test, this is already being used by "Activity Stream", and could replace the "variants" field
  • Learn from the universal telemetry to handle this kind of payload

Consider removing the 'config' package

It looks like ping-centre is using node-config to grab the 'endpoint' config value, based on NODE_ENV.

Unfortunately, the node-config package causes problems with an ES6 build chain, because it uses conditional requires, and those are explicitly disallowed in ES6.

I'm not sure what the best option is here. Maybe another library? Maybe try to get the node-config fix landed?

SDK fetch polyfill problem

When the Firefox SDK loads code, it's loaded into an environment in which there's no window, no self, and no global. This throws off pretty much every fetch polyfill detection approach, which means ping-centre doesn't load properly.

For instance, in isomorphic-fetch, the browser-compatible version uses whatwg-fetch, which in turn assumes that the global object is available as either self or this. In the case of an SDK addon, it attaches its fetch to the wrong object, and the polyfill doesn't work.

So, this is one way to get fetch into the global context in an SDK addon:

require('chrome').Cu.importGlobalProperties(["fetch"]);

I'm working on a PR that tries the SDK approach before loading any fetch libraries. (Another approach would be to have a build step that creates an SDK-compatible ping-centre.js and a web/node compatible ping-centre.js, but I noticed you've got a Makefile in this repo, and I don't totally know how I'd do that with make.) Hopefully will have the PR out within a day or so.

Audit files in npm package

DISCLAIMER: This is for the current [email protected] on npm.

$ cat ./node_modules/ping-centre/package.json | grep _id
  "_id": "[email protected]",

Steps to reproduce:

  1. Create a new, empty node project.
  2. npm i ping-centre@latest -S
  3. tree node_modules/ping-centre

Actual results:

node_modules/ping-centre
├── LICENSE
├── Makefile
├── README.md
├── example.js
├── package.json
├── ping-centre.js
├── schemas/
│   ├── commonSchema.js
│   └── sample.js
└── test/
    ├── common.js
    └── ping-centre.js

2 directories, 10 files
$ ls -lashR node_modules/ping-centre/

   1.6K Nov 28 07:04 .eslintrc
    22B Nov 28 07:04 .npmignore
    16K Dec  1 09:55 LICENSE
    63B Nov 28 07:04 Makefile
   3.4K Dec  1 09:48 README.md
   532B Nov 28 07:04 example.js
   2.5K Feb  6 21:29 package.json
   1.5K Dec  1 09:48 ping-centre.js
   249B Nov 28 07:04 schemas/commonSchema.js
   221B Dec  1 09:48 schemas/sample.js
   212B Nov 28 07:04 test/common.js
   2.2K Dec  1 09:48 test/ping-centre.js

We could probably not publish the following to npm:

  • .eslintrc (1.6K)
  • Makefile (63B)
  • README.md (3.4K)
  • example.js (532B)
  • test/common.js (212B)
  • test/ping-centre.js (2.2K)

Meh. Save about 8-9K.

One vs. Multiple Events per Ping

I noticed that the new test pilot ping sends an array of events rather than 1 at a time. The common schema in this case does not make sense because it expects one event type per ping.

Let's have a second common schema that takes this into account and let's agree that the event type field will have a common name. Right now we've got event_type in the common schema vs. event in the test pilot schema

Consider shrinking the "raw" field of "TestPilot" pings

We've observed that the vast majority of "raw" rows in the "testpilot" table are populated by "0" in the Redshift. The raw logs showed that most of the "raw" fileds are large JSON blobs (in 10KBs). Unfortunately, the ping-centre data processing (Disco) has a hard limit (8KBs) for the key/value size, and the system inappropriately set those overflowed fields to zero values.

This bug has made the "raw" field inaccurate, and will be fixed soon.

On the other hand, I'd recommend to consider shrinking this field for following benefits:

  • To save space in Redshift, big JSON blobs will grow the database quickly
  • To improve the query performance. Despite the "raw" field isn't deemed to be frequently used, keeping it compact can result in a better query speed
  • To limit the scope of data collection, there are a lot of user specif fields in the payload. Including all of them might introduce privacy concern as a result

@6a68 Shall we consider auditing this "raw" field? We'd like to hear your thoughts.

Report all validation errors vs abort early?

Currently it looks like if some data fails validation, we abort early and only show the first ValidationError.

That means that we get some errors like the following:

Joi Handles Various Cases
Invalid payload {"topic":"sample","client_id":"47304764-f5b1-4037-81b0-425293963629"}: ValidationError: child "event_type" fails because ["event_type" is required].

Joi#validate() lets you pass an optional options object, which lets you specify abortEarly param which:

abortEarly - when true, stops validation on the first error, otherwise returns all the errors found. Defaults to true.

IIUC, we can pass {abortEarly: false} in PingCentre.validate() to get all validation errors;

Joi Handles Various Cases
Invalid payload {"topic":"sample","client_id":"e0867855-e0ee-4fe2-90f7-29924696f39b"}: ValidationError: child "event_type" fails because ["event_type" is required]. child "value" fails because ["value" is required].

Create a Promise inside of send ping and reject on parse failure

https://github.com/mozilla/ping-centre/blob/master/ping-centre.js#L33

Here we pass the data into the validator and pass a callback in to handle the validation error case however this does not block the calling scope from moving immediately on to fetch so this will not prevent us from fetching with invalid data. We need to create a Promise and then either reject or resolve on it and then in the then case move on to fetch and resolve the outer promise using the success or failure of the fetch promise.

Support custom field queries for Test Pilot Snooze Tabs experiment

The Snooze Tabs experiment defines its own Redshift schema with custom fields not found in the Test Pilot ping-centre schema, specifically, snooze_time and snooze_time_type.

Could a separate Redshift table be created that matches the documented schema? And, could the existing pings be inserted into that table?

I'm not yet sure if it would also be good to add the raw_ping field to the linked Redshift schema. Some queries might be more useful if they could be sorted by Firefox version. @lmorchard any thoughts? The raw_ping field contains the entire Telemetry ping, serialized as JSON.

Aboarding firefox-onboarding

The schema is still under discussion, we are likely going to have two topics for "firefox-onboarding", i.e. "firefox-onboarding-sessions" and "firefox-onboarding-events", respectively.

Alternative choices for joi

Since joi isn't really designed to work in the browser (see issue 528 on the joi issue tracker, for example), here are some possible alternatives. They vary quite a bit in terms of power so it depends on what you need.

I did a quick test for build sizes are with webpack, with production optimizations on:

joi-browser           162kb
yup                    77kb
js-schema              16kb
react-schema         5.49kb

Add support for batching ping events

For activity stream on iOS, we generate a bad/undesired event whenever we encounter a highlight or top site that doesn't have an image or favicon. Currently ping centre only supports sending pings one at a time causing the panel to make potentially many network calls for each bad event it encounters.

Can we add an API on the service side to accept an array of event data instead of a singular event?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.