Code Monkey home page Code Monkey logo

semantic-metrics's Introduction

semantic-metrics

Build Status

This project contains modifications to the dropwizard metrics project.

The primary additions includes a replacement for MetricRegistry allowing for metric names containing tags through MetricId.

Usage

The following are the interfaces and classes that has to be used from this package in order for MetricId to be used.

You will find these types in com.spotify.metrics.core.

  • SemanticMetricRegistry — Replacement for MetricRegistry.
  • MetricId — Replacement for string-based metric names.
  • SemanticMetricFilter — Replacement for MetricFilter.
  • SemanticMetricRegistryListener — Replacement for MetricRegistryListener.
  • SemanticMetricSet — Replacement for MetricSet.

Care must be taken not to use the upstream MetricRegistry because it does not support the use of MetricId. To ease this, all of the replacing classes follow the Semantic* naming convention.

As an effect of this, pre-existing plugins for codahale metrics will not work.

Installation

Add a dependency in maven.

<dependency>
  <groupId>com.spotify.metrics</groupId>
  <artifactId>semantic-metrics-core</artifactId>
  <version>${semantic-metrics.version}</version>
</dependency>

Provided Plugins

This project provide the following set of plugins;

See and run examples.

Considerations

MetricIdCache

If you find yourself in a situation where you create many instances of this class (i.e. when reporting metrics) and profiling/benchmarks show a significant amount of time spent constructing MetricId instances, considering making use of a MetricIdCache

The following is an example integrating with Guava.

// GuavaCache.java

public final class GuavaCache<T> implements MetricIdCache.Cache<T> {
    final Cache<T, MetricId> cache = CacheBuilder.newBuilder().expireAfterAccess(6, TimeUnit.HOURS)
            .build();

    private final MetricIdCache.Loader<T> loader;

    public GuavaCache(Loader<T> loader) {
        this.loader = loader;
    }

    @Override
    public MetricId get(final MetricId base, final T key) throws ExecutionException {
        return cache.get(key, new Callable<MetricId>() {
            @Override
            public MetricId call() throws Exception {
                return loader.load(base, key);
            }
        });
    }

    @Override
    public void invalidate(T key) {
        cache.invalidate(key);
    }

    @Override
    public void invalidateAll() {
        cache.invalidateAll();
    }

    public static MetricIdCache.Any setup() {
        return MetricIdCache.builder().cacheBuilder(new MetricIdCache.CacheBuilder() {
            @Override
            public <T> MetricIdCache.Cache<T> build(final Loader<T> loader) {
                return new GuavaCache<T>(loader);
            }
        });
    }
}
// MyApplicationStatistics.java

public class MyApplicationStatistics() {
    private final MetricIdCache.Typed<String> endpoint = GuavaCache.setup()
        .loader(new MetricIdCache.Loader<String>() {
            @Override
            public MetricId load(MetricId base, String endpoint) {
                return base.tagged("endpoint", endpoint);
            }
        });

    private final MetricIdCache<String> requests = endpoint
        .metricId(MetricId.build().tagged("what", "endpoint-requests", "unit", "request"))
        .build();

    private final MetricIdCache<String> errors = endpoint
        .metricId(MetricId.build().tagged("what", "endpoint-errors", "unit", "error"))
        .build();

    private final SemanticMetricRegistry registry;

    public MyApplicationStatistics(SemanticMetricRegistry registry) {
        this.registry = registry;
    }

    public void reportRequest(String endpoint) {
        registry.meter(requests.get(endpoint)).mark();
    }

    public void reportError(String endpoint) {
        registry.meter(errors.get(endpoint)).mark();
    }
}

Don't assume that semantic-metrics will be around forever

Avoid performing deep integration of semantic-metrics into your library or application. This will prevent you, and third parties, from integrating your code with different metric collectors.

As an alternative you should build a tree of interfaces that your application uses to report metrics (e.g. my-service-statistics), and use these to build an implementation using semantic metrics (my-service-semantic-statistics).

This pattern greatly simplifies integrating your application with more than one metric collector, or ditching semantic-metrics when it becomes superseded by something better.

At configuration time your application can decide which implementation to use by simply providing an instance of the statistics API which suits their requirements.

Example

Build an interface describing all the things that your application reports.

public interface MyApplicationStatistics {
    /**
     * Report that a single request has been received by the application.
     */
    void reportRequest();
}

Provide a semantic-metrics implementation.

public class SemanticMyApplicationStatistics implements MyApplicationStatistics {
    private final SemanticMetricRegistry registry;

    private final Meter request;

    public SemanticMyApplicationStatistics(SemanticMetricRegistry registry) {
        this.registry = registry;
        this.request = registry.meter(MetricId.build().tagged(
            "what", "requests", "unit", "request"));
    }

    @Override
    public void reportRequest() {
        request.mark();
    }
}

Now a user of your framework/application can do something like the following to bootstrap your application.

public class Entry {
    public static void main(String[] argv) {
        final SemanticMetricRegistry registry = new SemanticMetricRegistry();
        final MyApplicationStatistics statistics = new SemanticMyApplicationStatistics(registry);
        /* your application */
        final MyApplication app = MyApplication.builder().statistics(statistics).build();

        final FastForwardReporter reporter = FastForwardReporter.forRegistry(registry).build()

        reporter.start();
        app.start();

        app.join();
        reporter.stopWithFlush();
        System.exit(0);
    }
}

Metric Types

There are different metric types that can be used depending on what it is that we want to measure, e.g., queue length, or request time, etc.

Gauge

A gauge is an instantaneous measurement of a value. For example if we want to measure the number of pending jobs in a queue.

registry.register(metric.tagged("what", "job-queue-length"), new Gauge<Integer>() {
    @Override
    public Integer getValue() {
        // fetch the queue length the way you like
        final int queueLength = 10;
        // obviously this is gonna keep reporting 10, but you know ;)
        return queueLength;
    }
});

In addition to the tags that are specified (e.g., "what" in this example), FfwdReporter adds the following tags to each Gauge data point:

tag values comment
metric_type gauge

Counter

A counter is just a gauge for an AtomicLong instance. You can increment or decrement its value.

For example we want a more efficient way of measuring the pending job in a queue.

final Counter counter = registry.counter(metric.tagged("what", "job-count"));
// Somewhere in your code where you are adding new jobs to the queue you increment the counter as well
counter.inc();
// Somewhere in your code the job is going to be removed from the queue you decrement the counter
counter.dec();

In addition to the tags that are specified (e.g., "what" in this example), FfwdReporter adds the following tags to each Counter data point:

tag values comment
metric_type counter

Meter

A meter measures the rate of events over time (e.g., "requests per second"). In addition to the mean rate, meters also track 1- and 5-minute moving averages.

For example we have an endpoint that we want to measure how frequent we receive requests for it.

Meter meter = registry.meter(metric.tagged("what", "incoming-requests").tagged("endpoint", "/v1/list"));
// Now a request comes and it's time to mark the meter
meter.mark();

In addition to the tags that are specified (e.g., "what" and "endpoint" in this example), FfwdReporter adds the following tags to each Meter data point:

tag values comment
metric_type meter
unit <unit>/s <unit> is what is originally specified as "unit" attribute during declaration. If missing, the value will be set as "n/s". For example if you originally specify .tagged("unit", "request") on a Meter, FfwdReporter emits Meter data points with "unit":"request/s"
stat 1m, 5m 1m means the size of the time bucket of the calculated moving average of this data point is 1 minute. 5m means 5 minutes.

NOTE: Meter also reports the meter counter value to allow platforms to derive rates using the monotonically increasing count instead of only aggregating the rate computed by the meter itself. It is useful for applications to be able to report both count and rate using a meter.

Deriving Meter

A deriving meter takes the derivative of a value that is expected to be monotonically increasing.

A typical use case is to get the rate of change of a counter of the total number of events.

This implementation ignores updates that decrease the counter value. The rationale is that the counter is expected to be monotonically increasing between infrequent resets (when a process has been restarted, for example). Thus, negative values should only happen on restart and should be safe to discard.

DerivingMeter derivingMeter = registry.derivingMeter(metric.tagged("what", "incoming-requests").tagged("endpoint", "/v1/list"));
derivingMeter.mark();

In addition to the tags that are specified (e.g., "what" and "endpoint" in this example), FfwdReporter adds the following tags to each Meter data point:

tag values comment
metric_type deriving_meter
unit <unit>/s <unit> is set to what is specified during declaration. For example, if you specify .tagged("unit", "request") on a DerivingMeter, FfwdReporter emits DerivingMeter data points with "unit":"request/s". Default: "n/s".
stat 1m, 5m <stat> means the size of the time bucket of the calculated moving average of this data point. 1m is 1 minute. 5m means 5 minutes.

Histogram

A histogram measures the statistical distribution of values in a stream of data. It measures minimum, maximum, mean, median, standard deviation, as well as 75th and 99th percentiles.

For example this histogram will measure the size of responses in bytes.

Histogram histogram = registry.histogram(metric.tagged("what", "response-size").tagged("endpoint", "/v1/content"));
// fetch the size of the response
final long responseSize = getResponseSize(response);
histogram.update(responseSize);

In addition to the tags that are specified (e.g., "what" and "endpoint" in this example), FfwdReporter adds the following tags to each Histogram data point:

tag values comment
metric_type histogram
stat min, max, mean, median, stddev, p75, p99 min: the lowest value in the snapshot
max: the highest value in the snapshot
mean: the arithmetic mean of the values in the snapshot
median: the median value in the distribution
stddev: the standard deviation of the values in the snapshot
p75: the value at the 75th percentile in the distribution
p99: the value at the 99th percentile in the distribution

Note that added custom percentiles will show up in the stat tag.

Histogram with ttl

HistogramWithTtl changes the behavior of the default codahale histogram when update rate is low. If the update rate goes below a certain threshold for a certain time, all samples that have been received during that time are used instead of the random sample that is used in the default histogram implementation. When update rates are above the threshold, the default implementation is used.

What problem does it solve?

The default histogram implementation uses a random sampling algorithm with exponentially decaying probabilities over time. This works well if update rates are approximately 10 requests per second or above. When rates go below that, the metrics, especially p99 and above tends to flatline because the values are not replaced often enough. We solve this by using a different implementation whenever the update rate goes below 10 RPS. This gives much more dynamic percentile measurements during low update rates. When update rates go above the threshold we switch to the default implementation.

This was authored by Johan Buratti.

Distribution [DO NOT USE]

Distributions are no longer supported. The code to create them and the Heroic code to query them still exists, however they are being retired and no further adoption should occur.

Heroic is being retired in favor of OpenSource alternatives, and this distribution implementation will not be portable to the future TSDB/query interface. Since only a few services with a few metrics had experimented with distributions, the choice was made to halt adoption now, to reduce the pain of conversion to a proper histrogram later.

For historical refrence only

DO NOT USE

Distribution is a simple interface that allows users to record measurements to compute rank statistics on data distribution, not just a local source.

Every implementation should produce a serialized data sketch in a byteBuffer as this metric point value.

Unlike traditional histograms, distribution doesn't require a predefined percentile value. Data recorded can be used upstream to compute any percentile.

Distribution doesn't require any binning configuration. Just get an instance through SemanticMetricBuilder and record data.

Distribution is a good choice if you care about percentile accuracy in a distributed environment and you want to rely on P99 to set SLOs.

For example, this distribution will measure the size of messages in bytes.

Distribution distribution = registry.distribution(metric.tagged("what", "distribution-message-size", "unit", Units.BYTE));
// fetch the size of the message
int size = getMessageSize(response);
distribution.record(size);

In addition to the tags that are specified (e.g., "what" and "unit" in this example), FfwdReporter adds the following tags to each Histogram data point:

tag values comment
metric_type distribution
tdigeststat P50, P75, P99 P50: the value at the 50th percentile in the distribution
P75: the value at the 75th percentile in the distribution
P99: the value at the 99th percentile in the distribution

What problem does it solve?

  • Accurate Aggregated Histogram Data

This can record data and send data sketches. A sketch of a dataset is a small data structure that lets you approximate certain characteristics of the original dataset. Sketches are used to compute rank based statistics such as percentile. Sketches are mergeable and can be used to compute any percentile on the entire data distribution.

  • Support Sophisticated Data-point Values

With distributions we are able to support sophisticated data point values, such as the Open-census metric distribution.

Authored by Adele Okoubo.

Timer

A timer measures both the rate that a particular piece of code is called and the distribution of its duration.

For example we want to measure the rate and handling duration of incoming requests.

Timer timer = registry.timer(metric.tagged("what", "incoming-request-time").tagged("endpoint", "/v1/get_stuff"));
// Do this before starting to do the thing. This creates a measurement context object that you can pass around.
final Context context = timer.time();
doStuff();
// Tell the context that it's done. This will register the duration and counts one occurrence.
context.stop();

In addition to the tags that are specified (e.g., "what" and "endpoint" in this example), FfwdReporter adds the following tags to each Timer data point:

tag values comment
metric_type timer
unit ns

NOTE: Timer is really just a combination of a Histogram and a Meter, so apart from the tags above, combination of both Histogram and Meter tags will be included.

Why Semantic Metrics?

When dealing with thousands of similar timeseries over thousands of hosts, classification becomes a big issue.

Classical systems organize metric names as strings, containing a lot of information about the metric in question.

You will often see things like webserver.host.example.com.df.used./.

The same metric expressed as a set of tags could look like.

{"role": "webserver", "host": "host.example.com", "what": "disk-used",
 "mountpoint": "/"}

This system of classification from the host machine greatly simplifies any metrics pipeline. When transported with a stable serialization method (like JSON) it does not matter if we add additional tags, or decide to change the order in which the timeseries happens to be designated.

We can also easily index this timeseries by its tag using a system like ElasticSearch and ask it interesting questions about which timeseries are available.

If used with a metrics backend that supports efficient aggregation and filtering across tags you gain a flexible and intionistic pipeline that is powerful and agnostic about what it sends, all the way from the service being monitored to your metrics GUI.

Contributing

This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.

  1. Fork semantic-metrics from github and clone your fork.
  2. Hack.
  3. Push the branch back to GitHub.
  4. Send a pull request to our upstream repo.

Releasing

Releasing is done via the maven-release-plugin and nexus-staging-plugin which are configured via the release profile. Deploys are staged in oss.sonatype.org before being deployed to Maven Central. Check out the maven-release-plugin docs and the nexus-staging-plugin docs for more information.

To release, first run:

mvn -P release release:prepare

You will be prompted for the release version and the next development version. On success, follow with:

mvn -P release release:perform

When you have finished these steps, please "Draft a new release" in Github and list the included PRs (aside from changes to documentation).

semantic-metrics's People

Contributors

ao2017 avatar balaji avatar cy6erbr4in avatar davidxia avatar dependabot[bot] avatar dmichel1 avatar freben avatar gnitsua avatar jhaals avatar jsferrei avatar klaraward avatar lmuhlha avatar malish8632 avatar mattnworb avatar mehrdad-hassanabadi avatar mziccard avatar nresare avatar parmus avatar pettermahlen avatar protocol7 avatar simeg avatar sjoeboo avatar spkrka avatar theindifferent avatar tommyulfsparre avatar udoprog avatar zfrank avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

semantic-metrics's Issues

Split packages between core and api modules

I'm trying to use this library in a project that uses JPMS, but run into issues because the package com.spotify.metrics.core is split between two modules, core and api.

[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] error: the unnamed module reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module jsr305 reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module auto.value.annotations reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module awaitility reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module slf4j.api reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module com.google.common reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module okio reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module protobuf.java reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module metrics.core reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module semantic.metrics.api reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module com.spotify.apollo.hermes reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module com.spotify.hermes reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module semantic.metrics.core reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core
[ERROR] error: module com.spotify.apollo reads package com.spotify.metrics.core from both semantic.metrics.api and semantic.metrics.core

Race condition in LockFreeExponentiallyDecayingReservoir

While using the version 1.0.7 in a backend, we noticed this error crop up

java.util.NoSuchElementException: null
at java.util.concurrent.ConcurrentSkipListMap.firstKey (ConcurrentSkipListMap.java:1858)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir$State.update (LockFreeExponentiallyDecayingReservoir.java:207)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir$State.update (LockFreeExponentiallyDecayingReservoir.java:198)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir$State.access$200 (LockFreeExponentiallyDecayingReservoir.java:160)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir.update (LockFreeExponentiallyDecayingReservoir.java:100)
at com.spotify.metrics.core.LockFreeExponentiallyDecayingReservoir.update (LockFreeExponentiallyDecayingReservoir.java:88)
at com.spotify.metrics.core.ReservoirWithTtl.update (ReservoirWithTtl.java:123)
at com.codahale.metrics.Histogram.update (Histogram.java:41)

It looks like this can occur if the States ConcurrentSkipListMap is reset via a backfill in one thread at the same time an update on that state happens in another thread. Granted it's a small window but it looks to be possible with the current implementation.

Upgrade the metrics libraries

The metrics libraries (com.codahale.metrics:metrics-core:3.0.2 and com.codahale.metrics:metrics-jvm:3.0.2) are more than 4 years old and are not maintained anymore. Also, their groupId has changed since then (it is now set to io.dropwizard.metrics). For these reasons, I think they must be upgraded to more recent versions. I will create a PR with the required changes.

histogram percentiles

does histogram support p90, p95, p98? Do I have to initialize the Histogram a certain way to activate these additional percentiles?

  • doc says "In addition to minimum, maximum, mean, etc., it also measures median, 75th, 90th, 95th, 98th, 99th, and 99.9th percentiles."
  • but then doc says: "min, max, mean, median, stddev, p75, p99"

Distribution Documentation

We recently refactored semantic-metrics to support Distributions. We need to update the documentation to reflect these changes.

DOD:

  • Update semantic-metrics documentation
  • Document Distribution Implementation

FastForwardReporter appends the 'prefix' twice to the `count` metric of Meters

I am using FastForwardReporter v1.1.1 via Apollo's metrics module (https://github.com/spotify/apollo/blob/1.x/modules/metrics/src/main/java/com/spotify/apollo/metrics/MetricsModule.java), which builds a reporter with prefix of MetricId.build("apollo") (plus some tags).

If I run ffwd locally to see what metrics are sent by a locally-running service to ffwd, I see that some metrics are sent with key=apollo and some are sent with key=apollo.apollo, and I think the latter is caused by logic in FastForwardReporter itself.

# launch ffwd
$ docker run --name=local-ffwd --rm -d -p 19091:19091/udp -p 19000:19000 -p 8080:8080 spotify/ffwd:latest
# check the logs
$ docker logs local-ffwd 2>&1 | grep "what=rpc-started"
16:36:57.930 [ffwd-scheduler-0] INFO  com.spotify.ffwd.debug.DebugPluginSink - M#141: Metric(key=apollo, value=Value.DoubleValue(value=0.1433062621147579), timestamp=1612802212648, tags={grpc-type=unary, target-service=remote-config-resolver, stat=1m, grpc-service=grpc.health.v1.Health, service-framework-version=1.17.4, grpc-method=Check, service-framework=apollo, unit=rpc/s, what=rpc-started, application=salem-api, grpc-component=client, metric_type=meter, host=82820c54a43e}, resource={})
16:36:57.930 [ffwd-scheduler-0] INFO  com.spotify.ffwd.debug.DebugPluginSink - M#142: Metric(key=apollo, value=Value.DoubleValue(value=0.18710139700632353), timestamp=1612802212649, tags={grpc-type=unary, target-service=remote-config-resolver, stat=5m, grpc-service=grpc.health.v1.Health, service-framework-version=1.17.4, grpc-method=Check, service-framework=apollo, unit=rpc/s, what=rpc-started, application=salem-api, grpc-component=client, metric_type=meter, host=82820c54a43e}, resource={})
16:36:57.930 [ffwd-scheduler-0] INFO  com.spotify.ffwd.debug.DebugPluginSink - M#143: Metric(key=apollo.apollo, value=Value.DoubleValue(value=1.0), timestamp=1612802212652, tags={grpc-type=unary, target-service=remote-config-resolver, unit=rpc, what=rpc-started, application=salem-api, grpc-service=grpc.health.v1.Health, service-framework-version=1.17.4, grpc-component=client, metric_type=counter, host=82820c54a43e, grpc-method=Check, service-framework=apollo}, resource={})

Note that the same what is reported under key=apollo and key=apollo.apollo, while the application code does not set any key at all for this MetricId.

I think I know what causes this:

On each tick when report(..) is called, it will iterate over all gauges, counters, histograms, meters, timers, etc:

reportMetered(MetricId, Meter) will overwrite the MetricId argument in order to prepend this.prefix to the supplied MetricId, and then translate the MetricId into a com.spotify.ffwd.Metric instance:

private void reportMetered(MetricId key, Meter value) {
key = MetricId.join(prefix, key);
final Metric m = FastForward
.metric(key.getKey())
.attributes(key.getTags())
.attribute(METRIC_TYPE, "meter");
reportMetered(m, value);
reportCounter(key, value);
}

since a Meter is also a Counter, the last line of reportMetered(MetricId, Meter) calls reportCounter(MetricId, Meter), but with the overwritten MetricId (which had this.prefix prepended onto it), which repeats the same prepend operation before translating the MetricId into a com.spotify.ffwd.Metric instance:

private void reportCounter(MetricId key, Counting value) {
key = MetricId.join(prefix, key);
final Metric m = FastForward
.metric(key.getKey())
.attributes(key.getTags())
.attribute(METRIC_TYPE, "counter");

This will result in the FastForwardReporter's prefix field being prepended twice into the metric's ID. reportTimer has the same behavior (update: I was wrong on this).

An easy fix here would be for reportMeter and reportTimer to pass the original MetricId, not the overwritten one, into reportCounter.

Add Distribution Support to Semantic Metric FastForward http Reporter

Heroic histogram data is currently computed locally.
It is practically impossible to aggregate percentile.
We are adding distribution to heroic to address that issue. Distribution will create data sketch that will be used upstream to compute percentile on the entire data distribution.

This task will add distribution support to FastForwardHttpReporter reporter.

DoD:
FastForwardHttpReporter should be able to report distribution metrics.

Dependencies:
1. Distribution support in JSON Metric Model

FastForwardReporter reports metered stats for timers with unit ns/s

For each Timer, FastForwardReporter will report the histogram stats with unit ns which is correct, but also the metered stats 1m and 5m with the unit ns/s which is not.

See: https://github.com/spotify/semantic-metrics/blob/master/ffwd-reporter/src/main/java/com/spotify/metrics/ffwd/FastForwardReporter.java#L284

If the metered stats should be reported for timers, they should have a proper unit. A reason to omit reporting metered stats for timers is that it makes it possible to define the MetricId more precisely.

Add Distribution Support to SemanticAggregatorMetric

Heroic histogram data is currently computed locally.
It is practically impossible to aggregate percentile.
We are adding distribution to heroic to address that issue. Distribution will create data sketch that will be used upstream to compute percentile on the entire data distribution.

This task will add distribution support to Semantic Metric Aggregator.

DoD: Ensure that distribution is supported in remote metric.

Code Pointer:
https://github.com/spotify/semantic-metrics/blob/master/remote/src/main/java/com/spotify/metrics/remote/SemanticAggregatorMetricRegistry.java

(Discuss) Abstract the FastForward client used in FastForwardReporter

public Builder fastForward(FastForward client) {
this.client = client;
return this;
}

The current setup for FastForwardReporter requires a user to pass in a real FastForward client, which is not easily unit testable -- in a production environment the FastForward client would be already running within the container, but in a local unit testing environment this can be pretty cumbersome to set up.

It would be simpler if we had some kind of interface or abstract class to allow the user to sub in different implementations of the FastForward client -- the only method it would have to implement is public void send(Metric metric) { }.

Quick code sketch of what I'm thinking:

public abstract class FastForwardClient {
  public abstract void send(Metric metric) throws IOException;

  // Default implementation
  public static FastForwardClient create(String host, int port) throws UnknownHostException, SocketException {
    return new FastForwardClient() {
      private final FastForward underlying = FastForward.setup(host, port);
      
      @Override
      public void send(Metric metric) throws IOException {
        underlying.send(metric);
      }
    };
  }
}

Then the FastForwardReporter's builder would look like:

public Builder fastForward(FastForwardClient client) {
  this.client = client;
  return this;
}

public FastForwardReporter build() throws IOException {
  final FastForwardClient client = this.client != null ? this.client : FastForwardClient.create(host, port);
  ...
}

And in a unit-testing context we could easily sub in something like this that's easy to validate:

public class StubbedFastForwardClient extends FastForwardClient {
  private final List<Metric> collectedMetrics;

  public StubbedFastForwardClient(List<Metric> collectedMetrics) {
    this.collectedMetrics = new ArrayList<>();
  }

  @Override
  public void send(Metric metric) throws IOException {
    collectedMetrics.add(metric);
  }

  public List<Metric> getCollectedMetrics() {
    return collectedMetrics;
  }
}

lmk what you think! I'm happy to make the PR myself if this seems reasonable.

(Not sure if this issue belongs more in ffwd-client-java... it would probably be better to fix it there, but it's also a bigger/more impactful change to make.)

Move builds to Circle CI

Spotify is slowly moving over to Circl CI as the primary open source build tool. We should see if it makes sense for semantic-metrics, and if so migrate over.

SemanticMetricRegistry API docs incorrect

Methods like histogram, meter, etc., are described like Creates a new {@link Histogram} and registers it under the given name.. This is incorrect, I believe, because in fact, if a Histogram already existed with the same ID, the existing one will be returned.

Add Distribution Support to Semantic-metrics.

We are adding distribution support to Heroic so we can provide stat on actual data distribution as oppose to local source stat.
Semantic Metrics is an adaptation of com.codahale.metrics so we are essentially extending that library.
We will be adding these two interfaces.

public interface Distribution extends Metric, Counting {

    /**
     * Record value from Min.Double to Max.Double.
     * @param val
     */
    void record(double val);

    /**
     * Return serialized distribution and flush.
     * @return
     */
    ByteBuffer getValueAndFlush();

}


public interface MetricRegistryListener extends com.codahale.metrics.MetricRegistryListener { // add code  with distribution support}

The actual implementation of Distribution will be a Tdigest wrapper.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.