Code Monkey home page Code Monkey logo

Comments (3)

56quarters avatar 56quarters commented on July 17, 2024

Thank you for the write up! I don't use statsd myself these days so this will likely need to be implemented by the community.

Some notes on the potential implementation.

  • I would prefer that sampling was enabled or disabled at the client level, not individual methods. This way, there are no changes required to the Cadence API besides an extra method on the StatsdClientBuilder - .sample_rate(f64) or similar.
  • I'm not sure which types of metrics should be sampled: Just counters? Counters and timers? Feedback welcome here!
  • Code that implements sampling shouldn't slow down anything if not enabled.
  • If an RNG is used (kinda seems like it has to be), it should be possible to seed it in order to make tests easier to write and reason about.
  • Sampling should probably be behind a feature flag since it will introduce a dependency on the rand crate but I don't feel strongly about this.
  • There should be a limit on the values used for sample rate. I don't want the sample rate included end the metric string to end up looking like @0.100000000001 or something. Not sure how to do this without truncating the float ala println!("{:.2}", 0.01);. Maybe we can introduce some sort of SampleRate type that knows how many digits to use to display itself based on the total argument below.
    let r = SampleRate::from(samples_per, total);

from cadence.

pmembrey avatar pmembrey commented on July 17, 2024

Thanks for the feedback! 😊

  • The sample really needs to be done at the method level I think. For example, if I am counting new connections, I might sample that at 0.1 but for individual network reads / writes, I might want to sample at 0.01 or even 0.001. Some things you wouldn't want to sample at all if they happen infrequently enough, or you really want the most accurate picture possible. Libraries that support this tend to have _timing and _timing_with_sample_rate where _timing just calls _timing_with_sample_rate with a rate of 1.0.
  • Counters and timers are the only things I've used with a sample rate. I don't think it makes sense for a Gauge.
  • Generally the sampling logic is short circuited. If the sample rate is 1.0, then it will always be sampled - there's no need to make a call to rand() or similar.
  • Great point, we'd definitely want to be able to seed that for testing
  • I think it's fine to put it behind a feature flag, but would prefer that feature was on by default. I think most people would be okay with the rand crate dependency and it gives the "least surprise". Those that don't want it and know they don't, can easily turn it off by disabling the feature flag.
  • I'm not sure if the spec defines a max sample rate. We'd definitely want to do something akin to format! to make sure we get nice sane rates. I don't think it's the total sample rate that's the problem, but more how to prevent the precision issues (at least in what we send to statsd or display)

from cadence.

ianks avatar ianks commented on July 17, 2024

Very interested in this feature!

from cadence.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.