Following up as requested from <a class="issue-link js-issue-link" data-error-text="Fa

Thanks for the feedback! 😊 The sample really needs to be done

Support being able to sample metrics about cadence HOT 3 OPEN

pmembrey commented on July 17, 2024

Support being able to sample metrics

from cadence.

56quarters commented on July 17, 2024

Thank you for the write up! I don't use statsd myself these days so this will likely need to be implemented by the community.

Some notes on the potential implementation.

I would prefer that sampling was enabled or disabled at the client level, not individual methods. This way, there are no changes required to the Cadence API besides an extra method on the StatsdClientBuilder - .sample_rate(f64) or similar.
I'm not sure which types of metrics should be sampled: Just counters? Counters and timers? Feedback welcome here!
Code that implements sampling shouldn't slow down anything if not enabled.
If an RNG is used (kinda seems like it has to be), it should be possible to seed it in order to make tests easier to write and reason about.
Sampling should probably be behind a feature flag since it will introduce a dependency on the rand crate but I don't feel strongly about this.
There should be a limit on the values used for sample rate. I don't want the sample rate included end the metric string to end up looking like @0.100000000001 or something. Not sure how to do this without truncating the float ala println!("{:.2}", 0.01);. Maybe we can introduce some sort of SampleRate type that knows how many digits to use to display itself based on the total argument below.
```
let r = SampleRate::from(samples_per, total);
```

from cadence.

pmembrey commented on July 17, 2024

Thanks for the feedback! 😊

The sample really needs to be done at the method level I think. For example, if I am counting new connections, I might sample that at 0.1 but for individual network reads / writes, I might want to sample at 0.01 or even 0.001. Some things you wouldn't want to sample at all if they happen infrequently enough, or you really want the most accurate picture possible. Libraries that support this tend to have _timing and _timing_with_sample_rate where _timing just calls _timing_with_sample_rate with a rate of 1.0.
Counters and timers are the only things I've used with a sample rate. I don't think it makes sense for a Gauge.
Generally the sampling logic is short circuited. If the sample rate is 1.0, then it will always be sampled - there's no need to make a call to rand() or similar.
Great point, we'd definitely want to be able to seed that for testing
I think it's fine to put it behind a feature flag, but would prefer that feature was on by default. I think most people would be okay with the rand crate dependency and it gives the "least surprise". Those that don't want it and know they don't, can easily turn it off by disabling the feature flag.
I'm not sure if the spec defines a max sample rate. We'd definitely want to do something akin to format! to make sure we get nice sane rates. I don't think it's the total sample rate that's the problem, but more how to prevent the precision issues (at least in what we send to statsd or display)

from cadence.

ianks commented on July 17, 2024

Very interested in this feature!

from cadence.