Code Monkey home page Code Monkey logo

Comments (5)

heuristicus avatar heuristicus commented on May 27, 2024

This sounds like a good idea, but is there any particular advantage to working with /diagnostics_agg rather than /diagnostics? I guess the main thing would be fewer callbacks, and the filtering that you define through the analyzers would be applied so you only see the diagnostics you're interested in.

At the same time, it seems to me that the main purpose of the /diagnostics_agg right now is to provide information to a GUI, which is why the update rate is slow.

I guess what I'm wondering about is whether the purpose of the aggregator is to provide an aggregation of the most up to date diagnostics at any point in time, or just in slices.

from diagnostics.

mikepurvis avatar mikepurvis commented on May 27, 2024

The main consumer of diagnostics_agg is the GUI, but I think its purpose is to supply a single diagnostic snapshot which doesn't require opening a socket to every node which produces diagnostics.

Even for the GUI case though, IMO a new ERROR case should be made visible as soon as possible, rather than waiting as much as a second for the next report to arrive.

from diagnostics.

heuristicus avatar heuristicus commented on May 27, 2024

Definitely agree with making errors visible ASAP.

What about changing messages that are attached to the diagnostics? Does it matter that the cause of the error is different, but the level is still the same?

For example:

level: 2
name: ''
message: something went wrong...
hardware_id: ''
values: 
  - 
    key: ''
    value: ''
---
level: 2
name: ''
message: something else went wrong...
hardware_id: ''
values: 
  - 
    key: ''
    value: ''
---

Should both of these messages cause an instant update, or just the first one? Does that also apply to when the level is OK?

From my perspective, I don't think it makes sense to update instantly on changed messages in the OK or WARN levels, but it might if the diagnostic is in the ERROR level. It's a pretty small change to make though, in terms of implementation - just a check on an additional field of the message. Doing this might cause updates to be too frequent, for example if your message contains some floating point values which change a lot.

from diagnostics.

mikepurvis avatar mikepurvis commented on May 27, 2024

That seems reasonable to me.

from diagnostics.

trainman419 avatar trainman419 commented on May 27, 2024

From personal experience, I'd suggest a few adjustments to this:

  • publishing /diagnostics_agg on change doesn't reduce the latency if the diagnostic_updater library is limiting changes on /diagnostics to 1Hz, so to implement this properly, both would need to updated to publish updates immediately when items go from OK to not OK
  • Industrial users will want to know the end-to-end latency between a node transitioning to an error state and the diagnostics_agg topic reflecting that
  • Most industrial systems that have a periodic update (1Hz) and immediate update on change also limit the max publishing rate on change, so that a status item that is oscillating doesn't overwhelm the system. I think this is good idea here as well
  • I think different users will want different notification behaviors on WARN vs ERROR; it might make sense to parameterize this in the final implementation

from diagnostics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.