Code Monkey home page Code Monkey logo

Comments (9)

nricheton avatar nricheton commented on September 4, 2024 1

Hello,

Following a discussion w/ Jörg & Santos, here is my input on monitoring.

Overview :

  • Monitoring should be free / effortless, should be set up on day 1.
  • A core set of monitoring should be available from dev environment to production
  • You should not need to ask a developer/an expert to retrieve or interpret the metrics/monitoring data. This info should be clear, have a first layer of analysis and tell that situation is OK or directly point out the issues.

Some examples of questions that should have an immediate answer from a monitoring solution.
(Immediate = display a web page, now)

  • Is the right version deployed ?
  • Was it able to start successfully ?
  • From the app/module/service point of view, is connectivity OK ?
  • From the app/module/service point of view, is configuration OK ?
  • From the app/module/service point of view, is performance OK ?
  • From the app/module/service point of view, is memory OK ?
  • From the app/module/service point of view, is data OK ?
  • From the app/module/service point of view, is number of runtime business or technical error acceptable ?
  • What is the average response time of X (service, operation, data request, business rule) ?
  • If not, what should I do ?
  • What is the percentage/number of technical errors of X (service, operation, data request, business rule) ?
  • What is the percentage/number of business errors of X (service, operation, data request, business rule) ?
  • Is caching efficient in X (service, operation, data request, business rule) ?
  • Is there background/async process running, when will it finish ?
  • Is there background/async process scheduled and does scheduling work ?
  • Is there any issue in background processing, which business data is causing issue ?

All these answers are priceless in production, but even in development/testing environments, where they are a clear indicator of the upcoming issues in the next stage.

In several projects, we have made huge improvements in quality and efficiency by having and looking at these metrics every day. Even non technical people can point out the code that is causing issues and the impacted features.

Several tools exists to set up this kind of monitoring.

I really think that devon should provide tooling out of the box and ready-to-use accelerators to provide additional analysis value for commons problems.

One effort to have this kind of monitoring have been appstatus :
https://github.com/appstatus/appstatus
http://appstatus.sourceforge.net

  • Provides answers to all questions above
  • Integrate with standard monitoring tools
  • Provide performance logging at no cost
  • Implements "explain first->log after" instead of "log first->interpret after" for reporting status in background jobs
  • Integrate with spring and AOP (does not depend on external tool)
  • Integrate with spring cache

Used by many projects in different IT companies.

Other alternatives :
https://github.com/javamelody/javamelody
https://www.appdynamics.fr/java/
https://www.zabbix.com/features

Again, low level metrics have little value, we need interpreted metrics, with business level (operations, rules, data retrieval, user perceived response time, ...) available from developper env. to production env. (And this probably should NOT be an option when creating a new devon application :-) )

Feedback is welcome !

Nicolas

from devon4j.

hohwille avatar hohwille commented on September 4, 2024 1

@nricheton thanks for your wunderful input.
I added AppDynamics and Zabbix to our guide:
https://github.com/devonfw-wiki/devon4j/wiki/guide-apm

Also we will have a look at appstatus.

However, we have to be careful with what we integrate by default. In one of my customer projects we used to integrate JavaMelody into all apps and then there came some CVE vulnerabilities with it and we were forced to remove it. Maybe the issues are meanwhile resolved. However, we should investigate your requirements and find a perfect match what we want to integrate as first choice and bring out of the box and what to have a just an option for projects that need more.

Being able to report the release version is of course very simple and does not come with any risk. Also health status (e.g. with spring actuator) should come OOTB.
For monitoring OS level stuff there are tons of solutions already out there and they should IMHO not be build into the app itself (we do not need a Java solution to observe CPU, Memory or Disk). Also there is already SNMP as an established protocol. In this sense we should IMHO also think of complex IT landscapes and microservices. Hence, an app does not really need to ship a UI for monitoring. Assume you have multiple redundant nodes of an app in a cluster with loadbalancing. What use would it make to view a UI in the browser showing CPU usage of the current app itself if I get assigned to some node randomly via some loadbalancer and have no direct access to the node itself? So instead we need to provide services that offer the monitoring data and look for state-of-the-art monitoring systems that integrate with all apps and all their nodes of the entire IT landscape presenting a complete dashboard and triggering alarms if something goes wrong.

Another aspect is OWASP Sensitive Data Exposure. Therefore detailed monitoring data should not be available to the outside world (end-users, internet) but stay secret within the admin-plane. In this manner we should also define strict standards for e.g. URL path scheme for monitoring services to simplify and avoid complex individual configurations.

from devon4j.

hohwille avatar hohwille commented on September 4, 2024

See also here:
https://github.com/devonfw/devon4j/wiki/guide-apm
(considering JavaMeldoy)

from devon4j.

sjimenez77 avatar sjimenez77 commented on September 4, 2024

Sometime ago we already created a demo and a cookbook entry in the devonfw guide for the integration of Spring Boot Admin https://github.com/devonfw/devon/wiki/Spring-boot-admin-Integration-with-devon4j. The document is probably deprecated, but could be a starting point.

My point here is that we should save the still valid cookbook entries for the different stacks wikis before removing the devonfw guide as it is today.

from devon4j.

nricheton avatar nricheton commented on September 4, 2024

Hi @hohwille

Thanks for your feedback !

On CVE risk, I would say that all Devon components (and all projects in general) have CVE in their history. Apart from projects which does not fix important CVEs for a long time, we should not consider CVE declaration as a reason of not integrating valuable components.

On OS-level monitoring, I fully agree with you that dedicated, existing solutions should be used.
However, a first level of checks can be integrated in solutions, here are some reasons :

  • Development or testing environnements are often not properly monitored for different reasons (cost, complexity - lot of real world examples), so checking free space or network shares mount are basic feedbacks that saves days of work.
  • JVM memory can be monitored at no cost in Java apps, especially in development phases
  • Some checks, like checking that your app is linked to the right data can prevent a disaster. For instance checking that test configuration module is connected to test data (and not production data)

On the data availability : I agree data should not be available to public, internet users. This should be reserved to people responsables of operations, like any monitoring tool.

Web page in module are mostly for early stages of feature development, then data should be aggregated into a common monitoring interface (any solution).

I would be happy to show you next week how appstatus handles these ideas, and how it allows to export the data for proper aggregation. And discuss of real world examples !

Nicolas

from devon4j.

hohwille avatar hohwille commented on September 4, 2024

I fully support making progress in this area. Also I assume we will spend a slot on the next DA meeting discussing this. However, as we broadened the scope of this issues and some aspects are not yet completely clear, I removed the milestone. Otherwise we would block the release planned for next month. If people come up with PRs to solve this issue, I am more than happy to replan it for 3.1.0 but at the moment I can not see how I could solve it till then...

from devon4j.

hohwille avatar hohwille commented on September 4, 2024

@nricheton thanks for your feedback.
I do agree that having some additional features like Memory or disc-space are great to have if they come without big effort or without complex dependencies. May only concern was that we should not waste our time to scan all mounted devices and observe their disc-space, send alerts, etc. inside Java if there are already tons of OS level tools doing all this.
To be more pragmatic, I would like to start with spring-boot-actuator and maybe also spring-boot-admin. Then we collect the list of features we get with them and see what are the remaining gaps, choose additional tools and move on till we have covered what we think is crutial.

from devon4j.

hohwille avatar hohwille commented on September 4, 2024

Do we have some key person who could drive the development of this issue. IMHO this is not just a 1-2 hours tasks but will need some attention and continuity. I do not have the time at the moment but would love to see some action and avoid that we are just talking. I am still happy to assist and support this also with some code snipplets or reviews...

from devon4j.

hohwille avatar hohwille commented on September 4, 2024

So JavaMelody even has a spring-boot-starter so you may only need to add a dependency and you are done.
Also glowroot can be added in a similar easy way.
Then there are solutions like spring actuator to provide app specific sensors to be integrated with existing monitoring tools such as CheckMk/Icinga/Nagios/etc.

So is there anybody left who initialally raised demands for this toppic - maybe @nricheton ?
What is left to do and the way to go?

  • Just add some more documentation?
  • Or create a demo app based on devon4j with the monitoring configured and in place (in one isolated commit after the initial app from the template)?
  • Something else?

As a learning we should go away from such generic issues - either the issue should be cristal clear in what is to do or we need a real driver who actively works on that.

from devon4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.