Code Monkey home page Code Monkey logo

collectd-fast-jmx's Introduction

Important Note : 1.0.0 has a new package...

In the process of transferring this project to E-gineering, LLC (thanks, guys!) we've moved the maven coordinates a bit in order to start pushing artifacts to maven central.

Note that all references to org.collectd.FastJMX have been changed to com.e_gineering.collectd.FastJMX, and the file name for jars has changed as well.

On the bright side, you can now pull pre-built artifacts of FastJMX out of the interwebs, from the Maven central repository!

Or, for folks who want to use this some other way, here's the Maven coordinates:

<dependency>
	<groupId>com.e-gineering</groupId>
	<artifactId>collectd-fast-jmx</artifactId>
	<version>1.0.0</version>
</dependency>

FastJMX - Low-latency JMX collectd plugin Build Status

The default GenericJMX plugin from collectd is great for basic collection of small numbers of metrics, but if you need to collect many metrics from one or more hosts, the latency to read the metrics can quickly exceed your interval time, and that's no fun. If you want to remotely collect metrics from multiple hosts you can forget about having short intervals, and some of the configuration settings aren't exactly obvious. Example: What do you mean I have to include the hostname? I gave you the serviceUrl!

Introducing FastJMX!

FastJMX does things differently than the GenericJMX plugin, but it does it in a manner that's configuration-compatible with the original plugin. (You read that right. There's just a few small tweaks to an existing configuration and FastJMX will take over)

  • FastJMX discovers all the matching beans when it first connects, then sets up listeners to the remote server so we get callbacks when any beans are added or removed from the server. This lets us identify all the permutations of the beans we need to read outside of the read() loop, which reduces read() latency, as well as internal GC stress and memory pressure.
  • Reconnections are attempted with increasing backoff sleep durations. Again, outside of the read loop, so that collecting metrics from connections which aren't failed continues to work.
  • Each attribute read from an mbean is it's own potential thread. The JDK 1.5 Concurrent packages are used to pool threads, inflict interval timeouts on the read cycle, and to make sure the queue is clear at the end of each read() invocation eliminating backlogged (lagged) metric reporting. If there isn't a metric polled in a timely manner, it's a dropped read.
  • Each read() cycle is timeslot protected (synchronized to the interval configured in collectd) so that old values and current values are never intermixed.
  • Each <Value> can define a custom PluginName, allowing segementation of reported metrics into different plugin buckets rather than everything being reported as "GenericJMX" or "FastJMX".
  • The port can be appended to the hostname using IncludePortInHostname. This is very helpful in separating data from multiple JVM instances on the same host without needing to specify an InstancePrefix on the <Connection>.
  • Hostnames are automatically detected from the serviceURL. If the serviceURL is a complex type, like service:jmx:rmi:///jndi/rmi://hostname:port/jmxrmi, FastJMX will still properly parse the hostname and port. The Hostname property (part of the standard GenericJMX configuration) value is still respected if present.
  • FastJMX doesn't require connections be defined after the beans. <MBean> (or <MXBean>, or just <Bean>) and <Connection> blocks can come in any order.

So how much faster is it?

In real-world collection scenarios, large volume remote collections from multiple hosts over a VPN improved from ~2500ms to collect (with GenericJMX) to ~120ms.

If you really want to know what FastJMX is doing, add CollectInternal true to the plugin configuration. This tells FastJMX to dispatch internal metrics (success, failure, error, latency, thread pool size) to collectd.

Configuration

Migrate from GenericJMX by...

  • Add the path to the fast-jmx jar in JVMARG
  • Include LoadPlugin "com.e_gineering.collectd.FastJMX in the <Plugin "java"> block.

Additional FastJMX Options:

  • Remove the hostname from the <Connection> blocks. FastJMX will do it's best to detect it from the jmx URI if you don't include it. If parsing has an issue, you'll see a message in the log.
  • Asynch connection handling by default, but you can force synch by adding Synchronous true to a <Connection> block. If the url contains remoting-jmx which is interpreted as JBoss Remoting then the synchronous wrapper is auto-magic-ally enabled.
  • Single-attribute <Value> blocks can use the syntax <Value "attributeName">. See the <MBean "classes"> example below.
  • Include PluginName declarations in a <Value> block to change the plugin name it's reported as. Useful for grouping different MBeans as if they came from different applications, or subsystems.
  • Use <MBean> or <MXBean> or <Bean>.
  • Composite and Table can be used interchangeably within a <Value> block, and can be omitted (defaults to false).
  • MaxThreads can change the default maximum number of threads (512) to allow.
  • CollectInternal enables internal metrics FastJMX uses to be reported back to Collectd.
  • TTL can be used on a Connection to force a reconnect after <value> many seconds have elapsed. This can be handy if your server isn't correctly maintining mbeans after redployments. Keep in mind this is seconds, so '43200' = 12 hours.
  • FastJMX can now traverse TabularData to pull out CompositeData values as tables, or track independent values.
LoadPlugin java
<Plugin "java">
  JVMARG "-Djava.class.path=/path/to/collectd-api.jar:/path/to/collectd-fast-jmx.jar"
  
  LoadPlugin "com.e_gineering.collectd.FastJMX"

  <Plugin "FastJMX">

    MaxThreads 256
    CollectInternal true
  
    <MBean "classes">
      ObjectName "java.lang:type=ClassLoading"

      <Value "LoadedClassCount">
        Type "gauge"
        InstancePrefix "loaded_classes"
        PluginName "JVM"
      </Value>
    </MBean>

    # Time spent by the JVM compiling or optimizing.
    <MBean "compilation">
      ObjectName "java.lang:type=Compilation"

      <Value "TotalCompilationTime">
        Type "total_time_in_ms"
        InstancePrefix "compilation_time"
        PluginName "JVM"
      </Value>
    </MBean>

    # Garbage collector information
    <MBean "garbage_collector">
      ObjectName "java.lang:type=GarbageCollector,*"
      InstancePrefix "gc-"
      InstanceFrom "name"

      <Value "CollectionTime">
        Type "total_time_in_ms"
        InstancePrefix "collection_time"
      	PluginName "JVM"
      </Value>
      
      # Reads the Par Eden Space data as a composite table
      <Value "LastGcInfo.memoryUsageAfterGc.Par Eden Space">
        Type "java_memory"
        Composite true
        InstancePrefix "pool-eden-after"
        PluginName "JVM"
      </Value>
      
      # Reads only the "used" portion of the Par Eden Space
      <Value "LastGcInfo.memoryUsageAfterGc.Par Eden Space.used">
        type "java_memory"
        InstancePrefix "pool-eden-after-used"
        PluginName "JVM"
      </Value>
    </MBean>

    # Memory usage by memory pool.
    <MBean "memory_pool">
      ObjectName "java.lang:type=MemoryPool,*"
      InstancePrefix "memory_pool-"
      InstanceFrom "name"

      <Value "Usage">
        Type "java_memory"
        Composite true
        PluginName "JVM"
      </Value>
    </MBean>


    <Connection>
      ServiceURL "service:jmx:rmi:///jndi/rmi://host1:8098/jmxrmi"
      IncludePortInHostname true
      Collect "classes"
      Collect "compilation"
      Collect "garbage_collector"
      Collect "memory_pool"
    </Connection>
    <Connection>
      ServiceURL "service:jmx:rmi:///jndi/rmi://host1:8198/jmxrmi"
      IncludePortInHostname true
      Collect "classes"
      Collect "compilation"
      Collect "garbage_collector"
      Collect "memory_pool"
    </Connection>
    <Connection>
      ServiceURL "service:jmx:rmi:///jndi/rmi://host2:8398/jmxrmi"
      IncludePortInHostname true
      Collect "classes"
      Collect "compilation"
      Collect "garbage_collector"
      Collect "memory_pool"
      # Force the connection to reset every 4 hours.
      TTL 14400
    </Connection>

  </Plugin>
</Plugin>

Internal Metrics

FastJMX collects some internal metrics that it uses to estimate an efficient pool size. If you enable internal metric collection (see above configuration options) and have the following types defined in types.db, the data will be submitted to collectd.

fastjmx_cycle      value:GAUGE:0:U
fastjmx_latency    value:GAUGE:0:U

Once you've got collectd keeping your data, you may find these Collection3 graph configurations useful...

<Type fastjmx_cycle>
  Module GenericStacked
  DataSources value
  RRDTitle "FastJMX Reads ({plugin_instance})"
  RRDFormat "%6.1lf"
  DSName "cancelled Incomplete "
  DSName "  success Success    "
  DSName "   failed Failed     "
  DSName "   weight Weight     "
  Order success cancelled failed weight
  Color failed ff0000
  Color cancelled ffb000
  Color success 00e000
  Color weight 0000ff
  Stacking on
</Type>
<Type fastjmx_latency>
  Module GenericStacked
  DataSources value
  RRDTitle "FastJMX Latency ({plugin_instance})"
  RRDFormat "%6.1lf"
  DSName "interval Interval" 
  DSName "duration Latency "
  Order interval duration 
  Color duration ffb000
  Color interval 00e000
  Stacking off
</Type>

JBoss EAP 6.x, AS 7.x

The JBoss remoting JMX provider has been tested with EAP 6.x, and should work properly with AS 7.x as well. As part of getting this to work, some 'workarounds' are included in the FastJMX code-base, which may also apply to other JMX protocol providers. In the case of the JBoss jmx remoting, appropriate bugs and feature requests have been filed.

JBoss EAP 6 Classpath

Here's an example JVMArg that works with jboss-eap-6.1

<Plugin java>
	JVMArg "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/lib/jvm/java-7-oracle/lib/jconsole.jar:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/remoting-jmx/main/remoting-jmx-1.1.0.Final-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/remoting3/main/jboss-remoting-3.2.16.GA-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/logging/main/jboss-logging-3.1.2.GA-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/xnio/main/xnio-api-3.0.7.GA-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/xnio/nio/main/xnio-nio-3.0.7.GA-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/sasl/main/jboss-sasl-1.0.3.Final-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/marshalling/main/jboss-marshalling-1.3.18.GA-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/marshalling/river/main/jboss-marshalling-river-1.3.18.GA-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/as/cli/main/jboss-as-cli-7.2.1.Final-redhat-10.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/staxmapper/main/staxmapper-1.1.0.Final-redhat-2.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/as/protocol/main/jboss-as-protocol-7.2.1.Final-redhat-10.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/dmr/main/jboss-dmr-1.1.6.Final-redhat-1.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/as/controller-client/main/jboss-as-controller-client-7.2.1.Final-redhat-10.jar:/opt/appserver/jboss-eap-6.1/modules/system/layers/base/org/jboss/threads/main/jboss-threads-2.1.0.Final-redhat-1.jar:/usr/share/collectd/java/collectd-fast-jmx-1.0-SNAPSHOT.jar"
	LoadPlugin "com.e_gineering.collectd.FastJMX"
	
	...
	
</Plugin>

To connect as an administrator you shoudln't need to change anything in the jboss configuration. The following Connection block should work in this scenario.

<Connection>
	ServiceURL "service:jmx:remoting-jmx://yourhostname:9999"
	Username "admin"
	Password "aR3allyStrongP@sswordThatOthersCanSee"
	ttl 300
	IncludePortInHostname false
	Collect "classes"
	...

</Connection>

To connect as a normal application user, and expose JMX over the 'remoting' port in EAP 6.1, your domain (or standalone) configuration should include use-management-endpoint="false", like so:

<subsystem xmlns="urn:jboss:domain:jmx:1.2">
  <expose-resolved-model/>
  <expose-expression-model/>
  <remoting-connector use-management-endpoint="false"/>
</subsystem>

This changes the port from 9999 (the default management port) to 4447 (the remoting port) and requires an application user rather than an administration user.

You can add the application user using the 'add-user' script from the JBoss Bin dir:

$JBOSS_HOME/bin/add-user.sh --silent -a --user jmx --password <yourpasshere>

Then in your Connection block, you can use:

<Connection>
	ServiceURL "service:jmx:remoting-jmx://yourhostname:4447"
	Username "jmx"
	Password "!amUnprivi1eged"
	ttl 300
	IncludePortInHostname false
	Collect "classes"
	...
</Connection>

Which exposes a non-privileged username / password.

WARNING WARNING WARNING This Unprivileged user will be able to invoke MBeans via JMX. WARNING WARNING WARNING\

Debugging & Troubleshooting

There are a couple additional configuration options worth nothing, which are helpful if you're troubleshooting an issue.

  • LogLevel sets the Plugins internal Java log level. By default this is 'INFO'. Meaning any log message generated internall that's INFO or greater will be logged to Collectd at the approprate (corresponding) Collectd Log Level...
  • ForceLoggingTo Lets you override the normal behavior of mapping Java log levels to collectd log levels, and forces all java log output to be logged at this collectd level.

So under normal operation, things logged in java as SEVERE are logged at ERROR in Collectd, etc.

Setting ForceLoggingTo "INFO" will make all Java logging output log in Collectd at INFO.

If your normal Collectd configuration sets the collectd log level to WARNING, but you want to get 'INFO' from the FastJMX plugin, you can do this:

<Plugin "FastJMX">
   LogLevel "INFO"
   ForceLoggingTo "WARNING"

   ...
</Plugin>

If you'd like to see FINE logging from FastJMX use:

<Plugin "FastJMX">
   LogLevel "FINE"
   ForceLoggingTo "WARNING"
</Plugin>

Basically, you're setting the java logger write any messages >= FINE, and to write those messages as Collectd WARNING messages. It gives a little more control over the verbosity of this single plugin.

collectd-fast-jmx's People

Contributors

bvarner avatar rahulaga avatar pmoranga avatar

Stargazers

 avatar Vlad Lysyy avatar  avatar  avatar Todd Wetherbee avatar  avatar  avatar SeongHo's Project avatar Funky Yang avatar Eugene Kropotkin avatar songguo avatar  avatar  avatar zhuyoulong avatar  avatar zhanghaichang avatar  avatar xf avatar Elias Abacioglu avatar ChangTai LIANG avatar Hari Krishna Ganji avatar Radek Antoniuk avatar Julian avatar Andrew Cooper avatar Derek avatar Edoardo Causarano avatar  avatar Dimitrij Pinneker avatar Michael Fong avatar rookiefly avatar  avatar  avatar yash avatar Fyodor avatar Radoslav Petrov avatar John Hill avatar DavidLee@TW avatar sergio avatar Evan D. Hoffman avatar shanyou avatar Troy Kelley avatar Robert Tarrall avatar Richard Drake avatar Paul DeLong avatar  avatar Nic Grayson avatar Bas Langenberg avatar  avatar Lior Goikhburg avatar Josh Behrends avatar Thomas Lee avatar Nate Ridderman avatar Sam Powers avatar Brandon Wilson avatar Dave Wongillies avatar Chris Ferry avatar Maxim Snezhkov avatar Giles Westwood avatar Michael Hood avatar Igor Berman avatar Ben Mathews avatar Olivier Bazoud avatar Bernd Zeimetz avatar Florent Cappelle avatar Adam J Gray avatar Keith Chambers avatar Hannes Voigt-Georg avatar

Watchers

Chris Ferry avatar Brad Fritz avatar James Cloos avatar Gregg Reed avatar Keith Chambers avatar Robbie Page avatar  avatar Dae Melchi avatar Nathan Erwin avatar Todd Wetherbee avatar Troy Kelley avatar  avatar Allan Moso avatar Jason Steele avatar Rick Zemer avatar  avatar  avatar Randy Cox avatar Mahmood Hosseini avatar David Birks avatar Clay Taylor avatar Jacob Harris avatar Ryan Hand avatar Neal Hamilton avatar  avatar  avatar Shane Bielefeld avatar  avatar Kurt Desserich avatar  avatar  avatar Robyn Williams avatar Brandon Gupton avatar Ben Williams avatar Chad Cooper avatar Brad Russell avatar Jason Bell avatar Christian Desserich avatar  avatar  avatar Hind Salih avatar  avatar

collectd-fast-jmx's Issues

Log verbosity

Enhancement request. To have cleaner logs it would be good if after one succesful collection we stop logging success. For example:
[2014-09-25 18:20:47] FastJMX plugin: [failed:0, canceled:0, successful:11] in 8ms with: 2 threads

Enhance AttributePermutation to lookup a value in a TabularData node, similar to CompositeData lookup

I do have a patch for the current code (downloaded 2015-06-03), but I didn't fork your repo, so I won't be able to submit this as a pull request.

The AttributePermutation class has a call() method that does a great job of drilling down into mbean attributes using intuitive dot-separated notation, parsing node names out of the attributePath.

  • If AttributePermutation is confronted by a CompositeData, it does a compositeValue.get(node) For example, the LastGcInfo attribute of the java.lang:type=GarbageCollector mbean is a CompositeData, so /etc/collectd.d/java.conf can map Attribute "LastGcInfo.duration" as a .
  • The patch below enhances AttributePermutation to handle a TabularData node. For example, the memoryUsageAfterGc.memoryUsageAfterGc node of the java.lang:type=GarbageCollector mbean, which is a table exposing each heap (and nonheap) memory pool using the name of its java.lang:type=MemoryPool as the TabularData key. Each of those memory pools is a CompositeData value, which understood by the existing FastJMX code after the TabularData node lookup is handled. This enhancement lets /etc/collectd.d/java.conf map either Attribute "LastGcInfo.memoryUsageBeforeGc.Par Eden Space" as a Type "memory" table (Table true) , or drill down to the Attribute "LastGcInfo.memoryUsageBeforeGc.Par Eden Space.used" as an individual (Table false) Type "memory" .

I hope the github markdown works here :). I do this crazy vi to build the patch because of the tab indentation, which I can't get to work in a here doc.

cd /usr/local/src/collectd-fast-jmx-20150603

vi enhance-lookupAttribute.patch  # Lots of tabs in the patch -- omit the ------ lines
------
--- ../collectd-fast-jmx-master/src/main/java/org/collectd/AttributePermutation.java    2015-06-03 09:48:21.000000000 -0400
+++ src/main/java/org/collectd/AttributePermutation.java    2015-07-12 23:39:58.000000000 -0400
@@ -10,11 +10,13 @@
 import javax.management.ObjectName;
 import javax.management.openmbean.CompositeData;
 import javax.management.openmbean.OpenType;
+import javax.management.openmbean.TabularData;
 import java.io.IOException;
 import java.math.BigDecimal;
 import java.math.BigInteger;
 import java.util.ArrayList;
 import java.util.List;
+import java.util.Collection;
 import java.util.Map;
 import java.util.Set;
 import java.util.concurrent.Callable;
@@ -223,6 +225,56 @@
                            value = compositeValue.get(node);
                        } else if (value instanceof OpenType) {
                            throw new UnsupportedOperationException("Handling of OpenType " + ((OpenType) value).getTypeName() + " is not yet implemented.");
+                       } else if (value instanceof TabularData) {
+                           // A java.lang:type=GarbageCollector mbean
+                           // has an LastGcInfo attribute that is a
+                           // CompositeData value, which interesting
+                           // sub-value called memoryUsageAfterGc and
+                           // memoryUsageBeforeGc, which expose the
+                           // javax.management.openmbean.TabularData
+                           // interface. Each table exposes each heap
+                           // and nonheap memory pool using the name
+                           // of its java.lang:type=MemoryPool as the
+                           // TabularData key. The memory pool is a
+                           // CompositeData value that can be further
+                           // examined with Attribute dot notation.
+                           //   #mbean = java.lang:type=GarbageCollector,name=ParNew
+                           //   LastGcInfo = { 
+                           //     memoryUsageAfterGc = {
+                           //       ( Par Survivor Space ) = {
+                           //         key = Par Survivor Space;
+                           //         value = {
+                           //           committed = 8716288;
+                           //           init = 8716288;
+                           //           max = 8716288;
+                           //           used = 79040;
+                           //          };
+                           //        };
+                           //       ...
+                           //      };
+                           //    };
+                           //
+                           // This java.conf entry would capture the state
+                           // of the "Par Eden Space" memory pool after the
+                           // most recent garbage collection invation:
+                           //    <MBean "java/garbage_collector">
+                           //      ObjectName "java.lang:type=GarbageCollector,*"
+                           //      <Value>
+                           //        Attribute "LastGcInfo.memoryUsageAfterGc.Par Eden Space"
+                           //        Table true
+                           //        Type "memory" # value:GAUGE:0:281474976710656
+                           //        InstancePrefix "pool-eden-after"
+                           //      </Value>
+                           //    </MBean>
+
+                           TabularData tabularData = (TabularData) value;
+                           Collection<CompositeData> table =
+                               (Collection<CompositeData>)tabularData.values();
+                           for (CompositeData compositeData : table) {
+                               if (node.equals(compositeData.get("key"))) {
+                                   value = compositeData.get("value");
+                               }
+                           }
                        } else if (value != null) {
                            // Try to traverse via Reflection.
                            value = value.getClass().getDeclaredField(node).get(value);
@@ -294,6 +346,7 @@
            //
            consecutiveNotFounds++;
        } catch (Exception ex) {
+                   logger.warning("catch: " + ex);
            throw ex;
        } finally {
            lastRunDuration = System.nanoTime() - start;
------

patch -p0 <enhance-lookupAttribute.patch

Some JMX providers don't support async connection notifications.

Include a configuration option to force synchronously obtaining an MBeanServerConnection during a connect() cycle.

Oracle RMI's connect() is expensive and backgrounded (async)
JBoss remoting is fairly lightweight and synchronous. Further, they do not implement JMXConnectionNotification at all....

Random silent socket failures with two localhost <Connection> definitions

GENERAL DESCRIPTION:
When I have two definitions, they can silently fail, both at the same time, at random times. All of my definitions are for JVM processes running on the localhost, obviously with different JMX ports. I am using the write_graphite plugin to output my metrics to a graphite running on a different server. I also have a number of input plugins other than FastJMX, but I doubt they are involved in the problem. I can run this same collectd configuration with GenericJMX without any problems.

REPRO INSTRUCTIONS:
I found two ways to repro this problem:

(1) I first noticed the problem when I had two sections, and was testing whether the would re-establish if I killed it's JVM and restarted it. Yes, that was closed down and recreated as expected. However, the other simply stopped reporting -- nothing in the /var/log/messages and nothing in my graphite. This was happening consistently, and is what I mean by "silently fail".

(2) Then I tried a longevity test, bringing up the two JVM processes first, and letting both sockets run indefinitely. Within a few hours, both sockets failed silently, apparently at the same time. This was not associated with a JVM restart, since those processes continued to run on the localhost, and I could still connect to their JMX ports using a jmxterm connection, through which I could still browse their MBeans.

MORE OBSERVATIONS (and faster repro for (2) longevity testing):
Then I instrumented my src/main/java/org/collectd/*.java code using "netstat -an" calls, allowing me to see the underlying socket connections at any point during the collectd run (basically, a "print debug" technique). I can tell the failure always happened between one FastJMX.read() call and the next Connection.ConnectTask.run() call, which are about 5 seconds apart in my configuration:

  • A "netstat -an" right at the end of FastJMX.read() call shows that each still has an ESTABLISHED socket.
  • A "netstat -an" right at the top of Connection.ConnectTask.run() will show every socket in TIME_WAIT. Of course, it's only TIME_WAIT when the sockets failed silently -- usually there isn't a failure.

This has the appearance of being a race condition, but there is only one FastJMX thread running, so I can only speculate. I do know that when I sprinkled a dozen or so of those "netstat -an" calls in the *.java code, the failures would happen very quickly. It usually takes less than one minute after the "service collectd start" for the sockets to fail into TIME_WAIT.

CODE THAT CAN REPRODUCE THE PROBLEM:
This is the instrumentation method I added to print out "netstat -an" output. My two JVM processes are serving JMX on ports 10001 and 10002:

import java.io.BufferedReader;
import java.io.InputStreamReader;

        private void reportNetstat(String label)
        {
          final String cmd = "netstat -an";
          try {
              // Run netstat
              Process process = Runtime.getRuntime().exec(cmd);
              process.waitFor();
              BufferedReader reader =
                   new BufferedReader(new InputStreamReader(process.getInputStream()));
              String line = "";
              while ((line = reader.readLine())!= null) {
                if (line.contains(":10001      ESTABLISHED")
                 || line.contains(":10002      ESTABLISHED")
                 || line.contains(":10001      TIME_WAIT")
                 || line.contains(":10002      TIME_WAIT")
                ) {
                  logger.info("NETSTAT " + label + ": " + line);
                }
              }
          } catch (Exception e) {
              e.printStackTrace(System.err);
          }
        }

This is a trivial java program that can be used for the JVM processes for the to watch. Obviously, anything will do, but if you don't have something handy, use this (I cribbed it from somewhere, with some modifications):

cd /tmp/mina

cat >>TCPServer.java <<EOF
import java.io.*;
import java.net.*;
class TCPServer {
    public static void main(String argv[]) throws Exception       {
         String clientSentence;
         String capitalizedSentence;
         int portNumber = Integer.valueOf(argv[0]);
         ServerSocket welcomeSocket = new ServerSocket(portNumber);
         while(true)          {
             Socket connectionSocket = welcomeSocket.accept();
            BufferedReader inFromClient =                new BufferedReader(new InputStreamReader(connectionSocket.getInputStream()));
            DataOutputStream outToClient = new DataOutputStream(connectionSocket.getOutputStream());
            clientSentence = inFromClient.readLine();
            System.out.println("Received: " + clientSentence);
            capitalizedSentence = clientSentence.toUpperCase() + '\n';
            outToClient.writeBytes(capitalizedSentence);
         }
       }
 }
EOF

javac *java

java -Dcom.sun.management.jmxremote.port=10001 \
     -Dcom.sun.management.jmxremote.local.only=false \
     -Dcom.sun.management.jmxremote.authenticate=false \
     -Dcom.sun.management.jmxremote.ssl=false \
     -classpath . TCPServer 20001

That JVM listens for JMX connections on 10001 and TELNET on 20001. I run two of those in different terminal shells sessions, using JMX ports 10001 and 10002.

Not picking up custom specified hostname.

I've used the hostname parameter in connection block but it's still picking up from the service URI. Though it's clearly written in the description " The Hostname property (part of the standard GenericJMX configuration) value is still respected if present."

JBoss: Failed to collect JVM statistics

This bug might be related to #6, also with JBoss 4.2:

Once we tell JBoss to use the platform mbean server (setting JAVA_OPTS="$JAVA_OPTS -Djboss.platform.mbeanserver Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl") FastJMX fails to collect basic JVM statistics, printing the following error messages:

[2015-07-09 18:49:36] FastJMX Plugin: Failed to collect: java.lang:type=MemoryPool,name=PS Survivor  Space@service:jmx:rmi:///jndi/rmi://127.0.0.1:8855/jmxrmi InstanceNotFound consecutive count=5

(and similar messages for the GC stats).

As soon as we connect via the JConsole to view the MBean, everything starts to work and FastJMX is able to collect the stats as expected.

Was anyone able to get this setup working? We would like to collect JVM and JBoss stats. Many thanks!

Needs a license

Hard to justify use in most environments without a license.

Can you please add a license file to the root of the project?

Dependency org.collectd is wired improperly ?

Hi Bryan,
I am seeing the following in the pom.xml of egineering-llc/collectd-fast-jmx

  org.collectd
  collectd-api
  1.0
  system    		
  ${basedir}/lib/collectd-api.jar

when I had to get collectd-fast-jmx dependency into my project, I get the following due to the above

Could not find artifact org.collectd:collectd-api:jar:1.0 at specified path /Users/bvarner/Documents/work/eg/collectd-fast-jmx/lib/collectd-api.jar

When I comment the scope and systemPath parameters above I get through fine.
Can you please reason whats the need to push the dependency into a systemPath rather than
get it from maven central itself.

Please suggest how do I move forward in my case.

jmxmp support

Is it possible to connect using jmxmp instead of rmi?

My jmx.conf:


 LoadPlugin java
 <Plugin "java">
 JVMARG "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/collectd-fast-jmx-1.0.0.jar:/usr/share/collectd/java/jconsole.jar:/usr/share/collectd/java/jmxremote_optional.jar"
 LoadPlugin "com.e_gineering.collectd.FastJMX"

 <Plugin "FastJMX">
 MaxThreads 256
 CollectInternal true
LogLevel "FINE"
ForceLoggingTo "WARNING"
# Memory usage by memory pool.
<MBean "memory_pool">
  ObjectName "java.lang:type=MemoryPool,*"
  InstancePrefix "memory_pool-"
  InstanceFrom "name"

  <Value "Usage">
    Type "java_memory"
    Composite true
    PluginName "JVM"
  </Value>
</MBean>

<Connection>
  ServiceURL "service:jmx:jmxmp://localhost:31301"
  IncludePortInHostname true
  Username "foo"
  Password "bar"
  Collect "memory_pool"
</Connection>

Error message:

[2017-02-12 14:34:53] FastJMX Plugin: Invoking javax.management.remote.jmxmp.JMXMPConnector.connect() on Connect-service:jmx:jmxmp://localhost:31301
[2017-02-12 14:34:53] FastJMX Plugin: Could not connect to : service:jmx:jmxmp://10.32.14.31:31301 exception message: The client does not require any profile but the server mandates one

Better handling of misconfiguration / missing types.

As explained by rbartl in issue #7

got that Unexpected Throwable: java.lang.NullPointerException error in my fresh setup.
After some added debugging i found it was a missing java_memory line in the types.db file.

types.db
java_memory value:GAUGE:0:U
a null pointer exception in the config method was preventing the executor field to be set.
A try catch block around the whole config block to see these config errors would be a good idea.

Support pattens with placeholders to specify output metric name

Now it's impossible to specify custom name for a metric based on ObjectName keys and domain.
We can just specify prefix and keys which will be joined with dash for now. It's not flexible enough for me. There are many monitoring systems with own naming conventions and I would like to have an ability to fit output name to my monitoring system.

Let's just say that I want to put my jmx metrics to graphite. Graphite supports hierarchy of names (dot is used as separator) and tags https://graphite.readthedocs.io/en/latest/tags.html
Consider the following cassandra jmx metric:
org.apache.cassandra.metrics:type=(ColumnFamily|IndexColumnFamily),keyspace=(Keyspace name),scope=(ColumnFamily Name),name=(Metric name)
I want to have in graphite something like this:
cassandra.write-latency;keyspace=k1;scope=s1;type=t1
Because using tags in graphite gives me many cool features but I am not able to do this for now.

I suggest to add new optional config property called "InstancePattern".
How can I achieve my wishes with this property:
InstancePattern "${domain}.write-latency;keyspace=${keyspace};scope=${scope};type=${type}"

If you like this idea let me know and I will implement and test this.

java.lang.NoClassDefFoundError: com/e_gineering/collectd/FastJMX

Here is my Config and thanks for any help !

LoadPlugin java

<Plugin "java">

 JVMARG "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/collectd-fast-jmx.jar"

  LoadPlugin "com.e_gineering.collectd.FastJMX"

  <Plugin "FastJMX">

    MaxThreads 256
    CollectInternal true

    <MBean "classes">
      ObjectName "java.lang:type=ClassLoading"

      <Value "LoadedClassCount">
        Type "gauge"
        InstancePrefix "loaded_classes"
        PluginName "JVM"
      </Value>
    </MBean>
...................
 collectd[19249]: java plugin: cjni_config_load_plugin: FindClass (com/e_gineering/collectd/FastJMX) failed.
Jan 26 20:14:19  collectd: Exception in thread "main" java.lang.NoClassDefFoundError: com/e_gineering/collectd/FastJMX
Jan 26 20:14:19  collectd: Caused by: java.lang.ClassNotFoundException: com.e_gineering.collectd.FastJMX
Jan 26 20:14:19  collectd: at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
Jan 26 20:14:19  collectd: at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
Jan 26 20:14:19  collectd: at java.security.AccessController.doPrivileged(Native Method)
Jan 26 20:14:19  collectd[19249]: java plugin: Configuration block for `FastJMX' found, but no such configuration callback has been registered. Please make sure, the `LoadPlugin' lines precede the `Plugin' blocks.
Jan 26 20:14:19  collectd: at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
Jan 26 20:14:19  collectd: at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
Jan 26 20:14:19  collectd: at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
Jan 26 20:14:19  collectd: at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
Jan 26 20:14:19  collectd[19249]: Initialization complete, entering read-loop.

allowing override of hostname

Is there a mechanism to actually allow overridding the hostname if we WANT to do so? I left the Hostname option in but it still used localhost.

The reason we need this is because the way SignalFX tags metrics is by overridding the hostname like so:

Host "hostname[hostHasService=activemq]"

Support wildcards in Collect option (enhancement request)

It would be nice to be able to use wildcards when listing what MBeans to collect to avoid having to type each one manually. It would be specially helpful to me since I generate the collectd config from other sources of information. For example, I would like to be able to do Collect "cassandra-*" instead of listing every one of the dozens of metrics.

InstancePrefix in <Value> not being used

The InstancePrefix configuration directive within the block doesn't seem to be used. Below is my example config and so I expected to see metrics like:

*.cass-test-1.jvm.heap.

and what I got was
*.cass-test-1.jvm.memory-

A quick test of changing the Type shows that the 'memory-' is coming from the type. Maybe the type is mistakenly being added to the metric name instead of the prefix?

LoadPlugin java

<Plugin java>
  JVMARG "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/collectd-fast-jmx-1.0.0.jar"

  LoadPlugin "com.e_gineering.collectd.FastJMX"

  <Plugin FastJMX>
    LogLevel "FINE"
    MaxThreads 256

    <MBean "memory">
      ObjectName "java.lang:type=Memory"

      <Value>
        Attribute "HeapMemoryUsage"
        Type "memory"
        Composite true
        InstancePrefix "heap"
        PluginName "jvm"
      </Value>
      <Value>
        Attribute "NonHeapMemoryUsage"
        Type "memory"
        Composite true
        InstancePrefx "non-heap"
        PluginName "jvm"
      </Value>
    </MBean>

    <Connection "cass-test-1">
      ServiceURL "service:jmx:rmi:///jndi/rmi://localhost:7199/jmxrmi"
      Collect "memory"
    </Connection>

  </Plugin>
</Plugin>

Failed to dispatch data from service connected using “rmi" when there is also service connected using ‘remoting-jmx’

Observed Behaviour
When trying to collect JVM data using ‘rmi’ and ‘http-remoting-jmx’ at the same time, we noticed that ONLY data from connection built using ‘remoting-jmx’ are colllected. We noticed that all types of connections are invoked, connections received notifications, notification listeners added and found expected instance in service.

Below is an example of our FastJMX config.

LoadPlugin java
<Plugin "java">
  JVMARG "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/fast-jmx.jar:/usr/share/collectd/java/jboss-client.jar"
  LoadPlugin "org.collectd.FastJMX"
  <Plugin "FastJMX">
    LogLevel "FINEST"
    ForceLoggingTo "INFO"
    CollectInternal true

    <MBean "Service1_ClassLoading">
      ObjectName "java.lang:type=ClassLoading"
      InstancePrefix "ClassLoading"
      <Value>
        Type "gauge"
        Table false
        InstancePrefix "ClassesTotalLoadedClassCount"
        Attribute "TotalLoadedClassCount"
        PluginName 'Service1'
      </Value>
      <Value>
        Type "gauge"
        Table false
        InstancePrefix "ClassesUnloadedClassCount"
        Attribute "UnloadedClassCount"
        PluginName 'Service1'
      </Value>
    </MBean>

    <Connection>
      ServiceURL "service:jmx:rmi:///jndi/rmi://{{CONTAINER_HOST_ADDRESS}}:9000/jmxrmi"
      Synchronous true
      Host "{{HOST_NAME}}"
      InstancePrefix 'JVM'
      Collect "Service1_ClassLoading"
    </Connection>
    {% endfor %}
    {% endif %}

    <MBean "Service2_ClassLoading">
      ObjectName "java.lang:type=ClassLoading"
      InstancePrefix "ClassLoading"
      <Value>
        Type "gauge"
        Table false
        InstancePrefix "ClassesTotalLoadedClassCount"
        Attribute "TotalLoadedClassCount"
        PluginName 'Service2'
      </Value>
      <Value>
        Type "gauge"
        Table false
        InstancePrefix "ClassesUnloadedClassCount"
        Attribute "UnloadedClassCount"
        PluginName 'Service2'
      </Value>
    </MBean>

    <Connection>
      ServiceURL  "service:jmx:http-remoting-jmx://{{CONTAINER_HOST_ADDRESS}}:{{JMX_PORT}}"
      User "admin"
      Password "admin"
      Host "{{HOST_NAME}}"
      InstancePrefix 'JVM'
      Collect "Service2_ClassLoading"
    </Connection>
  </Plugin>
</Plugin>

One server can block polling for others

First, thanks for this. We are using it happily in production and it has solved an issue we had with the GenericJMX plugin.

Now, we have one FastJMX plugin definition, with multiple MBean entries (memory, threading, GC, ...) and multiple Connection blocks.
Our monitoring shows FastJMX dropping out for all connections defined occasionally, which seems to be related to one of the machines under monitoring running into an OutOfMemory error. During the dropout collectd happily collects data for unrelated plugins.
Restarting said machine will revive FastJMX, magically bring back graphs for all other connections as well.
The collectd logs contain FastJMX Plugin: Failed to collect 98 of 98 samples within read interval with 2 threads until the restart.

All machines are running JRuby on either JVM7 or 8. We are using 1.0.0 from collectd-fast-jmx.

Expected behaviour would be for all unrelated connections to continue.

Reconnect failure to collect stats

On JBoss 4.2.3 GA
Connection string: service:jmx:rmi:///jndi/rmi://HOST:PORT/jmxrmi

I believe the reconnection does happen as I stop seeing log message saying "Scheduling reconnect..."
However the error messages in the log are about bean not found:

[2014-09-25 16:47:27] FastJMX plugin: Failed javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=CMS Perm Gen is not registered.
[2014-09-25 16:47:27] FastJMX plugin: Failed javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=Par Survivor Space is not registered.
[2014-09-25 16:47:27] FastJMX plugin: Failed javax.management.InstanceNotFoundException: java.lang:type=MemoryPool,name=Code Cache is not registered.

What I noticed was that after I connect to jmx via jconsole and just look at those beans the errors magically got fixed and I started seeing successful collection!

I will continue to monitor how things works after next re-deploy.

Collectable Permutations not being removed for long-offline hosts...

By running collectd in a VM, suspending the VM (overnight) and then resuming in the morning, the collectable permutations from the night before were not removed after the reconnect period, and when connections were established (by reconnecting) new permutations were added.

Null Pointer Exception

After a while FastJMX crashes with the following null pointer exception:

Jul 10, 2015 9:57:45 AM org.collectd.Connection close
WARNING: Exception closing JMXConnection: error during JRMP connection establishment; nested exception is:
        java.net.SocketTimeoutException: Read timed out
Jul 10, 2015 9:57:50 AM org.collectd.SelfTuningCollectionExecutor push
WARNING: Failed to collect 13 of 13 samples within read interval with 1 threads.
Jul 10, 2015 9:57:50 AM org.collectd.Connection$ReconnectTask run
SEVERE: Failure to close for TTL reconnect to: service:jmx:rmi:///jndi/rmi://HOSTNAME:PORT/jmxrmi
Jul 10, 2015 9:57:50 AM org.collectd.Connection$ReconnectTask run
INFO: Error or TTL Expiration for service:jmx:rmi:///jndi/rmi://HOSTNAME:PORT/jmxrmi forcing reconnect..
Exception in thread "Connect-service:jmx:rmi:///jndi/rmi://HOSTNAME:PORT/jmxrmi" java.lang.NullPointerException
        at org.collectd.Connection$ReconnectTask.run(Connection.java:202)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.