Code Monkey home page Code Monkey logo

akka-sensors's Introduction

Minimalist Akka Observability

Build Status codecov.io Scala Steward badge Maven Central

Non-intrusive native Prometheus collectors for Akka internals, negligible performance overhead, suitable for production use.

  • Are you running (or about to run) Akka in production, full-throttle, and want to see what happens inside? Did your load tests produce some ask timeouts? thread starvation? threads behaving non-reactively? old code doing nasty blocking I/O?

  • Would be nice to use Cinnamon Telemetry, but LightBend subscription is out of reach?

  • Overhead created by Kamon doesn't look acceptable, especially when running full-throttle?

  • Already familiar with Prometheus/Grafana observability stack?

If you answer 'yes' to most of the questions above, Akka Sensors may be the right choice for you:

  • Comprehensive feature set to make internals of your Akka visible, in any environment, including high-load production.

  • It is OSS/free, as in MIT license, and uses explicit, very lightweight instrumentation - yet is a treasure trove for a busy observability engineer.

  • Won't affect CPU costs, when running in public cloud.

  • Easy Demo/Evaluation setup included: Akka with Cassandra persistence, Prometheus server and Grafana dashboards.

Actor dashboard: Actors

Dispatcher dashboard: Dispatchers

Features

Dispatchers

  • time of runnable waiting in queue (histogram)
  • time of runnable run (histogram)
  • implementation-specific ForkJoinPool and ThreadPool stats (gauges)
  • thread states, as seen from JMX ThreadInfo (histogram, updated once in X seconds)
  • active worker threads (histogram, updated on each runnable)

Thread watcher

  • thread watcher, keeping eye on threads running suspiciously long, and reporting their stacktraces - to help you find blocking code quickly

Basic actor stats

  • number of actors (gauge)
  • time of actor 'receive' run (histogram)
  • actor activity time (histogram)
  • unhandled messages (count)
  • exceptions (count)

Persistent actor stats

  • recovery time (histogram)
  • number of recovery events (histogram)
  • persist time (histogram)
  • recovery failures (counter)
  • persist failures (counter)

Cluster

  • cluster events, per type/member (counter)

Cassandra

Instrumented Cassandra session provider, exposing Cassandra client metrics collection.

  • requests
  • traffic in/out
  • timeouts

Java Virtual Machine (from Prometheus default collectors)

  • number of instances
  • start since / uptime
  • JVM version
  • memory pools
  • garbage collector

Demo setup

We assuming you have docker and docker-compose up and running.

Prepare sample app:

sbt "compile; project app; docker:publishLocal"

Start observability stack:

docker-compose -f examples/observability/docker-compose.yml up

Send some events:

for z in {1..100}; do curl -X POST http://localhost:8080/api/ping-fj/$z/100; done
for z in {101..200}; do curl -X POST http://localhost:8080/api/ping-tp/$z/100; done
for z in {3001..3300}; do curl -X POST http://localhost:8080/api/ping-persistence/$z/300 ; done

Open Grafana at http://localhost:3000.

Go to http://localhost:3000/plugins/sensors-prometheus-app, click Enable. Sensors' bundled dashboards will be imported.

Usage

SBT dependency

libraryDependencies ++= 
  Seq(
     "nl.pragmasoft.sensors" %% "sensors-core" % "0.2.2",
     "nl.pragmasoft.sensors" %% "sensors-cassandra" % "0.2.2"
  )

Prometheus exporter

If you already have Prometheus exporter in your application, CollectorRegistry.defaultRegistry will be used by default. To control this finely, AkkaSensors.prometheusRegistry needs to be overridden.

For an example of HTTP exporter service, check MetricService implementation in example application (app) module.

Application configuration

Override type and executor with Sensors' instrumented executors. Add akka.sensors.AkkaSensorsExtension to extensions.

akka {

  actor {

    # main/global/default dispatcher

    default-dispatcher {
      type = "akka.sensors.dispatch.InstrumentedDispatcherConfigurator"
      executor = "akka.sensors.dispatch.InstrumentedExecutor"

      instrumented-executor {
        delegate = "fork-join-executor" 
        measure-runs = true
        watch-long-runs = true
        watch-check-interval = 1s
        watch-too-long-run = 3s
      }
    }

    # some other dispatcher used in your app

    default-blocking-io-dispatcher {
      type = "akka.sensors.dispatch.InstrumentedDispatcherConfigurator"
      executor = "akka.sensors.dispatch.InstrumentedExecutor"

      instrumented-executor {
        delegate = "thread-pool-executor"
        measure-runs = true
        watch-long-runs = false
      }
    }
  }

  extensions = [
    akka.sensors.AkkaSensorsExtension
  ]
}

Using explicit/inline executor definition

akka {
 persistence {
  cassandra {
   default-dispatcher {
        type = "akka.sensors.dispatch.InstrumentedDispatcherConfigurator"
        executor = "akka.sensors.dispatch.InstrumentedExecutor"

        instrumented-executor {
          delegate = "fork-join-executor"
          measure-runs = true
          watch-long-runs = false
        }

        fork-join-executor {
          parallelism-min = 6
          parallelism-factor = 1
          parallelism-max = 6
        }
      }
    }
  }
}      

Actors (classic)

 # Non-persistent actors
 class MyImportantActor extends Actor with ActorMetrics {

    # This becomes label 'actor', default is simple class name
    # but you may segment it further
    # Just make sure the cardinality is sane (<100)
    override protected def actorTag: String = ... 

      ... # your implementation
  }

 # Persistent actors
 class MyImportantPersistentActor extends Actor with PersistentActorMetrics {
  ...


Actors (typed)

val behavior = BehaviorMetrics[Command]("ActorLabel") # basic actor metrics
    .withReceiveTimeoutMetrics(TimeoutCmd) # provides metric for amount of received timeout commands
    .withPersistenceMetrics # if inner behavior is event sourced, persistence metrics would be collected
    .setup { ctx: ActorContext[Command] =>
      ... # your implementation
    }

Internal parameters

Some parameters of the Sensors library itself, that you may want to tune:

akka.sensors {
  thread-state-snapshot-period = 5s
  cluster-watch-enabled = false
}

Additional metrics

For anything additional to measure in actors, extend *ActorMetrics in your own trait.

trait CustomActorMetrics extends ActorMetrics  with MetricsBuilders {

  val importantEvents: Counter = counter
    .name("important_events_total")
    .help(s"Important events")
    .labelNames("actor")
    .register(metrics.registry)

}

Why codahale is used alongside Prometheus?

We would prefer 100% Prometheus, however Cassandra Datastax OSS driver doesn't support Prometheus collectors. Prometheus is our preferred main metrics engine, hence we brigde metrics from Codahale via JMX. This won't be needed anymore if Prometheus would be supported natively by Datastax driver.

akka-sensors's People

Contributors

ashpakau avatar brzzbr avatar coffius avatar irevive avatar jacum avatar scala-steward avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

akka-sensors's Issues

Q: Become monitoring backend agnostic

Hello, thank you for your library, very interesting.
I have a suggestion to make it agnostic to the monitoring backend, e.g. I want to plug statsd or Datadog/Newrelic instead of Prometheus

snakeyaml version in example conflicts with Cassandra

When running the example from the docker image, the following exception prevents startup:

app-1         | 17:50:50.795 [main] INFO  o.a.c.config.YamlConfigurationLoader - Configuration location: file:/tmp/cassandra/cassandra-server.yaml
app-1         | Exception in thread "main" java.lang.NoSuchMethodError: org.yaml.snakeyaml.constructor.Constructor.<init>(Ljava/lang/Class;)V
app-1         | 	at org.apache.cassandra.config.YamlConfigurationLoader$CustomConstructor.<init>(YamlConfigurationLoader.java:139)
app-1         | 	at org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:120)
app-1         | 	at org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:101)
app-1         | 	at org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:276)
app-1         | 	at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:152)
app-1         | 	at org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:137)
app-1         | 	at org.cassandraunit.utils.EmbeddedCassandraServerHelper.startEmbeddedCassandra(EmbeddedCassandraServerHelper.java:145)
app-1         | 	at org.cassandraunit.utils.EmbeddedCassandraServerHelper.startEmbeddedCassandra(EmbeddedCassandraServerHelper.java:108)
app-1         | 	at org.cassandraunit.utils.EmbeddedCassandraServerHelper.startEmbeddedCassandra(EmbeddedCassandraServerHelper.java:92)
app-1         | 	at nl.pragmasoft.app.Main$.<clinit>(Main.scala:16)
app-1         | 	at nl.pragmasoft.app.Main.main(Main.scala)

I think this is because the hardcoded snakeyaml for prometheus at

https://github.com/jacum/akka-sensors/blob/master/project/Dependencies.scala#L44

conflicts with the snakeyaml 1.1 in Cassandra.

Scala 2.13: `ScalaRunnableWrapper` uses `akka.dispatch.Batchable`

Hello! Thanks for the awesome library.

I have a question regarding ScalaRunnableWrapper for Scala 2.13:

import akka.dispatch.Batchable
import akka.sensors.dispatch.DispatcherInstrumentationWrapper.Run
import scala.PartialFunction.condOpt
object ScalaRunnableWrapper {
def unapply(runnable: Runnable): Option[Run => Runnable] =
condOpt(runnable) {
case runnable: Batchable => new OverrideBatchable(runnable, _)
}
class OverrideBatchable(self: Runnable, r: Run) extends Batchable with Runnable {
def run(): Unit = r(() => self.run())
def isBatchable: Boolean = true

According to the code, the match expects the akka.dispatch.Batchable type. Then, AkkaRunnableWrapper relies on akka.dispatch.Batchable too.

case runnable: Batchable => new BatchableWrapper(runnable, _)

If I get it right, the case case ScalaRunnableWrapper(runnable) => runnable(r) will never happen.

def apply(runnableParam: Runnable, r: Run): Runnable =
runnableParam match {
case AkkaRunnableWrapper(runnable) => runnable(r)
case ScalaRunnableWrapper(runnable) => runnable(r)
case runnable => new Default(runnable, r)
}

I assume, ScalaRunnerWrapper should import scala.concurrent.Batchable instead?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.