Code Monkey home page Code Monkey logo

Comments (7)

nathanielc avatar nathanielc commented on August 26, 2024

@elvarb Can you share you Kapacitor config for how you setup Morgoth?

What error are you getting?

from morgoth.

elvarb avatar elvarb commented on August 26, 2024
[udf]
# Configuration for UDFs (User Defined Functions)
[udf.functions]
    [udf.functions.morgoth]
      prog = "/opt/morgoth/morgoth"
      timeout = "10s"

This is the config.

I'm getting no errors at all.

when I do kapacitor show cpu_morgoth I get this, so something is happening at least

DOT:
digraph cpu_morgoth {
graph [throughput="0.00 points/s"];

stream0 [avg_exec_time_ns="0" ];
stream0 -> from1 [processed="3577"];

from1 [avg_exec_time_ns="87.5µs" ];
from1 -> window2 [processed="3577"];

window2 [avg_exec_time_ns="45.85µs" ];
window2 -> default3 [processed="168"];

default3 [avg_exec_time_ns="66.4µs" fields_defaulted="2796" tags_defaulted="0" ];
default3 -> morgoth4 [processed="168"];

morgoth4 [avg_exec_time_ns="156.498µs" ];
morgoth4 -> alert5 [processed="2"];

alert5 [alerts_triggered="0" avg_exec_time_ns="0" crits_triggered="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" ];
}

from morgoth.

nathanielc avatar nathanielc commented on August 26, 2024

Add the -log-level as an arg to the morgoth command.

[udf]
# Configuration for UDFs (User Defined Functions)
[udf.functions]
    [udf.functions.morgoth]
      prog = "/opt/morgoth/morgoth"
      args = ["-log-level", "debug"]
      timeout = "10s"

Then the logs will show up with the Kapacitor logs.

from morgoth.

elvarb avatar elvarb commented on August 26, 2024

I screwed up the args before, had the line like this
args = ["-log-level debug"]

Now I'm getting debug information in the log.

After letting Kapacitor and morgoth run on a cpu metric for a long long time I got 22 info hits where info was defined as
.info(lambda: TRUE)

Even thousands of events were sent to Morgoth it only returned 22 events as suspicious. The Anomaly score was 0.9 of those events. Is this normal or should Morgoth be sending everything it finds down the chain?

stream0 [avg_exec_time_ns="0" ];
stream0 -> from1 [processed="350"];

from1 [avg_exec_time_ns="6.7µs" ];
from1 -> window2 [processed="350"];

window2 [avg_exec_time_ns="4.7µs" ];
window2 -> default3 [processed="6"];

default3 [avg_exec_time_ns="0" fields_defaulted="177" tags_defaulted="0" ];
default3 -> morgoth4 [processed="6"];

morgoth4 [avg_exec_time_ns="2.8µs" ];
morgoth4 -> alert5 [processed="0"];

alert5 [alerts_triggered="0" avg_exec_time_ns="0" crits_triggered="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" ];

from morgoth.

elvarb avatar elvarb commented on August 26, 2024

I'm monitoring multiple cpu metrics and this is the debug information in the logs

[cpu_morgoth:morgoth4] 2016/09/05 14:06:04 I!P 2016/09/05 14:06:04 D! *sigma.Sigma anomalous? false support: 0.500000
[cpu_morgoth:morgoth4] 2016/09/05 14:07:04 I!P 2016/09/05 14:07:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:07:04 I!P 2016/09/05 14:07:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:07:04 I!P 2016/09/05 14:07:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:08:04 I!P 2016/09/05 14:08:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:08:04 I!P 2016/09/05 14:08:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:12:04 I!P 2016/09/05 14:12:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:12:04 I!P 2016/09/05 14:12:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:12:04 I!P 2016/09/05 14:12:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:12:04 I!P 2016/09/05 14:12:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:12:04 I!P 2016/09/05 14:12:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:12:04 I!P 2016/09/05 14:12:04 D! *sigma.Sigma anomalous? false support: 0.666667
[cpu_morgoth:morgoth4] 2016/09/05 14:16:04 I!P 2016/09/05 14:16:04 D! *sigma.Sigma anomalous? false support: 1.000000
[cpu_morgoth:morgoth4] 2016/09/05 14:17:04 I!P 2016/09/05 14:17:04 D! *sigma.Sigma anomalous? false support: 1.000000

The timestamps there do not match the pattern of the metrics its processing, so what triggers a debug log write?

What do the numbers mean at the end?

from morgoth.

nathanielc avatar nathanielc commented on August 26, 2024

@elvarb Answers to your questions below...

Even thousands of events were sent to Morgoth it only returned 22 events as suspicious. The Anomaly score was 0.9 of those events. Is this normal or should Morgoth be sending everything it finds down the chain?

Morgoth only sends the anomalous events down the chain. You can use join to join the original data back up with the anomalous score if need be. As for only getting 22 events out of thousands that is initially at least a good sign that Morgoth is not spewing false positives ;)

The timestamps there do not match the pattern of the metrics its processing, so what triggers a debug log write?

The timestamps on the logs are from the system time. Kapacitor can process points for any time period. How far off are the timestamps? If you are sending the data in real time then they should be close but may not match up exactly.

What do the numbers mean at the end?

Support is from the Lossy Counting algorithm, it is essentially saying how frequently this kind of window has been seen before. A larger number means it is more frequent aka normal, and a smaller number means it is less frequent and potentially anomalous. The default minimum support is 0.05, meaning if a window has a support greater than 0.05 it is considered normal otherwise it is anomalous. The support can be interpreted as a percentage and ranges from 0-1. For example a minimum support of 0.05 means events less than 5% frequent are anomalous.

The minimum support can be set via the minSupport property:

var cpu = stream
    |from()
        .measurement('win_cpu')
        .groupBy('host','domain')
    |window()
        .period(5m)
        .every(5m)
        .align()
    |default()
        .field('% Processor Time', 0.0)
  @morgoth()
    .field('% Processor Time')
    .scoreField('anomalyScore')
    .sigma(3.5)
    // set minimum support
   .minSupport(0.1)

cpu
    |alert()
        .info(lambda: 'anomalyScore' > 0.0)
        .warn(lambda: 'anomalyScore' > 1.0)
        .crit(lambda: 'anomalyScore' > 3.0)
        .log('/tmp/cpu_morgoth.log')

Hope that helps.

from morgoth.

elvarb avatar elvarb commented on August 26, 2024

I understand this much better now!

The modified tick script looks like this

var cpu = stream
    |from()
        .measurement('win_cpu')
        .where(lambda: "% Processor Time" >= 0.0)
        .groupBy(time(1m), 'host', 'domain')
    |window()
        .period(5m)
        .every(1m)
    @morgoth()
        .field('% Processor Time')
        .scoreField('anomalyScore')
        .minSupport(0.1)
        .sigma(2.0)

cpu
    |alert()
        .info(lambda: TRUE)
        .warn(lambda: 'anomalyScore' > 1.0)
        .crit(lambda: 'anomalyScore' > 3.0)
        .log('/tmp/cpu_morgoth.log')

The major changes I did was to add a where line to only select data if that metric value is used. The default setting I had before was throwing the detection way off. Also there is a groupby time setting because the metrics are only coming in every minute.

Setting minsupport meant that more would be passed through Morgoth.

The stats now make more sense also

stream0 [avg_exec_time_ns="0" ];
stream0 -> from1 [processed="4289"];

from1 [avg_exec_time_ns="73.125µs" ];
from1 -> window2 [processed="857"];

window2 [avg_exec_time_ns="31.212µs" ];
window2 -> morgoth3 [processed="351"];

morgoth3 [avg_exec_time_ns="278.18µs" ];
morgoth3 -> alert4 [processed="27"];

Next step for me is to write the anomaly score back to InfluxDB so its easier for me to see the relationship between the metrics and the score. Thanks for your help.

from morgoth.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.