Code Monkey home page Code Monkey logo

mext's Introduction

The Modern ML Monitoring Mess: Failure Modes in Extending Prometheus

Accompanying blog post here.

This WIP project (Monitoring Extension) aims to benchmark open-source ML monitoring tools. Tools in the benchmark include:

  • Prometheus

ML Task and Pipeline Architecture

Data source

We use the NYC taxicab data, which has been migrated from a bucket of flat files to an AWS RDS instance via the TTB project.

Feedback lag

To simulate lag that a real-world system might experience, we inject a delay sampled from a Gaussian distribution. (TODO: shreyashankar)

Pipeline

The ML task is to predict whether a passenger will give a taxi driver a sizeable tip (10% or more). Pipeline components are defined in the components folder and are called in train.py to train a model on Jan 2020 data. The inference code is in inference/main.py, which runs the model on data in 2-day increments from Feb 1 2020 to May 31 2020.

Prometheus Extension

We use 2 Gauge Metrics -- one for outputs, and one for feedback -- and aggregate them in PromQL to compute accuracy. These Metrics are defined in lib/prometheus_ml_ext.py. Read the accompanying blog post for more details.

mext's People

Contributors

shreyashankar avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

mext's Issues

Idea: Design for monitoring interface

Interface Layer

Users will create Tasks and Metrics (M:M relationship). An example of a Task might be high_tip_prediction. An example of a Metric might be accuracy. The user's code might look like this to define a Metric:

class Accuracy(Metric): # Metric is an mltrace abstract class that has abstract method compute
  def __init__(self, window_size, compute_frequency):
    super().__init__(name="accuracy", window_size=window_size, compute_frequency=compute_frequency)
    # Stateful vars here
    self.numerator = 0.0
    self.denominator = 0.0

  def compute(self, new_outputs, new_feedback):
    self.numerator += sum(new_outputs == new_feedback)
    self.denominator += len(new_outputs)

To define a Task:

t = Task("high_tip_prediction")
t.registerMetric(Accuracy(30, 1))
t.registerMetric(Accuracy(7, 1))
""" prediction code """
t.logOutput(prediction, output_id)
t.logFeedback(label, output_id)

Execution Layer

Each metric will have a new materialized view. When a materialized view is updated, we will run the user's metric computation function and output the value to the metric_history_table.

Storage Layer

On our end, we have 2 tables: one for outputs and one for feedback. The schema is as follows (feedback table is the same schema):

output_table = Table(
    "outputs",
    Base.metadata,
    Column("timestamp", DateTime),
    Column("identifier", String),
    Column("task_name", String),
    Column("value", Numeric),
    Index("outputs_ts_name_asc", "timestamp", "task_name"),
    Index(
        "outputs_ts_name_desc",
        text("timestamp DESC"),
        "task_name",
    ),
)

We want to create triggers to these tables create materialized views to hold the relevant history for each metric. Then we need triggers that actually compute the metrics. We probably need a table to store metric values, with the following schema:

metric_history_table = Table(
    "metric_history",
    Base.metadata,
    Column("timestamp", DateTime),
    Column("metric_name", String),
    Column("task_name", String),
    Column("value", Numeric),
    Index("ts_task_metric_asc", "timestamp", "task_name", "metric_name"),
    Index(
        "ts_task_metric_desc",
        text("timestamp DESC"),
        "task_name",
        "metric_name"
    ),
)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.