Code Monkey home page Code Monkey logo

apollo-prometheus-exporter's Introduction

Apollo Prometheus Exporter

GitHub Workflow Status npm

Plugin for Apollo Server to export metrics in Prometheus format.

It uses prom-client under the hood the export the metrics.

Since Apollo Server released a new major version, a new version (v2.x.y) of the exporter has been launched. Apollo Server v2 is still supported in v1.x.y. The two versions will be features-matched as much as possible.

Metrics

Name Description Type
apollo_server_starting The last timestamp when Apollo Server was starting. Gauge
apollo_server_closing The last timestamp when Apollo Server was closing. Gauge
apollo_query_started The amount of received queries. Counter
apollo_query_failed The amount of queries that failed. Counter
apollo_query_parse_started The amount of queries for which parsing has started. Counter
apollo_query_parse_failed The amount of queries for which parsing has failed. Counter
apollo_query_validation_started The amount of queries for which validation has started. Counter
apollo_query_validation_failed The amount of queries for which validation has failed. Counter
apollo_query_resolved The amount of queries which could be resolved. Counter
apollo_query_execution_started The amount of queries for which execution has started. Counter
apollo_query_execution_failed The amount of queries for which execution has failed. Counter
apollo_query_duration The total duration of a query. Histogram
apollo_query_field_resolution_duration The total duration for resolving fields. Histogram

For default metrics, please refer to prom-client default metrics.

Usage

  1. Install prom-client and @bmatei/apollo-prometheus-exporter

    npm install prom-client @bmatei/apollo-prometheus-exporter
  2. Create an instance of the plugin

    const app = express();
    
    const prometheusExporterPlugin = createPrometheusExporterPlugin({ app });
  3. Add the plugin to ApolloServer

    const server = new ApolloServer({
      plugins: [prometheusExporterPlugin]
    });

For a complete working example, please have a look over the example project in this repository.

Options

Name Description Type Default Value
app Express instance. For the moment it is used for defining the metrics endpoint. It is mandatory unless metricsEndpoint is set to false. Express undefined
defaultLabels An object containing default labels to be sent with each metric. Object {}
defaultMetrics Flag to enable/disable the default metrics registered by prom-client. Boolean true
defaultMetricsOptions Configuration object for the default metrics. DefaultMetricsCollectorConfiguration {}
durationHistogramBuckets A list of durations that should be used by histograms. number[] [0.001, 0.005, 0.015, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1, 5, 10]
hostnameLabel Flag to enable/disable hostname label. Boolean true
hostnameLabelName The name of the hostname label. String hostname
metricsEndpoint Flag to enable/disable the metrics endpoint. If you disable this, you can use the registerPrometheusMetricsEndpoint method to enable the metrics endpoint. Boolean true
metricsEndpointPath HTTP path where the metrics will be published. String "/metrics"
register Prometheus client registry to be used by Apollo Metrics. By default, it is also used by the default metrics. Registry register
skipMetrics A key-value map that controls if a metric is enabled or disabled. SkipMetricsMap {}

Thanks

apollo-prometheus-exporter's People

Contributors

anothergitprofile avatar bfmatei avatar dependabot[bot] avatar rgeyer avatar tomwilkie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

apollo-prometheus-exporter's Issues

error: metric have already been registered

I have an express server with express-prom-bundle and apollo-prometheus-exporter installed and set up, And when I start my server, it crashed.

This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
Error: A metric with the name process_cpu_user_seconds_total has already been registered.

I guess the reason is that express-prom-bundle has registered the process_cpu_user_seconds_total metric, and apollo-prometheus-exporter also try to register this metric, so it crashed.
(FYI, I found express-prom-bundle and apollo-prometheus-exporter called collectDefaultMetrics which is implement in prom-client)

Is it suitable to rescue and ignore the error when we re-register a metric? Or is there anything we can do?

Break down request duration by request?

So it sort of appears to me that the apollo_query_duration_bucket is supposed to include something called operationName that should be the name of the individual query or mutation being called. But it doesn't appear to be working for me. Not clear whether this is a bug it's just not implemented or it's impossible to implement. Glad to fix/add it if it's just something that's broken or unimplemented. The metrics w/o a breakdown of individual mutation/query names won't be all that valuable to us otherwise.

The automated release is failing 🚨

🚨 The automated release from the master branch failed. 🚨

I recommend you give this issue a high priority, so other packages depending on you could benefit from your bug fixes and new features.

You can find below the list of errors reported by semantic-release. Each one of them has to be resolved in order to automatically publish your package. I’m sure you can resolve this πŸ’ͺ.

Errors are usually caused by a misconfiguration or an authentication problem. With each error reported below you will find explanation and guidance to help you to resolve it.

Once all the errors are resolved, semantic-release will release your package the next time you push a commit to the master branch. You can also manually restart the failed CI job that runs semantic-release.

If you are not sure how to resolve this, here is some links that can help you:

If those don’t help, or if this issue is reporting something you think isn’t right, you can always ask the humans behind semantic-release.


Missing package.json file.

A package.json file at the root of your project is required to release on npm.

Please follow the npm guideline to create a valid package.json file.


Good luck with your project ✨

Your semantic-release bot πŸ“¦πŸš€

Allow custom metrics?

Would be interested in implementing the ability to pass in additional metrics that could then be exposed via the /metrics endpoint beyond the built in ones.

Plugin metrics not being recorded/returned

Did the installation procedure, simple. Created the plugin with the express app and assigned to the ApolloServer configuration on creation. Ran a query on the server to record some metrics. Went to the /metrics endpoint of the server and got back a list of metrics, however the metrics labeled for this plugin do not actually post any values, just the help and type declarations.

Initialization:

let plugins = [loggerPlugin, httpResponsePlugin];
...
if(process.env.GRAPHQL_METRICS_ENABLED === '1') plugins.push(createPrometheusExporterPlugin({app}))
const server = new ApolloServer({
...
  plugins: [plugins],
...
})

Metrics returned:

...

# HELP nodejs_gc_duration_seconds Garbage collection duration by kind, one of major, minor, incremental or weakcb.
# TYPE nodejs_gc_duration_seconds histogram
nodejs_gc_duration_seconds_bucket{le="0.001",kind="minor",hostname="EREBUS"} 0
nodejs_gc_duration_seconds_bucket{le="0.01",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="0.1",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="1",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="2",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="5",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_sum{kind="minor",hostname="EREBUS"} 0.009743800001218916
nodejs_gc_duration_seconds_count{kind="minor",hostname="EREBUS"} 7

# HELP apollo_server_starting The last timestamp when Apollo Server was starting.
# TYPE apollo_server_starting gauge

# HELP apollo_server_closing The last timestamp when Apollo Server was closing.
# TYPE apollo_server_closing gauge

# HELP apollo_query_started The amount of received queries.
# TYPE apollo_query_started counter

# HELP apollo_query_parse_started The amount of queries for which parsing has started.
# TYPE apollo_query_parse_started counter

# HELP apollo_query_parse_failed The amount of queries for which parsing has failed.
# TYPE apollo_query_parse_failed counter

# HELP apollo_query_validation_started The amount of queries for which validation has started.
# TYPE apollo_query_validation_started counter

# HELP apollo_query_validation_failed The amount of queries for which validation has failed.
# TYPE apollo_query_validation_failed counter

# HELP apollo_query_resolved The amount of queries which could be resolved.
# TYPE apollo_query_resolved counter

# HELP apollo_query_execution_started The amount of queries for which execution has started.
# TYPE apollo_query_execution_started counter

# HELP apollo_query_execution_failed The amount of queries for which execution has failed.
# TYPE apollo_query_execution_failed counter

# HELP apollo_query_failed The amount of queries that failed.
# TYPE apollo_query_failed counter

# HELP apollo_query_duration The total duration of a query.
# TYPE apollo_query_duration histogram

# HELP apollo_query_field_resolution_duration The total duration for resolving fields.
# TYPE apollo_query_field_resolution_duration histogram

I would expect the metrics to include values even if no query has been executed yet for metrics to be collected. But even after a query was done there are still no values being returned. Is there something about the configuration that I missed?

[question] how to ignore specific error?

Hi there,
thanks for this work.

Is there a way to filter out some error from the query_failed metric? Let's say wrong auth credentials, login expired for example and so on....

Thanks

Metrics on the number errors returned

A default response contains besides it's data a list of 0 or more errors (see example below). Im looking to get insights into these errors and was wondering if one of the current metrics already exposes this data.

I tried to see if any of the already exposed metrics expose this data but they dont seem to, or am i overlooking something?

image

Support to Node version 12

Hey man, your lib is awesome.

I'm trying to use it in my project, but nullish coalescing operator (??) and optional chaining operator (? .) seems not work in Node 12.

I made some modifications in the code (basically I changed these operators) and now seems work. Do you intend to support Node 12?

How to unexpose `/metrics` to public internet

How does one hide the /metrics from the public internet? And once it's hidden, what's the usual practice for Grafana/Grafana Cloud to scrape this hidden/protected endpoint?

I'm asking because I figured out that exposing /metrics to the world is unacceptable (unless I'm missing something obvious).

prom-client 13?

prom-client 12 was moved to 13 rather quickly with no updates coming on that line. Any chance of upgrading to support prom-client 13.x.x?

Support prom-client v15

The peer-dependency prom-client recently released version 15.0.0, however when installing that I get

npm ERR! Could not resolve dependency:
npm ERR! peer prom-client@"^12.0.0 || ^13.0.0 || ^14.0.0" from @bmatei/[email protected]
npm ERR! node_modules/@bmatei/apollo-prometheus-exporter
npm ERR!   @bmatei/apollo-prometheus-exporter@"^3.0.0" from the root project

I didn't test for full compatibility yet but if it's not a breaking change, it might be sufficient to just add || ^15.0.0.

durationHistogramBuckets doesn`t work

I tried to use durationHistogramBuckets as explain on the README.md , but it didn't works. So, I tried as durationHistogramsBuckets and it worked. Is it correct?

Optional `QUERY_DURATION` metric in failed queries

Hey man, I'm working in an application with many instances (about 200) and I'm using your plugin (is very very good btw). I'm trying to decrease the amount and cardinality of metrics (today is about 2.5M) to gain more performance.

I found these piece of code:

didEncounterErrors(context) {
  const requestEndDate = Date.now();
  
  actionMetric(MetricsNames.QUERY_FAILED, getLabelsFromContext(context));
  
  actionMetric(
    MetricsNames.QUERY_DURATION,
    {
      ...getLabelsFromContext(context),
      success: 'false'
    },
    requestEndDate - requestStartDate
  );
},

Here, we are metering MetricsNames.QUERY_DURATION in failed queries. Can we make this a custom parameter? Because queries duration metric has no use for me in failed queries (it's useful only in success queries).

Apollo logs not being flushed

We are having trouble where we logs are not being collected for extended periods of time, and we believe that flushing the logs anytime the metrics endpoint is called in the plugin, that this may help alleviate the issue. I'm aware this isn't exactly an issue, but i'm not sure where else I should ask. Thank you!

Tldr How can we get logs to flush (or any code to run) after the /metrics endpoint is called?

Trying use skipMetrics, but it doesn't work

Hi, I'm trying to use skipMetrics to skip the label success=false from apollo_query_duration metrics. I tried as bellow:

import { createPrometheusExporterPlugin, MetricsNames} from '@bmatei/apollo-prometheus-exporter'
[...] 

const prometheusExporterPlugin = createPrometheusExporterPlugin({
      durationHistogramsBuckets: [0.5, 2],
      app,
     skipMetrics: { 
        [MetricsNames.QUERY_DURATION]: (labels) => {labels.success === 'false'}
         },
    })

Should I import something else? What am I doing wrong?

The automated release is failing 🚨

🚨 The automated release from the master branch failed. 🚨

I recommend you give this issue a high priority, so other packages depending on you could benefit from your bug fixes and new features.

You can find below the list of errors reported by semantic-release. Each one of them has to be resolved in order to automatically publish your package. I’m sure you can resolve this πŸ’ͺ.

Errors are usually caused by a misconfiguration or an authentication problem. With each error reported below you will find explanation and guidance to help you to resolve it.

Once all the errors are resolved, semantic-release will release your package the next time you push a commit to the master branch. You can also manually restart the failed CI job that runs semantic-release.

If you are not sure how to resolve this, here is some links that can help you:

If those don’t help, or if this issue is reporting something you think isn’t right, you can always ask the humans behind semantic-release.


Missing package.json file.

A package.json file at the root of your project is required to release on npm.

Please follow the npm guideline to create a valid package.json file.


Good luck with your project ✨

Your semantic-release bot πŸ“¦πŸš€

Possible perf_hooks memory leak detected

Hello, we just tested your library and we're comparing it against the one in the thanks section by including both in a single app. Dotellie one is copied and modified a little (labelValues -> LabelValues) because it uses prom-client 11.5.3 which is incompatible with 12.0.0 that you're using), upon which we encounter this warning:

(node:17) Warning: Possible perf_hooks memory leak detected. There are 169 entries in the Performance Timeline. Use the clear methods to remove entries that are no longer needed or set performance.maxEntries equal to a higher value (currently the maxEntries is 150).

Is this safe? Should we just follow the message to increase performance.maxEntries? If yes, what's the expected safe value? If not, then what's your suggestion?

What can I do to the apollo_query_execution_failed metric be counted?

I was looking this part of code that increase the metric apollo_query_execution_failed:

executionDidStart(context) {
  actionMetric(MetricsNames.QUERY_EXECUTION_STARTED, getLabelsFromContext(context));

  return {
    willResolveField(field) {
      const fieldResolveStart = Date.now();

      return () => {
        const fieldResolveEnd = Date.now();

        actionMetric(
          MetricsNames.QUERY_FIELD_RESOLUTION_DURATION,
          {
            ...getLabelsFromContext(context),
            ...getLabelsFromFieldResolver(field)
          },
          fieldResolveEnd - fieldResolveStart
        );
      };
    },
    executionDidEnd(err) {
      if (err) {
        actionMetric(MetricsNames.QUERY_EXECUTION_FAILED, getLabelsFromContext(context));
      }
    }
  };
}

The hook executionDidEnd should supposedly work? I noted it is after return, is it correct? If yes, could you show me an example where this metrics will be count?

Support clustering mode

I can add multiple worker but it expose only one worker metrics at a time.
How can I support the cluster mode using this plugin?

Default metrics are not being registered in the registry passed in register option

If I create a plugin with:

const myRegistry = new Registry()

const prometheusExporterPlugin = createPrometheusExporterPlugin({ app, register: myRegistry });

the default metrics are not registered in the registry passed in the options (considering the descriptions of the options in the README this should happen).

It seems this is being caused by a conflict between generateContext and toggleDefaultMetrics:

export function generateContext<C = BaseContext, S = Source, A = Args>(
  options: PluginOptions<C, S, A>
): Context<C, S, A> {
  const context: Context<C, S, A> = {
    app: options.app as Express,
    defaultLabels: {},
    defaultMetrics: true,
    disabledMetrics: [],
    durationHistogramsBuckets: [0.001, 0.005, 0.015, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1, 5, 10],
    hostnameLabel: true,
    hostnameLabelName: 'hostname',
    metricsEndpoint: true,
    metricsEndpointPath: '/metrics',
    register,
    ...options,
    skipMetrics: {
      [MetricsNames.SERVER_STARTING]: () => false,
      [MetricsNames.SERVER_CLOSING]: () => false,
      [MetricsNames.QUERY_STARTED]: () => false,
      [MetricsNames.QUERY_FAILED]: () => false,
      [MetricsNames.QUERY_PARSE_STARTED]: () => false,
      [MetricsNames.QUERY_PARSE_FAILED]: () => false,
      [MetricsNames.QUERY_VALIDATION_STARTED]: () => false,
      [MetricsNames.QUERY_VALIDATION_FAILED]: () => false,
      [MetricsNames.QUERY_RESOLVED]: () => false,
      [MetricsNames.QUERY_EXECUTION_STARTED]: () => false,
      [MetricsNames.QUERY_EXECUTION_FAILED]: () => false,
      [MetricsNames.QUERY_DURATION]: () => false,
      [MetricsNames.QUERY_FIELD_RESOLUTION_DURATION]: () => false,
      ...(options.skipMetrics ?? {})
    },
    defaultMetricsOptions: {
      register,
      ...(options.defaultMetricsOptions ?? {})
    }
  };

the register reference used in defaultMetricsOptions points to the register imported from

import { DefaultMetricsCollectorConfiguration, LabelValues, register, Registry } from 'prom-client';

and then when toggleDefaultMetrics is called

export function toggleDefaultMetrics<C = AppContext, S = Source, A = Args>(
  register: Registry,
  { defaultMetrics, defaultMetricsOptions }: Context<C, S, A>
) {
  if (defaultMetrics) {
    collectDefaultMetrics({
      register,
      ...defaultMetricsOptions
    });
  }
}

the destructuring of defaultMetricsOptions overwrites the register, which points to the register passed in the options of createPrometheusExporterPlugin.

Is that expected? If it is not, I am willing to open a PR to fix this behavior.

defaultLabels & hostnameLabel not functional

Hello,

I have been trying to add options to PrometheusExporterPlugin but they don't seem to work. Can you please have a look on what could be going wrong.

        const service_name: string = `${process.env.NODE_ENV}-listing`;
        const hostname: string = process.env.HOST_NAME;
        const prometheusExporterPlugin = createPrometheusExporterPlugin({
            app: app,
            defaultLabels: {
                env: process.env.NODE_ENV,
                service: service_name
            },
            hostnameLabel: true,
            hostnameLabelName: hostname,
        });

Prometheus error:

expected equal, got "INVALID"

Customize `durationHistogramsBuckets`

The default value of the histogram duration buckets (durationHistogramsBuckets) has to many intervals, because of this we are generating a big amount of metrics and we would need a huge infrastructure to handle with this. After some tests, we figure out a way to solve the problem: to customize the durationHistogramsBuckets in createPrometheusExporterPlugin. What do you think about that? This is possible?

Koa

Is it possible to use this with Koa instead of Express?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.