bfmatei / apollo-prometheus-exporter Goto Github PK

View Code? Open in Web Editor NEW

55.0 4.0 28.0 639 KB

Plugin for Apollo Server to export metrics in Prometheus format

License: MIT License

Shell 0.18% TypeScript 95.23% Makefile 2.05% Jsonnet 1.40% JavaScript 1.13%

apollo-prometheus-exporter's Introduction

Apollo Prometheus Exporter

Plugin for Apollo Server to export metrics in Prometheus format.

It uses prom-client under the hood the export the metrics.

Since Apollo Server released a new major version, a new version (v2.x.y) of the exporter has been launched. Apollo Server v2 is still supported in v1.x.y. The two versions will be features-matched as much as possible.

Metrics

Name	Description	Type
`apollo_server_starting`	The last timestamp when Apollo Server was starting.	Gauge
`apollo_server_closing`	The last timestamp when Apollo Server was closing.	Gauge
`apollo_query_started`	The amount of received queries.	Counter
`apollo_query_failed`	The amount of queries that failed.	Counter
`apollo_query_parse_started`	The amount of queries for which parsing has started.	Counter
`apollo_query_parse_failed`	The amount of queries for which parsing has failed.	Counter
`apollo_query_validation_started`	The amount of queries for which validation has started.	Counter
`apollo_query_validation_failed`	The amount of queries for which validation has failed.	Counter
`apollo_query_resolved`	The amount of queries which could be resolved.	Counter
`apollo_query_execution_started`	The amount of queries for which execution has started.	Counter
`apollo_query_execution_failed`	The amount of queries for which execution has failed.	Counter
`apollo_query_duration`	The total duration of a query.	Histogram
`apollo_query_field_resolution_duration`	The total duration for resolving fields.	Histogram

For default metrics, please refer to prom-client default metrics.

Usage

Install prom-client and @bmatei/apollo-prometheus-exporter

npm install prom-client @bmatei/apollo-prometheus-exporter

Create an instance of the plugin

const app = express();

const prometheusExporterPlugin = createPrometheusExporterPlugin({ app });

Add the plugin to ApolloServer

const server = new ApolloServer({
  plugins: [prometheusExporterPlugin]
});

For a complete working example, please have a look over the example project in this repository.

Options

Name	Description	Type	Default Value
`app`	Express instance. For the moment it is used for defining the metrics endpoint. It is mandatory unless `metricsEndpoint` is set to false.	`Express`	`undefined`
`defaultLabels`	An object containing default labels to be sent with each metric.	`Object`	`{}`
`defaultMetrics`	Flag to enable/disable the default metrics registered by `prom-client`.	`Boolean`	`true`
`defaultMetricsOptions`	Configuration object for the default metrics.	`DefaultMetricsCollectorConfiguration`	`{}`
`durationHistogramBuckets`	A list of durations that should be used by histograms.	`number[]`	`[0.001, 0.005, 0.015, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1, 5, 10]`
`hostnameLabel`	Flag to enable/disable `hostname` label.	`Boolean`	`true`
`hostnameLabelName`	The name of the `hostname` label.	`String`	`hostname`
`metricsEndpoint`	Flag to enable/disable the metrics endpoint. If you disable this, you can use the `registerPrometheusMetricsEndpoint` method to enable the metrics endpoint.	`Boolean`	`true`
`metricsEndpointPath`	HTTP path where the metrics will be published.	`String`	`"/metrics"`
`register`	Prometheus client registry to be used by Apollo Metrics. By default, it is also used by the default metrics.	`Registry`	`register`
`skipMetrics`	A key-value map that controls if a metric is enabled or disabled.	`SkipMetricsMap`	`{}`

Thanks

apollo-prometheus-exporter's People

Contributors

Stargazers

Watchers

apollo-prometheus-exporter's Issues

error: metric have already been registered

I have an express server with express-prom-bundle and apollo-prometheus-exporter installed and set up, And when I start my server, it crashed.

This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
Error: A metric with the name process_cpu_user_seconds_total has already been registered.

I guess the reason is that express-prom-bundle has registered the process_cpu_user_seconds_total metric, and apollo-prometheus-exporter also try to register this metric, so it crashed.
(FYI, I found express-prom-bundle and apollo-prometheus-exporter called collectDefaultMetrics which is implement in prom-client)

Is it suitable to rescue and ignore the error when we re-register a metric? Or is there anything we can do?

Break down request duration by request?

So it sort of appears to me that the apollo_query_duration_bucket is supposed to include something called operationName that should be the name of the individual query or mutation being called. But it doesn't appear to be working for me. Not clear whether this is a bug it's just not implemented or it's impossible to implement. Glad to fix/add it if it's just something that's broken or unimplemented. The metrics w/o a breakdown of individual mutation/query names won't be all that valuable to us otherwise.

The automated release is failing 🚨

🚨 The automated release from the `master` branch failed. 🚨

I recommend you give this issue a high priority, so other packages depending on you could benefit from your bug fixes and new features.

You can find below the list of errors reported by semantic-release. Each one of them has to be resolved in order to automatically publish your package. I’m sure you can resolve this 💪.

Errors are usually caused by a misconfiguration or an authentication problem. With each error reported below you will find explanation and guidance to help you to resolve it.

Once all the errors are resolved, semantic-release will release your package the next time you push a commit to the master branch. You can also manually restart the failed CI job that runs semantic-release.

If you are not sure how to resolve this, here is some links that can help you:

If those don’t help, or if this issue is reporting something you think isn’t right, you can always ask the humans behind semantic-release.

Missing `package.json` file.

A package.json file at the root of your project is required to release on npm.

Please follow the npm guideline to create a valid package.json file.

Good luck with your project ✨

Your semantic-release bot 📦🚀

Allow custom metrics?

Would be interested in implementing the ability to pass in additional metrics that could then be exposed via the /metrics endpoint beyond the built in ones.

Plugin metrics not being recorded/returned

Did the installation procedure, simple. Created the plugin with the express app and assigned to the ApolloServer configuration on creation. Ran a query on the server to record some metrics. Went to the /metrics endpoint of the server and got back a list of metrics, however the metrics labeled for this plugin do not actually post any values, just the help and type declarations.

Initialization:

let plugins = [loggerPlugin, httpResponsePlugin];
...
if(process.env.GRAPHQL_METRICS_ENABLED === '1') plugins.push(createPrometheusExporterPlugin({app}))
const server = new ApolloServer({
...
  plugins: [plugins],
...
})

Metrics returned:

...

# HELP nodejs_gc_duration_seconds Garbage collection duration by kind, one of major, minor, incremental or weakcb.
# TYPE nodejs_gc_duration_seconds histogram
nodejs_gc_duration_seconds_bucket{le="0.001",kind="minor",hostname="EREBUS"} 0
nodejs_gc_duration_seconds_bucket{le="0.01",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="0.1",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="1",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="2",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="5",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="minor",hostname="EREBUS"} 7
nodejs_gc_duration_seconds_sum{kind="minor",hostname="EREBUS"} 0.009743800001218916
nodejs_gc_duration_seconds_count{kind="minor",hostname="EREBUS"} 7

# HELP apollo_server_starting The last timestamp when Apollo Server was starting.
# TYPE apollo_server_starting gauge

# HELP apollo_server_closing The last timestamp when Apollo Server was closing.
# TYPE apollo_server_closing gauge

# HELP apollo_query_started The amount of received queries.
# TYPE apollo_query_started counter

# HELP apollo_query_parse_started The amount of queries for which parsing has started.
# TYPE apollo_query_parse_started counter

# HELP apollo_query_parse_failed The amount of queries for which parsing has failed.
# TYPE apollo_query_parse_failed counter

# HELP apollo_query_validation_started The amount of queries for which validation has started.
# TYPE apollo_query_validation_started counter

# HELP apollo_query_validation_failed The amount of queries for which validation has failed.
# TYPE apollo_query_validation_failed counter

# HELP apollo_query_resolved The amount of queries which could be resolved.
# TYPE apollo_query_resolved counter

# HELP apollo_query_execution_started The amount of queries for which execution has started.
# TYPE apollo_query_execution_started counter

# HELP apollo_query_execution_failed The amount of queries for which execution has failed.
# TYPE apollo_query_execution_failed counter

# HELP apollo_query_failed The amount of queries that failed.
# TYPE apollo_query_failed counter

# HELP apollo_query_duration The total duration of a query.
# TYPE apollo_query_duration histogram

# HELP apollo_query_field_resolution_duration The total duration for resolving fields.
# TYPE apollo_query_field_resolution_duration histogram

I would expect the metrics to include values even if no query has been executed yet for metrics to be collected. But even after a query was done there are still no values being returned. Is there something about the configuration that I missed?

[question] how to ignore specific error?

Hi there,
thanks for this work.

Is there a way to filter out some error from the query_failed metric? Let's say wrong auth credentials, login expired for example and so on....

Thanks

Metrics on the number errors returned

A default response contains besides it's data a list of 0 or more errors (see example below). Im looking to get insights into these errors and was wondering if one of the current metrics already exposes this data.

I tried to see if any of the already exposed metrics expose this data but they dont seem to, or am i overlooking something?

Support to Node version 12

Hey man, your lib is awesome.

I'm trying to use it in my project, but nullish coalescing operator (??) and optional chaining operator (? .) seems not work in Node 12.

I made some modifications in the code (basically I changed these operators) and now seems work. Do you intend to support Node 12?

apollo-server 3 support

Hi there,

is there any plan to support apollo-server v3?

Thanks

Nest.js implementation

Its possible run this exporter in a nest.js project?

Support for metrics in the `didResolveSource` event

Hi there,

Is it possible to add metrics support for the didResolveSource event (https://www.apollographql.com/docs/apollo-server/integrations/plugins-event-reference/#didresolvesource) in the same way that other events are supported

How to unexpose `/metrics` to public internet

How does one hide the /metrics from the public internet? And once it's hidden, what's the usual practice for Grafana/Grafana Cloud to scrape this hidden/protected endpoint?

I'm asking because I figured out that exposing /metrics to the world is unacceptable (unless I'm missing something obvious).

prom-client 13?

prom-client 12 was moved to 13 rather quickly with no updates coming on that line. Any chance of upgrading to support prom-client 13.x.x?

Support prom-client v15

The peer-dependency prom-client recently released version 15.0.0, however when installing that I get

npm ERR! Could not resolve dependency:
npm ERR! peer prom-client@"^12.0.0 || ^13.0.0 || ^14.0.0" from @bmatei/[email protected]
npm ERR! node_modules/@bmatei/apollo-prometheus-exporter
npm ERR!   @bmatei/apollo-prometheus-exporter@"^3.0.0" from the root project

I didn't test for full compatibility yet but if it's not a breaking change, it might be sufficient to just add || ^15.0.0.

durationHistogramBuckets doesn`t work

I tried to use durationHistogramBuckets as explain on the README.md , but it didn't works. So, I tried as durationHistogramsBuckets and it worked. Is it correct?

Optional `QUERY_DURATION` metric in failed queries

Hey man, I'm working in an application with many instances (about 200) and I'm using your plugin (is very very good btw). I'm trying to decrease the amount and cardinality of metrics (today is about 2.5M) to gain more performance.

I found these piece of code:

didEncounterErrors(context) {
  const requestEndDate = Date.now();
  
  actionMetric(MetricsNames.QUERY_FAILED, getLabelsFromContext(context));
  
  actionMetric(
    MetricsNames.QUERY_DURATION,
    {
      ...getLabelsFromContext(context),
      success: 'false'
    },
    requestEndDate - requestStartDate
  );
},

Here, we are metering MetricsNames.QUERY_DURATION in failed queries. Can we make this a custom parameter? Because queries duration metric has no use for me in failed queries (it's useful only in success queries).

didEndHook is not a function error

Have this error for any >=1.1.0 version.
My server version is 2.9.7.

Apollo logs not being flushed

We are having trouble where we logs are not being collected for extended periods of time, and we believe that flushing the logs anytime the metrics endpoint is called in the plugin, that this may help alleviate the issue. I'm aware this isn't exactly an issue, but i'm not sure where else I should ask. Thank you!

Tldr How can we get logs to flush (or any code to run) after the /metrics endpoint is called?

Support prom-client v14?

prom-client v14 was released but this does not appear to support it yet.

Support apolo server v4

Support the new Apollo Server V4
https://github.com/apollographql/apollo-server/blob/main/packages/server/CHANGELOG.md

Trying use skipMetrics, but it doesn't work

Hi, I'm trying to use skipMetrics to skip the label success=false from apollo_query_duration metrics. I tried as bellow:

import { createPrometheusExporterPlugin, MetricsNames} from '@bmatei/apollo-prometheus-exporter'
[...] 

const prometheusExporterPlugin = createPrometheusExporterPlugin({
      durationHistogramsBuckets: [0.5, 2],
      app,
     skipMetrics: { 
        [MetricsNames.QUERY_DURATION]: (labels) => {labels.success === 'false'}
         },
    })

Should I import something else? What am I doing wrong?

The automated release is failing 🚨

🚨 The automated release from the `master` branch failed. 🚨

I recommend you give this issue a high priority, so other packages depending on you could benefit from your bug fixes and new features.

You can find below the list of errors reported by semantic-release. Each one of them has to be resolved in order to automatically publish your package. I’m sure you can resolve this 💪.

Errors are usually caused by a misconfiguration or an authentication problem. With each error reported below you will find explanation and guidance to help you to resolve it.

If you are not sure how to resolve this, here is some links that can help you:

If those don’t help, or if this issue is reporting something you think isn’t right, you can always ask the humans behind semantic-release.

Missing `package.json` file.

A package.json file at the root of your project is required to release on npm.

Please follow the npm guideline to create a valid package.json file.

Good luck with your project ✨

Your semantic-release bot 📦🚀

Possible perf_hooks memory leak detected

Hello, we just tested your library and we're comparing it against the one in the thanks section by including both in a single app. Dotellie one is copied and modified a little (labelValues -> LabelValues) because it uses prom-client 11.5.3 which is incompatible with 12.0.0 that you're using), upon which we encounter this warning:

(node:17) Warning: Possible perf_hooks memory leak detected. There are 169 entries in the Performance Timeline. Use the clear methods to remove entries that are no longer needed or set performance.maxEntries equal to a higher value (currently the maxEntries is 150).

Is this safe? Should we just follow the message to increase performance.maxEntries? If yes, what's the expected safe value? If not, then what's your suggestion?

What can I do to the apollo_query_execution_failed metric be counted?

I was looking this part of code that increase the metric apollo_query_execution_failed:

executionDidStart(context) {
  actionMetric(MetricsNames.QUERY_EXECUTION_STARTED, getLabelsFromContext(context));

  return {
    willResolveField(field) {
      const fieldResolveStart = Date.now();

      return () => {
        const fieldResolveEnd = Date.now();

        actionMetric(
          MetricsNames.QUERY_FIELD_RESOLUTION_DURATION,
          {
            ...getLabelsFromContext(context),
            ...getLabelsFromFieldResolver(field)
          },
          fieldResolveEnd - fieldResolveStart
        );
      };
    },
    executionDidEnd(err) {
      if (err) {
        actionMetric(MetricsNames.QUERY_EXECUTION_FAILED, getLabelsFromContext(context));
      }
    }
  };
}

The hook executionDidEnd should supposedly work? I noted it is after return, is it correct? If yes, could you show me an example where this metrics will be count?

Support clustering mode

I can add multiple worker but it expose only one worker metrics at a time.
How can I support the cluster mode using this plugin?

Default metrics are not being registered in the registry passed in register option

If I create a plugin with:

const myRegistry = new Registry()

const prometheusExporterPlugin = createPrometheusExporterPlugin({ app, register: myRegistry });

the default metrics are not registered in the registry passed in the options (considering the descriptions of the options in the README this should happen).

It seems this is being caused by a conflict between generateContext and toggleDefaultMetrics:

export function generateContext<C = BaseContext, S = Source, A = Args>(
  options: PluginOptions<C, S, A>
): Context<C, S, A> {
  const context: Context<C, S, A> = {
    app: options.app as Express,
    defaultLabels: {},
    defaultMetrics: true,
    disabledMetrics: [],
    durationHistogramsBuckets: [0.001, 0.005, 0.015, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1, 5, 10],
    hostnameLabel: true,
    hostnameLabelName: 'hostname',
    metricsEndpoint: true,
    metricsEndpointPath: '/metrics',
    register,
    ...options,
    skipMetrics: {
      [MetricsNames.SERVER_STARTING]: () => false,
      [MetricsNames.SERVER_CLOSING]: () => false,
      [MetricsNames.QUERY_STARTED]: () => false,
      [MetricsNames.QUERY_FAILED]: () => false,
      [MetricsNames.QUERY_PARSE_STARTED]: () => false,
      [MetricsNames.QUERY_PARSE_FAILED]: () => false,
      [MetricsNames.QUERY_VALIDATION_STARTED]: () => false,
      [MetricsNames.QUERY_VALIDATION_FAILED]: () => false,
      [MetricsNames.QUERY_RESOLVED]: () => false,
      [MetricsNames.QUERY_EXECUTION_STARTED]: () => false,
      [MetricsNames.QUERY_EXECUTION_FAILED]: () => false,
      [MetricsNames.QUERY_DURATION]: () => false,
      [MetricsNames.QUERY_FIELD_RESOLUTION_DURATION]: () => false,
      ...(options.skipMetrics ?? {})
    },
    defaultMetricsOptions: {
      register,
      ...(options.defaultMetricsOptions ?? {})
    }
  };

the register reference used in defaultMetricsOptions points to the register imported from

import { DefaultMetricsCollectorConfiguration, LabelValues, register, Registry } from 'prom-client';

and then when toggleDefaultMetrics is called

export function toggleDefaultMetrics<C = AppContext, S = Source, A = Args>(
  register: Registry,
  { defaultMetrics, defaultMetricsOptions }: Context<C, S, A>
) {
  if (defaultMetrics) {
    collectDefaultMetrics({
      register,
      ...defaultMetricsOptions
    });
  }
}

the destructuring of defaultMetricsOptions overwrites the register, which points to the register passed in the options of createPrometheusExporterPlugin.

Is that expected? If it is not, I am willing to open a PR to fix this behavior.

defaultLabels & hostnameLabel not functional

Hello,

I have been trying to add options to PrometheusExporterPlugin but they don't seem to work. Can you please have a look on what could be going wrong.

        const service_name: string = `${process.env.NODE_ENV}-listing`;
        const hostname: string = process.env.HOST_NAME;
        const prometheusExporterPlugin = createPrometheusExporterPlugin({
            app: app,
            defaultLabels: {
                env: process.env.NODE_ENV,
                service: service_name
            },
            hostnameLabel: true,
            hostnameLabelName: hostname,
        });

Prometheus error:

expected equal, got "INVALID"

Customize `durationHistogramsBuckets`

The default value of the histogram duration buckets (durationHistogramsBuckets) has to many intervals, because of this we are generating a big amount of metrics and we would need a huge infrastructure to handle with this. After some tests, we figure out a way to solve the problem: to customize the durationHistogramsBuckets in createPrometheusExporterPlugin. What do you think about that? This is possible?

Koa

Is it possible to use this with Koa instead of Express?

bfmatei / apollo-prometheus-exporter Goto Github PK

apollo-prometheus-exporter's Introduction

Apollo Prometheus Exporter

Metrics

Usage

Options

Thanks

apollo-prometheus-exporter's People

Contributors

Stargazers

Watchers

Forkers

apollo-prometheus-exporter's Issues

🚨 The automated release from the master branch failed. 🚨

Missing package.json file.

🚨 The automated release from the master branch failed. 🚨

Missing package.json file.

Recommend Projects

Recommend Topics

Recommend Org

🚨 The automated release from the `master` branch failed. 🚨

Missing `package.json` file.

🚨 The automated release from the `master` branch failed. 🚨

Missing `package.json` file.