miekg / caddy-prometheus Goto Github PK
View Code? Open in Web Editor NEWThis project forked from captncraig/caddy-stats
Prometheus metrics middleware for caddy
License: Apache License 2.0
This project forked from captncraig/caddy-stats
Prometheus metrics middleware for caddy
License: Apache License 2.0
I am running this using docker image zzrot/alpine-caddy
This is my Caddyfile
https://dashboard.mydomain.com {
prometheus
tls /certs/dashboard.mydomain.com.crt /certs/dashboard.mydomain.com.key
proxy / node1:4000 node2:4000 {
proxy_header Host {host}
proxy_header X-Real-IP {remote}
proxy_header X-Forwarded-Proto {scheme}
websocket
except /css /fonts /js /img
}
root /node-app/public
log stdout
errors stderr
}
Every thing works as expected except for prometheus
when curl http://localhost:9180/metrics
I get curl: (52) Empty reply from server
Also if I replace the localhost with the actual IP of the host (I have the port 9180 exposed) I get connection refused
. Am I missing anything?
It should not be "1"
and "2"
.
For IPv4 it should be "4"
and IPv6 should be "6"
.
On this way, everybody understand those values.
Like everywhere, even the IPvX-header use the right value with his only four bits.
When I use proxy in combination with prometheus monitorying I always get status=0
Caddy returns the correct status code to prometheus. eg: 200 or 404 and so on.
Caddy always returns status code 0
config file is like this.
https://site.url.example
{
tls [email protected]
prometheus 0.0.0.0:9180
proxy / localhost:3000 {
websocket
transparent
}
}
Need to figure out why and see if it is fixable.
On top of hostname, we would like to segregate our metrics by path such as:
If it's fine for you I can open a PR
Caddy's import path (and Go module name) has changed from
github.com/mholt/caddy
to
github.com/caddyserver/caddy
Unfortunately, Go modules are not yet mature enough to handle a change like this (see https://golang.org/issue/26904 - "haven't implemented that part yet" but high on priority list for Go 1.14) which caught me off-guard. Using Go module's replace
feature didn't act the way I expected, either. Caddy now fails to build with plugins until they update their import paths.
I've hacked a fix into the build server, so downloading Caddy with your plugin from our website should continue working without any changes on your part, for now. However, please take a moment and update your import paths, and do a new deploy on the website, because the workaround involves ignoring module checksums and performing a delicate recursive search-and-replace.
I'm terribly sorry about this. I did a number of tests and dry-runs to ensure the change would be smooth, but apparently some unknown combination of GOPATH, Go modules' lack of maturity, and other hidden variables in the system or environment must have covered up something I missed.
This bash script should make it easy (run it from your project's top-level directory):
find . -name '*.go' | while read -r f; do
sed -i.bak 's/\/mholt\/caddy/\/caddyserver\/caddy/g' $f && rm $f.bak
done
We use this script in the build server as part of the temporary workaround.
Let me know if you have any questions! Sorry again for the inconvenience.
Could you possibly add the user agent to the exported data?
I tracked down a bug that is caused by #25, which introduced this change:
status, err := next.ServeHTTP(rw, r)
if status == 0 {
status = rw.Status()
}
Setting the status code can cause unexpected behavior because other plugins interpret status code 0 as an already-written response which is expected when using the proxy
plugin.
This bug was exposed when a Java consumer of Caddy with Prometheus enabled threw this exception: org.springframework.web.client.ResourceAccessException: I/O error on PUT request for <url>: <url> failed to respond
.
Changing the above to this snippet resolves the issue as it never sets status
. stat
is used later in responseStatus.WithLabelValues
.
status, err := next.ServeHTTP(rw, r)
// proxies return a status code of 0 but the actual status is available on rw
var stat int
if status == 0 {
stat = rw.Status()
}
...
-l caddy.address=example.com \
-l caddy.targetport=8080 \
-l 'caddy.proxy=/socket backend:8080/socket' \
-l 'caddy.proxy./socket backend:8080/socket=websocket' \
...
results in:
example.com {
proxy / 172.17.0.1:8080 {
/socket backend:8080/socket websocket
}
}
while I was going for:
example.com {
proxy / 172.17.0.1:8080
proxy /socket backend:8080/socket {
websocket
}
}
running caddy 0.9.0
$ curl -I mycaddyserver:2015/php/breakme.phphp
HTTP/1.1 500 Internal Server Error
Content-Type: text/html
Server: Caddy
Status: 500 Internal Server Error
X-Powered-By: PHP/5.5.9-1ubuntu4.17
Date: Wed, 27 Jul 2016 09:11:30 GMT
/metrics only showing:
# HELP caddy_http_response_status_count_total Counter of response status codes.
# TYPE caddy_http_response_status_count_total counter
caddy_http_response_status_count_total{host="mycaddyserver",status="200"} 6
Caddyfile:
http://0.0.0.0:2015
fastcgi /php 127.0.0.1:9000 php
prometheus 127.0.0.1:2020
It always gets set to "1" regardless of whether the request is over IPv4 or IPv6.
Caddy 0.11.4.
My Caddyfile:
root@caddy ~# cat /etc/caddy/Caddyfile
import vhosts/*
prometheus
On restart:
Feb 27 21:22:36 caddy caddy[18493]: 2019/02/27 21:22:36 [INFO] [prometheus] acme: Obtaining bundled SAN certificate
Feb 27 21:22:37 caddy systemd[1]: caddy.service: Main process exited, code=exited, status=1/FAILURE
Feb 27 21:22:37 caddy systemd[1]: caddy.service: Unit entered failed state.
Feb 27 21:22:37 caddy systemd[1]: caddy.service: Failed with result 'exit-code'.
On reload:
Feb 27 21:20:53 caddy caddy[18132]: 2019/02/27 21:20:53 [INFO] SIGUSR1: Reloading
Feb 27 21:20:53 caddy caddy[18132]: 2019/02/27 21:20:53 [INFO] Reloading
Feb 27 21:20:53 caddy caddy[18132]: 2019/02/27 21:20:53 [INFO][FileStorage:/etc/ssl/caddy] Started certificate maintenance routine
Feb 27 21:20:53 caddy systemd[1]: Reloaded Caddy HTTP/2 web server.
Feb 27 21:20:53 caddy caddy[18132]: 2019/02/27 21:20:53 [INFO] [prometheus] acme: Obtaining bundled SAN certificate
Feb 27 21:20:53 caddy caddy[18132]: 2019/02/27 21:20:53 [ERROR] Restart failed: [prometheus] failed to obtain certificate: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/new-order :: urn:ietf:params:acme:error:malformed :: Error creating new order :: DNS name does not have enough labels, url:
Feb 27 21:20:53 caddy caddy[18132]: 2019/02/27 21:20:53 [ERROR] SIGUSR1: starting with listener file descriptors: [prometheus] failed to obtain certificate: acme: error: 400 :: POST :: https://acme-v02.api.letsencrypt.org/acme/new-order :: urn:ietf:params:acme:error:malformed :: Error creating new order :: DNS name does not have enough labels, url:
Even when I enable the prometheus and I have prometheus extension it returns me 404 when I go to /metrics
url. Am I doing something wrong?
Here is my Caddyfile
mydomain.com {
prometheus
root /var/www/lab3
}
Obviously if we wanted Caddy to serve up another /metrics
endpoint there'd be a conflict, but it'd be really useful if we only needed to expose a single port from Caddy for certain use-cases.
There is a way to test prom. metrics. Add tests.
See prometheus/client_golang#58 and
https://github.com/prometheus/haproxy_exporter/blob/master/haproxy_exporter_test.go#L39
Metrics are running, I can see statuscodes but only 200 and 0? #wtf
Don't see 404 and others....
Hey,
I'd really like to use this plugin! Exactly what I need.
I download caddy like https://caddyserver.com/download/build?os=linux&arch=amd64&features=prometheus
and build a docker image with it.
The Caddyfile includes only hosts like this:
foo.example.com {
tls [email protected]
prometheus 0.0.0.0:9180
proxy / foo.example.rancher.internal:1234 {
transparent
}
}
The metrics are exposed and also scraped.
Nevertheless there are not metrics with a caddy_
prefix.
Let me know if you need more info.
Thanks.
from the error log: 26/May/2018:23:46:16 +0000 [PANIC /] caddyhttp/proxy/reverseproxy.go:352 - *metrics.timedResponseWriter is not a hijacker
I think that can be fixed by implementing the Hijacker interface: https://golang.org/src/net/http/server.go#L174
It looks like it could be similar to how WriteHeader works and just call and return Hijack after calling w.didWrite()
The https://github.com/xuqingfeng/caddy-rate-limit plugin sends 429 responses when rate limits are hit, but as best as I can tell, the prometheus plugin doesn't report them. The middleware return value is here: https://github.com/xuqingfeng/caddy-rate-limit/blob/master/ratelimit.go#L84
And it seems like https://github.com/miekg/caddy-prometheus/blob/master/handler.go#L25-L35 ought to handle it correctly, but for reasons unknown, no "429" appears anywhere in our metrics even though they do appear in our logs and we've been able to successfully trigger them manually and observe that there is indeed a 429 response coming across the wire.
hi,
your prometheus plugin needs a publish on the new caddy site to be available for 0.10. Without this the plugin cannot be downloaded for the 0.10 version of caddy.
A freshly downloaded Prometheus adds without problem to a freshly downloaded Grafana. However, Prometheus exported by the Caddy plugin does not, with a 404 error that appears to be related to a URL reference in the Prometheus metric that doesn't exist in the version of the plugin.
To replicate, build a Caddy with Prometheus, configure, and run it. Download Grafana and run it; then try to add the Caddy Prometheus as a data source (either proxy or direct results in the same; direct needs more complex configuration because of CORS). "Save and test" will result in an "Unknown error" message; viewing the console shows that Grafana is attempting to access the following URL (proxied through to Prometheus), which is resulting in a 404:
https://HOSTNAME/grafana/api/datasources/proxy/1/api/v1/label/__name__/values
with the result
{data: "404 page not found↵", status: 404, config: Object, statusText: ""}
which appears to be prometheus/prometheus#925.
Is the API interface enabled in caddy-prometheus? Should this work, or is it PEBKAC?
see subject.
We don't export any of the go-stuff from the caddy binary (#go-routines, etc. etc.)
There is no check if prometheus <address>
is a valid address.
See captncraig#1
When you kill -SIGUSR1 the caddy process the monitoring looses is handler (or something), monitoring is broken at that moment.
Try to hook into the meta data for the ssl stuff and export that.
Hi, could you provide a Grafana dashboard with your metrics? Thanks for making this plugin greater!
I have no idea how to get the response size so we can export this.
Currently we have the request duration. It would be nice to have the request latency as well. The closest proxy for this that I can think of is the time to the first Write() call on the response writer.
Hello!
First, thank you for this plugin, it's awesome!
Now to the problem/question.
In readme there is
With caddyext you'll need to put this module early in the chain, so that the duration histogram actually makes sense. I've put it at number 0.
There is no more caddyext afaik. :) (just forks of it)
Is there some simple way to move prometheus directive in the chain up now, or should I fork the fork and fix imports etc. :)
Thanks
edit:
According to this issue #17 the problem was fixed in core by moving prometheus before proxy.
I'm still facing the same exact problem, which means that I don't see any metrics for hosts which are proxied. :)
output of metrics:
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.0748e-05
go_gc_duration_seconds{quantile="0.25"} 1.2134e-05
go_gc_duration_seconds{quantile="0.5"} 1.3446e-05
go_gc_duration_seconds{quantile="0.75"} 1.6894e-05
go_gc_duration_seconds{quantile="1"} 0.000164812
go_gc_duration_seconds_sum 0.000705874
go_gc_duration_seconds_count 25
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 37
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.13.8"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 3.549424e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 6.4820432e+07
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.459338e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 319219
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 9.333195774204041e-05
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 2.381824e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 3.549424e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 6.12352e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 5.152768e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 12593
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 6.1202432e+07
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.6387968e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.5831777421662006e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 331812
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 1736
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 51952
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 81920
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 4.315904e+06
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 451694
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 720896
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 720896
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.1500024e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 8
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.04
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 13
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 2.7377664e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.58317718462e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.2511232e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes -1
Caddyfile
example.com {
prometheus :9180
proxyprotocol 10.0.0.0/8
proxy / example_host:port {
transparent
}
tls {
wildcard
}
}
Edit 2:
It started working on localhost (docker-compose) but when I enable prometheus plugin in develop (docker-swarm) I get
[PANIC] runtime error: invalid memory address or nil pointer dereference
for every request :)
Hi guys,
What about adding a route POST /metrics to use the same Prometheus exporter for Caddy and a microservice ?
Most of my service using Caddy are coded in php, and, as you probably know, we cannot natively export metrics from php app without a statsd or collectd.
Having one binary starting a reverse proxy, fpm and a prometheus exporter for my app looks very interesting !
ping @mholt
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.