Code Monkey home page Code Monkey logo

httpsnoop's Introduction

httpsnoop

Package httpsnoop provides an easy way to capture http related metrics (i.e. response time, bytes written, and http status code) from your application's http.Handlers.

Doing this requires non-trivial wrapping of the http.ResponseWriter interface, which is also exposed for users interested in a more low-level API.

Go Reference Build Status

Usage Example

// myH is your app's http handler, perhaps a http.ServeMux or similar.
var myH http.Handler
// wrappedH wraps myH in order to log every request.
wrappedH := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	m := httpsnoop.CaptureMetrics(myH, w, r)
	log.Printf(
		"%s %s (code=%d dt=%s written=%d)",
		r.Method,
		r.URL,
		m.Code,
		m.Duration,
		m.Written,
	)
})
http.ListenAndServe(":8080", wrappedH)

Why this package exists

Instrumenting an application's http.Handler is surprisingly difficult.

However if you google for e.g. "capture ResponseWriter status code" you'll find lots of advise and code examples that suggest it to be a fairly trivial undertaking. Unfortunately everything I've seen so far has a high chance of breaking your application.

The main problem is that a http.ResponseWriter often implements additional interfaces such as http.Flusher, http.CloseNotifier, http.Hijacker, http.Pusher, and io.ReaderFrom. So the naive approach of just wrapping http.ResponseWriter in your own struct that also implements the http.ResponseWriter interface will hide the additional interfaces mentioned above. This has a high change of introducing subtle bugs into any non-trivial application.

Another approach I've seen people take is to return a struct that implements all of the interfaces above. However, that's also problematic, because it's difficult to fake some of these interfaces behaviors when the underlying http.ResponseWriter doesn't have an implementation. It's also dangerous, because an application may choose to operate differently, merely because it detects the presence of these additional interfaces.

This package solves this problem by checking which additional interfaces a http.ResponseWriter implements, returning a wrapped version implementing the exact same set of interfaces.

Additionally this package properly handles edge cases such as WriteHeader not being called, or called more than once, as well as concurrent calls to http.ResponseWriter methods, and even calls happening after the wrapped ServeHTTP has already returned.

Unfortunately this package is not perfect either. It's possible that it is still missing some interfaces provided by the go core (let me know if you find one), and it won't work for applications adding their own interfaces into the mix. You can however use httpsnoop.Unwrap(w) to access the underlying http.ResponseWriter and type-assert the result to its other interfaces.

However, hopefully the explanation above has sufficiently scared you of rolling your own solution to this problem. httpsnoop may still break your application, but at least it tries to avoid it as much as possible.

Anyway, the real problem here is that smuggling additional interfaces inside http.ResponseWriter is a problematic design choice, but it probably goes as deep as the Go language specification itself. But that's okay, I still prefer Go over the alternatives ;).

Performance

BenchmarkBaseline-8      	   20000	     94912 ns/op
BenchmarkCaptureMetrics-8	   20000	     95461 ns/op

As you can see, using CaptureMetrics on a vanilla http.Handler introduces an overhead of ~500 ns per http request on my machine. However, the margin of error appears to be larger than that, therefor it should be reasonable to assume that the overhead introduced by CaptureMetrics is absolutely negligible.

License

MIT

httpsnoop's People

Contributors

alexandear avatar costela avatar ebati avatar felixge avatar mitar avatar pda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

httpsnoop's Issues

New Release?

It looks like the last release has been quite a while ago, and since then #23/#24 have been implemented.

Could you release these for easy consumption, or is there something "missing"?

Start concrete comparison list against other current implementations

I really like the education benefit for the community of this project. I think the project has potential to educate router library and middleware authors to make their own implementation more robust and reach higher-quality implementations.
Thus, I would like to suggest a comparison list being added to the readme that outlines current "imperfections"/differences of other major router library providers.

I think a good start would the https://github.com/pressley/chi router and their https://github.com/pressly/chi/blob/master/middleware/wrap_writer.go middleware responsewriter. It always looked very complete and modern to me with their support for 1.8 and http.CloseNotifier, http.Flusher, http.Hijacker, http.Pusher, and io.ReaderFrom interfaces. But now I got sceptical.

I understand this list has potential for blame-gaming but I believe written in an informative and as-neutral-as-possible style it will actually help router and middleware authors to reach higher-quality implementations.

Storing values alone with the wrapped ResponseWriter for later access

Thanks for bringing up the common pitfall of naively embedding http.ResposeWriter and maintaining this package as the solution before the std library updates its APIs. During the attempts of using this library in our application, I found out that a common scenario for embedding http.RequestWriter is to compute and store values in the middlewares so they can be accessible in the handlers or other middlewares later. With the httpsnoop.Wrap, this would be impossible because there's no way to store additional values (well, closure would be the only way, but the lifecycle of the variables are bounded by the outer function). Do you have any suggestion how we can achieve this?

The only approach I can think of now is to introduce a value store, possibly similar to Context.Value, in the hook that can be accessible in the lifecycle of the request. The API might be

if values, ok := w.(httpsnoop.Values); ok {
  // values could be map[string]interface{}, http.Values, or custom types.
}

Do you think that would be a good addon to the existing APIs ?

Ability to access the wrapped ResponseWriter

Thanks for the library.

It would be useful to be able to explicitly access the underlying ResponseWriter.

For example New Relic's go-agent implements their newrelic.Transaction as an extension of http.ResponseWriter. I'd like to write middleware which type-asserts txn, ok := w.(newrelic.Transaction) in order to e.g. txn.AddAttribute(). But when w is already wrapped in httpsnoop the newrelic.Transaction type-assertion fails. I guess this is a case you've described in the README as “and it won't work for applications adding their own interfaces into the mix”.

However I'd be happy to do something like this:

if snoop, ok := w.(httpsnoop.Wrapper) {
  if txn, ok := snoop.Unwrap().(newrelic.Transaction) {
    txn.SetAttribute("foo", "bar")
  }
}

This would require a method on httpsnoop's type rw e.g. rw.Unwrap() or rw.ResponseWriter(), and an interface containing that method (although the caller could be responsible for declaring the interface).

(The httpsnoop assertion may need to be recursive in case it's wrapped multiple times; e.g. a request that has passed through a logging middleware and a metrics middleware that each use httpsnoop. But that's up to the caller).

Perhaps the patch would be something like this?

--- a/codegen/main.go
+++ b/codegen/main.go
@@ -127,6 +127,14 @@ type rw struct {
        w http.ResponseWriter
        h Hooks
 }
+
+type Wrapper interface {
+       Unwrap() http.ResponseWriter
+}
+
+func (w *rw) Unwrap() http.ResponseWriter {
+       return w.w
+}
 `)
        for _, iface := range ifaces {
                for _, fn := range iface.Funcs {

Support 1xx headers

Go 1.19 gained support for 103 Early hints (golang/go#42597). The implementation of HTTP server changed to allow for 1xx headers so now a common way to set them is to set headers and then call WriteHeader(StatusEarlyHints) which then allows setting more headers and then call do regular response writing, calling WriteHeader again, or just calling Write.

From looking around a bit around code here, I think CaptureMetrics might not work correctly, as it assumes WriteHeader is written only once. I can make a PR to fix that.

But I am not sure about any other breakage which might be happening because of that change. Anything?

Doesn't play well with HTTP GET when request body is present

Hello! Thanks for writing a good module.

I know no one uses HTTP GET with body in the wild, but indeed it is possible to send GET with the body data. And, from what I accidentally noticed, when your library is plugged to the request, this request errors:

2020/09/04 21:52:31 httputil: ReverseProxy read error during body copy: stream error: stream ID 1; PROTOCOL_ERROR

Here's what I do:

url, err := url.Parse("https://www.google.com")
proxy := logRequestHandler(httputil.NewSingleHostReverseProxy(url))

Try GET <host>/favicon.ico with any body and it errors. logRequestHandler is:

func logRequestHandler(h http.Handler) http.Handler {
	fn := func(w http.ResponseWriter, r *http.Request) {
		m := httpsnoop.CaptureMetrics(h, w, r)
		log.Println("RESPONSE SIZE: " + strconv.FormatInt(m.Written, 10))
	}
	return http.HandlerFunc(fn)
}

The error doesn't happen if I remove "logRequestHandler()". The other part of the program if you're curious to test my setup:

func main() {
	port := ":80"
	http.HandleFunc("/", handler())
	log.Println("Listeting on " + port)
	log.Fatal(http.ListenAndServe(port, nil))
}

func handler() func(http.ResponseWriter, *http.Request) {
	return func(res http.ResponseWriter, req *http.Request) {
		url, err := url.Parse("https://www.google.com")
		if err != nil {
			panic(err)
		}

		// Remove logRequestHandler() and the error will be gone
		proxy := logRequestHandler(httputil.NewSingleHostReverseProxy(url))

		req.URL.Host = url.Host
		req.Host = url.Host
		req.URL.Scheme = url.Scheme
		req.Header.Set("X-Forwarded-Host", url.Host)
		proxy.ServeHTTP(res, req)
	}
}

Hope this helps to make this library even better! :)

Create a new release that includes the new Unwrapper API

Currently we can only use the Unwrapper API by explicitly specifying the commit hash in go.mod. Is the API intentionally not included in the newest release ? If yes, this is just a friendly reminder that a new release is missing for publishing #12 (and later PRs) :).

runtime.newStack getting called from Wrap?

We're looking at CPU profiles of our servers, and noticed that there's a considerable amount of runtime.newStack getting attributed to httpsnoop.Wrap:
Screenshot 2024-02-06 at 13 36 31

I cannot find any explicit code doing that though, do you have any pointers here that could help us track this down (and ultimately eliminate)?

CaptureMetrics will crash the entire program if a handler panics

Foreword: Hey there, I really like this package (and I really like not needing to maintain my own).

Because you're running the http handler in a new goroutine, if the handler panics it will take down the whole program. The standard http server actually recovers from panics in handlers, so crashing completely is pretty bad.

There easiest solution is just to move to locks. Code gets simpler and faster.

Little program to reproduce:

package main

import (
    "log"
    "net/http"
    "net/http/httptest"
    "github.com/felixge/httpsnoop"
)

func main() {
    badHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        panic("oh no!")
    })

    server1 := httptest.NewServer(badHandler)
    defer server1.Close()

    server2 := httptest.NewServer(metricsMiddleware(badHandler))
    defer server2.Close()

    // This should cause the server to print out a panic traceback, but not to actually die.
    http.Get(server1.URL)
    log.Println("Program continued executing!!!")

    http.Get(server2.URL)
    log.Println("Program died so I won't get printed :(")
}

func metricsMiddleware(h http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        metrics := httpsnoop.CaptureMetrics(h, w, r)
        log.Printf("response code was: %d", metrics.Code)
    })
}

Thank You for this ....

this is a very good and light way way to get metrics from handlers without having to resort to using a whole framework ..... thank you ... im using this to grab metrics and populate prometheus collectors ..

thank you thank you for this

Not possible to get w.(httpPusher)

Hi, I am using http2 server (go1.8) with your library. But seems responseWriter is changed, so I can not get w.(httpPusher) and use http2 Push feature

avoid multiple wrapping

Since it's possible to have multiple uses of httpsnoop in a project, coming from different imported modules (e.g. when using server-timing), it's also possible to end up wrapping a ResponseWriter multiple times. This is not ideal because it has a sensible performance impact:

cpu: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
BenchmarkBaseline
BenchmarkBaseline-8                     745539247                1.477 ns/op           0 B/op          0 allocs/op
BenchmarkCaptureMetrics
BenchmarkCaptureMetrics-8                3790958               296.2 ns/op           225 B/op          7 allocs/op
BenchmarkCaptureMetricsTwice
BenchmarkCaptureMetricsTwice-8           1912039               581.3 ns/op           450 B/op         14 allocs/op

(these benchmarks were made after switching httptest.NewServer for direct h.ServeHTTP calls, in order to avoid the unrelated overhead of the server; see #20 )

Ideally we'd return the same Metrics instance when re-wrapping an already wrapped ResponseWriter.

In case we explicitly want to measure at two different places in the middleware-chain, we'd still be able to Unwrap before re-wrapping.

WDYT?

In metrics, do not use 200 as default code

When doing context cancellation handling, it can happen that the whole handler is never executed (or never gets to writing headers at least). It would be great if I could still use this package to log metrics. And it works. But the issue is that I get 200 as status code. I would prefer if status code would be 0, because then it would be easy to filter those log entries out and see which had context cancellation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.