I have an issue in em-http-request[0] that the client calls

a special normalize method? about addressable HOT 5 CLOSED

sporkmonger commented on June 7, 2024

a special normalize method?

from addressable.

Comments (5)

sporkmonger commented on June 7, 2024

The client should not be calling normalize, but this is a mistake in the client, not the URI parser. However, if it's been like this for awhile, resolving it will likely cause breaking changes in any projects that have em-http-request as a dependency and that have been relying on this functionality. i.e., the 'edge cases' Ilya was referring to.

If Ilya needs convincing, point him at OAuth and ask him if he's ever tried using em-http-request in conjunction with an auth mechanism that signs parts of the URI. If you pre-normalize like this, the signatures won't match and the request will fail in ways that are nearly impossible to debug.

However, in this particular case you've given, that is not an example of a semantic change. All URI-aware software should treat those two as equivalent. The main problem here is simply that if I give an HTTP client a URI, I expect it to make a request against exactly the byte-for-byte data I give it. Pre-normalizing is the kind of magic Ruby is known and sometimes reviled for, and we shouldn't be making a habit of that.

from addressable.

igrigorik commented on June 7, 2024

Bob, the normalize! call in em-http is a fairly recent addition. Perhaps I misunderstood the utility / semantics of it? I assumed the same behavior as built in URI lib...

ruby-1.9.2-p0 > require 'uri'
ruby-1.9.2-p0 > u = URI.parse('http://example.com/path?a=%28%2B%29')
ruby-1.9.2-p0 > u.normalize
# URI::HTTP:0x00000101959fa0 URL:http://example.com/path?a=%28%2B%29
ruby-1.9.2-p0 > u.normalize.to_s
"http://example.com/path?a=%28%2B%29"
ruby-1.9.2-p0 > u.query
"a=%28%2B%29"

ruby-1.9.2-p0 > require 'addressable/uri'
ruby-1.9.2-p0 > a = Addressable::URI.parse('http://example.com/path?a=%28%2B%29')
ruby-1.9.2-p0 > a.normalize!
# Addressable::URI:0x80e588c0 URI:http://example.com/path?a=(+)
ruby-1.9.2-p0 > a.query
"a=%28%2B%29"

I'm guessing you're following the normalization spec? [1] If thats the case, this is a tricky case.. In theory, the URI's should be the same, in practice (due to server implementations) they are not. At the same time, the last thing I want to do is reimplement pars of Addressable in em-http.

It seems like saying "client shouldn't call normalize" defeats the purpose of the lib? Having said that, it's a catch-22 because that's what the spec says you should do. Ugh!

Any suggestions for how to deal with this?

[1] http://labs.apache.org/webarch/uri/rfc/rfc3986.html#normalize-encoding

from addressable.

sporkmonger commented on June 7, 2024

Addressable performs encoding normalization as per the spec, yes. It also performs all the other normalization steps given, like path segment normalization and so on. The problem is not that Addressable's normalization is non-conformant. The problem is that an HTTP client must not perform normalization prior to sending the request. Nowhere does any spec require a generalized HTTP client perform normalization prior to sending the request. That's always something that should be done manually.

Normalization can and often does result in a new identifier. It's a process that attempts to produce a new URI that points to the same resource as the original URI. From the spec: "Implementations may use logic based on the definitions provided by this specification to reduce the probability of false negatives." In other words, any time you perform normalization, you run the risk of a false negative; i.e., a new URI that points to the wrong resource.

In the case of an HTTP client, it's critical that the client makes a request against precisely the same URI it was given. Because of the way HTTP splits the URI in half and only passes the request URI section to the server, it's OK to normalize the scheme and authority piece. But a client should not attempt to normalize the path or query components unless explicitly requested to do so.

And as I pointed out to the other guy, OAuth 1.0 is a perfect example of why this is so important. If you were to sign a request prior to passing it through to the HTTP client, and then the client performed normalization, the signatures would no longer match. The problem would be nearly impossible to debug on top of it, because it would work for almost all requests, and only if you encoded something that was already in canonical form would the signature fail.

from addressable.

sporkmonger commented on June 7, 2024

Also, in the particular example you gave, both implementations are quite possibly wrong. The most correct normalization may actually be http://example.com/path?a=(%2B), depending on the context.

However, to be clear, this one could probably be argued two different ways according to two different specifications (RFC 3986 vs HTML 4.01). Which pretty much should make this the perfect example of why you don't want to normalize here.

from addressable.

igrigorik commented on June 7, 2024

Bob, that makes sense. Let me take a pass over the code in em-http. Should be able to remove the normalize! call without too much trouble, since its localized to a single location. Just have to make sure that the requests are dispatched correctly in a few edge cases.

from addressable.

a special normalize method? about addressable HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent