Code Monkey home page Code Monkey logo

Comments (13)

n8han avatar n8han commented on August 17, 2024 2

On 10/22/2012 09:19 AM, Sam Stainsby wrote:

I looked at this further today because I'm using the couchdb HTTP API,
where, for example, you can make a request to create database really neatly:

|baseRequest.PUT / databaseName
|

.. or so I thought until I realised '/' characters are legal in a
database name, so they need to be encoded.

Oh man. I really wish they had not done that!

Admittedly, I'm a dispatch
newbie, but so far I haven't seen any idiomatic way of doing this
neatly. eg. if I pre-encode the databaseName part, it gets double
encoded ("a%2Fname" -> a%252Fname), which is still wrong. Is there even
any solution that doesn't involve rendering the URL back to and string
and then appending the path component?

The latest release is using java.net.URI to construct and deconstruct
URL paths, which itself doesn't support the behavior Couch requires.

In general, the consequence of not taking the opportunity to fully
sanitize URL paths could have some fairly serious security consequences
for the unwary, much like security holes from unsanitized file paths.

I might wander back into the code - I already looked earlier today at
the last change. It a shame that Java/Scala doesn't have a built-in
|encodeURIComponent| method like javascript. I'm not sure of the
veracity of John Topley's solution to that problem here:
http://stackoverflow.com/questions/607176/, which can be refroged in
Scala thus:

Yeah I've come across a few of those in my past googles. Instead of
doing that I'd rather just write an encoder from scratch. It's not that
difficult and should perform better than this, plus we'll know exactly
what it's doing for all characters.

In fact I started doing that for the last release (I hope I kept the
branch), before I learned that java.net.URI comes so close to doing what
we need. It's just for slash encoding that it lets us down, and
unfortunately there's no workaround for it.

Since this makes it impossible to write a good interface to Couch we'll
need to come up with a solution.

Nathan

|object encodeURIComponent {
def apply (s: String): String = {
URLEncoder.encode(s, "UTF-8").
replaceAll("+", "%20").
replaceAll("%21", "!").
replaceAll("%27", "'").
replaceAll("%28", "(").
replaceAll("%29", ")").
replaceAll("%7E", "~")
}
|

... which is what I used to pre-encode the database paths in my experiments.


Reply to this email directly or view it on GitHub
#23 (comment).

from reboot.

arosien avatar arosien commented on August 17, 2024

See http://stackoverflow.com/questions/724043/http-url-address-encoding-in-java

from reboot.

adamdecaf avatar adamdecaf commented on August 17, 2024

Yea, it looks like you're right. So, it would need to be something like: adamdecaf@4e239c7

Also, we don't need to be escaping the URL because async does it for us.

https://github.com/sonatype/async-http-client/blob/master/api/src/main/java/com/ning/http/client/RequestBuilderBase.java#L327

from reboot.

arosien avatar arosien commented on August 17, 2024

Thanks! I had to revert to 0.8.5 because of this bug.

On Fri, Sep 14, 2012 at 9:45 AM, Adam Shannon [email protected]:

Yea, it looks like you're right. So, it would need to be something like:
adamdecaf@4e239c7

Also, we don't need to be escaping the URL because async does it for us.

https://github.com/sonatype/async-http-client/blob/master/api/src/main/java/com/ning/http/client/RequestBuilderBase.java#L327


Reply to this email directly or view it on GitHubhttps://github.com//issues/23#issuecomment-8567559.

from reboot.

n8han avatar n8han commented on August 17, 2024

Also, we don't need to be escaping the URL because async does it for us.

In my testing, it does not. I requested http://ja.wikipedia.org/wiki/メインページ and the client sent an invalid GET with that non-ASCII path.

from reboot.

n8han avatar n8han commented on August 17, 2024

As far as I can tell java.net.URI will not help us encode, it just throws exceptions with path elements that include spaces or other invalid characters are submitted.

I'm looking around and not finding anything (on the core classpath) that will encode path elements correctly. We can handle unicode characters (like my example above) with URI#toASCIIString. But getting req / "a & b" to do the right thing may not be feasible.

from reboot.

adamdecaf avatar adamdecaf commented on August 17, 2024

This seems to handle a lot more. We'd just have to get every url to this setup.

scala> val url = """http://ja.wikipedia.org/wiki/メインページ"""
url: java.lang.String = http://ja.wikipedia.org/wiki/メインページ

scala> new URI("http", url.drop(5), null)
res5: java.net.URI = http://ja.wikipedia.org/wiki/メインページ

scala> res5.toASCIIString
res6: java.lang.String = http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8

scala> new URI("http", "//www.someurl.com/path name/?query=par am", "#he llo").toASCIIString
res7: java.lang.String = http://www.someurl.com/path%20name/?query=par%20am#%23he%20llo

scala> new URI("http", "example.com/a & b?index", null)
res8: java.net.URI = http:example.com/a%20&%20b?index

from reboot.

n8han avatar n8han commented on August 17, 2024

Oh, nice. I wasn't aware of the behavior shown in your res7 and 8. I guess I abandon the URL encoder I started last night.

from reboot.

stainsby avatar stainsby commented on August 17, 2024

I'm a bit confused as to whether slashes are supposed to be encoded or not. For example:

scala> url("http://localhost/") / "foo/bar" build
res9: com.ning.http.client.Request = http://localhost/foo/bar   GET

Is that the desired behaviour? If you were passing in a parameter, say something like this:

  url("http://localhost/resource/") / resourceId

I wouldn't expect slashes in resourceId to add path components.

from reboot.

n8han avatar n8han commented on August 17, 2024

I'm not convinced either that it is ideal but it's the only way that
java.net.URI wants to behave.

If anyone wants to reimplement the functionality it provides (mostly
path URL encoding, which is distinct from the query-string URL encoding
that URLEncoder provides), feel free.

On 10/22/2012 12:59 AM, Sam Stainsby wrote:

I'm a bit confused as to whether slashes are supposed to be encoded or
not. For example:

|scala> url("http://localhost/") / "foo/bar" build
res9: com.ning.http.client.Request = http://localhost/foo/bar GET
|

Is that the desired behaviour? If you were passing in a parameter, say
something like this:

| url("http://localhost/resource/") / resourceId
|

I wouldn't expect slashes in resourceId to add path components.


Reply to this email directly or view it on GitHub
#23 (comment).

from reboot.

stainsby avatar stainsby commented on August 17, 2024

I looked at this further today because I'm using the couchdb HTTP API, where, for example, you can make a request to create database really neatly:

baseRequest.PUT / databaseName

.. or so I thought until I realised '/' characters are legal in a database name, so they need to be encoded. Admittedly, I'm a dispatch newbie, but so far I haven't seen any idiomatic way of doing this neatly. eg. if I pre-encode the databaseName part, it gets double encoded ("a%2Fname" -> a%252Fname), which is still wrong. Is there even any solution that doesn't involve rendering the URL back to a string and then appending the path component?

In general, the consequence of not taking the opportunity to fully sanitize URL paths could have some fairly serious security consequences for the unwary, much like security holes from unsanitized file paths.

I might wander back into the code - I already looked earlier today at the last change. It a shame that Java/Scala doesn't have a built-in encodeURIComponent method like javascript. I'm not sure of the veracity of John Topley's solution to that problem here: http://stackoverflow.com/questions/607176/, which can be reforged in Scala thus:

object encodeURIComponent {
  def apply (s: String): String = {
    URLEncoder.encode(s, "UTF-8").
     replaceAll("\\+", "%20").
     replaceAll("\\%21", "!").
     replaceAll("\\%27", "'").
     replaceAll("\\%28", "(").
     replaceAll("\\%29", ")").
     replaceAll("\\%7E", "~")
  }
}

... which is what I used to pre-encode the database paths in my experiments.

from reboot.

stainsby avatar stainsby commented on August 17, 2024

Heh, I put this gist up at almost the same time as your commit: https://gist.github.com/3936749 .. Note that '+' is also valid in a couch database name ("A database must be named with all lowercase letters (a-z), digits (0-9), or any of the _$()+-/ characters" http://wiki.apache.org/couchdb/HTTP_database_API). Hence I also do some dancing around with '+' chars. It seems to work.

from reboot.

n8han avatar n8han commented on August 17, 2024

Yeah, at the moment we don't need query string encoding because because the underlying library handles it, correctly as far as I am aware. So a + in the input should pass through untouched, while the standard ASCII space becomes %20.

scala> url("http://localhost/b/") / "a/b c+é" build
res3: com.ning.http.client.Request = http://localhost/b/a%2Fb%20c+%E9   GET

I think this is, finally, correct

from reboot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.