Comments (22)
rust-postgres matches libpq's connection string syntax, so I'd prefer to not be forced to switch it to punycode: http://www.postgresql.org/docs/9.2/static/libpq-connect.html#LIBPQ-CONNSTRING
from rust-url.
Percent-encoding works. It’s /
that is black-listed in hosts/domains, per spec. Url::parse("http://%2F")
returns Err(InvalidDomainCharacter)
I suppose we could add a UrlParser
flag to disable that black list, although serialization will be inconsistent unless we do percent-encoding there that is otherwise unnecessary.
from rust-url.
A flag seems like it could be reasonable. If you don't want that kind of hackery to end up in rust-url, I can always pull the old liburl into rust-postgres.
from rust-url.
This is a real usage of URLs that people are doing out there, I’ll file an issue with the spec about this blacklist.
from rust-url.
Having a blacklist rule about the content of the host portion might better be considered a requirement of the http, https, and ftp schemes (and perhaps others), rather than of all relative URLs.
from rust-url.
Any update on this?
from rust-url.
@SimonSapin I don't understand how this is a problem. "postgres" is not a relative scheme per the URL Standard so nothing is blacklisted.
from rust-url.
@annevk rust-url provides a mechanism to override what is considered a relative scheme, for this kind of use case. Sure, non-relative is less restrictive, but it’s also not very useful, you get a single string.
from rust-url.
I wish this URL standard explained what a relative scheme means, rather than just listing 7 schemes.
It seems to be a huge oversight if the URL standard, in practical terms, simply marginalizes all other schemes.
from rust-url.
@SimonSapin such a solution does not seem to scale very well. @mikedilger I'm interested in figuring out if we can find an alternative that would satisfy everyone. Please file a bug on the URL Standard.
from rust-url.
@alexcrichton I think the basic issue here is that the first %2F should not be encoded. Thus the parser does not know where the path starts and the %2F is taken as part of the host name. RFC 3986 2.4 says "When a URI is dereferenced, the components ... must be ... separated before the percent-encoded octets ... can be safely decoded,...". What is the context that created this particular URL? Do the path components really need to be percent encoded?
from rust-url.
@galsondor it comes from PostgreSQL's connection strings - see the bottom of section 31.1.1.2 here: http://www.postgresql.org/docs/9.4/static/libpq-connect.html#LIBPQ-CONNSTRING
from rust-url.
@galsondor an example URL would be this: postgresql://postgres@%2Fvar%2Flib%2Fpostgresql/dbname
The %2F encodings are to keep those slashes from having "URL" meaning, and pass them through, as they have "postgres" meaning.
from rust-url.
@sfackler and @mikedilger, got it; thanks for the references. RFC 3986 2.4 also says that percent encodings should remain encoded until dereferenced, which in this case is probably the PostgresSQL client's job. Since the point of rust-url is to parse (presumably in order to deference), perhaps this URL should remain a string until passed to the client. (Obviously there is a lot I don't know about your application, please forgive me if I am off base here.)
from rust-url.
In sfackler's instance (rust-postgres) he is parsing it to determine how to setup the connection to the PostgreSQL server. I should hope a URL library could help him out, as the rest of the URL is formed very much like any other URL.
It comes down to the WhatWG's definition of a URL and in particular the heirarchy they choose for seeing a URL as. That hierarchy is, IMHO, not quite right, and this is a very good example of why.
from rust-url.
@mikedilger what do you mean by hierarchy? The problem here is that, per spec, percent-encoding does not apply to host names.
from rust-url.
@SimonSapin Percent encoding can be used in host names. Per RFC 3986: "host = IP-literal / IPv4address / reg-name" and "reg-name = *( unreserved / pct-encoded / sub-delims )". I think the issue is that "%2F..." is not a host name an the traditional, DNS sense. Postgres "netlocs" can be a path to a configuration directory on the local host. Thus, Postgres URLs can have two paths: the "netloc", which is in the host, and the "dbname", which is in the path.
from rust-url.
The issue of parsing percent encoding in a relative scheme is, yes, an interesting issue but I think it's moot because postgres is not a relative scheme (according to how that is aparently defined).
I'm referring to the hierarchy of the url components, what boundaries do you break down first, and then which of those breaks down further. Under WhatWG and this library (vs. the old rust url library) the notion of a "Relative Scheme" arises and includes (username,password,host,port,default_port,path), and as postgresql is not listed as a relative scheme (as that is how relative scheme seems to be defined), it is therefore a non-relative scheme, and therefore there is no parsing assistance available for any of those 6 components, you just get NonRelative(String) and are on your own.
from rust-url.
Sorry, I've taken this off topic (although there is overlap). I may be confusing issues somewhat with #2.
from rust-url.
I found something that avoids the problem altogether: PostgreSQL also allows components through query parameters:
postgres:///<db name>?host=/tmp
This doesn't immediately help because this is still a invalid URL (note the empty host in ///
), but I just checked that the parameter takes precedence over the real host of the connection URL. For example, this will connect to the socket in /tmp
:
postgres://dummy/<db name>?host=/tmp
With this, I think this is not a problem anymore!
from rust-url.
@sfackler, is @nox’s suggestion acceptable? If not we can add a opt-in flag to change the parsing behavior, though I’d rather do it after #176 lands to avoid yet another rebase.
from rust-url.
I am closing this.
from rust-url.
Related Issues (20)
- Host should implement deserialize to parse strings
- JOIN functionality not working HOT 4
- URL validity change between 2.2 and 2.3. HOT 2
- Documentation for IDNA configuration options should explain use cases
- Feature request: add parser boolean option to leave relative paths in the URL.
- Neither punycode::encode_str nor Config::...::to_ascii return expected results for single Unicode char and "EXAMPLE" HOT 3
- `Url::from_file_path()` incorrect handling of backslash on linux
- `=` is not being escaped as query value HOT 2
- [DataUrl] Unable to parse application/json;utf8 containing # HOT 1
- Feature request: provide separate struct for URL which is can-be-base
- Error: 🚫 Building project failed: error[E0583]: file not found for module `origin`serde, interproc... HOT 2
- Poping a path segment removes slash separator HOT 2
- No hostname format validation in URL HOT 5
- The input urls generated by the fuzzer can be problematic as it causes very long parse times
- Incorrect error when url contains number sign HOT 1
- URI and IRI support? HOT 1
- `form_urlencoded::ByteSerialize` does NOT conform to the URL standard HOT 1
- Why is IP convert to Domain HOT 3
- perf: Use `NonZeroU16` for port numbers
- Url with quote after schema is getting parsed as valid. HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rust-url.