Comments (6)
This is unlikely to change the result, but I did note in RFC 2368 that the comma should apparently be encoded. See http://www.ietf.org/rfc/rfc2368.txt and in particular the example at the end of page 1.
From: Alexander [mailto:[email protected]]
Sent: 07 August 2015 17:38
To: mganss/HtmlSanitizer [email protected]
Subject: [HtmlSanitizer] Throws exception on multiple recipients in a email. (#41)
Sanitize the following html with enabled mailto: scheme:
Actual:
System.UriFormatException : Invalid URI: The hostname could not be parsed.
at System.Uri.CreateHostStringHelper(String str, UInt16 idx, UInt16 end, ref Flags flags, ref String scopeId)
at System.Uri.CreateHostString()
at System.Uri.GetComponentsHelper(UriComponents uriComponents, UriFormat uriFormat)
at System.Uri.GetComponents(UriComponents components, UriFormat format)
at System.Uri.get_AbsoluteUri()
at Ganss.XSS.HtmlSanitizer.SanitizeUrl(String url, String baseUrl)
at Ganss.XSS.HtmlSanitizer.Sanitize(String html, String baseUrl, IOutputFormatter outputFormatter)
Expected:
No exception is thrown.
—
Reply to this email directly or view it on GitHub #41 . https://github.com/notifications/beacon/AAdzyfApWqVifRBd-YwDFboS6sleQdlcks5olNaDgaJpZM4Fnq7y.gif
from htmlsanitizer.
I think HtmlSanitizer should "eat" the exception?
from htmlsanitizer.
@Jawvig seems like System.Uri
is not very consistent. For example, I could not reproduce the same behaviour for http:
scheme.
So, I agree with @304NotModified. Probably, ignoring exception and stripping "invalid" href
is way to go.
from htmlsanitizer.
Some info about the (annoying) system uri. See #8
http://blogs.msdn.com/b/ncl/archive/2010/02/23/system-uri-f-a-q.aspx
from htmlsanitizer.
Ps @mganss great covery results!
from htmlsanitizer.
@Jawvig RFC 2368 was superseded by RFC 6068 which allows unencoded commas:
mailtoURI = "mailto:" [ to ] [ hfields ]
to = addr-spec *("," addr-spec )
...
In .NET 4.5 and above no exception is thrown because Uri.TryCreate()
returns false and the URI is stripped. But of course this means that System.Uri
is currently not compliant with RFC 6068 and Microsoft doesn't seem to have the intention to fix it soon: https://connect.microsoft.com/VisualStudio/feedback/details/794758/system-uri-incorrectly-rejects-mailto-uris
I agree that the best way to deal with this issue is to catch the exception and strip the URI although it's legal. If someone insisted on keeping these kinds of URIs, they'd have to handle RemovingAttribute
and check the URI themselves.
@304NotModified Thanks 😄
from htmlsanitizer.
Related Issues (20)
- AngleSharp missing dependency HOT 2
- Url extra escaping HOT 3
- Error on sanitizing simple post without any invalid char. HOT 3
- about slash in background property HOT 1
- Allow outlook conditional comments HOT 1
- Sanitizer removes "px" from the source style attributes when it's "0px" HOT 1
- Properties in @font-face declaration are removed (font-display, mso-generic-font-family, mso-font-alt) HOT 1
- FilterUrl event not raised for relative URLs if baseUrl is used HOT 1
- href's allow inline javascript? HOT 2
- AngleSharp dependency issue in .NET Framework (IIS-hosted WCF service) HOT 9
- RemovingTag and/or RemovingAttribute does not fire for "<BODY ONLOAD=alert('XSS')>" HOT 1
- css attribute white-space is being removed by default HOT 1
- Characters such as < and > are removed, even if they're not a tag
- Unable to load assembly AngleSharp Version 0.17.0 when HtmlSanitizer Initialized HOT 1
- What's the correct way to allow ld+json? HOT 3
- HtmlSanitization removes Allowed Attributes from HTML content within a JSON string HOT 1
- Error in HtmlSanitizer.Sanitize HOT 13
- Advice about sanitizing markdown HOT 3
- Sanitizer Removes HTML styling Inside JSON string
- Conflict Between HtmlSanitizer and Bunit Due to AngleSharp Version HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from htmlsanitizer.