Comments (25)
Bizarre. This is an issue with CsQuery. None of these result in a hang:
<div x=y/>
<svg x=y />
<svg x="y"/>
<xyz x=y/>
This does:
<svg x=y/>
I'll open an issue in CsQuery, although I'm afraid this might not get fixed because CsQuery is abandoned (see #34).
from htmlsanitizer.
Agreed, this is CsQuery bug. I was digging into it further and found that bug was indeed in CsQuery. CsQuery goes into infinite loop for this input.
For the time being I have fixed that bug in my code base :)
from htmlsanitizer.
Opened jamietre/CsQuery#194
from htmlsanitizer.
This is a security hole? I think we should work around csquery for this.
from htmlsanitizer.
Definitely DOS potential here. I tried to track down the problem in CsQuery. The hang occurs here: https://github.com/jamietre/HtmlParserSharp/blob/862e09cabdb6ebf75bb9556f07b1660a0902c882/HtmlParserSharp/Core/TreeBuilder.cs#L1163-L1167
I didn't get any further, though. Can you see what's going on there?
from htmlsanitizer.
The original validator.nu HTML parser code had a break eofloop;
there (up until Aug 14, 2013 when they added features and ripped that whole part out):
https://github.com/validator/htmlparser/blob/8417f703507931965b19e9540229ad6c56bb448d/src/nu/validator/htmlparser/impl/TreeBuilder.java#L1357
That would translate into goto breakEofloop;
instead of goto continueEofloop;
Not sure how to proceed now since both CsQuery and HtmlParserSharp seem to be abandoned 😕
from htmlsanitizer.
I am not sure if it is a good fix or not.
The way i fixed it was by replacing goto continueEofloop;
with goto breakEofloop;
from htmlsanitizer.
Opened jamietre/HtmlParserSharp#4
from htmlsanitizer.
I've included htmlsanitizer with nuget, what do I have to do to get the fix you pushed to htmlparsersharp added to our project? Do we need to ditch nuget and download and install manually from github?
from htmlsanitizer.
I've asked @jamietre to make a new NuGet release of CsQuery. Currently, you'd have to build CsQuery yourself to have this fix included.
from htmlsanitizer.
Thanks. Does removing SVG from the allowed tag list prevent this from being a problem?
from htmlsanitizer.
No. The svg tag isn't even in the default white list. This is a parser bug that's triggered before disallowed elements are removed.
from htmlsanitizer.
Ok thanks. Maybe I'll just wait for the anglesharp version to be pushed to nuget. Any idea when you think it will be stable enough for that?
from htmlsanitizer.
For the record: This problem will most likely also occur with the math element because, like svg, this element has a different namespace in XHTML than HTML elements and that's where the parser bug sits.
from htmlsanitizer.
I hope I can make a beta release with AngleSharp in the next few days.
from htmlsanitizer.
Nice, I'll test it for you when you do.
from htmlsanitizer.
There are still a few tests failing due to a minor issue which will be fixed in AngleSharp 0.9 (due out shortly).
from htmlsanitizer.
@mganss, any news on the anglesharp part?
from htmlsanitizer.
All tests are passing. I just pushed v3.0-beta.
from htmlsanitizer.
Nice!
from htmlsanitizer.
PS: the coverity has been lowered a bit and I think you should exclude the tests them self ;)
https://coveralls.io/builds/4001780
from htmlsanitizer.
Back up to ~100% after last commit 😄
I think tests should be covered as well, because what good is a test that is not executed (fully)?
from htmlsanitizer.
Good point!
One downside, it "pollutes" the average code coverage IMO.
from htmlsanitizer.
Good point, too 😄
from htmlsanitizer.
Closing this because we have moved to AngleSharp.
from htmlsanitizer.
Related Issues (20)
- AngleSharp missing dependency HOT 2
- Url extra escaping HOT 3
- Error on sanitizing simple post without any invalid char. HOT 3
- about slash in background property HOT 1
- Allow outlook conditional comments HOT 1
- Sanitizer removes "px" from the source style attributes when it's "0px" HOT 1
- Properties in @font-face declaration are removed (font-display, mso-generic-font-family, mso-font-alt) HOT 1
- FilterUrl event not raised for relative URLs if baseUrl is used HOT 1
- href's allow inline javascript? HOT 2
- AngleSharp dependency issue in .NET Framework (IIS-hosted WCF service) HOT 9
- RemovingTag and/or RemovingAttribute does not fire for "<BODY ONLOAD=alert('XSS')>" HOT 1
- css attribute white-space is being removed by default HOT 1
- Characters such as < and > are removed, even if they're not a tag
- Unable to load assembly AngleSharp Version 0.17.0 when HtmlSanitizer Initialized HOT 1
- What's the correct way to allow ld+json? HOT 3
- HtmlSanitization removes Allowed Attributes from HTML content within a JSON string HOT 1
- Error in HtmlSanitizer.Sanitize HOT 13
- Advice about sanitizing markdown HOT 3
- Sanitizer Removes HTML styling Inside JSON string
- Conflict Between HtmlSanitizer and Bunit Due to AngleSharp Version HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.