Code Monkey home page Code Monkey logo

Comments (25)

mganss avatar mganss commented on August 16, 2024

Bizarre. This is an issue with CsQuery. None of these result in a hang:

<div x=y/>
<svg x=y />
<svg x="y"/>
<xyz x=y/>

This does:

<svg x=y/>

I'll open an issue in CsQuery, although I'm afraid this might not get fixed because CsQuery is abandoned (see #34).

from htmlsanitizer.

samrules avatar samrules commented on August 16, 2024

Agreed, this is CsQuery bug. I was digging into it further and found that bug was indeed in CsQuery. CsQuery goes into infinite loop for this input.

For the time being I have fixed that bug in my code base :)

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

Opened jamietre/CsQuery#194

from htmlsanitizer.

304NotModified avatar 304NotModified commented on August 16, 2024

This is a security hole? I think we should work around csquery for this.

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

Definitely DOS potential here. I tried to track down the problem in CsQuery. The hang occurs here: https://github.com/jamietre/HtmlParserSharp/blob/862e09cabdb6ebf75bb9556f07b1660a0902c882/HtmlParserSharp/Core/TreeBuilder.cs#L1163-L1167

I didn't get any further, though. Can you see what's going on there?

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

The original validator.nu HTML parser code had a break eofloop; there (up until Aug 14, 2013 when they added features and ripped that whole part out):
https://github.com/validator/htmlparser/blob/8417f703507931965b19e9540229ad6c56bb448d/src/nu/validator/htmlparser/impl/TreeBuilder.java#L1357

That would translate into goto breakEofloop; instead of goto continueEofloop;

Not sure how to proceed now since both CsQuery and HtmlParserSharp seem to be abandoned 😕

from htmlsanitizer.

samrules avatar samrules commented on August 16, 2024

I am not sure if it is a good fix or not.
The way i fixed it was by replacing goto continueEofloop; with goto breakEofloop;

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

Opened jamietre/HtmlParserSharp#4

from htmlsanitizer.

dmeagor avatar dmeagor commented on August 16, 2024

I've included htmlsanitizer with nuget, what do I have to do to get the fix you pushed to htmlparsersharp added to our project? Do we need to ditch nuget and download and install manually from github?

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

I've asked @jamietre to make a new NuGet release of CsQuery. Currently, you'd have to build CsQuery yourself to have this fix included.

from htmlsanitizer.

dmeagor avatar dmeagor commented on August 16, 2024

Thanks. Does removing SVG from the allowed tag list prevent this from being a problem?

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

No. The svg tag isn't even in the default white list. This is a parser bug that's triggered before disallowed elements are removed.

from htmlsanitizer.

dmeagor avatar dmeagor commented on August 16, 2024

Ok thanks. Maybe I'll just wait for the anglesharp version to be pushed to nuget. Any idea when you think it will be stable enough for that?

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

For the record: This problem will most likely also occur with the math element because, like svg, this element has a different namespace in XHTML than HTML elements and that's where the parser bug sits.

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

I hope I can make a beta release with AngleSharp in the next few days.

from htmlsanitizer.

dmeagor avatar dmeagor commented on August 16, 2024

Nice, I'll test it for you when you do.

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

There are still a few tests failing due to a minor issue which will be fixed in AngleSharp 0.9 (due out shortly).

from htmlsanitizer.

304NotModified avatar 304NotModified commented on August 16, 2024

@mganss, any news on the anglesharp part?

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

All tests are passing. I just pushed v3.0-beta.

from htmlsanitizer.

304NotModified avatar 304NotModified commented on August 16, 2024

Nice!

from htmlsanitizer.

304NotModified avatar 304NotModified commented on August 16, 2024

PS: the coverity has been lowered a bit and I think you should exclude the tests them self ;)

https://coveralls.io/builds/4001780

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

Back up to ~100% after last commit 😄
I think tests should be covered as well, because what good is a test that is not executed (fully)?

from htmlsanitizer.

304NotModified avatar 304NotModified commented on August 16, 2024

Good point!

One downside, it "pollutes" the average code coverage IMO.

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

Good point, too 😄

from htmlsanitizer.

mganss avatar mganss commented on August 16, 2024

Closing this because we have moved to AngleSharp.

from htmlsanitizer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.