Code Monkey home page Code Monkey logo

Comments (18)

messense avatar messense commented on June 19, 2024

Neither feature parity with Bleach nor full ammonia API is the goal for nh3 at the moment, but I'm happy to add more knobs for projects that want to migrate from bleach to nh3.

from nh3.

xmo-odoo avatar xmo-odoo commented on June 19, 2024

Others will likely opine but FWIW looking at our current use of bleach, Ammonia seems to support most of what we need (though some of it by hand-rolling through the ultra generic attibute_filter which may not be so useful), however nh3 does not currently expose it

  • attribute_filter obviously, this we would need to filter / cleanup @style (like Bleach's CSSSanitizer, something which Ammonia does not currently seem to support directly (rust-ammonia/ammonia#179)
  • also the same for a finer / more flexible handling of data: urls which is not currently in ammonia (rust-ammonia/ammonia#154)
  • clean_content_tags which allows removing the tag and all of its content (the default whitelist removes the tag itself and "unwraps" its content I believe)

We also customise the serialization compared to the bleach default (which is just html5lib's as far as I can tell), but html5ever doesn't seem to have tuning knobs there (or at least not any which is relevant to what we configured) so there's definitely nothing you could do.

FWIW for the first two items nh3 might be able to provide bespoke whitelists from which it'd compose the relevant attribute_filter internally, but that may or may not be desirable (and for the first it would add a dependency on something like cssparser to implement a rust-level sanitizer).

from nh3.

messense avatar messense commented on June 19, 2024

Added attribute_filter in #11.

from nh3.

messense avatar messense commented on June 19, 2024

Added clean_content_tags in #12

from nh3.

xmo-odoo avatar xmo-odoo commented on June 19, 2024

❤️

from nh3.

camflan avatar camflan commented on June 19, 2024

Bleach offers the ability to pass callbacks which can modify/augment/remove attributes from each element. Is there a chance to get something similar here? One of our use-cases is to add target="_blank" to some anchor elements

from nh3.

messense avatar messense commented on June 19, 2024

Bleach offers the ability to pass callbacks which can modify/augment/remove attributes from each element. Is there a chance to get something similar here?

@camflan Added tag_attribute_values for this in v0.2.11.

See also a2ec808

from nh3.

MikeVL avatar MikeVL commented on June 19, 2024

Bleach offers the ability to pass callbacks which can modify/augment/remove attributes from each element. Is there a chance to get something similar here?

@camflan Added tag_attribute_values for this in v0.2.11.

It's not entirely clear how to add target="_blank" to a only if href value is https://github.com for example.

And is possible to rename element, for example h1 -> h2 ?

from nh3.

xmo-odoo avatar xmo-odoo commented on June 19, 2024

It's not entirely clear how to add target="_blank" to a only if href value is https://github.com for example.

And is possible to rename element, for example h1 -> h2 ?

I don't think you can do either of those via Ammonia (and thus nh3), especially for the second request it's not a general-purpose HTML-rewriting device.

You can see all the operations Ammonia supports at ammonia::Builder, and nh3 handles a subset of those.

Your first request would I think be rust-ammonia/ammonia#163.

from nh3.

jasperfirecai2 avatar jasperfirecai2 commented on June 19, 2024

Bleach offers the ability to pass callbacks which can modify/augment/remove attributes from each element. Is there a chance to get something similar here?

@camflan Added tag_attribute_values for this in v0.2.11.

See also a2ec808

Would it be possible to allow a * entry to this map just like is possible in the attributes arg?
from docs:

attributes (dict[str, set[str]], optional) – Sets the HTML attributes that are allowed on specific tags, * key means the attributes are allowed on any tag.

I'm trying to sanitize data from CKEditor5 and multiple tags can for example contain style="text-align:right;". I cannot however allow all values of style, as that would make it vulnerable to xss

from nh3.

xmo-odoo avatar xmo-odoo commented on June 19, 2024

Would it be possible to allow a * entry to this map just like is possible in the attributes arg? from docs:

nh3 is a thin layer over ammonia so it's limited to what ammonia provides.

For attributes, messense dispatches to generic_attributes or tag_attributes based on the * key rather than replicate the two parameters.

But there's no such generic_attribute_values, messense would have to create a custom attribute_filter which may then have to be reconciled with a user-provided attribute_filter, plus unexpectedly changing the concurrency caracteristics of the call, and possibly having different performance caracteristics. Maybe not something to hide from the user.

Unless you can get Ammonia to add a generic_attribute_values, I'd suggest just using attribute_filter directly.

from nh3.

jasperfirecai2 avatar jasperfirecai2 commented on June 19, 2024

For attributes, messense dispatches to generic_attributes or tag_attributes based on the * key rather than replicate the two parameters.

Ah, right i missed that in the rust docs.

unexpectedly changing the concurrency caracteristics of the call, and possibly having different performance caracteristics. Maybe not something to hide from the user.

Luckily performance is not an issue for me so i'll do it myself

I'd suggest just using attribute_filter directly.

That is fair enough. I assume I'd implement that by checking if attribute == "style" and value within a custom whitelist?

Does this filter apply before or after cleaning?

Thank you for the quick reply

Edit: I've resorted to just implementing a loop to update the tag_attribute_values with a tag whitelist

    for tag in tag_whitelist:
        tag_attribute_values.update([
            (tag, tag_attribute_values['*']),
        ])

from nh3.

frwickst avatar frwickst commented on June 19, 2024

One thing from bleach that I'm missing (or maybe I'm just missing it in the docs) is the strip attribute from Bleach. It removes tags which are not in allowed list. This is not the same as clean_content_tags since that is an explicit list of the tags to remove, I'm looking for something that would just remove all tags that are not allowed.

from nh3.

xmo-odoo avatar xmo-odoo commented on June 19, 2024

One thing from bleach that I'm missing (or maybe I'm just missing it in the docs) is the strip attribute from Bleach. It removes tags which are not in allowed list. This is not the same as clean_content_tags since that is an explicit list of the tags to remove, I'm looking for something that would just remove all tags that are not allowed.

Isn't that just tags? Unlike bleach, ammonia always just removes the blacklisted tags, it doesn't escape them:

>>> bleach.clean('<div><foo>xxx</foo></div>', tags={'div'})
'<div>&lt;foo&gt;xxx&lt;/foo&gt;</div>'
>>> bleach.clean('<div><foo>xxx</foo></div>', tags={'div'}, strip=True)
'<div>xxx</div>'
>>> nh3.clean('<div><foo>xxx</foo></div>', tags={'div'})
'<div>xxx</div>'

clean_content_tags is used to remove not just the tag but the tag's contents as well, which is why it's opt-in, and additional to tags (you can't clean_content_tags a whitelisted tag).

from nh3.

ThiefMaster avatar ThiefMaster commented on June 19, 2024

Unlike bleach, ammonia always just removes the blacklisted tags, it doesn't escape them:

That makes it unusable for usecases where you want to tread the string as plaintext where people may be writing stuff that looks like HTML (ie it has < and >) but isn't HTML... Being able to choose between stripping and escaping would be very much appreciated.

from nh3.

messense avatar messense commented on June 19, 2024

Being able to choose between stripping and escaping would be very much appreciated.

See rust-ammonia/ammonia#145

from nh3.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.