Comments (18)
Neither feature parity with Bleach nor full ammonia API is the goal for nh3
at the moment, but I'm happy to add more knobs for projects that want to migrate from bleach
to nh3
.
from nh3.
Others will likely opine but FWIW looking at our current use of bleach, Ammonia seems to support most of what we need (though some of it by hand-rolling through the ultra generic attibute_filter
which may not be so useful), however nh3 does not currently expose it
attribute_filter
obviously, this we would need to filter / cleanup@style
(like Bleach'sCSSSanitizer
, something which Ammonia does not currently seem to support directly (rust-ammonia/ammonia#179)- also the same for a finer / more flexible handling of
data:
urls which is not currently in ammonia (rust-ammonia/ammonia#154) clean_content_tags
which allows removing the tag and all of its content (the default whitelist removes the tag itself and "unwraps" its content I believe)
We also customise the serialization compared to the bleach default (which is just html5lib's as far as I can tell), but html5ever doesn't seem to have tuning knobs there (or at least not any which is relevant to what we configured) so there's definitely nothing you could do.
FWIW for the first two items nh3 might be able to provide bespoke whitelists from which it'd compose the relevant attribute_filter
internally, but that may or may not be desirable (and for the first it would add a dependency on something like cssparser
to implement a rust-level sanitizer).
from nh3.
Added attribute_filter
in #11.
from nh3.
Added clean_content_tags
in #12
from nh3.
❤️
from nh3.
Bleach offers the ability to pass callbacks
which can modify/augment/remove attributes from each element. Is there a chance to get something similar here? One of our use-cases is to add target="_blank"
to some anchor elements
from nh3.
Bleach offers the ability to pass
callbacks
which can modify/augment/remove attributes from each element. Is there a chance to get something similar here?
@camflan Added tag_attribute_values
for this in v0.2.11.
See also a2ec808
from nh3.
Bleach offers the ability to pass
callbacks
which can modify/augment/remove attributes from each element. Is there a chance to get something similar here?@camflan Added
tag_attribute_values
for this in v0.2.11.
It's not entirely clear how to add target="_blank"
to a
only if href
value is https://github.com
for example.
And is possible to rename element, for example h1
-> h2
?
from nh3.
It's not entirely clear how to add
target="_blank"
toa
only ifhref
value ishttps://github.com
for example.And is possible to rename element, for example
h1
->h2
?
I don't think you can do either of those via Ammonia (and thus nh3), especially for the second request it's not a general-purpose HTML-rewriting device.
You can see all the operations Ammonia supports at ammonia::Builder
, and nh3 handles a subset of those.
Your first request would I think be rust-ammonia/ammonia#163.
from nh3.
Bleach offers the ability to pass
callbacks
which can modify/augment/remove attributes from each element. Is there a chance to get something similar here?@camflan Added
tag_attribute_values
for this in v0.2.11.See also a2ec808
Would it be possible to allow a *
entry to this map just like is possible in the attributes
arg?
from docs:
attributes (dict[str, set[str]], optional) – Sets the HTML attributes that are allowed on specific tags, * key means the attributes are allowed on any tag.
I'm trying to sanitize data from CKEditor5 and multiple tags can for example contain style="text-align:right;"
. I cannot however allow all values of style
, as that would make it vulnerable to xss
from nh3.
Would it be possible to allow a
*
entry to this map just like is possible in theattributes
arg? from docs:
nh3 is a thin layer over ammonia so it's limited to what ammonia provides.
For attributes
, messense dispatches to generic_attributes
or tag_attributes
based on the *
key rather than replicate the two parameters.
But there's no such generic_attribute_values
, messense would have to create a custom attribute_filter
which may then have to be reconciled with a user-provided attribute_filter
, plus unexpectedly changing the concurrency caracteristics of the call, and possibly having different performance caracteristics. Maybe not something to hide from the user.
Unless you can get Ammonia to add a generic_attribute_values
, I'd suggest just using attribute_filter
directly.
from nh3.
For
attributes
, messense dispatches togeneric_attributes
ortag_attributes
based on the*
key rather than replicate the two parameters.
Ah, right i missed that in the rust docs.
unexpectedly changing the concurrency caracteristics of the call, and possibly having different performance caracteristics. Maybe not something to hide from the user.
Luckily performance is not an issue for me so i'll do it myself
I'd suggest just using
attribute_filter
directly.
That is fair enough. I assume I'd implement that by checking if attribute == "style"
and value
within a custom whitelist?
Does this filter apply before or after cleaning?
Thank you for the quick reply
Edit: I've resorted to just implementing a loop to update the tag_attribute_values with a tag whitelist
for tag in tag_whitelist:
tag_attribute_values.update([
(tag, tag_attribute_values['*']),
])
from nh3.
One thing from bleach that I'm missing (or maybe I'm just missing it in the docs) is the strip
attribute from Bleach. It removes tags which are not in allowed list. This is not the same as clean_content_tags
since that is an explicit list of the tags to remove, I'm looking for something that would just remove all tags that are not allowed.
from nh3.
One thing from bleach that I'm missing (or maybe I'm just missing it in the docs) is the
strip
attribute from Bleach. It removes tags which are not in allowed list. This is not the same asclean_content_tags
since that is an explicit list of the tags to remove, I'm looking for something that would just remove all tags that are not allowed.
Isn't that just tags
? Unlike bleach, ammonia always just removes the blacklisted tags, it doesn't escape them:
>>> bleach.clean('<div><foo>xxx</foo></div>', tags={'div'})
'<div><foo>xxx</foo></div>'
>>> bleach.clean('<div><foo>xxx</foo></div>', tags={'div'}, strip=True)
'<div>xxx</div>'
>>> nh3.clean('<div><foo>xxx</foo></div>', tags={'div'})
'<div>xxx</div>'
clean_content_tags
is used to remove not just the tag but the tag's contents as well, which is why it's opt-in, and additional to tags (you can't clean_content_tags
a whitelisted tag).
from nh3.
Unlike bleach, ammonia always just removes the blacklisted tags, it doesn't escape them:
That makes it unusable for usecases where you want to tread the string as plaintext where people may be writing stuff that looks like HTML (ie it has <
and >
) but isn't HTML... Being able to choose between stripping and escaping would be very much appreciated.
from nh3.
Being able to choose between stripping and escaping would be very much appreciated.
from nh3.
Related Issues (11)
- Feature: allow generic attribute prefixes, e.g. data-*
- Pylint false positive: no name in module HOT 1
- Couldn't find a setup script in /tmp/easy_install HOT 1
- [Question] How allow tg://user?=id tag in href a tag?
- How disable to remove unsupported tags? HOT 1
- clean_content_tags doesn't seem to work on tags other than <script> or <style> HOT 1
- Using default values in .clean() method produces unexpected output HOT 1
- RFE: is it possible to start making github releases?🤔 HOT 3
- An API for providing allowed tags and attributes? HOT 1
- Would it be possible to disable adding of rel="noopener noreferrer"? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nh3.