Comments (4)
The CharIndices
iterator is also used by:
ByteSlice::to_lowercase_into
ByteSlice::to_uppercase_into
They are not impacted by this bug, but I think adding a comment like this would be a good idea:
// This branch coaleces invalid UTF-8 bytes and legitimate
// occurrences of the Unicode replacement character. This is
// acceptable since the replacement character does not lowercase
// into a different character
if ch == '\u{FFFD}' {
from bstr.
The current implementation of Debug will always output the replacement character as three byte escapes. Is this intended?
I would say probably not. It's a good point. I think something like your approach would work well: check if the bytes correspond to the UTF-8 encoding of the replacement codepoint, and if so, escape it like any other codepoint. Otherwise, right the raw bytes.
from bstr.
As an aside, I now know that \uFFFD
is the replacement character after diving through the bstr
source, but there is a core::char::REPLACEMENT_CHARACTER
constant that I think would be nicer to use instead. I'm not sure if using it has MSRV implications.
from bstr.
@lopopolo Yeah, using the constant is probably the better thing to do. If you work with Unicode and UTF-8 as much as I do, then \uFFFD
just becomes as natural as breathing! Hah. But yeah, a named constant would be clearer. It would be fine to use, since that constant was introduced in Rust 1.9. bstr's MSRV is much higher than that.
from bstr.
Related Issues (20)
- bstr-bench Cargo.lock out of sync
- Can the MSRV be lowered to 1.57 (or lower)? HOT 3
- Clarify None case in bstr::decode_utf8 HOT 1
- Feature request: `impl Deserialize for Box<BStr>`
- bstr 1.3.0 with `impl AsRef<BStr> for BStr` breaks some folks downstream HOT 1
- candidate versions found which didn't match: 0.2.17, 0.2.16, 0.2.15, ... HOT 5
- Possible panic safety issues in insert_str HOT 2
- Complementary ByteSlice functions addition - find_not_byte / rfind_not_byte HOT 1
- Use clippy in CI? HOT 2
- Intradoc links are broken when building with no default features HOT 3
- re-enable miri tests
- Accept array of str for split_str HOT 1
- remove `Borrow<BStr> for String` impls (and similar) in a semver compatible release HOT 9
- Add unescape_ascii fn HOT 4
- Display implementation doesn't respect Formatter options
- `bstr::Split` should implement clone. HOT 1
- Incorrect Output rfind() HOT 7
- Should the documentation be updated to take into_encoded_bytes and related functions into account?
- Grapheme segmentation is 1.2x-8x slower than `unicode-segmentation` in benchmarks HOT 3
- When stdin is a terminal, for_byte_record_with_terminator() does not exit immediately on Control-D HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bstr.