Comments (3)
This seems like similar rationale to why some of the other unicode things are in here, but I wanted to know if a PR implementing this would be accepted before doing anything.
Yes, indeed. I believe what you've outlined is a good fit for bstr.
So I think the API would look something like fn unicode_width(&self, cjk: bool) -> usize. It's worth noting that the unicode-width crate exposes separate functions, but having one be the default feels a little non-ideal, and this reduces the API surface anyway. An enum would also work but the only name I can think of for the non-cjk variant is, well, NonCjk (to be clear: Latin wouldn't be accurate). So, a bool punts on having to make that decision, even though it might be considered slightly opaque.
Yeah, I think using a bool
here is probably the worst possible choice. :-( I'd really like to avoid it. What about unicode_width_noncjk
and unicode_width_cjk
?
from bstr.
For an update: I've determined that what I needed was a bit more subtle than unicode width (well, I had known this, but I was hoping that it wouldn't be too hard to massage the unicode-width output into something that worked well in practice).
What I actually need/needed seems to be somewhere between a modified unicode width of the char for single-char graphemes, and 2 for any graphemes which are emoji. Additionally, I needed to allow for the chars (emoji) which had their width change in Unicode version 9. This works okay but is very ad-hoc, and still falls over for characters like e.g. ﷽ (admittedly, there may be no handling cases like that), and some others.
Anyway I don't have an algo that I'm that happy with yet (just one that worked well enough for me to move on), and even if I did, something that ad-hoc and heuristic driven probably isn't a great fit for a lib bstr in the first place.
(All that said a spec-compliant implementation of the unicode east-asian-width computation probably still would be an okay fit for bstr
-- it's just not a thing I'm inclined to spend effort on)
So this is a long winded way of saying I'm probably not going to be the one to fix this, and so I'm closing it for now. Feel free to reopen if you'd rather wait for someone else to do it.
from bstr.
Agreed! Thank you. If someone else wants a spec compliant version, then I would still be happy to accept that for bstr.
from bstr.
Related Issues (20)
- bstr-bench Cargo.lock out of sync
- Can the MSRV be lowered to 1.57 (or lower)? HOT 3
- Clarify None case in bstr::decode_utf8 HOT 1
- Feature request: `impl Deserialize for Box<BStr>`
- bstr 1.3.0 with `impl AsRef<BStr> for BStr` breaks some folks downstream HOT 1
- candidate versions found which didn't match: 0.2.17, 0.2.16, 0.2.15, ... HOT 5
- Possible panic safety issues in insert_str HOT 2
- Complementary ByteSlice functions addition - find_not_byte / rfind_not_byte HOT 1
- Use clippy in CI? HOT 2
- Intradoc links are broken when building with no default features HOT 3
- re-enable miri tests
- Accept array of str for split_str HOT 1
- remove `Borrow<BStr> for String` impls (and similar) in a semver compatible release HOT 9
- Add unescape_ascii fn HOT 4
- Display implementation doesn't respect Formatter options
- `bstr::Split` should implement clone. HOT 1
- Incorrect Output rfind() HOT 7
- Should the documentation be updated to take into_encoded_bytes and related functions into account?
- Grapheme segmentation is 1.2x-8x slower than `unicode-segmentation` in benchmarks HOT 3
- When stdin is a terminal, for_byte_record_with_terminator() does not exit immediately on Control-D HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bstr.