Comments (8)
good find, I did have cases for checking that unicode substring search works for matches but not rejection test cases for cases where patterns did not match. The bug was a small oversight and has been fixed in 34553f0.
I included this in #33 just in time for the new release
from nucleo.
I'm actually getting the opposite problem now where now it's not matching even though it should be.
from nucleo.
hmm yeah that is an orthogonal bug, fixed in c7893db
from nucleo.
Now I'm getting a crash:
let needle = Utf32String::from("adi");
let haystack =
Utf32String::from("At the Road’s End - Seeming - SOL: A Self-Banishment Ritual");
let mut matcher = Matcher::new(Config::DEFAULT);
assert_ne!(
matcher.substring_match(haystack.slice(..), needle.slice(..)),
None
)
range end index 60 out of range for slice of length 59
stack backtrace:
0: rust_begin_unwind
at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/std/src/panicking.rs:597:5
1: core::panicking::panic_fmt
at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/panicking.rs:72:14
2: core::slice::index::slice_end_index_len_fail_rt
at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:76:5
3: core::slice::index::slice_end_index_len_fail
at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:68:9
4: <core::ops::range::Range<usize> as core::slice::index::SliceIndex<[T]>>::index
at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:411:13
5: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index
at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:18:9
6: nucleo_matcher::exact::<impl nucleo_matcher::Matcher>::substring_match_non_ascii
at /home/vector/.cargo/git/checkouts/nucleo-fe29e1ee969779b0/c7893db/matcher/src/exact.rs:248:28
7: nucleo_matcher::Matcher::substring_match_impl
at /home/vector/.cargo/git/checkouts/nucleo-fe29e1ee969779b0/c7893db/matcher/src/lib.rs:485:17
8: nucleo_matcher::Matcher::substring_match
at /home/vector/.cargo/git/checkouts/nucleo-fe29e1ee969779b0/c7893db/matcher/src/lib.rs:422:9
from nucleo.
Seems like the index generation is also really glitchy:
from nucleo.
Now I'm getting a crash:
let needle = Utf32String::from("adi"); let haystack = Utf32String::from("At the Road’s End - Seeming - SOL: A Self-Banishment Ritual"); let mut matcher = Matcher::new(Config::DEFAULT); assert_ne!( matcher.substring_match(haystack.slice(..), needle.slice(..)), None )range end index 60 out of range for slice of length 59 stack backtrace: 0: rust_begin_unwind at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/std/src/panicking.rs:597:5 1: core::panicking::panic_fmt at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/panicking.rs:72:14 2: core::slice::index::slice_end_index_len_fail_rt at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:76:5 3: core::slice::index::slice_end_index_len_fail at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:68:9 4: <core::ops::range::Range<usize> as core::slice::index::SliceIndex<[T]>>::index at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:411:13 5: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/slice/index.rs:18:9 6: nucleo_matcher::exact::<impl nucleo_matcher::Matcher>::substring_match_non_ascii at /home/vector/.cargo/git/checkouts/nucleo-fe29e1ee969779b0/c7893db/matcher/src/exact.rs:248:28 7: nucleo_matcher::Matcher::substring_match_impl at /home/vector/.cargo/git/checkouts/nucleo-fe29e1ee969779b0/c7893db/matcher/src/lib.rs:485:17 8: nucleo_matcher::Matcher::substring_match at /home/vector/.cargo/git/checkouts/nucleo-fe29e1ee969779b0/c7893db/matcher/src/lib.rs:422:9
For this crash it looks like above the code in the commit that Pascal refrenced where it is enumerating to the end of the haystack it should be enumerating to the end minus the length of the needle.
from nucleo.
Yeah I have a fix for that but I dont want to rush another fix (altough it is yet another existing bug, I guess I just need to weite moee unicode-unicode suabteing tests. I never type unicode so I rarely run into these).
from nucleo.
Yeah, I don't type Unicode usually but I have a music collection with a lot of foreign stuff or weird smart quotes and I'm writing a music player so I've been hitting the ascii-needle/unicode haystack codepath a lot.
from nucleo.
Related Issues (15)
- Where is crate docs? HOT 1
- Generate Coverage Report in CI HOT 1
- Run typos-rs in CI
- Panic with simple pattern. HOT 2
- Add a feature flag to disable Unicode normalization support
- Higher score for shorter matches? HOT 9
- consider using release tags and a changelog HOT 3
- [Feature request] Way to get scores of many/all items HOT 5
- Standalone CLI - toy project HOT 1
- bench: standalone fuzzy finder for benchmarking against other implementations HOT 1
- Starter example? HOT 8
- How should/does nucleo handle umlauts? HOT 2
- [Feature request] Get match indices and matched letters indices HOT 1
- How should Nucleo work? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nucleo.