Code Monkey home page Code Monkey logo

Comments (6)

drahnr avatar drahnr commented on May 22, 2024 1

These are very good points!

I'd like to address a couple of questions here.

* Should the `quirks` be applied recursively so essentially for `"'2x'"` `checker` would simplify it to `2`.

Technically that sounds like a good idea but requires additional work (using a mut Deque feeding in generated suggestions and processing the Deque element by element) so for now, I'd say get the recursion free version done, and then use the existing patterns in step 2 to refactor the suggestion processing logic.

* Should sub-parts of words be simplified as well? Example `2x-something` so it would check basically `2` and `something`.

I think the 2x quirk should be expanded a bit to something like ^[0-9]+(?:[,.e][0-9]+)?(?:-.+)?$ for the particular token, so here we would just expand the notion of 2x-pattern.

As I see there's 2 types of rules here basically ones which produce &str -> &str and &str -> Vec<&str>.

I understand it like this: The fn tokenize splits up the chunks into words, which are then checked against the dictionary, we then check those tokens against the dictionary, if that yields a suggestion/detects a mistake, then we call something like fn quirks(..) -> Vec<Suggestion<_>> which can internally handle all quirks described earlier (non-recursive for now) and will return n-suggestions. Returning suggestions here has the advantage, that not much context needs to be fed into the fn, and it can do more complex things rather than just reduction.

What do you think?

from cargo-spellcheck.

drahnr avatar drahnr commented on May 22, 2024

This should be quite self contained within checker/hunspell.rs, main.rs and config.rs.

CC @laysauchoa

from cargo-spellcheck.

zhiburt avatar zhiburt commented on May 22, 2024

Hey @drahnr I've tried it out and as always didn't succeed 😞

I'd like to address a couple of questions here.

  • Should the quirks be applied recursively so essentially for "'2x'" checker would simplify it to 2.
  • Should sub-parts of words be simplified as well? Example 2x-something so it would check basically 2 and something.

As I see there's 2 types of rules here basically ones which produce &str -> &str and &str -> Vec<&str>.

from cargo-spellcheck.

drahnr avatar drahnr commented on May 22, 2024

@zhiburt take a look at #90 - it implements the first step (more aligned to your proposal), repeated matching should be impl'd as step 2

from cargo-spellcheck.

drahnr avatar drahnr commented on May 22, 2024

0.4.0-alpha.1 just hit the road, it includes a hunspell specific backend quirk: regex_transform: [ "re1", ... ] specifies a bunch of regex options which are attempt to be applied to individual words to remove i.e. enclosing ' - the capture groups are then checked against the dictionary.
Note that this only solves half of the issues, i.e. the dashed suggestions for concatenated words can not be resolved using.
Example: testcase in a text would be suggested to be test-case, we would like an option to avoid those kind of meaningless suggestions.

from cargo-spellcheck.

drahnr avatar drahnr commented on May 22, 2024

Not entirely closed.

from cargo-spellcheck.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.