Code Monkey home page Code Monkey logo

ahocorasick's People

Contributors

anknown avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ahocorasick's Issues

feature request: case insensitive matching

Currently matching is always case sensitive. Obviously the machine could be built with lowercase dict, and input text could be lower cased, but it could be more efficient for the case insensitive matching to happen in the matching logic itself, especially in the case where returnImmediately=true

License?

Hi What would be the license of this fine code?
Thanks!

Bug Report: The algorithm fails to match keywords accurately in marginal cases.

I recently attempted to consume this module. But being the paranoid individual that I am, decided to run a robust test against the algorithm to check for its correctness.

My tests revealed that the algorithm matched keywords incorrectly in marginal cases.

I am willing to contribute the tests that I implemented for this algorithm, and also help figure out where/why it fails.

I am currently using the cloudflare/ahocorasick library because it is accurate 100% of the time, but I really do wish to see the bug in this library fixed because of its major edge in efficiency.

Matching whole words in the middle of a longer string

Hi, I have seen the issue at #4

But ExactSearch just seems to try and match a single word with a single word. It doesn't match a whole string "only" in the middle with a word boundary, like the original problem reported.

I.e with ExactSearch "abc" it will NOT match at all "abcde abc zabc", but will match if the string is "abc" (so it's basically acting like a Map)
But with MultiPatternSearch abc will match 3 times.

It would be good to have an option where it can match inside an arbitrary long string, but only at word boundaries either side (eg if there is whitespace or end of line next to the match). I'd be happy to add a specific boundary character between words if it helps.

Hope that makes sense!

You must give appropriate credit to reference projects

Hi anknown, or should I call you 韩诗楠.

Thank you for your effort porting Aho Corasick with DoubleArrayTrie to golang. But please give appropriate credit to other projects you benefited from.

Your benchmark is copied from my project. Why didn't you even mention it?

I haven't read your code yet, but as we (including pioneers I mentioned in my projects) are implementing almost the same algorithm. Did you borrow any ideas from them? If you did, you should give credit to them.

In a worldwide community like GitHub, we Chinese should show more respect for copyright. Otherwise we are always the joke.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.