Code Monkey home page Code Monkey logo

fuzzy's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuzzy's Issues

SpellCheck and SpellCheckSuggestions gives different results

The output from SpellCheck and SpellCheckSuggestions differs.

model.SpellCheck("lisence") => "liens"
model.SpellCheckSuggestions("lisence", 1) => ["licence"]
model.CheckKnown("lisense", "license") => true

purpel => pure || [parcel]
natior => nor || [nation]

It is not completely consistent, as it works in some cases.

The model has been trained from a pre-collected corpus with a count for each word. Using model.SetCount(word, count, true), but it seems to be the same with training from SampleEnglish()

Test:

func TestSuggestionsVsSpelling(t *testing.T) {
	model := NewModel()
	model.Train(SampleEnglish())
	cases := []string{
		"lisence",
		"purpel",
		"blidn",
		"teh",
	}
	for _, word := range cases {
		checked := model.SpellCheck(word)
		suggestions := model.SpellCheckSuggestions(word, 1)
		if len(suggestions) == 1 && suggestions[0] != checked {
			t.Errorf("first suggestion '%s', does not equal SpellCheck '%s'", suggestions[0], checked)
		}
	}
}

Improve Levenshtein algo

If you need speed, change your LD code for this one. Explanations :

  • Go passes parameters by value, not by ref. It means a copy for each call.
  • Reusing the map for the two dimensions avoids to create two maps.
  • Inserting the code of min function avoids the cost of each call.
func LevenshteinDistance(a, b *string) int {
    la := len(*a)
    lb := len(*b)
    d  := make([]int, la + 1)
    var lastdiag, olddiag, temp int

    for i := 1; i <= la; i++ {
        d[i] = i
    }
    for i := 1; i <= lb; i++ {
        d[0] = i
        lastdiag = i - 1
        for j := 1; j <= la; j++ {
            olddiag = d[j]
            min := d[j] + 1
            if (d[j - 1] + 1) < min {
                min = d[j - 1] + 1
            }
            if ( (*a)[j - 1] == (*b)[i - 1] ) {
                temp = 0
            } else {
                temp = 1
            }
            if (lastdiag + temp) < min {
                min = lastdiag + temp
            }
            d[j] = min
            lastdiag = olddiag
        }
    }
    return d[la]
}

slang

Hi,

you say that your algo has a accuracy of 68%. do you know what accuracy other libs achieve?

is it able to correct a slang - word?

Gerald

add public SuggestPotential(input string, exhaustive bool)

Thanks for the awesome fuzzy library.

For my use case I want to be able to provide different suggestion ranking criteria than is implemented in fuzzy.best(). I can use Suggestions(), but I lose the information in *Potential.

What do you think of exposing SuggestPotential()?

SuggestPotential() would need to obtain the lock and the fields in Potential also made public, but I don't see anything complex that would need to change.

Build failure

$ go build
# github.com/sajari/fuzzy
../../../github.com/sajari/fuzzy/fuzzy.go:128: undefined: UseAutocomplete

Did you try compiling this before merging the latest PR?

Support common words

Hi, I'm currently developing a cli tool. I want to use this project for spell testing the different help messages.

Is there a built-in trained dictionary this way I do not have to update the list of words I pass to Train() every time I update the help message?

Thank you

SpellCHeck for sentences

Can we use this package with sentences like below
Ex: Input = "hi i am fro India"("m" is missing in this text)
Expected output = "hi i am from India"("m" is added from missing place)

๐Ÿ› Go Testing case for spell suggestions on `bigge` fails

Hi everybody,

I am getting this error when I run the following commands while building a Docker image;

FROM golang:latest

RUN go get -t github.com/sajari/fuzzy
RUN cd ${GOPATH}/src/github.com/sajari/fuzzy && go test

I get the following error regarding the double char delete 2nd closest for the word bigge.

--- FAIL: TestSpellingSuggestions (0.00s)
    fuzzy_test.go:78: Spell check suggestions, Double char delete 2nd closest
Spell test1 count: 270, Correct: 193, Incorrect: 77, Ratio: 0.714815, Total time: 6.1401ms

Spell test2 count: 400, Correct: 270, Incorrect: 130, Ratio: 0.675000, Total time: 11.0152ms

FAIL
exit status 1
FAIL    github.com/sajari/fuzzy 4.069s
The command '/bin/sh -c cd ${GOPATH}/src/github.com/sajari/fuzzy && go test' returned a non-zero code: 1

Could you help me out?
Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.