glench / fuzzyset.js Goto Github PK
View Code? Open in Web Editor NEWfuzzyset.js - A fuzzy string set for javascript
Home Page: http://glench.github.io/fuzzyset.js/
License: Other
fuzzyset.js - A fuzzy string set for javascript
Home Page: http://glench.github.io/fuzzyset.js/
License: Other
The README.md file says this is licensed under a BSD license, but doesn't state which one.
For me to use this I must follow the license requirements, but there are different BSD licenses.
I believe all BSD licenses require the license to be included with the code.
Could you please clarify which BSD license is used here?
And technically the license should be included in this project.
When I do a get("some value")
I sometimes get an empty array back.
Is that by intention, that it returns an empty array when there is no match? I know it can also return null in case of no match.
To me it does not seem like consistent behaviour, I just wanted to know if this was intended before making a PR to fix it?
Hi and thanks for a great lib!
I have found that [email protected]
and above works great in vanilla node apps, but seems to stop working when webpacked. 1.0.5
works for me in both scenarios.
I guess this is not done on purpose :)
Maybe this is noob question but I do not know how to handle it.
Normal us is:
a = FuzzySet();
Loop:
a.add("some text");
End lopp
a.get("sme tekst");
But to speed up I'd like to load all items and avoid repeating this for later several search processes. Just use 'a.get()' later. Is there any way to store this object in memory?
Hi there!
I'm working on a little app that uses a relatively large local databse (200k entries, but small enough to fit on a client machine for the purposes of my project). It takes about 5s to build the index, and I was wondering if it would be possible for me to build the index once, export it to some serialized format, and then bundle it with the app so I don't have to spend so much time building the index when the user loads the page?
If this is possible, how would I do this?
Thanks!
Hi
I have something like this
optionsDic = FuzzySet();
constructor(props) {
super(props);
//options = ["tire"]
props.filter.options.forEach(option => {
this.optionsDic.add(option)
});
}
const results = this.optionsDic.get("t", .01);
Yet I dont' see any results till I type in like "ti" which gets me to a 0.50 weight. It is like the 0.33 weight is still being enforced.
Using "fuzzyset.js": "0.0.7"
I would like to find similar product name in database with 120.000 rows.
Exporting all to javascript object will be not possible - sql will not support so many characters.
Regards
After I installed by npm, there is an extra folder called MathJax-master
under fuzzyset.js
.
GUMI-235:fuzzyset.js barbayardashzeveg$ ls -l
total 40
drwxr-xr-x 19 barbayardashzeveg staff 608 Aug 8 16:35 MathJax-master
-rwxr-xr-x 1 barbayardashzeveg staff 4641 Jul 12 21:16 README.rst
-rw-r--r-- 1 barbayardashzeveg staff 47 Dec 27 2012 index.js
drwxr-xr-x 3 barbayardashzeveg staff 96 Aug 8 16:35 lib
-rw-r--r-- 1 barbayardashzeveg staff 1432 Aug 8 16:35 package.json
-rw-r--r-- 1 barbayardashzeveg staff 750 Jul 10 2013 test.html
Dependency
"dependencies": {
"ascii-table": "0.0.9",
"fuzzyset.js": "0.0.6"
}
Hello Glench!
You have a great app, unfortunately this app does not have a logo yet, may I donate a logo for your app?
First of all, great work !! I compared like 5-6 libraries for fuzzy search based on my requirements but none came closer to your library.
The only thing that concerned me was those libraries had like multiple releases. Are you planning to release a stable version any time soon ? I just think it is much better to use and maintain code based on release than on a commit.
Have u tested with Wordnet for example? It has 150 000 entries.
your last lines look like so:
if (typeof module !== 'undefined' && module.exports) {
module.exports = FuzzySet;
root.FuzzySet = FuzzySet;
} else {
root.FuzzySet = FuzzySet;
}
For me the point of using CommonJS is at least to some extent isolation of components. If I wanted the module to be exported globally I could do so in my own code like so:
window.FuzzySet = require('fuzzyset.js');
I'm just wondering if there's any particular reason you decided to double export.
also performance tests
fuzzyset.useLevenshtein = useLevenshtein || true;
That's how the variable is set, correct me if I'm wrong but that won't allow you to pass a false value.
Also I'm trying to understand why the different results with the python version, which in this javascript version doesn't return any match, but the python one does:
f = FuzzySet()
f.add(u'Conor Hedley')
print f.get(u'directory manager')
>>> [[(0.23529411764705888, u'Conor Hedley')]
Are you able to publish the latest to npm? I'm looking for the fix to getting multiple results above a certain threshold #9
fuzzy_data = FuzzySet()
fuzzy_data.add('something')
result = fuzzy_data.get('someing', 0.21)
Whatever I type there in third line as minimum score it has no influence on result. Am I doing something wrong?
hi, recently we want to add your lib to https://cdnjs.com, I want to confirm the BSD license identifier with you.
thank your very much!
There should be a way search within an array of objects.
Hello,
first thank you for your wonderful library !
I am writing an app with TypeScript and importing it like :
import { FuzzySet } from 'fuzzyset';
but when instanciating,
this._fuzzySet = FuzzySet();
I got this error :
ERROR TypeError: Object(...) is not a function
.
If I am correct, to allow your library to be TypeScript compatible, you can just update the final line to
module.exports.FuzzySet = FuzzySet;
(it may change the way to use it in vanilla JS)
Best regards !
Hi
I popped this list into your interactive example
Side Shift
Condition
Drive
SubCategory
Mast Type
SubCategory
Fuel
Description
Year
Brand
Model
Capacity (LB)
Mast Height: Lowered (Inch)
Mast Height: Raised (Inch)
Serial #
Hour Meter
Location
Model #
Stock #
Height (FT)
Max Fwd Reach (FT)
Sweep Path (Inch)
Platform Wdith (Inch)
Platform Length (Inch)
and used "Equipment Id", 0.5 and no match was found. Which is great that is what I expected but when I do it in my code
const dic = FuzzySet(["Side Shift",
"Condition",
"Drive",
"SubCategory",
"Mast Type",
"SubCategory",
"Fuel",
"Description",
"Year",
"Brand",
"Model",
"Capacity (LB)",
"Mast Height: Lowered (Inch)",
"Mast Height: Raised (Inch)",
"Serial #",
"Hour Meter",
"Location",
"Model #",
"Stock #",
"Height (FT)",
"Max Fwd Reach (FT)",
"Sweep Path (Inch)",
"Platform Wdith (Inch)",
"Platform Length (Inch)"], 3, 2);
const fuzzyMatches = dic.get(Equipment Id, 0.5);
Result comes back: [0.33333333333333337, "Hour Meter"]
I know the .get function will return more than 1 result if the match values are the same, and I'm sorry if this is something you are currently working on (since I saw differences in the published and current code just in the lines that builds the results), but the .get function could use an optional parameter to allow more than one result with different scores to be returned.
so I had the same match word the whole time and yet when I call it over and over sometimes its works and others don't
whats weird is if I reboot script sometimes all may work and sometimes all won't but even in one test cycle as seen above calling it again it gets a different answer
Note:
Nope is fired when a.get(***) returns null
passed is when it returns anything
See #36
Currently, the function returns just the exact match, if there ist one. An option to get also the lower-scored matches would be very helpful.
Current workaround: Add "unused" character like space to the end of the search value, so there will never be exact matches
I was trying to use the library directly on the web and ran into issue:
Getting Unexpected Token Export
I had to change it to instead:
<script type="module" src="dist/fuzzyset.js"></script>Is there a way once lemmatization is performed and we get the best match in the defined set of reference string, the input string can be corrected to match the reference text.
For example,
If my reference string is "pick number" and i am trying to match it with "click number three" and then i should be able to correct the word click to pick and get the corrected statement as "pick number three"
Maybe I missunderstand the search but I think it does not work as it should. I use fuzisearch.js
to compare urls with each other.
If someone misstypes a url like:
http://jwillmer.github.io/jekyllDecent/features
he is redirected to an error page and I add all urls from my website:
{
"title": "Theme Installation and Usage",
"url": "http://jwillmer.github.io/jekyllDecent/blog/readme/Readme",
},
{
"title": "Theme Features",
"url": "http://jwillmer.github.io/jekyllDecent/blog/features/Features",
},
{
"title": "YAML Custom Features",
"url": "http://jwillmer.github.io/jekyllDecent/blog/features/YAML-Features",
},
{
"title": "This post demonstrates post content styles",
"url": "http://jwillmer.github.io/jekyllDecent/blog/features/Content",
}
to fuzisearch.js
and as search result I only get one url back:
http://jwillmer.github.io/jekyllDecent/blog/features/Content
Shouldn’t there be at least three urls that have some kind of match?
Searching "Con" in demo does not return "Connecticut" (instead, the highest ranked result is "Oregon"). Searching "A" returns "Iowa" rather than the states that start with A. Seaching "K" only returns one of the two K states.
Surely it should prioritize exact matches and word beginnings over random characters spread throughout the string that happen to be in the correct order.
This is not an issue, just a question where else to look. Any idea how this could be easily used inside a mongodb find (instead of their regex solutions)?
I installed typings from @types/[email protected]
i.e., https://www.npmjs.com/package/@types/fuzzyset.js and changed my import statement like this import FuzzySet from 'fuzzyset.js';
in my .ts
file and I'm getting this error..
TypeError: fuzzyset_js_1.default is not a function
at PostcodeService.checkAndSetValue (postcode.service.ts:112)
at PostcodeService.updatePropertyFields (postcode.service.ts:153)
at eval (postcode.service.ts:69)
at ZoneDelegate.invoke (zone.js:388)
at Object.onInvoke (core.js:4733)
at ZoneDelegate.invoke (zone.js:387)
at Zone.run (zone.js:138)
at eval (zone.js:858)
at ZoneDelegate.invokeTask (zone.js:421)
at Object.onInvokeTask (core.js:4724)
Here're the related packages versions:
"fuzzyset.js": "0.0.1"
"@types/fuzzyset.js": "0.0.0"
"zone.js": "0.8.20"
"core-js": "2.5.3"
"typescript": "2.7.1"
I am not sure if I understand the normalization step. In the docs, it says that
fuzzyset will first normalize the string by removing non word characters except for spaces and commas and force everything to be lowercase.
But, the regexp /[^a-zA-Z0-9\u00C0-\u00FF, ]+/
only replaces the first match. Entering, for instance, abc^d*E%fwhat
into online debugger gives the following matchDict:
-a | (0,1)
ab | (0,1)
bc | (0,1)
cd | (0,1)
d* | (0,1)
*e | (0,1)
e% | (0,1)
%f | (0,1)
fw | (0,1)
wh | (0,1)
ha | (0,1)
at | (0,1)
t- | (0,1)
-ab | (0,1)
abc | (0,1)
bcd | (0,1)
cd* | (0,1)
d*e | (0,1)
*e% | (0,1)
e%f | (0,1)
%fw | (0,1)
fwh | (0,1)
wha | (0,1)
hat | (0,1)
at- | (0,1)
Is this the intended result?
Another example:
'dsfX1!3 0df,x 00#4.**wat'.replace(/[^a-zA-Z0-9\u00C0-\u00FF, ]+/, '');
// gives
"dsfX13 0df,x 00#4.**wat"
whereas
'dsfX1!3 0df,x 00#4.**wat'.replace(/[^a-zA-Z0-9\u00C0-\u00FF, ]/g, '');
// would produce
"dsfX13 0df,x 004wat"
Lets say I had an array
['btc', 'ltc']
Person searched 'tc' i want btc to come before ltc, how do we do that
Thanks in advance
Would it be possible to include the text of the actual license in the repo, or include the specific license identifier in package.json
? Thanks!
"Unable to find a readme for [email protected]"
First of all, I must thank you for this nice repo!
I would however like to advice you to keep a better structure on your versioning and hopefully also use releases.
When people like me (and the other 1.1k users starred it) sees there is an update in this repo's version - It is really hard to figure out what has been done and what we need to do to keep our systems working as intended. I am not sure if you are aware of the term "Semantic Versioning", but in general, it is a common assumption that developers use to understand a package's changes.
In example, you just changed this repo's version from [0.0.91 to 1.0.0](e1e20ea.
If you used semantic versioning, this would mean that a change you made to the core API would cause the usage of the repo version >1.0.0 to be incompatible to this new change; We who used your package would have to re-write some code if we were affected by that specific change.
As you understand, this is a bit confusing for a lot of developers, like myself.
Therefore, I would suggest you to look in to semantic versioning and hopefully also use Github releases - Then you type specific info regarding that particular release.
Thanks! 😄
When using browserify to make use of the CommonJS exports I get a weird error on line
190: "Uncaught TypeError: undefined is not a function" the function the error refers to is fuzzyset._normalizeStr
I believe it has something to do with your use of the "this" keyword, when I open it in a debugger the this keyword refers to window, not the fuzzyset instance.
If I however try to instantiate an instance in the console using a line like this
var a = window.FuzzySet();
a.add('a');
it works fine, and the this keyword refers to the fuzzyset instance.
Any ideas what might be going on?
I get this error on my React project:
Unhandled Rejection (TypeError): FuzzySet is not a function
I've installed it using this command: npm install --save fuzzyset
I've imported it: const FuzzySet = require('fuzzyset')
const a = FuzzySet(); //It seems to fail on this line
a.add("michael axiak");
a.get("micael asiak");
Please help. Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.