Comments (3)
Here are some ideas that i found in some issues inside the original javascript repository:
- Let it run multiple times first time with the original input and then per common separators like whitespaces
-
or_
.
For this approach there were two ideas to generate the guesses.
1.1 Take the lower score
1.2 Take the average - Remove the separators and add a guess count per used separator. So the separators a still considered as additional characters that improve the password
from zxcvbn.
Hey there, just wanted to chime in since I already "fixed" this in nbvcxz. The fact that separators didn't get a specific "match" type in zxcvbn is pretty dumb, it increase the entropy way more than it should as you noticed, as well as making more "strange" matches.
So in my implementation, I decided to add a specific separator matcher. https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/java/me/gosimple/nbvcxz/matching/SeparatorMatcher.java
That just tries to identify the most common non alpha-numeric character in the password, and specifies that as the "separator" and then returns a "match" for each occurrence of it.
For zxcvbn, if I remember right it tried to find a match for each part of the password, and only if it couldn't find a match for that part would it then add a brute force match.
With that in mind, I realized that the matching algorithm zxcvbn uses is...well it's broken. It gets things "right" most of the time, but it does not find an optimal combination of matches in quite a few cases. That is because it always tries to find the lowest entropy for each range of the password, but once it's found the lowest entropy match for that region, it moves on to the next region to find it's lowest entropy matches. There are a ton of cases where taking a slightly higher entropy match for one later part of the password lets you get a vastly lower entropy match for an earlier part of the password. That wasn't possible if we didn't have brute force matches for each section in with the rest of the "matches" we found.
So with all that in mind, it still didn't totally fix the issue with zxcvbn's algorithm for finding the best match combination FYI.
In here:
https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/java/me/gosimple/nbvcxz/Nbvcxz.java
Under the findBestCombination method is the new algorithm I implemented, while findGoodEnoughCombination is the original port of the algorithm. As you can see, I still use the original, as it was much faster and allows me to short circuit some more expensive code.
So if you just add a separator match without any other changes, it may or may not fix the issue you are seeing here. I just wanted to bring the other stuff to your attention, as it took me many weeks to track down and come up with fixes for them originally.
Here is the example output for that same password run through nbvcxz:
java -jar .\nbvcxz-1.5.0.jar
Commands: estimate password (e); generate password (g); quit (q)
Please enter your command:
e
Please enter the password to estimate:
buy by beer
----------------------------------------------------------
Time to calculate: 9 ms
Password: buy by beer
Entropy: 31.717160968028406
Your password does not meet the minimum strength requirement.
Warning: This is a very common password.
Suggestion: Add another word or two. Uncommon words are better.
Time to crack: ONLINE_THROTTLED: 56 years
Time to crack: ONLINE_UNTHROTTLED: 1 years
Time to crack: OFFLINE_BCRYPT_14: 19 days
Time to crack: OFFLINE_BCRYPT_12: 4 days
Time to crack: OFFLINE_BCRYPT_10: 1 days
Time to crack: OFFLINE_BCRYPT_8: 7 hours
Time to crack: OFFLINE_BCRYPT_5: 56 minutes
Time to crack: OFFLINE_SHA512: instant
Time to crack: OFFLINE_SHA1: instant
Time to crack: OFFLINE_MD5: instant
-----------------------------------
Match Type: DictionaryMatch
Entropy: 6.584962500721156
Token: buy
Start Index: 0
End Index: 2
Length: 3
Dictionary: english
Dictionary Value: buy
Rank: 96
Length: 3
Leet Substitutions: false
Reversed: false
Distance: 0
-----------------------------------
Match Type: SeparatorMatch
Entropy: 3.3219280948873626
Token:
Start Index: 3
End Index: 3
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 4.700439718141093
Token: b
Start Index: 4
End Index: 4
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 4.700439718141093
Token: y
Start Index: 5
End Index: 5
Length: 1
-----------------------------------
Match Type: SeparatorMatch
Entropy: 3.3219280948873626
Token:
Start Index: 6
End Index: 6
Length: 1
-----------------------------------
Match Type: DictionaryMatch
Entropy: 9.087462841250339
Token: beer
Start Index: 7
End Index: 10
Length: 4
Dictionary: passwords
Dictionary Value: beer
Rank: 544
Length: 4
Leet Substitutions: false
Reversed: false
Distance: 0
from zxcvbn.
Many thanks 👍
That will definitely help me a lot!
If i'm going to work on this issue i would try every approach to check how they perform. But i like the idea of having a SeparatorMatcher.
I guess i will checkout your findBestCombination
too but i think this will be quite a lot of work to bring it into this code base :D
from zxcvbn.
Related Issues (20)
- Guesses value is higher than it should be when the l33t matcher uses a replacement of multiple symbols to one HOT 1
- L33t scoring is incorrect for multisymbol substitutions HOT 2
- Regex matcher returns only first match HOT 1
- For randomly generated inputs, the value of guessesLog10 is incorrectly equal to the length of the string. HOT 2
- Make time estimate and scoring thresholds configurable HOT 2
- Separator scoring isn't used
- One of the directions gets less turns and guesses in spatial matcher HOT 1
- Consider adding base to output HOT 1
- @zxcvbn-ts/matcher-pwned crypto error HOT 6
- Typing issue HOT 2
- Performance drop because of the new l33t matcher
- Small error in documentation
- Language sources not working anymore
- Not recognizing dictionary words when multiple dictionaries are used HOT 1
- repeat matcher is causing repeat characters to be tested in other matchers HOT 2
- [FR]: Ability to configure the execution order of the matchers HOT 2
- User input doesn't affect the scoring HOT 4
- Proposal: Reduce the dictionary size footprint HOT 2
- "Matcher pwned already exists" warning message issue HOT 1
- Incorrect score for the string "password" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zxcvbn.