Code Monkey home page Code Monkey logo

probable-wordlists's Introduction

Please Help Support the Project by seeding these files

Torrent Note: Before 21 Jun 2017, 4 compression formats for each wordlist variety were included: .7z, -LZMA.zip, .tar.gz and .tar.xz . Now, there are only .7z and .tar.gz files. The seeding performance was little diluted due to the presence of these redundant files. If you wish, you can still download the .tar.xz and -LZMA.zip files from the Mega.nz links, but in future releases, only .7z and .tar.gz will be created.

To Cloners and Zip Downloaders:

This repository does not contain code, but links to a group of lists. A clone or zip download is possible, but may not be necessary to get the files you need.

Check out the Password Trend Analysis - and learn!

I visualized the trends of passwords that appeared 10 times or more in analysis. The charts contain immediately actionable advice on how to make your passwords more unique.

Probable Wordlists Logo

Probable Wordlists

Wordlists sorted by probability originally created for password generation and testing - make sure your passwords aren't popular!

Do you know what the world's most common passwords are? Do you know what they look like? You'll want to avoid them to be secure!

Methodology - The Why and How

While I was able to locate a few Password Wordlists that were sorted by popularity, the vast majority of lists, especially the larger lists, were sorted alphabetically. This seems like a major practicality flaw! If we assume that the most common password is password, (which is actually the 2nd most common, after 123456) and we are checking to see if a given password is in against an English dictionary, we are going to have to slog from aardvark through passover to get to password. I don't know off the top of my head just how commonly aardvark is used as a password - but we could be wasting a lot of time by not starting with the most common password on our list!

I went to SecLists, Weakpass, and Hashes.org and downloaded nearly every single Wordlist containing real passwords I could find. These lists were huge, and I ended up with over 80 GB actual, human-generated and used passwords. These were split up among over 350 files of varying length, sorting scheme, character encoding, origin and other properties. I sorted these files, removed duplicates from within the files themselves, and prepared to join them all together.

Some of these lists were composed of the other lists, and some were exact duplicates. I took care to remove any exact duplicate files - we didn't need to have any avoidable false positives. If a password was found across multiple files, I considered this to be an approximation of its popularity. If an entry was found in 5 files, it wasn't too popular. If an entry could be found in 300 files, it was very popular. Using Unix commands, I concatenated all the files into one giant file representing keys to over 4 billion secret areas on the web, and sorted them by number of appearances in the single file. From this, I was able to create a large wordlist sorted by popularity, not the alphabet. I've included all of the items that appeared at least twice in analysis.

Real-Passwords

These are REAL passwords. Every once in a while, a popular site has a high-profile security leak and passwords are released freely across the internet. Some of these passwords can be found on aggregator sites where they are separated from usernames to protect the unfortunate victim.

The files in this folder come from https://github.com/danielmiessler/SecLists, https://weakpass.com/ and https://hashes.org/

NOTE THAT UNTIL REV 2.0, ALL NON-ASCII CHARACTERS HAVE BEEN REMOVED

  • A more inclusive, and thus, more accurate list is in the works.

NOTE THAT THE DUE TO THE NEWLINE DUPLICATES ISSUE, 'WPA-Length' LISTS MAY INCLUDE LINES OF 7 CHARACTERS

  • Files in the WPA-List Folders in this repo have been CLEANED of lines under 8 characters.
  • However, the files found in Megalinks or Torrents have NOT BEEN CLEANED of lines under 8 characters
  • This will be fixed in Rev 2.0

Lists sorted by popularity will include "probable" in the filename

Dictionary-Style Lists

Wordlists including dictionaries, encyclopedic lists and miscellaneous. Do not contain information found with the password label.

Tasklist and Plans

Rev 2.0 Plan

  • Include truly accurate WPA-Length sorting
  • More sources (This is what is taking the most time)
  • Bigger sources
  • Non-ASCII Sources
  • Specialized lists compiled from sources themselves
  • Totally Recompile wordlists for improved accuracy, no duplicates from the get-go.
  • Include Counts for some files

Undetermined Future Plans

  • Come up with poetic name for 'things that'd be cool that I'm not sure when I'll do'
  • Create list of "pure" common passwords for use with rule-based cracking

Attributions

People Are Talking About Probable-Wordlists?!

Note that the author is not affiliated with or officially endorsing the visiting of any of the links below.

I found most (if not all) of these mentions by simply searching for the project in various engines

Thanks for the shout-outs!

Projects that use Probable-Wordlists

Note that the author is not affiliated with or officially endorsing any of the projects below.

The author cannot guarentee the security or efficacy of these applications - use at your own risk.

Check any project's code before running, and ALWAYS EXERCISE EXTREME CAUTION when entering in a password.

This is true for all applications downloaded off the internet - not just the hardworking members of the Github community who write open-source code for no profit.*

Disclaimer and License

  • These lists are for LAWFUL, ETHICAL AND EDUCATIONAL PURPOSES ONLY.
  • The files contained in this repository are released "as is" without warranty, support, or guarantee of effectiveness.
  • However, I am open to hearing about any issues found within these files and will be actively maintaining this repository for the foreseeable future. If you find anything noteworthy, let me know and I'll see what I can do about it.

The author did not steal, phish, deceive or hack in any way to get hold of these passwords. All lines in these files were obtained through freely available means.

The author's intent for this project is to provide information on insecure passwords in order to increase overall password security. The lists will show you what passwords are the most common, what patterns are the most common, and what you should avoid when making

License: CC BY-SA 4.0

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

You are free to:

Share

  • Copy and redistribute the material in any medium or format

Adapt

  • Remix, transform, and build upon the material for any purpose, even commercially.

The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms:

Attribution

  • You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

ShareAlike

  • If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

No additional restrictions

  • You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Notices:

  • You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
  • No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

Enjoy!

probable-wordlists's People

Contributors

berzerk0 avatar jimbergman avatar spmedia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.