Code Monkey home page Code Monkey logo

fastbloom's Introduction

Fastbloom

A lightweight but fast Bloomfilter written in Python(2.7).

Introduction

Based on mmap and MurMur, Spooky hash functions as the base hashes. Use double hashing to reduce the number of hashes to two.

Requirements

Install boost and boost-python

  • Ubuntu

sudo apt-get install libboost-all-dev

  • Centos

sudo yum install boost-devel

  • OSX

brew install boost boost-python

Install pyhash

sudo pip install pyhash

Install fastbloom

sudo pip install fastbloom

Examples

>>> from fastbloom import BloomFilter
>>> filter_ = BloomFilter(10000, 0.001) # set size of input 1000, error rate 0.1%
>>> filter_.add('www.google.com')
>>> 'www.google.com' in filter_
True
>>> 'www.github.com' in filter_
False

Benchmark

When input_size set to 1000000, accepted_error_rate set to 0.01%

  • Memory consumption: 4 MB
  • Add operation: 7322.42 ops
  • Check operation: 24788.79 ops
  • Actual fault positive rate: 0.0000

You can just run the benchmark.py to see the actual benchmarks

Todo

  • add a faster bloomfilter
  • add save and restore functions(as files)
  • write a scalable bloomfilter

fastbloom's People

Contributors

preytaren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.