Code Monkey home page Code Monkey logo

arithmetic-hash's Introduction

Arithmetic Hash

This project was developed upon Justin's idea.

Evaluation

Background: SHA-1

A good hash function should map the expected inputs as evenly as possible over its output range. - Wikipedia/Hash_Function

Hash algorithm we use for bittorrent is SHA-1, which succeeds to map the encoded outputs in absolute evenly with given range. But this is earned by losing the ability to decode and distinguish the original key. (This project does not consider about the 'security' side of SHA-1)

Improvement Direction

If we can get the original key from hashed output, we will be able to easily search through all the info_hash (hashed keys) to find the file we want to download. For this purpose the Arithmetic Hashing is developed. (1) To hash keys in known algorithm in the way that we can decode, and (2) to map the encoded outputs evenly to its output range.

The basic method is like this, suppose that our key is "apple" and our output range is [0, 1]. By expected probability if the probability of a being the first letter is 10%, than the 'apple' goes in 0 to 0.1 section. And if the probability of bs appearance as second letter after ais 5%, apple goes in [x, x+0.005]. So on, we can specify the position apple shold go in.

Recent Work

Recently, one paper jumped into this idea with 3-order Markov Chain. (I will link the paper as soon as I get it back) But in our hypothesis, 3-order Markov Chain would have descending performance as the key (query) gets longer than one simple word. That's why we are trying to solve this issue with RNN(LSTM).

Results

You can see the experiment flow, results, and the evaluation from this link of the jupyter notebook.

standard deviation results from brown corpus (1 word long keys)

standard deviation results from 2gram corpus (equal or more than 2 words long keys)

with keys with short length, Markov Chain showed better performance, but RNN becomes better as the key length increases. The results were as we desired, and encoding-decoding experiments also went well.

arithmetic-hash's People

Contributors

rokrokss avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.