Code Monkey home page Code Monkey logo

base62's People

Contributors

dhimmel avatar joelnb avatar kurtmckee avatar suminb avatar yeonghoey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

base62's Issues

Ignoring leading zero bytes

Hello,

first of all, thank you for this library. I am using it for encoding 16 byte blocks and I have noticed, that during encoding, leading bytes that are equal to 0x00 are ignored. This is due to conversion to integer, which the library internally does. I believe this is not a correct behavior, because without knowledge of the input bytes block length, you cannot reconstruct (decode) the original input from output. But for example in encryption (and many other areas), all bytes (incl. leading zero bytes) matter.

I'll give an example using base64, which does this correctly:

encoded = b64encode(b'\x00\x00\x01').decode()
print(encoded)
decoded = b64decode(encoded)
print(decoded)

This code yields:

AAAB
b'\x00\x00\x01'

Now your library:

encoded = base62.encodebytes(b'\x00\x00\x01')
print(encoded)
decoded = base62.decodebytes(encoded)
print(decoded)

Yields:

1
b'\x01'

As you can see, decoded output is not equal the input (it misses the two leading zero bytes).

my bad

I has run this code and it return not correct

hex_id = "18815C41CB3F4E98AB4C55183553E82B"
base62.encodebytes(bytes.fromhex(hex_id)).swapcase()
# and return the wrong `KeVTaiPBuBS3akxitQdAf`

but it should be 0KeVTaiPBuBS3akxitQdAf

Advertise Python 3 compatibility

Please consider adding Python 3 trove classifiers (e.g. Programming Language :: Python :: 3) to this package to advertise Python 3 compatibility. This will also require release a new version on PyPI.

Make as a C module (with Cython)

Our preliminary test shows promising results. Building base62 with Cython results in great improvement in performance.

Test code

import random

import base62


for _ in range(1000000):
    value = random.randint(0, 0xffffffff) 
    encoded = base62.encode(value)
    assert base62.decode(encoded) == value

Test with native Python (3.5)

(pybase62) ➜  base62 git:(develop) ✗ time python test.py
python test.py  8.41s user 0.00s system 99% cpu 8.420 total

Test with Cython

(pybase62) ➜  base62 git:(develop) ✗ time python test.py
python test.py  6.63s user 0.00s system 99% cpu 6.636 total

That's approximately 21% of speed improvement. More thorough tests shall be conducted in the near future.

Wheels to PyPI

Thank you for the great package, @suminb!

Would you consider publishing wheels beyond the source on PyPI for faster installation? This is, in particular, relevant for projects with long lists of dependencies with small packages like this one. 😅

Question with regard to `0z`

Someone sent me an email asking this question:

I see your base62 script at github https://github.com/suminb/base62/blob/develop/base62.py, it's a awesome project and save me lots of time. But I have a question about the code

def decode(b):
    """Decodes a base62 encoded value ``b``."""

    if b.startswith('0z'):
        b = b[2:]

    l, i, v = len(b), 0, 0
    for x in b:
        v += _value(x) * (BASE ** (l - (i + 1)))
        i += 1

    return v

About the above code, what does the if b.startswith('0z') do ? why there need to do this? I don't understand. Thanks.

This is a great question, in fact, and I would like to share this question along with my answer with everyone else.

The prefix 0z is something that I used to use in my old code. It was a decade ago or so. At that time I wanted to use different encodings within a single project. For example,

  • 123456 (decimal)
  • 0xf10a (hexadecimal)
  • 0zExid (base62)

were all in use. The code for checking for the prefix 0z is there merely for backward-compatibility reasons and not necessary in ordinary cases. If you ask base62 to encode an integer, it will not add the prefix 0z. Likewise, if you ask to decode a base62 string without the prefix, it will work fine.

P.S. It is generally okay to send me an email, but I highly recommend to post your question on GitHub for two reasons:

  1. Our discussion becomes sharable and searchable, and everyone benefits from it.
  2. Someone else may be able to answer quicker than me.

Python 3 compatibility

Changing n /= BASE to n //= BASE makes this work with Python 3, without sacrificing backward compatibility.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.