Code Monkey home page Code Monkey logo

base41's Introduction

base41

Base41 encoding can be viewed as a simplified form of Ascii85 encoding. It is simple to code and understand, safe to embed in strings (like JSON or most programming language's literals), uses minimal resources and is reasonably efficient (has 50% overhead), making it a good candidate for use in embedded or otherwise constrained systems / applications.

See Base41 Encoding Specification for a detailed treatment of this encoding. Постоји и Спецификација на српском.

This repository has implementations in several languages. All code is under the liberal MIT license. Reference implementation is in C. Other languages may follow.

Misc

Invention, patents, whatever

AFAIK, I invented this encoding. If I haven't, great, let the glory go to the other guy.

But, if I did invent it, I hereby put it to public domain as of 2014-05-06 (the actual date of invention in ISO8601 format). That should make any patent claims for it invalid. Don't laugh, I've seen software patents that are much more silly.

Paper on a similar encoding with the same name

Botta and Cavagnino published a paper in 2022 about a slightly different encoding with the same name. The differences are:

  1. They have a "hand-picked" alphabet, making it possible for the Base41 encoded string to be part of an URL. Interesting, but, we avoided having an "alphabet table" by design, to reduce memory footprint.
  2. Their encoding is not "byte oriented" but "bit-oriented". That is, data need not have 8 x N bits. Again, interesting, but complicates things for a somewhat rare use-case (well, I guess for them it was not so rare).

Thus, we can think of that as a variant/dialect of the same encoding, which one would use if the differences mentioned above favour it in some application.

They were kind enough to mention this code repo as "apriori art" of sorts. It's an interesting read, with a different focus and presentation style than our specification. Here's the full reference to the paper:

Botta, Marco, and Davide Cavagnino. "Base41: A proposal for printable encoding of bit strings." Engineering Reports (2022): e12606.

Age before beauty

No, I was not 41 when I invented it. :)

base41's People

Contributors

sveljko avatar phmajerus avatar

Stargazers

 avatar Petr Mikusek avatar Niklas Salmoukas avatar  avatar Eshaan Bansal avatar nexteve avatar  avatar

Watchers

James Cloos avatar  avatar Lars H. Rohwedder avatar  avatar  avatar

base41's Issues

Information about QR Code use for documentation

Base41 could be the optimal format to store binary data in URIs optimized for QR Codes.

QR Codes in alphanumeric mode can encode all the characters allowed in URIs (RFC 3986): ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=
However, alphanumeric/ASCII QR code really only encode 45 characters: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ␣$%*+-./:.
All lowercase require twice as much space for a preceding shift character. This is why the HCERT-designed (WHO Electronic Health Certificate) Base45 use that exact alphabet of 45 characters.
But they really did it in a hurry and apparently were not familiar with binary-to-text encoding optimizations nor QR Code software, as they generate codes that require a specific scanning library, they cannot be used to reliably directly link to web sites or apps.

They could have achieved the same density with a Base41 encoding, and that would have made it possible to avoid dangerous characters, primarily the % that many generic QR Code readers expect to be percent-encoding, and + that can be an escaped space. Both may be unreliably decoded and processed by some QR scanning utilities designed to expect URLs.

With a 41-characters alphabet, it is possible to limit it to the URI (unreserved + reserved, not the more limited URI component / unreserved) set : 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ$*-.:, keeping the / available as a fields separator or padding to append several base41-encoded values, and avoiding the + completely.
While not completely safe for a querystring component (depending on how you handle those), it is safe for embedding in a URI or URL, which means it makes it possible to have a URI that specifies a web site or an app URI with binary data appended to its registered pseudo-protocol.
I believe this makes it perfect both for trackable web sites links (with some binary data attached), and apps shortcuts URIs with the state information appended in binary (for example, a music player could have a playlist UUID encoded in binary to launch that specific playlist).
This seems like a good way to generate shorter web and app links optimized for QR codes, which in turn results in smaller QR Codes.

In short, I believe Base41 may be the optimal format to generate short URLs and URIs for use in QR Code links that can be read directly by phone's code scanning utilities.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.