base41's Introduction

base41

Base41 encoding can be viewed as a simplified form of Ascii85 encoding. It is simple to code and understand, safe to embed in strings (like JSON or most programming language's literals), uses minimal resources and is reasonably efficient (has 50% overhead), making it a good candidate for use in embedded or otherwise constrained systems / applications.

See Base41 Encoding Specification for a detailed treatment of this encoding. Постоји и Спецификација на српском.

This repository has implementations in several languages. All code is under the liberal MIT license. Reference implementation is in C. Other languages may follow.

Misc

Invention, patents, whatever

AFAIK, I invented this encoding. If I haven't, great, let the glory go to the other guy.

But, if I did invent it, I hereby put it to public domain as of 2014-05-06 (the actual date of invention in ISO8601 format). That should make any patent claims for it invalid. Don't laugh, I've seen software patents that are much more silly.

Paper on a similar encoding with the same name

Botta and Cavagnino published a paper in 2022 about a slightly different encoding with the same name. The differences are:

They have a "hand-picked" alphabet, making it possible for the Base41 encoded string to be part of an URL. Interesting, but, we avoided having an "alphabet table" by design, to reduce memory footprint.
Their encoding is not "byte oriented" but "bit-oriented". That is, data need not have 8 x N bits. Again, interesting, but complicates things for a somewhat rare use-case (well, I guess for them it was not so rare).

Thus, we can think of that as a variant/dialect of the same encoding, which one would use if the differences mentioned above favour it in some application.

They were kind enough to mention this code repo as "apriori art" of sorts. It's an interesting read, with a different focus and presentation style than our specification. Here's the full reference to the paper:

Botta, Marco, and Davide Cavagnino. "Base41: A proposal for printable encoding of bit strings." Engineering Reports (2022): e12606.

Age before beauty

No, I was not 41 when I invented it. :)

base41's People

Contributors

Stargazers

Watchers

base41's Issues

Information about QR Code use for documentation

Base41 could be the optimal format to store binary data in URIs optimized for QR Codes.

QR Codes in alphanumeric mode can encode all the characters allowed in URIs (RFC 3986): ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=
However, alphanumeric/ASCII QR code really only encode 45 characters: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ␣$%*+-./:.
All lowercase require twice as much space for a preceding shift character. This is why the HCERT-designed (WHO Electronic Health Certificate) Base45 use that exact alphabet of 45 characters.
But they really did it in a hurry and apparently were not familiar with binary-to-text encoding optimizations nor QR Code software, as they generate codes that require a specific scanning library, they cannot be used to reliably directly link to web sites or apps.

They could have achieved the same density with a Base41 encoding, and that would have made it possible to avoid dangerous characters, primarily the % that many generic QR Code readers expect to be percent-encoding, and + that can be an escaped space. Both may be unreliably decoded and processed by some QR scanning utilities designed to expect URLs.

With a 41-characters alphabet, it is possible to limit it to the URI (unreserved + reserved, not the more limited URI component / unreserved) set : 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ$*-.:, keeping the / available as a fields separator or padding to append several base41-encoded values, and avoiding the + completely.
While not completely safe for a querystring component (depending on how you handle those), it is safe for embedding in a URI or URL, which means it makes it possible to have a URI that specifies a web site or an app URI with binary data appended to its registered pseudo-protocol.
I believe this makes it perfect both for trackable web sites links (with some binary data attached), and apps shortcuts URIs with the state information appended in binary (for example, a music player could have a playlist UUID encoded in binary to launch that specific playlist).
This seems like a good way to generate shorter web and app links optimized for QR codes, which in turn results in smaller QR Codes.

In short, I believe Base41 may be the optimal format to generate short URLs and URIs for use in QR Code links that can be read directly by phone's code scanning utilities.

Recommend Projects

sveljko / base41 Goto Github PK

base41's Introduction

base41

Misc

Invention, patents, whatever

Paper on a similar encoding with the same name

Age before beauty

base41's People

Contributors

Stargazers

Watchers

Forkers

base41's Issues

Information about QR Code use for documentation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent