Code Monkey home page Code Monkey logo

spec's Introduction



ulid


Universally Unique Lexicographically Sortable Identifier

UUID can be suboptimal for many uses-cases because:

  • It isn't the most character efficient way of encoding 128 bits of randomness
  • UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
  • UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
  • UUID v4 provides no other information than randomness which can cause fragmentation in many data structures

Instead, herein is proposed ULID:

ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV
  • 128-bit compatibility with UUID
  • 1.21e+24 unique ULIDs per millisecond
  • Lexicographically sortable!
  • Canonically encoded as a 26 character string, as opposed to the 36 character UUID
  • Uses Crockford's base32 for better efficiency and readability (5 bits per character)
  • Case insensitive
  • No special characters (URL safe)
  • Monotonic sort order (correctly detects and handles the same millisecond)

Implementations in other languages

From ourselves and the community!

Language Author Binary Implementation
C++ suyash
Clojure theikkila
Objective-C ricardopereira
Crystal SuperPaintman
Dart isoos
Delphi matinusso
D enckse
D (dub) extrawurst
Erlang savonarola
Elixir Homepolish
Elixir merongivian
Elixir omisego
Elixir (Ecto) dcuddeback
F# lucasschejtman
Go oklog
Haskell steven777400
Java huxi
Java azam
Java Lewiscowles1986
JavaScript ulid
JavaScript aarondcohen
Julia ararslan
Lua Tieske
.NET RobThree
.NET fvilers
Nim adelq
Perl 5 bk
PHP Lewiscowles1986
PHP robinvdvleuten
PowerShell PetterBomban
Python mdipierro
Python ahawker
Python mdomke
Ruby rafaelsales
Rust mmacedoeu
Rust dylanhart
SQL-Microsoft rmalayter
Swift simonwhitehouse
Tcl dbohdan

Specification

Below is the current specification of ULID as implemented in this repository.

Note: the binary format has not been implemented in JavaScript as of yet.

 01AN4Z07BY      79KA1307SR9X4MV3

|----------|    |----------------|
 Timestamp          Randomness
   48bits             80bits

Components

Timestamp

  • 48 bit integer
  • UNIX-time in milliseconds
  • Won't run out of space till the year 10889 AD.

Randomness

  • 80 bits
  • Cryptographically secure source of randomness, if possible

Sorting

The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed

Canonical String Representation

ttttttttttrrrrrrrrrrrrrrrr

where
t is Timestamp (10 characters)
r is Randomness (16 characters)

Encoding

Crockford's Base32 is used as shown. This alphabet excludes the letters I, L, O, and U to avoid confusion and abuse.

0123456789ABCDEFGHJKMNPQRSTVWXYZ

Monotonicity

When generating a ULID within the same millisecond, we can provide some guarantees regarding sort order. Namely, if the same millisecond is detected, the random component is incremented by 1 bit in the least significant bit position (with carrying). For example:

import { monotonicFactory } from 'ulid'

const ulid = monotonicFactory()

// Assume that these calls occur within the same millisecond
ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRZ
ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS0

If, in the extremely unlikely event that, you manage to generate more than 280 ULIDs within the same millisecond, or cause the random component to overflow with less, the generation will fail.

import { monotonicFactory } from 'ulid'

const ulid = monotonicFactory()

// Assume that these calls occur within the same millisecond
ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRY
ulid() // 01BX5ZZKBKACTAV9WEVGEMMVRZ
ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS0
ulid() // 01BX5ZZKBKACTAV9WEVGEMMVS1
...
ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZX
ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZY
ulid() // 01BX5ZZKBKZZZZZZZZZZZZZZZZ
ulid() // throw new Error()!

Overflow Errors when Parsing Base32 Strings

Technically, a 26-character Base32 encoded string can contain 130 bits of information, whereas a ULID must only contain 128 bits. Therefore, the largest valid ULID encoded in Base32 is 7ZZZZZZZZZZZZZZZZZZZZZZZZZ, which corresponds to an epoch time of 281474976710655 or 2 ^ 48 - 1.

Any attempt to decode or encode a ULID larger than this should be rejected by all implementations, to prevent overflow bugs.

Binary Layout and Byte Order

The components are encoded as 16 octets. Each component is encoded with the Most Significant Byte first (network byte order).

0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      32_bit_uint_time_high                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     16_bit_uint_time_low      |       16_bit_uint_random      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       32_bit_uint_random                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       32_bit_uint_random                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Prior Art

Partly inspired by:

spec's People

Contributors

aarondcohen avatar alizain avatar darccio avatar dcuddeback avatar erikreedstrom avatar huxi avatar jpwilliams avatar unnawut avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.