Code Monkey home page Code Monkey logo

go-art's Introduction

                                        __      
                                       /\ \__   
   __     ___               __     _ __\ \ ,_\  
 /'_ `\  / __`\  _______  /'__`\  /\`'__\ \ \/  
/\ \L\ \/\ \L\ \/\______\/\ \L\.\_\ \ \/ \ \ \_ 
\ \____ \ \____/\/______/\ \__/.\_\\ \_\  \ \__\
 \/___L\ \/___/           \/__/\/_/ \/_/   \/__/
   /\____/                                      
   \_/__/                                       

an adaptive radix tree implementation in go

Build Status

what

An Adaptive Radix Tree is an indexing data structure similar to traditional radix trees, but uses internal nodes that grow and shrink intelligently with consecutive inserts and removals.

Adaptive Radix Trees have many interesting attributes that could be seen as improvements on other indexing data structures like Hash Maps or other Prefix Trees, such as:

  • Worst-case search complexity of O(k), where k is the length of the key.
  • They don't have to be rebuilt due to excessive inserts.
  • The structure of their inner nodes is space-efficent when compared to traditional Prefix Trees.
  • They provide prefix compression, a technique where each inner node specifies how far to 'fast-forward' in the search key before traversing to the next child.

usage

Include go-art in your go pacakages with:

import( "github.com/kellydunn/go-art" )

Go nuts:

// Make an ART Tree
tree := art.NewTree()

// Insert some stuff
tree.Insert([]byte("art trees"), []byte("are rad"))

// Search for a key, and get the resultant value
res := tree.Search([]byte("art trees"))

// Inspect your result!
fmt.Printf("%s\n", res) // "are rad"

documentation

Check out the documentation on godoc.org: http://godoc.org/github.com/kellydunn/go-art

implementation details

  • It's currently unclear if golang supports SIMD instructions, so Node16s make use of Binary Search for lookups instead of the originally specified manner.
  • Search is currently implemented in the pessimistic variation as described in the specification linked below.

performance

Worst-case scenarios for basic operations are:

Search Insert Removal
O(k) O(k)+c O(k)+c
  • k is the length of the key that we wish to insert. With prefix compression, this can be faster than Hashing functions, since hashing functions are O(k) operations.
  • c is the number of children at the parent node of insertion or removal. This accounts for the growing or shrinking of the inner node. At the worst case, this is number is 48; the maximum number of children to move when transitioning between the biggest types of inner nodes.

releated works

go-art's People

Contributors

kellydunn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

go-art's Issues

Add tree size function

It would be great to have a tree size function to return the number of nodes in the tree.

Crash after inserting >100 keys

package main

import (
    "encoding/binary"
    "math/rand"

    "github.com/kellydunn/go-art"
)

func main() {
    rand.Seed(42)
    key := make([]byte, 8)
    tree := art.NewArtTree()
    for i := 0; i < 135; i++ {
        binary.BigEndian.PutUint64(key, uint64(rand.Int63()))
        tree.Insert(key, key)
    }
}
$ go run bug.go 
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/kellydunn/go-art.(*ArtTree).insertHelper(0xc20800a1f0, 0xc20803f170, 0xc20805c3b0, 0xc20800aca0, 0x9, 0x10, 0x483380, 0xc20801f5c0, 0x1)
    /home/tv/go/src/github.com/kellydunn/go-art/art_tree.go:115 +0x619
github.com/kellydunn/go-art.(*ArtTree).insertHelper(0xc20800a1f0, 0xc20803e090, 0xc20800a1f0, 0xc20800aca0, 0x9, 0x10, 0x483380, 0xc20801f5c0, 0x0)
    /home/tv/go/src/github.com/kellydunn/go-art/art_tree.go:170 +0xde7
github.com/kellydunn/go-art.(*ArtTree).Insert(0xc20800a1f0, 0xc20800aca0, 0x9, 0x10, 0x483380, 0xc20801f5c0)
    /home/tv/go/src/github.com/kellydunn/go-art/art_tree.go:62 +0xa9
main.main()
    /home/tv/tmp/max.go:16 +0x154

goroutine 2 [runnable]:
runtime.forcegchelper()
    /home/tv/src/go1.4/src/runtime/proc.go:90
runtime.goexit()
    /home/tv/src/go1.4/src/runtime/asm_amd64.s:2232 +0x1

goroutine 3 [runnable]:
runtime.bgsweep()
    /home/tv/src/go1.4/src/runtime/mgc0.go:82
runtime.goexit()
    /home/tv/src/go1.4/src/runtime/asm_amd64.s:2232 +0x1

goroutine 4 [runnable]:
runtime.runfinq()
    /home/tv/src/go1.4/src/runtime/malloc.go:712
runtime.goexit()
    /home/tv/src/go1.4/src/runtime/asm_amd64.s:2232 +0x1
exit status 2

Inserted Key needs to be preserved and accessible by the current Node

As requested by a few users, it seems like it would be pretty useful to access the inserted key of the current node. Currently, this isn't possible because the path compression implementation alters the key so it's null terminated to differentiate between an internal node and a leaf node.

Suggested implementation:

  • Add another field to each Leaf Node such that it can remember the actual key it was inserted in.
  • Provide a Getter function that returns the original Key of the current node.

No method to get key from a node?

This makes e.g. iterating the tree a bit pointless; you can't even round-trip a map[string]string to ART and back (without kludges).

Trim Size of Node members

anachronistic [3:18 PM] 
@kellydunn looking at `ArtNode` (thinking about allocations, etc.) you can shave 8 bytes off regardless of platform by reordering the struct members

anachronistic [3:19 PM] 
type ArtNode struct {
    keys      []byte
    prefix    []byte
    children  []*ArtNode
    prefixLen int
    size      uint8
    nodeType  uint8
    key       []byte
    keySize   uint64
    value     interface{}
} // 72 on 32-bit, 136 on 64-bit

Coupled with the findings at github.com/tv42/benchmark-ordered-map , this lib could dramatically reduce the number of bytes allocated for searching and inserting.

Acceptance criteria:

  • Change order of node members for optimum spacing
  • Re-run benchmarks to ensure space savings
  • Add some sort of test or benchmarking to ensure optimum size of nodes in future.

License?

Interesting implementation of an interesting concept, but what is the license? Can I use it?

Go module support

It would be great if this repo could support go modules (and semantically versioned git tags).
Thanks,
Ben

Unicode strings with zero-byte values inside the key cannot be inserted

As discovered by @nick-codes, it looks like inserting a unicode key into a new ArtTree panics with an index out of range error.

Here's some code to reproduce the error:

// Tests a Unicode insertion.
func TestArtTreeInsertWithInternalZeroByte(t *testing.T) {
    tree := NewArtTree()

    // โ€˜a' followed by unicode accent character.
    accent := []byte{0x61, 0x00, 0x60}
    tree.Insert([]byte("a"), "a")
    tree.Insert(accent, string(accent))

    res := tree.Search(accent)
    if res == nil {
        t.Errorf("Could not find Leaf Node with expected key: '%s'", accent)
    } else {
        if !bytes.Equal(res.([]byte), accent) {
            t.Error("Unexpected search result.")
        }
    }
}

Acceptance critieria:

  • A user can insert a unicode string with an empty byte value inside of it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.