timshannon / bolthold Goto Github PK

View Code? Open in Web Editor NEW

630.0 24.0 43.0 340 KB

BoltHold is an embeddable NoSQL store for Go types built on BoltDB

License: MIT License

Go 100.00%

boltdb bucket query-criteria nosql go golang

bolthold's People

Contributors

Stargazers

Watchers

Forkers

godeep totalstar pelioniot mattparlmer arguello soyersoyer rmg dima21250 bwesterb zer09 priestd09 bybzmt bonedaddy mh-cbon ivard backwardn strogo gtrevg shivakumarjakkani jitendraky ftknox obimka madewithlinux importpath shenlanzifa ferocious-space nv4re kedric demonoid81 p0vidl0 barryzxy voratham syncplify ahmadfikrimasyhur zgwit vela-security marcelom caixiong110 psyton 1codim iq-scm cdreier zealass

bolthold's Issues

usage examples

I have been using storm, but this looks pretty good.

is there are example of where it has been used in a project yet ?

I am also thinking of using this with "go generate"

Discussion: code generation

Hi,

Very cool project, certainly there is a need for high level object functions on top of Bolt.

I saw this conversation on GitHub https://www.reddit.com/r/golang/comments/5dfzm7/bolthold_an_embeddable_nosql_store_for_go_types/da5uc53/ .

For me codegeneration is a way to avoid reflection and interface{}. E.g. the Where("field") concept seems weak compared to a codegen'd Where_MyType_MyField() replacement. It helps catch errors at compile-time.

Possibly this is too great a departure from the package's current behaviour?

Add Count method

Seems to be pretty common operation, so a help func to count records without needing to return them and hold them in memory may be useful.

count, err : = store.Count(bolthold.Where("field").Eq("Whatever"))

consider Find(<T>) as well as Find(<[]T>)

Hi,

can you also consider to improve Find so it can also handle structs ?

It would return an error if the input was a struct and that no rows were found, unlike a slice input.

right now i work around with something like this

func (m ModelAware) FindOne(out interface{}, q *bolthold.Query) error {
	rslice := reflect.New(reflect.SliceOf(reflect.TypeOf(out).Elem()))
	slice := rslice.Interface()
	err := m.Store.Find(slice, q.Limit(1))
	if err != nil {
		return err
	}
	if rslice.Len() < 1 {
		return notFound(out)
	}
	reflect.ValueOf(out).Set(rslice.Elem().Index(0))
	return nil
}

I suspect it generates few useless allocations.

Create Query via JSON

I just started playing with bolthold this evening and it was quick and easy to setup. Really digging it so far.

Maybe this is already supported, but is there any way to construct a *Query via JSON? I would make for a nice search capability on a REST endpoint I'm considering. Since it would allow a query via json to be passed and evaluated.

I could come up with my own mechanism but would take some work to map into the fluent aspects of the current Query with the hidden internals.

bolthold.Get does not populate ID field with boltholdKey tag

I may be misunderstanding the documentation for boltholdKey, but the following example demonstrates my confusion. When using a uint64 id field with the boltholdKey struct tag, the id field is not populated when using bolthold.Get

package main

import (
    "os"
    "testing"

    "github.com/timshannon/bolthold"
)

type User struct {
    ID   uint64 `json:"id" boltholdKey:"id"`
    User string `json:"user"`
}

func TestInsertGet(t *testing.T) {
    os.Remove("test.db")

    db, err := bolthold.Open("test.db", 0600, nil)
    if err != nil {
        t.Fatalf("%+v\n", err)
    }

    u1 := User{
        User: "john.doe",
    }

    err = db.Insert(bolthold.NextSequence(), &u1)
    if err != nil {
        t.Fatalf("%+v\n", err)
    }
    t.Logf("u1: %+v\n", u1)

    var u2 User
    err = db.Get(u1.ID, &u2)
    if err != nil {
        t.Fatalf("%+v\n", err)
    }
    t.Logf("u2: %+v\n", u2)
}

This produces the following output:

=== RUN   TestInsertGet
--- PASS: TestInsertGet (0.00s)
    main_test.go:32: u1: {ID:1 User:john.doe}
    main_test.go:39: u2: {ID:0 User:john.doe}
PASS
ok      github.com/bemasher/bolthold    0.282s

Why is the ID field on u2 not being populated by db.Get?

consider WTx / RTx

hi,

If it is not unadvised to have two functions such as

func (s Storer) Wtx(fn func(Storer)error)error{...}
func (s Storer) Rtx(fn func(Storer)error)error{...}

I would like to suggest it for addition.

Right now i have some supplementary code such as

package model

import (
	"github.com/timshannon/bolthold"
	"go.etcd.io/bbolt"
)

type Storer interface {
	Begin(writable bool) (Storer, Tx)
	WTx(fn func(Storer) error) error
	MustWTx(fn func(Storer) error)
	RTx(fn func(Storer) error) error

	FindAggregate(data interface{}, query *bolthold.Query, groupBy ...string) ([]*bolthold.AggregateResult, error)
	Find(data interface{}, query *bolthold.Query) error
	Insert(key, data interface{}) error
	Update(key, data interface{}) error
	Upsert(key, data interface{}) error
	Delete(key, data interface{}) error
	DeleteMatching(data interface{}, query *bolthold.Query) error
	Get(key, result interface{}) error
}

type txStore struct {
	store *bolthold.Store
	tx    *bbolt.Tx
}

func (m txStore) Begin(writable bool) (Storer, Tx) {
	return m, m.tx
}
func (m txStore) WTx(fn func(Storer) error) error {
	return fn(m)
}
func (m txStore) MustWTx(fn func(Storer) error) {
	err := m.WTx(fn)
	if err != nil {
		panic(err)
	}
}
func (m txStore) RTx(fn func(Storer) error) error {
	return fn(m)
}

func (t txStore) Get(key, result interface{}) error {
	return t.store.TxGet(t.tx, key, result)
}
func (t txStore) FindAggregate(data interface{}, query *bolthold.Query, groupBy ...string) ([]*bolthold.AggregateResult, error) {
	return t.store.TxFindAggregate(t.tx, data, query, groupBy...)
}
func (t txStore) Find(data interface{}, query *bolthold.Query) error {
	return t.store.TxFind(t.tx, data, query)
}
func (t txStore) Insert(key, data interface{}) error {
	return t.store.TxInsert(t.tx, key, data)
}
func (t txStore) Update(key interface{}, data interface{}) error {
	return t.store.TxUpdate(t.tx, key, data)
}
func (t txStore) Upsert(key interface{}, data interface{}) error {
	return t.store.TxUpsert(t.tx, key, data)
}
func (t txStore) Delete(key interface{}, data interface{}) error {
	return t.store.TxDelete(t.tx, key, data)
}
func (t txStore) DeleteMatching(data interface{}, query *bolthold.Query) error {
	return t.store.TxDeleteMatching(t.tx, data, query)
}

type Tx interface {
	Commit() error
	Rollback() error
}

type Store struct {
	store *bolthold.Store
}

func (m Store) Begin(writable bool) (Storer, Tx) {
	tx, err := m.store.Bolt().Begin(writable)
	if err != nil {
		panic(err)
	}
	mm := &txStore{
		store: m.store,
		tx:    tx,
	}
	return mm, tx
}
func (m Store) WTx(fn func(Storer) error) error {
	mm, tx := m.Begin(true)
	defer tx.Commit()
	return fn(mm)
}
func (m Store) MustWTx(fn func(Storer) error) {
	err := m.WTx(fn)
	if err != nil {
		panic(err)
	}
}
func (m Store) RTx(fn func(Storer) error) error {
	mm, tx := m.Begin(false)
	defer tx.Commit()
	return fn(mm)
}

func (m Store) Get(key, value interface{}) error {
	return m.store.Get(key, value)
}
func (m Store) FindAggregate(data interface{}, query *bolthold.Query, groupBy ...string) ([]*bolthold.AggregateResult, error) {
	return m.store.FindAggregate(data, query, groupBy...)
}
func (m Store) Find(data interface{}, query *bolthold.Query) error {
	return m.store.Find(data, query)
}
func (m Store) Insert(key, data interface{}) error {
	return m.store.Insert(key, data)
}
func (m Store) Update(id interface{}, data interface{}) error {
	return m.store.Update(id, data)
}
func (m Store) Upsert(id interface{}, data interface{}) error {
	return m.store.Upsert(id, data)
}
func (m Store) Delete(id interface{}, data interface{}) error {
	return m.store.Delete(id, data)
}
func (m Store) DeleteMatching(data interface{}, query *bolthold.Query) error {
	return m.store.DeleteMatching(data, query)
}

so i can isolate code within tx by simply wrappingit, it also cares that i do not open a transaction within another tx, and the api is simpler and cleaner.

Index ids using byteslice instead of gob

Instead of using gob to marshal ids slices a []byte slice could be used directly.

order by or sort?

I found query.skip() and query.limit() methods, but is there any way to sort the results?

let's say I have a blog and I want to show the 10 most recent posts matching some tags. should I recover all posts and sort them? this would not defeat the purpose of skip and limit (at least for pagination)?

any suggestion?

plans to move to bbolt

As you may or may not be aware, the original boltdb repo has been archived, and the community has moved over to the coreos maintained version, bbolt (https://github.com/coreos/bbolt) which includes some additional bug fixes, and continued maintenance. I'd recommend taking a look at it and potentially moving over to it, as it should be backwards compatible.

best practices using boltholdKey tag

Hi, I'm using bolthold to store time series samples, so using time.Time as key. As the sample type includes a Time field, I have the boltholdKey tag on that field.

Is there any way to keep the data from being stored in the Value struct as well as the key? Is a gob:"-" tag recommended on all fields that have the boltholdKey tag?

improving docs by adding examples

Hi!

I was reading at last changes, great. However, since the beginning i think the documentation is lacking clarity.

I would like to propose a pr to add examples likes this one https://golang.org/src/io/io_test.go#L350
Attached to method, and easy to access from godoc.

I m not sure which godoc engine is running https://godoc.org/github.com/timshannon/bolthold and if it is the same as https://golang.org/pkg/bytes/ but i believe that it will be updated so even if it does not display today, it will tomorrow.

Index don't get properly updated

When i change the value of an indexed field and search for the old value, i still get results.

package main

import (
	"fmt"
	"log"

	"github.com/timshannon/bolthold"
)

type Person struct {
	Name     string
	Division string `boltholdIndex:"Division"`
}

var p1 = Person{"Hermann", "Division1"}

func main() {
	store, err := bolthold.Open("hold.db", 0666, nil)
	if err != nil {
		log.Fatalln(err)
	}

	store.Insert(p1.Name, p1)

	p1.Division = "test123"
	store.Update(p1.Name, p1)

	var result []Person
	err = store.Find(&result, bolthold.Where("Division").Eq("Division1"))
	fmt.Println(result)
}

The result is:

go run test.go
[{Hermann test123}]

criterion value might not be in database during seekCursor

See #66

consider query.Like() ?

Hi,

Have you considered a Like(thresholdComuter, threshold) criterion that would be able to filter items given a threshold of resemblance ?

Where resemblance could be a Levenshtein distance, or something more complex.

(I m trying to figure out how i will handle search text, for the bigger picture)

how to use string array in a query with [where In] ?

var ids []string
where := bolthold.Where("Id").In(ids...) //It's error

Nested Structs and Where

Say you have the following structure:

type Repo struct {
  Name string
  Contact ContactPerson
}

type ContactPerson struct {
  Name string
}

The following doesn't seem to work: bolthold.Where("Contact.Name").Eq("some-name") given I've inserted a bunch of Repo types in this example.

Is this not supported?

move away query/criterion so we can re use it against a regular sql db ?

do you have any thoughts about that ? I see you already done badger and boltdb, they are similar (?) yet they use a different base source code.

I m interested into using a regular sql db in the way i do with bolt.
in my opinion the current query api is definitely capable of doing this.

I still see some advantages in using slq db, its definitely more matured, yet i really like the way i work with bolt.

Do you have any thoughts or ideas to share about this ?

Add auto incrementing Sequence key by using bucket.NextSequence()

So you can do things like:

	err = store.Insert(bolthold.NextSequence(), &KeyTest{
			Key:   3,
			Value: "test value",
		})

And get an auto incrementing id.

Add Skip and Limit

Other databases have options of skip and limit to help with pagination. I'm thinking I'll handle this by adding a Row field selector similar to how you can compare against the Key with bolthold.Key()

So a query may look like this:

bolthold.Where(bolthold.Row()).Ge(0).And(bolthold.Row()).Lt(100))

It allows you to skip and limit.

panic with invalid page type

panic: invalid page type: 0: 4
panic: invalid freelist page: 0, page type is meta [recovered]
panic: invalid freelist page: 0, page type is meta
goroutine 228 [running]:
github.com/PPIO/go-ppio/util/sys.Go.func1.1()
/home/workspace/go/src/github.com/PPIO/go-ppio/util/sys/panic.go:55 +0x76
panic(0xd27bc0, 0xc42093a9c0)
/usr/lib/go/src/runtime/panic.go:502 +0x229
go.etcd.io/bbolt.(*freelist).read(0xc4202ef230, 0x7fa4f8655000)
/home/workspace/go/src/go.etcd.io/bbolt/freelist.go:237 +0x2fd
go.etcd.io/bbolt.(*freelist).reload(0xc4202ef230, 0x7fa4f8655000)
/home/workspace/go/src/go.etcd.io/bbolt/freelist.go:297 +0x50
go.etcd.io/bbolt.(*Tx).rollback(0xc4201880e0)
/home/workspace/go/src/go.etcd.io/bbolt/tx.go:267 +0xd0
go.etcd.io/bbolt.(*DB).Update.func1(0xc4201880e0)
/home/workspace/go/src/go.etcd.io/bbolt/db.go:662 +0x3e
panic(0xd27bc0, 0xc42093a9a0)
/usr/lib/go/src/runtime/panic.go:502 +0x229
go.etcd.io/bbolt.(*Cursor).search(0xc4201195f8, 0xc42091dee0, 0xb, 0x10, 0x4)
/home/workspace/go/src/go.etcd.io/bbolt/cursor.go:250 +0x388
go.etcd.io/bbolt.(*Cursor).seek(0xc4201195f8, 0xc42091dee0, 0xb, 0x10, 0x0, 0x0, 0xe27560, 0xc4209122e0, 0xc420119520, 0xc42095afc0, ...)
/home/workspace/go/src/go.etcd.io/bbolt/cursor.go:159 +0xa5
go.etcd.io/bbolt.(*Bucket).CreateBucket(0xc4201880f8, 0xc42091dee0, 0xb, 0x10, 0xc42091deeb, 0x5, 0xcb6400)
/home/workspace/go/src/go.etcd.io/bbolt/bucket.go:165 +0xfa
go.etcd.io/bbolt.(*Bucket).CreateBucketIfNotExists(0xc4201880f8, 0xc42091dee0, 0xb, 0x10, 0xb, 0x10, 0xc42091dee0)
/home/workspace/go/src/go.etcd.io/bbolt/bucket.go:199 +0x4d
go.etcd.io/bbolt.(*Tx).CreateBucketIfNotExists(0xc4201880e0, 0xc42091dee0, 0xb, 0x10, 0xb, 0x10, 0x2)
/home/workspace/go/src/go.etcd.io/bbolt/tx.go:115 +0x4f
github.com/timshannon/bolthold.(*Store).TxInsert(0xc4200c4630, 0xc4201880e0, 0xd27bc0, 0xc42093a990, 0xe27560, 0xc4201baa00, 0x0, 0xc4201198f0)
/home/workspace/go/src/github.com/timshannon/bolthold/put.go:50 +0xc3
github.com/timshannon/bolthold.(*Store).Insert.func1(0xc4201880e0, 0x1009490, 0xc4201880e0)
/home/workspace/go/src/github.com/timshannon/bolthold/put.go:38 +0x58
go.etcd.io/bbolt.(*DB).Update(0xc4203023c0, 0xc420119958, 0x0, 0x0)
/home/workspace/go/src/go.etcd.io/bbolt/db.go:670 +0x90
github.com/timshannon/bolthold.(*Store).Insert(0xc4200c4630, 0xd27bc0, 0xc42093a990, 0xe27560, 0xc4201baa00, 0x0, 0x0)
/home/workspace/go/src/github.com/timshannon/bolthold/put.go:37 +0x8a

Two Structs, separated the db file or join in one file

I have two or more Structs, Should I separated the db file so I opened all file while init or join in one file?

type Item struct {
ID int64
Name string boltholdIndex:"Name"
Command string boltholdIndex:"Command"
Created int64
Updated int64
}
type Profile struct {
ID int64
Name string boltholdIndex:"Name"
Phone string boltholdIndex:"Phone"
Created int64
Updated int64
}

struct tag to fill in key/id on results

First off, loving the library. Thank you! I noticed quickly however that when utilizing Find() (and similar methods), that the returned list has no identification of where it came from, or what it's origin key is. More specifically, if I use Find() to retrieve a list of matching values, I cannot determine what the original key is.

Reviewing query.go, it seems like this would be a fairly trivial addition, however what are your thoughts on this? For example, the simplest way one may add this is by filling in a matching struct tag.

type SomeEntry struct {
	ID string `boltholdKey`
	// Could be interchangeable to byte as well, maybe? And, field name
	// shouldn't matter either.
	Key   byte `boltholdKey`
	Field value
}

When running the query, which loops through the boltdb key/values, it could fill in the ID field if it wasn't already specified after it was unmarshaled from gob.

There are only a few potential downsides if you could call them that, being:

The amount of checks necessary to loop through the fields to see if the struct tag exists, but it's rare to ever see structs with a large amount of fields.
When the struct is being saved, it would be stored in bolt as well, unless the end user also added a gob:"-" style tag. I don't think there is any way around this .
Probably something else I missed. It's early in the morning for me, hah.

The only way I can see of getting around this is to store the key within the struct itself when it's being saved, but this seems non-idiomatic to me. This is what I am doing currently.

panic on bolthold.Where("Something").In("something") (only with index)

package main

import (
	"fmt"
	"log"

	"github.com/timshannon/bolthold"
)

type Person struct {
	Name     string
	Division string `boltholdIndex:"Division"`
}

var p1 = Person{"Hermann", "Division1"}

func main() {
	store, err := bolthold.Open("hold.db", 0666, nil)
	if err != nil {
		log.Fatalln(err)
	}

	store.Insert(p1.Name, p1)

	p1.Division = "test123"
	store.Update(p1.Name, p1)

	var result []Person
	err = store.Find(&result, bolthold.Where("Division").In("test123"))
	fmt.Println(result)
}

panic

panic: reflect: New(nil)

goroutine 1 [running]:
reflect.New(0x0, 0x0, 0x1, 0x7f882f332000, 0x0)
	/usr/local/go/src/reflect/value.go:2130 +0x114
github.com/timshannon/bolthold.(*Criterion).test(0xc4200b25c0, 0x517400, 0xc4200b66c0, 0x40c701, 0x517400, 0xc4200b66c0, 0xc4200aba80)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/query.go:333 +0xce
github.com/timshannon/bolthold.matchesAllCriteria(0xc42000c4c8, 0x1, 0x1, 0x517400, 0xc4200b66c0, 0x1, 0xb, 0x7f882f2ce103, 0x27)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/query.go:397 +0x62
github.com/timshannon/bolthold.newIterator.func3(0xc4200aba00, 0xc4200b6680, 0x0, 0x0, 0x0, 0x506184, 0xc)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/index.go:250 +0x193
github.com/timshannon/bolthold.(*iterator).Next(0xc420012320, 0x0, 0x0, 0x606000, 0x0, 0x0, 0xc4200abc40)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/index.go:333 +0x188
github.com/timshannon/bolthold.runQuery(0xc420098380, 0x511220, 0xc4200b65e0, 0xc420018400, 0x0, 0x0, 0x0, 0x0, 0xc4200abd68, 0xc4200abd68, ...)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/query.go:497 +0x28b
github.com/timshannon/bolthold.findQuery(0xc420098380, 0x510160, 0xc4200b65c0, 0xc420018400, 0x1, 0xc420098380)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/query.go:602 +0x2f4
github.com/timshannon/bolthold.(*Store).TxFind(0xc42000c0d8, 0xc420098380, 0x510160, 0xc4200b65c0, 0xc420018400, 0xc4200abe20, 0xc420016180)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/get.go:58 +0x49
github.com/timshannon/bolthold.(*Store).Find.func1(0xc420098380, 0x553520, 0xc420098380)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/get.go:52 +0x4f
github.com/boltdb/bolt.(*DB).View(0xc42009e000, 0xc4200abe90, 0x0, 0x0)
	/home/rkaufmann/go/src/github.com/boltdb/bolt/db.go:629 +0x9f
github.com/timshannon/bolthold.(*Store).Find(0xc42000c0d8, 0x510160, 0xc4200b65c0, 0xc420018400, 0xc420018400, 0x0)
	/home/rkaufmann/go/src/github.com/timshannon/bolthold/get.go:53 +0x98
main.main()
	/home/rkaufmann/bolt.go:29 +0x309
exit status 2

Add Query optimizer

Select the best index of all fields in a query by finding which index is most unique / requires the least amount of reads.

That index would be determined by one of the following:

Find the index with the most number of rows, or more specifically, the least number of keys per row.
Or try all of the indexes simultaneously and see which one comes back first

Given Go's excellent concurrency control, I'm leaning towards the second option, but we'll do both and see which one has better benchmarks.

consider an alternative query.Where("").Match()

Hi,

instead of MatchFunc(func(ra *RecordAccess) (bool, error)) can you consider an alternative Match(func(<T>) bool) ?

Where <T> is user defined and you just have to check the type compatibility before invoking the given function via reflect.

That would be a really cool improvement over the api.
This matcfunc implementation has smelly taste IMVHO.

I sketched an example impl here https://play.golang.org/p/t-VvzLu6wC8

How does the index system work?

Could you expand your description of the index system in the readme or wiki?

Looks like you store all indexes in the_index bucket and then each key in the index has a gobencoded slice of []byte values that are the actual references back to whatever objects are indexed.

I don't quite understand the lookup code other than it supports gt, ge, & eq operations.

Any possible support nested bucket?

Currently, use Upsert or Insert to insert or update value is convenience without create bucket, but in some cases need nested bucket that bolthold not currently supported.

Is there any plane to support nested bucket?

bolthold.Key

hi,

what is the thing around bolthold.Key ? few times i hit difficulties and wonders because i used the raw field name instead of this special key.

Is it possible to makes this totally optional ?

Encoder and decoder are set globally

When specifying another encoder or decoder when opening a bolthold database, the encoder or decoder is changed globally over the project. This is a problem if you use multiple bolthold databases at the same time. For example, we use bolthold in our main application and one of the libraries we use in our tests also uses bolthold. The library however uses another encoder and decoder than we do.

A consequence of this is that as as soon as the library opens its database and changes the decoder, the database in our main application cannot be used anymore. It gives an error that information cannot be encoded or decoded anymore.

In the following code you can see that the encoder and decoder are set globally over the project:

bolthold/encode.go

Lines 18 to 19 in 525de81

    
           var encode EncodeFunc 
        
           var decode DecodeFunc

nested structs and find

There seems to be an issue with nested structs that are pointers, example of breaking code here https://play.golang.org/p/0LWo8-91p2Z

I have applied the following patch to query.go and it seems to be working. I am not very familiar with the reflect package, so I am unsure if this is the best way to fix.

--- query.go	2019-03-14 14:15:46.000000000 -0600
+++ ../query.fix.go	2019-03-14 14:17:04.000000000 -0600
@@ -778,11 +778,7 @@
 		func(r *record) error {
 			var rowValue reflect.Value
 
-			if elType.Kind() == reflect.Ptr {
-				rowValue = r.value
-			} else {
-				rowValue = r.value.Elem()
-			}
+			rowValue = r.value.Elem()
 
 			if keyType != nil {
 				err := decode(r.key, rowValue.FieldByName(keyField).Addr().Interface())
@@ -791,7 +787,11 @@
 				}
 			}
 
-			sliceVal = reflect.Append(sliceVal, rowValue)
+			if elType.Kind() == reflect.Ptr {
+				sliceVal = reflect.Append(sliceVal, rowValue.Addr())
+			} else {
+				sliceVal = reflect.Append(sliceVal, rowValue)
+			}
 
 			return nil
 		})

Aggregates

With the addition of SubQueries in #2, having aggregates would be very powerful.
Aggregate functions will need to end a query chain, and so must always come at the end.

Something like:

bolthold.Where("Fieldname").Eq(value).GroupBy("Fieldname")

// reductions
bolthold.Where("Fieldname").Eq(value).Avg("Fieldname")
bolthold.Where("Fieldname").Eq(value).Min("Fieldname")
bolthold.Where("Fieldname").Eq(value).Max("Fieldname")
bolthold.Where("Fieldname").Eq(value).Count("Fieldname")

//Or group + reductions
bolthold.Where("Fieldname").Eq(value).GroupBy("Fieldname").Avg("Fieldname")
bolthold.Where("Fieldname").Eq(value).GroupBy("Fieldname").Min("Fieldname")
bolthold.Where("Fieldname").Eq(value).GroupBy("Fieldname").Max("Fieldname")
bolthold.Where("Fieldname").Eq(value).GroupBy("Fieldname").Count("Fieldname")

The Avg, Min, and Max aggregates will have to be on numeric fields, or alternately specify a function in the aggregate that returns a float64.

bolthold.Where("Fieldname").Eq(value).Max(func(field interface{}) float64 {
   // return a float representing the the numerical value of this field being aggregated
})

The result of all grouped aggregates will have to be new type:

type AggregateResult struct {
  Group interface{}  //Field value grouped by
  // List of records in the grouping, with Min and Max , the reduction will be a single record, with the min/max in the value field
  // With Avg and Count the reduction will be empty, and only the value field populated
  Reduction []interface{}
  Value interface{}
}

This is a first draft on a plan, and I'm sure I'm missing some things, or some things will need to change when I implement it.

improve bolthold.Where / query.And

Hi,

some consideratons regarding the query api

The current scheme using bolthold.Where / Query.And over a pointer is not in my opinion the best strategy.

In this silly example, the user has to deal with repetitive code

var q *bolthold.Query
if title!="" {
if q == nil { q = bolthold.Where("Title").Eq(title) }
else{ q = q.And("Title").Eq(title) }
}
if otherthing="" {
if q == nil { q = bolthold.Where("Otherthing").Eq(otherthing) }
else{ q = q.And("Otherthing").Eq(otherthing) }
}
if q == nil { q = new(*bolthold.Query) }
q.Limit(10)
store.Find(some{}, q)

something is broken in this code, could be factorized, a much better out of the box version could look like

var q bolthold.Query
if title!="" {
q.And("Title").Eq(title)   // or q.Where.
}
if otherthing="" {
q.And("Otherthing").Eq(otherthing)  // or q.Where.
}
q.Limit(10)
store.Find(some{}, q)

the and / where mechanic is simplified. in the example they should behave exactly the same way. no matter what who was called first etc.
because q is declared as value type there is no need to new or anything else.

as q is value type, Find method change to a value type also. so writing a criterion less query is painful

var q bolthold.Query
store.Find(some{}, q) // vs nil previously

a solution is to adjust Find method in order to take a variadic query parameter, so the user can write

store.Find(some{})

am i missing someting ?

Improve handling of missing indexes

Similar to #9, when performing an index query for a type that doesn't yet have any records, the API returns an error because the index doesn't exist.

Given that the cause is partially because there's no records to find, would it make more sense for Find() to return a nil error with no results, or a ErrNotFound error?

The former seems more logical to me, so I'm going to open a PR for that shortly..

no query.NotIn(...interface{}) ?

hi,

i cant find the NotIn verb in the query implementation. Am i missing something obvious ?

Is bolthold practical for time series?

Hi, I'm considering using bolthold for time series data. Ideally, with a time series database, you can filter data by a time range and quickly pull out the data you want with having to query over the entire bucket. If I was building on boltdb directly, I could do something like create separate buckets for each week of the year or something. Do you have any thoughts if it is practical to store time series data using bolthold?

The application for this is simple setups where we are storing data from a handful of sensors over the course of years on an embedded Linux system, where we can't justify something heavier like influxdb. I think there is also an application for some http://simpleiot.org installations where someone wants to set up an instance to log data from a handful of sensors without messing with a separate DB.

Instead of using the entire refelct library consider using a version of the library with the debugging features removed

I was considering using this or the badger version of your library but you use the reflect library which a decent portion is debugging code. However there are "lite" versions of this library that have much of that cut out. There is a good chance you can find a version of the library with the pieces you need for your aggregate functionality without having the unsafe stuff included.

Just a consideration, feel free to close this if you disagree.

ForEach

Run a function for each matching record of a query without having to hold the entire dataset in memory.

Badger support

Love this. Any plans for badger support? Anything you know of that would make it difficult to port? Thx.

Add Not() query criteria modifier

Per the discussion in #48.

It'd be nice to have a general negation operator that could be put in front of any other critierion.

Where("field").Not().In(val1, val2, val3)

Where("field").Not().MatchFunc(func(ra *RecordAccess) (bool, error))

Where("field").Not().IsNil()

Etc

Improve MatchFunc

MatchFunc criteria is pretty powerful, as it lets you express whatever criteria you want in Go code, however it could be made much more powerful with a few changes:

Current MatchFunc definition

type MatchFunc func(field interface{}) (bool, error)

Changed MatchFunc definition

type MatchFunc func(records *RecordAccess) (bool, error)

type RecordAccess struct {}

func (r *RecordAccess) Field() interface{} {
  //returns currently selected criterion field
}

func (r *RecordAccess) Record() interface{} {
  //returns current record in it's entirety
}

func (r *RecordAccess) SubQuery(result interface{}, query *Query) error {
   // Run a subquery to compare against the current record  or field
}

coreos/bbolt code changes breaks bolthold

Hi Tim,

Great work you have done here,

There is an issue that occurs due to the recent code changes by the coreos/bbolt team.

The changes means the bolt reference is no longer correct and causes errors when building like below:

To fix it in my version i made a change like this:

import (
bolt "github.com/coreos/bbolt"
)

The errors i encountered earlier are listed below:

github.com/timshannon/bolthold
./aggregate.go:12:2: imported and not used: "github.com/coreos/bbolt"
./delete.go:10:2: imported and not used: "github.com/coreos/bbolt"
./get.go:10:2: imported and not used: "github.com/coreos/bbolt"
./index.go:12:2: imported and not used: "github.com/coreos/bbolt"
./put.go:11:2: imported and not used: "github.com/coreos/bbolt"
./query.go:15:2: imported and not used: "github.com/coreos/bbolt"
./store.go:12:2: imported and not used: "github.com/coreos/bbolt"
./store.go:17:6: undefined: bolt
./store.go:25:2: bolt is not a package

Does bolthold support 'Batch read-write transactions'?

I want to use bolthold package for BoltDB.
I think it is very useful for me.
But it seems that we cannot use batch read-write transaction(https://github.com/boltdb/bolt#batch-read-write-transactions) in bolthold package.
Could you explain how to use batch tx in bolthold package?

A bug about MatchFunc with indexed field

Hi,

This awesome project is very helpful for me. Thanks a lot.

When I added index tag to Name, got a bug:

=== RUN TestMatchFunc
--- FAIL: TestMatchFunc (0.03s)
bolthold_test.go:27: gob: decoding into local type *bolthold.MatchFunc, received remote type string

test code:

func TestMatchFunc(t *testing.T) {
	store, err := bolthold.Open("test.db", 0666, nil)
	if err != nil {
		t.Error(err)
	}
	defer store.Close()
	if err := store.Insert("key", &Item{
		Name:    "Test Name",
		Created: time.Now(),
	}); err != nil {
		t.Error(err)
	}
	result := []Item{}
	if err := store.Find(&result, bolthold.Where("Name").MatchFunc(func(ra *bolthold.RecordAccess) (bool, error) {
		t.Logf("%T,%T", ra.Field(), ra.Record())
		return ra.Field() == "Test Name", nil
	})); err != nil {
		t.Error(err)
	} else {
		t.Log(result)
	}
}

type Item struct {
	Name    string    `boltholdIndex:"Name"`
	Created time.Time `boltholdIndex:"Created"`
}

Alternative key namespace designs

Looking at the structure of the storage it seems an entire struct is encoded into one binary blob and saved by entity + id (along with any indexes). For example, taking the simple example code here the resulting database is:

BUCKET "Person"
  KEY    "\n\f\x00\aHermann" = "*\xff\x81\x03\x01\x01\x06Person\x01\xff\x82\x00\x01\x02\x01\x04Name\x01\f\x00\x01\bDivision\x01\f\x00\x00\x00\x15\xff\x82\x01\aHermann\x01\atest123\x00"
BUCKET "_index:Person:Division"
  KEY    "\n\f\x00\atest123" = "\x15\xff\x83\x02\x01\x01\akeyList\x01\xff\x84\x00\x01\n\x00\x00\x10\xff\x84\x00\x01\v\n\f\x00\aHermann"

Looking at the object storage and hydration code it is clear that having all the fields in once place is easy to reason about. With indexes on every field used for lookup, you also can quickly search by value.

I'm curious if you ever thought about designing the layout to be more "flat" to help speed up range/prefix/sufix scans without requiring indexes on all search columns.

person:name:1 = Hermann
person:name:2 = Sam
person:division:1 = 31
person:division:2 = 23
etc...

b.Cursor().Seek('person:name:') ....

After searching the values of each key matching the prefix/range scan, one problem is knowing before hand all the columns that exist on a given struct (including nested structs) so that you can reassemble the whole object. (That might require storing a copy of the struct for it's reflected field metadata on first save.)

A possible solution could be namespacing by ID instead of field. Assuming you could figure out how to split keys up to pull out names correctly (the split on ":" in this simple example isn't very safe).

person:1:name = Hermann
person:1:division = 31
person:2:name = Sam
person:2:division = 23
etc...

Querying Nested Slices?

Apologies if this is obvious, but I couldn't find it documented anywhere and none of the available query operators seemed to cover the use case of querying nested slices. Is there a recommended approach for querying a nested slice? For example, how would I find all ItemTests that include a Tags slice containing "sausage"?

var data := []ItemTest{
	ItemTest{
		Key:  1,
		ID:   32,
		Name: "pizza",
		Tags: []string{"cheese", "sausage"},
	},
	ItemTest{
		Key:  1,
		ID:   32,
		Name: "pizza",
		Tags: []string{"cheese", "sausage", "pepperoni"},
	},
	ItemTest{
		Key:  1,
		ID:   32,
		Name: "pizza",
		Tags: []string{"cheese", "pepperoni"},
	}
}

Add sorting / ordering

.sortBy("fieldname")
or
.orderBy("fieldname")

Will have to handle skip and limits properly with sorting.
Add documentation about fastest sort is always on Key field.

Crashes when getting an object of a type not previously registered.

That is,

type Foo ...

var result Foo
db.Get("someKey", &result)

results in

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x141679d]

goroutine 23 [running]:
testing.tRunner.func1(0xc420122340)
	/usr/local/go/src/testing/testing.go:621 +0x554
panic(0x1557dc0, 0x184c690)
	/usr/local/go/src/runtime/panic.go:489 +0x2ee
github.com/boltdb/bolt.(*Bucket).Get(0x0, 0xc4200d4b10, 0x11, 0x40, 0x0, 0x20, 0x0)
	/Users/jb/src/github.com/boltdb/bolt/bucket.go:257 +0x5d
github.com/timshannon/bolthold.(*Store).TxGet(0xc42008a188, 0xc420123450, 0x1538de0, 0xc4201541a0, 0x151b680, 0xc420121ee0, 0x1420bee, 0xc4200e3c70)
	/Users/jb/src/github.com/timshannon/bolthold/get.go:33 +0x18b
github.com/timshannon/bolthold.(*Store).Get.func1(0xc420123450, 0x15e4268, 0xc420123450)
	/Users/jb/src/github.com/timshannon/bolthold/get.go:19 +0x74
github.com/boltdb/bolt.(*DB).View(0xc4200776c0, 0xc4200e3cf0, 0x0, 0x0)
	/Users/jb/src/github.com/boltdb/bolt/db.go:578 +0xc7
github.com/timshannon/bolthold.(*Store).Get(0xc42008a188, 0x1538de0, 0xc4201541a0, 0x151b680, 0xc420121ee0, 0x0, 0xc4200e3d90)
	/Users/jb/src/github.com/timshannon/bolthold/get.go:20 +0x106
...

apparently because the bucket doesn't exist when the get happens. (I'll try to fix this.)

best practice for reading large amounts of data

What is the best way to read a large amount of data from a database (say 20,000 records) without reading everything into one result array? Is there any type of cursor functionality?

How to recycle space

I have insufficient free space on my device. And I delete all logs record form store, Probably use about 5G of log records. But space is not recycled