zshipko / rust-kv Goto Github PK

View Code? Open in Web Editor NEW

153.0 153.0 26.0 142 KB

An embedded key/value store for Rust

License: ISC License

Rust 100.00%

rust-kv's People

Contributors

Stargazers

Watchers

rust-kv's Issues

Open multiple reader simultanously

Hi, I have an issue where it is not possible to open multiple reader at the same time (LMDB(BadRslot)). Is this intended, or am I doing something wrong? I thought LMDB would follow the single writer/multiple reader model?

    pub fn from_disk<T: AsRef<Path>>(path: T) -> Store {
        let mut cfg = Config::default(path);
        cfg.set_map_size(50000000000);
        cfg.set_max_readers(128);
        cfg.bucket("metadata", None);

        let store = kv::Store::new(cfg).unwrap();

        {
            let a = store.read_txn().unwrap();
            // second read handle not possible
            let b = store.read_txn().unwrap();
        }

        Store { store: Some(store) }
    }

Error: Error in Sled: Unsupported: the 'compression' feature must be enabled

but their is not flag to enable compression feature

Config::temporary unused

Hello,

I was a bit confused over the purpose of Config::temporary... The config value seems entirely unused?

Also, for the "flush frequency", does that flush happen in the background as well? In a separate thread or on some kinda schedule, or only when actually modifying the store?

Questions: Performance Review and Custom Serde Serialization Schemes

Hi, awesome work with Kv,

Was looking for a simplified interface to Sled, so this is cool.

Performance:
Can you expand on why you chose Sled, did you do any performance benchmarks against Rocks or LMDB before this?

Anything you uncovered?

Custom Encoding / Serialization:
Any pathway for me to integrate a custom serde encoder/decoder?

I have a scheme I want to use that isn't listed, for encoding / decoding keys.

Awesome work, happy holidays.

Why one would choose `kv` over using `sled` directly?

Would you care to explain why would one choose to use kv over using sled directly? I'm new to Rust ecosystem and from the first glance it looks like kv just unnecessary layer on top of sled... but I'm sure that there is a good reason for its existence. So, could you please put the explanation of why? in the project README?

Loading gets slow

I am bulk loading about 130m items of variable size. Initially it was running at 12k/sec, but as time goes on it is down to 7k/sec. This is on a Macbook Pro, SSD storage. Changing the write path to just put in a big binary file is like lightning, so the slowdown is from kv.

Support for multi-tree transactions

Would be cool to allow a transaction to span over multiple buckets. This would allow to atomically move an element from one bucket to another. sled already has support for this: https://docs.rs/sled/0.34.6/sled/transaction/trait.Transactional.html#foreign-impls

Passing keys by reference

Methods on Bucket receiving a Key parameter should do so by accepting a reference to the key. This would allow to use the same key in multiple operations without cloning.

sled accepts an AsRef<[u8]> for the underlying operations. By doing the same in kv, it can support references without changing the API for existing code.

Future direction?

Hello!

Coincidentally, I have also been working on a wrapper around LMDB.

I've spent some time working on things that I don't yet see represented here:

Correctly restrict to one handle per process via a 'manager'.
Integer-keyed LMDB stores.
Key-typed generic stores, so if you want a store keyed off URIs or integers only, Rust will enforce your constraint.
More detailed error wrapping via failure.
Encoding and decoding of values via bincode/serde, so it can handle storage of almost any serializable object.
Tagging of types so we know which deserialization to apply.

Naturally I'd like to avoid wasted effort in any direction, and either merge some features, use existing libraries as layers, or clarify design decisions.

What's your plan for the future of your library? Do you see it as relatively stable as is (that is, you're already meeting your own needs), or are you hoping to grow it into something more substantial?

Segmentation fault with impl Iterator

Hey there,
I get a segmentation fault when I return the Iterator, but not the corresponding Cursor, like so:

pub fn publications(&self) -> impl Iterator<Item = (u64, Publication)> {
        self.txn.read_cursor(&self.bucket).unwrap().iter()
            .map(|(a,b)| (u64::from(a), b.inner().unwrap().to_serde()))
}

this results in an Segmentation Fault. My solution was to wrap the Cursor in a seperate Iterator struct and implementing the next function manually.

blobs reside on disk after the key-value pair is removed

I used &str as the key and Vec as the value. The size of the value is 1GB.
I tried a couple of cases below and notified that the generated blob files were always kept on the disk.

Case1: Set a test k-v pair with above types; Get by key; Modify vector; Set it back with the original key. As a result, in the "blobs" folder, two files, instead of 1 file, left and occupied around 2GB disk space.

Case2: Set the k-v pair; Remove by key; Call "contains" to make sure the key has been removed from the store. But on disk, under the "blobs" folder, a 1GB file still resides.

Could you please let me know how to delete the old blob file automatically after the corresponding value is removed or replaced? Thanks.

cursor iterator only works for K: From<&u[u8]>

This seems, at least for K = &str, to make it impossible to iterate over a cursor.

To reproduce, add the following to the main example in src/lib.rs:

    {
        // This time a readonly transaction
        let txn = store.read_txn()?;

        // Getting a value is easy once everything is set up
        let curs = txn.read_cursor(&bucket)?;
        let all: Vec<(String, String)> =
            curs.map(|(k, v)| (k.to_string(), v.to_string())).collect();
    }

and the resulting error is

error[E0599]: no method named `map` found for type `kv::cursor::Cursor<'_, &str, &str>` in the current scope
  --> src/bin/foo.rs:44:18
   |
44 |             curs.map(|(k, v)| (k.to_string(), v.to_string())).collect();
   |                  ^^^ method not found in `kv::cursor::Cursor<'_, &str, &str>`
   |
   = note: the method `map` exists but the following trait bounds were not satisfied:
           `&mut kv::cursor::Cursor<'_, &str, &str> : std::iter::Iterator`

due to this trait bound

impl<'a, K: Key, V: Value<'a>> Iterator for Iter<'a, K, V>
where
    K: From<&'a [u8]>,

Expose something similar to the `compare_and_swap` from sled to the bucket

I'm building a poc using kv, so far it's been great, but I'm running into a potential concurrency issue when two 'calls' try to update the same key. Currently they both succeed, but I would rather have the later one failing. This could be done if the compare_and_swap was exposed. As the second call would fail, since the expected value is already different.

I might want to try to add it if you like the idea, but that won't be for weeks. It's not a big problem, especially as this is a poc.

kv::Json not found

match store.bucket::<Raw, Json>(None) {
                        Ok(bucket) => {
                            
                        }
                    }

version 0.22.0
version 0.21.1

Rethink Serde Integration

I'm currently using this crate in a project, and I've run into an issue with how you've designed Serde integration, and I also have a solution that may be interesting to you.

I wanted to serialize data stored in LMDB with bincode, but firstly, there's no documentation for how to do that, so I read through the source and found that you had a custom Data type for using bincode as an encoding. This isn't great, since the custom type doesn't provide good mechanisms for interacting with custom data structures.

The way I worked around this was to implement Encoding for a custom type that wraps a generic. Here is a general overview of how it works

pub struct BincodeEncoding<T>(pub T);

impl<T: DeserializeOwned + Serialize> Encoding for BincodeEncoding<T> {
    fn encode_to<W: Write>(&self, w: &mut W) -> Result<(), Error> {
        bincode::serialize_into(w, &self.0).map_err(|e| {
            // do required error mapping
        })
    }

    fn decode_from<R: Read>(r: &mut R) -> Result<Self, Error> {
        bincode::deserialize_from(r).map(BincodeEncoding).map_err(|e| {
            // do required error mapping
        })
    }
}

This allows for serialization and deserialization of user-provided types in LMDB, and can be easily replicated for other serde-compatible crates.

Bincode support is broken

Bincode support in the error module is accidentally mapped to the msgpack-value feature instead of the bincode-value feature.

See #12

zshipko / rust-kv Goto Github PK

rust-kv's People

Contributors

Stargazers

Watchers

Forkers

rust-kv's Issues

Recommend Projects

Recommend Topics

Recommend Org