Code Monkey home page Code Monkey logo

rust-kv's People

Contributors

ahxxm avatar calmofthestorm avatar colvin avatar dcnick3 avatar dependabot-preview[bot] avatar dependabot[bot] avatar edef1c avatar franklx avatar jeremybanks avatar xla avatar zshipko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rust-kv's Issues

Open multiple reader simultanously

Hi, I have an issue where it is not possible to open multiple reader at the same time (LMDB(BadRslot)). Is this intended, or am I doing something wrong? I thought LMDB would follow the single writer/multiple reader model?

    pub fn from_disk<T: AsRef<Path>>(path: T) -> Store {
        let mut cfg = Config::default(path);
        cfg.set_map_size(50000000000);
        cfg.set_max_readers(128);
        cfg.bucket("metadata", None);

        let store = kv::Store::new(cfg).unwrap();

        {
            let a = store.read_txn().unwrap();
            // second read handle not possible
            let b = store.read_txn().unwrap();
        }

        Store { store: Some(store) }
    }

Config::temporary unused

Hello,

I was a bit confused over the purpose of Config::temporary... The config value seems entirely unused?

Also, for the "flush frequency", does that flush happen in the background as well? In a separate thread or on some kinda schedule, or only when actually modifying the store?

Questions: Performance Review and Custom Serde Serialization Schemes

Hi, awesome work with Kv,

Was looking for a simplified interface to Sled, so this is cool.

Performance:
Can you expand on why you chose Sled, did you do any performance benchmarks against Rocks or LMDB before this?

Anything you uncovered?

Custom Encoding / Serialization:
Any pathway for me to integrate a custom serde encoder/decoder?

I have a scheme I want to use that isn't listed, for encoding / decoding keys.

Awesome work, happy holidays.

Why one would choose `kv` over using `sled` directly?

Would you care to explain why would one choose to use kv over using sled directly? I'm new to Rust ecosystem and from the first glance it looks like kv just unnecessary layer on top of sled... but I'm sure that there is a good reason for its existence. So, could you please put the explanation of why? in the project README?

Loading gets slow

I am bulk loading about 130m items of variable size. Initially it was running at 12k/sec, but as time goes on it is down to 7k/sec. This is on a Macbook Pro, SSD storage. Changing the write path to just put in a big binary file is like lightning, so the slowdown is from kv.

Passing keys by reference

Methods on Bucket receiving a Key parameter should do so by accepting a reference to the key. This would allow to use the same key in multiple operations without cloning.

sled accepts an AsRef<[u8]> for the underlying operations. By doing the same in kv, it can support references without changing the API for existing code.

Future direction?

Hello!

Coincidentally, I have also been working on a wrapper around LMDB.

I've spent some time working on things that I don't yet see represented here:

  • Correctly restrict to one handle per process via a 'manager'.
  • Integer-keyed LMDB stores.
  • Key-typed generic stores, so if you want a store keyed off URIs or integers only, Rust will enforce your constraint.
  • More detailed error wrapping via failure.
  • Encoding and decoding of values via bincode/serde, so it can handle storage of almost any serializable object.
  • Tagging of types so we know which deserialization to apply.

Naturally I'd like to avoid wasted effort in any direction, and either merge some features, use existing libraries as layers, or clarify design decisions.

What's your plan for the future of your library? Do you see it as relatively stable as is (that is, you're already meeting your own needs), or are you hoping to grow it into something more substantial?

Segmentation fault with impl Iterator

Hey there,
I get a segmentation fault when I return the Iterator, but not the corresponding Cursor, like so:

pub fn publications(&self) -> impl Iterator<Item = (u64, Publication)> {
        self.txn.read_cursor(&self.bucket).unwrap().iter()
            .map(|(a,b)| (u64::from(a), b.inner().unwrap().to_serde()))
}

this results in an Segmentation Fault. My solution was to wrap the Cursor in a seperate Iterator struct and implementing the next function manually.

blobs reside on disk after the key-value pair is removed

I used &str as the key and Vec as the value. The size of the value is 1GB.
I tried a couple of cases below and notified that the generated blob files were always kept on the disk.

Case1: Set a test k-v pair with above types; Get by key; Modify vector; Set it back with the original key. As a result, in the "blobs" folder, two files, instead of 1 file, left and occupied around 2GB disk space.

Case2: Set the k-v pair; Remove by key; Call "contains" to make sure the key has been removed from the store. But on disk, under the "blobs" folder, a 1GB file still resides.

Could you please let me know how to delete the old blob file automatically after the corresponding value is removed or replaced? Thanks.

cursor iterator only works for K: From<&u[u8]>

This seems, at least for K = &str, to make it impossible to iterate over a cursor.

To reproduce, add the following to the main example in src/lib.rs:

    {
        // This time a readonly transaction
        let txn = store.read_txn()?;

        // Getting a value is easy once everything is set up
        let curs = txn.read_cursor(&bucket)?;
        let all: Vec<(String, String)> =
            curs.map(|(k, v)| (k.to_string(), v.to_string())).collect();
    }

and the resulting error is

error[E0599]: no method named `map` found for type `kv::cursor::Cursor<'_, &str, &str>` in the current scope
  --> src/bin/foo.rs:44:18
   |
44 |             curs.map(|(k, v)| (k.to_string(), v.to_string())).collect();
   |                  ^^^ method not found in `kv::cursor::Cursor<'_, &str, &str>`
   |
   = note: the method `map` exists but the following trait bounds were not satisfied:
           `&mut kv::cursor::Cursor<'_, &str, &str> : std::iter::Iterator`

due to this trait bound

impl<'a, K: Key, V: Value<'a>> Iterator for Iter<'a, K, V>
where
    K: From<&'a [u8]>,

Expose something similar to the `compare_and_swap` from sled to the bucket

I'm building a poc using kv, so far it's been great, but I'm running into a potential concurrency issue when two 'calls' try to update the same key. Currently they both succeed, but I would rather have the later one failing. This could be done if the compare_and_swap was exposed. As the second call would fail, since the expected value is already different.

I might want to try to add it if you like the idea, but that won't be for weeks. It's not a big problem, especially as this is a poc.

kv::Json not found

match store.bucket::<Raw, Json>(None) {
                        Ok(bucket) => {
                            
                        }
                    }

version 0.22.0
version 0.21.1

Rethink Serde Integration

I'm currently using this crate in a project, and I've run into an issue with how you've designed Serde integration, and I also have a solution that may be interesting to you.

I wanted to serialize data stored in LMDB with bincode, but firstly, there's no documentation for how to do that, so I read through the source and found that you had a custom Data type for using bincode as an encoding. This isn't great, since the custom type doesn't provide good mechanisms for interacting with custom data structures.

The way I worked around this was to implement Encoding for a custom type that wraps a generic. Here is a general overview of how it works

pub struct BincodeEncoding<T>(pub T);

impl<T: DeserializeOwned + Serialize> Encoding for BincodeEncoding<T> {
    fn encode_to<W: Write>(&self, w: &mut W) -> Result<(), Error> {
        bincode::serialize_into(w, &self.0).map_err(|e| {
            // do required error mapping
        })
    }

    fn decode_from<R: Read>(r: &mut R) -> Result<Self, Error> {
        bincode::deserialize_from(r).map(BincodeEncoding).map_err(|e| {
            // do required error mapping
        })
    }
}

This allows for serialization and deserialization of user-provided types in LMDB, and can be easily replicated for other serde-compatible crates.

Bincode support is broken

Bincode support in the error module is accidentally mapped to the msgpack-value feature instead of the bincode-value feature.

See #12

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.