zshipko / rust-kv Goto Github PK
View Code? Open in Web Editor NEWAn embedded key/value store for Rust
License: ISC License
An embedded key/value store for Rust
License: ISC License
Hi, I have an issue where it is not possible to open multiple reader at the same time (LMDB(BadRslot)
). Is this intended, or am I doing something wrong? I thought LMDB would follow the single writer/multiple reader model?
pub fn from_disk<T: AsRef<Path>>(path: T) -> Store {
let mut cfg = Config::default(path);
cfg.set_map_size(50000000000);
cfg.set_max_readers(128);
cfg.bucket("metadata", None);
let store = kv::Store::new(cfg).unwrap();
{
let a = store.read_txn().unwrap();
// second read handle not possible
let b = store.read_txn().unwrap();
}
Store { store: Some(store) }
}
but their is not flag to enable compression feature
Hello,
I was a bit confused over the purpose of Config::temporary
... The config value seems entirely unused?
Also, for the "flush frequency", does that flush happen in the background as well? In a separate thread or on some kinda schedule, or only when actually modifying the store?
Hi, awesome work with Kv,
Was looking for a simplified interface to Sled, so this is cool.
Performance:
Can you expand on why you chose Sled, did you do any performance benchmarks against Rocks or LMDB before this?
Anything you uncovered?
Custom Encoding / Serialization:
Any pathway for me to integrate a custom serde encoder/decoder?
I have a scheme I want to use that isn't listed, for encoding / decoding keys.
Awesome work, happy holidays.
Would you care to explain why would one choose to use kv
over using sled
directly? I'm new to Rust ecosystem and from the first glance it looks like kv
just unnecessary layer on top of sled
... but I'm sure that there is a good reason for its existence. So, could you please put the explanation of why?
in the project README?
I am bulk loading about 130m items of variable size. Initially it was running at 12k/sec, but as time goes on it is down to 7k/sec. This is on a Macbook Pro, SSD storage. Changing the write path to just put in a big binary file is like lightning, so the slowdown is from kv.
Would be cool to allow a transaction to span over multiple buckets. This would allow to atomically move an element from one bucket to another. sled
already has support for this: https://docs.rs/sled/0.34.6/sled/transaction/trait.Transactional.html#foreign-impls
Methods on Bucket
receiving a Key
parameter should do so by accepting a reference to the key. This would allow to use the same key in multiple operations without cloning.
sled
accepts an AsRef<[u8]>
for the underlying operations. By doing the same in kv
, it can support references without changing the API for existing code.
Hello!
Coincidentally, I have also been working on a wrapper around LMDB.
I've spent some time working on things that I don't yet see represented here:
failure
.bincode
/serde
, so it can handle storage of almost any serializable object.Naturally I'd like to avoid wasted effort in any direction, and either merge some features, use existing libraries as layers, or clarify design decisions.
What's your plan for the future of your library? Do you see it as relatively stable as is (that is, you're already meeting your own needs), or are you hoping to grow it into something more substantial?
Hey there,
I get a segmentation fault when I return the Iterator
, but not the corresponding Cursor
, like so:
pub fn publications(&self) -> impl Iterator<Item = (u64, Publication)> {
self.txn.read_cursor(&self.bucket).unwrap().iter()
.map(|(a,b)| (u64::from(a), b.inner().unwrap().to_serde()))
}
this results in an Segmentation Fault
. My solution was to wrap the Cursor
in a seperate Iterator struct and implementing the next function manually.
I used &str as the key and Vec as the value. The size of the value is 1GB.
I tried a couple of cases below and notified that the generated blob files were always kept on the disk.
Case1: Set a test k-v pair with above types; Get by key; Modify vector; Set it back with the original key. As a result, in the "blobs" folder, two files, instead of 1 file, left and occupied around 2GB disk space.
Case2: Set the k-v pair; Remove by key; Call "contains" to make sure the key has been removed from the store. But on disk, under the "blobs" folder, a 1GB file still resides.
Could you please let me know how to delete the old blob file automatically after the corresponding value is removed or replaced? Thanks.
This seems, at least for K = &str
, to make it impossible to iterate over a cursor.
To reproduce, add the following to the main example in src/lib.rs
:
{
// This time a readonly transaction
let txn = store.read_txn()?;
// Getting a value is easy once everything is set up
let curs = txn.read_cursor(&bucket)?;
let all: Vec<(String, String)> =
curs.map(|(k, v)| (k.to_string(), v.to_string())).collect();
}
and the resulting error is
error[E0599]: no method named `map` found for type `kv::cursor::Cursor<'_, &str, &str>` in the current scope
--> src/bin/foo.rs:44:18
|
44 | curs.map(|(k, v)| (k.to_string(), v.to_string())).collect();
| ^^^ method not found in `kv::cursor::Cursor<'_, &str, &str>`
|
= note: the method `map` exists but the following trait bounds were not satisfied:
`&mut kv::cursor::Cursor<'_, &str, &str> : std::iter::Iterator`
due to this trait bound
impl<'a, K: Key, V: Value<'a>> Iterator for Iter<'a, K, V>
where
K: From<&'a [u8]>,
I'm building a poc using kv, so far it's been great, but I'm running into a potential concurrency issue when two 'calls' try to update the same key. Currently they both succeed, but I would rather have the later one failing. This could be done if the compare_and_swap
was exposed. As the second call would fail, since the expected value is already different.
I might want to try to add it if you like the idea, but that won't be for weeks. It's not a big problem, especially as this is a poc.
match store.bucket::<Raw, Json>(None) {
Ok(bucket) => {
}
}
version 0.22.0
version 0.21.1
I'm currently using this crate in a project, and I've run into an issue with how you've designed Serde integration, and I also have a solution that may be interesting to you.
I wanted to serialize data stored in LMDB with bincode, but firstly, there's no documentation for how to do that, so I read through the source and found that you had a custom Data
type for using bincode as an encoding. This isn't great, since the custom type doesn't provide good mechanisms for interacting with custom data structures.
The way I worked around this was to implement Encoding for a custom type that wraps a generic. Here is a general overview of how it works
pub struct BincodeEncoding<T>(pub T);
impl<T: DeserializeOwned + Serialize> Encoding for BincodeEncoding<T> {
fn encode_to<W: Write>(&self, w: &mut W) -> Result<(), Error> {
bincode::serialize_into(w, &self.0).map_err(|e| {
// do required error mapping
})
}
fn decode_from<R: Read>(r: &mut R) -> Result<Self, Error> {
bincode::deserialize_from(r).map(BincodeEncoding).map_err(|e| {
// do required error mapping
})
}
}
This allows for serialization and deserialization of user-provided types in LMDB, and can be easily replicated for other serde-compatible crates.
Bincode
support in the error
module is accidentally mapped to the msgpack-value
feature instead of the bincode-value
feature.
See #12
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.