Comments (11)
rocksdb-rs is really a great project. From my perspective, it has nearly the same functionalities as agatedb, except some design goals. For example, agatedb has key-value separation and user-specified epoch built-in, while RocksDB and rocksdb-rs doesn't.
Firstly, I'd like to answer some questions regarding the agatedb project.
agatedb does not support snapshot isolation. It means that we must do a lot of job to change TiKV.
It indeed supports SSI. Every txn will create a snapshot on the current LSM tree, and agatedb should do conflict detection when writing (though this part is not implemented yet).
agatedb does not implement the table format like rocksdb.
That's true, and changing the SST format should be very simple. agatedb stores value pointer as a normal value in LSM, so any SST format should be okay for agatedb.
agatedb does not support asynchronous IO or asynchronous interface. It opens all files by mmap and does not support process manage block-cache just like rocksdb.
This is also true. mmap doesn't sound like a good way to run a storage engine.
I'm also happy if we could welcome a new experimental storage engine project to the TiKV organization, but I have some concerns regarding this transfer...
- How will the development process go on in TiKV org? Looks like we need more developers working on rocksdb-rs, so as to get this project actively maintained and being used.
- Does the code on main branch get adequate review? As you have already known, the complete agatedb source code is still on the develop branch, without being merged into master. When we was building yet another storage engine, we've found a lot of bugs in agatedb like tikv/agatedb#115 tikv/agatedb#114, which was not spotted by me at the time I was developing it. I think we should ensure there's no significant bug and get adequate review for rocksdb-rs.
- How does rocksdb-rs perform compared with titan or rocksdb? There was a very simple benchmark tool for agatedb called agate_bench, and an internal doc about its performance comparison. I think developers might also be interested in benchmarking and using rocksdb-rs, and we should provide quick start for them to use rocksdb-rs in their libraries or doing benchmarks.
- The name rocksdb-rs seems confusing at first sight. Some others might think of it as a Rust binding of RocksDB. What about having a new name for that before transferring, like
tirocks
orrustrocks
?
from community.
What's the relations between tikv/agatedb and rocksdb-rs? Can you describe more on the motivation of building a new rocksdb implementation instead of agatedb?
from community.
OK. I thought about creating my project based on agatedb before. But I found some critical issues preventing me from doing this.
- agatedb does not support snapshot isolation. It means that we must do a lot of job to change TiKV.
- agatedb does not implement the table format like rocksdb. It means that if we upgrade TiKV from rocksdb to agatedb we must overwrite all the data and it will cost a lot of time (maybe several hours) before TiKV could serve requests. And this also means a very high upgrade risk, because in the event of a data file corruption, users cannot easily fall back to an older version (from agatedb to rocksdb).
- agatedb does not support asynchronous IO or asynchronous interface. It opens all files by mmap and does not support process manage block-cache just like rocksdb.
These issues reminds me that what I need is far away from agatedb. so I have to implement a new one. But I copy the code of memtable(skiplist) from agatedb and I think it is the only thing works for me...
from community.
agatedb does not support snapshot isolation.
agatedb is supposed to inherit all features badger supports, so it may support it at the moment, but it's supposed to support it in the end.
agatedb does not implement the table format like rocksdb
Indeed. This is the trade off between support compatibility and new features. But after we can divide LSM tree for regions, it may ease the pain for upgrading and downgrading. For example, using different engines for different regions.
agatedb does not support asynchronous IO or asynchronous interface.
I think this is similar to the first statement. Actually as explained the README of agatedb, the desire of writing a new engine in Rust is to explorer asynchronous support and unify thread menagements. It's part of its goal.
However, as one of the author of agatedb and TiKV maintainer, I personally happy to add this project as another experimental project in TiKV org to explorer further possibility as long as it's actively working on.
/cc @skyzh @zz-jason What's your opinions?
/cc @tikv/maintainers
from community.
Indeed. This is the trade off between support compatibility and new features. But after we can divide LSM tree for regions, it may ease the pain for upgrading and downgrading. For example, using different engines for different regions
I'm not against new data formats. I just think it's a better choice to support compatibility with the old format at first and we can upgrade the format step by step. As you mentioned, we can store most of regions in a old format and store some of them in a new format. For cloud environment, mmap is not supported by most shared storage and it means that you have to store data in memory. Obviously, we can not store all data in memory.
As of now, all interfaces of this project are synchronous, which means that if I want to use asynchronous interface, I need to rewrite almost all the code. It seems that no one cares the goal of agatedb....
I don't think the badger format is essential for cloud storage. But we can also implement the badger format in the rocksdb-rs. I think the most important thing is to support asynchronous IO so that the high latency of cloud disk (or S3 storage) won't block read thread.
from community.
It indeed supports SSI. Every txn will create a snapshot on the current LSM tree, and agatedb should do conflict detection when writing (though this part is not implemented yet).
I do not think it is a good idea to support SSI in a kv engine. Because the transaction model may not be compatibility with the distributed transaction model of TiKV. So I suggest to implement a no transaction KV engine and then we can improvement it with developer of transaction developers.
How will the development process go on in TiKV org? Looks like we need more developers working on rocksdb-rs, so as to get this project actively maintained and being used.
Good point. I have finished the basic iterator and seek and write function. I need someone help me finished blockcache and compression and some other function. You'll find that this project has quite a few features done.
How does rocksdb-rs perform compared with titan or rocksdb? There was a very simple benchmark tool for agatedb called agate_bench, and an internal doc about its performance comparison. I think developers might also be interested in benchmarking and using rocksdb-rs, and we should provide quick start for them to use rocksdb-rs in their libraries or doing benchmarks.
I have benchmarked the write performance by single thread and I found that it can only reach 65% of RocksDB. Most of the write time comes from skiplist and the skiplist of rocksdb-rs is copied from agatedb....
The name rocksdb-rs seems confusing at first sight. Some others might think of it as a Rust binding of RocksDB. What about having a new name for that before transferring, like tirocks or rustrocks?
TiKV will name another project tirocks
and maybe we can rethink another name.
from community.
Does the code on main branch get adequate review? As you have already known, the complete agatedb source code is still on the develop branch, without being merged into master. When we was building yet another storage engine, we've found a lot of bugs in agatedb like tikv/agatedb#115 tikv/agatedb#114, which was not spotted by me at the time I was developing it. I think we should ensure there's no significant bug and get adequate review for rocksdb-rs.
This is why I don't want to introduce a complex transaction model in the storage engine, unless it is designed from the distributed transaction of TiDB as a whole. In the first step, we only need to design an engine with the same interface usage as rocksdb, and then we can support more functions according to TiDB's distributed transactions
from community.
To make sure there's no significant bug , I will add more unit test for this project in the future development.
from community.
I have no objection (nor clear accept) for this proposal. Any ideas from other members?
from community.
I would like to also hear from @sunxiaoguang @zhangjinpeng1987
To make sure there's no significant bug , I will add more unit test for this project in the future development.
In addition, if it's going to be a serious project, changes needs to be reviewed and get at least two approvals before landing in its own master.
from community.
@zhangjinpeng1987 @sunxiaoguang Any suggestion?
from community.
Related Issues (20)
- Proposal: Reduce the Entropy of TiKV Community Governance HOT 70
- Associate every repository with one and one and only one team HOT 1
- Update authentication mechanism HOT 8
- Update community governance description
- Write down the description of roles in TiKV community and how a member goes to that role HOT 1
- Clean up previous community stuff that invalid now
- Tracking issue for applying pr issue association HOT 2
- Create a proposal for pr issue association
- style: enforce order in team member json HOT 1
- Change the method of PR associated issue HOT 13
- new PR requirement and commit message style
- Transfer TiKV Dev Guide Repo to tikv org HOT 7
- Maintainess issue of pprof-rs HOT 1
- Transfer `components/match_template` to tikv org HOT 11
- Transfer tidb_xxxx components to pingcap org HOT 2
- Update governance document HOT 1
- Nominate Qi Liu as TOC Member HOT 1
- Nominate Dongxu Huang as TOC Member HOT 1
- Transfer tikv/minitrace as fastracelabs/fastrace HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from community.