Code Monkey home page Code Monkey logo

Comments (11)

skyzh avatar skyzh commented on September 28, 2024 2

rocksdb-rs is really a great project. From my perspective, it has nearly the same functionalities as agatedb, except some design goals. For example, agatedb has key-value separation and user-specified epoch built-in, while RocksDB and rocksdb-rs doesn't.

Firstly, I'd like to answer some questions regarding the agatedb project.

agatedb does not support snapshot isolation. It means that we must do a lot of job to change TiKV.

It indeed supports SSI. Every txn will create a snapshot on the current LSM tree, and agatedb should do conflict detection when writing (though this part is not implemented yet).

agatedb does not implement the table format like rocksdb.

That's true, and changing the SST format should be very simple. agatedb stores value pointer as a normal value in LSM, so any SST format should be okay for agatedb.

agatedb does not support asynchronous IO or asynchronous interface. It opens all files by mmap and does not support process manage block-cache just like rocksdb.

This is also true. mmap doesn't sound like a good way to run a storage engine.

I'm also happy if we could welcome a new experimental storage engine project to the TiKV organization, but I have some concerns regarding this transfer...

  • How will the development process go on in TiKV org? Looks like we need more developers working on rocksdb-rs, so as to get this project actively maintained and being used.
  • Does the code on main branch get adequate review? As you have already known, the complete agatedb source code is still on the develop branch, without being merged into master. When we was building yet another storage engine, we've found a lot of bugs in agatedb like tikv/agatedb#115 tikv/agatedb#114, which was not spotted by me at the time I was developing it. I think we should ensure there's no significant bug and get adequate review for rocksdb-rs.
  • How does rocksdb-rs perform compared with titan or rocksdb? There was a very simple benchmark tool for agatedb called agate_bench, and an internal doc about its performance comparison. I think developers might also be interested in benchmarking and using rocksdb-rs, and we should provide quick start for them to use rocksdb-rs in their libraries or doing benchmarks.
  • The name rocksdb-rs seems confusing at first sight. Some others might think of it as a Rust binding of RocksDB. What about having a new name for that before transferring, like tirocks or rustrocks?

from community.

BusyJay avatar BusyJay commented on September 28, 2024

What's the relations between tikv/agatedb and rocksdb-rs? Can you describe more on the motivation of building a new rocksdb implementation instead of agatedb?

from community.

Little-Wallace avatar Little-Wallace commented on September 28, 2024

OK. I thought about creating my project based on agatedb before. But I found some critical issues preventing me from doing this.

  • agatedb does not support snapshot isolation. It means that we must do a lot of job to change TiKV.
  • agatedb does not implement the table format like rocksdb. It means that if we upgrade TiKV from rocksdb to agatedb we must overwrite all the data and it will cost a lot of time (maybe several hours) before TiKV could serve requests. And this also means a very high upgrade risk, because in the event of a data file corruption, users cannot easily fall back to an older version (from agatedb to rocksdb).
  • agatedb does not support asynchronous IO or asynchronous interface. It opens all files by mmap and does not support process manage block-cache just like rocksdb.

These issues reminds me that what I need is far away from agatedb. so I have to implement a new one. But I copy the code of memtable(skiplist) from agatedb and I think it is the only thing works for me...

from community.

BusyJay avatar BusyJay commented on September 28, 2024

agatedb does not support snapshot isolation.

agatedb is supposed to inherit all features badger supports, so it may support it at the moment, but it's supposed to support it in the end.

agatedb does not implement the table format like rocksdb

Indeed. This is the trade off between support compatibility and new features. But after we can divide LSM tree for regions, it may ease the pain for upgrading and downgrading. For example, using different engines for different regions.

agatedb does not support asynchronous IO or asynchronous interface.

I think this is similar to the first statement. Actually as explained the README of agatedb, the desire of writing a new engine in Rust is to explorer asynchronous support and unify thread menagements. It's part of its goal.

However, as one of the author of agatedb and TiKV maintainer, I personally happy to add this project as another experimental project in TiKV org to explorer further possibility as long as it's actively working on.

/cc @skyzh @zz-jason What's your opinions?

/cc @tikv/maintainers

from community.

Little-Wallace avatar Little-Wallace commented on September 28, 2024

Indeed. This is the trade off between support compatibility and new features. But after we can divide LSM tree for regions, it may ease the pain for upgrading and downgrading. For example, using different engines for different regions

I'm not against new data formats. I just think it's a better choice to support compatibility with the old format at first and we can upgrade the format step by step. As you mentioned, we can store most of regions in a old format and store some of them in a new format. For cloud environment, mmap is not supported by most shared storage and it means that you have to store data in memory. Obviously, we can not store all data in memory.

As of now, all interfaces of this project are synchronous, which means that if I want to use asynchronous interface, I need to rewrite almost all the code. It seems that no one cares the goal of agatedb....

I don't think the badger format is essential for cloud storage. But we can also implement the badger format in the rocksdb-rs. I think the most important thing is to support asynchronous IO so that the high latency of cloud disk (or S3 storage) won't block read thread.

from community.

Little-Wallace avatar Little-Wallace commented on September 28, 2024

It indeed supports SSI. Every txn will create a snapshot on the current LSM tree, and agatedb should do conflict detection when writing (though this part is not implemented yet).

I do not think it is a good idea to support SSI in a kv engine. Because the transaction model may not be compatibility with the distributed transaction model of TiKV. So I suggest to implement a no transaction KV engine and then we can improvement it with developer of transaction developers.

How will the development process go on in TiKV org? Looks like we need more developers working on rocksdb-rs, so as to get this project actively maintained and being used.

Good point. I have finished the basic iterator and seek and write function. I need someone help me finished blockcache and compression and some other function. You'll find that this project has quite a few features done.

How does rocksdb-rs perform compared with titan or rocksdb? There was a very simple benchmark tool for agatedb called agate_bench, and an internal doc about its performance comparison. I think developers might also be interested in benchmarking and using rocksdb-rs, and we should provide quick start for them to use rocksdb-rs in their libraries or doing benchmarks.

I have benchmarked the write performance by single thread and I found that it can only reach 65% of RocksDB. Most of the write time comes from skiplist and the skiplist of rocksdb-rs is copied from agatedb....

The name rocksdb-rs seems confusing at first sight. Some others might think of it as a Rust binding of RocksDB. What about having a new name for that before transferring, like tirocks or rustrocks?

TiKV will name another project tirocks and maybe we can rethink another name.

from community.

Little-Wallace avatar Little-Wallace commented on September 28, 2024

Does the code on main branch get adequate review? As you have already known, the complete agatedb source code is still on the develop branch, without being merged into master. When we was building yet another storage engine, we've found a lot of bugs in agatedb like tikv/agatedb#115 tikv/agatedb#114, which was not spotted by me at the time I was developing it. I think we should ensure there's no significant bug and get adequate review for rocksdb-rs.

This is why I don't want to introduce a complex transaction model in the storage engine, unless it is designed from the distributed transaction of TiDB as a whole. In the first step, we only need to design an engine with the same interface usage as rocksdb, and then we can support more functions according to TiDB's distributed transactions

from community.

Little-Wallace avatar Little-Wallace commented on September 28, 2024

To make sure there's no significant bug , I will add more unit test for this project in the future development.

from community.

skyzh avatar skyzh commented on September 28, 2024

I have no objection (nor clear accept) for this proposal. Any ideas from other members?

from community.

BusyJay avatar BusyJay commented on September 28, 2024

I would like to also hear from @sunxiaoguang @zhangjinpeng1987

To make sure there's no significant bug , I will add more unit test for this project in the future development.

In addition, if it's going to be a serious project, changes needs to be reviewed and get at least two approvals before landing in its own master.

from community.

Little-Wallace avatar Little-Wallace commented on September 28, 2024

@zhangjinpeng1987 @sunxiaoguang Any suggestion?

from community.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.