openacid / celeritasdb Goto Github PK

View Code? Open in Web Editor NEW

15.0 8.0 2.0 856 KB

A redis compatible database.

Home Page: https://blog.openacid.com/

License: Apache License 2.0

Rust 88.00% Python 5.15% Shell 6.57% Makefile 0.17% Dockerfile 0.12%

rust db distributed wan global-consistency

celeritasdb's People

Contributors

Stargazers

Watchers

Forkers

zhanglei cppcoffee

celeritasdb's Issues

二分查找下ClusterInfo.group。或者搞个rangedict

二分查找下。或者搞个rangedict

Originally posted by @liubaohai in #223

rewrite API-server and replication server with async tonic server

李树龙相关工作

epaxos论文通信部分读懂
replica接口文档：
（1）协议与接口对照，12.05
（2）数据结构定义，12.05

李树龙相关工作

epaxos论文：协议部分读懂
2.replica接口文档：
（1）协议与接口对照，12.15
（2）数据结构定义，12.15

尽量赶上上面的时间结点

有个关于fast-quorum=N-1的问题.

Assumes we have N=3 replicas thus:
quorum = ⌊N/2⌋ = 2
and fast-quorum = N-1 = 2

R1 received a request and initialized an instance R1.a. R2 received a request R2.b, R3 received a request R3.c.
R1 sent PreAccpet to R2 and R3.
Then R2 will store R1.a depends on R2.b.
Then R3 will store R1.a depends on R3.c.

As you mentioned in #16 :

Yes, allEqual is still necessary even if you don't use seq. That's necessary for correct recovery of a command committed on the fast path. However, allCommitted is no longer necessary.

Now R1 could commit R1.a depends on R2.b or R1.a depends on R3.c, depends on whose reply it received.
And now R1 powered off.
The recovery process will see two value R1.a depends on R2.b and R1.a depends on R3.c. Either of them could have been committed by R1, because including the leader of R1.a, R2 and R3 both could have formed a fast-quorum.

So my question is how the recovery process chooses a value to send Accept request?
In my option, the only way to prevent this to happen is to allow fast-commit only when all dependent instances are committed. But you said this is not necessary, that confused me.

Since according to the recover algorithm, R is possible to see two values that both achieve ⌊N/2⌋ replies, if N=3.

定义一个Engine trait, 提供抽象的访问方法

open()
add_instance()
update_instance()
...

handle-accept-reply

handle-fast-accept-reply

2019-12-27 讨论议题

议题：

snapshot 功能和 epaxos crate 的相关性很高。但是之前 snapshot 是独立的 crate，是否保持独立？或者合并成 epaxos crate 的一个 mod。

liubaohai

是否去掉seq？采用哪种方式？

instance_num 改下名字叫instance_offset咋样?

因为num读起来像是总数的意思.
offset表示这个instance在instanceSpace中某个个leader下的偏移位置.对应数组下标的概念.

可选名字:

instance_offset
offset
instOffset
instIndex
index
idx
position
pos

PS. rust里变量命名啥风格? instID 还是 inst_id?

目录结构

需求

整个项目可能需要编译出多个binary(一个server, 一个mgr等等).
每个组件可能需要拆分到多个repo中, 也可能需要以源码方式(或fork然后修改过的源码方式)引入第三方的repo.

设计

顶级一个workspace, 指定下面的几个binary和lib作为members.

以源码方式引入的repo, 手动clone进来(不知道rust有没有好的包管理方式? git-submodule可以做这个事情, 但命令需要学习一下).

/Cargo.toml:

[workspace]

members = [
    "cele",
    "mgr", 
]

目录结构:

celeritas/
▾ cele/
  ▾ src/
      main.rs
    Cargo.toml
▾ mgr/
  ▾ src/
      main.rs
    Cargo.toml
  Cargo.toml

snapshot 模块的职责描述

1. 与外部的交互

a. executed 的 command 需要把它修改的 key 和 vlaue 更新到 snapshot;

b. client 需要读取某个 key 时，需要 snapshot 进行查找;

c. 已经 committed 的 log 需要更新到 snapshot;

d. 还没有 executed 的 command 对应的 log 外部能够全部读到；已经 executed 的 command 对应的 log 外部全部不能读到；

2. 内部的规划

a. 对 key-value 和对 log 的存储采用 2 个独立的模块，交互也使用 2 套不同的接口。

b. 为了保证 1.d 的需求，必须记录已经存储的 log 中，哪些是已经 executed。而且必须与 execute 模块保持一致。这一步在 execute 模块写入它要更新的 key 的时候，同时记录这个 key 所关联的那些被 executed 的 log（事务进行）。

c. 对 log 的存储需要一个清理的机制（例如，每天删除一次），只能清理已经 executed 的日志。

clean up after test: remove tmp db created: `components/epaxos/test.db/`

require: handle commit

不涉及网路请求处理,传入参数: commit的Request, 返回: commit的Reply.

replica 初始化：集群配置信息

Is your feature request related to a problem? Please describe.
yaml文件提供集群配置信息，主要描述各个节点的ip和port

Describe the solution you'd like
封装一下rust yaml，提供简单的接口，对外获取配置信息和设置配置信息接口

Describe alternatives you've considered
none
Additional context
none

mock下Engine, 做一个纯内存的实现, 用来做几个请求的处理的测试的底层存储

下一步开发前还需要1个东西: mock下Engine, 做一个纯内存的实现, 用来做几个请求的处理的测试的底层存储, 谁有空整一下吗

replica实现exec部分

Describe the solution you'd like

Find out the set of smallest instances of every leader: S
If there are any a → b relations(a.final_deps ⊃ b.final_deps) in S, replace S with: S = S - {y | x ∈ S, y ∈ S and (∃y: y → x)}
Execute all instances in S in instance-id-order

义谱工作描述

理解epaxos论文，实现参考efficient/epaxos - 2019/12/15
定义log的内存数据格式 - 2019/12/15
定义模块对外rust接口 - 2019/12/15
根据epaxos协议实现模块功能
依赖 @wenbobuaa 持久化日志结构

handle-commit-reply

HandlePreaccept mock测试.

handle-fast-accept

rust grpc 调研

Is your feature request related to a problem? Please describe.
希望使用更简洁的方式处理各个节点之间的通信。
候选方案有socket和grpc，从其中选择更优的方案。

Describe the solution you'd like
如果grpc能够支持更简洁的方式支持长连接，那么选择grpc。

方便的建立连接
能够方便的丢弃其中的完整或者残存数据，不至于影响下一个rpc

Describe alternatives you've considered
socket

Additional context

execution顺序的疑问

某时刻某个replica上的部分instancespace如下:

|       .->d 
|      /   | 
b<-------->c 
a<----'    | 
|          | 
------------

a, d 是interfering的 2个instance, 互相依赖.
b, c 是interfering的 2个instance, 互相依赖.

例如a,d是set x的, b,c是set y的.

按照epaxos的描述, 以及代码实现来看, 似乎并没有保证这4个instance的总体顺序?
而只保证了a,d作为一个SCC的顺序, 和b,c作为一个SCC的顺序.

而a,d和b,c这2组谁先执行, 就看谁先commit了?
例如像epaxos.go中429行开始的执行逻辑.

感觉这里有点问题啊, paper中的执行顺序是ad 然后bc 或bc然后ad 这2种情况, 都可能出现.
但是真实的时序有可能是a,c,b,d 这种顺序.
也就是说d有可能先于c执行, 感觉有点不合理. 但也没什么错误

handle-fast-accept: test: req.initial_deps is None

In this case it should respond some Error.

Current it just unwrap() it thus it would panic, which is not expected.

Also deps_committed should be checked too

刘保海相关工作

epaxos论文读懂，时间点暂定2019-12-16
阅读efficient/epaxos中epaxos协议实现部分，时间暂定2019-12-16
exec的实现，需要依赖 @pengsven日志部分数据结构定义 @lishulong Replica相关数据结构定义
k/v持久化，需要依赖 @wenbobuaa 的接口

自动生成enum的名字的字符串

Is the requested refactoring related to a known/potential problem? Please describe.

Describe what to do

使用自动生成的方法把enum字段转字符串的步骤自动化起来:

impl DBCF {
    pub fn as_str<'a>(&self) -> &'a str {
        match self {
            DBCF::Default => "default",
            DBCF::Instance => "instance",
            DBCF::Config => "config",
            DBCF::Conflict => "conflict",
        }
    }

    fn all<'a>() -> Vec<&'a str> {
        vec![
            DBCF::Default.as_str(),
            DBCF::Instance.as_str(),
            DBCF::Config.as_str(),
            DBCF::Conflict.as_str(),
        ]
    }
}

参考这个方法, 用一个宏来做这事.
https://stackoverflow.com/questions/32710187/get-enum-as-string
突然发现这就是quick-error的实现方式了吧!

module Replication

Replication is responsible to forward an instance to other replicas, to achieve fast-quorum or quorum.

Weekly Digest (5 April, 2020 - 12 April, 2020)

Here's the Weekly Digest for openacid/celeritasdb:

ISSUES

This week, 10 issues were created. Of these, 9 issues have been closed and 1 issues are still open.

OPEN ISSUES

💚 #234 Refactor server data, by drmingdrmer

CLOSED ISSUES

❤️ #233 remove transaction trait, add write batch and deletels, refine some m…, by liubaohai
❤️ #232 Improve server, by drmingdrmer
❤️ #231 feat: server: add clusterconf to specify a cluster it serves for., by drmingdrmer
❤️ #230 Get fast commit deps, by drmingdrmer
❤️ #229 fix: bug of fast-quorum: it must be greater than quorum, by drmingdrmer
❤️ #228 feat: Status.get_accept_deps(): to get deps for accept-request, by drmingdrmer
❤️ #227 fix: if fast-quorum is 0, then n==1. then fast-commit can always be done, by drmingdrmer
❤️ #226 change: Status now has an instance in it instead of a reference, by drmingdrmer
❤️ #225 feat: replica: add get_max_instance_ids(), which would be used when i…, by drmingdrmer

PULL REQUESTS

This week, 10 pull requests were proposed. Of these, 0 pull requests have been merged and 1 are still open.

OPEN PRs

💚 #234 Refactor server data, by drmingdrmer

CONTRIBUTORS

This week, 2 users have contributed to this repository.
They are drmingdrmer, and liubaohai.

STARGAZERS

This week, no user has starred this repository.

COMMITS

This week, there have been 14 commits in the repository.
These are:
🛠️ remove transaction trait, add write batch and deletels, refine some methods by liubaohai
🛠️ feat: add ServerData::new() and get_local_replica_for_key() to find out what group and replica serves a key by drmingdrmer
🛠️ feat: ClusterInfo::from_str() to load conf directly from yaml in str by drmingdrmer
🛠️ [feat: server: add clusterconf to specify a cluster it serves for.

feat: CeleServer as server level error type.
TODO: take a better name for it.
feat: Server should be able to be cloned and shared across
coroutines. Thus core data is put into a Arc struct.
feat: Server::new() takes a parameter Storage.
feat: Server initialize cluster info, replcias etc.
refactor: extract protocol impl out of server. Adds RedisApi
struct.](2411e37) by drmingdrmer
🛠️ test: set key range to [a, z] for test cluster by drmingdrmer
🛠️ feat: Storage need to be marked with Send and Sync in order to share between threads by drmingdrmer
🛠️ feat: Status.get_fast_commit_deps(): to get deps for fast-commit by drmingdrmer
🛠️ fix: bug of fast-quorum: it must be greater than quorum by drmingdrmer
🛠️ feat: Status.get_accept_deps(): to get deps for accept-request by drmingdrmer
🛠️ doc: update pr template by drmingdrmer
🛠️ fix: if fast-quorum is 0, then n==1. then fast-commit can always be done by drmingdrmer
🛠️ [change: Status now has an instance in it instead of a reference

There is an issue of rust: a fake modify-a-borrowed problem:

struct Inst { a: i32, }
struct St<'a> { inst: &'a Inst, }
fn borrow_it(s: &mut St) { }

fn main() {
    let mut i = Inst{a:3};
    let mut s = St{ inst: &i, };
    i.a = 5;
    borrow_it(&mut s);
}

error[E0506]: cannot assign to `i.a` because it is borrowed
  --> refmut.rs:16:5
   |
14 |         inst: &i,
   |               -- borrow of `i.a` occurs here
15 |     };
16 |     i.a = 5;
   |     ^^^^^^^ assignment to borrowed `i.a` occurs here
17 |     borrow_it(&mut s);
   |               ------ borrow later used here

error: aborting due to previous error
```](https://github.com/openacid/celeritasdb/commit/853f1b026263e0ccb75c075b9abae055bea2ac64) by [drmingdrmer](https://github.com/drmingdrmer)
:hammer_and_wrench: [feat: replica: add get_max_instance_ids(), which would be used when initiating instance](https://github.com/openacid/celeritasdb/commit/cbd707fc3f52ad74b17e151af173be5b67d7af84) by [drmingdrmer](https://github.com/drmingdrmer)
:hammer_and_wrench: [feat: do not allow InstanceId.idx to be negative](https://github.com/openacid/celeritasdb/commit/21f6ee5d532381a96b55e59adc1607c144d4590f) by [drmingdrmer](https://github.com/drmingdrmer)

 # RELEASES
This week, no releases were published.

That's all for this week, please watch :eyes: and star :star: [openacid/celeritasdb](https://github.com/openacid/celeritasdb) to receive next weekly updates. :smiley:

Engine(snapshot) 需要提供事务

Is your feature request related to a problem? Please describe.

Describe the solution you'd like
在 executor 写入 key-value 的时候会需要.

接口：

Engine.begin() -> tx: 开始一个事务；
tx.get_kv(key)
tx.get_kv_for_update(key): 得到的 key 不能被其他事务修改；
tx.set_kvs(keys, values, instances)
tx.commit()：提交一个事务；

Describe alternatives you've considered

Additional context

test: linear probing

the link: https://en.wikipedia.org/wiki/Linear_probing

impl From<A:Into<Replica>,B:Into<i64>> for InstanceId

李树龙相关工作

Engine(snapshot) 需要提供instance访问接口

Is your feature request related to a problem? Please describe.

Describe the solution you'd like
需要提供的接口：

set_instance(instance)：保存一个 instance，以 instance.instance_id 为 key，以 instance 序列化后的 [u8] 为 value；

update_instance(instance): 以传入的 instance 覆盖一个已经存在的 instance.instance_id 为 key 的 value；

get_instance(instance_id)：根据一个 instance_id 得到一个 instance 实例，或者一个 error；

scan_instance(replica_id): 根据一个 replica_id 得到一个 iteration，用来遍历这个 replica 上的所有的 instance；

Describe alternatives you've considered

Additional context
依赖 instance 实例的序列化。

增加1个hello world

hello world program in Rust.
增加travis自动测试配置, 参考下slim里的travis.yml, 定义自动测试脚本. travis CI 结果可以在 https://travis-ci.org/openacid/celeritasdb 看到.

Engine(snapshot) 需要提供 key-value 访问接口

Is your feature request related to a problem? Please describe.

Describe the solution you'd like
需要提供的接口：

set_kv(keys, values, instances): 向 snapshot 中写入 key-value 并同时更新相关的 instances。

get_kv(key): 读取一个 snapshot 中的一个值。

Describe alternatives you've considered

Additional context

require: handle accept

这个函数不涉及网络传输, 传入参数是一个Accept的Request请求. 返回一个Accept的Reply.

protobuf message header

定义一个可以直接序列化/反序列化的 message header.

因为protobuf的消息中自己不描述自己的大小, 需要一个确定大小的头来做这件事情.

头部包括:

版本号
头大小
body大小(实际的protobuf 序列化之后的信息)

可以先看看其他人咋做的.

这块应该是还需要一步把请求中指定的committed标记应用到本地replica上. 要不先留个todo不做这步优化也行...

Originally posted by @drmingdrmer in #182

ReplicaPeer 需要 connect方法

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

travis-ci中需要建立一个3节点的环境, 跑集成测试

集成测试插入一个值, 从3个节点上都能读到

define SMR

open() 打开一个底层的存储用来存日志和snapshot.

12.08 wenbo 的事情

详细了解 epaxos 执行过程，并且给出详细版的 snapshot 模块的需求描述。周二晚之前提交，需要 @pengsven @liubaohai 给出 review 和意见要求。
根据 1，给出一版使用 rust 的接口定义，包括接口的注释和 example，周四晚之前提交。
学习 rust 相关，支持 2 的进行。
学习 RocksDB，主要是 Rust 版本的接口，和实际使用情况。

xp事儿~

折腾一个可用的redis服务, 做填东西的框架, 正在扒 github.com/seppo0010/rsedis的代码.
- 目前折腾出一个可用的版本, 然后做些代码优化, 让server框架看起来更清楚点. 大约在本周2,3 提交进去.
- 至少需要 @lishulong @liubaohai 2位看看代码结构.
- 为了达到能部署起来测试的目的, 还需要将redis的get/set命令解析做完整. 拿到具体的参数.
疯狂学习rust. 估计要到12-14

后面目测需要把redis的get/set命令解析和encoding的模块先做出来. 到时看谁空了.

内部调用使用vec[u8]替代string

Is the requested refactoring related to a known/potential problem? Please describe.

Describe what to do

Describe alternatives you've considered

Additional context

Affect other component or side effect

redis-response生成部分使用crate redis替换.

关于 epaxos 的几个基本数据结构的定义

基本数据结构

系统里需要多个模块引用的公共的数据结构，需要统一讨论和定义一下实现细节。

有哪些需要定义的

目前下面几个需要定义的基本结构，有：

/// Command 是一条命令，表示来自客户端的指令。一个例子如下：
pub enum Command {
    Set(String, String),
    Get(String),
    Del(String),
}

/// InstanceStatus 表示一个 instance 周期内的各个时期的状态。一个例子如下：
/// 这个例子里使用了 `enum` 来定义，不知道序列化成 protobuf 会不会有什么问题，待讨论。
pub enum InstanceStatus {
    PreAccepted,
    Accepted,
    Committed,
    Executed,
    Purged,
}

/// InstanceID 是在一个 replica 上，instance 的唯一标识。一个例子如下：
/// 这里如果使用数字有没有啥问题，待讨论。
pub type InstanceID = i64;

/// ReplicaID 是一个 replica 的唯一标识。一个例子如下：
/// 这里如果使用 String 有没有啥问题，待讨论。
pub type ReplicaID = String;

/// CommandSeq 是一次 epaxos 协商里，Command 的版本。一个例子如下：
/// 这里如果使用数字有没有啥问题，待讨论。
pub type CommandSeq = i64;

/// Instance 的定义例子：
/// 由 instance_id + replica_id 在整个系统中唯一确定一个 instance
pub struct Instance {
    pub instance_id: InstanceID,
    pub replica_id: ReplicaID,
    pub cmds: Vec<Command>,
    pub seq: CommandSeq,
    pub deps: Vec<[InstanceID, ReplicaID]>,
    pub status: InstanceStatus,
}

可能还有一些需要补充一下~

在哪里

这些基本数据定义在哪个模块里？对应在项目的哪个 component？（或者怎么组织...)
以及其他组件如何使用。

impl Ord for InstanceID

为InstanceID实现Ord trait, 实现任意2个InstanceID的大小比较

snapshot.StatusEngine trait 不定义 `set_max_inst` `set_max_exec_inst` 两个函数

Describe what to do

set_max_instance 和 set_instance 函数使用场景是绑定在一起的，而且设置的是一个内部状态，不应该给外部函数来调用；
同样的，set_max_exec_isnt 也是和 update_instance 是绑定的。

所以提议把这两个函数的操作逻辑增加在 set_instance 和 update_instance 里，而不另外提供对外的设置接口，只提供读取接口。

require: Init instance

Implement Init instance

传入参数: cmds

create an instance with all necessary fields filled.

openacid / celeritasdb Goto Github PK

celeritasdb's People

Contributors

Stargazers

Watchers

Forkers

celeritasdb's Issues

议题：

liubaohai

需求

设计

1. 与外部的交互

2. 内部的规划

ISSUES

OPEN ISSUES

CLOSED ISSUES

PULL REQUESTS

OPEN PRs

CONTRIBUTORS

STARGAZERS

COMMITS

后面目测需要把redis的get/set命令解析和encoding的模块先做出来. 到时看谁空了.

基本数据结构

有哪些需要定义的

在哪里

Implement Init instance

Recommend Projects

Recommend Topics

Recommend Org