apache / incubator-pegasus Goto Github PK
View Code? Open in Web Editor NEWApache Pegasus - A horizontally scalable, strongly consistent and high-performance key-value store
Home Page: https://pegasus.apache.org/
License: Apache License 2.0
Apache Pegasus - A horizontally scalable, strongly consistent and high-performance key-value store
Home Page: https://pegasus.apache.org/
License: Apache License 2.0
>>> use temp
OK
>>> full_scan
partition: all
hash_key_filter_type: no_filter
sort_key_filter_type: no_filter
batch_size: 100
max_count: 2147483647
timout_ms: 5000
detailed: false
no_value: false
"a" : "m_1" => "a"
"a" : "m_2" => "a"
"a" : "m_3" => "a"
"a" : "m_4" => "a"
"a" : "m_5" => "a"
"a" : "n_1" => "b"
"a" : "n_2" => "b"
"a" : "n_3" => "b"
8 key-value pairs got.
>>> full_scan --batch_size 10 -s prefix -y m
partition: all
hash_key_filter_type: no_filter
sort_key_filter_type: prefix
sort_key_filter_pattern: "m"
batch_size: 10
max_count: 2147483647
timout_ms: 5000
detailed: false
no_value: false
"a" : "m_1" => "a"
"a" : "m_2" => "a"
"a" : "m_3" => "a"
"a" : "m_4" => "a"
"a" : "m_5" => "a"
5 key-value pairs got.
>>> full_scan --batch_size 3 -s prefix -y m
partition: all
hash_key_filter_type: no_filter
sort_key_filter_type: prefix
sort_key_filter_pattern: "m"
batch_size: 3
max_count: 2147483647
timout_ms: 5000
detailed: false
no_value: false
"a" : "m_1" => "a"
"a" : "m_2" => "a"
"a" : "m_3" => "a"
"a" : "m_4" => "a"
"a" : "m_5" => "a"
"a" : "n_1" => "b"
"a" : "n_2" => "b"
"a" : "n_3" => "b"
8 key-value pairs got.
>>>
github上发布的版本和小米内部使用的版本是一致的不?
编译环境:
机器: Linux version 3.2.0-61-generic (buildd@roseapple) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #93-Ubuntu SMP Fri May 2 21:31:50 UTC 2014
GCC:4.8.4
CMake:2.8.12.2
错误信息:
skip build fmtlib
skip build Poco
-DPOCO_INCLUDE=/data/bigdata/pegasus/pegasus/rdsn/thirdparty/output/include -DPOCO_LIB=/data/bigdata/pegasus/pegasus/rdsn/thirdparty/output/lib -DGTEST_INCLUDE=/data/bigdata/pegasus/pegasus/rdsn/thirdparty/output/include -DGTEST_LIB=/data/bigdata/pegasus/pegasus/rdsn/thirdparty/output/lib -DCMAKE_POSITION_INDEPENDENT_CODE=ON
-- Configuring done
-- Generating done
-- Build files have been written to: /data/bigdata/pegasus/pegasus/rdsn/thirdparty/build/fds
[ 78%] Built target galaxy-fds-sdk-cpp
[ 84%] Built target sample
Linking CXX executable testrunner
/usr/bin/ld: cannot find -lgtest
/usr/bin/ld: cannot find -lgtest_main
collect2: error: ld returned 1 exit status
make[2]: *** [test/testrunner] Error 1
make[1]: *** [test/CMakeFiles/testrunner.dir/all] Error 2
make: *** [all] Error 2
build fds failed
ERROR: build rdsn failed
when pegasus not support INCR operator, we also define RPC_RRDB_RRDB_INCR (refer to https://github.com/XiaoMi/pegasus/blob/v1.9.2/src/server/pegasus_server_impl.cpp#L30 )to keep compatibility with v1.4.x.
if we send RPC_RRDB_RRDB_INCR rpc code using Pegasus Java Client 1.9.0 to Pegasus Server <=1.9.2,then pegasus server will coredump:
D2018-07-15 21:27:23.725 (1531661243725307352 0295) replica.io-thrd.00661: network.cpp:619:on_server_session_accepted(): server session accepted, remote_client = x.x.x.x:xxxxx, current_count = 5
F2018-07-15 21:27:23.727 (1531661243727366961 02aa) replica.default3.030002940001000e: pegasus_server_impl.cpp:86:handle_request(): assertion expression: false
F2018-07-15 21:27:23.727 (1531661243727404626 02aa) replica.default3.030002940001000e: pegasus_server_impl.cpp:86:handle_request(): recv message with unhandled rpc name RPC_RRDB_RRDB_INCR from x.x.x.x:xxxxx, trace_id = 0000000000000000
That is: If some rpc code is defined, but not handled, the server will core. It it not robust enough.
refer to storage_serverlet.h:82
Only scan hash_key when do full scan.
在build机器上使用toolchain进行编译时,我发现:
如果环境变量设置了:
export LIBRARY_PATH="$DSN_THIRDPARTY_ROOT/lib"
那么CMakeLists.txt中的以下语句不会生效:
link_directories(${DSN_THIRDPARTY_ROOT}/lib)
造成的后果就是${DSN_THIRDPARTY_ROOT}/lib
不在链接的-L路径里面,链接时因为找不到库或者找到错误的库,链接失败。
搜到一些可能相关的链接: https://public.kitware.com/Bug/view.php?id=16074
Hi I'm getting started with Pegasus by reading PacificA consensus algorithm.
In the paper, primary/secondary data node could suspect its peer has become faulty,
then it reports to configuration manager.
After the configuration manager removes the faulty one, the replica count in this
replication group has been reduced.
In Pegasus's implementation, does the configuration manager pick up a new node and
add it to this replication group automatically? Or do this via some tools?
Thanks in advance.
Pegasus travis test fails occasionally:
D2018-06-10 13:30:02.390 (1528637402390718359 3e4e) mimic.io-thrd.15950: client session created, remote_server = 127.0.0.1:34601, current_count = 1
sleep 1 second to wait complete...
D2018-06-10 13:30:02.390 (1528637402390939619 3e59) mimic.io-thrd.15961: client session connected, remote_server = 127.0.0.1:34601, current_count = 1
new app_id = 2
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
sleep 10s to wait app become healthy...
partition[0] is unhealthy, coz primary is invalid...
/home/travis/build/XiaoMi/pegasus/src/test/function_test/test_restore.cpp:318: Failure
Value of: restore()
Actual: false
Expected: true
But if you rebuild it, it probably successes.
如题
Now we have read/write QPS counter, but is not enough, because the size of each request is not take into account.
广告CTR的AB方案中,需要使用./script/pegasus_set_usage_scenario.sh来设置表的使用场景。
现在c3和c4机房分别有两个集群c3srv-adb和c4srv-adb,需要灌数据的时候,工作流会同时启动两个任务,分别设置这两个集群的usage_scenario。
由于这两个任务在同一个机器上执行,且使用的同一个pegasus tools工具目录下的shell,结果造成总是只有一个集群被设置成功。
原因是:./run.sh shell --cluster=xxx 在执行的时候,会在当前目录下生成config-shell.ini。如果同时有多个shell运行,都会生成config-shell.ini,会出现覆盖问题。
and update the wiki doc.
to improve write performance.
Just like Hive HBase Integration and Hive Mongo Integration, we can integrate Pegasus into hive, then users can query SQL on Pegasus, like Using Hive to interact with HBase.
Sounds cool, isn't it?
We can add random fault injection on pegasus_write_service::impl::db_write
to test the condition where rocksdb is failed.
如题,感觉整个过程还是比较复杂的。多谢!
现在过期表的物理删除逻辑太复杂,依赖的点太多,需要简化。
参见过期表数据的物理删除。
currently if we want to test the learning in the kill test, we should only kill the replica server process, which is not friendly for memory leak test. perhaps we'd better add a command of killing one partition but not the whole replica server process.
now building pegasus takes more than 30 minutes (travis ci timeout is 50 minutes), we need to speed up it.
when we run "cluster_info" in shell command, the "meta_servers" list is a static value which shows the configured meta servers when cluster is initialized:
>>> cluster_info
meta_servers : 10.112.3.11:30601,10.112.3.10:30601
primary_meta_server : 10.112.3.11:30601
zookeeper_hosts : 10.112.3.11:2181,10.112.3.10:2181,10.112.2.33:2181
zookeeper_root : /pegasus/c3tst-sample
meta_function_level : freezed
when a new meta server is added to cluster dynamically, the "meta_servers" list won't change.
shoud resolve this.
我看到你们在微信上的文章,partition的调度使用了网络流算法,正好我以前针对这个问题做过一个费用流的算法,目标里包括了分机架和总迁移量尽量小,感兴趣的话可以参考一下。
Appears for many times in unit test:
W2018-07-18 10:16:05.669 (1531880165669313326 617f) mimic.io-thrd.24959: io_getevents returns -4, you probably want to try on another machine:-(
currently the only way to quickly clean the data of a table is drop the table then create a new one. perhaps we can support a "truncate table" command to let users to clean the table quickly.
需要两个文档未提及的包
1、zlib-devel
编译时需要,否则出错
2、nmap-ncat
没有这个会在 ./run.sh start_onebox 这一步骤遇到:
./scripts/start_zk.sh: line 62: nc: command not found
目前编译完成,还在继续作新手尝试。
感谢各位开发者的贡献。
当一个节点挂了比较长的时间,重新复活后数据是如何同步的?
now the priority queue's dequeue() is:
T dequeue_impl(/*out*/ long &ct, bool pop = true)
{
if (_count == 0) {
ct = 0;
return nullptr;
}
ct = --_count;
int index = priority_count - 1;
for (; index >= 0; index--) {
if (_items[index].size() > 0) {
break;
}
}
assert(index >= 0); // "must find something");
auto c = _items[index].front();
_items[index].pop();
return c;
}
if the HIGH priority queue is always not empty, the task in COMMON/LOW queue may be starved.
we can refer to the implementation of nfs_client_impl.
now we use lots of assert() in our code, and do not add NDEBUG macro even when compile in release mode.
to improve:
系统:ubuntu16.04
➜ thirdparty git:(dc3a3ee) ✗ ./build-thirdparty.sh
+++ dirname ./build-thirdparty.sh
++ cd .
++ pwd
-- Thrift version: 0.9.3 (0.9.3)
-- Thrift package version: 0.9.3
-- Build configuration Summary
-- Build Thrift compiler: OFF
-- Build with unit tests: OFF
-- Build examples: OFF
-- Build Thrift libraries: ON
-- Language libraries:
-- Build C++ library: OFF
-- - Boost headers missing
-- Build C (GLib) library: OFF
-- - Disabled by via WITH_C_GLIB=OFF
-- Build Java library: OFF
-- - Disabled by via WITH_JAVA=OFF
-- - Ant missing
-- Build Python library: OFF
-- - Disabled by via WITH_PYTHON=OFF
-- Library features:
-- Build shared libraries: OFF
-- Build static libraries: ON
-- Build with ZLIB support: ON
-- Build with libevent support: OFF
-- Build with Qt4 support: OFF
-- Build with Qt5 support: OFF
-- Build with OpenSSL support: OFF
-- Build with Boost thread support: OFF
-- Build with C++ std::thread support: OFF
-- Configuring done
-- Generating done
-- Build files have been written to: /home/listar/Code/pegasus/rdsn/thirdparty/build/thrift-0.9.3
找到原因了,是boost版本太老导致,系统中并存了2个版本,1个版本太老是1.44版本,删除了就ok了。
Pegasus supports only asc order while scanning datas in default.
In some case, we want to scan the datas in reverse order (or desc order).
I saw "#pragma once" in header file, it is better use "#ifndef XXX #define XXX ... # endif"
just like the issue title,did it support distributed transaction so I can use it like the following code?
pegasus.beginTransaction();
pegasus.put("key",value);
v = pegasus.get("key");
v++;
pegasus.put("key",v);
pegasus.commit();
// multi set. value="99" ,set 后再读取出来,value 值就错了。
用run.sh shell 看到的内容是:
multi_get_range test_key sortkey sortkez
hash_key: "test_key"
start_sort_key: "sortkey"
start_inclusive: true
stop_sort_key: "sortkez"
stop_inclusive: false
sort_key_filter_type: no_filter
max_count: -1
no_value: false
reverse: false
"test_key" : "sortkey_0" => "V\x7F"
"test_key" : "sortkey_1" => "V\x7F"
"test_key" : "sortkey_2" => "V\x7F"
"test_key" : "sortkey_3" => "V\x7F"
"test_key" : "sortkey_4" => "V\x7F"
"test_key" : "sortkey_5" => "V\x7F"
"test_key" : "sortkey_6" => "V\x7F"
下面是 multi_set 的代码:
int main(int argc, const char *argv[])
{
if (!pegasus_client_factory::initialize("config.ini")) {
fprintf(stderr, "ERROR: init pegasus failed\n");
return -1;
}
if (argc < 3) {
fprintf(stderr, "USAGE: %s <cluster-name> <app-name>\n", argv[0]);
return -1;
}
int run_key_count = 2;
if (argc == 4) {
run_key_count = atoi(argv[3]);
}
// set
pegasus_client *client = pegasus_client_factory::get_client(argv[1], argv[2]);
std::string hashKey = "test_key";
std::map<std::string, std::string> kvs;
for(int j =0; j < 7; ++j) {
std::string sortKey = "sortkey_" + std::to_string(j);
kvs[sortKey ] = "99";
printf("test:key:%s,value:%s\n", sortKey.c_str(), kvs[sortKey].c_str());
}
int ret = client->multi_set(hashKey, kvs);
if (ret != PERR_OK) {
return -1;
}
struct pegasus_client::multi_get_options optA;
std::map<std::string, std::string> values;
ret = client->multi_get(hashKey, "sortkey", "sortkez", optA, values);
if (ret != PERR_OK && ret != PERR_INCOMPLETE ) {
return -1;
}
for ( std::map<std::string, std::string>::iterator it = values.begin(); it != values.end(); ++it ) {
std::string newValue = "99";
if (0 != strcmp(newValue.c_str(), it->second.c_str())) {
fprintf(stdout, "ERROR: multi_get value headKey:%s, sortKey:%s, value:%s != value:%s\n"
, hashKey.c_str(), it->first.c_str()
, it->second.c_str(), newValue.c_str());
return -1;
}
fprintf(stdout, "hashkey:%s, sortkey:%s, value:%s\n", hashKey.c_str(), it->first.c_str(), it->second.c_str());
// del
ret = client->del(hashKey, it->first);
if (ret != PERR_OK) {
fprintf(stderr, "ERROR: del failed, error=%s\n", client->get_error_string(ret));
return -1;
}
}
return 0;
}
环境
CentOS 7.3.1611
kernel 3.10.0-514.el7.x86_64
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
cmake 2.8.12.2
boost 1.53.0 Release 27.el7
参考:
https://github.com/XiaoMi/pegasus/blob/master/docs/installation.md
1、安装开发包
yum -y install cmake boost-devel libaio-devel snappy-devel bzip2-devel
readline-devel
2、clone
3、build
日志如下:
ln: failed to create symbolic link ‘/root/pegasus/DSN_ROOT’: File exists
INFO: start build rdsn...
CLEAR=NO
BUILD_TYPE=debug
SERIALIZE_TYPE=
GIT_SOURCE=github
ONLY_BUILD=YES
RUN_VERBOSE=NO
WARNING_ALL=NO
ENABLE_GCOV=NO
Use system boost
CMAKE_OPTIONS= -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++
-DCMAKE_BUILD_TYPE=Debug -DDSN_GIT_SOURCE=github
MAKE_OPTIONS= -j8
#############################################################################
fatal: Not a git repository: ../.git/modules/rdsn
currently, several scripts assume the minos client is in dir "/home/work/pegasus/infra/minos/client", like:
we'd better to fix this by making the variable "minos_client_dir" changable.
besides, minos2.0 should also be supported in these scripts.
Though pegasus already provides buld_load
usage scenario on table for faster write speed, it still uses set
or multiSet
inteface to insert data one by one. Definitely it is not fast enough to load very quite a lot of data (typically billions of rows) into Pegasus in a short time.
Maybe we can seek for a better way, considering:
Then the idea is:
including:
and collectorapp.pegasusapp.stat.storage_count#all
2018/03/15 15:24
Pegasus Server 1.7.0 (9a7a067) Release
CentOS release 6.3 (Final)
work@c3-hadoop-ssd-tst-st04
/home/work/coresave/issue-13
#0 0x000000376e4328a5 in raise () from /lib64/libc.so.6
#1 0x000000376e434085 in abort () from /lib64/libc.so.6
#2 0x000000376e46ffe7 in __libc_message () from /lib64/libc.so.6
#3 0x000000376e475916 in malloc_printerr () from /lib64/libc.so.6
#4 0x00007f59df3ebe24 in deallocate (this=<optimized out>, __p=<optimized out>) at /home/work/qinzuoyan/Pegasus/toolchain/output/include/c++/4.8.2/ext/new_allocator.h:110
#5 _M_deallocate (this=<optimized out>, __n=<optimized out>, __p=<optimized out>) at /home/work/qinzuoyan/Pegasus/toolchain/output/include/c++/4.8.2/bits/stl_vector.h:174
#6 ~_Vector_base (this=0x7f5665200abc, __in_chrg=<optimized out>) at /home/work/qinzuoyan/Pegasus/toolchain/output/include/c++/4.8.2/bits/stl_vector.h:160
#7 ~vector (this=0x7f5665200abc, __in_chrg=<optimized out>) at /home/work/qinzuoyan/Pegasus/toolchain/output/include/c++/4.8.2/bits/stl_vector.h:416
#8 dsn::aio_task::~aio_task (this=0x7f56652009e4, __in_chrg=<optimized out>) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task.cpp:682
#9 0x00007f59df3ebe89 in dsn::aio_task::~aio_task (this=0x7f56652009e4, __in_chrg=<optimized out>) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task.cpp:682
#10 0x00007f59df3ed6ea in release_ref (this=0x7f56652009e4) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/include/dsn/utility/autoref_ptr.h:76
#11 dsn::task::exec_internal (this=this@entry=0x7f56652009e4) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task.cpp:242
#12 0x00007f59df47e3fd in dsn::task_worker::loop (this=0x12926f0) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task_worker.cpp:323
#13 0x00007f59df47e5c9 in dsn::task_worker::run_internal (this=0x12926f0) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task_worker.cpp:302
#14 0x00007f59dd528600 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
at /home/qinzuoyan/git.xiaomi/pegasus/toolchain/objdir/../gcc-4.8.2/libstdc++-v3/src/c++11/thread.cc:84
#15 0x000000376e807851 in start_thread () from /lib64/libpthread.so.0
#16 0x000000376e4e811d in clone () from /lib64/libc.so.6
std::vector<dsn_file_buffer_t> _unmerged_write_buffers;
<dsn::ref_counter> = {
_vptr.ref_counter = 0x7f59df77b1f0 <vtable for dsn::aio_task+16>,
_magic = 3735928559,
_counter = {
<std::__atomic_base<long>> = {
_M_i = 0
}, <No data fields>}
},
_unmerged_write_buffers = {
<std::_Vector_base<dsn_file_buffer_t, std::allocator<dsn_file_buffer_t> >> = {
_M_impl = {
<std::allocator<dsn_file_buffer_t>> = {
<__gnu_cxx::new_allocator<dsn_file_buffer_t>> = {<No data fields>}, <No data fields>},
members of std::_Vector_base<dsn_file_buffer_t, std::allocator<dsn_file_buffer_t> >::_Vector_impl:
_M_start = 0x7f56ff82fe20,
_M_finish = 0x7f56ff82fe50,
_M_end_of_storage = 0x7f56ff82fe60
}
}, <No data fields>},
(gdb) pvector this._unmerged_write_buffers
elem[0]: $2 = {
buffer = 0x0,
size = 0
}
elem[1]: $3 = {
buffer = 0x0,
size = 0
}
elem[2]: $4 = {
buffer = 0x7f56642154b0,
size = 0
}
Vector size = 3
Vector capacity = 4
Element type = std::_Vector_base<dsn_file_buffer_t, std::allocator<dsn_file_buffer_t> >::pointer
Init pegasus succeed
LevelDB: version 4.0
Date: Thu Nov 9 08:13:07 2017
CPU: 2 * Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
CPUCache: 46080 KB
Keys: 16 bytes each
Values: 100 bytes each (100 bytes after compression)
Entries: 100000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 11.1 MB (estimated)
FileSize: 11.1 MB (estimated)
Writes per second: 0
Compression: NoCompression
Memtablerep: skip_list
Perf Level: 0
WARNING: Optimization is disabled: benchmarks unnecessarily slow
WARNING: Assertions are enabled; benchmarks unnecessarily slow
Thread Count Runtime QPS AvgLat P99Lat
1 10000 11.763 850 1176 2425
2 20000 20.256 987 2018 4953
3 30000 29.534 1015 2938 7817
4 40000 38.809 1030 3765 12794
just like: Benchmark on v1.8.0
增加一项:
(5)数据只读: 12客户端*100线程
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.