utsaslab / recipe Goto Github PK
View Code? Open in Web Editor NEWRECIPE : high-performance, concurrent indexes for persistent memory (SOSP 2019)
License: Apache License 2.0
RECIPE : high-performance, concurrent indexes for persistent memory (SOSP 2019)
License: Apache License 2.0
Exposed by crashing after freeing the hash table in clht_gc_free
.
Lines 239 to 242 in fc508dd
pmemobj_free
sets the PMEMoid object to NULL when freeing objects.hashtable->table_off
, the offset is never set to null, and so a crash can cause a double-free to occur.gdb --args ./example 20 20
> break clht_gc.c:241
> run
> quit
# Then, re-run
./example 20 0
Will output something like:
Simple Example of P-CLHT
operation,n,ops/s
Throughput: load, inf ,ops/us
Throughput: run, inf ,ops/us
<libpmemobj>: <1> [palloc.c:295 palloc_heap_action_exec] assertion failure: 0
It looks like there is a segmentation fault caused by `./build/ycsb art a randint uniform 4'
I followed the the build and config procedure as described in README.md, till 'Persistent Memory environment'.
Machine config:
CPU: AMD Ryzen Threadripper 2990WX 32-Core Processor
DRAM: 8*16G DDR4
DRAM emulated persistent memory: 50G, mounted with ext4-dax file system
OS: Ubuntu 18.04.3 LTS, with linux-5.1.0+ kernel
root@RECIPE# cat ./scripts/set_vmmalloc.sh
export VMMALLOC_POOL_SIZE=$((16*1024*1024*1024))
export VMMALLOC_POOL_DIR="/mnt/pmem"
root@RECIPE# source ./scripts/set_vmmalloc.sh
root@RECIPE# LD_PRELOAD="../pmdk/src/nondebug/libvmmalloc.so.1" ./build/ycsb art a randint uniform 4
art, workloada, randint, uniform, threads 4
Loaded 0 keys
Segmentation fault (core dumped)
root@RECIPE# gdb ./build/ycsb core
warning: Error reading shared library list entry at 0x7f2ecdc39b00
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./build/ycsb art a randint uniform 4'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005629a6dc8967 in ART_ROWEX::N4::change(unsigned char, ART_ROWEX::N*) ()
[Current thread is 1 (Thread 0x7f32d0800800 (LWP 108852))]
(gdb) bt
#0 0x00005629a6dc8967 in ART_ROWEX::N4::change(unsigned char, ART_ROWEX::N*) ()
#1 0x00005629a6dcd717 in ART_ROWEX::Tree::insert(Key const*, ART::ThreadInfo&) ()
#2 0x00005629a6d73cc0 in tbb::interface9::internal::start_for<tbb::blocked_range<unsigned long>, ycsb_load_run_randint(int, int, int, int, int, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<int, std::allocator<int> >&, std::vector<int, std::allocator<int> >&)::{lambda(tbb::blocked_range<unsigned long> const&)#1}, tbb::auto_partitioner const>::execute() ()
#3 0x00007f32cff9bb46 in ?? () from /usr/lib/x86_64-linux-gnu/libtbb.so.2
#4 0x00007f32cff98790 in ?? () from /usr/lib/x86_64-linux-gnu/libtbb.so.2
#5 0x00005629a6d82db6 in ycsb_load_run_randint(int, int, int, int, int, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<int, std::allocator<int> >&, std::vector<int, std::allocator<int> >&) ()
#6 0x00005629a6d700c6 in main ()
I've read your code a bit and try to modified example.cpp
to use string keys.
I use the function void masstree::put(char *key, uint64_t value)
to insert string keys to P-Masstree.
However it would not working correctly if key is overlapped with the previously inserted keys.
For example if we first insert key1 abcdefghijklmnopqrstuvwxyz
and then key2: abcdefghijklmnopqrstuvwxy
. Then I try to get key1 using void *masstree::get(char *key)
would return an empty value.
Excuse me, I have read your paper, and I have some questions about your test:
Current implementations only ensure the lowest level of isolation (Read Uncommitted) for some read operations such as scan, negative lookup, and verification for value existence, since they are based on normal CASs or temporal stores coupled with cache line flush instructions. However, it is not the fundamental limitation of RECIPE conversions. You can easily extend them, following RECIPE conversions, to guarantee the higher level of isolation (Read Committed) by replacing each final commit stores (such as pointer swap) coupled with cache line flushes with non-temporal stores
coupled with memory fence for lock-based implementations
including P-CLHT, P-HOT, P-ART, and P-Masstree. For lock-free implementations
such as P-Bwtree, you can either add additional flushes only after loads to final commit stores or replace volatile CASs coupled with cache line flush instructions with alternative software-based atomic-persistent primitives such as either Link-and-Persist
(paper, code) or PSwCAS
(paper, code).
Excuse me, I have read your paper and code,very interesting. But I have a question about CLFLUSH_OPT/CLWB。
Below is your implementation
inline void clflush(char *data, int len, bool front, bool back)
{
volatile char *ptr = (char )((unsigned long)data & ~(cache_line_size - 1));
if (front)
mfence();
for (; ptr < data+len; ptr += cache_line_size){
unsigned long etsc = read_tsc() +
(unsigned long)(write_latency_in_ns * cpu_freq_mhz/1000);
#ifdef CLFLUSH
asm volatile("clflush %0" : "+m" ((volatile char )ptr));
#elif CLFLUSH_OPT
asm volatile(".byte 0x66; clflush %0" : "+m" ((volatile char )(ptr)));
#elif CLWB
asm volatile(".byte 0x66; xsaveopt %0" : "+m" ((volatile char *)(ptr)));
#endif
while (read_tsc() < etsc) cpu_pause();
}
if (back)
mfence();
}
In fact, CLFLUSH_OPT/CLWB will be reorder. In some indexes, Should mfence() be added between cachelines instead of adding mfence() at the end? E.g. the FAST & FAIR paper clearly mentions the need to add mfence() at cacheline boundary. In the original implementation of FAST&FAIR, only CLFLUSH was used, which will not cause problems, because CLFLUSH will not be reorder.
So will using CLFLUSH_OPT/CLWB cause a bug?
Exposed by crashing after acquiring a lock from clht_put
.
RECIPE/P-CLHT/include/clht_lb_res.h
Lines 306 to 312 in fc508dd
gdb --args ./example 20 1
> break clht_lb_res.h:311
> run
> next
> p *lock
# should print "$1 = 1 '\001'"
> quit
# Then, re-run
./example 20 1
The second execution should run indefinitely, waiting on acquiring the lock.
I see your comments here about locking assumptions:
RECIPE/P-CLHT/include/clht_lb_res.h
Lines 162 to 164 in fc508dd
Does this mean this is a known issue, or does clht_lock_initialization
just need to be added to clht_create
? I ask because it seems that clht_lock_initialization
is called in other places, just not in the recovery procedure.
Currently, I only saw the scan method on masstree. Iterator on masstree may need some methods such as seek(key), next, seekfirst, seeklast. I'm wondering any plan on this?
In linear_search_range
of /third-party/FAST_FAIR/btree.h, count()
should be current->count()
,
I first tried running YCSB CLHT on DRAM, which worked. Then, I ran it with libvmmalloc as LD_PRELOAD, and observed a segmentation fault occasionally happening (in a nondeterministic way), both with 16 and 32 threads. I ran YCSB workloads with 2 input configurations, recordcount=operationcount=64000000, and recordcount=operationcount=1000000, both seeing the segfault:
RECIPE# LD_PRELOAD="../pmdk/src/nondebug/libvmmalloc.so.1" ./build/ycsb clht a randint uniform 32
Loaded 1000001 keys
Segmentation fault (core dumped)
Machine config:
CPU: AMD Ryzen Threadripper 2990WX 32-Core Processor
DRAM: 8*16G DDR4
DRAM emulated persistent memory: 64G, mounted with ext4-dax file system
OS: Ubuntu 18.04.3 LTS, with linux-5.1.0+ kernel
scripts/set_vmmalloc.sh:
export VMMALLOC_POOL_SIZE=$((60*1024*1024*1024))
export VMMALLOC_POOL_DIR="/mnt/pmem/test"
GDB backtrace on the core file
warning: Error reading shared library list entry at 0x7f7154c39b00
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./build/ycsb clht a randint uniform 32'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055a0f68ab202 in ssmem_mem_reclaim ()
[Current thread is 1 (Thread 0x7f71545ff700 (LWP 20200))]
(gdb) bt
#0 0x000055a0f68ab202 in ssmem_mem_reclaim ()
#1 0x000055a0f68aa72e in clht_gc_release ()
#2 0x000055a0f68a9ba0 in ht_resize_pes ()
#3 0x000055a0f68a94dc in ht_status ()
#4 0x000055a0f68a98de in clht_put ()
#5 0x000055a0f6840dc8 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<ycsb_load_run_randint(int, int, int, int, int, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<unsigned long, std::allocator<unsigned long> >&, std::vector<int, std::allocator<int> >&, std::vector<int, std::allocator<int> >&)::{lambda()#9}> > >::_M_run() ()
#6 0x00007f7d5662166f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007f7d568f46db in start_thread (arg=0x7f71545ff700) at pthread_create.c:463
#8 0x00007f7d55cde88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Hi, when I invoke a get
for non-existing keys in P-Masstree, "should not enter here....." is printed.
code:
masstree::masstree *tree = new masstree::masstree();
auto t = tree->getThreadInfo();
char *str = "helloworld";
tree->get(str, t);
printed information:
should not enter here
fkey = rowolleh, key = 7522537965574647666, searched key = 0, key index = -1
The value of hashtable->ht_oldest
is not persisted after the free, meaning that a post-crash execution can read the previous value and perform a double-free.
Lines 183 to 196 in 05a49d7
Hello, we are trying to reproduce your results. Even through we have successfully plugged your work with the Intel Optane of our system by following exactly the procedure that you indicate, we do not observe a significant scalability as we increase the number of threads, in contrast to DRAM execution, where the scalablity is clear. Is this a known issue? Do these indexes scale on both DRAM and Intel Optane? Is there any known reason why scalability fails in Optane?
Thank you :)
This issue involves porting the RECIPE data structures to work on libpmem. For example, converting P-CLHT to a form that uses the libpmem pointers and allocation routines.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.