indeedeng / lsmtree Goto Github PK
View Code? Open in Web Editor NEWA fast key/value store that is efficient for high-volume random access reads and writes.
License: Apache License 2.0
A fast key/value store that is efficient for high-volume random access reads and writes.
License: Apache License 2.0
Hi, first at all compliment for this project! I found this library casually and i gave a look ... it is very well written and complex.
If there is a durable transaction with 10 elements i have to call flush when i commited. But this method for your opinion is good and it is performant? Using big memory block for every flush you save all the block also if the area changed is little? it is correct?
I wrote following test
20 threads doing puts in the Store, each threads puts 100 keys.
When I iterate over using iterator I am getting less keys
package com.indeed.lsmtree.core;
import com.indeed.util.serialization.StringSerializer;
import junit.framework.TestCase;
import org.apache.commons.io.FileUtils;
import org.junit.Test;
import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.stream.IntStream;
public class MultiThreadsOp extends TestCase {
final class PutTask1 implements Runnable {
private int taskId;
private Store map;
public PutTask1(int id, Store map) {
this.taskId = id;
this.map = map;
}
@Override
public void run() {
System.out.println("Task ID : " + this.taskId + " performed by "
+ Thread.currentThread().getName());
for (int rowIdx = 0; rowIdx < 100; rowIdx++) {
try {
final String key = "Key-" + rowIdx + taskId;
final String val = "Value-" + rowIdx + taskId;
this.map.put( key, val );
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
@Test
public void testLSMOpPutGetMultipleThreads() throws IOException {
try {
Store map = new StoreBuilder<String, String>(
new File("/tmp/testLSMOpPutGetMultipleThreads"), new StringSerializer(),
new StringSerializer()).setMaxVolatileGenerationSize(8 * 1024).setCodec(null)
.setStorageType(StorageType.INLINE).build();
final int numOfThreads = 20;
ExecutorService taskExecutor = Executors.newFixedThreadPool(numOfThreads);
IntStream.range(0, numOfThreads).forEach(i -> taskExecutor.submit(new PutTask1(i, map)));
taskExecutor.shutdown();
try {
while(!taskExecutor.awaitTermination(60, TimeUnit.SECONDS)) {
Thread.sleep(60000);
}
} catch (InterruptedException e) {
}
Iterator iterator = map.iterator();
int count = 0;
while ( iterator.hasNext()) {
count++;
iterator.next();
}
assertEquals(100 * numOfThreads, count);
} finally {
FileUtils.deleteDirectory(new File("/tmp/testLSMOpPutGetMultipleThreads"));
}
}
}
I may be wrong, but I think this entry:
talks about system that uses lsmtree as a building block. If so, would be good to add a link from README (I can do a PR).
lsmtree depends on util-compress, which depends on a native snappy build.
It also depends on util-mmap, which by default contains a native linux build.
Need to figure out the best way to document these dependencies and explain how to build/run on different (compatible) platforms.
Have some questions, not sure where else to ask. We're looking at this project for use in one of our systems.
Thanks for open-sourcing this project, by the way. I've poked around a bit and the code seems to be clean and well-designed.
Unit tests are failing which end up causing the mvn install
command to fail for the recordlog library.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.