leventov / koloboke Goto Github PK
View Code? Open in Web Editor NEWJava Collections till the last breadcrumb of memory and performance
Home Page: https://koloboke.com/
Java Collections till the last breadcrumb of memory and performance
Home Page: https://koloboke.com/
While debugging the koloboke 0.6.7, jdk8, I observed that after running this code:
Map map = HashObjObjMaps.newMutableMap();
map.put("a", "b");
map.clear();
the inner table (from the MutableParallelKVObjLHashSO class) still contain a reference to the "b" object. As I can see, the MutableParallelKVObjLHashSO.fillFree method only clears every second item (the keys), but not the values:
private void fillFree() {
Object[] tab = table;
for (int i = 0; i < tab.length; i += 2) {
tab[i] = FREE;
}
}
As long as the entry that used to be use by the "a" key is not reused, this means a potential memory leak for the application using this map.
Hi,
We have a change in the Trove ensureCapacity method which takes care that the grow of the underlying tables (_set, _values) is following the normal grow where the capacity is always doubled. Your (and Trove) implementation will always increase the capacity just to fit the new additional (or Trove's extra desired) size which will cause a lot of rehashing (including array re-allocations) when ensureCapacity is called many times for small increments. If the ensureCapacity is going to rehash, it should follow the normal capacity grow (Trove - double it, Koloboke - grow by configured scale).
Reimplemented ensureCapacity for Trove 2.0.x
@Override
public void ensureCapacity(int desiredExtraSize) {
if(desiredExtraSize > (super._maxSize - size())) {
final int desiredCapacity = ((int) ((desiredExtraSize + size()) / super._loadFactor)) + 1;
final int doubleCapacity = this.capacity() << 1;
this.rehash(PrimeFinder.nextPrime(Math.max(desiredCapacity, doubleCapacity)));
final int capacity = capacity();
this.computeMaxSize(capacity);
this.computeNextAutoCompactionAmount(capacity);
} else {
;
}
}
BTW: Do you have a page listing all changes made to the Trove interface? The ensureCapacity is to be called now with the expected total size which is quite a change to the desired size on Trove.
Michal
Don't you think that it is more reasonable to use Equivalence.hash method during get operations for the map ?
Description might be a bit confusing and non informative but generally here what I've faced with:
The following code:
HashLongSet lal = lookALikes.getA();
LOG.info("singleLookALikesNoScore() - topResults size is {}, adding more {} results", topResults.size(), lal.size());
int cnt = 0;
long[] followers = topFollowers.toLongArray();
System.out.println("followers.length = " + followers.length);
for (int i = 0; i < followers.length; i++) {
long follower = followers[i];
lal.add(follower);
if(cnt++ % 1000000 ==0) {
System.out.println("Added, last value is " + follower);
}
}
System.out.println("lal.size() = " + lal.size());
Resulting logs are interesting:
singleLookALikesNoScore() - topResults size is 1739131, adding more 3260870 results
followers.length = 1739131
Added, last value is 2157902531
And after that process hangs up for about 5 minutes. I.e. it outputs one line (which means the first value was added) and that is all. After about 5-8 mins it outputs other parts.
Strange that before that I was easily able to add 5 mils, 8 mils to other HashLongSets and this one fails on 3 millions.
If I switch from HashLongSet to usual set everything works fine.
Ideas?
In the last week I was looking for a primitive type collection library and found
Trove, HPPC, FastUtil, Apache Commons Primitives (has only ArrayLists), Primitive Collections for Java (PCJ).
In my specific scenario I was looking for HashMaps and there "contains", "get" Performance. In the end HPPC and FastUtil where the fastest, performing on the same level.
But FastUtil takes less memory for some collection sizes, has a faster construction process and is more compatible to Java-Maps. On the other hand it is over 10MB big while HPPC, PCJ and Trove are around 2MB.
All the libraries are storing the "states" (if a cell in a hash map is filled or not) in a boolean array, even though a single bit would be enough. Problably for performance reasons. I think OpenHFT with the extensive use of Java.Unsafe can reduce the memory footprint even further and still have the same performance if not even faster. Because all collections are doing boundaries checks, before accessing an array and we all know java is doing the same thing again while accessing the array.
I'm interested in this project and offer my help for testing, benchmarking the library on my windows machines, if needed.
Once I tried to build the project I catched an irreproducible bug. A single failed test among 300K. It occured with the map with byte keys, therefore it's probability should be in order 1/256-1/128, and it is probably related to replacement of special free/removed values during insertion.
JUnit message:
testIterator_unknownOrderRemoveSupported[Tests of mutable ByteFloatMaps. Key samples: [0, -128, 127, 1, -2]. Value samples: [0.0, NaN, 3.4028235E38, 1.0, -2.0]. Factory: HashByteFloatMapFactory[keyConfig=ByteHashConfig[hashConfig=HashConfig[loadFactor[loadFactor=0.5,shrinkCondition=null,defaultExpectedSize=10],lowerKeyDomainBound=-128,upperKeyDomainBound=127],defaultValue=0.0]. Special features: [RESTRICTS_KEYS, RESTRICTS_VALUES]. [collection size: several] values [collection size: several]]
junit.framework.AssertionFailedError: failed with stimuli [hasNext, next, next, remove]
at com.google.common.collect.testing.Helpers.fail(Helpers.java:164)
at com.google.common.collect.testing.AbstractIteratorTester.compareResultsForThisListOfStimuli(AbstractIteratorTester.java:418)
at com.google.common.collect.testing.AbstractIteratorTester.recurse(AbstractIteratorTester.java:383)
at com.google.common.collect.testing.AbstractIteratorTester.recurse(AbstractIteratorTester.java:388)
at com.google.common.collect.testing.AbstractIteratorTester.recurse(AbstractIteratorTester.java:388)
at com.google.common.collect.testing.AbstractIteratorTester.recurse(AbstractIteratorTester.java:388)
at com.google.common.collect.testing.AbstractIteratorTester.recurse(AbstractIteratorTester.java:388)
at com.google.common.collect.testing.AbstractIteratorTester.recurse(AbstractIteratorTester.java:388)
at com.google.common.collect.testing.AbstractIteratorTester.test(AbstractIteratorTester.java:372)
at com.google.common.collect.testing.testers.CollectionIteratorTester.runIteratorTest(CollectionIteratorTester.java:96)
at com.google.common.collect.testing.testers.CollectionIteratorTester.testIterator_unknownOrderRemoveSupported(CollectionIteratorTester.java:84)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at junit.framework.TestCase.runTest(TestCase.java:176)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:86)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:49)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:69)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:355)
at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Caused by: junit.framework.AssertionFailedError: did not contain expected element 0.0, expected = [NaN, 0.0], actual = [NaN]
at junit.framework.Assert.fail(Assert.java:57)
at com.google.common.collect.testing.Helpers.assertEqualIgnoringOrder(Helpers.java:91)
at com.google.common.collect.testing.AbstractContainerTester.expectContents(AbstractContainerTester.java:110)
at com.google.common.collect.testing.testers.CollectionIteratorTester.access$200(CollectionIteratorTester.java:48)
at com.google.common.collect.testing.testers.CollectionIteratorTester$1.verify(CollectionIteratorTester.java:104)
at com.google.common.collect.testing.AbstractIteratorTester.compareResultsForThisListOfStimuli(AbstractIteratorTester.java:416)
... 56 more
Currently rehash() first replaces table arrays and then re-fills them with elements. Collection has incosistent state for relatively long time.
Rehash should first build new table and then quickly replace references.
This code:
public static void fillKeys(char[] table, byte key) {
/* if !(long elem) */
long base = CHAR_BASE + BYTE_KEY_OFFSET;
for (long off = CHAR_SCALE * (long) table.length; (off -= CHAR_SCALE) >= 0L;) {
U.putByte(table, base + off, key);
}
/* elif long elem */
for (int i = 0; i < table.length; i += 2) {
table[i] = key;
}
/* endif */
}
Will perform worse than Arrays.fill which is a (sort of) intrinsic on OpenJDK. The intrinsic version will replace 'fill loops' of the pattern 'for(int i=0;i<length;i++)a[i] = v;" with an optimized version using wider write instructions. This is similar to System.arrayCopy in spirit. The optimization is for all types smaller than long.
Currently hash table implementations return ordinary impl, which scan the whole table unsuccessfully, that is rather ineffective in this case.
Depends on #12.
While the findbugs jsr is needed to compile this project, it is not really a 'compile' scope dependency because it is not required to be present at runtime. However compile scope causes it to be a transitive runtime dependency for users of this library and they need to actively exclude it.
The dependency in this project should be changed to 'provided' so that it is available when this project compiles but it does not transitively get included by projects using the library.
@NotNull
and @Nullable
from org.jetbrains.annotations
already used in some signatures, but not comprehensively.
Exception in thread "pool-1-thread-1" java.util.NoSuchElementException
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:340)
at java.util.ServiceLoader$1.next(ServiceLoader.java:428)
at net.openhft.koloboke.collect.map.hash.HashIntDoubleMaps.getDefaultFactory(HashIntDoubleMaps.java:56)
at net.openhft.koloboke.collect.map.hash.HashIntDoubleMaps.newMutableMap(HashIntDoubleMaps.java:81)
at Test.run(Test.java:18)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
Hi,
I have download the 0.6.6 java 8 api files, and wrote this one line java main class which I took from the documentation example
package test2;
import java.util.Map;
import net.openhft.koloboke.collect.map.hash.HashIntIntMaps;
public class Try
{
public static void main(String[] args)
{
Map<Integer, Integer> map = HashIntIntMaps.newMutableMap();
}
}
I get this runtime error
Exception in thread "main" java.lang.ExceptionInInitializerError
at net.openhft.koloboke.collect.map.hash.HashIntIntMaps.getDefaultFactory(HashIntIntMaps.java:52)
at net.openhft.koloboke.collect.map.hash.HashIntIntMaps.newMutableMap(HashIntIntMaps.java:76)
at test2.Try.main(Try.java:11)
Caused by: java.util.NoSuchElementException
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:365)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at net.openhft.koloboke.collect.map.hash.HashIntIntMaps$DefaultFactoryHolder.(HashIntIntMaps.java:37)
... 3 more
Hi,
After cloning the repo and following the instructions in the readme, my build fails during the first gradle step (./gradlew :buildMeta).
The stacktrace is:
org.gradle.api.GradleScriptException: A problem occurred evaluating project ':jpsg:core'.
at org.gradle.groovy.scripts.internal.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:54)
at org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:187)
at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:39)
at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:26)
at org.gradle.configuration.project.ConfigureActionsProjectEvaluator.evaluate(ConfigureActionsProjectEvaluator.java:34)
at org.gradle.configuration.project.LifecycleProjectEvaluator.evaluate(LifecycleProjectEvaluator.java:55)
at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:470)
at org.gradle.api.internal.project.AbstractProject.evaluationDependsOn(AbstractProject.java:535)
at org.gradle.api.internal.project.AbstractProject.evaluationDependsOn(AbstractProject.java:527)
at org.gradle.api.internal.BeanDynamicObject$MetaClassAdapter.invokeMethod(BeanDynamicObject.java:225)
at org.gradle.api.internal.BeanDynamicObject.invokeMethod(BeanDynamicObject.java:129)
at org.gradle.api.internal.CompositeDynamicObject.invokeMethod(CompositeDynamicObject.java:147)
at org.gradle.groovy.scripts.BasicScript.methodMissing(BasicScript.java:79)
at build_6ankpql6665bjrdrnft7v9epi5$_run_closure5.doCall(/Users/chadni/Projects/Koloboke/build.gradle:72)
at build_6ankpql6665bjrdrnft7v9epi5.run(/Users/chadni/Projects/Koloboke/build.gradle:72)
at org.gradle.groovy.scripts.internal.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:52)
at org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:187)
at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:39)
at org.gradle.configuration.project.BuildScriptProcessor.execute(BuildScriptProcessor.java:26)
at org.gradle.configuration.project.ConfigureActionsProjectEvaluator.evaluate(ConfigureActionsProjectEvaluator.java:34)
at org.gradle.configuration.project.LifecycleProjectEvaluator.evaluate(LifecycleProjectEvaluator.java:55)
at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:470)
at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:79)
at org.gradle.configuration.DefaultBuildConfigurer.configure(DefaultBuildConfigurer.java:28)
at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:128)
at org.gradle.initialization.DefaultGradleLauncher.doBuild(DefaultGradleLauncher.java:105)
at org.gradle.initialization.DefaultGradleLauncher.run(DefaultGradleLauncher.java:85)
at org.gradle.launcher.exec.InProcessBuildActionExecuter$DefaultBuildController.run(InProcessBuildActionExecuter.java:81)
at org.gradle.launcher.cli.ExecuteBuildAction.run(ExecuteBuildAction.java:33)
at org.gradle.launcher.cli.ExecuteBuildAction.run(ExecuteBuildAction.java:24)
at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:39)
at org.gradle.launcher.daemon.server.exec.ExecuteBuild.doBuild(ExecuteBuild.java:45)
at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:34)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.WatchForDisconnection.execute(WatchForDisconnection.java:35)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.ResetDeprecationLogger.execute(ResetDeprecationLogger.java:24)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.StartStopIfBuildAndStop.execute(StartStopIfBuildAndStop.java:33)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.ReturnResult.execute(ReturnResult.java:34)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.ForwardClientInput$2.call(ForwardClientInput.java:71)
at org.gradle.launcher.daemon.server.exec.ForwardClientInput$2.call(ForwardClientInput.java:69)
at org.gradle.util.Swapper.swap(Swapper.java:38)
at org.gradle.launcher.daemon.server.exec.ForwardClientInput.execute(ForwardClientInput.java:69)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.LogToClient.doBuild(LogToClient.java:60)
at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:34)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.EstablishBuildEnvironment.doBuild(EstablishBuildEnvironment.java:60)
at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:34)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.StartBuildOrRespondWithBusy$1.run(StartBuildOrRespondWithBusy.java:45)
at org.gradle.launcher.daemon.server.DaemonStateCoordinator.runCommand(DaemonStateCoordinator.java:184)
at org.gradle.launcher.daemon.server.exec.StartBuildOrRespondWithBusy.doBuild(StartBuildOrRespondWithBusy.java:49)
at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:34)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.HandleStop.execute(HandleStop.java:30)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.DaemonHygieneAction.execute(DaemonHygieneAction.java:39)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.CatchAndForwardDaemonFailure.execute(CatchAndForwardDaemonFailure.java:32)
at org.gradle.launcher.daemon.server.exec.DaemonCommandExecution.proceed(DaemonCommandExecution.java:125)
at org.gradle.launcher.daemon.server.exec.DefaultDaemonCommandExecuter.executeCommand(DefaultDaemonCommandExecuter.java:51)
at org.gradle.launcher.daemon.server.DefaultIncomingConnectionHandler$ConnectionWorker.handleCommand(DefaultIncomingConnectionHandler.java:155)
at org.gradle.launcher.daemon.server.DefaultIncomingConnectionHandler$ConnectionWorker.receiveAndHandleCommand(DefaultIncomingConnectionHandler.java:128)
at org.gradle.launcher.daemon.server.DefaultIncomingConnectionHandler$ConnectionWorker.run(DefaultIncomingConnectionHandler.java:116)
at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:64)
Caused by: java.lang.StackOverflowError
at org.gradle.api.internal.CompositeDynamicObject.invokeMethod(CompositeDynamicObject.java:156)
at org.gradle.groovy.scripts.BasicScript.methodMissing(BasicScript.java:79)
at org.gradle.api.internal.CompositeDynamicObject.invokeMethod(CompositeDynamicObject.java:156)
at org.gradle.groovy.scripts.BasicScript.methodMissing(BasicScript.java:79)
(and the last frame repeats until the stack overflow occurs)
Any help gratefully appreciated.
Thanks,
Nick
There is annoying pecularity of java.lang.Float and java.lang.Double: their equals
method is incosistent with primitive identity: 1) -0.0 == 0.0
, but ((Double) -0.0).equals(0.0) == false
, 2) all NaN values are equal as objects, but not identical to anything including themselves.
To be consistent with JCF, we should convert all floats to bits using Float.floatToIntBits
and Double.doubleToLongBits
on inserting into the collections. Thus, int[]
and long[]
arrays of elements internally instead of float[]
and double[]
.
This requires some work in our code generator.
Hi there,
I'm currently using trove for a few projects, so naturally I wanted to benchmark hftc in comparison. However, the build buildMain
always stops at > Building 8% > :lib:impl:compileJava
.
My setup is Java 1.8.0_11 (oracle) on LUbuntu 13.10 (64bit), gradle 2.0, groovy 2.3.3, ant 1.9.3.
PS. It would be great if you could release binaries.
In implementation for Java 8 we can use forEach
instead of plain old iteration.
Hi,
we are using Trove library with some extra modifications where one of them is about null reference being used for the FREE value. The initial array creation and rehashing do not require to fill the array with FREE values (one extra iteration over the complete array is now required).
Additionally it can be cheaper for a garbage collector to see a null reference than many references to a singleton object representing FREE - null should be for free.
Obviously there is some cost in handling key slots where the new NULL object represents null key - could be reduced by noNull case (like you do with noRemoved).
protected static final Object REMOVED = new Object(), FREE = null, NULL = new Object();
Michal
Can you add this to MC?
Hello Roman,
I'm currently trying to brew my own benchmarks using Mikhail Vorontsov's as inspiration (http://java-performance.info) using my own put() version modified for different inputs. Well, it turns out Koloboke is badly misbehaving, to the point of pathological, if you compare to Fastutil for instance.
I've reproduced the case using a Random distrib [- 6000000; 6000000] with 6000000 as target preallocated size, 0.75 load factor, using the same allocation logic as Mikhail did:
testedMap = HashObjIntMaps. getDefaultFactory()
.withHashConfig(HashConfig.fromLoads(KolobokeObjectIntIssue.LOAD_FACTOR / 2, KolobokeObjectIntIssue.LOAD_FACTOR, KolobokeObjectIntIssue.LOAD_FACTOR))
.newMutableMap(KolobokeObjectIntIssue.TARGET_SIZE);
Here is the code:
https://gist.github.com/vsonnier/35b68b109528884e2bdd
The int ==> int version does not seem to suffer the same problem, under the same input.
Do you intent do use fastutil, hppc or mahout collections?
Although I've been writing the API compatible with JDK8 from the beginning, I haven't even tried to assemble the project with Java 8. Certainly, there is a number of bugs.
Key change:
Java 6 & 7 API:
n.h.f.Consumer {}
n.h.c.ObjCollection { void forEach(n.h.f.Consumer); }
Java 8 API:
n.h.f.Consumer extends j.u.f.Consumer {}
n.h.c.ObjCollection { void forEach(j.u.f.Consumer); }
collections
artifactId under net.openhft
groupId is taken by HugeCollections.
Current package: net.openhft.collect
Project name proposal: HFTC (HFT Collections)
I noticed the following problem in rehash implementation, at least in MutableObjQHashSetSO.
void rehash(int newCapacity) {
int mc = modCount();
Object[] keys = set;
initForRehash(newCapacity);
mc++; // modCount is incremented in initForRehash()
Object[] newKeys = set;
int capacity = newKeys.length;
if (noRemoved()) {
...
The method initForRehash calls internalInit that calls initSlotCounts that sets removedSlots to zero. Therefore the test noRemoved() always succeeds even when the hash set contains REMOVED entries. This bug appears in koloboke-impl-jdk8 1.0.0 I downloaded from Maven central.
Hi,
I'm currently working to solve Project Euler programming tasks and I've recently started using Koloboke. When writing a solution for problem 60, I've gotten a strange ArrayIndexOutOfBoundsException when calling toLongArray() on a HashLongSet.
Sorry for the hodgepodge of code below, but I wanted you to be able to copy/paste/run the code below without any edits.
The line below throws the exception:
long[] tertArr = secondarySet.toLongArray();
No comments on the code please. I wasn't done yet :P
import java.math.BigInteger;
import java.util.Arrays;
import java.util.BitSet;
import net.openhft.koloboke.collect.map.hash.HashLongObjMap;
import net.openhft.koloboke.collect.map.hash.HashLongObjMaps;
import net.openhft.koloboke.collect.set.hash.HashLongSet;
import net.openhft.koloboke.collect.set.hash.HashLongSets;
public class Euler60 {
// The primes 3, 7, 109, and 673, are quite remarkable. By taking any two
// primes and concatenating them in any order the result will always be prime.
// For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of
// these four primes, 792, represents the lowest sum for a set of four primes
// with this property.
//
// Find the lowest sum for a set of five primes for which any two primes
// concatenate to produce another prime.
// Analysis:
// 2 cannot be a part of the resulting set, since 2 is the only prime ending
// in 2. Same for 5.
// Use sets to intersect valid pairs until a set of five remains
private static final long[] primes = generatePrimesWithESieve(100000);
private static HashLongSet primeSet = HashLongSets.newUpdatableSet(primes);
private static long maxSum = 100000;
private static final int MAX_INDEX = Arrays.binarySearch(primes, BigInteger.valueOf(25000).nextProbablePrime().longValue());
private static final long MAX_PRIME = primes[primes.length - 1];
private static HashLongObjMap<HashLongSet> primePairMap = HashLongObjMaps.getDefaultFactory().<HashLongSet> newUpdatableMap(MAX_INDEX);
public static void main(String[] args) {
long time = System.currentTimeMillis();
for (int i = 1; i < MAX_INDEX; i++) {
HashLongSet primarySet = getPairSet(primes[i], i);
long[] secArr = primarySet.toLongArray();
Arrays.sort(secArr);
for (int j = 0; j < secArr.length; j++) {
if (primes[i] + secArr[j] * 4 > maxSum) break;
HashLongSet secondarySet = HashLongSets.getDefaultFactory().newMutableSet(primarySet);
secondarySet.retainAll(getPairSet(secArr[j]));
if (secondarySet.size() < 3) continue;
long[] tertArr = secondarySet.toLongArray();
Arrays.sort(tertArr);
for (int k = 0; k < tertArr.length; k++) {
if (primes[i] + secArr[j] + tertArr[k] * 3 > maxSum) break;
HashLongSet tertiarySet = HashLongSets.getDefaultFactory().newMutableSet(secondarySet);
if (tertiarySet.size() < 2) continue;
tertiarySet.retainAll(getPairSet(tertArr[k]));
long[] quatArr = tertiarySet.toLongArray();
Arrays.sort(quatArr);
for (int l = 0; l < quatArr.length; l++) {
if (primes[i] + secArr[j] + tertArr[k] + quatArr[l] * 2 > maxSum) break;
HashLongSet quaternarySet = HashLongSets.getDefaultFactory().newMutableSet(tertiarySet);
if (quaternarySet.size() < 1) continue;
quaternarySet.retainAll(getPairSet(quatArr[l]));
long[] quinArr = quaternarySet.toLongArray();
Arrays.sort(quinArr);
for (int m = 0; m < quinArr.length; m++) {
if (primes[i] + secArr[j] + tertArr[k] + quatArr[l] + quinArr[m] > maxSum) break;
System.out.format("{ %1$d, %2$d, %3$d, %4$d, %5$d } Sum = %6$d", primes[i], secArr[j], tertArr[k], quatArr[l], quinArr[m], primes[i]
+ secArr[j] + tertArr[k] + quatArr[l] + quinArr[m]);
System.out.println();
long sum = primes[i] + secArr[j] + tertArr[k] + quatArr[l] + quinArr[m];
if (sum < maxSum) {
maxSum = sum;
}
}
}
}
}
}
System.out.println("Result = " + maxSum);
System.out.println("Done.");
System.out.println("Solution took " + (System.currentTimeMillis() - time) + " ms.");
}
private static HashLongSet getPairSet(long prime) {
return getPairSet(prime, Arrays.binarySearch(primes, prime));
}
private static HashLongSet getPairSet(long prime, int primeIndex) {
HashLongSet ret = primePairMap.get(prime);
if (ret == null) {
primePairMap.put(prime, (ret = getPairs(prime, primeIndex)));
}
return ret;
}
private static HashLongSet getPairs(long prime, int primeIndex) {
HashLongSet pairSet = HashLongSets.newUpdatableSet();
long primeDigitSum = digitSum(prime);
for (int j = primeIndex + 1; j < MAX_INDEX; j++) {
if ((primeDigitSum + digitSum(primes[j])) % 3 == 0) continue;
if (isPrime(concat(prime, primes[j])) && isPrime(concat(primes[j], prime))) pairSet.add(primes[j]);
}
return pairSet;
}
private static boolean isPrime(long num) {
if (num > MAX_PRIME) { return BigInteger.valueOf(num).isProbablePrime(20); }
return primeSet.contains(num);
}
public static long[] generatePrimesWithESieve(long upperLimit) {
// No even numbers in sieve BitSet
// BitSet starts from 0..sieveBound
// p = 2i+3 => i = (p-3)/2
// jStart = 3p
// jStep = 2p
// i = 0 => p = 3 => jStart = 9, jStep = 6
// i = 0, j = 3, 6, 9
// i = 1 => p = 5 => jStart = 15, jStep = 10
// i = 1, j = 6, 11, 16
// i = 2 => p = 7 => jStart = 21, jStep = 14
// i = 2, j = 9, 16, 23
int intUpperLimit = Math.toIntExact(upperLimit);
int sieveBound = (intUpperLimit - 1) / 2;
int upperSqrt = ((int) Math.sqrt(upperLimit) - 1) / 2;
BitSet set = new BitSet(sieveBound + 1);
set.set(0, sieveBound + 1);
for (int i = 0; i <= upperSqrt; i++) {
if (set.get(i)) {
for (int j = 3 * i + 3; j <= sieveBound; j += 2 * i + 3) {
set.clear(j);
}
}
}
int i = 0;
// Number of primes can be approximated with pi(x) = x/log(x - 1)
LongArray list = new LongArray((int) (upperLimit / (Math.log(upperLimit) - 1.08366)));
list.setSizeIncreaseFactor(1.2d);
list.add(2); // sieve starts at 3
do {
list.add(2 * i + 3);
} while ((i = set.nextSetBit(i + 1)) != -1);
return list.getArray();
}
public static long concat(long a, long b) {
for (long c = b; c > 0;) {
a *= 10;
c /= 10;
}
return a + b;
}
public static long digitSum(long num) {
long sum = 0;
while (num > 0) {
sum += num - ((num / 10) * 10);
num /= 10;
}
return sum;
}
public static class LongArray {
private static final int DEFAULT_SIZE = 16;
private double sizeIncreaseFactor = 1.5d;
private long[] arr;
private int size = 0;
public LongArray() {
arr = new long[DEFAULT_SIZE];
}
public LongArray(int initialSize) {
arr = new long[initialSize];
}
public void add(long value) {
if (size == arr.length) {
increaseSize();
}
arr[size++] = value;
}
public long[] getArray() {
trim();
return arr;
}
public long get(int i) {
return arr[i];
}
public void trim() {
if (size < arr.length) {
long[] newArr = new long[size];
System.arraycopy(arr, 0, newArr, 0, size);
arr = newArr;
}
}
private void increaseSize() {
long[] newArr = new long[(int) (arr.length * sizeIncreaseFactor)];
System.arraycopy(arr, 0, newArr, 0, arr.length);
arr = newArr;
}
public double getSizeIncreaseFactor() {
return sizeIncreaseFactor;
}
public void setSizeIncreaseFactor(double sizeIncreaseFactor) {
this.sizeIncreaseFactor = sizeIncreaseFactor;
}
}
}
Naming: element()
/first()
/getFirst()
.
I've been benchmarking Koloboke against several alternatives and have been very pleased with performance. Would like to standardize all our code on it but see that right now there are no primitive array lists (e.g like IntArrayList from Mahout/Trove).
Are there any plans to include such functionality?
I'm using koloboke version 0.6.7, jdk 8 version. I created a map with this codeL
map = HashObjObjMaps.newMutableMap();
While debugging the clear() method with Intellij IDEA (version 14.1.5, but i don't think it matters) the IDE shows the debugging cursor on the lines which actually contain comments or no code at all. Details: when I debug the MutableLHashParallelKVObjObjMapGO.clear() method:
@Override
public void clear() {
int mc = modCount() + 1;
super.clear();
if (mc != modCount())
throw new ConcurrentModificationException();
}
the debugging works ok, however when I enter in modCount() method (inherited from MutableLHash) or in MutableLHash.clear() things look like in the attached screenshot
Many projects could benefit from drop-in replacement Set/Map implementations (Object -> Object). Please provide a small distribution of koloboke (in Maven central), that only contains this core, instead of the 20Mb version with all the primitive specializations.
Code generator appears to be perfectly reusable.
Currenlty it is located in buildSrc/
module, it acts as Gradle jpsg
(Java Primitive Specializations Generator) plugin.
Both koloboke-impl and koloboke-impl-common are automatically fetched as part of project build. Both have a package-info.class
file in com/koloboke/collect/impl
. This causes Proguard builds (and presumably some other forms of library repackaging) to fail.
guava-testlib
doesn't cover this.
The easiest way is probably to extend guava library. Or wait when this will be implemented by Guava developers (but ,probably they won't).
The same trick as in rehash is applicable here: we can assume each inserted key is not already present in the set.
If keys are objects, equivalences of source and target sets must be equal.
I was iterating through a HashMap with a cursor and called the setValue(V) method to replace the current value. The new value was set in the vals[] array but did not replaced the curValue.
MutableLHashSeparateKVIntObjMapGO.java
@Override
public void setValue(V value) {
if (curKey != free) {
if (expectedModCount == modCount()) {
vals[index] = value;
if (vals != values) {
values[index] = value;
}
} else {
throw new java.util.ConcurrentModificationException();
}
} else {
throw new java.lang.IllegalStateException();
}
}
Currently there are two mutability profiles: Mutable and Immutable. For example, see HashObjSets.
But probably the most often collections (especially Maps) use-case is updated collection, but elements never removed. It could be implemented almost as efficiently as immutable collection.
NoRemovals
infix looks really ugly.
Other version: AddOnlySet
/PutOnlyMap
. Disadvantage of this version is that such collections could support clear()
operation. These names hide this fact.
I have encountered an issue where in an incorrectly synchronized multi-threaded environment the IntSet.toIntArray() method call returns an ArrayIndexOutOfBoundsException.
This is not a major issue and it is acceptable as "undefined behavior" while accessing a non thread safe collection without proper guards but throwing a ConcurrentModificationException
would be more helpful to find these problems.
Test code
@Test
public void toIntArray_should_not_throw_ArrayIndexOutOfBoundsException() throws Exception {
final IntSet ids = HashIntSets.newUpdatableSet(1048576);
final Random r = new Random();
Thread thread = new Thread(new Runnable() {
@Override
public void run() {
for (int j = 0; j < 2000000; j++) {
ids.add(r.nextInt(1000000));
}
}
});
thread.start();
for (int j = 0; j < 2000000; j++) {
ids.add(r.nextInt(1000000));
if (ids.size() % 1000 == 0) {
int[] ints = ids.toIntArray();
System.out.println(ints.length);
}
}
}
Corrently simply mutable container created first, then it's state moved to immutable.
For example: https://github.com/OpenHFT/UntitledCollectionsProject/blob/da3bea76e014fe00aed5f4dcadcbfca14499e4eb/impl/src/main/javaTemplates/net/openhft/collect/impl/hash/HashCharSetFactoryGO.java#L220
This approach cause considerable GC overhead, if someone is going to create many small containers.
net.openhft.collect.CharSets {
CharSet empty(); // immutable
CharSet singleton(char e); // immutable
CharCollection asImmutable(Collection<Character> )
}
net.openhft.collect.map.CharShortMaps {
CharShortMap emptyMap(); // immutable
asImmutableMap(char[] keys, char[] values);
asImmutable(Map<Character, Short> map);
}
Or something like this
Is there a possibility to expand the number of collections interfaces implemented so these classes can be used there any other collection interface is used.
Hello leventov,
At first thanks for the very cool library.
Is it possible to build the library with a subset of primitives e.g. key={int,long} value={int,long,double} to reduce the jar size?
I'm not sure If I'm missing something. But I would like to get access to either an iterator or Cursor to loop over elements of a collection without it internally creating a new instance of an Iterator or Cursor (such as NoRemovedIterator or NoRemovedCursor).
One way I imagine doing this is creating the Iterator or Cursor at initialization and then resetting it every time I need to loop over the elements.
I don't want to use the for each lambdas as I have other variables needed for iteration that would also potentially cause a new lambda instance to be created.
The generated code contains calls to java.util.concurent.ThreadLocalRandom, which is not present in java 6. See http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadLocalRandom.html
(Env. Mac OSX 10.9.4, JRE Java SE6 version 1.6.0_33)
public class Test {
public static void main(final String[] args) {
new LHashParallelKVLongDoubleMapFactoryImpl();
}
}
Exception in thread "main" java.lang.NoClassDefFoundError: java/util/concurrent/ThreadLocalRandom
at com.busintel.ponti.impl.MapBasedValueCollectionImplTest.main(MapBasedValueCollectionImplTest.java:62)
Caused by: java.lang.ClassNotFoundException: java.util.concurrent.ThreadLocalRandom
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 1 more
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.