Comments (6)
Maybe this happens because the Murmur3 is not thread safe i.e. if you call it from different threads you could have a wrong hash.
from cardinalityestimation.
Thanks Sergey for the fix!
from cardinalityestimation.
Thanks, will check the fix, however I see the same when I use the cardinality estimation in a single thread. I think it is might because I add "long" numbers. I check adding 10000000 random numbers between 10000000 and 10000100, so only 100 different numbers. And when I compare with a hashset the hashset actually contains 100 numbers and the estimator estimates more (which it is not expected since we have a directcount). The strange thing is that it only occurs sometimes although I initiate my random number generator with a fixed seed and the code is not multithreaded (unless the cardinality estimator adds multi threading).
from cardinalityestimation.
Ok, my test was wrong since I used testmethods for both the multithreaded and singlethreaded scenario. And, since the test-methods run in parallel and the Murmur3-hash is static, even the singlethreaded testmethod actually tested the multithreaded scenario.. :-S.
Running a much larger test on the single-threaded scenario confirmed the observation that the issue was the thread-safety.
Thanks for fixing!
Arthur
from cardinalityestimation.
Hi Oron,
Are you planing to publish these changes to nuget?
Thanks,
Sergey
On Fri, Aug 5, 2016 at 5:49 AM, Oron Navon [email protected] wrote:
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#12 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEgQ0NMKiJsy9GBeygB1Av82c7daF-59ks5qczFogaJpZM4JNkEW
.
from cardinalityestimation.
Hi Sergey,Thanks for the reminder :) Published as 1.2.1.
Thanks,Oron
Date: Fri, 19 Aug 2016 15:32:49 -0700
From: [email protected]
To: [email protected]
CC: [email protected]; [email protected]
Subject: Re: [Microsoft/CardinalityEstimation] Murmur hash gives different hash values for same input value (#12)
Hi Oron,
Are you planing to publish these changes to nuget?
Thanks,
Sergey
On Fri, Aug 5, 2016 at 5:49 AM, Oron Navon [email protected] wrote:
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
or mute the thread
.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
from cardinalityestimation.
Related Issues (14)
- Support any types of objects HOT 2
- implement ISerializable to enable Remoting serialization in Murmur3 branch HOT 1
- CardinalityEstimation is not supported on .net core HOT 6
- Make DirectCounterMaxElements configurable HOT 2
- Deserialization fails with NullReferenceException in certain cases
- Add a strong-named version
- CardinalityEstimator.Merge() Keeps Direct Counting Items No Matter How Many Exist After The Merge HOT 3
- Publish signed version of .NET Standard-compatible library HOT 2
- Provide efficient serializer
- Add Resharper settings file
- NuGet package does not contain CardinalityEstimatorSerializer class. HOT 1
- Re-estimate biases based on actual hash used
- High error for sequential sets - consider MurmurHash instead of FNV HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cardinalityestimation.