rmc00 / maybe Goto Github PK
View Code? Open in Web Editor NEWA class library implementing probabilistic data structures in .NET
License: MIT License
A class library implementing probabilistic data structures in .NET
License: MIT License
Environment:
Error Message: (Sorry for Chinese locale, the main meaning is the package not compatible with net462)
正在还原 D:\GitProjects\BigsetExisting\BigsetExisting\BigsetExisting.csproj 的包...
包 Maybe.NET 1.0.77 与 net462 (.NETFramework,Version=v4.6.2) 不兼容。 包 Maybe.NET 1.0.77 支持: release (Release,Version=v0.0)
包 Maybe.NET 1.0.77 与 net462 (.NETFramework,Version=v4.6.2) / win 不兼容。 包 Maybe.NET 1.0.77 支持: release (Release,Version=v0.0)
包 Maybe.NET 1.0.77 与 net462 (.NETFramework,Version=v4.6.2) / win-x64 不兼容。 包 Maybe.NET 1.0.77 支持: release (Release,Version=v0.0)
包 Maybe.NET 1.0.77 与 net462 (.NETFramework,Version=v4.6.2) / win-x86 不兼容。 包 Maybe.NET 1.0.77 支持: release (Release,Version=v0.0)
程序包还原失败。正在回滚“BigsetExisting”的程序包更改。
已用时间: 00:00:00.2589401
Reproduce steps:
I saw, that most methods already have XML Documentation written for them.
Unfortunately, they're not distributed via NuGet. Doing this would make it much easier for developers to use this library:
Hi - thanks for creating a library for Bloom filters (with a great API!)
I have run some experiments using your scalable Bloom filters, but they do not seem to perform very well :(
I created a fork containing code for benchmarking the use cases I want to use your library for, as well as experiments with optimizing some of your code.
One performance bottleneck can be found in your choice of method for converting objects to bytes, in order to hash the bytes. I have created a project for benchmarking various methods for the conversion, showing that the BinaryFormatter
used in your library performs horribly compared to the alternatives.
The benchmarks in my fork shows the performance improvements achievable by replacing BinaryFormatter
with alternatives (I have included the benchmarking results at the bottom of this issue - notice the memory usage column on the far right)
Despite the new optimizations, the memory usage of your Bloom filters (esspecially the scalable version, which I really want to use) is very high compared to an alternative like a HashSet
.
Is this just the nature of the implementation, or can it be improved?
(The table below is a part of the output of the benchmarking code. In the tabl, your original implementation of a scalable Bloom filter is called ScalableBloomFilter
.)
BenchmarkDotNet=v0.11.5, OS=macOS Mojave 10.14.5 (18F203) [Darwin 18.6.0]
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=2.2.300
[Host] : .NET Core 2.2.5 (CoreCLR 4.6.27617.05, CoreFX 4.6.27618.01), 64bit RyuJIT
Core : .NET Core 2.2.5 (CoreCLR 4.6.27617.05, CoreFX 4.6.27618.01), 64bit RyuJIT
Job=Core Runtime=Core
Method | ItemsToInsert | MaximumErrorRate | Mean | Error | StdDev | Ratio | RatioSD | Rank | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|---|
HashSet | 1000 | 0.02 | 146.3 μs | 0.1939 μs | 0.1620 μs | 1.00 | 0.00 | 1 | 15.1367 | - | - | 70.31 KB |
StringOptimizedScalableBloomFilter | 1000 | 0.02 | 909.7 μs | 5.6858 μs | 5.0404 μs | 6.21 | 0.03 | 2 | 234.3750 | - | - | 1083.49 KB |
GenericOptimizedScalableBloomFilter | 1000 | 0.02 | 5,701.6 μs | 42.8845 μs | 40.1142 μs | 39.00 | 0.27 | 3 | 2414.0625 | - | - | 11144.79 KB |
ScalableBloomFilter | 1000 | 0.02 | 6,271.6 μs | 56.0022 μs | 52.3845 μs | 42.89 | 0.41 | 4 | 2390.6250 | - | - | 11029.23 KB |
HashSet | 10000 | 0.02 | 1,492.3 μs | 3.4257 μs | 3.0368 μs | 1.00 | 0.00 | 1 | 152.3438 | - | - | 703.13 KB |
StringOptimizedScalableBloomFilter | 10000 | 0.02 | 17,605.3 μs | 45.3247 μs | 37.8482 μs | 11.80 | 0.04 | 2 | 4062.5000 | - | - | 18814.31 KB |
GenericOptimizedScalableBloomFilter | 10000 | 0.02 | 106,426.8 μs | 1,297.8431 μs | 1,214.0032 μs | 71.35 | 0.88 | 3 | 43600.0000 | - | - | 201469.91 KB |
ScalableBloomFilter | 10000 | 0.02 | 114,675.1 μs | 952.2085 μs | 844.1081 μs | 76.84 | 0.62 | 4 | 43200.0000 | - | - | 199364.2 KB |
HashSet | 50000 | 0.02 | 8,018.3 μs | 23.7326 μs | 21.0383 μs | 1.00 | 0.00 | 1 | 828.1250 | - | - | 3828.13 KB |
StringOptimizedScalableBloomFilter | 50000 | 0.02 | 103,353.9 μs | 1,100.2632 μs | 975.3547 μs | 12.89 | 0.11 | 2 | 26400.0000 | - | - | 122239.81 KB |
GenericOptimizedScalableBloomFilter | 50000 | 0.02 | 693,657.8 μs | 2,892.3871 μs | 2,564.0259 μs | 86.51 | 0.44 | 3 | 285000.0000 | - | - | 1316991.46 KB |
ScalableBloomFilter | 50000 | 0.02 | 734,364.0 μs | 9,817.7735 μs | 9,183.5514 μs | 91.65 | 1.16 | 4 | 282000.0000 | - | - | 1303198.18 KB |
i fix it this way :
public IEnumerator<T> GetEnumerator()
{
var currentNode = _headNode.Next[0];
while (currentNode != null && currentNode.HasNextAtLevel(0))
{
yield return currentNode.Value;
currentNode = currentNode.Next[0];
}
if (currentNode != null)
{
yield return currentNode.Value;
}
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.