Code Monkey home page Code Monkey logo

Comments (5)

martinsumner avatar martinsumner commented on August 20, 2024

There's a branch - mas-i95-lz4sst

rebar ct test runs appear to be consistently quicker (but only by about o(1) %).

Running a full volume test in Riak, we can see how the sst slot_fetch time has changed between this test and master:

index=* SST01 | rex "SST timing for result (?<ResultType>[a-z_]+) is sample (?<SampleSize>[0-9]+) total (?<TotalTime>[0-9]+) and max [0-9]+" | stats sum(TotalTime) AS TotalTimeAll sum(SampleSize) as SampleSizeAll by ResultType | eval MeanTimeAll = TotalTimeAll/SampleSizeAll 
SST Timing Point master mas-i95-lz4sst
tiny_bloom 10.85 10.33
slot_bloom 21.85 20.89
slot_fetch 320.61 275.33

Timings are in microseconds. So using LZ4 reduces the time it takes to fetch from a slot by 45 microseconds.

45 microseconds is irrelevant in the context of GET performance, but if we're doing 5K GETs per second on a 5 node cluster this will involve 4 slot_fetches per GET - and so it amounts to about 20% of one CPU core in CPU activity. When CPU bound, this may make an overall difference.

When the SST file is being created, there appears to be a small delta with the SST file write times taking slightly longer in the test:

index=* SST08 | rex "at level (?<LedgerLevel>[0-9]{1}) with max sqn [0-9]+ with time taken (?<SSTCreateTime>[0-9]+) ms"  | stats mean(SSTCreateTime) by LedgerLevel 
SST File Level master mas-i95-lz4sst
1 20.86 21.56
2 21.51 22.42
3 24.63 25.94

Timings are in milliseconds. So the cost of creating a SST file has increased marginally (but creating a SST file. At peak in these tests though, each node is seeing < 4 SST file creations per second. So the CPU cost of this additional millisecond isn't relevant.

Tests are still running, I will post a comment with graphs on the impact of overall throughput this evening.

from leveled.

martinsumner avatar martinsumner commented on August 20, 2024

Chart below shows the difference with and without LZ4 compression of the SST (rather than using term_to_binary [compressed]). the throughput with LZ4 improves by 3% . The most notable performance improvement is a 12% reduction in 2i query time.

image

from leveled.

martinsumner avatar martinsumner commented on August 20, 2024

The other aspect of compression in Leveled is compression in the Journal. The Journal does not by default compress values (in the current master), it only compresses the values during compaction.

The current compression is zlib. Turning compression on on receipt (so compress immediately not after compaction), doubles the time taken by the Inker to PUT the object (1ms -> 2ms). This increase leads to a drop in throughput of 6.5%.

This drop in throughput was why the compression was disabled.

So what if LZ4 compression is used for the Journal - can this 1ms PUT overhead be reduced?

from leveled.

martinsumner avatar martinsumner commented on August 20, 2024

Switching the Journal compression to LZ4 - and actively compressing on receipt, had a negligible impact on throughput.

There is a 184 microsecond increase in the Inker PUT time, compared to disabling compression on receipt. This compares favourably to the 1ms increase from using zlib.

Although this CPU time will have an impact on throughput, it is a change with potential compensating improvements from reduced object sizes etc.

from leveled.

martinsumner avatar martinsumner commented on August 20, 2024

Based on these results - NIF support has been merged in:

#99

from leveled.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.