Code Monkey home page Code Monkey logo

Comments (12)

lukoktonos avatar lukoktonos commented on May 14, 2024

The performance results were favorable for switching to block based format.

https://gerrit.readyset.name/c/readyset/+/5206

https://readyset-workspace.slack.com/archives/C05471W2M44/p1687374146825119

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

It's REA-2714 (mentioned above) — going to perform at least the PlainTable vs BlockBased performance to validate this is OK to switch.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

We had a task slated to benchmark other table formats against PlainTable for both write and read latency specifically for this reason - I think @Ethan might have been working on it but I'm not sure where it ended up

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

Thanks. Not much to go by out there I'm afraid.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024
commit 3e668db2de9a66e9788371de981377fbb19b17ed
Author: Martin Ek <[email protected]>
Date:   Wed May 2 23:13:11 2018 +0200

    Use rocksdb's plain table SST format

Looks like this change was made back in the noria days and there isn't much context as to why it was chosen.

If i had to take a wild guess, the wording in https://github.com/facebook/rocksdb/wiki/PlainTable-Format "PlainTable is a RocksDB's SST file format optimized for low query latency on pure-memory or really low-latency media." sounds appealing for a research project focused on low query latency.

They also had the advantage of running on smaller datasets and avoiding the "file is too large" problem that real world scenarios can run into.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

@KwilLuke Can you git-blame to see who might have made the choice to use PlainTable instead of the default?

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

PlainTable Format is the only rocksdb SST format that has this size restriction, and it is a hard limit due to the data structure using 32-bit integers in indexing.

It seems like we should be able to configure rocksdb to avoid making a file greater than the 2GiB limit in compaction, and from what I can tell we set it to make 256MiB files at most, but there must be something that is at least temporarily making one too large. I'm not exactly sure where yet, and it is a bit slow to reproduce this repeatedly.

As we already have a hunch that BlockBased Table Format may be a better choice (it is the default for rocksdb), I will try using that and confirm that it can handle the failure scenario in this ticket, as well as sanity check the performance a bit, which will cover some of the matrix of REA-2714.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

cc: @jasobrown Who I know was interested in RocksDB settings as well.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

So, are you looking at implementing REA-2900 first?

No, I was going to work on this one first.

I thought we might already be invoking "PrepareForBulUpload" in our code.

We don't directly call this currently. We do similar options-tweaking that PrepareForBulkUpload does to improve performance, though. I tried calling this and didn't see any significant performance improvement.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

So, are you looking at implementing REA-2900 first? I thought we might already be invoking "PrepareForBulUpload" in our code. Seems worthwhile for us to try that out first, I agree.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

If we implement REA-2900 before this, it may either resolve this or have a different signature, with compaction failing with the same error after opening PersistentState rather than during.

from readyset.

lukoktonos avatar lukoktonos commented on May 14, 2024

Fixed in 8a32b30

from readyset.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.