Code Monkey home page Code Monkey logo

Comments (8)

kakra avatar kakra commented on June 13, 2024 1

I'd go for round down... Because hash table sizes below 16 or 32 MB probably make no sense anyway. And then just ignore the few extra data, a slack size of 16 or less MB really doesn't matter these days. Just throw a warning to the log that the extra space isn't used because of alignment. That is probably also the best expectation, because in storage speak, sizes are always rounded down for alignment, never up.

We should NOT use a default size when we end up with 0, otherwise we may create unwanted surprises.

from bees.

Zygo avatar Zygo commented on June 13, 2024 1

Sounds good. Here's what I'm thinking:

  • There's no significant startup performance gain from the lazy fetching capability--we can just read the entire hash table in the BeesHashTable constructor before any other threads are running. Even if there is a benefit from reading in multiple threads, those threads can all run and finish before the constructor returns, simplifying error handling and getting rid of a lot of pointless flag checks. So we can delete all references to ExtentMetaData::m_missing and all but one reference to fetch_missing_extent.

  • Add O_CREAT to the flags when we try to open an existing hash table. If this fails, throw an exception to abort bees (there is no hash table file and we failed to create one).

  • Get the size of the file rounded down. If the file size is zero, and no size is specified on the command line, throw an exception to abort bees.

Now we have an open FD to an existing file (possibly an empty one we just created).

  • If no size was specified on the command line, set the target size to the existing file size; otherwise, the target size is the size specified on the command line.

  • Do the mmap(), mlock(), and madvise() with the target size (note that we don't use m_fd in mmap, so we don't require the file size and mmap size to be identical). This will give us an all-zero in-memory buffer of the final desired size.

Before starting the prefetch and writeback threads, read the data into the hash table:

  • If the FD's file size is (after rounding) is equal to the target size (after rounding) then simply read the data from disk directly into the hash table (this will be the one remaining reference to fetch_missing_extent). At the end, all m_dirty fields in m_extent_metadata should be false.
  • If the existing table size is different, read the existing table one extent at a time into a temporary buffer, and insert every non-zero Cell from that buffer into the hash table.
    • Add a new method push_back_hash_addr which places new data in the first non-zero Cell of a Bucket, or discards the data if no non-zero Cell is available. This method will set m_dirty as required.
    • Keep counts of inserted and dropped Cells and log the counts at EOF.
    • Ignore read errors or short reads, just skip to the following extent.
    • Set a flag m_resized = true for the writeback thread

Start prefetch_thread (which should be renamed to 'statistics_thread' since it no longer prefetches anything) and writeback_thread.

  • Modify the writeback thread so that after all extents are written but before we wait on the condvar, we check m_resized. If m_resized is true, truncate the file to the target size and set m_resized to false.

The above overwrites the hash table in place, which means the hash table may be lost if bees is interrupted during the (very slow) writeback. It may be preferable to do this variant instead:

  • After reading the old file as above but before creating writeback_thread, create a new temporary file
  • truncate() temporary file to desired size
  • Write the entire hash table to the temporary file (skip non-dirty pages to save IO in the file creation case)
  • Rename the temporary file over the old
  • Set all m_dirty = false
  • Keep the temporary file's FD as m_fd and start writeback_thread

The problem with that approach is that the "write entire hash table to the temporary file" step is pretty heavy on IO, especially if the hash table is more than 50% of RAM. The system will be unusable during this IO as the VM subsystem will block all writes. Maybe nobody but me cares about that though, especially if resizing is a rare case.

from bees.

Zygo avatar Zygo commented on June 13, 2024

The hash table is loaded lazily by extent scanning. If the hash table fails to load because of an exception, the extent scanner's exception handler traps the exception and returns to the scanning loop for the next extent. The hash table loader throws an exception if the hash table is an unacceptable size.

Repeat until the hash table is modified to be an acceptable size, or the admin gives up and kills the bees process.

I guess we could try to load the hash table before starting crawlers, and if the exception occurs there, exit bees with an error.

from bees.

nefelim4ag avatar nefelim4ag commented on June 13, 2024

@Zygo, may be we can also add align down hash table size automatically by BLOCK_SIZE_HASHTAB_EXTENT, and run truncate() on file after?

from bees.

kakra avatar kakra commented on June 13, 2024

Without having that tested, couldn't we just do:

diff --git a/src/bees-hash.cc b/src/bees-hash.cc
index 7ba562f..117452d 100644
--- a/src/bees-hash.cc
+++ b/src/bees-hash.cc
@@ -617,6 +617,7 @@ BeesHashTable::open_file()

        Stat st(new_fd);
        off_t new_size = st.st_size;
+       new_size = new_size - (new_size % BLOCK_SIZE_HASHTAB_EXTENT);

        THROW_CHECK1(invalid_argument, new_size, new_size > 0);
        THROW_CHECK1(invalid_argument, new_size, (new_size % BLOCK_SIZE_HASHTAB_EXTENT) == 0);

from bees.

Zygo avatar Zygo commented on June 13, 2024

Should we round up or round down? What happens if the file is 15.9MB and the size rounds down to zero? What if there is some other error opening the file?

I guess we should do all of these:

  1. Construct the hash table early enough that we can exit bees if there is an exception during the constructor
  2. Round the size down (no need to truncate, we'll just ignore the extra data?)
  3. If the rounded size is zero then either fail or use a default size

or

  1. As above
  2. Round the size up
  3. Ignore short reads during fetch. Flush will extend the file when it writes out the last extent.

I'd really like to have a 'hash table size' option, and create/resize the hash table if the on-disk size is different from the size specified by the option.

from bees.

kakra avatar kakra commented on June 13, 2024

BTW: I volunteer to add the size configuration option if you don't mind. I could need one or another advice, and we should settle on an expected behavior first.

from bees.

Zygo avatar Zygo commented on June 13, 2024

It seems we will need to support incompatible hash table format changes (see #33 (comment)). Changes to hash table bucket size (assuming the same hash algorithm) can be achieved the same way as a hash table resize.

Add to the above:

  • when a new hash table is written due to a size, algorithm, or layout change, store some format information in the table (e.g. store 0x4245455300000000 | format_info_bits in the first cell's hash field, which is not a possible value for that field in the current format).

  • change the size alignment to 128KB instead of 16M, or make it configurable. 128KB is the largest useful size for hash table storage on compressed filesystems. The hash table is usually compressible because adjacent hash values are similar (20+ bits in common) and there are a lot of zeros in block numbers.

from bees.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.