Code Monkey home page Code Monkey logo

Comments (3)

sweettea avatar sweettea commented on July 18, 2024

Greetings;

The physical_block_size is the smallest unit that the device can address without doing read-modify-write operations. VDO is coded to do all IO (including all deduplication and compression) in 4096-byte units -- we found in our test datasets that deduplication in 4k chunks is a good balance of deduplication success and deduplication metadata usage.

What are you trying to do? Do you have a device which has 8k physical_block_size under VDO? Are you trying to get a filesystem atop VDO to use 8k block size? I'm happy to help make VDO work for your usecase.

from kvdo.

HerlinCoder avatar HerlinCoder commented on July 18, 2024

Hi,
I have the same question that is it able to change VDO_BLOCK_SIZE to 8K or more? I'm looking forward to adding some feature to compression and wondering if 4K block size is too small.

from kvdo.

corwin avatar corwin commented on July 18, 2024

Sorry to leave this hanging, but trying to operate VDO with a larger block size is most likely difficult if not impossible and probably won't achieve what you are looking to do. Absent an answer to the question sweettea posed in response to the original issue, I'll try to cover a few possibilities.

First off we chose a 4K block size for VDO both because, at least on x86, it is the default linux page size, and this is convenient, but also because our experiments showed that 4K was the best tradeoff between deduplication savings and metadata size. As the block size gets smaller, the amount of mapping data grows and with it a reduction in space saving (due to increased overhead) and poorer performance (due to increased metadata updates). On the other hand, as the block size gets bigger, for most real-world datasets the number of duplicate blocks falls rapidly. Above 4K, VDO can't generally save enough space to make it worth the memory and performance costs.

Using VDO on top of devices whose block size is larger than 4K does work (although there may still be some bugs related to this as we have limited ability to test in such environments). It is less performant since the layers below VDO are forced to do read-modify-writes.

If the goal is to have VDO actually read and write in larger chunks, that is probably not possible at this time. While it would be possible to change the value of the VDO_BLOCK_SIZE constant and recompile the module, it is likely that there will be some problems either in the on-disk layout or the assumptions about the number of entries which can fit in VDO's journals and other data structures. At best, the efficiency of the on-disk data strcutures would be reduced, and at worst, the correctness of the device would be compromised.

from kvdo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.