Code Monkey home page Code Monkey logo

Comments (7)

khimaros avatar khimaros commented on June 16, 2024 1

I've documented the behavior difference between 4.19 and 5.7 kernels in README.md along with a tabular overview, and have made the requested changes to random_write.py. I've also included configs for all explorations in the explorations/ folder Closing this issue.

from raid-explorations.

khimaros avatar khimaros commented on June 16, 2024

@tomato42 -- thank you for the feedback.

As explained in the README.md, reattaching the dm-integrity device with --integrity-no-journal still does not allow the device to be reattached to the MD array:

# integritysetup open --integrity sha256 -D -R /dev/sda3 sda3_int
# mdadm --manage /dev/md0 --add /dev/mapper/sda3_int
[ ... ] Buffer I/O error on dev dm-3, logical block 1, async page write

The only way to reattach that I found is to format and recreate the integrity device from scratch.

Do you happen to know where, precisely, the DM metadata would be on a 10GB device? In my tests I am skipping the first 20MB and last 16MB of the device to avoid metadata corruption.

Broadly, if you have an idea for a better way to simulate disk corruption I'm very open to pull requests.

from raid-explorations.

tomato42 avatar tomato42 commented on June 16, 2024

@tomato42 -- thank you for the feedback.

As explained in the README.md, reattaching the dm-integrity device with --integrity-no-journal still does not allow the device to be reattached to the MD array. The only way to reattach that I found is to format and recreate the integrity device from scratch.

ah, sorry, didn't notice that; that would suggest a bug in dm-integrity though; If we tell it to ignore journal, it should ignore it...

Do you happen to know where, precisely, the DM metadata would be on a 10GB device? In my tests I am skipping the first 20MB and last 16MB of the device to avoid metadata corruption.

actually, it seems like the journal size is independent of the size of the device, a quick check like this:

truncate -s 10G int-phys
integritysetup format --no-wipe --integrity sha256 int-phys
hexdump -C int-phys

shows that the last non-zero byte is at offset 0x03ffbfff, which is very close to the 64MiB size of the 1.5TiB device

So it looks like for dm-integrity, the first 64MiB of the device are special

Broadly, if you have an idea for a better way to simulate disk corruption I'm very open to pull requests.

I think only increasing the skips to something like 0.5GiB should be sufficient (see also mdadm --examine /dev/mapper/sda3_int | grep Data\ Offset)

from raid-explorations.

tomato42 avatar tomato42 commented on June 16, 2024

One more thing about mdadm --manage /dev/md0 --add /dev/mapper/sda3_int: mdadm looks for superblock in all the places a superblock can be (i.e. the beginning of the disk, 4K from start and the end of disk, representing the metadata formats 1.1, 1.2 or 1.0 respectively), so if the blocks where the md-raid metadata can be are damaged the --add operation will fail too.
You can overriding that behaviour by specifying the correct --metadata= option while adding the volume.

from raid-explorations.

khimaros avatar khimaros commented on June 16, 2024

One more thing about mdadm --manage /dev/md0 --add /dev/mapper/sda3_int: mdadm looks for superblock in all the places a superblock can be (i.e. the beginning of the disk, 4K from start and the end of disk, representing the metadata formats 1.1, 1.2 or 1.0 respectively), so if the blocks where the md-raid metadata can be are damaged the --add operation will fail too.
You can overriding that behaviour by specifying the correct --metadata= option while adding the volume.

It sounds like this should not be a constraint with the approach you recommended above: skipping the first 512MB and final 128MB of the disk.

from raid-explorations.

tomato42 avatar tomato42 commented on June 16, 2024

yes, optimally, you should query dm-integrity how big is the journal and then query mdadm how big is the data offset

but then, do we have a reason to believe that mdadm and dm-integrity will handle a read error and checksum failure differently in first sector than in a millionth one? It's easier to just do "large enough" skip than to calculate exact value

from raid-explorations.

khimaros avatar khimaros commented on June 16, 2024

but then, do we have a reason to believe that mdadm and dm-integrity will handle a read error and checksum failure in first sector than in a millionth one? It's easier to just do "large enough" skip than to calculate exact value

I'm not too concerned about the difficulty and have updated random_write.py to use 512MB offset at start and 128MB for end. A further improvement would be to corrupt entire sectors rather than random byte locations.

I've also added a caveat to my notes that if using devicemapper based block devices, it is a very good idea to backup these areas. It's particularly important with dm-crypt devices where it is otherwise impossible to recover any data without healthy metadata.

from raid-explorations.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.