Code Monkey home page Code Monkey logo

Comments (4)

landley avatar landley commented on July 22, 2024 1

On 08/11/2016 03:49 AM, Matthias Urhahn wrote:

Toybox 0.7.1, Busybox 1.24.2
On a [email protected].

Why is there a difference in blocksize?

|root@hammerhead:/sdcard # busybox stat -c %B:%b:%o:%s
twrp-3.0.0-0-hammerhead.img 512:28632:4096:14657536
root@hammerhead:/sdcard # toybox stat -c %B:%b:%o:%s
twrp-3.0.0-0-hammerhead.img 4096:28632:4096:14657536 |

%B Bytes per block

In toybox, both %o and %B come from the stat() system call.

else if (type == 'B') out('u', stat->st_blksize);

} else if (type == 'o') out('u', stat->st_blksize);

I.E. we're telling you what the operating system told us (presumably
getting it from the filesystem driver).

A quick check shows busybox is printing a hardwired value for %B but the
stat value for %o:

    } else if (m == 'B') {
            strcat(pformat, "lu");
            printf(pformat, (unsigned long) 512); //ST_NBLOCKSIZE

    } else if (m == 'o') {
            strcat(pformat, "lu");
            printf(pformat, (unsigned long) statbuf->st_blksize);

I don't understand the point of printing a hardwired value for %B? FAT
block sizes vary from 512 bytes up to 65535k. ext2 can be 1k or 4k.

A few years ago hard drives went from 512 byte physical block to 4k,
which caused some problems because of longstanding assumptions:

https://lwn.net/Articles/322777/
https://lwn.net/Articles/377895/

And now the sectors are getting bigger:

https://lwn.net/Articles/582862/

Here's some articles about the damage conflicting block size assumptions
can do (data loss when an interrupted write changes stuff you didn't
think was being rewritten) and ways around it:

https://lwn.net/Articles/349970/
https://lwn.net/Articles/665299/
https://lwn.net/Articles/353411/

You shouldn't have to care about 90% of that (the OS handles it all for
you), but from my perspective having two ways to query block size would
be really nice if the OS gave me a way to determine the block size of
the physical media, as distinct from the block size of the filesystem.
Unfortunately, although it seems like %B would be filesystem block
size and %o would be physical media block size, Linux doesn't give me a
way to query physical media block size that I've noticed. (I might be
able to beat it out of the mtd layer for some types of flash?)

busybox says 512Byte while toybox says 4096Byte.

Busybox is returning a single hardwired answer.

That said, it looks like the Ubuntu version is also doing that. If %B
should always say "512" regardless of context, I can make it do that and
change the help text to say something like:

%B prints "512"

28632 Blocks * 512Byte = 14659584 Byte
Which is a lot closer to the actual file size of 14657536 Byte (reported
by both toybox&busybox).

More stat fields: %b is st_blocks and %s is st_size.

The stat(2) man page says that st_blocks is "number of 512B blocks
allocated" so %B is units for %b (and yes, it's a hardwired value). So
the help text should be something like:

%B units for %b (always 512)

Other commands from both binaries also show a block size of 4096 Byte
though.

The "man 1 stat" page says:

   %B     the size in bytes of each block reported by %b

The "man 2 stat" page says:

   blkcnt_t  st_blocks;  /* number of 512B blocks allocated */

So yes, you've found an inconsistency and I should fix it. %B should
output a hardwired "512", busybox is correct here.

Thanks. Good catch,

Rob

from toybox.

d4rken avatar d4rken commented on July 22, 2024

This stackoverflow post indicates that the stat %B parameter for block size should be ignored as it may not necessarily reflect the actual file system block size value but usually some unrelated OS value.

This would explain the values of 4096 which leads to huge discrepancies between allocated and actual file size (it's not a sparse file). What would be interesting though why busybox still gets the correct 512 blocksize value.

Is it planned to support something like --block-size for commands like stat,ls and du?

from toybox.

d4rken avatar d4rken commented on July 22, 2024

After more research, I think I got it now, see here.
%b comes from st_blocks which is always in units of 512 byte.

So the allocated file size on the filesystem is always calculated as block-count * 512 byte, with the result being in increments of the actual file system block size, which is the %o format i.e. I/O Block size.

So why does busybox print the "correct" value of 512 for "Bytes per Block" which should usually just be ignored? Well it's hardcoded.

Toybox returns the same value of st_blksize for both IO Block size and Bytes per block.
Busybox returns a hardcoded 512 for Bytes per block and st_blksize for IO Block size.

@landley
What would you say to also hardcoding 512 for Bytes per bock for toybox?
Returning the IO Block size for seems wrong.
Telling us the "Bytes per block" for the "Blocks allocated" count seems like stat's and the %B formats responsibility and would actually fit the parameters descriptions.
Or Bytes per block could be cut from code, due to serving no useful purpose... but that would break "spec" and could lead to compatibility issues. So hardcoding 512 seems like a good solution.

from toybox.

d4rken avatar d4rken commented on July 22, 2024

Fixed with 4460e9f 👍 Thanks!

from toybox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.