Code Monkey home page Code Monkey logo

Comments (4)

tavianator avatar tavianator commented on August 14, 2024

This bug affects bfs too. I just implemented a binary search to find ARG_MAX empirically which is much better than before: tavianator/bfs@ac1d28e

from argmax.

tavianator avatar tavianator commented on August 14, 2024

GNU xargs tries to do the same thing: https://git.savannah.gnu.org/cgit/findutils.git/tree/lib/buildcmd.c?id=372cd34894e247fe5c2991eb75185ea2ec850ee2#n190

But their implementation is buggy because limit can be far outside [largest_successful_arg_count, smallest_failed_arg_count]. In my case I get shift == 1 and it has to reduce limit from ~30k to ~10k which takes forever:

$ ./configure CFLAGS="-m32 -g" TIME_T_32_BIT_OK=yes
$ make
$ ulimit -s 512
$ yes foo | head -n100000 | time xargs echo >/dev/null
xargs echo > /dev/null  0.03s user 0.08s system 102% cpu 0.111 total
$ yes foo | head -n100000 | time ./xargs/xargs echo >/dev/null
./xargs/xargs echo > /dev/null  15.91s user 7.06s system 101% cpu 22.550 total

from argmax.

sharkdp avatar sharkdp commented on August 14, 2024

Thank you very much! I implemented the proposed fix in 2057ea1. This fixes the unit tests for i686-unknown-linux-gnu.

The updated status is here: #2. The only thing which would be nice to fix for now is the unreasonably small limit on musl targets.

This bug affects bfs too. I just implemented a binary search to find ARG_MAX empirically which is much better than before: tavianator/bfs@ac1d28e

So even with the conservative guess for the pointer size, do you think there will always be cases where the command line length can overflow? Will such a backup strategy be needed for fd as well?

By binary search, do you mean: exponential backoff? I.e. divide the number of arguments by two until it works? Or do you really go up again after you found a working number of arguments? In that case, would bfs -exec do something like the following?

  • try with 400,000 arguments => fails
  • try with 200,000 arguments => fails
  • try with 100,000 arguments => succeeds => i.e. the first 100,000 search results have been processed; 300,000 to go
  • try with 150,000 arguments => fails
  • try with 125,000 arguments => succeeds => i.e. the first 225,000 search results have now been processed…

from argmax.

tavianator avatar tavianator commented on August 14, 2024

By binary search, do you mean: exponential backoff? I.e. divide the number of arguments by two until it works? Or do you really go up again after you found a working number of arguments?

Yeah the limit can go back up:

bfs: -D exec: -exec: Got E2BIG, shrinking argument list...
bfs: -D exec: -exec: ARG_MAX between [0, 2086316], trying 1043158
bfs: -D exec: -exec: ARG_MAX between [1043085, 2086316], trying 1564700
bfs: -D exec: -exec: ARG_MAX between [1564682, 2086316], trying 1825499
bfs: -D exec: -exec: ARG_MAX between [1825451, 2086316], trying 1955883
bfs: -D exec: -exec: ARG_MAX between [1955849, 2086316], trying 2021082
bfs: -D exec: -exec: Got E2BIG, shrinking argument list...
bfs: -D exec: -exec: ARG_MAX between [1955849, 2021017], trying 1988433
...

from argmax.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.