Comments (5)
I prefer dealing with this sort of thing through the mailing list, I
only really use github as repository hosting.
On 02/25/2016 01:31 AM, drinkcat wrote:
We use toybox-0.7.0 as part of the Chromium OS project, and sometimes
hit an issue when building it on our automated builders (see this issue
https://bugs.chromium.org/p/chromium/issues/detail?id=584542):|toybox-0.7.0: armv7a-cros-linux-gnueabi-gcc -O2 -O2 -pipe -march=armv7-a
-mtune=cortex-a15 -mfpu=neon -mfloat-abi=hard -g -fno-exceptions
-fno-unwind-tables -fno-asynchronous-unwind-tables -clang-syntax
-funsigned-char -Wno-string-plus-int -I . -Os -ffunction-sections
-fdata-sections -fno-asynchronous-unwind-tables -fno-strict-aliasing -c
toys/posix/tail.c -o generated/obj/tail.o toybox-0.7.0: scripts/make.sh:
line 270: wait: pid 8477 is not a child of this shell toybox-0.7.0:
Hmmm... PID wrap, maybe?
Makefile:19: recipe for target 'toybox' failed toybox-0.7.0: make: ***
[toybox] Error 1 toybox-0.7.0: * ERROR:
sys-apps/toybox-0.7.0::portage-stable failed (compile phase):
toybox-0.7.0: * emake failed |For some reason we cannot reproduce locally (it only happens on these
builders that are compiling many other packages at the same time).
Neither can I.
Maybe I could do something with a restricted process ID range forcing
quick wrapping, but this seems more a bash problem than my script, so a
workaround's more likely than a proper fix. (I wonder if I can
distinguish this error from a compiler error? Hmmm... 127 is nonexistent
process or job, except how to distinguish "gcc not in $PATH" from "PID
we waited on went bye-bye and took its exit status with it"?)
Looking at the code (|script/make.sh|), we are wondering about your use
of |$(jobs -rp)|. Wouldn't it be more correct to add jobs to PENDING
using |$!| right after you launch the job (|do_loudly|)?
If you think that'll help, I'm happy to give it a try, sure.
Thanks,
Rob
from toybox.
On Fri, Feb 26, 2016 at 1:53 PM, Rob Landley [email protected]
wrote:
I prefer dealing with this sort of thing through the mailing list, I
only really use github as repository hosting.
Oh, sorry, adding the list to this reply.
On 02/25/2016 01:31 AM, drinkcat wrote:
We use toybox-0.7.0 as part of the Chromium OS project, and sometimes
hit an issue when building it on our automated builders (see this issue
https://bugs.chromium.org/p/chromium/issues/detail?id=584542):|toybox-0.7.0: armv7a-cros-linux-gnueabi-gcc -O2 -O2 -pipe -march=armv7-a
-mtune=cortex-a15 -mfpu=neon -mfloat-abi=hard -g -fno-exceptions
-fno-unwind-tables -fno-asynchronous-unwind-tables -clang-syntax
-funsigned-char -Wno-string-plus-int -I . -Os -ffunction-sections
-fdata-sections -fno-asynchronous-unwind-tables -fno-strict-aliasing -c
toys/posix/tail.c -o generated/obj/tail.o toybox-0.7.0: scripts/make.sh:
line 270: wait: pid 8477 is not a child of this shell toybox-0.7.0:Hmmm... PID wrap, maybe?
That's what we were wondering about... The builder is building a lot of
other packages at the same time, including Chromium, so it's not unlikely
that the PID space is saturated... Also, the builder retries after the
first failure, and the second try always works (probably when the builder
is less busy...)
Looking at the code (|script/make.sh|), we are wondering about your use
of |$(jobs -rp)|. Wouldn't it be more correct to add jobs to PENDING
using |$!| right after you launch the job (|do_loudly|)?If you think that'll help, I'm happy to give it a try, sure.
I have a commit ready here, that appears to fix the problem:
drinkcat@4c70562
It's a little less aggressive at parallelizing, as it always waits for the
first PID if PENDING is full (instead of refreshing the PENDING list every
time)...
I guess that you prefer I send the patch to the list? Or is a github PR
fine too?
Thanks!
Best,
Nicolas
from toybox.
On 02/26/2016 12:31 AM, Nicolas Boichat wrote:
On Fri, Feb 26, 2016 at 1:53 PM, Rob Landley <[email protected]
On 02/25/2016 01:31 AM, drinkcat wrote:
> We use toybox-0.7.0 as part of the Chromium OS project,
P.S. Yay!
> and sometimes > hit an issue when building it on our automated builders (see this issue > <https://bugs.chromium.org/p/chromium/issues/detail?id=584542>): > > |toybox-0.7.0: armv7a-cros-linux-gnueabi-gcc -O2 -O2 -pipe -march=armv7-a > -mtune=cortex-a15 -mfpu=neon -mfloat-abi=hard -g -fno-exceptions > -fno-unwind-tables -fno-asynchronous-unwind-tables -clang-syntax > -funsigned-char -Wno-string-plus-int -I . -Os -ffunction-sections > -fdata-sections -fno-asynchronous-unwind-tables -fno-strict-aliasing -c > toys/posix/tail.c -o generated/obj/tail.o toybox-0.7.0: scripts/make.sh: > line 270: wait: pid 8477 is not a child of this shell toybox-0.7.0: Hmmm... PID wrap, maybe?
That's what we were wondering about... The builder is building a lot of
other packages at the same time, including Chromium, so it's not
unlikely that the PID space is saturated... Also, the builder retries
after the first failure, and the second try always works (probably when
the builder is less busy...)
Possibly the OS is killing zombies if it wants to reuse that PID before
the zombie is reaped? (Which would be a horrible heuristic because
process exit could happen after a long runtime but right before a new fork.)
Or maybe it's doing so if it there are no more free PIDs, instead of
fork failing?
In either case, moving to $! wouldn't fix it. But that also wouldn't
explain why only bash was seeing the problem...
It's an interesting bug and I'd be interested in tracking it down if I
was willing to get sucked into debugging GPLv3 bash. (GPLv2 bash I spent
days tracking down weirdness, ala:
The initial problem:
http://landley.net/notes-2011.html#24-08-2011
Mentioned in passing:
http://landley.net/notes-2011.html#26-08-2011
http://landley.net/notes-2011.html#28-08-2011
Deep dig:
http://landley.net/notes-2011.html#02-09-2011
http://landley.net/notes-2011.html#03-09-2011
http://landley.net/notes-2011.html#04-09-2011
And finally finding it:
http://landley.net/notes-2011.html#05-09-2011
Yes, that's me happily digging through libc, kernel, and back into a
userspace program to find a problem. But if a GPLv3 program is involved,
"it's broken, let's replace it".
> Looking at the code (|script/make.sh|), we are wondering about your use > of |$(jobs -rp)|. Wouldn't it be more correct to add jobs to PENDING > using |$!| right after you launch the job (|do_loudly|)? If you think that'll help, I'm happy to give it a try, sure.
I have a commit ready here, that appears to fix the problem:
drinkcat@4c70562
I pushed a change last night based on your $! suggestion, did that fix
it? (Your patch is using ${%%} to filter, which is interesting. I
couldn't make ${//} work right but maybe that could replace my sed
invocation? Trying to get the number of execs in the dispatch/monitoring
cycle down as small as possible. Then again once it can build under a
toybox shell then it's just a fork() and not an exec, which is cheaper.
Eh, worry about it later...)
It's a little less aggressive at parallelizing, as it always waits for
the first PID if PENDING is full (instead of refreshing the PENDING list
every time)...
So's the one I did last night. I should poke around on my 8-way machine
and see how it's doing keeping the cpus busy...
I guess that you prefer I send the patch to the list? Or is a github PR
fine too?
What would be really nice is if github gave me a button to get the
"git format-patch" version of the patch at the above URL. But of course
they don't do that, why would they do that?
When github emails me a pull requests I can wget and "git am" from
there, so it's usable. (It's then up to the submitter to close said
request, but having a list of old irrelevant pull requests I've already
dealt with one way or another is github's problem, as far as I'm concerned.)
Posting them to the list gives other people the chance to chime in, but
I think we covered that here. :)
Thanks,
Rob
from toybox.
Followed up on list.
from toybox.
Also, e17fbf1 seems to fix it.
from toybox.
Related Issues (20)
- vmstat: Bad pgpgin in /proc/vmstat: HOT 4
- stat -f -c %T /sys/fs/cgroup/memory display "unknown" instead of "cgroupfs" HOT 11
- fold: tests don't pass HOT 2
- xargs needs "--" argument stopper HOT 5
- sh: failure in cmake OS detection script HOT 14
- diff from stdin HOT 2
- find crash on invalid argument HOT 2
- Why is bash hardcoded in scripts and tests? HOT 1
- UTF-8 character support in Android's `sed` HOT 5
- tar: `--sort=name` Does not follow symlinks with `-h` HOT 1
- `xxd -p` adds an extra space at the end of each line HOT 2
- `kill` command shouldn't assume process name has no spaces in it HOT 2
- cpio does not support -L / --dereference HOT 2
- cp: `cp $(readlink x) x` does not work
- bunzip2 deletes files on error HOT 3
- date: segfault with `-s+` HOT 2
- patch: heap-buffer-overflow HOT 2
- "grep -A -m" missing lines on last match HOT 2
- Pathological patch can't be re-apply to original source without loose match HOT 7
- patch tests fail under (hw)asan HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from toybox.