I've discovered an issue while trying to perform builds via buildkit, on a host architecture of linux/amd64
, while the target is linux/arm64
(although this also happens for other emulated architectures). This is happening when relying on the buildkit binfmt emulation, and not having host-level support installed into the kernel. binfmt
output for the host, showing no emulation support for arm64
and others:
{
"supported": [
"linux/amd64",
"linux/386"
],
"emulators": null
}
When run in this configuration with buildkit
v0.10.4, which includes binfmt v6.2.0-24, buildkit itself is configured to run commands using the buildkit variant qemu binaries. Previously I believe the way to support this was to install the host-level emulation support, however with the capabilities in buildkit itself now, my understanding is that should no longer be the case?
RUN
commands in the Dockerfile
via emulation work in many cases with the buildkit provided qemu, with it modifying the RUN
command to inject the buildkit qemu binary into the call. For some (many?) cases where the program being run itself tries to run something else in the PATH
, this fails.
Minimal Dockerfile
examples that fail when run through emulation:
FROM alpine:3.16
RUN /usr/bin/env sh -c 'echo Hello World'
FROM debian:bullseye
RUN /usr/bin/env sh -c 'echo Hello World'
Note: this doesn't just affect env
, but other commands that execute files within the PATH
as well, like xargs
From debugging the process, I can see that within the buildkit qemu emulation, the execve
syscall modifies the arguments to inject .buildkit_qemu_emulator
into the call, ensuring that the invocation is run through the emulator. This change to the call modifies the behaviour such that it breaks execution of binaries within the PATH
, e.g. via an execvp
syscall.
Typically the execvp
call will iterate over each element of the PATH
, and each time the target file doesn't exist within that path element, will return an ENOENT
and continue the loop until either; 1) the file is found (and executed); or 2) the entire PATH
is searched and the file isn't found. When executed via .buildkit_qemu_emulator
however, the qemu binary itself is always found, but internally the command being called (e.g. absolute path /usr/local/sbin/sh
) isn't found. .buildkit_qemu_emulator
fails indicating Error while loading /usr/local/sbin/sh: No such file or directory
, but the error returned through the execve
call is the child process (.buildkit_qemu_emulator
) failing with an error. This bubbles back to the execvp
call, matches its unhandled error case, and aborts the entire process.
Workaround: by modifying the PATH
so that the first element contains the file to be executed, the process succeeds.
The relevant portion of the build for the alpine
example run with strace
enabled for qemu:
#5 [2/2] RUN /usr/bin/env sh -c 'echo Hello World'
#5 0.302 1 set_tid_address(365117932080,1,16,16,0,365117246396) = 1
#5 0.304 1 brk(NULL) = 0x00000055000eb000
#5 0.304 1 brk(0x00000055000ed000) = 0x00000055000ed000
#5 0.304 1 mmap(0x00000055000eb000,4096,PROT_NONE,MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED,-1,0) = 0x00000055000eb000
#5 0.305 1 mprotect(0x00000055000e6000,16384,PROT_READ) = 0
#5 0.307 1 getuid() = 0
#5 0.308 1 mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x0000005502b9a000
#5 0.308 1 mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x0000005502b9b000
#5 0.309 1 getpid() = 1
#5 0.309 1 mmap(NULL,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x0000005502b9c000
#5 0.309 1 rt_sigprocmask(SIG_UNBLOCK,0x0000005502aeb820,NULL) = 0
#5 0.309 1 rt_sigaction(SIGCHLD,0x0000005502aeb800,NULL) = 0
#5 0.310 1 getppid() = 0
#5 0.310 1 uname(0x5502aeb9f0) = 0
#5 0.311 1 getcwd(0x5502aeab30,4096) = 2
#5 0.312 1 rt_sigaction(SIGINT,NULL,0x0000005502aeba30) = 0
#5 0.312 1 rt_sigaction(SIGINT,0x0000005502aeba10,NULL) = 0
#5 0.312 1 rt_sigaction(SIGQUIT,NULL,0x0000005502aeba30) = 0
#5 0.312 1 rt_sigaction(SIGQUIT,0x0000005502aeba10,NULL) = 0
#5 0.312 1 rt_sigaction(SIGTERM,NULL,0x0000005502aeba30) = 0
#5 0.327 Error while loading /usr/local/sbin/sh: No such file or directory
#5 ERROR: process "/dev/.buildkit_qemu_emulator -strace /bin/sh -c /usr/bin/env sh -c 'echo Hello World'" did not complete successfully: exit code: 1
The issue also impacts script shebang execution (which is how I initially encountered this), but I'm unclear on how much of that process differs inside the emulation from the example cases here.
#!/usr/bin/env sh
echo "Hello World!"
^ Also fails when executed via a RUN
instruction