Laszlo Ersek
2023-Mar-22 14:45 UTC
[Libguestfs] [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
On 3/21/23 18:28, Eric Blake wrote:> it is indeed a bug in busybox now that POSIX is moving towards > standardizing realpath, so I've filed it: > https://bugs.busybox.net/show_bug.cgi?id=15466I've found another busybox bug. The "/bin/sh" utility is provided by busybox as well (via the usual symlinking). Per POSIX, if execvp(file, { argv[0], argv[1], ..., NULL }) were to fail with -1/ENOEXEC, then execvp() must retry "as if" with execv(<shell path>, { argv[0], file, argv[1], ..., NULL }) In other words, if direct execution of "file" failed because "file" "has the appropriate access permission but has an unrecognized format", then execvp() is required to try executing "file" as a shell script. For that, <shell path> is left unspecified by POSIX, but the arguments of the shell are specified: - Argv[0] remains the same. That is, what we wanted "file" to know itself as, is what we now want *the shell executable* to know itself as. - argv[1] becomes "file" -- this is the script that the shell is supposed to run. - argv[2] and onwards become positional parameters $1, $2, ... for the shell script. And the argv[0] specification is what's violated by busybox, because if argv[0] is anything other than "sh", then the busybox binary doesn't recognize itself as the shell! The simplest way to demonstrate the bug is this: bash-5.2$ ( exec -a foobar /bin/sh <<< "echo hello" ) foobar: applet not found And then, another way to demonstrate the same busybox issue... lets us, in fact, discover a musl bug in turn!!! Consider the following C program (called "test-execvp.c"; the binary is called "test-execvp"): ------------- #include <stdio.h> #include <unistd.h> int main(void) { char *args[] = { "foobar", NULL }; execvp("hello.sh", args); perror("execvp"); return 1; } ------------- The file "hello.sh" resides in the current directory (same directory where "test-execvp" resides). Furthermore it has execute permission, and the following contents: ------- echo hello ------- Now consider the following command (from bash): $ PATH=.:$PATH test-execvp What is supposed to happen is this: (1) bash shall find test-execvp in the current directory per PATH, (2) execvp() shall find "hello.sh" in the current directory per PATH, (3) execvp() shall hit an internal failure -1/ENOEXEC, (4) execvp() shall then invoke the shell (under an unspecified pathname), (5) the shell shall get "foobar" for its argv[0], and "hello.sh" for its argv[1] (6) we shall see "hello" on the standard output. That's exactly what happens on Linux/glibc. (Note: this result has absolutely nothing to do with my execvpe() implementation, or libnbd in the first place.) Now, according to my above description of the busybox bug, we're tempted to believe that step (6) fails on Alpine Linux (using musl + busybox). We expect the busybox binary to be launched, via the /bin/sh symlink, in step (4), and we expect it to fail after step (5), due to it not recognizing "foobar" as an "applet name". It turns out however that step (4) does not happen. musl does not handle ENOEXEC:> bash-5.2$ PATH=.:$PATH test-execvp > execvp: Exec format errorexecvp() is not permitted by POSIX to fail with ENOEXEC. (Not considering the virtually impossible situation when executing the system shell binary *itself* fails with ENOEXEC.) I've checked the musl source too, at commit 7d756e1c04de ("dns: prefer monotonic clock for timeouts", 2023-02-12). The execvp() implementation: https://git.musl-libc.org/cgit/musl/tree/src/process/execvp.c does not handle ENOEXEC; what's more, the entire tree only sets errno=ENOEXEC in "ldso/dynlink.c". Laszlo
Laszlo Ersek
2023-Mar-22 15:01 UTC
[Libguestfs] [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
On 3/22/23 15:45, Laszlo Ersek wrote:> On 3/21/23 18:28, Eric Blake wrote: > >> it is indeed a bug in busybox now that POSIX is moving towards >> standardizing realpath, so I've filed it: >> https://bugs.busybox.net/show_bug.cgi?id=15466 > > I've found another busybox bug. > > The "/bin/sh" utility is provided by busybox as well (via the usual symlinking). > > Per POSIX, if > > execvp(file, { argv[0], argv[1], ..., NULL }) > > were to fail with -1/ENOEXEC, then execvp() must retry "as if" with > > execv(<shell path>, { argv[0], file, argv[1], ..., NULL }) > > In other words, if direct execution of "file" failed because "file" "has the appropriate access permission but has an unrecognized format", then execvp() is required to try executing "file" as a shell script. For that, <shell path> is left unspecified by POSIX, but the arguments of the shell are specified: > > - Argv[0] remains the same. That is, what we wanted "file" to know itself as, is what we now want *the shell executable* to know itself as. > > - argv[1] becomes "file" -- this is the script that the shell is supposed to run. > > - argv[2] and onwards become positional parameters $1, $2, ... for the shell script. > > And the argv[0] specification is what's violated by busybox, because if argv[0] is anything other than "sh", then the busybox binary doesn't recognize itself as the shell! > > The simplest way to demonstrate the bug is this: > > bash-5.2$ ( exec -a foobar /bin/sh <<< "echo hello" ) > foobar: applet not found > > > And then, another way to demonstrate the same busybox issue... lets us, in fact, discover a musl bug in turn!!! > > Consider the following C program (called "test-execvp.c"; the binary is called "test-execvp"): > > ------------- > #include <stdio.h> > #include <unistd.h> > > int main(void) > { > char *args[] = { "foobar", NULL }; > > execvp("hello.sh", args); > perror("execvp"); > return 1; > } > ------------- > > The file "hello.sh" resides in the current directory (same directory where "test-execvp" resides). Furthermore it has execute permission, and the following contents: > > ------- > echo hello > ------- > > Now consider the following command (from bash): > > $ PATH=.:$PATH test-execvp > > What is supposed to happen is this: > > (1) bash shall find test-execvp in the current directory per PATH, > (2) execvp() shall find "hello.sh" in the current directory per PATH, > (3) execvp() shall hit an internal failure -1/ENOEXEC, > (4) execvp() shall then invoke the shell (under an unspecified pathname), > (5) the shell shall get "foobar" for its argv[0], and "hello.sh" for its argv[1] > (6) we shall see "hello" on the standard output. > > That's exactly what happens on Linux/glibc. (Note: this result has absolutely nothing to do with my execvpe() implementation, or libnbd in the first place.) > > Now, according to my above description of the busybox bug, we're tempted to believe that step (6) fails on Alpine Linux (using musl + busybox). We expect the busybox binary to be launched, via the /bin/sh symlink, in step (4), and we expect it to fail after step (5), due to it not recognizing "foobar" as an "applet name". > > It turns out however that step (4) does not happen. musl does not handle ENOEXEC: > >> bash-5.2$ PATH=.:$PATH test-execvp >> execvp: Exec format error > > execvp() is not permitted by POSIX to fail with ENOEXEC. (Not considering the virtually impossible situation when executing the system shell binary *itself* fails with ENOEXEC.) > > I've checked the musl source too, at commit 7d756e1c04de ("dns: prefer monotonic clock for timeouts", 2023-02-12). The execvp() implementation: > > https://git.musl-libc.org/cgit/musl/tree/src/process/execvp.c > > does not handle ENOEXEC; what's more, the entire tree only sets errno=ENOEXEC in "ldso/dynlink.c".I considered reporting a musl bug, but: - per <https://wiki.musl-libc.org/reporting-bugs.html>, musl issues should be reported on the mailing list, - on the mailing list, I've found two reports of this symptom, one from 2018, another from 2020: https://www.openwall.com/lists/musl/2018/03/09/2 https://www.openwall.com/lists/musl/2020/02/12/9 So it turns out that my execvp[e]() implementation is not only async-signal-safe, but mitigates a known bug in musl, too. (BTW what Rich Felker seems to be missing, under the second link above, is that execvp() is *NOT* required to be async-signal-safe (approx. "safe to use from a vforked child"), so malloc() is fair game within execvp(), even in case malloc() is also async-signal-UNSAFE in musl.) Laszlo
Laszlo Ersek
2023-Mar-22 15:53 UTC
[Libguestfs] [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
On 3/22/23 15:45, Laszlo Ersek wrote:> On 3/21/23 18:28, Eric Blake wrote: > >> it is indeed a bug in busybox now that POSIX is moving towards >> standardizing realpath, so I've filed it: >> https://bugs.busybox.net/show_bug.cgi?id=15466 > > I've found another busybox bug. > > The "/bin/sh" utility is provided by busybox as well (via the usual symlinking). > > Per POSIX, if > > execvp(file, { argv[0], argv[1], ..., NULL }) > > were to fail with -1/ENOEXEC, then execvp() must retry "as if" with > > execv(<shell path>, { argv[0], file, argv[1], ..., NULL }) > > In other words, if direct execution of "file" failed because "file" "has the appropriate access permission but has an unrecognized format", then execvp() is required to try executing "file" as a shell script. For that, <shell path> is left unspecified by POSIX, but the arguments of the shell are specified: > > - Argv[0] remains the same. That is, what we wanted "file" to know itself as, is what we now want *the shell executable* to know itself as. > > - argv[1] becomes "file" -- this is the script that the shell is supposed to run. > > - argv[2] and onwards become positional parameters $1, $2, ... for the shell script. > > And the argv[0] specification is what's violated by busybox, because if argv[0] is anything other than "sh", then the busybox binary doesn't recognize itself as the shell! > > The simplest way to demonstrate the bug is this: > > bash-5.2$ ( exec -a foobar /bin/sh <<< "echo hello" ) > foobar: applet not found > > > And then, another way to demonstrate the same busybox issue... lets us, in fact, discover a musl bug in turn!!! > > Consider the following C program (called "test-execvp.c"; the binary is called "test-execvp"): > > ------------- > #include <stdio.h> > #include <unistd.h> > > int main(void) > { > char *args[] = { "foobar", NULL }; > > execvp("hello.sh", args); > perror("execvp"); > return 1; > } > ------------- > > The file "hello.sh" resides in the current directory (same directory where "test-execvp" resides). Furthermore it has execute permission, and the following contents: > > ------- > echo hello > ------- > > Now consider the following command (from bash): > > $ PATH=.:$PATH test-execvp > > What is supposed to happen is this: > > (1) bash shall find test-execvp in the current directory per PATH, > (2) execvp() shall find "hello.sh" in the current directory per PATH, > (3) execvp() shall hit an internal failure -1/ENOEXEC, > (4) execvp() shall then invoke the shell (under an unspecified pathname), > (5) the shell shall get "foobar" for its argv[0], and "hello.sh" for its argv[1] > (6) we shall see "hello" on the standard output. > > That's exactly what happens on Linux/glibc. (Note: this result has absolutely nothing to do with my execvpe() implementation, or libnbd in the first place.) > > Now, according to my above description of the busybox bug, we're tempted to believe that step (6) fails on Alpine Linux (using musl + busybox). We expect the busybox binary to be launched, via the /bin/sh symlink, in step (4), and we expect it to fail after step (5), due to it not recognizing "foobar" as an "applet name". > > It turns out however that step (4) does not happen. musl does not handle ENOEXEC:It's getting crazier by the hour. I thought to create a reproducer for busybox, in spite of musl breaking down at an earlier point (see above). For that, I *statically linked* "test-execvp" on RHEL-9.1 (using glibc), and then executed the binary in the Alpine Linux container. This should eliminate musl from the picture, and exercise the ENOEXEC fallback (from glibc), invoking /bin/sh (aka busybox) under argv[0] "foobar", and trigger the "unknown applet" bug in busybox. However, this does not happen. Instead, I get "hello" printed. How is that possible? The solution is that glibc *too* has a bug, and that bug hides the busybox bug. Namely, in glibc, going back to historical commit commit 6a032d81581978187f562e5533a32e0a6a3d352b (tag: cvs/libc-960210) Author: Roland McGrath <roland at gnu.org> Date: Sat Feb 10 10:00:27 1996 +0000 Sat Feb 10 04:18:48 1996 Roland McGrath <roland at churchy.gnu.ai.mit.edu> * posix/execvp.c: If execv fails with ENOEXEC, run the shell on the file. Fri Feb 9 11:46:45 1996 Roland McGrath <roland at churchy.gnu.ai.mit.edu> * time/Makefile (CFLAGS-zdump.c, CFLAGS-zic.c, CFLAGS-ialloc.c, CFLAGS-scheck.c): Use -DNOID instead of -Wno-unused. * hurd/Makefile (user-interfaces): Added hurd/tioctl. (note the date: 1996!), the POSIX-mandated fallback execv(<shell path>, { argv[0], file, argv[1], ..., NULL }) is not being done. Instead, the following is done: execv(<shell path>, { <shell path>, file, argv[1], ..., NULL }) In other words, the original argv[0] is not preserved, but is replaced by <shell path>. (Look for _PATH_BSHELL in said historical glibc commit, and also in today's glibc file "posix/execvpe.c".) This can be demonstrated with: $ PATH=.:$PATH strace -etrace=execve test-execvp execve("./test-execvp", ["test-execvp"], 0x7ffc0d1e5248 /* 85 vars */) = 0 execve("./hello.sh", ["foobar"], 0x7ffc528e14a8 /* 85 vars */) = -1 ENOEXEC (Exec format error) execve("/bin/sh", ["/bin/sh", "./hello.sh"], 0x7ffc528e14a8 /* 85 vars */) = 0 hello +++ exited with 0 +++ The third execve() call should be: execve("/bin/sh", ["foobar", "./hello.sh"], 0x7ffc528e14a8 /* 85 vars */) = 0 ^^^^^^^^ Laszlo
Eric Blake
2023-Mar-22 16:10 UTC
[Libguestfs] [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
On Wed, Mar 22, 2023 at 03:45:17PM +0100, Laszlo Ersek wrote:> On 3/21/23 18:28, Eric Blake wrote: > > > it is indeed a bug in busybox now that POSIX is moving towards > > standardizing realpath, so I've filed it: > > https://bugs.busybox.net/show_bug.cgi?id=15466 > > I've found another busybox bug. > > The "/bin/sh" utility is provided by busybox as well (via the usual symlinking). > > Per POSIX, if > > execvp(file, { argv[0], argv[1], ..., NULL }) > > were to fail with -1/ENOEXEC, then execvp() must retry "as if" with > > execv(<shell path>, { argv[0], file, argv[1], ..., NULL }) > > In other words, if direct execution of "file" failed because "file" "has the appropriate access permission but has an unrecognized format", then execvp() is required to try executing "file" as a shell script. For that, <shell path> is left unspecified by POSIX, but the arguments of the shell are specified: > > - Argv[0] remains the same. That is, what we wanted "file" to know itself as, is what we now want *the shell executable* to know itself as. > > - argv[1] becomes "file" -- this is the script that the shell is supposed to run. > > - argv[2] and onwards become positional parameters $1, $2, ... for the shell script. > > And the argv[0] specification is what's violated by busybox, because if argv[0] is anything other than "sh", then the busybox binary doesn't recognize itself as the shell!The as-if rule might allow us to invoke something like execv(<shell path>, {"sh", "-c", munge(file), argv[0], argv[1], ..., NULL}, where munge(file) produces ". quoted_file" as a way to source the contents of file in the current shell environment, without ';' or other metacharacters in file causing us to go off the rail. But getting munge(file) to work correctly without post-fork() malloc() is going to be just as difficult (our _init function pre-fork would have to pre-munge every candidate name...) The busybox list is annoying - it won't let me post without first being a subscriber (I attempted to post a quick patch to implement 'readlink -- foo'; implementing 'realpath -- foo' was not as quick). But if they let me on the list, I'll certainly bring it to their attention that their 'sh' behavior is indeed awkward.> > The simplest way to demonstrate the bug is this: > > bash-5.2$ ( exec -a foobar /bin/sh <<< "echo hello" ) > foobar: applet not found > > > And then, another way to demonstrate the same busybox issue... lets us, in fact, discover a musl bug in turn!!!I'll discuss that more in reply to your followup mail. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Reasonably Related Threads
- [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
- [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
- [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
- [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()
- [libnbd PATCH v4 0/2] lib/utils: introduce async-signal-safe execvpe()