John Baldwin
2016-Oct-28 14:29 UTC
stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote:> [The following has been reported in: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] > > In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying to track things down I ran into truss getting a SIGSEGV when it tries to handle the situation. . . > > In truss's enter_syscall there is (from a live gdb on truss, after the segmentation fault): > > 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, t->cs.number); > 381 if (t->cs.name == NULL) > (gdb) > 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", > 383 t->proc->abi->type, t->cs.number); > 384 > 385 sc = get_syscall(t->cs.name, narg); > 386 t->cs.nargs = sc->nargs; > 387 assert(sc->nargs <= nitems(t->cs.s_args)); > 388 > 389 t->cs.sc = sc; > > (gdb) print *t > $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, tid = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = 580828064, args = 0x2061b0c0, nargs = 0, > s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} > > (gdb) print sc > $3 = (struct syscall *) 0x0 > > So line 386 listed above gets a segmentation fault for sc->nargs when t->cs.name is a NULL pointer: sc ends up NULL. > > Looking at the two things that the fprintf on lines 382 and 383 would report: > > (gdb) print t->proc->abi->type > $4 = 0x10166 "FreeBSD ELF32" > > (gdb) print t->cs.number > $5 = 580828064 > > (gdb) print narg > $6 = 0 > > (that last is for context for the get_syscall arguments). > > FYI: 580828064 = 0x229EBBA0I have a patchset I have tested some in a git branch that I believe fixes handling of unknown system calls. Please try this: https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown (Add .diff to get a diff you can apply with patch) -- John Baldwin
Mark Millard
2016-Oct-28 23:02 UTC
stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
On 2016-Oct-28, at 7:29 AM, John Baldwin <jhb at freebsd.org> wrote:> On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote: >> [The following has been reported in: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] >> >> In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying to track things down I ran into truss getting a SIGSEGV when it tries to handle the situation. . . >> >> In truss's enter_syscall there is (from a live gdb on truss, after the segmentation fault): >> >> 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, t->cs.number); >> 381 if (t->cs.name == NULL) >> (gdb) >> 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", >> 383 t->proc->abi->type, t->cs.number); >> 384 >> 385 sc = get_syscall(t->cs.name, narg); >> 386 t->cs.nargs = sc->nargs; >> 387 assert(sc->nargs <= nitems(t->cs.s_args)); >> 388 >> 389 t->cs.sc = sc; >> >> (gdb) print *t >> $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, tid = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = 580828064, args = 0x2061b0c0, nargs = 0, >> s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} >> >> (gdb) print sc >> $3 = (struct syscall *) 0x0 >> >> So line 386 listed above gets a segmentation fault for sc->nargs when t->cs.name is a NULL pointer: sc ends up NULL. >> >> Looking at the two things that the fprintf on lines 382 and 383 would report: >> >> (gdb) print t->proc->abi->type >> $4 = 0x10166 "FreeBSD ELF32" >> >> (gdb) print t->cs.number >> $5 = 580828064 >> >> (gdb) print narg >> $6 = 0 >> >> (that last is for context for the get_syscall arguments). >> >> FYI: 580828064 = 0x229EBBA0 > > I have a patchset I have tested some in a git branch that I believe fixes handling of > unknown system calls. Please try this: > > https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown > > (Add .diff to get a diff you can apply with patch) > > -- > John BaldwinSorry it took so long to try the build. . . I got a compile failure for use of bool in my stable/11 context for the BPI-M3 build that the truss problem was discovered with (quoting the build log below):> --- main.o --- > cc -target armv6-gnueabihf-freebsd11.0 --sysroot=/usr/local/src/crochet/work/obj/arm.armv6/usr/src/tmp -B/usr/local/src/crochet/work/obj/arm.armv6/usr/src/tmp/usr/bin -O -pipe -I/usr/src/usr.bin/truss -I. -I/usr/src/usr.bin/truss/../../sys -g -MD -MF.depend.main.o -MTma > in.o -std=gnu99 -Wsystem-headers -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs > -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -Wmissing-variable-declarations -Wthread-safety -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Qunused-arguments -c /usr/src/usr.bin/truss/main.c -o main.o > In file included from /usr/src/usr.bin/truss/main.c:53: > /usr/src/usr.bin/truss/syscall.h:75:2: error: unknown type name 'bool' > bool unknown; /* Uknown system call */ > ^ > 1 error generated. > *** [main.o] Error code 1 > > make[4]: stopped in /usr/src/usr.bin/truss > 1 errorIn C99 bool is a macro from <stdbool.h> and _Bool is the C99 type itself. So apparently <stdbool.h> (or an equivalent) was not directly or indirectly included. (The macros true and false and __bool_true_false_are_defined are also from <stdbool.h> .) Which way do you want the C99 typing to be handled for this: native C99 with no <stdbool.h> required? Use <stdbool.h> ? Side note: I'll see about getting my normal stable/11 build environment going for the BPI-M3 instead of using the crochet from my first-time build for the target. ==Mark Millard markmi at dsl-only.net
Mark Millard
2016-Oct-29 21:18 UTC
stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
[I re-established the crotchet-build based failure context finally. Unfortunately truss just dies in a new place.] On 2016-Oct-28, at 7:29 AM, John Baldwin <jhbat freebsd.org> wrote:> On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote: >> [The following has been reported in: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] >> >> In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying to track things down I ran into truss getting a SIGSEGV when it tries to handle the situation. . . >> >> In truss's enter_syscall there is (from a live gdb on truss, after the segmentation fault): >> >> 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, t->cs.number); >> 381 if (t->cs.name == NULL) >> (gdb) >> 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", >> 383 t->proc->abi->type, t->cs.number); >> 384 >> 385 sc = get_syscall(t->cs.name, narg); >> 386 t->cs.nargs = sc->nargs; >> 387 assert(sc->nargs <= nitems(t->cs.s_args)); >> 388 >> 389 t->cs.sc = sc; >> >> (gdb) print *t >> $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, tid = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = 580828064, args = 0x2061b0c0, nargs = 0, >> s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} >> >> (gdb) print sc >> $3 = (struct syscall *) 0x0 >> >> So line 386 listed above gets a segmentation fault for sc->nargs when t->cs.name is a NULL pointer: sc ends up NULL. >> >> Looking at the two things that the fprintf on lines 382 and 383 would report: >> >> (gdb) print t->proc->abi->type >> $4 = 0x10166 "FreeBSD ELF32" >> >> (gdb) print t->cs.number >> $5 = 580828064 >> >> (gdb) print narg >> $6 = 0 >> >> (that last is for context for the get_syscall arguments). >> >> FYI: 580828064 = 0x229EBBA0 > > I have a patchset I have tested some in a git branch that I believe fixes handling of > unknown system calls. Please try this: > > https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown > > (Add .diff to get a diff you can apply with patch) > > > -- > John Baldwin[Watch out for inlining consequences in how gdb presents things. Also I extracted from my explorations and changed the presentation order to eliminate junk.] Summary: st->syscalls ends up NULL from reallocf refusing a huge allocation because t->cs.number==580828064, which would make for a huge offset in st->syscalls[number] . new_count * sizeof(st->syscalls[0]) would be rather large (new_count == number+1) . reallocf's result needs to be tested and/or reasonable-value-checks on t->cs.number (a.k.a. number) need to be made and unreasonable value handled some other way. The supporting details: root at bananapi-m3:/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/armv6-portbld-freebsd11.0/libgcc # gdb truss GNU gdb 6.1.1 [FreeBSD] . . . (gdb) run -faeH -o truss.log /usr/obj/portswork/usr/ports/lang/gcc6/work/.build/./gcc/xgcc -B/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/./gcc/ -B/usr/local/armv6-portbld-freebsd11.0/bin/ -B/usr/local/armv6-portbld-freebsd11.0/lib/ -isystem /usr/local/armv6-portbld-freebsd11.0/include -isystem /usr/local/armv6-portbld-freebsd11.0/sys-include -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -O2 -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -pthread -fno-inline -fomit-frame-pointer -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -fPIC -pthread -fno-inline -fomit-frame-pointer -I. -I. -I../.././gcc -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/. -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/../gcc -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/../include -DHAVE_CC_TLS -o _muldi3.o -MT _muldi3.o -MD -MP -MF _muldi3.dep -DL_muldi3 -c /usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/libgcc2.c -fvisibility=hidden -DHIDE_EXPORTS Starting program: /usr/bin/truss -faeH -o truss.log . . . . Program received signal SIGSEGV, Segmentation fault. 0x20241ebc in memset () from /lib/libc.so.7 Current language: auto; currently minimal (gdb) bt #0 0x20241ebc in memset () from /lib/libc.so.7 #1 0x0000aec8 in get_syscall (t=<value optimized out>, number=580828064, nargs=0) at /usr/src/usr.bin/truss/syscalls.c:956 #2 0x0000ab8c in enter_syscall (info=0x20612000, t=0x2061b0a0, pl=<value optimized out>) at /usr/src/usr.bin/truss/setup.c:380 #3 0x0000a798 in eventloop (info=<value optimized out>) at /usr/src/usr.bin/truss/setup.c:664 #4 0x000098d4 in $a.6 () at /usr/src/usr.bin/truss/main.c:207 #5 0x000098d4 in $a.6 () at /usr/src/usr.bin/truss/main.c:207 (gdb) up #1 0x0000aec8 in get_syscall (t=<value optimized out>, number=580828064, nargs=0) at /usr/src/usr.bin/truss/syscalls.c:956 956 memset(st->syscalls + st->count, 0, (new_count - st->count) * . . . 0x20241eac <memset+244>: cmp r1, #4 ; 0x4 0x20241eb0 <memset+248>: bge 0x20241dd4 <memset+28> 0x20241eb4 <memset+252>: cmp r1, #0 ; 0x0 0x20241eb8 <memset+256>: moveq pc, lr 0x20241ebc <memset+260>: strb r3, [r12], #1 . . . (gdb) info reg r0 0x0 0 r1 0x8a7aee84 -1971655036 r2 0x8a7aee84 -1971655036 r3 0x0 0 r4 0x1 1 r5 0x2062000c 543293452 r6 0x20620000 543293440 r7 0x229ebba1 580828065 r8 0x2061b0b0 543273136 r9 0x0 0 r10 0x229ebba0 580828064 r11 0xbfbfe478 -1077943176 r12 0x0 0 sp 0xbfbfe450 -1077943216 lr 0xaec8 44744 pc 0x20241ebc 539238076 fps 0x0 0 cpsr 0xa0000010 -1610612720 . . . (gdb) 946 static void 947 grow_syscall_table(struct syscall_table *st, u_int number) 948 { 949 u_int new_count; 950 951 new_count = number + 1; 952 if (st->count >= new_count) 953 return; 954 st->syscalls = reallocf(st->syscalls, new_count * 955 sizeof(st->syscalls[0])); (gdb) 956 memset(st->syscalls + st->count, 0, (new_count - st->count) * 957 sizeof(st->syscalls[0])); 958 } 959 960 /* 961 * If/when the list gets big, it might be desirable to do it 962 * as a hash table or binary search. 963 */ 964 struct syscall * 965 get_syscall(struct threadinfo *t, u_int number, u_int nargs) (gdb) 966 { 967 struct syscall_table *st; 968 struct syscall *sc; 969 const char *name; 970 u_int i; 971 972 st = lookup_syscall_table(t->proc->abi->abi); 973 grow_syscall_table(st, number); 974 sc = st->syscalls[number]; 975 if (sc != NULL) (gdb) 976 return (sc); . . . 951 new_count = number + 1; 952 if (st->count >= new_count) 953 return; 954 st->syscalls = reallocf(st->syscalls, new_count * 955 sizeof(st->syscalls[0])); 956 memset(st->syscalls + st->count, 0, (new_count - st->count) * 957 sizeof(st->syscalls[0])); 958 } 959 960 /* (gdb) up #2 0x0000ab8c in enter_syscall (info=0x20612000, t=0x2061b0a0, pl=<value optimized out>) at /usr/src/usr.bin/truss/setup.c:380 380 sc = get_syscall(t, t->cs.number, narg); (gdb) list 375 if (narg != 0 && t->proc->abi->fetch_args(info, narg) != 0) { 376 free_syscall(t); 377 return; 378 } 379 380 sc = get_syscall(t, t->cs.number, narg); 381 if (sc->unknown) 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", 383 t->proc->abi->type, t->cs.number); 384 (gdb) print *t $1 = {entries = {le_next = 0x0, le_prev = 0x20617028}, proc = 0x20617018, tid = 100103, in_syscall = 1, cs = {sc = 0x0, number = 580828064, nargs = 0, args = 0x2061b0c0, s_args = 0x2061b0e8}, before = {tv_sec = 1477771714, tv_nsec = 696971654}, after = {tv_sec = 1477771714, tv_nsec = 697117646}} (gdb) print narg $2 = 0 . . . (gdb) print t->cs.number $9 = 580828064 . . . (gdb) print *(t->proc) $6 = {entries = {le_next = 0x20617000, le_prev = 0x20617048}, pid = 808, abi = 0x1ee68, threadlist = {lh_first = 0x2061b0a0}} (gdb) print *(t->proc->abi) $7 = {type = 0x1026b "FreeBSD ELF32", abi = SYSDECODE_ABI_FREEBSD, fetch_args = 0xda44 <arm_fetch_args>, fetch_retval = 0xdb64 <arm_fetch_retval>} (gdb) print t->proc->abi->abi $8 = SYSDECODE_ABI_FREEBSD So for t->cs.number==580828064 : 380 sc = get_syscall(t, t->cs.number, narg); . . . 965 get_syscall(struct threadinfo *t, u_int number, u_int nargs) . . . 973 grow_syscall_table(st, number); 974 sc = st->syscalls[number]; would get very far away from st->syscalls after indexing by the large number==580828064 --if the grow could even complete for number==580828064 : 947 grow_syscall_table(struct syscall_table *st, u_int number) 948 { . .. 951 new_count = number + 1; . . . 954 st->syscalls = reallocf(st->syscalls, new_count * 955 sizeof(st->syscalls[0])); 956 memset(st->syscalls + st->count, 0, (new_count - st->count) * 957 sizeof(st->syscalls[0])); st->syscalls was NULL after reallocf returned the value NUKL. ==Mark Millard markmi at dsl-only.net