Hans Petter Selasky
2017-Sep-18 07:40 UTC
[Asterisk-bsd] Asterisk13 coredump on freebsd 11.1
On 09/18/17 09:39, Tao Zhou wrote:> I recently upgraded asterisk13 from 13.17.0_1 to 13.17.1, and also > upgraded freebsd from 11.0 to 11.1. since then asterisk starts crashing > every few minutes. > > I ran /usr/local/share/asterisk/scripts/ast_coredumper /tmp/asterisk.core > > and got the following result in asterisk.core-full.txt > > > Thread 6 (LWP 101423): > #0 0x000000080335558a in _poll () from /lib/libc.so.7 > No symbol table info available. > #1 0x000000080303e706 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #2 0x00000000004506b2 in ?? () > No symbol table info available. > #3 0x00000000005921ea in ?? () > No symbol table info available. > #4 0x000000080303bbc5 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #5 0x0000000000000000 in ?? () > No symbol table info available. > Backtrace stopped: Cannot access memory at address 0x7fffdfe0a000 > > Thread 5 (LWP 101336): > #0 0x0000000803049c7c in ?? () from /lib/libthr.so.3 > No symbol table info available. > #1 0x0000000803046325 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #2 0x0000000000589d18 in ?? () > No symbol table info available. > #3 0x00000000004552f1 in ?? () > No symbol table info available. > #4 0x00000000004552f1 in ?? () > No symbol table info available. > #5 0x0000000000589c4b in ?? () > No symbol table info available. > #6 0x000000000058318e in ast_taskprocessor_execute () > No symbol table info available. > #7 0x00000000005832de in ?? () > No symbol table info available. > #8 0x00000000005921ea in ?? () > No symbol table info available. > #9 0x000000080303bbc5 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #10 0x0000000000000000 in ?? () > No symbol table info available. > Backtrace stopped: Cannot access memory at address 0x7fffdff04000 > > Thread 4 (LWP 101240): > #0 0x00000008032d88b8 in _umtx_op () from /lib/libc.so.7 > No symbol table info available. > #1 0x00000008032c275d in sem_clockwait_np () from /lib/libc.so.7 > No symbol table info available. > #2 0x00000000005832c8 in ?? () > No symbol table info available. > #3 0x00000000005921ea in ?? () > No symbol table info available. > #4 0x000000080303bbc5 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #5 0x0000000000000000 in ?? () > No symbol table info available. > Backtrace stopped: Cannot access memory at address 0x7fffdff81000 > > Thread 3 (LWP 101183): > #0 0x0000000803049c7c in ?? () from /lib/libthr.so.3 > No symbol table info available. > #1 0x000000080303dba0 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #2 0x00000008030479f8 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #3 0x00000000004c9756 in ?? () > No symbol table info available. > #4 0x00000000005921ea in ?? () > No symbol table info available. > #5 0x000000080303bbc5 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #6 0x0000000000000000 in ?? () > No symbol table info available. > Backtrace stopped: Cannot access memory at address 0x7fffdfffe000 > > Thread 2 (LWP 100441): > #0 0x000000080335558a in _poll () from /lib/libc.so.7 > No symbol table info available. > #1 0x000000080303e706 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #2 0x0000000000452980 in ?? () > No symbol table info available. > #3 0x0000000000436483 in ?? () > No symbol table info available. > #4 0x0000000000437b1f in ?? () > No symbol table info available. > #5 0x0000000800876000 in ?? () > No symbol table info available. > #6 0x0000000000000000 in ?? () > No symbol table info available. > > Thread 1 (LWP 100493): > #0 x86_64_freebsd_fallback_frame_state (context=0x7fffdfd0fe20, > context=0x7fffdfd0fe20, fs=0x7fffdfd0fb70) at ./md-unwind-support.h:60 > sf = <optimized out> > new_cfa = <optimized out> > #1 uw_frame_state_for (context=context at entry=0x7fffdfd0fe20, > fs=fs at entry=0x7fffdfd0fb70) at > /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind-dw2.c:1249 > fde = 0x0 > cie = <optimized out> > aug = <optimized out> > insn = <optimized out> > end = <optimized out> > #2 0x0000000802e2cffb in _Unwind_ForcedUnwind_Phase2 > (exc=exc at entry=0x80701dc30, context=context at entry=0x7fffdfd0fe20) at > /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind.inc:155 > fs = {regs = {reg = {{loc = {reg = 0, offset = 0, exp = 0x0}, > how = REG_UNSAVED} <repeats 18 times>}, prev = 0x0, cfa_offset = 0, > cfa_reg = 0, cfa_exp = 0x0, cfa_how = CFA_UNSET}, pc = 0x0, personality > = 0x0, data_align = 0, code_align = 0, retaddr_column = 0, fde_encoding > = 0 '000', lsda_encoding = 0 '000', saw_z = 0 '000', signal_frame = 0 > '000', eh_ptr = 0x0} > action = <optimized out> > stop = 0x8030497b0 > stop_argument = 0x0 > code = <optimized out> > stop_code = <optimized out> > #3 0x0000000802e2d334 in _Unwind_ForcedUnwind (exc=0x80701dc30, > stop=0x8030497b0, stop_argument=<optimized out>) at > /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind.inc:207 > this_context = {reg = {0x7fffdfd0ff18, 0x7fffdfd0ff20, 0x0, > 0x7fffdfd0ff28, 0x0, 0x0, 0x7fffdfd0ff50, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x7fffdfd0ff30, 0x7fffdfd0ff38, 0x7fffdfd0ff40, 0x7fffdfd0ff48, > 0x7fffdfd0ff58, 0x0}, cfa = 0x7fffdfd0ff60, ra = 0x803049613, lsda = > 0x0, bases = {tbase = 0x0, dbase = 0x0, func = 0x802e2d2d0 > <_Unwind_ForcedUnwind>}, flags = 4611686018427387904, version = 0, > args_size = 0, by_value = '000' <repeats 17 times>} > cur_context = {reg = {0x7fffdfd0ff18, 0x7fffdfd0ff20, 0x0, > 0x7fffdfd0ffd8, 0x0, 0x0, 0x7fffdfd0fff0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x7fffdfd0ff30, 0x7fffdfd0ff38, 0x7fffdfd0ffe0, 0x7fffdfd0ffe8, > 0x7fffdfd0fff8, 0x0}, cfa = 0x7fffdfd10000, ra = 0x7fffdfc94000, lsda = > 0x0, bases = {tbase = 0x0, dbase = 0x0, func = 0x80303ba80}, flags = > 4611686018427387904, version = 0, args_size = 0, by_value = '000' > <repeats 17 times>} > code = <optimized out> > #4 0x0000000803049613 in ?? () from /lib/libthr.so.3 > No symbol table info available. > #5 0x000000080304942b in pthread_exit () from /lib/libthr.so.3 > No symbol table info available. > #6 0x000000080303bbcd in ?? () from /lib/libthr.so.3 > No symbol table info available. > #7 0x00007fffdfc94000 in ?? () > No symbol table info available. > Backtrace stopped: Cannot access memory at address 0x7fffdfd10000 > > > I tried to run without loading any asterisk modules, and it still crashes. > > I also tried copying all the dependent libs from another machine running > freebsd 11.0, put them in a separate directory and use chrpath to force > asterisk use those libs, and also got the same error. > > > Any ideas? >Hi, There is a known issue with the latest version of Asterisk 13.xxx crashing. I don't know the root cause. Try downgrading the Asterisk version. You probably should compile all code with debug flags enabled if you want to find the root cause of this. --HPS
On 18/9/17 5:40 pm, Hans Petter Selasky wrote:> There is a known issue with the latest version of Asterisk 13.xxx > crashing. I don't know the root cause. Try downgrading the Asterisk > version. You probably should compile all code with debug flags enabled > if you want to find the root cause of this.In our environment the crash happen almost always within two minutes. No calls or other activity are needed to make the crash happen. Things that didn't help: * downgrading asterisk13 to 13.17 or 13.16 * downgrading gcc5 or upgrading it to gcc6 * disabling all modules * compiling asterisk13 with GCC or CLANG * upgrading the poudriere build environment from 11.0 to 11.1 Thing that helped * installing astersisk 13.16 from https://pkg.freebsd.org (All our previous attempts were with software which was compiled locally on poudriere under FreeBSD 11.1 or 11.0) Not sure if it relevant, but our make environment looks like this: WITH_PKGNG=yes WITHOUT_X11=yes JAVA_PORT=java/openjdk8 JAVA_VERSION=1.8 apache22-worker-mpm_SET+=PROXY_AJP PROXY_BALANCER PROXY_CONNECT PROXY_FTP PROXY_HTTP PROXY_SCGI WITH_BDB_VER=5 DEFAULT_VERSIONS+=? php=7.1 DEFAULT_VERSIONS+=? apache=2.4 DEFAULT_VERSIONS+=? ssl=openssl WITH_MYSQL_VER=102m # This is needed when using openssl from ports OPTIONS_UNSET+= GSSAPI_BASE OPTIONS_SET+=?? GSSAPI_MIT -- Tao ZHOU ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001
Hans Petter Selasky
2017-Sep-27 23:17 UTC
[Asterisk-bsd] Asterisk13 coredump on freebsd 11.1
Hi, I just upgraded and hit these SEGFAULTs too. First of all you need to install GDB 8.0 from ports to get the right backtrace (important). This leads straight into LibUnwind in libgcc: (gdb) bt #0 uw_frame_state_for (context=context at entry=0x7fffdf3bbe20, fs=fs at entry=0x7fffdf3bbb70) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind-dw2.c:1249 #1 0x0000000802cc8ffb in _Unwind_ForcedUnwind_Phase2 (exc=exc at entry=0x804427230, context=context at entry=0x7fffdf3bbe20) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind.inc:155 #2 0x0000000802cc9334 in _Unwind_ForcedUnwind (exc=0x804427230, stop=0x8024d5450 <thread_unwind_stop>, stop_argument=<optimized out>) at /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind.inc:207 #3 0x00000008024d52b3 in _Unwind_ForcedUnwind (ex=<optimized out>, stop_func=0x7fffdf3bb948, stop_arg=0x804427000) at /usr/img/freebsd.11/lib/libthr/thread/thr_exit.c:106 #4 thread_unwind () at /usr/img/freebsd.11/lib/libthr/thread/thr_exit.c:172 #5 _pthread_exit_mask (status=<optimized out>, mask=<optimized out>) at /usr/img/freebsd.11/lib/libthr/thread/thr_exit.c:257 #6 0x00000008024d50db in _pthread_exit (status=0x804427000) at /usr/img/freebsd.11/lib/libthr/thread/thr_exit.c:206 #7 0x00000008024c7c0d in thread_start (curthread=0x804427000) at /usr/img/freebsd.11/lib/libthr/thread/thr_create.c:289 #8 0x00007fffdf340000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffdf3bc000 libgcc uses this format which is OK: (gdb) ptype struct _Unwind_Context type = struct _Unwind_Context { _Unwind_Context_Reg_Val reg[18]; void *cfa; void *ra; void *lsda; struct dwarf_eh_bases bases; _Unwind_Word flags; _Unwind_Word version; _Unwind_Word args_size; char by_value[18]; }> x86_64_freebsd_fallback_frame_state > (struct _Unwind_Context *context, _Unwind_FrameState *fs) > { > struct sigframe *sf; > long new_cfa; > > /* Prior to FreeBSD 9, the signal trampoline was located immediately > before the ps_strings. To support non-executable stacks on AMD64, > the sigtramp was moved to a shared page for FreeBSD 9. Unfortunately > this means looking frame patterns again (sys/amd64/amd64/sigtramp.S) > rather than using the robust and convenient KERN_PS_STRINGS trick. > > <pc + 00>: lea 0x10(%rsp),%rdi > <pc + 05>: pushq $0x0 > <pc + 17>: mov $0x1a1,%rax > <pc + 14>: syscall > > If we can't find this pattern, we're at the end of the stack. > */ > > if (!( *(unsigned int *)(context->ra) == 0x247c8d48^^^^ fault is triggered by this read access on the stack> && *(unsigned int *)(context->ra + 4) == 0x48006a10 > && *(unsigned int *)(context->ra + 8) == 0x01a1c0c7 > && *(unsigned int *)(context->ra + 12) == 0x050f0000 )) > return _URC_END_OF_STACK; >The code in question is trying to access the return address of the caller on the stack which apparently I think is caught by the recently added MAP_GUARD feature: https://svnweb.freebsd.org/changeset/base/320763 I think this feature can be disabled by setting: sysctl security.bsd.stack_guard_page=0 And then restart Asterisk. Not sure if it helps, currently testing. This my best guess why Asterisk started segfaulting when upgrading to 11.1. --HPS