Mark Martinec
2017-Sep-12 13:40 UTC
11.1 coredumping in sendfile, as used by a uwsgi process
A couple of days ago I have upgraded an Intel box from FreeBSD 10.3 to 11.1-RELEASE-p1, and reinstalled all the packages, built on the same OS version. This host is running nginx web server with an uwsgi as a backend. The file system is ZFS (recent as of 10.3, zpool not yet upgraded to new 11.1 features). Ever since the upgrade, this host is crashing/rebooting two or three times per day. The reported crash location is always the same: it is in a sendfile function (same addresses each time), the running process is always uwsgi: Sep 12 15:03:12 xxx syslogd: kernel boot file is /boot/kernel/kernel Sep 12 15:03:12 xxx kernel: [22677] Sep 12 15:03:12 xxx kernel: [22677] Sep 12 15:03:12 xxx kernel: [22677] Fatal trap 12: page fault while in kernel mode Sep 12 15:03:12 xxx kernel: [22677] cpuid = 7; apic id = 07 Sep 12 15:03:12 xxx kernel: [22677] fault virtual address = 0xe8 Sep 12 15:03:12 xxx kernel: [22677] fault code = supervisor write data, page not present Sep 12 15:03:12 xxx kernel: [22677] instruction pointer = 0x20:0xffffffff80afefb2 Sep 12 15:03:12 xxx kernel: [22677] stack pointer = 0x28:0xfffffe02397da5a0 Sep 12 15:03:12 xxx kernel: [22677] frame pointer = 0x28:0xfffffe02397da5e0 Sep 12 15:03:12 xxx kernel: [22677] code segment = base 0x0, limit 0xfffff, type 0x1b Sep 12 15:03:12 xxx kernel: [22677] = DPL 0, pres 1, long 1, def32 0, gran 1 Sep 12 15:03:12 xxx kernel: [22677] processor eflags = interrupt enabled, resume, IOPL = 0 Sep 12 15:03:12 xxx kernel: [22677] current process = 34504 (uwsgi) Sep 12 15:03:12 xxx kernel: [22677] trap number = 12 Sep 12 15:03:12 xxx kernel: [22677] panic: page fault Sep 12 15:03:12 xxx kernel: [22677] cpuid = 7 Sep 12 15:03:12 xxx kernel: [22677] KDB: stack backtrace: Sep 12 15:03:12 xxx kernel: [22677] #0 0xffffffff80aada97 at kdb_backtrace+0x67 Sep 12 15:03:12 xxx kernel: [22677] #1 0xffffffff80a6bb76 at vpanic+0x186 Sep 12 15:03:12 xxx kernel: [22677] #2 0xffffffff80a6b9e3 at panic+0x43 Sep 12 15:03:12 xxx kernel: [22677] #3 0xffffffff80edf832 at trap_fatal+0x322 Sep 12 15:03:12 xxx kernel: [22677] #4 0xffffffff80edf889 at trap_pfault+0x49 Sep 12 15:03:12 xxx kernel: [22677] #5 0xffffffff80edf0c6 at trap+0x286 Sep 12 15:03:12 xxx kernel: [22677] #6 0xffffffff80ec3641 at calltrap+0x8 Sep 12 15:03:12 xxx kernel: [22677] #7 0xffffffff80a6a2af at sendfile_iodone+0xbf Sep 12 15:03:12 xxx kernel: [22677] #8 0xffffffff80a69eae at vn_sendfile+0x124e Sep 12 15:03:12 xxx kernel: [22677] #9 0xffffffff80a6a4dd at sendfile+0x13d Sep 12 15:03:12 xxx kernel: [22677] #10 0xffffffff80ee0394 at amd64_syscall+0x6c4 Sep 12 15:03:12 xxx kernel: [22677] #11 0xffffffff80ec392b at Xfast_syscall+0xfb Sep 12 15:03:12 xxx kernel: [22677] Uptime: 6h17m57s Sep 12 15:03:12 xxx kernel: [22677] Dumping 983 out of 8129 MB:..2%..12%..22%..31%..41%..51%..61%..72%..82%..92%Copyright (c) 1992-2017 The FreeBSD Project. Sep 12 15:03:12 xxx kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Sep 12 15:03:12 xxx kernel: The Regents of the University of California. All rights reserved. Sep 12 15:03:12 xxx kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Sep 12 15:03:12 xxx kernel: FreeBSD 11.1-RELEASE-p1 #0: Wed Aug 9 11:55:48 UTC 2017 [...] Sep 12 15:03:12 xxx savecore: reboot after panic: page fault Sep 12 15:03:12 xxx savecore: writing core to /var/crash/vmcore.4 This host with the same services was very stable under 10.3, same ZFS pool. We have several other hosts running 11.1 with no incidents, running various services (but admittedly no other host has a comparably busy web server). Interestingly the nginx has a sendfile feature enabled too, but this does not cause a crash (on this or other hosts), only the sendfile as used by uwsgi seems to be the problem. For the time being I have disabled the use of sendfile in uwsgi, we'll see is this avoids the trouble. Suggestions? Mark
Steven Hartland
2017-Sep-12 13:46 UTC
11.1 coredumping in sendfile, as used by a uwsgi process
Could you post the decoded crash info from /var/crash/... I would also create a bug report: https://bugs.freebsd.org/bugzilla/enter_bug.cgi?product=Base%20System ??? Regards ??? Steve On 12/09/2017 14:40, Mark Martinec wrote:> A couple of days ago I have upgraded an Intel box from FreeBSD 10.3 to > 11.1-RELEASE-p1, and reinstalled all the packages, built on the same > OS version. > This host is running nginx web server with an uwsgi as a backend. > The file system is ZFS (recent as of 10.3, zpool not yet upgraded > to new 11.1 features). > > Ever since the upgrade, this host is crashing/rebooting two or three > times > per day. The reported crash location is always the same: it is in a > sendfile > function (same addresses each time), the running process is always uwsgi: > > > Sep 12 15:03:12 xxx syslogd: kernel boot file is /boot/kernel/kernel > Sep 12 15:03:12 xxx kernel: [22677] > Sep 12 15:03:12 xxx kernel: [22677] > Sep 12 15:03:12 xxx kernel: [22677] Fatal trap 12: page fault while in > kernel mode > Sep 12 15:03:12 xxx kernel: [22677] cpuid = 7; apic id = 07 > Sep 12 15:03:12 xxx kernel: [22677] fault virtual address???? = 0xe8 > Sep 12 15:03:12 xxx kernel: [22677] fault code??????????????? = > supervisor write data, page not present > Sep 12 15:03:12 xxx kernel: [22677] instruction pointer?????? = > 0x20:0xffffffff80afefb2 > Sep 12 15:03:12 xxx kernel: [22677] stack pointer???????????? = > 0x28:0xfffffe02397da5a0 > Sep 12 15:03:12 xxx kernel: [22677] frame pointer???????????? = > 0x28:0xfffffe02397da5e0 > Sep 12 15:03:12 xxx kernel: [22677] code segment????????????? = base > 0x0, limit 0xfffff, type 0x1b > Sep 12 15:03:12 xxx kernel: [22677]?????????????????? = DPL 0, pres 1, > long 1, def32 0, gran 1 > Sep 12 15:03:12 xxx kernel: [22677] processor eflags? = interrupt > enabled, resume, IOPL = 0 > Sep 12 15:03:12 xxx kernel: [22677] current process?????????? = 34504 > (uwsgi) > Sep 12 15:03:12 xxx kernel: [22677] trap number?????????????? = 12 > Sep 12 15:03:12 xxx kernel: [22677] panic: page fault > Sep 12 15:03:12 xxx kernel: [22677] cpuid = 7 > Sep 12 15:03:12 xxx kernel: [22677] KDB: stack backtrace: > Sep 12 15:03:12 xxx kernel: [22677] #0 0xffffffff80aada97 at > kdb_backtrace+0x67 > Sep 12 15:03:12 xxx kernel: [22677] #1 0xffffffff80a6bb76 at vpanic+0x186 > Sep 12 15:03:12 xxx kernel: [22677] #2 0xffffffff80a6b9e3 at panic+0x43 > Sep 12 15:03:12 xxx kernel: [22677] #3 0xffffffff80edf832 at > trap_fatal+0x322 > Sep 12 15:03:12 xxx kernel: [22677] #4 0xffffffff80edf889 at > trap_pfault+0x49 > Sep 12 15:03:12 xxx kernel: [22677] #5 0xffffffff80edf0c6 at trap+0x286 > Sep 12 15:03:12 xxx kernel: [22677] #6 0xffffffff80ec3641 at calltrap+0x8 > Sep 12 15:03:12 xxx kernel: [22677] #7 0xffffffff80a6a2af at > sendfile_iodone+0xbf > Sep 12 15:03:12 xxx kernel: [22677] #8 0xffffffff80a69eae at > vn_sendfile+0x124e > Sep 12 15:03:12 xxx kernel: [22677] #9 0xffffffff80a6a4dd at > sendfile+0x13d > Sep 12 15:03:12 xxx kernel: [22677] #10 0xffffffff80ee0394 at > amd64_syscall+0x6c4 > Sep 12 15:03:12 xxx kernel: [22677] #11 0xffffffff80ec392b at > Xfast_syscall+0xfb > Sep 12 15:03:12 xxx kernel: [22677] Uptime: 6h17m57s > Sep 12 15:03:12 xxx kernel: [22677] Dumping 983 out of 8129 > MB:..2%..12%..22%..31%..41%..51%..61%..72%..82%..92%Copyright (c) > 1992-2017 The FreeBSD Project. > Sep 12 15:03:12 xxx kernel: Copyright (c) 1979, 1980, 1983, 1986, > 1988, 1989, 1991, 1992, 1993, 1994 > Sep 12 15:03:12 xxx kernel: The Regents of the University of > California. All rights reserved. > Sep 12 15:03:12 xxx kernel: FreeBSD is a registered trademark of The > FreeBSD Foundation. > Sep 12 15:03:12 xxx kernel: FreeBSD 11.1-RELEASE-p1 #0: Wed Aug? 9 > 11:55:48 UTC 2017 > [...] > Sep 12 15:03:12 xxx savecore: reboot after panic: page fault > Sep 12 15:03:12 xxx savecore: writing core to /var/crash/vmcore.4 > > > This host with the same services was very stable under 10.3, same ZFS > pool. > > We have several other hosts running 11.1 with no incidents, running > various > services (but admittedly no other host has a comparably busy web server). > Interestingly the nginx has a sendfile feature enabled too, but this does > not cause a crash (on this or other hosts), only the sendfile as used > by uwsgi seems to be the problem. > > For the time being I have disabled the use of sendfile in uwsgi, we'll > see > is this avoids the trouble. > > Suggestions? > > ? Mark > _______________________________________________ > freebsd-stable at freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"