thr3ads.net - freebsd stable - System hang on shutdown when running freebsd-update [Dec 2014]

If this information is useful, please help other people find it:
Share via:

Walter Hop

2014-Dec-10 20:48 UTC

System hang on shutdown when running freebsd-update

> On 10 Dec 2014, at 15:28, Juan Ram?n Molina Menor <listjm at
club-internet.fr> wrote:
> 
>> A FreeBSD 10.1-REL amd64 VM here also hangs after /sbin/reboot is run,
>> after installing updates with freebsd-update. The VM runs under
>> VirtualBox on an Ubuntu 14.04 host.
>> 
>> On hard-resetting the VM, it appears the filesystem (UFS) wasn't
flushed:
>> 
>> Dec 11 00:29:52 vbox-freebsd kernel: WARNING: / was not properly
dismounted
> 
> Yes, I forgot that, same here on bare metal.

I?d expect it to happen again, since the update touched /sbin/init, and that
seems to be a way to get reboot/unmount hanging with UFS2+softupdates. (See my
minimal example in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195458
<https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195458>)

A workaround to prevent the hang/fsck is to first disable softupdates on /
before doing the freebsd-update. We updated around 40 boxes with this one weird
trick, and we did not have the problem.

I?ve tried if the hang is apparent in a 11-CURRENT snapshot too, and it is.
(Tried the iso snapshot available today, which is r273635). WITNESS indicates a
lock order reversal, which looks related to me:

root at current:~ # chflags noschg /sbin/init
root at current:~ # cp -Rp /sbin/init /sbin/init2
lock order reversal:
1st 0xfffffe007b842fa0 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:3093
2nd 0xfffff80002b9ea00 dirhash (dirhash) @
/usr/src/sys/ufs/ufs/ufs_dirhash.c:284
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000025c270
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe000025c320
witness_checkorder() at witness_checkorder+0xdad/frame 0xfffffe000025c3b0
_sx_xlock() at _sx_xlock+0x75/frame 0xfffffe000025c3f0
ufsdirhash_add() at ufsdirhash_add+0x3a/frame 0xfffffe000025c430
ufs_direnter() at ufs_direnter+0x6a0/frame 0xfffffe000025c4f0
ufs_makeinode() at ufs_makeinode+0x560/frame 0xfffffe000025c6a0
VOP_CREATE_APV() at VOP_CREATE_APV+0xf1/frame 0xfffffe000025c6d0
vn_open_cred() at vn_open_cred+0x29d/frame 0xfffffe000025c820
kern_openat() at kern_openat+0x26f/frame 0xfffffe000025c9a0
amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe000025cab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe000025cab0
--- syscall (5, FreeBSD ELF64, sys_open), rip = 0x80094f01a, rsp =
0x7fffffffe958, rbp = 0x7fffffffe9c0 ---

Screenshot is here: http://lf.ms/current-r273635-hang-1.png
<http://lf.ms/current-r273635-hang-1.png>

Finally, when rebooting, another lock order reversal appears and the system
hangs.
I don?t have a text log of this, so I?ll copy the first few lines:

Syncing disks, vnodes remaining?1 0 0 done
All buffers synced.
lock order reversal:
 1st 0xfffff80002e65d50 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1223
 2nd 0xfffff80002e665f0 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2144

Screenshot is here: http://lf.ms/current-r273635-hang-2.png
<http://lf.ms/current-r273635-hang-2.png>

I don?t have kernel hacking experience, but these source files look awfully
related to the parts that we are having problems with.

I would really love some research into this and possibly an errata for 10.1.
What can we do to make this actionable?

WH

-- 
Walter Hop | PGP key: https://lifeforms.nl/pgp

Walter Hop

2014-Dec-10 21:20 UTC

head link

System hang on shutdown when running freebsd-update

> On 10 Dec 2014, at 21:48, Walter Hop <freebsd at spam.lifeforms.nl>
wrote:
> 
> I?ve tried if the hang is apparent in a 11-CURRENT snapshot too, and it is.
(Tried the iso snapshot available today, which is r273635).
I lied! That was not the latest snapshot available, my apologies. I retried with
a snapshot from 7 Dec (r275582).

In r275582, the LOR when fiddling with /sbin/init is the same as in 10.1, but
the behavior at reboot is different, although it is still pathological (?Giving
up on 1 buffers?) . Screenshot: http://lf.ms/current-r275582-hang-givingup.png

I have hit this behavior twice on r275582, but not on 10.1. So something has
changed in CURRENT after r273635, but the basic problem still seems to be there.

-- 
Walter Hop | PGP key: https://lifeforms.nl/pgp

Jilles Tjoelker

2014-Dec-14 18:37 UTC

head link

System hang on shutdown when running freebsd-update

On Wed, Dec 10, 2014 at 09:48:18PM +0100, Walter Hop
wrote:> root at current:~ # chflags noschg /sbin/init
> root at current:~ # cp -Rp /sbin/init /sbin/init2
> lock order reversal:
> 1st 0xfffffe007b842fa0 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:3093
> 2nd 0xfffff80002b9ea00 dirhash (dirhash) @
/usr/src/sys/ufs/ufs/ufs_dirhash.c:284
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe000025c270
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe000025c320
> witness_checkorder() at witness_checkorder+0xdad/frame 0xfffffe000025c3b0
> _sx_xlock() at _sx_xlock+0x75/frame 0xfffffe000025c3f0
> ufsdirhash_add() at ufsdirhash_add+0x3a/frame 0xfffffe000025c430
> ufs_direnter() at ufs_direnter+0x6a0/frame 0xfffffe000025c4f0
> ufs_makeinode() at ufs_makeinode+0x560/frame 0xfffffe000025c6a0
> VOP_CREATE_APV() at VOP_CREATE_APV+0xf1/frame 0xfffffe000025c6d0
> vn_open_cred() at vn_open_cred+0x29d/frame 0xfffffe000025c820
> kern_openat() at kern_openat+0x26f/frame 0xfffffe000025c9a0
> amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe000025cab0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe000025cab0
> --- syscall (5, FreeBSD ELF64, sys_open), rip = 0x80094f01a, rsp =
0x7fffffffe958, rbp = 0x7fffffffe9c0 ---
> Screenshot is here: http://lf.ms/current-r273635-hang-1.png
<http://lf.ms/current-r273635-hang-1.png>
> Finally, when rebooting, another lock order reversal appears and the
> system hangs. I don?t have a text log of this, so I?ll copy the first
> few lines:
> Syncing disks, vnodes remaining?1 0 0 done
> All buffers synced.
> lock order reversal:
>  1st 0xfffff80002e65d50 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1223
>  2nd 0xfffff80002e665f0 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2144
> Screenshot is here: http://lf.ms/current-r273635-hang-2.png
<http://lf.ms/current-r273635-hang-2.png>
> I don?t have kernel hacking experience, but these source files look
> awfully related to the parts that we are having problems with.
> I would really love some research into this and possibly an errata for
> 10.1. What can we do to make this actionable?
Both of these LORs are false positives. There is no mechanism in WITNESS
to suppress them properly.

I cannot reproduce the problem (VirtualBox, stable/10 amd64 and head
i386), so apparently there is something special about some users'
environments that causes this.

-- 
Jilles Tjoelker

freebsd stable - Dec 2014 - System hang on shutdown when running freebsd-update

System hang on shutdown when running freebsd-update

System hang on shutdown when running freebsd-update

System hang on shutdown when running freebsd-update