David Wolfskill
2016-Oct-30 18:20 UTC
(Circumvented) insta-panic from "pkg upgrade" stable/11 @r308090
Summary: I've worked around this -- at least, for now -- but a process I've been using every Sunday since July 2015 on a pair of machines suddenly failed this morning (on just one of the machines). For background, (if you're interested): * <http://www.catwhisker.org/~david/FreeBSD/upgrade.html> * <http://www.catwhisker.org/~david/FreeBSD/convert_i386_amd64.html> * <http://www.catwhisker.org/~david/FreeBSD/history/> So... this morning, the update from: FreeBSD albert.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #95 r307797M/307819:1100505: Sun Oct 23 03:52:44 PDT 2016 root at freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT amd64 to: FreeBSD albert.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #102 r308090M/308101:1100506: Sun Oct 30 04:09:05 PDT 2016 root at freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT amd64 Just Worked -- as usual. I rebooted both "production" machines, logged in, fired up tmux on each, rotated my typescript files, then fired up script and ran the csh command alias I use on both machines to update the installed ports (from the locally-built packages that reside on my build machine; the production machines access them via NFS). For one of the machines ("bats"), things Just Worked (again). For the other ("albert"), I lost contact. Eventually (after I actually got up and went to the room where the machines are), I found that it had rebooted. Further experimentation showed that in the command sequence: mount -u -w / && \ mount -u -w /usr && \ ( cd /etc/mail && make stop-mta ) && \ service dovecot stop && \ service apache24 stop && \ pkg upgrade it got through "service apache24 stop" OK, but when I issued "pkg upgrade" -- the screen blanked, and the machine started rebooting. On reboot, the /var file system (UFS2+soft updates) showed the typescript files from before the above efforts -- not even the "rotation" (mentioned above) was reflected. (The typescript files in question reside in /var/tmp on the machine.) Oh: and the initial "fsck -p" for /var indicated that fsck needed to be re-run (so when I booted to single-user mode, I did just that). There was no hint in the logs of why the reboot (panic?) occurred. One point that may be at issue is that for bats (where things still worked), I manually mount the package repository from the build machine to bats:/mnt, while for albert (where things failed), I have depended on autofs to handle the mounting as needed (since I need albert to run autofs anyway, and bats does not). E.g.: bats(11.0-S)[1] cat /usr/local/etc/pkg/repos/custom.conf custom: { # url: file:///net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-home url: file:///mnt enabled: yes, } bats(11.0-S)[2] vs.: albert(11.0-S)[10] cat /usr/local/etc/pkg/repos/custom.conf custom: { url: file:///net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-home enabled: yes, } albert(11.0-S)[11] In the process of finally(!) getting albert's "pkg upgrade" working, I did 2 things differently: * I did not run under tmux. I can't imagine that this contributed, but I cite it for completeness. * Prior to invoking "pkg upgrade", I issued "ls /net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-home" (and got a sane result), so the mount was satisfied prior to "pkg upgrade" being run. I note, too, that one of the times I logged in to albert, the login seemed to "hang" for a while. When I hit ^T (several times), it was apparent that the process was trying to use autofs to mount my home directory (from the FreeNAS box, "grundoon")... and that effort timed out -- I ended up with the whine about inability to find my home directory. And then when I logged out & back in again, /net/grundoon/mnt/tank/homedirs showed up 3 times in the output of "df". So perhaps there's something involving autofs and timing... though that doesn't seem like much to go on. Peace, david -- David H. Wolfskill david at catwhisker.org Those who would murder in the name of God or prophet are blasphemous cowards. See http://www.catwhisker.org/~david/publickey.gpg for my public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 603 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20161030/ddd4cc2d/attachment.sig>