Kyle Evans
2018-Apr-17 14:49 UTC
amd64 kernel crash introduced between 20180329 & 20180408
On Tue, Apr 17, 2018 at 9:44 AM, Dan Allen <danallen46 at gmail.com> wrote:> I run FreeBSD 11-STABLE on actual machines, and I build the system every few days. Things have been fine. > > However, I also run FreeBSD 11 via the qemu emulator on my Mac. I run lots of different BSD & Linux OSes here to test them out. I have been running the same binary of qemu-system-x86_64 v1.2 for six years. It runs great. > > Then recently this happened: > > This snapshot dated 20180329, after doing a fresh install, runs fine: > > https://download.freebsd.org/ftp/snapshots/ISO-IMAGES/11.1/FreeBSD-11.1-STABLE-amd64-20180329-r331742-disc1.iso > > I can run pkg install and begin adding stuff to the system and life is good. > > BUT > > This snapshot dated 20180408, after doing a fresh install, will crash when running pkg install: > > https://download.freebsd.org/ftp/snapshots/ISO-IMAGES/11.1/FreeBSD-11.1-STABLE-amd64-20180408-r332308-disc1.iso > > It crashes about 90% of the way through updating the pkg snapshot. It does not matter what pkg you try and install. > > However, the latest release in the i386 flavor works fine on qemu: > > https://download.freebsd.org/ftp/snapshots/ISO-IMAGES/11.1/FreeBSD-11.1-STABLE-i386-20180412-r332428-disc1.iso > > So sometime between March 29th & April 8th, in amd64 boot code, I believe the problem was introduced.As "the guy most likely to have broken boot code in stable," may I ask what leads you specifically to amd64 boot code? Mostly curious if there's something beyond "i386 works well" that lead you to this conclusion.> I cannot debug the crash, because it does a kernel dump, and then when the system reboots, almost anything again triggers a kernel crash and it reboots again and again: no chance to inspect a mini dump or whatever.When you say it crashes and does a kernel dump- you're landing at a ddb prompt, yeah? What does executing bt at that prompt look like>?> I wish I had more to go on, but I am happy to off list work with anyone that wants to pursue this, by testing out stuff or answering more questions. >
> On 17 Apr 2018, at 8:49 AM, Kyle Evans <kevans at freebsd.org> wrote: > > As "the guy most likely to have broken boot code in stable," may I ask > what leads you specifically to amd64 boot code? Mostly curious if > there's something beyond "i386 works well" that lead you to this > conclusion.It is partly just a hunch. I installed 11.0 for use with qemu a while ago. I did binary upgrades for patches using freebsd-update. When 11.1 came out, it would not work correctly, again with the same kind of behavior. Then, I got some later snapshots that worked again, notably the 20180329 build. When the next snapshot came out, things broke. I also tried my own builds, same story. I even got both source trees together - 20180329 and 20180408 - and did a diff on the entire trees, and I noticed activity in the boot & kernel code. It could just as likely be something in the kernel as well, but none of this happens with the i386 build.> When you say it crashes and does a kernel dump- you're landing at a > ddb prompt, yeah? What does executing bt at that prompt look like?No, I am not ever given a prompt. I get to watch a mini-dump happen and then an automatic reboot. It is a kernel panic. Here is what I see: