On Thu, Jan 17, 2019 at 4:21 AM Pete French <petefrench at ingresso.co.uk> wrote:> so, having got a workaround for yesterdays problems, I now went to upgrade > my > other pair of boxes using CARP. No 'pf' on these, just one shared address. > This is the setup I have tested in development and it works fine. > > I install the new kenel and do the first reboot - and I get the panic > below. Maybe its not carp related, but seems suspicious as the last > thing it spits out is a carp message. > > Note this is the first reboot - so 12.0 kernel with the 11 userland, in > perparation for the installworld step. Machine is booting from, ZFS and > also > has a GELI partition contain some data which requires a manual password. >To point out the obvious, booting a 12.0 kernel with 11 userland to multiuser mode is seriously unsupported. You really need to boot to single user and install 12.0 userland to really expect things to work. OTOH, while I would expect that MANY things might not work, panics should not come from problems in userland. Still, I would not be at all shocked if it turns out to be coincidental to CARP. I would also not be shocked if this makes no difference, but even between minor version updates, there can be issues when the kernel and userland are different versions. Is there a reason that a standalone boot is not possible? Both installworld and mergemaster should be run before moving to multiuser mode and reboot is preferred to exit. In some cases even delete-old can be required before safely going to multimode. -- Kevin Oberman, Part time kid herder and retired Network Engineer E-mail: rkoberman at gmail.com PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683
> To point? out the obvious, booting a 12.0 kernel with 11? userland to > multiuser mode is seriously unsupported. You really need to boot to > single user and install 12.0 userland to really expect things to work.Yes, good point. This has worked on every other machine I have upgraded from 11 to 12, which is why I didnt think of that, but then all the motherboards are slightly different.> Is there a reason that a standalone boot is not possible?Sort of - I am on a serial console to do this, which works in the BIOS, and works after the kernel has started booting, but does not work in the loader for some reason, so I can't select single user. So I go to single user by booting multi user and the shutting down. Of course I could use nextboot, so its just lazyness on my part actually. Thanks for pointing this out, I immediately jumped to the CARP conclusion due to last weeks experiences on the other machine, but actually this is far more likely to be the issue. -pete. PS: apparently I have been playing fast and loose with this - and bothering the mailing list about it - since 2005... :-) http://freebsd.1045724.x6.nabble.com/upgrading-5-4-gt-6-0-without-reinstalling-safe-td3932902.html Time to change my ways I think!
> > To point out the obvious, booting a 12.0 kernel with 11.0 userland to > > multiuser mode is seriously unsupported. You really need to boot to > > single user and install 12.0 userland to really expect things to work. > > Yes, good point. This has worked on every other machine I have upgraded > from 11 to 12, which is why I didnt think of that, but then all the > motherboards are slightly different.So, I went back to this, and di it properly. Booted single user mode, which worked, then installed world, mergemaster, and rebooted single user mode. ...and I get a kernel panic as I did before. So it wasn't the 11 world with the 12 kernel after all. Panic is reproduced below. I am somewhat stuck now though - where do I go from here ? Feeding entropy lo0: link state changed to UP carp: demoted by 240 to 240 (interface down) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x28 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80ca1621 stack pointer = 0x28:0xfffffe00004da740 frame pointer = 0x28:0xfffffe00004da760 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi4: clock (0)) trap number = 12 panic: page fault cpuid = 0 time = 1549292394 KDB: stack backtrace: #0 0xffffffff80be8d57 at kdb_backtrace+0x67 #1 0xffffffff80b9d293 at vpanic+0x1a3 #2 0xffffffff80b9d0e3 at panic+0x43 #3 0xffffffff8107384f at trap_fatal+0x35f #4 0xffffffff810738a9 at trap_pfault+0x49 #5 0xffffffff81072ece at trap+0x29e #6 0xffffffff8104ee55 at calltrap+0x8 #7 0xffffffff80ca1526 at ether_output+0x6b6 #8 0xffffffff80d0c824 at arprequest+0x4c4 #9 0xffffffff80d0e47c at garp_rexmit+0xbc #10 0xffffffff80bb7169 at softclock_call_cc+0x129 #11 0xffffffff80bb7649 at softclock+0x79 #12 0xffffffff80b613a4 at ithread_loop+0x1d4 #13 0xffffffff80b5e2d2 at fork_exit+0x82 #14 0xffffffff8104fe3e at fork_trampoline+0xe Uptime: 11s