Willem Jan Withagen
2020-Oct-14 10:00 UTC
12.2-RC2 crashing on leftover ZFS cache/log on SSD
Hi, I managed myself in a somewhat weird situation on a new Epyc server. It has 2x 4T spinning HDD and 2x intel SSD. This is what I did: ?- Installed 12.1-RELASE from a USB stick ?? - Put the zroot on the 2 HHDs in mirror ?- Downloaded 12/stable, compiled and installed that. ?- Then I added cache and log from? the SSDs (2 gparts) ?? to the zroot pool. caches as stripe and log in mirror. That worked fine. Only I found out that I had stable, instead of RC2 which I actually wanted since upgrading with freebsd-upgrade is a lot easiers then when 12.2-RELASE is out. So I fetch 12.2-RC2 and installed that on the 2 rusty spinners. But now the SSDs have leftover ZFS stuff on it that panics the 12.2-RC2. Simple soluction is to remover the SSDs from their trays, boot and reinsert the SSDs. That works. I'll be able to clean the SSDs and redo cache and log stuff. There are 2 things with the panic: ?- It flies by so fast that I had to film it to actually get the panic message. ?? no kernel debugger ?? No 30/60 secs before autoboot ?? Is this a seeting that has changed? ??? debug.debugger_on_panic=1 does not seem to work.... ?- The panic itself: ?? panic: solaris assert: nvlist_lookup_uint64(configs[i], ZPOOL_CONFIG_POOL_TXG, ???? , &txg) == 0, file /usr/src/sys/cddl/contribute/opensolaris/uts/common/fs/zfs/spa.c ???? line 5636 Now this could be a case of wrong order and 12/stable has tings in the pool that 12.2-RC2 does not like. And should just reuse the SSDs. Or it is a real bug, and somebody wants to take a look at it? When we get into the debugger we could look at more data.... I have a picture of the stack trace, and will leave the system for a day as it is. It there are no takers, than I'll fix things tomorrow and put the system into production. --WjW