CC'ing Alexander Motin who comitted the change.
20.07.2019 1:21, Garrett Wollman wrote:
> I recently upgraded several file servers from 11.2 to 11.3. All of
> them boot from a ZFS pool called "tank" (the data is in a
different
> pool). In a couple of instances (which caused me to have to take a
> late-evening 140-mile drive to the remote data center where they are
> located), the servers crashed at the root mount phase. In one case,
> it bailed out with error 5 (I believe that's [EIO]) to the usual
> mountroot prompt. In the second case, the kernel panicked instead.
>
> The root cause (no pun intended) on both servers was a disk which was
> supplied by the vendor with a label on it that claimed to be part of
> the "tank" pool, and for some reason the 11.3 kernel was trying
to
> mount that (faulted) pool rather than the real one. The disks and
> pool configuration were unchanged from 11.2 (and probably 11.1 as
> well) so I am puzzled.
>
> Other than laboriously running "zpool labelclear -f
/dev/somedisk" for
> every piece of media that comes into my hands, is there anything else
> I could have done to avoid this?
Both 11.3-RELEASE announcement and Release Notes mention this:
> The ZFS filesystem has been updated to implement parallel mounting.
I strongly suggest reading Release documentation in case of troubles
after upgrade, at least. Or better, read *before* updating.
I guess this parallelism created some race for your case.
Unfortunately, a way to fall back to sequential mounting seems undocumented.
libzfs checks for ZFS_SERIAL_MOUNT environment variable to exist having any
value.
I'm not sure how you set it for mounting root, maybe it will use kenv,
so try adding to /boot/loader.conf:
ZFS_SERIAL_MOUNT=1
Alexander should have more knowledge on this.
And of course, attaching unrelated device having label conflicting
with root pool is asking for trouble. Re-label it ASAP.