Hello.
On 28.04.2018 17:46, Willem Jan Withagen wrote:> Hi,
>
> I upgraded a server from 10.4 to 11.1 and now al of a sudden the
> server complains about:
> ZFS: Can't find pool by guid
> And I end up in the boot prompt:
>
> lsdev gives disk0 withe on p1 the partion that the zroot is/was.
>
> This is an active server, so redoing install and stuf is nog going to
> be real workable....
>
> So how do I get this to boot?
The basic scenario for this is when you have a "shadow" pool on the
bootable disks with actual root pool - for example once you had a zfs
pool on some disks that were in dedicated mode, then you extracted these
disks without clearing the zpool labels (and 'zpool destroy' never
clears the zpool labels) and installed the system onto them. This way
'zpool import' will show the old pool which has no live replicas and no
live vdevs. The system on it may be bootable (and will probably be)
until the data gets redistributed in some way, after that gptzfsboot
will start to see the old pool remains, will try to detect if this pool
has bootfs on it - but in this case there's no valid pool - so it will
fall into error and stop working. Actually, the newer 11.2 gptzfsboot
loader has more support of this - it clearly states the pool found and
mentions the error - thanks to all the guys that did a great work on
this, seriously.
The way to resolve this is to detach disks sequentially from root pool
(or offline them in case of raidz), making 'zpool labelclear' on them
(please keep in mind that 'labelclear' is evil and ignorant, and breaks
things including GPT table) and attaching them back, resilvering, and
repeating this until 'zpool import' will show no old disassembled pools.
Determining which disks have the old labels can be done with 'zdb -l
/dev/<disk> | grep name:'.
I understand that your situation was resolved long ago, I'm writing this
merely to establish a knowledge point if someone will step on this too,
like I did yesterday.
Eugene.