Alan Somers
2015-Nov-06 16:00 UTC
unable to boot a healthy zfs pool: all block copies unavailable
On Thu, Nov 5, 2015 at 10:53 PM, Eugene M. Zheganin <emz at norma.perm.ru> wrote:> Hi. > > On 06.11.2015 02:58, Andriy Gapon wrote: >> >> It could be that your BIOS is not able to read past 1TB (512 * INT_MAX). That >> seems to be a rather common problem for consumer motherboards. >> Here is an example of how it looked for me: >> https://people.freebsd.org/~avg/IMAG1099.jpg >> Fortunately, it wasn't a root pool that got the error. > Mine looks way different: yours shows the pool info, mine shows 'BTX > halted' message: http://zhegan.in/files/cannot-read-MOS.jpg . I'm > running the latest BIOS for this motherboard (Gigabyte Z77P-D3, updated > yesterday, stilll it's only 2012h year). If it's still the BIOS-related > bug, what wokraround can I use - reslice the disk and create the root > pool inside first Tb, right ? > > Thanks. > Eugene.I notice that my 10.2-RELEASE VM prints the same message about "all block copies unavailable" and then continues to boot just fine. So I wonder if that part is just red herring. There is another possibility here: I have seen a bug where ZFS attempts to open the root pool's vdevs by path (eg ada0p3) but can't find them because disks have been replaced and no longer have their old devnames. So vdev_geom searches through the list of geom providers looking for any provider with the correct ZFS GUID. Normally it would find the right devname (eg ada1p3). But sometimes, because the disks are partitioned, it will find the wrong partition first (eg ada1). Since ZFS has labels at both the beginning and the end of each vdev, vdev_geom will see the label at the end of ada1 (really, it's the label at the end of ada1p3, but it shares the same LBA that a label at the end of ada1 would) and think that it opened ada1 successfully. vdev_geom_open will then return, and at some later date another part of ZFS will fail to read the MOS, and your boot will fail. If this is the case, then there are three possible solutions: 1) Fix vdev_geom. I'm currently testing a patch to do just that. 2) With power off, shuffle disks around until the boot disks have the same devnames that they had the last time you successfully booted. If this is a SATA-only computer, swapping cables between different mobo ports should be enough. 3) Boot from your USB stick and carefully (oh so carefully!) erase the ZFS labels at the end of the boot disk. Don't touch the labels at the beginning. If your boot pool is mirrored, it should be sufficient to erase the labels on one disk only. -Alan
Andriy Gapon
2015-Nov-06 22:40 UTC
unable to boot a healthy zfs pool: all block copies unavailable
On 06/11/2015 18:00, Alan Somers wrote:> I notice that my 10.2-RELEASE VM prints the same message about "all > block copies unavailable" and then continues to boot just fine.Is that on a system with only one ZFS pool or are there more of them? -- Andriy Gapon
Eugene M. Zheganin
2015-Nov-09 07:32 UTC
unable to boot a healthy zfs pool: all block copies unavailable
Hi. On 06.11.2015 21:00, Alan Somers wrote:> I notice that my 10.2-RELEASE VM prints the same message about "all > block copies unavailable" and then continues to boot just fine. So I > wonder if that part is just red herring. There is another possibility > here: I have seen a bug where ZFS attempts to open the root pool's > vdevs by path (eg ada0p3) but can't find them because disks have been > replaced and no longer have their old devnames. So vdev_geom searches > through the list of geom providers looking for any provider with the > correct ZFS GUID. Normally it would find the right devname (eg > ada1p3). But sometimes, because the disks are partitioned, it will > find the wrong partition first (eg ada1). Since ZFS has labels at > both the beginning and the end of each vdev, vdev_geom will see the > label at the end of ada1 (really, it's the label at the end of ada1p3, > but it shares the same LBA that a label at the end of ada1 would) and > think that it opened ada1 successfully. vdev_geom_open will then > return, and at some later date another part of ZFS will fail to read > the MOS, and your boot will fail. >You are talking here about gptzfsboot being not smart enough, right ? Since kernel itself is able to find that pool after being booted up from alternative source. So it's a gptzfsboot issue, right ? Eugene.