Eugene M. Zheganin
2015-Nov-06 05:53 UTC
unable to boot a healthy zfs pool: all block copies unavailable
Hi. On 06.11.2015 02:58, Andriy Gapon wrote:> > It could be that your BIOS is not able to read past 1TB (512 * INT_MAX). That > seems to be a rather common problem for consumer motherboards. > Here is an example of how it looked for me: > https://people.freebsd.org/~avg/IMAG1099.jpg > Fortunately, it wasn't a root pool that got the error.Mine looks way different: yours shows the pool info, mine shows 'BTX halted' message: http://zhegan.in/files/cannot-read-MOS.jpg . I'm running the latest BIOS for this motherboard (Gigabyte Z77P-D3, updated yesterday, stilll it's only 2012h year). If it's still the BIOS-related bug, what wokraround can I use - reslice the disk and create the root pool inside first Tb, right ? Thanks. Eugene.
Alan Somers
2015-Nov-06 16:00 UTC
unable to boot a healthy zfs pool: all block copies unavailable
On Thu, Nov 5, 2015 at 10:53 PM, Eugene M. Zheganin <emz at norma.perm.ru> wrote:> Hi. > > On 06.11.2015 02:58, Andriy Gapon wrote: >> >> It could be that your BIOS is not able to read past 1TB (512 * INT_MAX). That >> seems to be a rather common problem for consumer motherboards. >> Here is an example of how it looked for me: >> https://people.freebsd.org/~avg/IMAG1099.jpg >> Fortunately, it wasn't a root pool that got the error. > Mine looks way different: yours shows the pool info, mine shows 'BTX > halted' message: http://zhegan.in/files/cannot-read-MOS.jpg . I'm > running the latest BIOS for this motherboard (Gigabyte Z77P-D3, updated > yesterday, stilll it's only 2012h year). If it's still the BIOS-related > bug, what wokraround can I use - reslice the disk and create the root > pool inside first Tb, right ? > > Thanks. > Eugene.I notice that my 10.2-RELEASE VM prints the same message about "all block copies unavailable" and then continues to boot just fine. So I wonder if that part is just red herring. There is another possibility here: I have seen a bug where ZFS attempts to open the root pool's vdevs by path (eg ada0p3) but can't find them because disks have been replaced and no longer have their old devnames. So vdev_geom searches through the list of geom providers looking for any provider with the correct ZFS GUID. Normally it would find the right devname (eg ada1p3). But sometimes, because the disks are partitioned, it will find the wrong partition first (eg ada1). Since ZFS has labels at both the beginning and the end of each vdev, vdev_geom will see the label at the end of ada1 (really, it's the label at the end of ada1p3, but it shares the same LBA that a label at the end of ada1 would) and think that it opened ada1 successfully. vdev_geom_open will then return, and at some later date another part of ZFS will fail to read the MOS, and your boot will fail. If this is the case, then there are three possible solutions: 1) Fix vdev_geom. I'm currently testing a patch to do just that. 2) With power off, shuffle disks around until the boot disks have the same devnames that they had the last time you successfully booted. If this is a SATA-only computer, swapping cables between different mobo ports should be enough. 3) Boot from your USB stick and carefully (oh so carefully!) erase the ZFS labels at the end of the boot disk. Don't touch the labels at the beginning. If your boot pool is mirrored, it should be sufficient to erase the labels on one disk only. -Alan
Andriy Gapon
2015-Nov-06 22:46 UTC
unable to boot a healthy zfs pool: all block copies unavailable
On 06/11/2015 07:53, Eugene M. Zheganin wrote:> Hi. > > On 06.11.2015 02:58, Andriy Gapon wrote: >> >> It could be that your BIOS is not able to read past 1TB (512 * INT_MAX). That >> seems to be a rather common problem for consumer motherboards. >> Here is an example of how it looked for me: >> https://people.freebsd.org/~avg/IMAG1099.jpg >> Fortunately, it wasn't a root pool that got the error. > Mine looks way different: yours shows the pool info, mine shows 'BTX > halted' message: http://zhegan.in/files/cannot-read-MOS.jpg .My output is more verbose because I've added some extra diagnostics. Also, as I've said, in my case is complain was not about a root pool.> I'm > running the latest BIOS for this motherboard (Gigabyte Z77P-D3, updated > yesterday, stilll it's only 2012h year).Fun fact - my problem is also with the latest BIOS for my motherboard. The previous BIOS version does not have the problem.> If it's still the BIOS-related > bug, what wokraround can I use - reslice the disk and create the root > pool inside first Tb, right ?That's what I would do to be sure that I'm safe. -- Andriy Gapon