Russ Price
2010-Mar-27 19:26 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
I have two 500 GB drives on my system that are attached to built-in SATA ports on my Asus M4A785-M motherboard, running in AHCI mode. If I shut down the system, remove either drive, and then try to boot the system, it will fail to boot. If I disable the splash screen, I find that it will display the SunOS banner and the hostname, but it never gets as far as the "Reading ZFS config:" stage. GRUB is installed on both drives, and if both drives are present, I can flip the boot order in the BIOS and still have it boot successfully. I can even move one of the mirrors to a different SATA port and still have it boot. But if a mirror is missing, forget it. I can''t find any log entries in /var/adm/messages about why it fails to boot, and the console is equally uninformative. If I check fmdump, it reports an empty fault log. If I throw in a blank drive in place of one of the mirrors, the boot still fails. Needless to say, this pretty much makes the whole idea of mirroring rather useless. Any idea what''s really going wrong here? -- This message posted from opensolaris.org
Tim Cook
2010-Mar-27 19:35 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
On Sat, Mar 27, 2010 at 2:26 PM, Russ Price <rjp_sun at fubegra.net> wrote:> I have two 500 GB drives on my system that are attached to built-in SATA > ports on my Asus M4A785-M motherboard, running in AHCI mode. If I shut down > the system, remove either drive, and then try to boot the system, it will > fail to boot. If I disable the splash screen, I find that it will display > the SunOS banner and the hostname, but it never gets as far as the "Reading > ZFS config:" stage. GRUB is installed on both drives, and if both drives are > present, I can flip the boot order in the BIOS and still have it boot > successfully. I can even move one of the mirrors to a different SATA port > and still have it boot. But if a mirror is missing, forget it. I can''t find > any log entries in /var/adm/messages about why it fails to boot, and the > console is equally uninformative. If I check fmdump, it reports an empty > fault log. > > If I throw in a blank drive in place of one of the mirrors, the boot still > fails. Needless to say, this pretty much makes the whole idea of mirroring > rather useless. > > Any idea what''s really going wrong here? > >What build? How long have you waited for the boot? It almost sounds to me like it''s waiting for the drive and hasn''t timed out before you give up and power it off. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100327/61ff882c/attachment.html>
Russ Price
2010-Mar-27 19:45 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
> What build? ?How long have you waited for the boot? ?It > almost sounds to me like it''s waiting for the > drive and hasn''t timed out before you give up and > power it off.I waited about three minutes. This is a b134 installation. One one of my tests, I tried shoving the removed mirror into the hotswap bay, and got a console message indicating that the device was detected, but that didn''t make the boot complete. I restarted the system with the drive present, and everything''s fine. How long should I expect to wait if a drive is missing? It shouldn''t take more than 30 seconds, IMHO. -- This message posted from opensolaris.org
Tim Cook
2010-Mar-27 20:20 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
On Sat, Mar 27, 2010 at 2:45 PM, Russ Price <rjp_sun at fubegra.net> wrote:> > What build? How long have you waited for the boot? It > > almost sounds to me like it''s waiting for the > > drive and hasn''t timed out before you give up and > > power it off. > > I waited about three minutes. This is a b134 installation. > > One one of my tests, I tried shoving the removed mirror into the hotswap > bay, and got a console message indicating that the device was detected, but > that didn''t make the boot complete. I restarted the system with the drive > present, and everything''s fine. > > How long should I expect to wait if a drive is missing? It shouldn''t take > more than 30 seconds, IMHO. > >Depends on a lot of things. I''d let it sit for at least half an hour to see if you get any messages. 30 seconds, if it''s waiting for the driver stack timeouts, is way too short. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100327/246b712d/attachment.html>
William Bauer
2010-Mar-28 00:57 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
Posted this reply in the help forum, copying it here: I frequently use mirrors to replace disks, or even as a backup with an esata dock. So I set up v134 with a mirror in VB, ran installgrub, then detached each drive in turn. I completely duplicated and can confirm your problem, and since I''m quite comfortable with this process I suggest you have found a serious bug and should report it immediately. This is a horrible problem if a mirror member fails and renders a system unbootable!! -- This message posted from opensolaris.org
Tim Cook
2010-Mar-28 01:05 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
On Sat, Mar 27, 2010 at 7:57 PM, William Bauer <bqbauer at gmail.com> wrote:> Posted this reply in the help forum, copying it here: > > I frequently use mirrors to replace disks, or even as a backup with an > esata dock. So I set up v134 with a mirror in VB, ran installgrub, then > detached each drive in turn. I completely duplicated and can confirm your > problem, and since I''m quite comfortable with this process I suggest you > have found a serious bug and should report it immediately. This is a > horrible problem if a mirror member fails and renders a system unbootable!! > >Have you tried booting from a livecd and importing the pool from there? It might help narrow down exactly where the problem lies. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100327/dc0fa244/attachment.html>
William Bauer
2010-Mar-28 01:53 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
Good idea (importing from a LiveCD). I just did this, and it imported without any unusual complaint, except for the usual "DEGRADED" state because a member is missing. Also, for whatever this is worth, I noticed that v134 now shows the mirror (or the first mirror) as "mirror-0" instead of just "mirror". This is both with the LiveCD and the installed image. Probably irrelevant. -- This message posted from opensolaris.org
William Bauer
2010-Mar-28 03:03 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
Depends on a lot of things. ?I''d let it sit for at least half an hour to see if you get any messages. ?30 seconds, if it''s waiting for the driver stack timeouts, is way too short. ----------------------------- I''m not the OP, but I let my VB guest sit for an hour now, and nothing new has happened. The last thing it displayed was the "Hostname:" line, as the original post stated. Personally, I''ve never seen an OpenSolaris system, virtual or physical, take more than a few seconds to pause for a missing mirror member. I did notice something a little odd--usually an OpenSolaris VB guest doesn''t use all of its allocated memory immediately, even after a user logs into gnome. However, this seemingly idle system quickly ate up all of the 2GB I allocated to it. The host is 2009.06 with 8GB memory and an Intel quad Q6600, so I have adequate resources for this guest. -- This message posted from opensolaris.org
Tim Cook
2010-Mar-28 03:18 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
On Sat, Mar 27, 2010 at 10:03 PM, William Bauer <bqbauer at gmail.com> wrote:> Depends on a lot of things. I''d let it sit for at least half an hour to > see if you get any messages. 30 seconds, if it''s waiting for the driver > stack timeouts, is way too short. > ----------------------------- > > I''m not the OP, but I let my VB guest sit for an hour now, and nothing new > has happened. The last thing it displayed was the "Hostname:" line, as the > original post stated. Personally, I''ve never seen an OpenSolaris system, > virtual or physical, take more than a few seconds to pause for a missing > mirror member. > > I did notice something a little odd--usually an OpenSolaris VB guest > doesn''t use all of its allocated memory immediately, even after a user logs > into gnome. However, this seemingly idle system quickly ate up all of the > 2GB I allocated to it. The host is 2009.06 with 8GB memory and an Intel > quad Q6600, so I have adequate resources for this guest. > > >Sounds exactly like the behavior people have had previously while a system is trying to recover a pool with a faulted drive. I''ll have to check and see if I can dig up one of those old threads. I vaguely recall someone here had a single drive fail on an import and it took forever to import the pool, running out of memory every time. I think he eventually added significantly more memory and was able to import the pool (of course my memory sucks, so I''m sure that''s not quite accurate). --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100327/1984ee7d/attachment.html>
Ian Collins
2010-Mar-28 03:27 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
On 03/28/10 04:18 PM, Tim Cook wrote:> > Sounds exactly like the behavior people have had previously while a > system is trying to recover a pool with a faulted drive. I''ll have to > check and see if I can dig up one of those old threads. I vaguely > recall someone here had a single drive fail on an import and it took > forever to import the pool, running out of memory every time. I think > he eventually added significantly more memory and was able to import > the pool (of course my memory sucks, so I''m sure that''s not quite > accurate). >Maybe *you* need to add some more memory! -- Ian.
William Bauer
2010-Mar-28 03:33 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
What I don''t understand then is why can I do this with some frequency without any delays on my 2009.06 and S10 systems? I have a three disk mirror at home, one disk in an esata dock. Sometimes I don''t turn on the dock, and the system boots just as quickly. Likewise, I''ve done this with two-disk mirrors at work for various reasons. No delays or issues. This problem seems to have come along since 2009.06--even Solaris 10 handles this situation without problems. I just gave that VB guest 4GB memory, and it used it all within five minutes. It''s consuming very little host CPU. (I''d quote your messages, but seem to be having problems with the web interface preview working) -- This message posted from opensolaris.org
Victor Latushkin
2010-Mar-28 05:35 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
This problem is known an fixed in later builds: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6923585 AFAIK it is going to be included into b134a as well Sent from my iPhone On Mar 27, 2010, at 22:26, Russ Price <rjp_sun at fubegra.net> wrote:> I have two 500 GB drives on my system that are attached to built-in > SATA ports on my Asus M4A785-M motherboard, running in AHCI mode. If > I shut down the system, remove either drive, and then try to boot > the system, it will fail to boot. If I disable the splash screen, I > find that it will display the SunOS banner and the hostname, but it > never gets as far as the "Reading ZFS config:" stage. GRUB is > installed on both drives, and if both drives are present, I can flip > the boot order in the BIOS and still have it boot successfully. I > can even move one of the mirrors to a different SATA port and still > have it boot. But if a mirror is missing, forget it. I can''t find > any log entries in /var/adm/messages about why it fails to boot, and > the console is equally uninformative. If I check fmdump, it reports > an empty fault log. > > If I throw in a blank drive in place of one of the mirrors, the boot > still fails. Needless to say, this pretty much makes the whole idea > of mirroring rather useless. > > Any idea what''s really going wrong here? > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Dick Hoogendijk
2010-Mar-28 13:01 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
On 28-3-2010 7:35, Victor Latushkin wrote:> This problem is known an fixed in later builds: > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6923585 > > AFAIK it is going to be included into b134a as wellIt''s now March 28. For OpenSolaris 2010.03 that means only a few days remaining... Or would it be called 2010.04 ? ;-) -- Dick Hoogendijk -- PGP/GnuPG key: 01D2433D + http://nagual.nl/ | OpenSolaris 2010.03 b134 + All that''s really worth doing is what we do for others (Lewis Carrol)
Russ Price
2010-Mar-28 22:13 UTC
[zfs-discuss] b134 - Mirrored rpool won''t boot unless both mirrors are present
> This problem is known an fixed in later builds: > > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6923585 > > AFAIK it is going to be included into b134a as wellOK, I just did some checking, and my rpool was already set up with autoreplace=off. It''s necessary to use the -r boot flag as well; that works. Hopefully, b134a will actually have the fix. Once I use -r, the boot proceeds at normal speed - no need to wait ridiculous amounts of time as Tim suggested. -- This message posted from opensolaris.org