Michael Kennedy
2005-Dec-15 22:26 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
I''m a proud owner of a shiny new Sun Ultra 20 Workstation (1 of 10), and I ordered it with the standard SATA 80GB drive, and a secondary 250GB SATA drive. I''ve installed the machine with Nevada 27a, and proceeded to update things the way I wanted them. I used ''zpool'' to create a new datapool and assigned the 250GB SATA drive. I realize that this isn''t necessarily an optimal use case with only the one drive assigned, but hey, I want to learn how zfs works and what better way to do it than to use it. So, ZFS works like a charm, exactly like the docs said it would. I''m creating filesystems, moving mountpoints, assigning quotas, etc.... Then I rebooted... And my heart sunk when the BIOS POST froze after detecting the drives, which it did so successfully. After fiddling for a bit, trying to understand what was going on, I started the dreary process of elimination. Lucky for me, my first inclination was to remove the drives, and add them back, one by one, so the process of elimination didn''t last long. The end result was that I pulled the 250GB SATA drive, and the machine posted. So like a good consumer, I assumed the drive had died, and proceeded to RMA it. In the meantime, I inserted another 250GB SATA drive from another Ultra20 yet to be deployed (from one of the other 9 we have), and re-created my ZFS pool and filesystems... And rebooted. Wasn''t I surprised to have the same problem? Then a colleague a few cubes over stuck his head up and said, "My machine won''t boot! WTF is wrong with this thing?" To which I replied, "Did you ZFS your data drive?", and his answer was "Yeah, why? Is there a problem with that?" So, you see, a pattern is forming. If we ZFS a drive, it won''t pass BIOS POST and boot the machine. I can''t find a reason for it in any sunsolve/googling. Does anyone here have the same experience? Two drive models that have been used, both Sun Part 540-6521-01, are a Hitachi Deskstar HDS722525VLSA80, and a Seagate ST3250823AS. Machine is Sun Ultra 20 (Opteron 148, 2GB RAM) - Running Solaris 5.11 nv27a. Any tips would be appreciated. I will be trying nv28 next, but my breath, she is not being held. And I''m starting to pile up SATA drives that I cannot use. MK This message posted from opensolaris.org
Dan Price
2005-Dec-15 22:56 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On Thu 15 Dec 2005 at 02:26PM, Michael Kennedy wrote:> I''m a proud owner of a shiny new Sun Ultra 20 Workstation (1 of 10), > and I ordered it with the standard SATA 80GB drive, and a secondary > 250GB SATA drive. > > I''ve installed the machine with Nevada 27a, and proceeded to update > things the way I wanted them. > > I used ''zpool'' to create a new datapool and assigned the 250GB SATA > drive. I realize that this isn''t necessarily an optimal use case with > only the one drive assigned, but hey, I want to learn how zfs works > and what better way to do it than to use it. > > So, ZFS works like a charm, exactly like the docs said it would. I''m > creating filesystems, moving mountpoints, assigning quotas, etc.... > > Then I rebooted... And my heart sunk when the BIOS POST froze after > detecting the drives, which it did so successfully. After fiddling > for a bit, trying to understand what was going on, I started the > dreary process of elimination. Lucky for me, my first inclination was > to remove the drives, and add them back, one by one, so the process of > elimination didn''t last long. > > The end result was that I pulled the 250GB SATA drive, and the machine > posted. So like a good consumer, I assumed the drive had died, and > proceeded to RMA it. In the meantime, I inserted another 250GB SATA > drive from another Ultra20 yet to be deployed (from one of the other 9 > we have), and re-created my ZFS pool and filesystems... And rebooted. > > Wasn''t I surprised to have the same problem? Then a colleague a few > cubes over stuck his head up and said, "My machine won''t boot! WTF is > wrong with this thing?" To which I replied, "Did you ZFS your data > drive?", and his answer was "Yeah, why? Is there a problem with > that?"I had the same problem with my U20--- precisely this sequence. Let''s get a bug filed. -dp -- Daniel Price - Solaris Kernel Engineering - dp at eng.sun.com - blogs.sun.com/dp
Bill Sommerfeld
2005-Dec-15 23:13 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On Thu, 2005-12-15 at 17:56, Dan Price wrote:> I had the same problem with my U20--- precisely this sequence. Let''s > get a bug filed.sounds to me like the EFI GPT label used by ZFS when you give it the whole disk is somehow toxic to the BIOS... - Bill
Dan Price
2005-Dec-15 23:21 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On Fri 16 Dec 2005 at 10:22AM, Nathan Kroenert wrote:> Silly question (I don''t have a U20... :( )... > > Does the bios have boot sector monitoring (virus checking) capabilities, > and is it possible the label ZFS uses is hurting it''s brain? > > Just a thought... :)It has a virus protection option in the BIOS, but it is off, at least on mine. But I''ll dork with the BIOS settings now that I realize it isn''t a bad drive. -dp -- Daniel Price - Solaris Kernel Engineering - dp at eng.sun.com - blogs.sun.com/dp
Nathan Kroenert
2005-Dec-15 23:22 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
Silly question (I don''t have a U20... :( )... Does the bios have boot sector monitoring (virus checking) capabilities, and is it possible the label ZFS uses is hurting it''s brain? Just a thought... :) Nathan. Dan Price wrote:> On Thu 15 Dec 2005 at 02:26PM, Michael Kennedy wrote: > >>I''m a proud owner of a shiny new Sun Ultra 20 Workstation (1 of 10), >>and I ordered it with the standard SATA 80GB drive, and a secondary >>250GB SATA drive. >> >>I''ve installed the machine with Nevada 27a, and proceeded to update >>things the way I wanted them. >> >>I used ''zpool'' to create a new datapool and assigned the 250GB SATA >>drive. I realize that this isn''t necessarily an optimal use case with >>only the one drive assigned, but hey, I want to learn how zfs works >>and what better way to do it than to use it. >> >>So, ZFS works like a charm, exactly like the docs said it would. I''m >>creating filesystems, moving mountpoints, assigning quotas, etc.... >> >>Then I rebooted... And my heart sunk when the BIOS POST froze after >>detecting the drives, which it did so successfully. After fiddling >>for a bit, trying to understand what was going on, I started the >>dreary process of elimination. Lucky for me, my first inclination was >>to remove the drives, and add them back, one by one, so the process of >>elimination didn''t last long. >> >>The end result was that I pulled the 250GB SATA drive, and the machine >>posted. So like a good consumer, I assumed the drive had died, and >>proceeded to RMA it. In the meantime, I inserted another 250GB SATA >>drive from another Ultra20 yet to be deployed (from one of the other 9 >>we have), and re-created my ZFS pool and filesystems... And rebooted. >> >>Wasn''t I surprised to have the same problem? Then a colleague a few >>cubes over stuck his head up and said, "My machine won''t boot! WTF is >>wrong with this thing?" To which I replied, "Did you ZFS your data >>drive?", and his answer was "Yeah, why? Is there a problem with >>that?" > > > I had the same problem with my U20--- precisely this sequence. Let''s > get a bug filed. > > -dp >
Bill Moore
2005-Dec-16 00:36 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On Thu, Dec 15, 2005 at 06:13:47PM -0500, Bill Sommerfeld wrote:> On Thu, 2005-12-15 at 17:56, Dan Price wrote: > > I had the same problem with my U20--- precisely this sequence. Let''s > > get a bug filed. > > sounds to me like the EFI GPT label used by ZFS when you give it the > whole disk is somehow toxic to the BIOS...This is a recently discovered bug (by James Gosling, no less). It seems to be, as Bill S. points out, caused by the EFI label that ZFS uses. It interacts with the BIOS RAID configuration scanning code in some evil way. A bug has been filed and is being aggressively pursued by both Sun, the BIOS vendor, and the Nvidia RAID folks. The problem should go away if you manually format the disk with a Sun VTOC label, then give ZFS a slice: zpool create mypool c2d0s0 The way I''ve worked around this is by unplugging the drive until the GRUB boot screen comes up, then hot-plugging the drive in and booting Solaris. YMMV. --Bill
Michael Kennedy
2005-Dec-16 03:50 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
Thanks for the insight Bill. It''s appreciated. I will attempt the hot-plug ritual tomorrow morning, and if the gods are smiling, I will be able to VTOC & get on with my life. ZFS is showing incredible potential, and I''m discovering more uses than any of the marketing material that has been spewing since ''04 has eluded to. I can''t wait for this to go GA. This is a disruptive technology. Unfortunately I''m getting ahead of myself, imagining potential uses, and solutioning problems that don''t exist... As we are all prone to do from time to time. Thanks again for the steer. MK This message posted from opensolaris.org
Casper.Dik at Sun.COM
2005-Dec-16 07:20 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
>I had the same problem with my U20--- precisely this sequence. Let''s >get a bug filed.Can you remove the drive fromt he boot sequence? I''ve seen this happen when a Tyan 2885 wanted to boot from an EFI labelled disk. Casper
Dan Price
2005-Dec-16 08:02 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On Thu 15 Dec 2005 at 04:36PM, Bill Moore wrote:> On Thu, Dec 15, 2005 at 06:13:47PM -0500, Bill Sommerfeld wrote: > > On Thu, 2005-12-15 at 17:56, Dan Price wrote: > > > I had the same problem with my U20--- precisely this sequence. Let''s > > > get a bug filed. > > > > sounds to me like the EFI GPT label used by ZFS when you give it the > > whole disk is somehow toxic to the BIOS... > > This is a recently discovered bug (by James Gosling, no less). It seems > to be, as Bill S. points out, caused by the EFI label that ZFS uses. It > interacts with the BIOS RAID configuration scanning code in some evil > way. A bug has been filed and is being aggressively pursued by both Sun, > the BIOS vendor, and the Nvidia RAID folks. The problem should go awayBugID?> if you manually format the disk with a Sun VTOC label, then give ZFS a > slice: zpool create mypool c2d0s0 > > The way I''ve worked around this is by unplugging the drive until the > GRUB boot screen comes up, then hot-plugging the drive in and booting > Solaris. YMMV.I tried that, and I just wound up with a sad machine; the second drive never got recognized, and things like format would hang for a long while before giving up. I also filed a bug (6364104), which maybe can now be closed as a dup. -dp -- Daniel Price - Solaris Kernel Engineering - dp at eng.sun.com - blogs.sun.com/dp
Scott Howard
2005-Dec-16 08:43 UTC
[zfs-discuss] ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On Fri, Dec 16, 2005 at 12:02:17AM -0800, Dan Price wrote:> > way. A bug has been filed and is being aggressively pursued by both Sun, > > the BIOS vendor, and the Nvidia RAID folks. The problem should go away > > BugID?CR 6363449 I''d guess. Scott
Keith Chan
2005-Dec-16 09:03 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS
> >I had the same problem with my U20--- precisely this > sequence. Let''s > >get a bug filed. > > > Can you remove the drive fromt he boot sequence? > I''ve seen this > happen when a Tyan 2885 wanted to boot from an EFI > labelled > disk.Same thing happened to me on my MSI K8N Neo4 Platinum based system - I just set the drive type to "None". This message posted from opensolaris.org
Richard Elling
2005-Dec-16 18:07 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
SInce EFI is an Intel (sponsored?) standard, why are *we* just now seeing this? Shouldn''t everyone who does EFI see this? -- richard This message posted from opensolaris.org
Cyril Plisko
2005-Dec-16 19:30 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
On 12/16/05, Richard Elling <Richard.Elling at sun.com> wrote:> SInce EFI is an Intel (sponsored?) standard, why are *we* just now seeing this? Shouldn''t everyone who does EFI see this?Richard, EFI was designed for IA64 architecture and GPT label (aka EFI label in Solaris) is used there. Add to this the fact that PC BIOS usually cannot cope (didn''t we just see that :-Q) with the EFI label and here is what we have - most of the PC (UNIX/Win) users just have no reason to put EFI label on their disks. ZFS just changed this. Oh, BTW, before ZFS one couldn''t put an EFI label on [S]ATA drive at all even in Solaris. I think that explains the situation. -- Regards, Cyril
Casper.Dik at Sun.COM
2005-Dec-16 20:26 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
>EFI was designed for IA64 architecture and GPT label (aka EFI label in Solaris) >is used there. Add to this the fact that PC BIOS usually cannot cope >(didn''t we just >see that :-Q) with the EFI label and here is what we have - most of >the PC (UNIX/Win) >users just have no reason to put EFI label on their disks. ZFS just >changed this. >Oh, BTW, before ZFS one couldn''t put an EFI label on [S]ATA drive at all even >in Solaris. I think that explains the situation.And that was a *very* recent putback; hit the gates in the two weeks before zfs, I believe. So the exposure of the new feature was very limited before zfs came out.. Casper
Kyle McDonald
2005-Dec-21 17:17 UTC
[zfs-discuss] Zpool output is wierd after export/import.
I had nv_28 on a SPARC machine with 6 12 disk multipacks. I created a pool and several filesystems. (no data yet though.) Today I ran ''zpool export'' and then jumpstarted to nv_29. After booting up I ran ''zpool import'' and the ouput looked like below. I''m pretty sure there wasn''t any thing wrong with the disks before re jumpstarting. But what I find suspicious is that it says ''c3t2d0'' is missing, and then says ''c3t2d0s0'' is OK and ONLINE. Which is it? ''c0'' is the boot disk controler. But there are 3 dual ultra scsi controllers in this box, so there really should be a ''c6'' somewhere too. Does this look fishy to anyone else? -Kyle bell# zpool import pool: datapool0 id: 17061535701658615450 state: DEGRADED status: One or more devices are missing from the system. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: http://www.sun.com/msg/ZFS-8000-2Q config: datapool0 DEGRADED raidz DEGRADED c1t2d0s0 ONLINE c2t2d0s0 ONLINE c3t2d0 FAULTED cannot open c3t2d0s0 ONLINE c4t2d0s0 ONLINE c5t2d0s0 ONLINE raidz DEGRADED c1t3d0s0 ONLINE c2t3d0s0 ONLINE c3t3d0 FAULTED cannot open c3t3d0s0 ONLINE c4t3d0s0 ONLINE c5t3d0s0 ONLINE raidz DEGRADED c1t4d0s0 ONLINE c2t4d0s0 ONLINE c3t4d0 FAULTED cannot open c3t4d0s0 ONLINE c4t4d0s0 ONLINE c5t4d0s0 ONLINE raidz DEGRADED c1t5d0s0 ONLINE c2t5d0s0 ONLINE c3t5d0 FAULTED cannot open c3t5d0s0 ONLINE c4t5d0s0 ONLINE c5t5d0s0 ONLINE raidz DEGRADED c1t8d0s0 ONLINE c2t8d0s0 ONLINE c3t8d0 FAULTED cannot open c3t8d0s0 ONLINE c4t8d0s0 ONLINE c5t8d0s0 ONLINE raidz DEGRADED c1t9d0s0 ONLINE c2t9d0s0 ONLINE c3t9d0 FAULTED cannot open c3t9d0s0 ONLINE c4t9d0s0 ONLINE c5t9d0s0 ONLINE raidz DEGRADED c1t10d0s0 ONLINE c2t10d0s0 ONLINE c3t10d0 FAULTED cannot open c3t10d0s0 ONLINE c4t10d0s0 ONLINE c5t10d0s0 ONLINE raidz DEGRADED c1t11d0s0 ONLINE c2t11d0s0 ONLINE c3t11d0 FAULTED cannot open c3t11d0s0 ONLINE c4t11d0s0 ONLINE c5t11d0s0 ONLINE raidz DEGRADED c1t12d0s0 ONLINE c2t12d0s0 ONLINE c3t12d0 FAULTED cannot open c3t12d0s0 ONLINE c4t12d0s0 ONLINE c5t12d0s0 ONLINE raidz DEGRADED c1t13d0s0 ONLINE c2t13d0s0 ONLINE c3t13d0 FAULTED cannot open c3t13d0s0 ONLINE c4t13d0s0 ONLINE c5t13d0s0 ONLINE raidz DEGRADED c1t14d0s0 ONLINE c2t14d0s0 ONLINE c3t14d0 FAULTED cannot open c3t14d0s0 ONLINE c4t14d0s0 ONLINE c5t14d0s0 ONLINE raidz DEGRADED c1t15d0s0 ONLINE c2t15d0s0 ONLINE c3t15d0 FAULTED cannot open c3t15d0s0 ONLINE c4t15d0s0 ONLINE c5t15d0s0 ONLINE bell#
Eric Schrock
2005-Dec-21 18:53 UTC
[zfs-discuss] Zpool output is wierd after export/import.
Yes, this is probably related to: 6362672 import gets confused about overlapping slices When you created this pool, did you use whole disks? This might also be related to: 6344272 re-think how whole disks are stored The latter should be fixed in build 31; the former is on my short list. This may also be a new pathology. Can you recreate this? Can you send the output of ''zpool status'' before exporting the pool? Thanks. - Eric On Wed, Dec 21, 2005 at 12:17:33PM -0500, Kyle McDonald wrote:> I had nv_28 on a SPARC machine with 6 12 disk multipacks. I created a > pool and several filesystems. > (no data yet though.) Today I ran ''zpool export'' and then jumpstarted to > nv_29. After booting up I ran ''zpool import'' and the ouput looked like > below. > > I''m pretty sure there wasn''t any thing wrong with the disks before re > jumpstarting. > But what I find suspicious is that it says ''c3t2d0'' is missing, and then > says ''c3t2d0s0'' is OK and ONLINE. Which is it? > > ''c0'' is the boot disk controler. But there are 3 dual ultra scsi > controllers in this box, so there really should be a ''c6'' somewhere too. > > Does this look fishy to anyone else? > > -Kyle > > > > bell# zpool import > pool: datapool0 > id: 17061535701658615450 > state: DEGRADED > status: One or more devices are missing from the system. > action: The pool can be imported despite missing or damaged devices. The > fault tolerance of the pool may be compromised if imported. > see: http://www.sun.com/msg/ZFS-8000-2Q > config: > > datapool0 DEGRADED > raidz DEGRADED > c1t2d0s0 ONLINE > c2t2d0s0 ONLINE > c3t2d0 FAULTED cannot open > c3t2d0s0 ONLINE > c4t2d0s0 ONLINE > c5t2d0s0 ONLINE > raidz DEGRADED > c1t3d0s0 ONLINE > c2t3d0s0 ONLINE > c3t3d0 FAULTED cannot open > c3t3d0s0 ONLINE > c4t3d0s0 ONLINE > c5t3d0s0 ONLINE > raidz DEGRADED > c1t4d0s0 ONLINE > c2t4d0s0 ONLINE > c3t4d0 FAULTED cannot open > c3t4d0s0 ONLINE > c4t4d0s0 ONLINE > c5t4d0s0 ONLINE > raidz DEGRADED > c1t5d0s0 ONLINE > c2t5d0s0 ONLINE > c3t5d0 FAULTED cannot open > c3t5d0s0 ONLINE > c4t5d0s0 ONLINE > c5t5d0s0 ONLINE > raidz DEGRADED > c1t8d0s0 ONLINE > c2t8d0s0 ONLINE > c3t8d0 FAULTED cannot open > c3t8d0s0 ONLINE > c4t8d0s0 ONLINE > c5t8d0s0 ONLINE > raidz DEGRADED > c1t9d0s0 ONLINE > c2t9d0s0 ONLINE > c3t9d0 FAULTED cannot open > c3t9d0s0 ONLINE > c4t9d0s0 ONLINE > c5t9d0s0 ONLINE > raidz DEGRADED > c1t10d0s0 ONLINE > c2t10d0s0 ONLINE > c3t10d0 FAULTED cannot open > c3t10d0s0 ONLINE > c4t10d0s0 ONLINE > c5t10d0s0 ONLINE > raidz DEGRADED > c1t11d0s0 ONLINE > c2t11d0s0 ONLINE > c3t11d0 FAULTED cannot open > c3t11d0s0 ONLINE > c4t11d0s0 ONLINE > c5t11d0s0 ONLINE > raidz DEGRADED > c1t12d0s0 ONLINE > c2t12d0s0 ONLINE > c3t12d0 FAULTED cannot open > c3t12d0s0 ONLINE > c4t12d0s0 ONLINE > c5t12d0s0 ONLINE > raidz DEGRADED > c1t13d0s0 ONLINE > c2t13d0s0 ONLINE > c3t13d0 FAULTED cannot open > c3t13d0s0 ONLINE > c4t13d0s0 ONLINE > c5t13d0s0 ONLINE > raidz DEGRADED > c1t14d0s0 ONLINE > c2t14d0s0 ONLINE > c3t14d0 FAULTED cannot open > c3t14d0s0 ONLINE > c4t14d0s0 ONLINE > c5t14d0s0 ONLINE > raidz DEGRADED > c1t15d0s0 ONLINE > c2t15d0s0 ONLINE > c3t15d0 FAULTED cannot open > c3t15d0s0 ONLINE > c4t15d0s0 ONLINE > c5t15d0s0 ONLINE > bell# > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Kyle McDonald
2005-Dec-21 19:06 UTC
[zfs-discuss] Zpool output is wierd after export/import.
Eric Schrock wrote:> Yes, this is probably related to: > > 6362672 import gets confused about overlapping slices > > When you created this pool, did you use whole disks? This might also > be related to: > > 6344272 re-think how whole disks are stored > > The latter should be fixed in build 31; the former is on my short list. > This may also be a new pathology. Can you recreate this? Can you send > the output of ''zpool status'' before exporting the pool? > > Thanks. > >I think I figured it out. maybe this will help you: One of the multipacks was confused. I power cycled it, and ran ''disks'', ''drvconfig'', and ''devlinks'' and c6 showed up. This c6 should have been numbered c3 if it had been found during the jumpstart, but since the disks on it were wacked it didn''t get a number at all. I think the missing c3 in the output is zpool showing me the name of the device the last time it was present. The other c3 is the name of the device that currently is in the c3 position, but which used to be in the c4 position. Since the controller was missing entirely, there wasn''t any name to replace the old c3 with. Maybe zpool should print ''missing'' instead? -Kyle> - Eric > > On Wed, Dec 21, 2005 at 12:17:33PM -0500, Kyle McDonald wrote: > >> I had nv_28 on a SPARC machine with 6 12 disk multipacks. I created a >> pool and several filesystems. >> (no data yet though.) Today I ran ''zpool export'' and then jumpstarted to >> nv_29. After booting up I ran ''zpool import'' and the ouput looked like >> below. >> >> I''m pretty sure there wasn''t any thing wrong with the disks before re >> jumpstarting. >> But what I find suspicious is that it says ''c3t2d0'' is missing, and then >> says ''c3t2d0s0'' is OK and ONLINE. Which is it? >> >> ''c0'' is the boot disk controler. But there are 3 dual ultra scsi >> controllers in this box, so there really should be a ''c6'' somewhere too. >> >> Does this look fishy to anyone else? >> >> -Kyle >> >> >> >> bell# zpool import >> pool: datapool0 >> id: 17061535701658615450 >> state: DEGRADED >> status: One or more devices are missing from the system. >> action: The pool can be imported despite missing or damaged devices. The >> fault tolerance of the pool may be compromised if imported. >> see: http://www.sun.com/msg/ZFS-8000-2Q >> config: >> >> datapool0 DEGRADED >> raidz DEGRADED >> c1t2d0s0 ONLINE >> c2t2d0s0 ONLINE >> c3t2d0 FAULTED cannot open >> c3t2d0s0 ONLINE >> c4t2d0s0 ONLINE >> c5t2d0s0 ONLINE >> raidz DEGRADED >> c1t3d0s0 ONLINE >> c2t3d0s0 ONLINE >> c3t3d0 FAULTED cannot open >> c3t3d0s0 ONLINE >> c4t3d0s0 ONLINE >> c5t3d0s0 ONLINE >> raidz DEGRADED >> c1t4d0s0 ONLINE >> c2t4d0s0 ONLINE >> c3t4d0 FAULTED cannot open >> c3t4d0s0 ONLINE >> c4t4d0s0 ONLINE >> c5t4d0s0 ONLINE >> raidz DEGRADED >> c1t5d0s0 ONLINE >> c2t5d0s0 ONLINE >> c3t5d0 FAULTED cannot open >> c3t5d0s0 ONLINE >> c4t5d0s0 ONLINE >> c5t5d0s0 ONLINE >> raidz DEGRADED >> c1t8d0s0 ONLINE >> c2t8d0s0 ONLINE >> c3t8d0 FAULTED cannot open >> c3t8d0s0 ONLINE >> c4t8d0s0 ONLINE >> c5t8d0s0 ONLINE >> raidz DEGRADED >> c1t9d0s0 ONLINE >> c2t9d0s0 ONLINE >> c3t9d0 FAULTED cannot open >> c3t9d0s0 ONLINE >> c4t9d0s0 ONLINE >> c5t9d0s0 ONLINE >> raidz DEGRADED >> c1t10d0s0 ONLINE >> c2t10d0s0 ONLINE >> c3t10d0 FAULTED cannot open >> c3t10d0s0 ONLINE >> c4t10d0s0 ONLINE >> c5t10d0s0 ONLINE >> raidz DEGRADED >> c1t11d0s0 ONLINE >> c2t11d0s0 ONLINE >> c3t11d0 FAULTED cannot open >> c3t11d0s0 ONLINE >> c4t11d0s0 ONLINE >> c5t11d0s0 ONLINE >> raidz DEGRADED >> c1t12d0s0 ONLINE >> c2t12d0s0 ONLINE >> c3t12d0 FAULTED cannot open >> c3t12d0s0 ONLINE >> c4t12d0s0 ONLINE >> c5t12d0s0 ONLINE >> raidz DEGRADED >> c1t13d0s0 ONLINE >> c2t13d0s0 ONLINE >> c3t13d0 FAULTED cannot open >> c3t13d0s0 ONLINE >> c4t13d0s0 ONLINE >> c5t13d0s0 ONLINE >> raidz DEGRADED >> c1t14d0s0 ONLINE >> c2t14d0s0 ONLINE >> c3t14d0 FAULTED cannot open >> c3t14d0s0 ONLINE >> c4t14d0s0 ONLINE >> c5t14d0s0 ONLINE >> raidz DEGRADED >> c1t15d0s0 ONLINE >> c2t15d0s0 ONLINE >> c3t15d0 FAULTED cannot open >> c3t15d0s0 ONLINE >> c4t15d0s0 ONLINE >> c5t15d0s0 ONLINE >> bell# >> >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > -- > Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Eric Schrock
2005-Dec-21 19:26 UTC
[zfs-discuss] Zpool output is wierd after export/import.
On Wed, Dec 21, 2005 at 02:06:43PM -0500, Kyle McDonald wrote:> > One of the multipacks was confused. I power cycled it, and ran ''disks'', > ''drvconfig'', and ''devlinks'' and c6 showed up. This c6 should have been > numbered c3 if it had been found during the jumpstart, but since the > disks on it were wacked it didn''t get a number at all. > > I think the missing c3 in the output is zpool showing me the name of the > device the last time it was present. The other c3 is the name of the > device that currently is in the c3 position, but which used to be in the > c4 position. Since the controller was missing entirely, there wasn''t any > name to replace the old c3 with. Maybe zpool should print ''missing'' instead? >OK. That makes sense. The additional ''s0'' additions are due to the whole-disk bug, and will be fixed soon. This could probably be made a little more explicit, but it gets difficult once you actually import the pool. If we were to import your pool, we would not know whether ''c3t0d0'' was the right name of the device or not. Note that once you plugged the disk in, we would correctly open it by devid and all would be well, modulo this bug: 6364582 need to fixup paths if they''ve changed Which doesn''t affect correctness, but can produce confusing output when using zpool(1M). For example, what if you had: pool mirror c0t1d0 ONLINE c0t1d0 OFFLINE cannot open c0t2d0 ONLINE Is there any way to display this in a non-confusing manner? One possibility is that if there is a device with the given path, but it doesn''t match the one we''re expecting, then we display it differently, either with ''missing'', or marking the path somehow, like "(c0t1d0)". Would any of this help? - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Casper.Dik at Sun.COM
2005-Dec-21 19:56 UTC
[zfs-discuss] Zpool output is wierd after export/import.
> pool > mirror > c0t1d0 ONLINE > c0t1d0 OFFLINE cannot open > c0t2d0 ONLINE > >Is there any way to display this in a non-confusing manner? One >possibility is that if there is a device with the given path, but it >doesn''t match the one we''re expecting, then we display it differently, >either with ''missing'', or marking the path somehow, like "(c0t1d0)". >Would any of this help?If I may suggest *not* mentioning the device name at all? Whatever way you present it, it is going to be confusing. In this case we''d have c0t1d0 and (c0t1d0). Now the user may well say "but c0t1d0 is there, it''s c1t1d0 that''s gone missing!. It makes much more sense to use some WNN id. Casper
Kyle McDonald
2005-Dec-21 21:09 UTC
[zfs-discuss] Zpool output is wierd after export/import.
Eric Schrock wrote:> On Wed, Dec 21, 2005 at 02:06:43PM -0500, Kyle McDonald wrote: > >> One of the multipacks was confused. I power cycled it, and ran ''disks'', >> ''drvconfig'', and ''devlinks'' and c6 showed up. This c6 should have been >> numbered c3 if it had been found during the jumpstart, but since the >> disks on it were wacked it didn''t get a number at all. >> >> I think the missing c3 in the output is zpool showing me the name of the >> device the last time it was present. The other c3 is the name of the >> device that currently is in the c3 position, but which used to be in the >> c4 position. Since the controller was missing entirely, there wasn''t any >> name to replace the old c3 with. Maybe zpool should print ''missing'' instead? >> >> > > OK. That makes sense. The additional ''s0'' additions are due to the > whole-disk bug, and will be fixed soon. > > This could probably be made a little more explicit, but it gets > difficult once you actually import the pool. If we were to import your > pool, we would not know whether ''c3t0d0'' was the right name of the > device or not. Note that once you plugged the disk in, we would > correctly open it by devid and all would be well, modulo this bug: > > 6364582 need to fixup paths if they''ve changed > > Which doesn''t affect correctness, but can produce confusing output when > using zpool(1M). For example, what if you had: > > pool > mirror > c0t1d0 ONLINE > c0t1d0 OFFLINE cannot open > c0t2d0 ONLINE > > Is there any way to display this in a non-confusing manner? One > possibility is that if there is a device with the given path, but it > doesn''t match the one we''re expecting, then we display it differently, > either with ''missing'', or marking the path somehow, like "(c0t1d0)". > Would any of this help? >I would think: pool mirror c0t1d0 ONLINE missing OFFLINE cannot open (was c0t1d0) c0t2d0 ONLINE Would be the most useful. -Kyle
Eric Schrock
2005-Dec-21 21:19 UTC
[zfs-discuss] Zpool output is wierd after export/import.
On Wed, Dec 21, 2005 at 04:09:14PM -0500, Kyle McDonald wrote:> > I would think: > > pool > mirror > c0t1d0 ONLINE > missing OFFLINE cannot open (was c0t1d0) > c0t2d0 ONLINEThis makes sense. We can do this for the following cases: 1. During import, if during our device scan we never touched the disk in question. 2. For an active pool, if the path is valid, but doesn''t refer to the device we expect it to be. However, if we have an active pool whose path and devid are invalid, does it still make sense to display "(was c0t1d0)"? For example, if I unplug a USB device, or a network attached drive goes away for some reason, does it make sense to display it as if the path is somehow wrong? At this point we can''t tell if the path is right or not - can/should we distinguish between these cases? - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Casper.Dik at Sun.COM
2005-Dec-21 21:22 UTC
[zfs-discuss] Zpool output is wierd after export/import.
>I would think: > > pool > mirror > c0t1d0 ONLINE > missing OFFLINE cannot open (was c0t1d0) > c0t2d0 ONLINE > >Would be the most useful.I''m still not sure how useful this is; if the device was moved from one system to another, the controller number could very well be wrong. The unique identifier would be a better indicator; perhaps, if it''s available, disk brand and serial. Casper
Eric Schrock
2005-Dec-21 21:39 UTC
[zfs-discuss] Zpool output is wierd after export/import.
On Wed, Dec 21, 2005 at 10:22:20PM +0100, Casper.Dik at sun.com wrote:> > I''m still not sure how useful this is; if the device was moved from > one system to another, the controller number could very well be wrong.Except that it may very well clue the admin into what went wrong. If they notice, for example, that every disk that was on (former) controller 3 is mising, it suggests that they didn''t quite connect controller 3 correctly, even though it is now controller 6.> The unique identifier would be a better indicator; perhaps, if it''s > available, disk brand and serial.I disagree. If we had a unique identifier that was in any way intelligible to the user, this would make sense. We have both a 64-bit GUID and a device ID, neither of which provide any useful input to the administrator on how to fix the problem. In the example here, how would a bunch of devids like ''SEAGATE at WWC49384598610004949993999/SS3948829" be any indication of the real problem (controller 6 was setup incorrectly)? - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Anton B. Rang
2005-Dec-22 02:51 UTC
[zfs-discuss] Re: Zpool output is wierd after export/import.
Perhaps we should allow users to name disks when they''re added, as well as labeling them with the internal ID? Then there''d be an identifier -- unique if the user is careful -- which was meaningful. This message posted from opensolaris.org
Robert Milkowski
2007-Jan-22 13:48 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
Is there an BIOS uptade for Ultra20 to make it understand EFI? This message posted from opensolaris.org
Casper.Dik at Sun.COM
2007-Jan-22 13:56 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
>Is there an BIOS uptade for Ultra20 to make it understand EFI?Understanding EFI is perhaps asking too much; but I believe the latest BIOS no longer hangs/crashes when it encountered EFI labels on disks it examines. (All disks it probes) Casper
Robert Milkowski
2007-Feb-06 18:48 UTC
[zfs-discuss] Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS 2.1.7)
Hello Casper, Monday, January 22, 2007, 2:56:16 PM, you wrote:>>Is there an BIOS uptade for Ultra20 to make it understand EFI?CDSC> Understanding EFI is perhaps asking too much; but I believe the CDSC> latest BIOS no longer hangs/crashes when it encountered EFI labels CDSC> on disks it examines. (All disks it probes) That''s what we looked for. Somehow my friend missed that. Thank you. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Eric Haycraft
2007-Feb-12 15:27 UTC
[zfs-discuss] Re: Re: ZFS volume is hosing BIOS POST on Ultra20 (BIOS
I had the same issue with zfs killing my Ultra20. I can confirm that flashing the BIOS fixed the issue. http://www.sun.com/desktop/workstation/ultra20/downloads.jsp#Ultra Eric This message posted from opensolaris.org