I am having trouble getting ZFS to behave as I would expect. I am using the HP driver (cpqary3) for the Smart Array P400 (in a HP Proliant DL385 G2) with 10k 2.5" 146GB SAS drives. The drives appear correctly, however due to the controller not offering JBOD functionality I had to configure each drive as a RAID0 logical drive. Everything appears to work fine, the drives are detected and I created a mirror for the OS to install to and an additional raidz2 array with the remaining 6 discs. But when I remove a disc and then reinsert it I cannot get ZFS to accept it back into the array see bellow for the details. I thought it might be a problem with using the whole discs eg: c1t*d0 so I created a single partition on each and used that, but had the same results. The module seem to detect the drive has been reinserted successfully but the OS doesn''t seem to want to write to it. Any help would be most appreciated as I would much prefer to use ZFS''s software capabilities rather than the hardware card in the machine. When rebooting the system the Array BIOS also displays some interesting behavior. ### BIOS Output 1792-Slot 1 Drive Array - Valid Data Found in the Array Accelerator Data will automatically be written to the drive array 1779-Slot 1 Drive Array - Replacement drive(s) detected OR previously failed drives(s) no appear to be operational POrt 2I: Box1: Bay3 Logical drives(s) disabled due to possible data loss. Select "F1" to continue with logical drive(s) disabled Select "F2" to accept data loss and to re-enable logical drive(s) #### Terminal output bash-3.00# zpool status test pool: test state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: resilver completed after 0h0m with 0 errors on Tue Jan 27 03:30:16 2009 config: NAME STATE READ WRITE CKSUM test DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c1t2d0p0 ONLINE 0 0 0 c1t3d0p0 ONLINE 0 0 0 c1t4d0p0 ONLINE 0 0 0 c1t5d0p0 UNAVAIL 0 0 0 cannot open c1t6d0p0 ONLINE 0 0 0 c1t8d0p0 ONLINE 0 0 0 errors: No known data errors bash-3.00# zpool online test c1t5d0p0 warning: device ''c1t5d0p0'' onlined, but remains in faulted state use ''zpool replace'' to replace devices that are no longer present bash-3.00# dmesg Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] NOTICE: Smart Array P400 Controller Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] Hot-plug drive inserted, Port: 2I Box: 1 Bay: 3 Jan 27 03:27:40 unknown cpqary3: [ID 479030 kern.notice] Configured Drive ? ....... YES Jan 27 03:27:40 unknown cpqary3: [ID 100000 kern.notice] Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] NOTICE: Smart Array P400 Controller Jan 27 03:27:40 unknown cpqary3: [ID 834734 kern.notice] Media exchange detected, logical drive 6 Jan 27 03:27:40 unknown cpqary3: [ID 100000 kern.notice] ... Jan 27 03:36:24 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at 38,0/pci1166,142 at 10/pci103c,3234 at 0/sd at 5,0 (sd6): Jan 27 03:36:24 unknown SYNCHRONIZE CACHE command failed (5) ... Jan 27 03:47:58 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at 38,0/pci1166,142 at 10/pci103c,3234 at 0/sd at 5,0 (sd6): Jan 27 03:47:58 unknown drive offline -- This message posted from opensolaris.org
Fajar A. Nugraha
2009-Jan-27 12:47 UTC
[zfs-discuss] Problems using ZFS on Smart Array P400
On Tue, Jan 27, 2009 at 7:16 PM, Alex <alex at pancentric.com> wrote:> I am using the HP driver (cpqary3) for the Smart Array P400 (in a HP Proliant DL385 G2) with 10k 2.5" 146GB SAS drives. The drives appear correctly, however due to the controller not offering JBOD functionality I had to configure each drive as a RAID0 logical drive.Ouch. Short comment, it might not worth it. Seriously. Does the P400 have a battery-backed cache? If yes, it will be MUCH easier to simply let it handle RAID5/10 and use stripe config for zfs. HW controllers with battery-backed cache will reduce the possibility of raid5 write hole. Depending on what your goal is, it might be the best choice.> But when I remove a disc and then reinsert it I cannot get ZFS to accept it back into the array see bellow for the details.I don''t think it''s zfs'' fault.> Any help would be most appreciated as I would much prefer to use ZFS''s software capabilities rather than the hardware card in the machine.Yet you''re stuck with a hardware that practically does not allow you to do just that.> 1779-Slot 1 Drive Array - Replacement drive(s) detected OR previously failed drives(s) no appear to be operational > POrt 2I: Box1: Bay3 > Logical drives(s) disabled due to possible data loss. > Select "F1" to continue with logical drive(s) disabled > Select "F2" to accept data loss and to re-enable logical drive(s)>From this output, the HW controller refuses to enable the removeddisc, so that the OS (and thus zfs) can''t see it yet. You can enable it from BIOS during booting (which kinda defeat the whole hot-swap thing), or you might be able to find a way (using ILO perhaps, or with some HP-supplied software installed on the OS that can talk to P400) to enable it without having to reboot. Regards, Fajar
I''m testing the same thing on a DL380 G5 with P400 controller. I set individual RAID 0 logical drives for each disk. I ended up with the same result upon drive removal. I''m looking into whether the hpacucli array command line utility will let me re-enable a logical drive from its interface. -- Edmund William White ewwhite at mac.com> From: Alex <alex at pancentric.com> > Date: Tue, 27 Jan 2009 04:16:56 -0800 (PST) > To: <zfs-discuss at opensolaris.org> > Subject: [zfs-discuss] Problems using ZFS on Smart Array P400 > > I am having trouble getting ZFS to behave as I would expect. > > I am using the HP driver (cpqary3) for the Smart Array P400 (in a HP Proliant > DL385 G2) with 10k 2.5" 146GB SAS drives. The drives appear correctly, however > due to the controller not offering JBOD functionality I had to configure each > drive as a RAID0 logical drive. > > Everything appears to work fine, the drives are detected and I created a > mirror for the OS to install to and an additional raidz2 array with the > remaining 6 discs. > > But when I remove a disc and then reinsert it I cannot get ZFS to accept it > back into the array see bellow for the details. > > I thought it might be a problem with using the whole discs eg: c1t*d0 so I > created a single partition on each and used that, but had the same results. > The module seem to detect the drive has been reinserted successfully but the > OS doesn''t seem to want to write to it. > > Any help would be most appreciated as I would much prefer to use ZFS''s > software capabilities rather than the hardware card in the machine. > > When rebooting the system the Array BIOS also displays some interesting > behavior. > > ### BIOS Output > > 1792-Slot 1 Drive Array - Valid Data Found in the Array Accelerator > Data will automatically be written to the drive array > 1779-Slot 1 Drive Array - Replacement drive(s) detected OR previously failed > drives(s) no appear to be operational > POrt 2I: Box1: Bay3 > Logical drives(s) disabled due to possible data loss. > Select "F1" to continue with logical drive(s) disabled > Select "F2" to accept data loss and to re-enable logical drive(s) > > #### Terminal output > > bash-3.00# zpool status test > > pool: test > state: DEGRADED > status: One or more devices could not be opened. Sufficient replicas exist for > the pool to continue functioning in a degraded state. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-2Q > scrub: resilver completed after 0h0m with 0 errors on Tue Jan 27 03:30:16 2009 > config: > > NAME STATE READ WRITE CKSUM > test DEGRADED 0 0 0 > raidz2 DEGRADED 0 0 0 > c1t2d0p0 ONLINE 0 0 0 > c1t3d0p0 ONLINE 0 0 0 > c1t4d0p0 ONLINE 0 0 0 > c1t5d0p0 UNAVAIL 0 0 0 cannot open > c1t6d0p0 ONLINE 0 0 0 > c1t8d0p0 ONLINE 0 0 0 > > errors: No known data errors > bash-3.00# zpool online test c1t5d0p0 > warning: device ''c1t5d0p0'' onlined, but remains in faulted state > use ''zpool replace'' to replace devices that are no longer present > > bash-3.00# dmesg > > Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] NOTICE: Smart Array > P400 Controller > Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] Hot-plug drive > inserted, Port: 2I Box: 1 Bay: 3 > Jan 27 03:27:40 unknown cpqary3: [ID 479030 kern.notice] Configured Drive ? > ....... YES > Jan 27 03:27:40 unknown cpqary3: [ID 100000 kern.notice] > Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] NOTICE: Smart Array > P400 Controller > Jan 27 03:27:40 unknown cpqary3: [ID 834734 kern.notice] Media exchange > detected, logical drive 6 > Jan 27 03:27:40 unknown cpqary3: [ID 100000 kern.notice] > ... > Jan 27 03:36:24 unknown scsi: [ID 107833 kern.warning] WARNING: > /pci at 38,0/pci1166,142 at 10/pci103c,3234 at 0/sd at 5,0 (sd6): > Jan 27 03:36:24 unknown SYNCHRONIZE CACHE command failed (5) > ... > Jan 27 03:47:58 unknown scsi: [ID 107833 kern.warning] WARNING: > /pci at 38,0/pci1166,142 at 10/pci103c,3234 at 0/sd at 5,0 (sd6): > Jan 27 03:47:58 unknown drive offline > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
You need to step back and appreciate that the manner in which you are presenting Solaris with disks is the problem and not necessarily ZFS. As your storage system is incapable of JBOD operation, you have decided to present each disk as a ''simple'' RAID0 volume. Whilst this looks like a ''pass-thru'' access method to the disk and its contents, it is far from it. The HW RAID sub-system is creating a logical volume based on this single spindle (in exactly the same way it would be for multiple spindles, aka a stripe), metadata is recorded by the RAID system with regard to the make-up of said volume. The important issue here is that you have a non-redundant RAID (!) config, hence a single failure (in this case your single spindle failure) causes the RAID sub-system to declare the volume (and hence its operational status) as failed, this in turn is declared to the OS as a failed volume. At this juncture, intervention is normally necessary to re-destroy/re-create a volume (remember no redundancy--- so this is manual!) and hence re-present it to the OS (which will find a new UID for the volume and treat it as a new device). On occasions it may be possible to intervene and "resurrect" a volume by manually overriding the status of the RAID0 volume, but in many HW RAID systems this is not to be recommended. In short, you''ve got more abstractions (layers) in place than you need/ desire and that is fundamentally the cause of your problem ... either plump for a simpler array or swallow some loss of transparency in the ZFS layer and present redundant RAID sets from your array, but live with the consequences of increased admin and complexity and some loss of transparency/protection---but hopefully the RAID sub-system will be capable of automated recovery in most circumstances of simple failures. Craig On 27 Jan 2009, at 13:00, Edmund White wrote:> I''m testing the same thing on a DL380 G5 with P400 controller. I set > individual RAID 0 logical drives for each disk. I ended up with the > same > result upon drive removal. I''m looking into whether the hpacucli array > command line utility will let me re-enable a logical drive from its > interface. > > -- > Edmund William White > ewwhite at mac.com > > > >> From: Alex <alex at pancentric.com> >> Date: Tue, 27 Jan 2009 04:16:56 -0800 (PST) >> To: <zfs-discuss at opensolaris.org> >> Subject: [zfs-discuss] Problems using ZFS on Smart Array P400 >> >> I am having trouble getting ZFS to behave as I would expect. >> >> I am using the HP driver (cpqary3) for the Smart Array P400 (in a >> HP Proliant >> DL385 G2) with 10k 2.5" 146GB SAS drives. The drives appear >> correctly, however >> due to the controller not offering JBOD functionality I had to >> configure each >> drive as a RAID0 logical drive. >> >> Everything appears to work fine, the drives are detected and I >> created a >> mirror for the OS to install to and an additional raidz2 array with >> the >> remaining 6 discs. >> >> But when I remove a disc and then reinsert it I cannot get ZFS to >> accept it >> back into the array see bellow for the details. >> >> I thought it might be a problem with using the whole discs eg: >> c1t*d0 so I >> created a single partition on each and used that, but had the same >> results. >> The module seem to detect the drive has been reinserted >> successfully but the >> OS doesn''t seem to want to write to it. >> >> Any help would be most appreciated as I would much prefer to use >> ZFS''s >> software capabilities rather than the hardware card in the machine. >> >> When rebooting the system the Array BIOS also displays some >> interesting >> behavior. >> >> ### BIOS Output >> >> 1792-Slot 1 Drive Array - Valid Data Found in the Array Accelerator >> Data will automatically be written to the drive array >> 1779-Slot 1 Drive Array - Replacement drive(s) detected OR >> previously failed >> drives(s) no appear to be operational >> POrt 2I: Box1: Bay3 >> Logical drives(s) disabled due to possible data loss. >> Select "F1" to continue with logical drive(s) disabled >> Select "F2" to accept data loss and to re-enable logical drive(s) >> >> #### Terminal output >> >> bash-3.00# zpool status test >> >> pool: test >> state: DEGRADED >> status: One or more devices could not be opened. Sufficient >> replicas exist for >> the pool to continue functioning in a degraded state. >> action: Attach the missing device and online it using ''zpool online''. >> see: http://www.sun.com/msg/ZFS-8000-2Q >> scrub: resilver completed after 0h0m with 0 errors on Tue Jan 27 >> 03:30:16 2009 >> config: >> >> NAME STATE READ WRITE CKSUM >> test DEGRADED 0 0 0 >> raidz2 DEGRADED 0 0 0 >> c1t2d0p0 ONLINE 0 0 0 >> c1t3d0p0 ONLINE 0 0 0 >> c1t4d0p0 ONLINE 0 0 0 >> c1t5d0p0 UNAVAIL 0 0 0 cannot open >> c1t6d0p0 ONLINE 0 0 0 >> c1t8d0p0 ONLINE 0 0 0 >> >> errors: No known data errors >> bash-3.00# zpool online test c1t5d0p0 >> warning: device ''c1t5d0p0'' onlined, but remains in faulted state >> use ''zpool replace'' to replace devices that are no longer present >> >> bash-3.00# dmesg >> >> Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] NOTICE: >> Smart Array >> P400 Controller >> Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] Hot-plug >> drive >> inserted, Port: 2I Box: 1 Bay: 3 >> Jan 27 03:27:40 unknown cpqary3: [ID 479030 kern.notice] Configured >> Drive ? >> ....... YES >> Jan 27 03:27:40 unknown cpqary3: [ID 100000 kern.notice] >> Jan 27 03:27:40 unknown cpqary3: [ID 823470 kern.notice] NOTICE: >> Smart Array >> P400 Controller >> Jan 27 03:27:40 unknown cpqary3: [ID 834734 kern.notice] Media >> exchange >> detected, logical drive 6 >> Jan 27 03:27:40 unknown cpqary3: [ID 100000 kern.notice] >> ... >> Jan 27 03:36:24 unknown scsi: [ID 107833 kern.warning] WARNING: >> /pci at 38,0/pci1166,142 at 10/pci103c,3234 at 0/sd at 5,0 (sd6): >> Jan 27 03:36:24 unknown SYNCHRONIZE CACHE command failed (5) >> ... >> Jan 27 03:47:58 unknown scsi: [ID 107833 kern.warning] WARNING: >> /pci at 38,0/pci1166,142 at 10/pci103c,3234 at 0/sd at 5,0 (sd6): >> Jan 27 03:47:58 unknown drive offline >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Craig Craig Morgan t: +44 (0)791 338 3190 f: +44 (0)870 705 1726 e: craig.morgan at sun.com ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Given that I have lots of ProLiant equipment, are there any recommended controllers that would work in this situation? Is this an issue unique to the Smart Array controllers? If I do choose to use some level of hardware RAID on the existing Smart Array P400, what''s the best way to use it with ZFS (assume 8 disks with an emphasis on capacity)? -- Edmund William White ewwhite at mac.com> From: Craig Morgan <Craig.Morgan at Sun.COM> > Date: Tue, 27 Jan 2009 13:54:46 +0000 > To: Edmund White <ewwhite at mac.com> > Cc: Alex <alex at pancentric.com>, <zfs-discuss at opensolaris.org> > Subject: Re: [zfs-discuss] Problems using ZFS on Smart Array P400 > > You need to step back and appreciate that the manner in which you are > presenting Solaris with disks is the problem and not necessarily ZFS. > > As your storage system is incapable of JBOD operation, you have > decided to present each disk as a ''simple'' RAID0 volume. Whilst this > looks like a ''pass-thru'' access method to the disk and its contents, > it is far from it. The HW RAID sub-system is creating a logical volume > based on this single spindle (in exactly the same way it would be for > multiple spindles, aka a stripe), metadata is recorded by the RAID > system with regard to the make-up of said volume. > > The important issue here is that you have a non-redundant RAID (!) > config, hence a single failure (in this case your single spindle > failure) causes the RAID sub-system to declare the volume (and hence > its operational status) as failed, this in turn is declared to the OS > as a failed volume. At this juncture, intervention is normally > necessary to re-destroy/re-create a volume (remember no redundancy--- > so this is manual!) and hence re-present it to the OS (which will find > a new UID for the volume and treat it as a new device). On occasions > it may be possible to intervene and "resurrect" a volume by manually > overriding the status of the RAID0 volume, but in many HW RAID systems > this is not to be recommended. > > In short, you''ve got more abstractions (layers) in place than you need/ > desire and that is fundamentally the cause of your problem ... either > plump for a simpler array or swallow some loss of transparency in the > ZFS layer and present redundant RAID sets from your array, but live > with the consequences of increased admin and complexity and some loss > of transparency/protection---but hopefully the RAID sub-system will be > capable of automated recovery in most circumstances of simple failures. > > Craig > > On 27 Jan 2009, at 13:00, Edmund White wrote: > > > -- > Craig > > Craig Morgan > t: +44 (0)791 338 3190 > f: +44 (0)870 705 1726 > e: craig.morgan at sun.com > > ~ > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > NOTICE: This email message is for the sole use of the intended > recipient(s) and may contain confidential and privileged information. > Any unauthorized review, use, disclosure or distribution is > prohibited. > If you are not the intended recipient, please contact the sender by > reply email and destroy all copies of the original message. > ~ > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >
Fajar A. Nugraha
2009-Jan-28 02:23 UTC
[zfs-discuss] Problems using ZFS on Smart Array P400
On Tue, Jan 27, 2009 at 10:41 PM, Edmund White <ewwhite at mac.com> wrote:> Given that I have lots of ProLiant equipment, are there any recommended > controllers that would work in this situation? Is this an issue unique to > the Smart Array controllers?It''s an issue with controllers that can''t prsent JBOD to the OS. I think IBM and DELL are affected as well.> If I do choose to use some level of hardware > RAID on the existing Smart Array P400, what''s the best way to use it with > ZFS (assume 8 disks with an emphasis on capacity)? >Both Craig and I have mentioned it earlier. Let HW controller manage RAID redundancy (RAID5 should do). Regards, Fajar