I''m currently trying to get a SuperMicro JBOD with dual SAS expander chips running in MPxIO, but I''m a total amateur to this and would like to ask about how to detect whether MPxIO is working (or not). My SAS topology is: *) One LSI SAS2008-equipped HBA (running the latest IT firmware from LSI) with two external ports. *) Two SAS cables running from the HBA to the SuperMicro JBOD, where they enter the JBOD''s rear backplane (which is equipped with two LSI SAS expander chips). *) From the rear backplane, via two internal SAS cables to the front backplane (also with two SAS expanders on it) *) The JBOD is populated with 45 2TB Toshiba SAS 7200rpm drives The machine also has a PERC H700 for the boot media, configured into a hardware RAID-1 (on which rpool resides). Here is the relevant part from cfgadm -al for the MPxIO bits: c5 scsi-sas connected configured unknown c5::dsk/c5t50000393D8CB4452d0 disk connected configured unknown c5::dsk/c5t50000393E8C90CF2d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF2A6d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF2AAd0 disk connected configured unknown c5::dsk/c5t50000393E8CAF2BEd0 disk connected configured unknown c5::dsk/c5t50000393E8CAF2C6d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF2E2d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF2F2d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF5C6d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF28Ad0 disk connected configured unknown c5::dsk/c5t50000393E8CAF32Ed0 disk connected configured unknown c5::dsk/c5t50000393E8CAF35Ad0 disk connected configured unknown c5::dsk/c5t50000393E8CAF35Ed0 disk connected configured unknown c5::dsk/c5t50000393E8CAF36Ad0 disk connected configured unknown c5::dsk/c5t50000393E8CAF36Ed0 disk connected configured unknown c5::dsk/c5t50000393E8CAF52Ed0 disk connected configured unknown c5::dsk/c5t50000393E8CAF53Ad0 disk connected configured unknown c5::dsk/c5t50000393E8CAF53Ed0 disk connected configured unknown c5::dsk/c5t50000393E8CAF312d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF316d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF506d0 disk connected configured unknown c5::dsk/c5t50000393E8CAF546d0 disk connected configured unknown c5::dsk/c5t50000393F8C84F5Ed0 disk connected configured unknown c5::dsk/c5t50000393F8C84FBAd0 disk connected configured unknown c5::dsk/c5t50000393F8C851EEd0 disk connected configured unknown c5::dsk/c5t50000393F8C852A6d0 disk connected configured unknown c5::dsk/c5t50000393F8C852C2d0 disk connected configured unknown c5::dsk/c5t50000393F8C852CAd0 disk connected configured unknown c5::dsk/c5t50000393F8C852EAd0 disk connected configured unknown c5::dsk/c5t50000393F8C854BAd0 disk connected configured unknown c5::dsk/c5t50000393F8C854E2d0 disk connected configured unknown c5::dsk/c5t50000393F8C855AAd0 disk connected configured unknown c5::dsk/c5t50000393F8C8509Ad0 disk connected configured unknown c5::dsk/c5t50000393F8C8520Ad0 disk connected configured unknown c5::dsk/c5t50000393F8C8528Ad0 disk connected configured unknown c5::dsk/c5t50000393F8C8530Ed0 disk connected configured unknown c5::dsk/c5t50000393F8C8531Ed0 disk connected configured unknown c5::dsk/c5t50000393F8C8557Ed0 disk connected configured unknown c5::dsk/c5t50000393F8C8558Ed0 disk connected configured unknown c5::dsk/c5t50000393F8C8560Ad0 disk connected configured unknown c5::dsk/c5t50000393F8C85106d0 disk connected configured unknown c5::dsk/c5t50000393F8C85222d0 disk connected configured unknown c5::dsk/c5t50000393F8C85246d0 disk connected configured unknown c5::dsk/c5t50000393F8C85366d0 disk connected configured unknown c5::dsk/c5t50000393F8C85636d0 disk connected configured unknown c5::es/ses0 ESI connected configured unknown c5::es/ses1 ESI connected configured unknown c5::smp/expd0 smp connected configured unknown c5::smp/expd1 smp connected configured unknown c6 scsi-sas connected configured unknown c6::dsk/c6t50000393D8CB4453d0 disk connected configured unknown c6::dsk/c6t50000393E8C90CF3d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF2A7d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF2ABd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF2BFd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF2C7d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF2E3d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF2F3d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF5C7d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF28Bd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF32Fd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF35Bd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF35Fd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF36Bd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF36Fd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF52Fd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF53Bd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF53Fd0 disk connected configured unknown c6::dsk/c6t50000393E8CAF313d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF317d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF507d0 disk connected configured unknown c6::dsk/c6t50000393E8CAF547d0 disk connected configured unknown c6::dsk/c6t50000393F8C84F5Fd0 disk connected configured unknown c6::dsk/c6t50000393F8C84FBBd0 disk connected configured unknown c6::dsk/c6t50000393F8C851EFd0 disk connected configured unknown c6::dsk/c6t50000393F8C852A7d0 disk connected configured unknown c6::dsk/c6t50000393F8C852C3d0 disk connected configured unknown c6::dsk/c6t50000393F8C852CBd0 disk connected configured unknown c6::dsk/c6t50000393F8C852EBd0 disk connected configured unknown c6::dsk/c6t50000393F8C854BBd0 disk connected configured unknown c6::dsk/c6t50000393F8C854E3d0 disk connected configured unknown c6::dsk/c6t50000393F8C855ABd0 disk connected configured unknown c6::dsk/c6t50000393F8C8509Bd0 disk connected configured unknown c6::dsk/c6t50000393F8C8520Bd0 disk connected configured unknown c6::dsk/c6t50000393F8C8528Bd0 disk connected configured unknown c6::dsk/c6t50000393F8C8530Fd0 disk connected configured unknown c6::dsk/c6t50000393F8C8531Fd0 disk connected configured unknown c6::dsk/c6t50000393F8C8557Fd0 disk connected configured unknown c6::dsk/c6t50000393F8C8558Fd0 disk connected configured unknown c6::dsk/c6t50000393F8C8560Bd0 disk connected configured unknown c6::dsk/c6t50000393F8C85107d0 disk connected configured unknown c6::dsk/c6t50000393F8C85223d0 disk connected configured unknown c6::dsk/c6t50000393F8C85247d0 disk connected configured unknown c6::dsk/c6t50000393F8C85367d0 disk connected configured unknown c6::dsk/c6t50000393F8C85637d0 disk connected configured unknown c6::es/ses2 ESI connected configured unknown c6::es/ses3 ESI connected configured unknown c6::smp/expd2 smp connected configured unknown c6::smp/expd3 smp connected configured unknown I can see all drives in format(1M), but "mpathadm list lu" doesn''t show anything, even after I did "stmsboot -e" (and rebooted). Can anybody please help me find out whether MPxIO is properly enabled and how I can start using it? My understanding is that MPxIO should mask each drive under some common name (e.g. c5t20d0) and underneath handle all of the multipathing internally, but maybe I''m wrong and it behaves completely differently. I''d like to create a 5x 9-drive raidz''s on the JBOD, but I''m not sure how to do it now that I can see each drive twice... All help greatly appreciated! Cheers, -- Saso
Sorry I can''t comment on MPxIO, except that I thought zfs could by itself discern two paths to the same drive, if only to protect against double-importing the disk into pool. 2012-05-25 21:07, Sa?o Kiselkov wrote: > I''d like to create a 5x 9-drive raidz''s on the JBOD, but> I''m not sure how to do it now that I can see each drive twice...I am not sure it is a good idea to use such low protection (raidz1) with large drives. At least, I was led to believe that after 2Tb in size raidz2 is preferable, and raidz3 is optimal due to long scrub/resilver times leading to large timeframes that a pool with an error is exposed to possible fatal errors (due to double-failures with single-protection). //Jim
On 05/25/2012 07:35 PM, Jim Klimov wrote:> Sorry I can''t comment on MPxIO, except that I thought zfs could by > itself discern two paths to the same drive, if only to protect > against double-importing the disk into pool.Unfortunately, it isn''t the same thing. MPxIO provides redundant signaling to the drives, independent of the storage/RAID layer above it, so it does have its place (besides simply increasing throughput).> I am not sure it is a good idea to use such low protection (raidz1) > with large drives. At least, I was led to believe that after 2Tb in > size raidz2 is preferable, and raidz3 is optimal due to long > scrub/resilver times leading to large timeframes that a pool with > an error is exposed to possible fatal errors (due to > double-failures with single-protection).I''d use lower protection if it were available :) The data on that array is not very important, the primary design parameter is low cost per MB. We''re in a very demanding IO environment, we need large quantities of high-throughput, high-IOPS storage, but we don''t need stellar reliability. If the pool gets corrupted due to unfortunate double-drive failure, well, that''s tough, but not unbearable (the pool stores customer channel recordings for nPVR, so nothing critical really). -- Saso
2012-05-25 21:45, Sa?o Kiselkov wrote:> On 05/25/2012 07:35 PM, Jim Klimov wrote: >> Sorry I can''t comment on MPxIO, except that I thought zfs could by >> itself discern two paths to the same drive, if only to protect >> against double-importing the disk into pool. > > Unfortunately, it isn''t the same thing. MPxIO provides redundant > signaling to the drives, independent of the storage/RAID layer above > it, so it does have its place (besides simply increasing throughput).Yes, I know - I just don''t have hands-on experience with that in Solaris (and limited in Linux), not so many double-link boxes around here :)> I''d use lower protection if it were available :)> The data on that array is not very important, the primary design > parameter is low cost per MB. Why not just stripe it all then? That would give good speeds ;) Arguably, mirroring would indeed cost about twice as much per MB, but as a tradeoff which may be useful to you - it can also give a lot more IOPS due to more TLVDEVs being available for striping, and doubling read speeds due to mirroring.> We''re in a very demanding IO environment, we need large > quantities of high-throughput, high-IOPS storage, but we don''t need > stellar reliability.Does your array include SSD L2ARC caches? I guess (and want to be corrected if wrong) - since ZFS can tolerate loss of L2ARCs so much that mirroring them is not even supported, you may get away with several single-link SSDs connected to one or another controller (or likely a dedicated one other than those driving the disk arrays - since IOPS on the few SSDs will be higher than on tens of disks). Likely you shouldn''t connect those single link (SATA) SSDs to the dual-link backplane either - i.e. mount them in the server chassis, not in the JBOD box. I may be wrong though :)> If the pool gets corrupted due to unfortunate > double-drive failure, well, that''s tough, but not unbearable (the pool > stores customer channel recordings for nPVR, so nothing critical really).//Jim
See the soluion at https://www.illumos.org/issues/644 -- richard On May 25, 2012, at 10:07 AM, Sa?o Kiselkov wrote:> I''m currently trying to get a SuperMicro JBOD with dual SAS expander > chips running in MPxIO, but I''m a total amateur to this and would like > to ask about how to detect whether MPxIO is working (or not). > > My SAS topology is: > > *) One LSI SAS2008-equipped HBA (running the latest IT firmware from > LSI) with two external ports. > *) Two SAS cables running from the HBA to the SuperMicro JBOD, where > they enter the JBOD''s rear backplane (which is equipped with two > LSI SAS expander chips). > *) From the rear backplane, via two internal SAS cables to the front > backplane (also with two SAS expanders on it) > *) The JBOD is populated with 45 2TB Toshiba SAS 7200rpm drives > > The machine also has a PERC H700 for the boot media, configured into a > hardware RAID-1 (on which rpool resides). > > Here is the relevant part from cfgadm -al for the MPxIO bits: > > c5 scsi-sas connected configured > unknown > c5::dsk/c5t50000393D8CB4452d0 disk connected configured > unknown > c5::dsk/c5t50000393E8C90CF2d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF2A6d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF2AAd0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF2BEd0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF2C6d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF2E2d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF2F2d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF5C6d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF28Ad0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF32Ed0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF35Ad0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF35Ed0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF36Ad0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF36Ed0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF52Ed0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF53Ad0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF53Ed0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF312d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF316d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF506d0 disk connected configured > unknown > c5::dsk/c5t50000393E8CAF546d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C84F5Ed0 disk connected configured > unknown > c5::dsk/c5t50000393F8C84FBAd0 disk connected configured > unknown > c5::dsk/c5t50000393F8C851EEd0 disk connected configured > unknown > c5::dsk/c5t50000393F8C852A6d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C852C2d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C852CAd0 disk connected configured > unknown > c5::dsk/c5t50000393F8C852EAd0 disk connected configured > unknown > c5::dsk/c5t50000393F8C854BAd0 disk connected configured > unknown > c5::dsk/c5t50000393F8C854E2d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C855AAd0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8509Ad0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8520Ad0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8528Ad0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8530Ed0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8531Ed0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8557Ed0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8558Ed0 disk connected configured > unknown > c5::dsk/c5t50000393F8C8560Ad0 disk connected configured > unknown > c5::dsk/c5t50000393F8C85106d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C85222d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C85246d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C85366d0 disk connected configured > unknown > c5::dsk/c5t50000393F8C85636d0 disk connected configured > unknown > c5::es/ses0 ESI connected configured > unknown > c5::es/ses1 ESI connected configured > unknown > c5::smp/expd0 smp connected configured > unknown > c5::smp/expd1 smp connected configured > unknown > c6 scsi-sas connected configured > unknown > c6::dsk/c6t50000393D8CB4453d0 disk connected configured > unknown > c6::dsk/c6t50000393E8C90CF3d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF2A7d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF2ABd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF2BFd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF2C7d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF2E3d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF2F3d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF5C7d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF28Bd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF32Fd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF35Bd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF35Fd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF36Bd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF36Fd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF52Fd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF53Bd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF53Fd0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF313d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF317d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF507d0 disk connected configured > unknown > c6::dsk/c6t50000393E8CAF547d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C84F5Fd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C84FBBd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C851EFd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C852A7d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C852C3d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C852CBd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C852EBd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C854BBd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C854E3d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C855ABd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8509Bd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8520Bd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8528Bd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8530Fd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8531Fd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8557Fd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8558Fd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C8560Bd0 disk connected configured > unknown > c6::dsk/c6t50000393F8C85107d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C85223d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C85247d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C85367d0 disk connected configured > unknown > c6::dsk/c6t50000393F8C85637d0 disk connected configured > unknown > c6::es/ses2 ESI connected configured > unknown > c6::es/ses3 ESI connected configured > unknown > c6::smp/expd2 smp connected configured > unknown > c6::smp/expd3 smp connected configured > unknown > > I can see all drives in format(1M), but "mpathadm list lu" doesn''t show > anything, even after I did "stmsboot -e" (and rebooted). Can anybody > please help me find out whether MPxIO is properly enabled and how I can > start using it? My understanding is that MPxIO should mask each drive > under some common name (e.g. c5t20d0) and underneath handle all of the > multipathing internally, but maybe I''m wrong and it behaves completely > differently. I''d like to create a 5x 9-drive raidz''s on the JBOD, but > I''m not sure how to do it now that I can see each drive twice... > > All help greatly appreciated! > > Cheers, > -- > Saso > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120525/52fc485e/attachment-0001.html>
On 05/25/2012 08:40 PM, Richard Elling wrote:> See the soluion at https://www.illumos.org/issues/644 -- richardGood Lord, that was it! It never occurred to me that the drives had a say in this. Thanks a billion! Cheers, -- Saso
On 05/25/2012 08:40 PM, Richard Elling wrote:> See the soluion at https://www.illumos.org/issues/644 > -- richardAnd predictably, I''m back with another n00b question regarding this array. I''ve put a pair of LSI-9200-8e controllers in the server and attached the cables to the enclosure to each of the HBAs. As a result (why?) I''m getting some really strange behavior: * piss poor performance (around 5MB/s per disk tops) * fmd(1M) running one core at near 100% saturation each time something writes or reads from the pool * using fmstat I noticed that its the eft module receiving hundreds of fault reports every second * fmd is flooded by multipath failover ereports like: ... May 29 21:11:44.9408 ereport.io.scsi.cmd.disk.tran May 29 21:11:44.9423 ereport.io.scsi.cmd.disk.tran May 29 21:11:44.8474 ereport.io.scsi.cmd.disk.recovered May 29 21:11:44.9455 ereport.io.scsi.cmd.disk.tran May 29 21:11:44.9457 ereport.io.scsi.cmd.disk.dev.rqs.derr May 29 21:11:44.9462 ereport.io.scsi.cmd.disk.tran May 29 21:11:44.9527 ereport.io.scsi.cmd.disk.tran May 29 21:11:44.9535 ereport.io.scsi.cmd.disk.dev.rqs.derr May 29 21:11:44.6362 ereport.io.scsi.cmd.disk.recovered ... I suspect that multipath is something not exactly very happy with my Toshiba disks, but I have no idea what to do to make it work at least somehow acceptably. I tried messing with scsi_vhci.conf to try and set load-balance="none", change the scsi-vhci-failover-override for the Toshiba disks to f_asym_lsi, flashing the latest as well as old firmware in the cards, reseating them to other PCI-e slots, removing one cable and even removing one whole HBA, unloading the eft fmd module etc, but nothing helped so far and I''m sort of out of ideas. Anybody else got an idea on what I might try? Cheers, -- Saso
On May 30, 2012, at 1:07 PM, Sa?o Kiselkov wrote:> On 05/25/2012 08:40 PM, Richard Elling wrote: >> See the soluion at https://www.illumos.org/issues/644 >> -- richard > > And predictably, I''m back with another n00b question regarding this > array. I''ve put a pair of LSI-9200-8e controllers in the server and > attached the cables to the enclosure to each of the HBAs. As a result > (why?) I''m getting some really strange behavior: > > * piss poor performance (around 5MB/s per disk tops) > * fmd(1M) running one core at near 100% saturation each time something > writes or reads from the pool > * using fmstat I noticed that its the eft module receiving hundreds of > fault reports every second > * fmd is flooded by multipath failover ereports like: > > ... > May 29 21:11:44.9408 ereport.io.scsi.cmd.disk.tran > May 29 21:11:44.9423 ereport.io.scsi.cmd.disk.tran > May 29 21:11:44.8474 ereport.io.scsi.cmd.disk.recovered > May 29 21:11:44.9455 ereport.io.scsi.cmd.disk.tran > May 29 21:11:44.9457 ereport.io.scsi.cmd.disk.dev.rqs.derr > May 29 21:11:44.9462 ereport.io.scsi.cmd.disk.tran > May 29 21:11:44.9527 ereport.io.scsi.cmd.disk.tran > May 29 21:11:44.9535 ereport.io.scsi.cmd.disk.dev.rqs.derr > May 29 21:11:44.6362 ereport.io.scsi.cmd.disk.recovered > ... > > > > I suspect that multipath is something not exactly very happy with my > Toshiba disks, but I have no idea what to do to make it work at least > somehow acceptably. I tried messing with scsi_vhci.conf to try and set > load-balance="none", change the scsi-vhci-failover-override for the > Toshiba disks to f_asym_lsi, flashing the latest as well as old firmware > in the cards, reseating them to other PCI-e slots, removing one cable > and even removing one whole HBA, unloading the eft fmd module etc, but > nothing helped so far and I''m sort of out of ideas. Anybody else got an > idea on what I might try?Those ereports are consistent with faulty cabling. You can trace all of the cables and errors using tools like lsiutil, sg_logs, kstats, etc. Unfortunately, it is not really possible to get into this level of detail over email, and it can consume many hours. -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120530/f49f6fe2/attachment.html>
On 05/30/2012 10:53 PM, Richard Elling wrote:> On May 30, 2012, at 1:07 PM, Sa?o Kiselkov wrote: > >> On 05/25/2012 08:40 PM, Richard Elling wrote: >>> See the soluion at https://www.illumos.org/issues/644 >>> -- richard >> >> And predictably, I''m back with another n00b question regarding this >> array. I''ve put a pair of LSI-9200-8e controllers in the server and >> attached the cables to the enclosure to each of the HBAs. As a result >> (why?) I''m getting some really strange behavior: >> >> * piss poor performance (around 5MB/s per disk tops) >> * fmd(1M) running one core at near 100% saturation each time something >> writes or reads from the pool >> * using fmstat I noticed that its the eft module receiving hundreds of >> fault reports every second >> * fmd is flooded by multipath failover ereports like: >> >> ... >> May 29 21:11:44.9408 ereport.io.scsi.cmd.disk.tran >> May 29 21:11:44.9423 ereport.io.scsi.cmd.disk.tran >> May 29 21:11:44.8474 ereport.io.scsi.cmd.disk.recovered >> May 29 21:11:44.9455 ereport.io.scsi.cmd.disk.tran >> May 29 21:11:44.9457 ereport.io.scsi.cmd.disk.dev.rqs.derr >> May 29 21:11:44.9462 ereport.io.scsi.cmd.disk.tran >> May 29 21:11:44.9527 ereport.io.scsi.cmd.disk.tran >> May 29 21:11:44.9535 ereport.io.scsi.cmd.disk.dev.rqs.derr >> May 29 21:11:44.6362 ereport.io.scsi.cmd.disk.recovered >> ... >> >> >> >> I suspect that multipath is something not exactly very happy with my >> Toshiba disks, but I have no idea what to do to make it work at least >> somehow acceptably. I tried messing with scsi_vhci.conf to try and set >> load-balance="none", change the scsi-vhci-failover-override for the >> Toshiba disks to f_asym_lsi, flashing the latest as well as old firmware >> in the cards, reseating them to other PCI-e slots, removing one cable >> and even removing one whole HBA, unloading the eft fmd module etc, but >> nothing helped so far and I''m sort of out of ideas. Anybody else got an >> idea on what I might try? > > Those ereports are consistent with faulty cabling. You can trace all of the > cables and errors using tools like lsiutil, sg_logs, kstats, etc. Unfortunately, > it is not really possible to get into this level of detail over email, and it can > consume many hours. > -- richardThat''s actually a pretty good piece of information for me! I will try changing my cabling to see if I can get the errors to go away. Thanks again for the suggestions! Cheers -- Saso
On 05/30/2012 10:53 PM, Richard Elling wrote:> Those ereports are consistent with faulty cabling. You can trace all of the > cables and errors using tools like lsiutil, sg_logs, kstats, etc. Unfortunately, > it is not really possible to get into this level of detail over email, and it can > consume many hours. > -- richardAnd it turns out you were right. Looking at errors using iostat -E while manipulating the path taken by the data using mpathadm clearly shows that one of the paths is faulty. Thanks again for pointing me in the right direction! Cheers, -- Saso