So, finally! Sun has a SAS/SATA JBOD that doesn''t have the extra crap we, as zfs users, do not need. How does zfs handle device renumbering? IIRC from my last foray into SAS, the LSI or Sun HBA remembers drive id''s. If you remove all drives from the chassis (some SAS/SATA chassis, not specifically J4500) and replace them in random order, they would retain the same mapped scsi id''s that they had previously, as the controller would remember which drive was which. This makes it difficult to know which drive to remove in the event of a failure since the chassis slot numbering isn''t related to the scsi id. (and all you have to go on is scsi id.) Also, this makes single drive replacement, even when the scsi id to chassis slot mapping is previously known, an issue since your new drive won''t be there *instead of* the old removed drive, it will be a new addition. Just wondering how this works now. Or does the SAS driver do away with scsi id''s now? -frank
Well I haven''t used a J4500, but when we had an x4500 (Thumper) on loan they had Solaris pretty well integrated with the hardware. When a disk failed, I used cfgadm to offline it and as soon as I did that a bright blue "Ready to Remove" LED lit up on the drive tray of the faulty disk, right next to the handle you need to lift to remove the drive. There''s also a bright red "Fault" LED as well as the standard green "OK" LED, so spotting failed drives really should be a piece of cake. Certainly in my tests, so long as you followed the procedure in the manual it really is impossible to get the wrong drive. This message posted from opensolaris.org
On Tue, Jul 15, 2008 at 4:17 AM, Ross <myxiplx at hotmail.com> wrote:> Well I haven''t used a J4500, but when we had an x4500 (Thumper) on loan they had Solaris pretty well integrated with the hardware. When a disk failed, I used cfgadm to offline it and as soon as I did that a bright blue "Ready to Remove" LED lit up on the drive tray of the faulty disk, right next to the handle you need to lift to remove the drive. > > There''s also a bright red "Fault" LED as well as the standard green "OK" LED, so spotting failed drives really should be a piece of cake. Certainly in my tests, so long as you followed the procedure in the manual it really is impossible to get the wrong drive. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris has a bit more info on some of this -- while I would expect Sun products to integrate that well, it''s nice to know the framework is there for other vendors to do the same if they wish.
On July 15, 2008 7:44:53 AM -0500 Jason King <jason at ansipunx.net> wrote:> http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris > has a bit more info on some of this -- while I would expect Sun > products to integrate that well, it''s nice to know the framework is > there for other vendors to do the same if they wish.Wow, that''s hot off the presses. Sounds perfect, but I guess it will be awhile before it makes it into Solaris. I''m surprised Sun would release a storage product without the ability to know which drive has failed. -frank
On Tue, 15 Jul 2008, Ross wrote:> Well I haven''t used a J4500, but when we had an x4500 (Thumper) on > loan they had Solaris pretty well integrated with the hardware. > When a disk failed, I used cfgadm to offline it and as soon as I did > that a bright blue "Ready to Remove" LED lit up on the drive tray of > the faulty disk, right next to the handle you need to lift to remove > the drive.That sure sounds a whole lot easier to manage than my setup with a StorageTek 2540 and each drive as a LUN. The 2540 could detect a failed drive by itself and turn an LED on, but if ZFS decides that a drive has failed and the 2540 does not, then I will have to use the 2540''s CAM administrative interface and manually set the drive out of service. I very much doubt that cfgadm will communicate with the 2540 and tell it to do anything. A little while back I created this table so I could understand how things were mapped: Disk Volume LUN WWN Device ZFS ====== ======= === =============================================== ===================================== ==t85d01 Disk-01 0 60:0A:0B:80:00:3A:8A:0B:00:00:09:61:47:B4:51:BE c4t600A0B80003A8A0B0000096147B451BEd0 P3-A t85d02 Disk-02 1 60:0A:0B:80:00:39:C9:B5:00:00:0A:9C:47:B4:52:2D c4t600A0B800039C9B500000A9C47B4522Dd0 P6-A t85d03 Disk-03 2 60:0A:0B:80:00:39:C9:B5:00:00:0A:A0:47:B4:52:9B c4t600A0B800039C9B500000AA047B4529Bd0 P1-B t85d04 Disk-04 3 60:0A:0B:80:00:3A:8A:0B:00:00:09:66:47:B4:53:CE c4t600A0B80003A8A0B0000096647B453CEd0 P4-A t85d05 Disk-05 4 60:0A:0B:80:00:39:C9:B5:00:00:0A:A4:47:B4:54:4F c4t600A0B800039C9B500000AA447B4544Fd0 P2-B t85d06 Disk-06 5 60:0A:0B:80:00:3A:8A:0B:00:00:09:6A:47:B4:55:9E c4t600A0B80003A8A0B0000096A47B4559Ed0 P1-A t85d07 Disk-07 6 60:0A:0B:80:00:39:C9:B5:00:00:0A:A8:47:B4:56:05 c4t600A0B800039C9B500000AA847B45605d0 P3-B t85d08 Disk-08 7 60:0A:0B:80:00:3A:8A:0B:00:00:09:6E:47:B4:56:DA c4t600A0B80003A8A0B0000096E47B456DAd0 P2-A t85d09 Disk-09 8 60:0A:0B:80:00:39:C9:B5:00:00:0A:AC:47:B4:57:39 c4t600A0B800039C9B500000AAC47B45739d0 P4-B t85d10 Disk-10 9 60:0A:0B:80:00:39:C9:B5:00:00:0A:B0:47:B4:57:AD c4t600A0B800039C9B500000AB047B457ADd0 P5-B t85d11 Disk-11 10 60:0A:0B:80:00:3A:8A:0B:00:00:09:73:47:B4:57:D4 c4t600A0B80003A8A0B0000097347B457D4d0 P5-A t85d12 Disk-12 11 60:0A:0B:80:00:39:C9:B5:00:00:0A:B4:47:B4:59:5F c4t600A0B800039C9B500000AB447B4595Fd0 P6-B When I selected the drive pairings, it was based on a dump from a multipath utility and it seems that on a chassis level there is no rhyme or reason for the zfs mirror pairings. This is an area where traditional RAID hardware makes ZFS more difficult to use. Bob
It sounds like you might be interested to read up on Eric Schrock''s work. I read today about some of the stuff he''s been doing to bring integrated fault management to Solaris: http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris His last paragraph is great to see, Sun really do seem to be headed in the right direction: "I often like to joke about the amount of time that I have spent just getting a single LED to light. At first glance, it seems like a pretty simple task. But to do it in a generic fashion that can be generalized across a wide variety of platforms, correlated with physically meaningful labels, and incorporate a diverse set of diagnoses (ZFS, SCSI, HBA, etc) requires an awful lot of work. Once it''s all said and done, however, future platforms will require little to no integration work, and you''ll be able to see a bad drive generate checksum errors in ZFS, resulting in a FMA diagnosis indicating the faulty drive, activate a hot spare, and light the fault LED on the drive bay (wherever it may be). Only then will we have accomplished our goal of an end-to-end storage strategy for Solaris - and hopefully someone besides me will know what it has taken to get that little LED to light." Ross> Date: Tue, 15 Jul 2008 12:51:22 -0500> From: bfriesen at simple.dallas.tx.us> To: myxiplx at hotmail.com> CC: zfs-discuss at opensolaris.org> Subject: Re: [zfs-discuss] J4500 device renumbering> > On Tue, 15 Jul 2008, Ross wrote:> > > Well I haven''t used a J4500, but when we had an x4500 (Thumper) on > > loan they had Solaris pretty well integrated with the hardware. > > When a disk failed, I used cfgadm to offline it and as soon as I did > > that a bright blue "Ready to Remove" LED lit up on the drive tray of > > the faulty disk, right next to the handle you need to lift to remove > > the drive.> > That sure sounds a whole lot easier to manage than my setup with a > StorageTek 2540 and each drive as a LUN. The 2540 could detect a > failed drive by itself and turn an LED on, but if ZFS decides that a > drive has failed and the 2540 does not, then I will have to use the > 2540''s CAM administrative interface and manually set the drive out of > service. I very much doubt that cfgadm will communicate with the 2540 > and tell it to do anything.> > A little while back I created this table so I could understand how > things were mapped:> > Disk Volume LUN WWN Device ZFS> ====== ======= === =============================================== ===================================== ===> t85d01 Disk-01 0 60:0A:0B:80:00:3A:8A:0B:00:00:09:61:47:B4:51:BE c4t600A0B80003A8A0B0000096147B451BEd0 P3-A> t85d02 Disk-02 1 60:0A:0B:80:00:39:C9:B5:00:00:0A:9C:47:B4:52:2D c4t600A0B800039C9B500000A9C47B4522Dd0 P6-A> t85d03 Disk-03 2 60:0A:0B:80:00:39:C9:B5:00:00:0A:A0:47:B4:52:9B c4t600A0B800039C9B500000AA047B4529Bd0 P1-B> t85d04 Disk-04 3 60:0A:0B:80:00:3A:8A:0B:00:00:09:66:47:B4:53:CE c4t600A0B80003A8A0B0000096647B453CEd0 P4-A> t85d05 Disk-05 4 60:0A:0B:80:00:39:C9:B5:00:00:0A:A4:47:B4:54:4F c4t600A0B800039C9B500000AA447B4544Fd0 P2-B> t85d06 Disk-06 5 60:0A:0B:80:00:3A:8A:0B:00:00:09:6A:47:B4:55:9E c4t600A0B80003A8A0B0000096A47B4559Ed0 P1-A> t85d07 Disk-07 6 60:0A:0B:80:00:39:C9:B5:00:00:0A:A8:47:B4:56:05 c4t600A0B800039C9B500000AA847B45605d0 P3-B> t85d08 Disk-08 7 60:0A:0B:80:00:3A:8A:0B:00:00:09:6E:47:B4:56:DA c4t600A0B80003A8A0B0000096E47B456DAd0 P2-A> t85d09 Disk-09 8 60:0A:0B:80:00:39:C9:B5:00:00:0A:AC:47:B4:57:39 c4t600A0B800039C9B500000AAC47B45739d0 P4-B> t85d10 Disk-10 9 60:0A:0B:80:00:39:C9:B5:00:00:0A:B0:47:B4:57:AD c4t600A0B800039C9B500000AB047B457ADd0 P5-B> t85d11 Disk-11 10 60:0A:0B:80:00:3A:8A:0B:00:00:09:73:47:B4:57:D4 c4t600A0B80003A8A0B0000097347B457D4d0 P5-A> t85d12 Disk-12 11 60:0A:0B:80:00:39:C9:B5:00:00:0A:B4:47:B4:59:5F c4t600A0B800039C9B500000AB447B4595Fd0 P6-B> > When I selected the drive pairings, it was based on a dump from a > multipath utility and it seems that on a chassis level there is no > rhyme or reason for the zfs mirror pairings.> > This is an area where traditional RAID hardware makes ZFS more > difficult to use.> > Bob_________________________________________________________________ Find the best and worst places on the planet http://clk.atdmt.com/UKM/go/101719807/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080715/c73dae71/attachment.html>
On Tue, 15 Jul 2008, Ross Smith wrote:> > It sounds like you might be interested to read up on Eric Schrock''s work. I read today about some of the stuff he''s been doing to bring integrated fault management to Solaris: > http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris > His last paragraph is great to see, Sun really do seem to be headed in the right direction:That does sound good. It seems like this effort is initially limited to SAS enclosures. It would be nice to see a special Solaris "JBOD" operating mode for firmware-managed RAID shelves. Currently there is no relationship between a LUN and a drive unless the administrator carefully establishes that mapping. The Solaris system does not know how the LUN is constructed. With a special "JBOD" mode the RAID shelf could put a set of drives into JBOD mode with a single command. The Solaris fault/device system could then be integrated so that it knows about drive errors, and knows how to enable lights and take drives in and out of service. I have no doubt that the Solaris command line administrative utilities that come with CAM could be used to do useful things but these may not be well documented. Bob
On Tue, 2008-07-15 at 15:32 -0500, Bob Friesenhahn wrote:> On Tue, 15 Jul 2008, Ross Smith wrote: > > > > > It sounds like you might be interested to read up on Eric Schrock''s work. I read today about some of the stuff he''s been doing to bring integrated fault management to Solaris: > > http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris > > His last paragraph is great to see, Sun really do seem to be headed in the right direction: > > That does sound good. It seems like this effort is initially limited > to SAS enclosures.It seems to get some info from a SE3510 jbod (fiberchannel), but doesn''t identify which disk is in each drive slot: # /usr/lib/fm/fmd/fmtopo -V ''*/ses-enclosure=0/bay=0'' TIME UUID Jul 15 17:33:37 6033e234-94a3-ca79-9138-af1ee7f95b8d hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 group: protocol version: 1 stability: Private/Private resource fmri hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 label string Disk Drives 0 FRU fmri hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 group: authority version: 1 stability: Private/Private product-id string SUN-StorEdge-3510F-D chassis-id string 205000c0ff086b4a server-id string group: ses version: 1 stability: Private/Private node-id uint64 0x3 target-path string /dev/es/ses0 # /usr/lib/fm/fmd/fmtopo ''*/ses-enclosure=0/*'' TIME UUID Jul 15 17:35:23 16ff7d01-7f1d-e8ef-f8a5-d60a01d99b68 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/psu=0 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/psu=1 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=0 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=1 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=2 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=3 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=1 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=2 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=3 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=4 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=5 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=6 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=7 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=8 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=9 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=10 hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=11 - Bill
Frank Cusack wrote:> On July 15, 2008 7:44:53 AM -0500 Jason King <jason at ansipunx.net> wrote: > >> http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris >> has a bit more info on some of this -- while I would expect Sun >> products to integrate that well, it''s nice to know the framework is >> there for other vendors to do the same if they wish. >> > > Wow, that''s hot off the presses. Sounds perfect, but I guess it will > be awhile before it makes it into Solaris. I''m surprised Sun would > release a storage product without the ability to know which drive has > failed. >Eh? The J400 series, and almost all other Sun storage arrays, uses the Sun Common Array Manager (CAM) software which includes a Fault Management Service (FMS) component that will manage faults, automatically log service calls (aka phone home), etc. For more details on how to use CAM for the J4000 arrays, see the docs at http://docs.sun.com/app/docs/coll/cam6.1 especially the User Guide for the J4000 Array. The difference between CAM and native Solaris FMA services is that CAM runs on Solaris, Windows, and Linux servers. CAM provides local and remote CLI as well as a BUI. So, I would say that CAM is an array-specific manager which runs on multiple operating systems as opposed to the work being done by Eric and team which is to make an array-agnostic manager for Solaris. In truth, both solutions are needed :-) -- richard
The current SES enumerator doesn''t support parsing AES FC descriptors (which are required to correlate disks with the Solaris abstraction). So we get the PSU/fan/bay information, but don''t know which disk is which. It should be pretty straightforward to do, though we may need to make sure that the HBAs are exporting appropriate properties to do the correleation. I don''t have a machine to test it on, but it''d be a nice little project for someone sufficiently motivated. - Eric On Tue, Jul 15, 2008 at 02:37:09PM -0700, Bill Sommerfeld wrote:> On Tue, 2008-07-15 at 15:32 -0500, Bob Friesenhahn wrote: > > On Tue, 15 Jul 2008, Ross Smith wrote: > > > > > > > > It sounds like you might be interested to read up on Eric Schrock''s work. I read today about some of the stuff he''s been doing to bring integrated fault management to Solaris: > > > http://blogs.sun.com/eschrock/entry/external_storage_enclosures_in_solaris > > > His last paragraph is great to see, Sun really do seem to be headed in the right direction: > > > > That does sound good. It seems like this effort is initially limited > > to SAS enclosures. > > It seems to get some info from a SE3510 jbod (fiberchannel), but doesn''t > identify which disk is in each drive slot: > > # /usr/lib/fm/fmd/fmtopo -V ''*/ses-enclosure=0/bay=0'' > TIME UUID > Jul 15 17:33:37 6033e234-94a3-ca79-9138-af1ee7f95b8d > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 > group: protocol version: 1 stability: Private/Private > resource fmri hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 > label string Disk Drives 0 > FRU fmri hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 > group: authority version: 1 stability: Private/Private > product-id string SUN-StorEdge-3510F-D > chassis-id string 205000c0ff086b4a > server-id string > group: ses version: 1 stability: Private/Private > node-id uint64 0x3 > target-path string /dev/es/ses0 > > # /usr/lib/fm/fmd/fmtopo ''*/ses-enclosure=0/*'' > TIME UUID > Jul 15 17:35:23 16ff7d01-7f1d-e8ef-f8a5-d60a01d99b68 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/psu=0 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/psu=1 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=0 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=1 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=2 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/fan=3 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=0 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=1 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=2 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=3 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=4 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=5 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=6 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=7 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=8 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=9 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=10 > > hc://:product-id=SUN-StorEdge-3510F-D:chassis-id=205000c0ff086b4a:server-id=/ses-enclosure=0/bay=11 > > > - Bill > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Fishworks http://blogs.sun.com/eschrock