My first real-hardware Solaris install. I''ve installed S10 u2 on a system with an Asus M2n-SLI Deluxe nForce 570-SLI motherboard, Athlon 64 X2 dual core CPU. It''s in a Chenbro SR107 case with two Chenbro 4-drive SATA hot-swap bays. C1D0 is in the first hot-swap bay, and is the boot drive (an 80GB). C2D0 is in the second bay, and is not used (eventually this will be a mirror of the boot drive, via the controller hardware, or Solaris software; possibly I''ll even go to a ZFS boot system, when that''s available). Another 80GB drive. C3D0 and C4D0 are 400GB drives in the third and fourth hot-swap bays. They''re in a ZFS mirror vdev, and are the only thing in ZFS pool zp1. Everything works fine; I can create ZFS filesystems on the ZFS pool, put files on them, read them back, etc. I can run a scrub and it all checks out. ZFS status reports it healthy, online, all there, etc. So, having gotten this far, and it being a scratch install and all, I reached over and pulled out C3D0. I then typed a zpool status command. This hung after the first line of output. And I started getting messages on the console, saying things like (retyped; the system never really unhung, and isn''t on a network yet anyway): gda: Warning: /pci at 0,0/pci-ide at 5,1/ide at 1/cmdk at 0,0 (disk 3) Error for command: write sector Error level: informational gda: sense key: aborted command gda: vendor "Gen-ATA" error code: 0x3 <illegible>: ata-disk start: select failed Eventually I have to hard-reset the box. It comes up again fine, and the pool is okay (I pushed the drive back in), and scrub doesn''t find any errors. So what''s going on? Does there have to be some special driver to communicate with the hot-swap hardware? I didn''t think one was needed. Also, shouldn''t some of these error messages end up in some kind of log file on disk somewhere? I found /var/log/syslog, and some other log files nearby, and none of them had any disk-related issues at all. Are those log files kept somewhere else entirely? -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
It sounds like the SATA (or SD) driver might be overambitious at retrying operations. It seems to me, coming from the SCSI world, that a "select failed" really ought to be a pretty strong indication that the device is gone; but perhaps SATA acts quite differently. As for these messages, they ought to wind up in /var/adm/messages by default; the standard Solaris /etc/syslog.conf file directs them there. This message posted from opensolaris.org
David Dyer-Bennet wrote: [...]> So, having gotten this far, and it being a scratch install and all, I > reached over and pulled out C3D0. I then typed a zpool status > command. This hung after the first line of output. And I started > getting messages on the console, saying things like (retyped; the > system never really unhung, and isn''t on a network yet anyway): > > gda: Warning: /pci at 0,0/pci-ide at 5,1/ide at 1/cmdk at 0,0 (disk 3) > Error for command: write sector Error level: informational > > gda: sense key: aborted command > gda: vendor "Gen-ATA" error code: 0x3 > <illegible>: ata-disk start: select failed > > Eventually I have to hard-reset the box. It comes up again fine, and > the pool is okay (I pushed the drive back in), and scrub doesn''t find > any errors. > > So what''s going on? Does there have to be some special driver to > communicate with the hot-swap hardware? I didn''t think one was > needed.Here''s my take without looking anything up. While the drive is physically hot-pluggable, the software stack doesn''t support what you did. I *think* the correct sequence of events would probably be: 1. Detach the mirrored drive (c3d0) from the ZFS pool. 2. Use ''cfgadm'' to ''unplug'' the drive 3. Remove it However, it sounds like your drive is SATA running in legacy ATA mode with the ata driver, and I do not off-hand know if the driver and/or cfgadm support ''unplugging'' it. Dana
On September 8, 2006 9:34:29 PM -0500 David Dyer-Bennet <dd-b at dd-b.net> wrote:> My first real-hardware Solaris install. I''ve installed S10 u2 on a > system with an Asus M2n-SLI Deluxe nForce 570-SLI motherboard, Athlon > 64 X2 dual core CPU. It''s in a Chenbro SR107 case with two Chenbro > 4-drive SATA hot-swap bays.... [ hot swap doesn''t work ] ...> So what''s going on?S10 does not support SATA hot swap unless they are SATA drives behind a SAS controller. Maybe it will work in U3. -frank
On 9/9/06, Frank Cusack <frank at cusack.net> wrote:> On September 8, 2006 9:34:29 PM -0500 David Dyer-Bennet <dd-b at dd-b.net> > wrote: > > My first real-hardware Solaris install. I''ve installed S10 u2 on a > > system with an Asus M2n-SLI Deluxe nForce 570-SLI motherboard, Athlon > > 64 X2 dual core CPU. It''s in a Chenbro SR107 case with two Chenbro > > 4-drive SATA hot-swap bays. > ... [ hot swap doesn''t work ] ... > > So what''s going on? > > S10 does not support SATA hot swap unless they are SATA drives behind > a SAS controller.Gack! I don''t want to sound *too* pissy about this, but I''d be a lot happier if somebody had thought to mention this when I was first asking about issues I''d face building a home disk server on an SATA hot-swap box. Because now I''ve invested $2k in this box that apparently can''t really do what I want, at least yet. I see suggestions on what might be a usable workaround (basically telling zfs manually to stop using the disk before physically removing it), and a hope that full hot-swap might appear in a later release. I don''t suppose Express or Community Edition might already have released the bits needed for SATA hot-swap? I''ll certainly try out the workaround, and if it does work, that''s really a pretty adequate workaround; mostly what it loses is the cool demo potential of simply casually ripping the drive out. -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
On 9/9/06, Dana H. Myers <Dana.Myers at sun.com> wrote:> David Dyer-Bennet wrote:> Here''s my take without looking anything up. While the drive is physically > hot-pluggable, the software stack doesn''t support what you did. I *think* > the correct sequence of events would probably be: > > 1. Detach the mirrored drive (c3d0) from the ZFS pool. > 2. Use ''cfgadm'' to ''unplug'' the drive > 3. Remove itI''ll look into this; It''d be excellent to have some sort of workaround. Other than the loss of cool demo value (and I''m not using this system to sell ZFS or anything, demo value is purely recreational here) and the slight possibility of forgetting and pulling a drive without typing the right commands first, having to type a few commands isn''t a problem.> However, it sounds like your drive is SATA running in legacy ATA > mode with the ata driver, and I do not off-hand know if the driver > and/or cfgadm support ''unplugging'' it.Any idea how I could approach finding drivers to let me run in a more real SATA mode? Is it worth trying Express or CE just to see? -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
On September 9, 2006 10:51:30 AM -0500 David Dyer-Bennet <dd-b at dd-b.net> wrote:> I see suggestions on what might be a usable workaround (basically > telling zfs manually to stop using the disk before physically removing > it), and a hope that full hot-swap might appear in a later release.My experience is that this won''t work. Once you remove a SATA drive you no longer have access to it, period. This kind of makes sense; hot swap works at the device level, not the filesystem level. -frank
On September 9, 2006 10:51:30 AM -0500 David Dyer-Bennet <dd-b at dd-b.net> wrote:> On 9/9/06, Frank Cusack <frank at cusack.net> wrote: >> On September 8, 2006 9:34:29 PM -0500 David Dyer-Bennet <dd-b at dd-b.net> >> wrote: >> > My first real-hardware Solaris install. I''ve installed S10 u2 on a >> > system with an Asus M2n-SLI Deluxe nForce 570-SLI motherboard, Athlon >> > 64 X2 dual core CPU. It''s in a Chenbro SR107 case with two Chenbro >> > 4-drive SATA hot-swap bays. >> ... [ hot swap doesn''t work ] ... >> > So what''s going on? >> >> S10 does not support SATA hot swap unless they are SATA drives behind >> a SAS controller. > > Gack! > > I don''t want to sound *too* pissy about this, but I''d be a lot happier > if somebody had thought to mention this when I was first asking about > issues I''d face building a home disk server on an SATA hot-swap box. > Because now I''ve invested $2k in this box that apparently can''t really > do what I want, at least yet.Buy a SAS controller. Of course that''s more money down the hole if it doesn''t work out for you. -frank
On 9/9/06, Frank Cusack <fcusack at fcusack.com> wrote:> On September 9, 2006 10:51:30 AM -0500 David Dyer-Bennet <dd-b at dd-b.net> > wrote: > > I see suggestions on what might be a usable workaround (basically > > telling zfs manually to stop using the disk before physically removing > > it), and a hope that full hot-swap might appear in a later release. > > My experience is that this won''t work. Once you remove a SATA drive you > no longer have access to it, period. This kind of makes sense; hot swap > works at the device level, not the filesystem level.Yeah, it doesn''t in fact work. On the other hand, nv_44 does find my ethernet interfaces, which I don''t think S10_u2 did (that system isn''t on any net yet), so it''s not a total loss. -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
On 9/9/06, Frank Cusack <fcusack at fcusack.com> wrote:> On September 9, 2006 10:51:30 AM -0500 David Dyer-Bennet <dd-b at dd-b.net> > wrote: > > On 9/9/06, Frank Cusack <frank at cusack.net> wrote: > >> On September 8, 2006 9:34:29 PM -0500 David Dyer-Bennet <dd-b at dd-b.net> > >> wrote: > >> > My first real-hardware Solaris install. I''ve installed S10 u2 on a > >> > system with an Asus M2n-SLI Deluxe nForce 570-SLI motherboard, Athlon > >> > 64 X2 dual core CPU. It''s in a Chenbro SR107 case with two Chenbro > >> > 4-drive SATA hot-swap bays. > >> ... [ hot swap doesn''t work ] ... > >> > So what''s going on? > >> > >> S10 does not support SATA hot swap unless they are SATA drives behind > >> a SAS controller. > > > > Gack! > > > > I don''t want to sound *too* pissy about this, but I''d be a lot happier > > if somebody had thought to mention this when I was first asking about > > issues I''d face building a home disk server on an SATA hot-swap box. > > Because now I''ve invested $2k in this box that apparently can''t really > > do what I want, at least yet. > > Buy a SAS controller. Of course that''s more money down the hole if it > doesn''t work out for you.If it comes to that, I buy one of the SATA controllers known to work (there''s a Supermicro card, for example). I probably run without hot swap for a while, until either it comes out, or I get tired enough of it to buy a card. Or else I run Linux, but that loses me the ability to add mirrors into the pool (expanding pool size) and the block checksums and scrubbing, so that''s not so attractive (unless the ZFS port goes unexpectedly fast). My big worry going to Solaris was hardware compatibility, and I''ve gotten bitten pretty bad by it here. Sigh. -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
David Dyer-Bennet wrote:> Any idea how I could approach finding drivers to let me run in a more > real SATA mode? Is it worth trying Express or CE just to see?Today, use the Marvell controllers. See marvell88sx(7d) SuperMicro has several products which use these controllers. -- richard