Roberto Scudeller
2012-Jul-17  14:43 UTC
[zfs-discuss] Problem: Disconnected command timeout for Target X
Hi all, I''m using Opensolaris snv_134 with LSI Controllers and a motherboard supermicro, with 20 sata disks, zfs in raid-10 conf. I mounted this zfs_storage with NFS. I''m not opensolaris specialist. What''re the commands to show hardware information? Like ''lshw'' in linux but for opensolaris. The storage stopped working, but ping responds. SSH and NFS is out. When I open the console showing this messages: Jul 2 13:00:27 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:00:27 storage Disconnected command timeout for Target 4 Jul 2 13:01:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:01:28 storage Disconnected command timeout for Target 3 Jul 2 13:02:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:02:28 storage Disconnected command timeout for Target 2 Jul 2 13:03:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:03:29 storage Disconnected command timeout for Target 1 Jul 2 13:04:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:04:29 storage Disconnected command timeout for Target 0 Jul 2 13:05:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:05:40 storage Disconnected command timeout for Target 6 Jul 2 13:06:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 ,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): Jul 2 13:06:40 storage Disconnected command timeout for Target 5 Any ideas? Could help me? -- Roberto Scudeller -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120717/6c42dfd4/attachment.html>
Bob Friesenhahn
2012-Jul-17  15:23 UTC
[zfs-discuss] Problem: Disconnected command timeout for Target X
On Tue, 17 Jul 2012, Roberto Scudeller wrote:> Hi all, > > I''m using Opensolaris snv_134 with LSI Controllers and a motherboard supermicro, with 20 sata disks, zfs in raid-10 conf. I mounted this zfs_storage with > NFS. > I''m not opensolaris specialist. What''re the commands to show hardware information? Like ''lshw'' in linux but for opensolaris.cfgadm, prtconf, prtpicl, prtdiag zpool status fmadm faulty It sounds like you may have a broken cable or power supply failure to some disks. Bob> > The storage stopped working, but ping responds. SSH and NFS is out. When I open the console showing this messages: > > Jul? 2 13:00:27 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:00:27 storage??????? Disconnected command timeout for Target 4 > Jul? 2 13:01:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:01:28 storage??????? Disconnected command timeout for Target 3 > Jul? 2 13:02:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:02:28 storage??????? Disconnected command timeout for Target 2 > Jul? 2 13:03:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:03:29 storage??????? Disconnected command timeout for Target 1 > Jul? 2 13:04:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:04:29 storage??????? Disconnected command timeout for Target 0 > Jul? 2 13:05:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:05:40 storage??????? Disconnected command timeout for Target 6 > Jul? 2 13:06:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1000,3140 at 0 (mpt2): > Jul? 2 13:06:40 storage??????? Disconnected command timeout for Target 5 > > Any ideas? Could help me? > > -- > Roberto Scudeller > > > >-- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Roberto Scudeller
2012-Jul-17  20:34 UTC
[zfs-discuss] Problem: Disconnected command timeout for Target X
Hi Bob, Thanks for the answers. How do I test your theory? In this case, I use common disks SATA 2, not Nearline SAS (NL SATA) or SAS. Do you think the disks SATA are the problem? Cheers, 2012/7/17 Bob Friesenhahn <bfriesen at simple.dallas.tx.us>> On Tue, 17 Jul 2012, Roberto Scudeller wrote: > > Hi all, >> >> I''m using Opensolaris snv_134 with LSI Controllers and a motherboard >> supermicro, with 20 sata disks, zfs in raid-10 conf. I mounted this >> zfs_storage with >> NFS. >> I''m not opensolaris specialist. What''re the commands to show hardware >> information? Like ''lshw'' in linux but for opensolaris. >> > > cfgadm, prtconf, prtpicl, prtdiag > > zpool status > > fmadm faulty > > It sounds like you may have a broken cable or power supply failure to some > disks. > > Bob > > > >> The storage stopped working, but ping responds. SSH and NFS is out. When >> I open the console showing this messages: >> >> Jul 2 13:00:27 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:00:27 storage Disconnected command timeout for Target 4 >> Jul 2 13:01:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:01:28 storage Disconnected command timeout for Target 3 >> Jul 2 13:02:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:02:28 storage Disconnected command timeout for Target 2 >> Jul 2 13:03:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:03:29 storage Disconnected command timeout for Target 1 >> Jul 2 13:04:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:04:29 storage Disconnected command timeout for Target 0 >> Jul 2 13:05:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:05:40 storage Disconnected command timeout for Target 6 >> Jul 2 13:06:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci at 0 >> ,0/pci8086,340a at 3/**pci1000,3140 at 0 (mpt2): >> Jul 2 13:06:40 storage Disconnected command timeout for Target 5 >> >> Any ideas? Could help me? >> >> -- >> Roberto Scudeller >> >> >> >> >> > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/** > users/bfriesen/ <http://www.simplesystems.org/users/bfriesen/> > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/-- Roberto Scudeller -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120717/d021ff2c/attachment.html>
Bob Friesenhahn
2012-Jul-17  20:46 UTC
[zfs-discuss] Problem: Disconnected command timeout for Target X
On Tue, 17 Jul 2012, Roberto Scudeller wrote:> Hi Bob, > > Thanks for the answers. > > How do I test your theory?I would use ''dd'' to see if it is possible to transfer data from one of the problem devices. Gain physical access to the system and check the signal and power cables to these devices closely. Use ''iostat -xe'' to see what error counts have accumulated. Also ''iostat -E''.> In this case, I use common disks SATA 2, not Nearline SAS (NL SATA) or SAS. Do you think the disks SATA are the problem?There have been reports of congestion leading to timeouts and resets when SATA disks are on expanders. There have also been reports that one failing disk can cause problems when on expanders. Regardless, if this system has been previously operating fine for some time, these errors would indicate a change in the hardware shared by all these devices. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/