M P
2009-Nov-11 16:08 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
Server using [b]Sun StorageTek 8-port external SAS PCIe HBA [/b](mpt driver) connected to external JBOD array with 12 disks. Here is link to the exact SAS (Sun) adapter: http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf (LSI SAS3801) When running IO intensive operations (zpool scrub) for couple of hours, the server locks with the following repeating messages: Nov 10 16:31:45 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:31:45 sunserver Log info 0x31140000 received for target 17. Nov 10 16:31:45 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:55 sunserver Disconnected command timeout for Target 19 Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:56 sunserver Log info 0x31140000 received for target 19. Nov 10 16:32:56 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:34:16 sunserver Disconnected command timeout for Target 21 I tested this on two servers: - [b]Sun Fire X2200[/b] using [b]Sun Storage J4200 JBOD[/b] array and - [b]Dell R410 Server[/b] with [b]Promise VTJ-310SS JBOD array[/b] They both are showing the same repeating messages and locking after couple of hours of zpool scrub. Solaris appears to be more stable (than OpenSolaris) - it doesn''t lock when scrubbing, but still locks after 5-6 hours reading from the JBOD array - 10TB size. So at this point this looks like an issue with the MPT driver or these SAS cards (I tested two) when under heavy load. I put the latest firmware for the SAS card from LSI''s web site - v1.29.00 without any changes, server still locks. Any ideas, suggestions how to fix or workaround this issue? The adapter is suppose to be enterprise-class. Here is more detailed log info: =======================================================Sun Fire X2200 and Sun Storage J4200 JBOD array SAS card: Sun StorageTek 8-port external SAS PCIe HBA http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf (LSI SAS3801) Operation System: SunOS sunserver 5.11 snv_111b i86pc i386 i86pc Solaris Nov 10 16:30:33 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:30:33 sunserver Log info 0x31140000 received for target 0. Nov 10 16:30:33 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:31:43 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:31:43 sunserver Disconnected command timeout for Target 17 Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:55 sunserver Disconnected command timeout for Target 19 Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:56 sunserver Log info 0x31140000 received for target 19. Nov 10 16:32:56 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:34:16 sunserver Disconnected command timeout for Target 21 ---------------- Dell R410 Server and Promise VTJ-310SS JBOD array SAS card: Sun StorageTek 8-port external SAS PCIe HBA Operating System: SunOS dellserver 5.10 Generic_141445-09 i86pc i386 i86pc Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): Nov 11 00:18:22 dellserver Disconnected command timeout for Target 0 Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0/sd at 0,0 (sd13): Nov 11 00:18:22 dellserver Error for Command: read(10) Error Level: Retryable Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Requested Block: 276886498 Error Block: 276886498 Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Vendor: Dell Serial Number: Dell Interna Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Nov 11 00:19:33 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): Nov 11 00:19:33 dellserver Disconnected command timeout for Target 0 Nov 11 00:19:34 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0/sd at 0,0 (sd13): Nov 11 00:19:34 dellserver SCSI transport failed: reason ''reset'': retrying command Nov 11 00:20:44 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): Nov 11 00:20:44 dellserver Disconnected command timeout for Target 0 -- This message posted from opensolaris.org
Markus Kovero
2009-Nov-11 16:30 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
Hi, you could try LSI itmpt driver as well, it seems to handle this better, although I think it only supports 8 devices at once or so. You could also try more recent version of opensolaris (123 or even 126), as there seems to be a lot fixes regarding mpt-driver (which still seems to have issues). Yours Markus Kovero -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of M P Sent: 11. marraskuuta 2009 18:08 To: zfs-discuss at opensolaris.org Subject: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding Server using [b]Sun StorageTek 8-port external SAS PCIe HBA [/b](mpt driver) connected to external JBOD array with 12 disks. Here is link to the exact SAS (Sun) adapter: http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf (LSI SAS3801) When running IO intensive operations (zpool scrub) for couple of hours, the server locks with the following repeating messages: Nov 10 16:31:45 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:31:45 sunserver Log info 0x31140000 received for target 17. Nov 10 16:31:45 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:55 sunserver Disconnected command timeout for Target 19 Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:56 sunserver Log info 0x31140000 received for target 19. Nov 10 16:32:56 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:34:16 sunserver Disconnected command timeout for Target 21 I tested this on two servers: - [b]Sun Fire X2200[/b] using [b]Sun Storage J4200 JBOD[/b] array and - [b]Dell R410 Server[/b] with [b]Promise VTJ-310SS JBOD array[/b] They both are showing the same repeating messages and locking after couple of hours of zpool scrub. Solaris appears to be more stable (than OpenSolaris) - it doesn''t lock when scrubbing, but still locks after 5-6 hours reading from the JBOD array - 10TB size. So at this point this looks like an issue with the MPT driver or these SAS cards (I tested two) when under heavy load. I put the latest firmware for the SAS card from LSI''s web site - v1.29.00 without any changes, server still locks. Any ideas, suggestions how to fix or workaround this issue? The adapter is suppose to be enterprise-class. Here is more detailed log info: =======================================================Sun Fire X2200 and Sun Storage J4200 JBOD array SAS card: Sun StorageTek 8-port external SAS PCIe HBA http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf (LSI SAS3801) Operation System: SunOS sunserver 5.11 snv_111b i86pc i386 i86pc Solaris Nov 10 16:30:33 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:30:33 sunserver Log info 0x31140000 received for target 0. Nov 10 16:30:33 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:31:43 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:31:43 sunserver Disconnected command timeout for Target 17 Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:55 sunserver Disconnected command timeout for Target 19 Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:32:56 sunserver Log info 0x31140000 received for target 19. Nov 10 16:32:56 sunserver scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,376 at a/pci1000,3150 at 0 (mpt0): Nov 10 16:34:16 sunserver Disconnected command timeout for Target 21 ---------------- Dell R410 Server and Promise VTJ-310SS JBOD array SAS card: Sun StorageTek 8-port external SAS PCIe HBA Operating System: SunOS dellserver 5.10 Generic_141445-09 i86pc i386 i86pc Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): Nov 11 00:18:22 dellserver Disconnected command timeout for Target 0 Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0/sd at 0,0 (sd13): Nov 11 00:18:22 dellserver Error for Command: read(10) Error Level: Retryable Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Requested Block: 276886498 Error Block: 276886498 Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Vendor: Dell Serial Number: Dell Interna Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Nov 11 00:19:33 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): Nov 11 00:19:33 dellserver Disconnected command timeout for Target 0 Nov 11 00:19:34 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0/sd at 0,0 (sd13): Nov 11 00:19:34 dellserver SCSI transport failed: reason ''reset'': retrying command Nov 11 00:20:44 dellserver scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): Nov 11 00:20:44 dellserver Disconnected command timeout for Target 0 -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Maurice Volaski
2009-Nov-11 18:23 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
I''ve experienced behavior similar this several times, each time it was a single bad drive, in this case, looking like target 0. For whatever reason, buggy Solaris/mpt driver, some of the other drives get wind of it, then hide from their respective buses in fear. :-)>Operating System: SunOS dellserver 5.10 Generic_141445-09 i86pc i386 i86pc > >Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: >/pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): >Nov 11 00:18:22 dellserver Disconnected command timeout for Target 0 >Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: >/pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0/sd at 0,0 (sd13): >Nov 11 00:18:22 dellserver Error for Command: read(10) >Error Level: Retryable >Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Requested >Block: 276886498 Error Block: 276886498 >Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Vendor: >Dell Serial Number: Dell Interna >Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] Sense >Key: Unit Attention >Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice] ASC: 0x29 >(power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 >Nov 11 00:19:33 dellserver scsi: [ID 107833 kern.warning] WARNING: >/pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): >Nov 11 00:19:33 dellserver Disconnected command timeout for Target 0 >Nov 11 00:19:34 dellserver scsi: [ID 107833 kern.warning] WARNING: >/pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0/sd at 0,0 (sd13): >Nov 11 00:19:34 dellserver SCSI transport failed: reason >''reset'': retrying command >Nov 11 00:20:44 dellserver scsi: [ID 107833 kern.warning] WARNING: >/pci at 0,0/pci8086,340a at 3/pci1028,1f0f at 0 (mpt0): >Nov 11 00:20:44 dellserver Disconnected command timeout for Target 0 >---- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
M P
2009-Nov-11 19:05 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
I already changed some of the drives, no difference. The target drive seem to have random character - most likely not from the drives. -- This message posted from opensolaris.org
Markus Kovero
2009-Nov-11 19:08 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
Have you tried another SAS-cable? Yours Markus Kovero -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of M P Sent: 11. marraskuuta 2009 21:05 To: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding I already changed some of the drives, no difference. The target drive seem to have random character - most likely not from the drives. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Travis Tabbal
2009-Nov-12 05:09 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
> Hi, you could try LSI itmpt driver as well, it seems > to handle this better, although I think it only > supports 8 devices at once or so. > > You could also try more recent version of opensolaris > (123 or even 126), as there seems to be a lot fixes > regarding mpt-driver (which still seems to have > issues).I won''t speak for the OP, but I''ve been seeing this same behaviour on 126 with LSI 1068E based cards (Supermicro USAS-L8i). For the LSI driver, how does one install it? I''m new to OpenSolaris and don''t want to mess it up. It looked to be very old, is Solaris backward compatibility that good? It would be really nice if Sun would at least acknowledge the bug and that they can/can''t reproduce it. I''m happy to supply information and test things if it will help. I have some spare disks I can attach to one of these cards and test driver updates and such. It sounds like people with Sun hardware are experiencing this as well. -- This message posted from opensolaris.org
Travis Tabbal
2009-Nov-12 05:12 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
> Have you tried another SAS-cable?I have. 2 identical SAS cards, different cables, different disks (brand, size, etc). I get the errors on random disks in the pool. I don''t think it''s hardware related as there have been a few reports of this issue already. -- This message posted from opensolaris.org
James C. McPherson
2009-Nov-12 05:25 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
Travis Tabbal wrote:>> Hi, you could try LSI itmpt driver as well, it seems to handle this >> better, although I think it only supports 8 devices at once or so. >> >> You could also try more recent version of opensolaris (123 or even >> 126), as there seems to be a lot fixes regarding mpt-driver (which >> still seems to have issues). > > > I won''t speak for the OP, but I''ve been seeing this same behaviour on 126 > with LSI 1068E based cards (Supermicro USAS-L8i). For the LSI driver, > how does one install it? I''m new to OpenSolaris and don''t want to mess it > up. It looked to be very old, is Solaris backward compatibility that > good?I don''t know whether itmpt has been updated to cope with OpenSolaris. Use it at your own risk.> It would be really nice if Sun would at least acknowledge the bug and > that they can/can''t reproduce it. I''m happy to supply information and > test things if it will help. I have some spare disks I can attach to one > of these cards and test driver updates and such. It sounds like people > with Sun hardware are experiencing this as well.The first step towards "acknowledging" that there is a problem is you logging a bug in bugs.opensolaris.org. If you don''t, we don''t know that there might be a problem outside of the ones that we identify. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
Travis Tabbal
2009-Nov-12 05:54 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
On Wed, Nov 11, 2009 at 10:25 PM, James C. McPherson <jmcp at opensolaris.org>wrote:> > The first step towards "acknowledging" that there is a problem > is you logging a bug in bugs.opensolaris.org. If you don''t, we > don''t know that there might be a problem outside of the ones > that we identify. >I apologize if I offended by not knowing the protocol. I thought that posting in the forums was watched and the bug tracker updated by people at Sun. I didn''t think normal users had access to submit bugs. Thank you for the reply. I have submitted a bug on the issue with all the information I think might be useful. If someone at Sun would like more information, output from commands, or testing, I would be happy to help. I was not provided with a bug number by the system. I assume that those are given out if the bug is deemed worthy of further consideration. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091111/1d57ae32/attachment.html>
James C. McPherson
2009-Nov-12 06:42 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
Travis Tabbal wrote:> > > On Wed, Nov 11, 2009 at 10:25 PM, James C. McPherson > <jmcp at opensolaris.org <mailto:jmcp at opensolaris.org>> wrote: > > > The first step towards "acknowledging" that there is a problem > is you logging a bug in bugs.opensolaris.org > <http://bugs.opensolaris.org>. If you don''t, we > don''t know that there might be a problem outside of the ones > that we identify. > > I apologize if I offended by not knowing the protocol. I thought that > posting in the forums was watched and the bug tracker updated by people > at Sun. I didn''t think normal users had access to submit bugs. Thank you > for the reply. I have submitted a bug on the issue with all the > information I think might be useful. If someone at Sun would like more > information, output from commands, or testing, I would be happy to help.Hi Travis, no, you didn''t offend at all. There''s a chunk of doco on the hub.opensolaris.org site which talks about bugs, and there''s a link to both "bugster" and bugzilla. "Bugster" is the internal tool which you can view via bugs.opensolaris.org. The bugzilla instance is r/w by anybody who has an account on opensolaris.org. Most of the kernel groups are not yet looking at the bugzilla instance so it''s better to use bugster at this point in time.> I was not provided with a bug number by the system. I assume that those > are given out if the bug is deemed worthy of further consideration.That''s an invalid assumption. As it happens, the bugs.o.o interface does not always provide you with a bug id, we have to wait for the new entry to show up in our internal triage queue and then reassign it to where it really should go. I haven''t seen your bug turn up yet so I can''t help more at this point. No doubt a copy of it will turn up in my inbox after I''ve gone to sleep. Whoever picks up your bug should contact you directly to get copies of the other data you mention. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
Peter Eriksson
2009-Nov-12 14:55 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
Have you tried wrapping your disks inside LVM metadevices and then used those for your ZFS pool? -- This message posted from opensolaris.org
Peter Eriksson
2009-Nov-12 14:59 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
What type of disks are you using? -- This message posted from opensolaris.org
M P
2009-Nov-12 16:23 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
I was just looking to see if it is a known problem before I submit it as a bug. What would be the best category to submit the bug under? I am not sure if it is driver/kernel issue. I would be more than glad to help. One of the machines is a test environment and I can run any dumps/debug versions you want. The issue is reproducible on the two servers Sun and Dell and with different SAS JBOD storage. The systems consist of raidz2 pool, made from 11 SATA large disks (1.5TB Seagate). The pool is 60% or so full. The easiest way to reproduce it is when running bacula client to back the whole pool overnight. After couple of hours the issue will manifest. The machine will just print these messages and not respond to any connections, even keyboard. I was looking into one other machine that we have ? a relatively old custom build machine with 11 1TB (Western Digital) disks connected to 8 port SATA controller (+3 from the motherboard). I noticed that there are similar messages for the disks there. The machine doesn?t lock, just prints the messages when under heavy load (backup), see bellow: =========================== Operating System: Solaris 10 8/07 s10x_u4wos_12b X86 Adapter: 8 port SATA: http://www.supermicro.com/products/accessories/addon/AOC-SAT2-MV8.cfm Oct 21 17:47:22 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 17:47:22 mirror port 1: link lost Oct 21 17:47:22 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 17:47:22 mirror port 1: link established Oct 21 17:47:22 mirror marvell88sx: [ID 812950 kern.warning] WARNING: marvell88sx0: error on port 1: Oct 21 17:47:22 mirror marvell88sx: [ID 517869 kern.info] device disconnected Oct 21 17:47:22 mirror marvell88sx: [ID 517869 kern.info] device connected Oct 21 17:47:22 mirror scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6/disk at 1,0 (sd2): Oct 21 17:47:22 mirror Error for Command: read(10) Error Level: Retryable Oct 21 17:47:22 mirror scsi: [ID 107833 kern.notice] Requested Block: 178328863 Error Block: 178328863 Oct 21 17:47:22 mirror scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: Oct 21 17:47:22 mirror scsi: [ID 107833 kern.notice] Sense Key: No Additional Sense Oct 21 17:47:22 mirror scsi: [ID 107833 kern.notice] ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 Oct 21 17:58:51 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 17:58:51 mirror port 0: device reset Oct 21 17:58:51 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 17:58:51 mirror port 0: device reset Oct 21 17:58:51 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 17:58:51 mirror port 0: link lost Oct 21 17:58:51 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 17:58:51 mirror port 0: link established Oct 21 17:58:51 mirror marvell88sx: [ID 812950 kern.warning] WARNING: marvell88sx0: error on port 0: Oct 21 17:58:51 mirror marvell88sx: [ID 517869 kern.info] device disconnected Oct 21 17:58:51 mirror marvell88sx: [ID 517869 kern.info] device connected Oct 21 17:58:51 mirror scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6/disk at 0,0 (sd1): Oct 21 17:58:51 mirror Error for Command: read(10) Error Level: Retryable Oct 21 17:58:51 mirror scsi: [ID 107833 kern.notice] Requested Block: 929071121 Error Block: 929071121 Oct 21 17:58:51 mirror scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: Oct 21 17:58:51 mirror scsi: [ID 107833 kern.notice] Sense Key: No Additional Sense Oct 21 17:58:51 mirror scsi: [ID 107833 kern.notice] ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 Oct 21 18:02:10 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 18:02:10 mirror port 4: device reset Oct 21 18:02:10 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 21 18:02:10 mirror port 4: device reset Oct 21 18:02:10 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 29 00:03:24 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 29 00:03:24 mirror port 5: device reset Oct 29 00:03:24 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 29 00:03:24 mirror port 5: device reset Oct 29 00:03:24 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 29 00:03:24 mirror port 5: link lost Oct 29 00:03:24 mirror sata: [ID 801593 kern.notice] NOTICE: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6: Oct 29 00:03:24 mirror port 5: link established Oct 29 00:03:24 mirror marvell88sx: [ID 812950 kern.warning] WARNING: marvell88sx0: error on port 5: Oct 29 00:03:24 mirror marvell88sx: [ID 517869 kern.info] device disconnected Oct 29 00:03:24 mirror marvell88sx: [ID 517869 kern.info] device connected Oct 29 00:03:24 mirror scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci10de,26f at 10/pci11ab,11ab at 6/disk at 5,0 (sd6): Oct 29 00:03:24 mirror Error for Command: write(10) Error Level: Retryable Oct 29 00:03:24 mirror scsi: [ID 107833 kern.notice] Requested Block: 1513181930 Error Block: 1513181930 Oct 29 00:03:24 mirror scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: Oct 29 00:03:24 mirror scsi: [ID 107833 kern.notice] Sense Key: No Additional Sense Oct 29 00:03:24 mirror scsi: [ID 107833 kern.notice] ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0 -- This message posted from opensolaris.org
Travis Tabbal
2009-Nov-12 18:27 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
I submitted a bug on this issue, it looks like you can reference other bugs when you submit one, so everyone having this issue could possibly link mine and submit their own hardware config. It sounds like it''s widespread though, so I''m not sure if that would help or hinder. I''d hate to bury the developers/QA team under a mountain of duplicate requests. CR 6900767 -- This message posted from opensolaris.org
Travis Tabbal
2009-Nov-12 18:30 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
> What type of disks are you using?I''m using SATA disks with SAS-SATA breakout cables. I''ve tried different cables as I have a couple spares. mpt0 has 4x1.5TB Samsung "Green" drives. mpt1 has 4x400GB Seagate 7200 RPM drives. I get errors from both adapters. Each adapter has an unused SAS channel available. If I can get this fixed, I''m planning to populate those as well. -- This message posted from opensolaris.org
Travis Tabbal
2009-Nov-12 18:33 UTC
[zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding
> Have you tried wrapping your disks inside LVM > metadevices and then used those for your ZFS pool?I have not tried that. I could try it with my spare disks I suppose. I avoided LVM as it didn''t seem to offer me anything ZFS/ZPOOL didn''t. -- This message posted from opensolaris.org