Maurice Volaski
2010-Feb-07 01:52 UTC
[zfs-discuss] Workaround for mpt timeouts in snv_127
>For those who''ve been suffering this problem and who have non-Sun >jbods, could you please let me know what model of jbod and cables >(including length thereof) you have in your configuration. > >For those of you who have been running xVM without MSI support, >could you please confirm whether the devices exhibiting the problem >are internal to your host, or connected via jbod. And if via jbod, >please confirm the model number and cables.>For those who''ve been suffering this problem and who have non-Sun >jbods, could you please let me know what model of jbod and cables >(including length thereof) you have in your configuration.I have a SuperMicro X8DTN motheboard and an LSI SAS3081E-R, which is firmware 1.28.02.00-IT. I have 24 drives attached to the backplane of the system with a single mini-SAS cable probably not even 18 inches long. All the drives are WD RE4-GP. OpenSolaris, snv_130, is running on VMWare, but I am using PCI passthrough for the LSI card. I have set mpt:mpt_enable_msi=0 set mptsas:mptsas_enable_msi=0 in /etc/system and have rebooted. Here is an excerpt of messages during a scrub eb 6 19:13:24 thecratedoor scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3140 at 0 (mpt1): Feb 6 19:13:24 thecratedoor Log info 0x31123000 received for target 22. Feb 6 19:13:24 thecratedoor scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 6 19:13:29 thecratedoor scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci15ad,7a0 at 15/pci1000,3140 at 0 (mpt1): Feb 6 19:13:29 thecratedoor mpt_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31123000 Feb 6 19:13:29 thecratedoor scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci15ad,7a0 at 15/pci1000,3140 at 0 (mpt1): Feb 6 19:13:29 thecratedoor mpt_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31123000 Feb 6 19:13:29 thecratedoor scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci15ad,7a0 at 15/pci1000,3140 at 0 (mpt1): Feb 6 19:13:29 thecratedoor mpt_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31123000 Feb 6 19:13:29 thecratedoor scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci15ad,7a0 at 15/pci1000,3140 at 0 (mpt1): Feb 6 19:13:29 thecratedoor mpt_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31123000 Feb 6 19:13:30 thecratedoor scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3140 at 0 (mpt1): Feb 6 19:13:30 thecratedoor Log info 0x31123000 received for target 20. Feb 6 19:13:30 thecratedoor scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc scrub: scrub in progress for 3h28m, 50.53% done, 3h23m to go config: NAME STATE READ WRITE CKSUM cratepooldoor ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 18K repaired c4t5d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 512 repaired c4t7d0 ONLINE 0 0 0 c4t8d0 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 c4t9d0 ONLINE 0 0 0 c4t10d0 ONLINE 0 0 0 c4t11d0 ONLINE 0 0 0 c4t12d0 ONLINE 0 0 0 128K repaired c4t13d0 ONLINE 0 0 0 c4t14d0 ONLINE 0 0 0 c4t15d0 ONLINE 0 0 0 c4t16d0 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 c4t17d0 ONLINE 0 0 0 18.5K repaired c4t18d0 ONLINE 0 0 0 18.5K repaired c4t19d0 ONLINE 0 0 0 c4t20d0 ONLINE 0 0 0 1K repaired c4t21d0 ONLINE 0 0 0 c4t22d0 ONLINE 0 0 0 256K repaired c4t23d0 ONLINE 0 0 0 c4t24d0 ONLINE 0 0 0 109K repaired I have another identical system and it''s behaving the same way. -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100206/926f142e/attachment.html>
Maurice Volaski
2010-Feb-10 16:20 UTC
[zfs-discuss] Workaround for mpt timeouts in snv_127
>>For those who''ve been suffering this problem and who have non-Sun >>jbods, could you please let me know what model of jbod and cables >>(including length thereof) you have in your configuration. >> >>For those of you who have been running xVM without MSI support, >>could you please confirm whether the devices exhibiting the problem >>are internal to your host, or connected via jbod. And if via jbod, >>please confirm the model number and cables. > > >For those who''ve been suffering this problem and who have non-Sun >>jbods, could you please let me know what model of jbod and cables > >(including length thereof) you have in your configuration. > >I have a SuperMicro X8DTN motheboard and an LSI SAS3081E-R, which is >firmware 1.28.02.00-IT. I have 24 drives attached to the backplane >of the system with a single mini-SAS cable probably not even 18 >inches long. All the drives are WD RE4-GP. > >OpenSolaris, snv_130, is running on VMWare, but I am using PCI >passthrough for the LSI card.I tried the card and drives under Linux. Interestingly, it appears to be causing similar messages. On top of that, I upgraded the BIOS/firmware on the card to 1.29 (phase 17) and it made no difference. Under OpenSolaris, bits were flipping all over the place. It''s not clear if that is happening here, but Linux just doesn''t know. Either way, the card''s not behaving under Linux either. Feb 10 09:01:17 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Feb 10 09:01:17 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:01:18 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 10 times - Feb 10 09:03:54 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:03:55 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 12 times - Feb 10 09:05:39 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:05:40 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 25 times - Feb 10 09:10:01 [cron] (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons ) Feb 10 09:11:15 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Feb 10 09:11:15 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:11:17 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 13 times - Feb 10 09:13:09 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:13:10 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 11 times - Feb 10 09:20:01 [cron] (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons ) - Last output repeated 2 times - Feb 10 09:40:06 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Feb 10 09:40:06 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:40:07 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 10 times - Feb 10 09:45:52 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:45:53 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 10 times - Feb 10 09:50:01 [cron] (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons ) Feb 10 09:58:56 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Feb 10 09:58:56 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 09:58:57 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 10 times - Feb 10 10:00:01 [cron] (root) CMD (rm -f /var/spool/cron/lastrun/cron.hourly) Feb 10 10:00:01 [cron] (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons ) Feb 10 10:00:01 [run-crons] (root) CMD (/etc/cron.hourly/hourly_woodchuck) Feb 10 10:00:02 [sSMTP] Sent mail for root at theradargun (221 2.0.0 Bye) uid=0 username=root outbytes=877 Feb 10 10:10:01 [cron] (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons ) - Last output repeated 3 times - Feb 10 10:40:15 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Feb 10 10:40:15 [kernel] mptbase: ioc1: LogInfo(0x31112000): Originator={PL}, Code={Reset}, SubCode(0x2000) Feb 10 10:40:17 [kernel] mptbase: ioc1: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) - Last output repeated 11 times - -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
Maurice Volaski
2010-Feb-18 14:51 UTC
[zfs-discuss] Workaround for mpt timeouts in snv_127
>>For those who''ve been suffering this problem and who have non-Sun >>jbods, could you please let me know what model of jbod and cables >>(including length thereof) you have in your configuration. >> >>For those of you who have been running xVM without MSI support, >>could you please confirm whether the devices exhibiting the problem >>are internal to your host, or connected via jbod. And if via jbod, >>please confirm the model number and cables. > > >For those who''ve been suffering this problem and who have non-Sun >>jbods, could you please let me know what model of jbod and cables > >(including length thereof) you have in your configuration. > >I have a SuperMicro X8DTN motheboard and an LSI SAS3081E-R, which is >firmware 1.28.02.00-IT. I have 24 drives attached to the backplane >of the system with a single mini-SAS cable probably not even 18 >inches long. All the drives are WD RE4-GP. > >OpenSolaris, snv_130, is running on VMWare, but I am using PCI >passthrough for the LSI card.It turns out the the mpt_sas HBAs are affected the same way: Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:47:57 thecratewall scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall mptsas_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31110630 Feb 17 04:47:57 thecratewall scsi: [ID 243001 kern.warning] WARNING: /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall mptsas_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31110630 Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
> >>For those who''ve been suffering this problem and > who have non-Sun > >>jbods, could you please let me know what model of > jbod and cables > >>(including length thereof) you have in your > configuration. > >>We are seeing the problem on both Sun and non-Sun hardware. On our Sun thumper x4540, we can reproduce it on all 3 devices. Our configuration is large stripes with only 2 vdevs. Doing a simple scrub will show the typical mpt timeout. We are running snv_131. -- This message posted from opensolaris.org
James C. McPherson
2010-Feb-19 01:19 UTC
[zfs-discuss] Workaround for mpt timeouts in snv_127
On 19/02/10 12:51 AM, Maurice Volaski wrote:>>> For those who''ve been suffering this problem and who have non-Sun >>> jbods, could you please let me know what model of jbod and cables >>> (including length thereof) you have in your configuration. >>> >>> For those of you who have been running xVM without MSI support, >>> could you please confirm whether the devices exhibiting the problem >>> are internal to your host, or connected via jbod. And if via jbod, >>> please confirm the model number and cables. >> >> >For those who''ve been suffering this problem and who have non-Sun >>> jbods, could you please let me know what model of jbod and cables >> >(including length thereof) you have in your configuration. >> >> I have a SuperMicro X8DTN motheboard and an LSI SAS3081E-R, which is >> firmware 1.28.02.00-IT. I have 24 drives attached to the backplane of >> the system with a single mini-SAS cable probably not even 18 inches >> long. All the drives are WD RE4-GP. >> >> OpenSolaris, snv_130, is running on VMWare, but I am using PCI >> passthrough for the LSI card. > > It turns out the the mpt_sas HBAs are affected the same way:Hi Maurice, this is very interesting to note. I''ll pass the info along to the relevant team (they''re in Beijing, so away for another few days due to Spring Festival).> Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. > Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. > Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. > Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:47:57 thecratewall scsi: [ID 243001 kern.warning] WARNING: > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall mptsas_handle_event_sync: IOCStatus=0x8000, > IOCLogInfo=0x31110630 > Feb 17 04:47:57 thecratewall scsi: [ID 243001 kern.warning] WARNING: > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall mptsas_handle_event: IOCStatus=0x8000, > IOCLogInfo=0x31110630 > Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. > Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. > Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. > Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. > Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. > Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc-- James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
Hello.> We are seeing the problem on both Sun and non-Sun hardware. On our Sun thumper x4540, we can reproduce it on all 3 devices. Our configuration is large stripes with only 2 vdevs. Doing a simple scrub will show the typical mpt timeout. > We are running snv_131.Somebody observed similar problems with new LSI 2008 SAS2 6Gb HBAs?
Maurice Volaski
2010-Apr-10 04:56 UTC
[zfs-discuss] Workaround for mpt timeouts in snv_127
At 11:19 AM +1000 2/19/10, James C. McPherson wrote:>On 19/02/10 12:51 AM, Maurice Volaski wrote: >>>>For those who''ve been suffering this problem and who have non-Sun >>>>jbods, could you please let me know what model of jbod and cables >>>>(including length thereof) you have in your configuration. >>>> >>>>For those of you who have been running xVM without MSI support, >>>>could you please confirm whether the devices exhibiting the problem >>>>are internal to your host, or connected via jbod. And if via jbod, >>>>please confirm the model number and cables. >>> >>>>For those who''ve been suffering this problem and who have non-Sun >>>>jbods, could you please let me know what model of jbod and cables >>> >(including length thereof) you have in your configuration. >>> >>>I have a SuperMicro X8DTN motheboard and an LSI SAS3081E-R, which is >>>firmware 1.28.02.00-IT. I have 24 drives attached to the backplane of >>>the system with a single mini-SAS cable probably not even 18 inches >>>long. All the drives are WD RE4-GP. >>> >>>OpenSolaris, snv_130, is running on VMWare, but I am using PCI >>>passthrough for the LSI card. >> >>It turns out the the mpt_sas HBAs are affected the same way: > > >Hi Maurice, >this is very interesting to note. I''ll pass the info along to >the relevant team (they''re in Beijing, so away for another few >days due to Spring Festival). >I have identified the culprit is the Western Digital drive WD2002FYPS-01U1B0. It''s not clear if they can fix it in firmware, but Western Digital is replacing my drives.>>Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. >>Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. >>Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. >>Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:47:57 thecratewall scsi: [ID 243001 kern.warning] WARNING: >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall mptsas_handle_event_sync: IOCStatus=0x8000, >>IOCLogInfo=0x31110630 >>Feb 17 04:47:57 thecratewall scsi: [ID 243001 kern.warning] WARNING: >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall mptsas_handle_event: IOCStatus=0x8000, >>IOCLogInfo=0x31110630 >>Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. >>Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. >>Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. >>Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. >>Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc >>Feb 17 04:47:57 thecratewall scsi: [ID 365881 kern.info] >>/pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): >>Feb 17 04:47:57 thecratewall Log info 0x31110630 received for target 33. >>Feb 17 04:47:57 thecratewall scsi_status=0x0, ioc_status=0x804b, >>scsi_state=0xc > > >-- >James C. McPherson >-- >Senior Kernel Software Engineer, Solaris >Sun Microsystems >http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog-- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100410/e645f79c/attachment.html>
...> > I have identified the culprit is the Western Digital drive WD2002FYPS-01U1B0. It''s not clear if they can fix it in firmware, but Western Digital is replacing my drives.> Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xc > Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] > /pci at 0,0/pci15ad,7a0 at 15/pci1000,3010 at 0 (mpt_sas0): > Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. > Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, > scsi_state=0xcHi, do you have disks connected in sata1/2? With WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 these timeouts are to be expected if disk is in SATA2 mode, we''ve get rid of these timeouts after forcing disks in SATA1-mode with jumpers, now they only appear when disk is having real issues and needs to be replaced. Yours Markus Kovero
Maurice Volaski
2010-Apr-10 07:42 UTC
[zfs-discuss] Workaround for mpt timeouts in snv_127
>Hi, do you have disks connected in sata1/2? With >WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 >these timeouts are to be expected if disk is in SATA2 mode,No, why are they to be expected with SATA2 mode? Is the defect specific to the SATA2 circuitry? I guess it could be a temporary workaround provided they would eventually fix the problem in firmware, but I''m getting new drives, so I guess I can''t complain :-) -- Maurice Volaski, maurice.volaski at einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
> No, why are they to be expected with SATA2 mode? Is the defect > specific to the SATA2 circuitry? I guess it could be a temporary > workaround provided they would eventually fix the problem in > firmware, but I''m getting new drives, so I guess I can''t complain :-)Probably your new disks do this too, I really don''t know whats with flawkey sata2 but I''d be quite sure it would fix your issues. Performance drop is not even noticeable, so it''s worth a try. Yours Markus Kovero