Hello
I've stability issue with my new intel SASUC8I [1] PCIe controller. It's
a LSI 1068e based controller. After a few minutes with disk io (csup or scrub by
example) my FreeBSD 8-stable (64bit) is "freezing" for a couple of
minutes and I see a lot of error messages like:
Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bc3b0:48367 timed out for
ccb 0xffffff00050a8000 (req->ccb 0xffffff00050a8000)
Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bbb40:48368 timed out for
ccb 0xffffff0004f81800 (req->ccb 0xffffff0004f81800)
Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req
0xffffff80002bc3b0:48367
Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req
0xffffff80002bbb40:48368
Sep 17 03:10:03 gw kernel: mpt0: Timedout requests already complete. Interrupts
may not be functioning
It takes minutes to execute commands after the first timeout message. The whole
OS is running on single raidz ZFS pool. I use 4x 2TB SpinPoint F4 EcoGreen
HD204UI disks with fixed firmware [2]. The disks are connected with brand new
adaptec cables. I already replaced a cable to disqualify a cable issue. smartctl
looks ok. The system is working fine if I connect all 4 disk to the the onboard
controller.
Is this an known issue with this controller and the mpt driver? Is there an
workaround? I hope someone can help me.
Regards,
Thomas
FreeBSD Information:
8.2-STABLE FreeBSD 8.2-STABLE #3: Sun Jul  3 19:39:11 UTC 2011 
root@gw.lan:/usr/obj/usr/src/sys/GENERIC  amd64
Motherboard:
smbios.planar.maker="ASUSTeK Computer INC."
smbios.planar.product="P5N73-AM"
Controller:
mptutil show adapter
mpt0 Adapter:
       Board Name: SASUC8I
   Board Assembly: L3-25071-00C
        Chip Name: C1068E
    Chip Revision: UNUSED
      RAID Levels: RAID0, RAID1, RAID1E
    RAID0 Stripes: 64k
   RAID1E Stripes: 64k
 RAID0 Drives/Vol: 2-10
 RAID1 Drives/Vol: 2
dmesg about the controller:
mpt0: <LSILogic SAS/SATA Adapter> port 0xee00-0xeeff mem
0xef7fc000-0xef7fffff,0xef7e0000-0xef7effff irq 16 at device 0.0 on pci2
mpt0: [ITHREAD]
mpt0: MPI Version=1.5.20.0
mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
mpt0: 0 Active Volumes (2 Max)
mpt0: 0 Hidden Drive Members (14 Max)
zpool status:
  pool: tank
 state: ONLINE
 scan: scrub canceled on Sun Sep 18 10:44:46 2011
config:
	NAME                                            STATE     READ WRITE CKSUM
	tank                                            ONLINE       0     0     0
	  raidz1-0                                      ONLINE       0     0     0
	    gptid/7cd20811-2af6-11e0-9271-e5dbd4b6b481  ONLINE       0     0     0
	    gptid/7d7af0fc-2af6-11e0-b0b1-85b4c14d926b  ONLINE       0     0     0
	    gptid/7e24b41a-2af6-11e0-944f-fb3dae8bad6a  ONLINE       0     0     0
	    gptid/dac06b61-972b-11e0-affc-1c6f65565b30  ONLINE       0     0     0
/boot/loader.conf:
zfs_load="YES"
vfs.root.mountfrom="zfs:tank/root"
aio_load="yes"
ahci_load="YES"
ataahci_load="YES"
ipfw_load="YES"
vm.kmem_size="6G"
vfs.zfs.arc_max="4G"
vfs.zfs.prefetch_disable="1"
vfs.zfs.txg.timeout="5"
rc.conf:
ipv6_firewall_enable="YES"
ipv6_firewall_type="OPEN"
sendmail_enable="NO"
sshd_enable="YES"
zfs_enable="YES"
firewall_enable="YES"
firewall_type="OPEN"
dmesg:
http://pastebin.com/sL15g1vt
smartctl:
http://pastebin.com/wYU2DXJ4
[1]
http://www.intel.com/content/www/us/en/servers/raid/raid-controller-sasuc8i.html
[2] http://www.samsung.com/global/business/hdd/faqView.do?b2b_bbs_msg_id=386
Hello On Sep 19, 2011, at 2:45 PM, Thomas Vogt wrote:> I've stability issue with my new intel SASUC8I [1] PCIe controller. It's a LSI 1068e based controller. After a few minutes with disk io (csup or scrub by example) my FreeBSD 8-stable (64bit) is "freezing" for a couple of minutes and I see a lot of error messages like: > > dmesg: > http://pastebin.com/sL15g1vt > > smartctl: > http://pastebin.com/wYU2DXJ4Sorry: http://pastebin.com/0Na23R54 Regards, Tom
Hi Marius On 19.09.2011, at 21:06, Marius Strobl wrote:> On Mon, Sep 19, 2011 at 02:45:04PM +0200, Thomas Vogt wrote: >> Hello >> >> I've stability issue with my new intel SASUC8I [1] PCIe controller. It's a LSI 1068e based controller. After a few minutes with disk io (csup or scrub by example) my FreeBSD 8-stable (64bit) is "freezing" for a couple of minutes and I see a lot of error messages like: >> >> Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bc3b0:48367 timed out for ccb 0xffffff00050a8000 (req->ccb 0xffffff00050a8000) >> Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bbb40:48368 timed out for ccb 0xffffff0004f81800 (req->ccb 0xffffff0004f81800) >> Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req 0xffffff80002bc3b0:48367 >> Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req 0xffffff80002bbb40:48368 >> Sep 17 03:10:03 gw kernel: mpt0: Timedout requests already complete. Interrupts may not be functioning >> > > If this really is an issue with interrupts not getting delivered you > could try whether disabling MSI/MSI-X by setting hw.pci.enable_msi=0 > and hw.pci.enable_msix=0 either on the loader prompt or via loader.conf > works around it.I already tried this. It didn't change anything. Same timeouts. I also tried vfs.zfs.vdev.min_pending="1" and vfs.zfs.vdev.max_pending="1" in loader.conf without any positive change. Regards, Tom
On Mon, Sep 19, 2011 at 02:45:04PM +0200, Thomas Vogt wrote:> Hello > > I've stability issue with my new intel SASUC8I [1] PCIe controller. It's a LSI 1068e based controller. After a few minutes with disk io (csup or scrub by example) my FreeBSD 8-stable (64bit) is "freezing" for a couple of minutes and I see a lot of error messages like: > > Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bc3b0:48367 timed out for ccb 0xffffff00050a8000 (req->ccb 0xffffff00050a8000) > Sep 17 03:10:03 gw kernel: mpt0: request 0xffffff80002bbb40:48368 timed out for ccb 0xffffff0004f81800 (req->ccb 0xffffff0004f81800) > Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req 0xffffff80002bc3b0:48367 > Sep 17 03:10:03 gw kernel: mpt0: completing timedout/aborted req 0xffffff80002bbb40:48368 > Sep 17 03:10:03 gw kernel: mpt0: Timedout requests already complete. Interrupts may not be functioning >If this really is an issue with interrupts not getting delivered you could try whether disabling MSI/MSI-X by setting hw.pci.enable_msi=0 and hw.pci.enable_msix=0 either on the loader prompt or via loader.conf works around it. Marius