Dennis Kögel
2013-Jun-19 13:01 UTC
Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0)
Hi, very periodically, we see I/O hangs for about 10 seconds, roughly once per minute. Each time this happens, the I/O rate simply drops to zero, and all disk access hangs; this is also very noticeable on the shell, for NFS clients etc. Everything else (networking, kernel, ?) seems to continue normally. Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe with 24x Seagate ST33000650SS (3rd party arcsas.ko driver). It's easy to observe these hangs under write load, e.g. with 'zpool iostat 1': void 22.4T 42.6T 34 2.73K 1.07M 293M void 22.4T 42.6T 20 2.74K 623K 289M void 22.4T 42.6T 144 2.62K 4.83M 279M void 22.4T 42.6T 13 2.60K 437K 283M void 22.4T 42.6T 0 0 0 0 <-- hang starts void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 0 0 0 void 22.4T 42.6T 0 296 4.00K 34.2M <-- hang ends void 22.4T 42.6T 2 2.64K 73.8K 288M void 22.4T 42.6T 8 3.12K 278K 329M Each time this happens, there is a completely unexplained spike of interrupts on uhci0: 'systat -vm' then displays numbers around 270k. # vmstat -i | grep -E '(arcsas|uhci0|Total)' irq16: uhci0 1227020890 67708 irq24: arcsas0 12045211 664 Total 1266417827 69882 Things to note: - Booting an USB-less kernel or disabling all USB in the BIOS doesn't change a thing (no interrupt spikes to be seen, but the hangs remain) - The hangs / interrupt spikes happen just as often when the system is idle - Board is a Supermicro x8dth - There's two igb cards - Root is ZFS as well (separate pool though) - BIOS, Areca FW and driver already are latest versions - Putting the controller to a different slot doesn't change the behaviour - We have two identical systems and both show the exact same symptoms, so flaky hardware is probably not the issue Any ideas would be appreciated. Thanks, D.
Ronald Klop
2013-Jun-19 13:28 UTC
Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0)
On Wed, 19 Jun 2013 15:01:14 +0200, Dennis K?gel <dk at neveragain.de> wrote:> Hi, > > very periodically, we see I/O hangs for about 10 seconds, roughly once > per minute. > > Each time this happens, the I/O rate simply drops to zero, and all disk > access hangs; this is also very noticeable on the shell, for NFS clients > etc. Everything else (networking, kernel, ?) seems to continue normally. > > Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe > with 24x Seagate ST33000650SS (3rd party arcsas.ko driver). > > It's easy to observe these hangs under write load, e.g. with 'zpool > iostat 1': > > void 22.4T 42.6T 34 2.73K 1.07M 293M > void 22.4T 42.6T 20 2.74K 623K 289M > void 22.4T 42.6T 144 2.62K 4.83M 279M > void 22.4T 42.6T 13 2.60K 437K 283M > void 22.4T 42.6T 0 0 0 0 <-- hang starts > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 0 0 0 > void 22.4T 42.6T 0 296 4.00K 34.2M <-- hang ends > void 22.4T 42.6T 2 2.64K 73.8K 288M > void 22.4T 42.6T 8 3.12K 278K 329M > > Each time this happens, there is a completely unexplained spike of > interrupts on uhci0: 'systat -vm' then displays numbers around 270k. > > # vmstat -i | grep -E '(arcsas|uhci0|Total)' > irq16: uhci0 1227020890 67708 > irq24: arcsas0 12045211 664 > Total 1266417827 69882 > > Things to note: > > - Booting an USB-less kernel or disabling all USB in the BIOS doesn't > change a thing (no interrupt spikes to be seen, but the hangs remain) > - The hangs / interrupt spikes happen just as often when the system is > idle > - Board is a Supermicro x8dth > - There's two igb cards > - Root is ZFS as well (separate pool though) > - BIOS, Areca FW and driver already are latest versions > - Putting the controller to a different slot doesn't change the behaviour > - We have two identical systems and both show the exact same symptoms, > so flaky hardware is probably not the issue > > Any ideas would be appreciated. > > Thanks, > D.First send more information about the system: - The content of /var/run/dmesg.boot. - Install /usr/ports/sysutils/zfs-stats and send the output of zfs-stats -a. - Send the output of zpool status + zpool list. - Did you configure compression or dedup on the pool? - Do you keep a lot of snapshots? - Do you run a cronjob every minute which does something with the pool? Gathers statistics or something like that. Ronald.
Reasonably Related Threads
- uhci0 excessive interrupts---how can I disable or reset specific USB port?
- newfs locks entire machine for 20seconds
- [LLVMdev] Building a stable bitcode format for PNaCl - based on LLVM IR
- usb.ko is unloadable?
- [LLVMdev] Building a stable bitcode format for PNaCl - based on LLVM IR