Dear All, I seem to have a problem where really heavy disk I/O is drowning my machine. I see hangs in the shell where I am logged on using ssh. Network connections get dropped for no apparent reason and some HTTP requests are served really slowly. Profiling the app code shows that the hangs are in completely random places. Operations that are no more than a few lines of code apart suddenly take seconds to complete. In my search I seem to find that my machine is quite slow on the disk. I find that rather odd, given that the device in question is an SSD drive and it is a good bit faster than the WD drive that used to carry the data set that is accessed heavily. This drive is doing 1.5 times the throughput, but the hangs have not gone away. To clarify, the data set used to live on ada2 (see the devlist below) which is a spinning disk. When I experienced intermittent hangs I plugged in an SSD drive (ada3 on the devlist) and moved the data there. This improved the MB's per second that are being written (it is mostly-write data) but has not changed the hangs. If anything, they got worse since. Using gstat I notice that I/O service time is quite high. From the gstat below you can see that it takes just over 2s to servr the requests. The L(q) seems to never drop far below 100 and %busy hovers around 100% all day long. Can someone please help me troubleshoot that further? What can I do to make the underlying problem visible? I should mention all data is referenced through cross-mountpoint symlinks, would that make a difference? Should I use canonical paths in the code instead? All file systems are mounted "noatime, soft-updates". Details: # uname -a FreeBSD cumin.java-monitor.com 9.0-STABLE FreeBSD 9.0-STABLE #0: Mon Mar 26 14:30:19 UTC 2012 kjkoster@cumin.java-monitor.com:/usr/obj/usr/src/sys/CUMIN amd64 # gstat -f 'ada[0-3]$' -b dT: 1.001s w: 1.000s filter: ada[0-3]$ L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 0 0 0 0.0 0 0 0.0 0.0 ada0 0 0 0 0 0.0 0 0 0.0 0.0 ada1 0 0 0 0 0.0 0 0 0.0 0.0 ada2 103 273 0 0 0.0 273 34630 2062 121.9 ada3 # camcontrol devlist <WDC WD740ADFD-00NLR1 20.07P20> at scbus1 target 0 lun 0 (pass0,ada0) <WDC WD740GD-00FLC0 33.08F33> at scbus2 target 0 lun 0 (pass1,ada1) <WDC WD740GD-00FLC0 33.08F33> at scbus3 target 0 lun 0 (pass2,ada2) <OCZ SUMMIT VBM1801Q> at scbus4 target 0 lun 0 (pass3,ada3) <PepperC Virtual Disc 1 0.01> at scbus7 target 0 lun 0 (pass4,cd0) <PepperC Virtual Disc 2 0.01> at scbus8 target 0 lun 0 (pass5,cd1) # _ -- Kees Jan http://java-monitor.com/ kjkoster@kjkoster.org +31651838192 The secret of success lies in the stability of the goal. -- Benjamin Disraeli
On 5/29/2012 12:26 PM, Kees Jan Koster wrote:> I seem to have a problem where really heavy disk I/O is drowning my machine.Assuming you're using the default scheduler (SCHED_ULE), try switching to the 4BSD scheduler in your kernel config file and see if that helps. Doug -- This .signature sanitized for your protection
On Tue, May 29, 2012 at 12:26 PM, Kees Jan Koster <kjkoster@gmail.com> wrote:> I seem to have a problem where really heavy disk I/O is drowning my machine. I see hangs in the shell where I am logged on using ssh. Network connections get dropped for no apparent reason and some HTTP requests are served really slowly. Profiling the app code shows that the hangs are in completely random places. Operations that are no more than a few lines of code apart suddenly take seconds to complete. > > In my search I seem to find that my machine is quite slow on the disk. I find that rather odd, given that the device in question is an SSD drive and it is a good bit faster than the WD drive that used to carry the data set that is accessed heavily. This drive is doing 1.5 times the throughput, but the hangs have not gone away. > > To clarify, the data set used to live on ada2 (see the devlist below) which is a spinning disk. When I experienced intermittent hangs I plugged in an SSD drive (ada3 on the devlist) and moved the data there. This improved the MB's per second that are being written (it is mostly-write data) but has not changed the hangs. If anything, they got worse since. > > Using gstat I notice that I/O service time is quite high. From the gstat below you can see that it takes just over 2s to servr the requests. The L(q) seems to never drop far below 100 and %busy hovers around 100% all day long. Can someone please help me troubleshoot that further? What can I do to make the underlying problem visible? > > I should mention all data is referenced through cross-mountpoint symlinks, would that make a difference? Should I use canonical paths in the code instead? > > All file systems are mounted "noatime, soft-updates".You may want to play around with gshed, the GEOM Scheduler. Matt Dillon did a bunch of tests comparing FreeBSD+UFS to DragonflyBSD+HAMMER and found that FreeBSD starves read threads in order to satisfy write threads (or the other way around?). But, adding gsched into the mix helped things immensely, allowing mixed reads/writes to better shares disk I/O resources. I'll see if I can dig up a link to his testing e-mail messages. -- Freddie Cash fjwcash@gmail.com
On Tue, May 29, 2012 at 09:26:32PM +0200, Kees Jan Koster wrote:> Dear All, > > I seem to have a problem where really heavy disk I/O is drowning my machine. I see hangs in the shell where I am logged on using ssh. Network connections get dropped for no apparent reason and some HTTP requests are served really slowly. Profiling the app code shows that the hangs are in completely random places. Operations that are no more than a few lines of code apart suddenly take seconds to complete. > > In my search I seem to find that my machine is quite slow on the disk. I find that rather odd, given that the device in question is an SSD drive and it is a good bit faster than the WD drive that used to carry the data set that is accessed heavily. This drive is doing 1.5 times the throughput, but the hangs have not gone away. > > To clarify, the data set used to live on ada2 (see the devlist below) which is a spinning disk. When I experienced intermittent hangs I plugged in an SSD drive (ada3 on the devlist) and moved the data there. This improved the MB's per second that are being written (it is mostly-write data) but has not changed the hangs. If anything, they got worse since. > > Using gstat I notice that I/O service time is quite high. From the gstat below you can see that it takes just over 2s to servr the requests. The L(q) seems to never drop far below 100 and %busy hovers around 100% all day long. Can someone please help me troubleshoot that further? What can I do to make the underlying problem visible? > > I should mention all data is referenced through cross-mountpoint symlinks, would that make a difference? Should I use canonical paths in the code instead? > > All file systems are mounted "noatime, soft-updates". > > Details: > > # uname -a > FreeBSD cumin.java-monitor.com 9.0-STABLE FreeBSD 9.0-STABLE #0: Mon Mar 26 14:30:19 UTC 2012 kjkoster@cumin.java-monitor.com:/usr/obj/usr/src/sys/CUMIN amd64 > # gstat -f 'ada[0-3]$' -b > dT: 1.001s w: 1.000s filter: ada[0-3]$ > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name > 0 0 0 0 0.0 0 0 0.0 0.0 ada0 > 0 0 0 0 0.0 0 0 0.0 0.0 ada1 > 0 0 0 0 0.0 0 0 0.0 0.0 ada2 > 103 273 0 0 0.0 273 34630 2062 121.9 ada3 > # camcontrol devlist > <WDC WD740ADFD-00NLR1 20.07P20> at scbus1 target 0 lun 0 (pass0,ada0) > <WDC WD740GD-00FLC0 33.08F33> at scbus2 target 0 lun 0 (pass1,ada1) > <WDC WD740GD-00FLC0 33.08F33> at scbus3 target 0 lun 0 (pass2,ada2) > <OCZ SUMMIT VBM1801Q> at scbus4 target 0 lun 0 (pass3,ada3) > <PepperC Virtual Disc 1 0.01> at scbus7 target 0 lun 0 (pass4,cd0) > <PepperC Virtual Disc 2 0.01> at scbus8 target 0 lun 0 (pass5,cd1)Check the SSD for its internal block size and make sure your filesystem and partitions are aligned with the disk block size. Unless there is something wrong with your SATA controller I'd expect a lot more than 273 IOPS/sec and ~30MByte/sec from a SSD. Regards, Gary