The Emulex fiber channel device driver supports a max-xfer-size tunable (via /kernel/drv/emlxs.conf) which only applies to i386. This tunable specifies the scatter gather list buffer size. By default, this value is set to 339968 for Solaris 10. Today I experimented with doubling this value to 688128 and was happy to see a large increase in sequential read performance from my ZFS pool which is based on six mirrors vdevs. Sequential read performance jumped from 552787 MB/s to 799626 MB/s. It seems that the default driver buffer size interfers with zfs''s ability to double the read performance by balancing the reads from the mirror devices. Now the read performance is almost 2X the write performance. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Sun, 28 Jun 2009, Bob Friesenhahn wrote:> Today I experimented with doubling this value to 688128 and was happy to see > a large increase in sequential read performance from my ZFS pool which is > based on six mirrors vdevs. Sequential read performance jumped from 552787 > MB/s to 799626 MB/s. It seems that the default driver buffer size interfers > with zfs''s ability to double the read performance by balancing the reads from > the mirror devices. Now the read performance is almost 2X the write > performance.Grumble. This may be a bit of a red herring. When testing with a 16GB file, the reads were definitely faster. With 32GB and 64GB test files the read performace is the same as before. Now I am thinking that the improved performance with the 16GB file is due to the test being executed on a freshly booted system vs one that has run for a week. Perhaps somehow there is still some useful caching going on with the 16GB file. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Sun, 28 Jun 2009, Bob Friesenhahn wrote:> On Sun, 28 Jun 2009, Bob Friesenhahn wrote: >> Today I experimented with doubling this value to 688128 and was happy to >> see a large increase in sequential read performance from my ZFS pool which >> is based on six mirrors vdevs. Sequential read performance jumped from >> 552787 MB/s to 799626 MB/s. It seems that the default driver buffer size >> interfers with zfs''s ability to double the read performance by balancing >> the reads from the mirror devices. Now the read performance is almost 2X >> the write performance. > > Grumble. This may be a bit of a red herring.Perhaps this Emulex tunable was not entirely a red herring. Doubling the default for this tunable made a difference to my application. It dropped total real execution time from 2:45:03.152 to 2:24:25.675. That is a pretty large improvement. If I run two copies of my application at once and divide up the work, the execution time is 1:42:32.42. Even with two (or three) copies of the application running, it seems that zfs is still the bottleneck since the square-wave of system CPU utilization becomes even more prominent, indicating that all readers are blocked during the TXG sync. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/