Maureen Chew
2009-Nov-18  13:58 UTC
[zfs-discuss] open(2), but no I/O to large files creates performance hit
I''m seeing a performance anomaly where opening a large file (but doing 
*no* I/O to it) seems to cause (or correlates to) a significant 
performance hit on a mirrored ZFS filesystem.  Unintuitively, if I 
disable zfs_prefetch_disable, I don''t see the performance degradation. 
It doesn''t make sense that this would help unless there is some
cache/VM
pollution resulting from the open(2).
The basic situation is that there are 8 SAS applications running a 
statistical procedure.  The 8 applications read the same 26GB file but 
each writes its own unique 41GB output file (using vanilla 
read(2)/write(2) of 4K/8K/16K I/O sizes).  All I/O going to the same 
mirrored ZFS filesystem. The performance problem occurs only if the 41 
GB output files exist.  In this situation, SAS will open the file to be 
overwritten for exclusive read/write so no one will mess with it, write 
the output file using a temp name, and then do a rename() of temp file 
to real name when done.  I''ve verified with dtrace, that no I/O is done
to the original file that is to be overwritten.
If the files don''t exist (or if zfs_prefetch_disable is set to true), 
the times for each of the 8 jobs looks similar to this (all jobs run in 
less than 5 hours):
# ../get-times.sh
FSUSER1.log:4:50:31.21
FSUSER2.log:4:51:09.99
FSUSER3.log:4:49:35.22
FSUSER4.log:4:50:31.05
FSUSER5.log:4:50:30.11
FSUSER6.log:4:50:29.51
FSUSER7.log:4:49:35.22
FSUSER8.log:4:51:08.53
If the output files exist (and I don''t touch zfs_prefetch_disable), the
times look like this (3/8 jobs run in less than 5 hrs, the other 5 run 
in 5.5 hours(almost 50 minutes longer)):
# ../get-times.sh
FSUSER1.log:5:35:30.77
FSUSER2.log:5:36:23.41
FSUSER3.log:5:32:16.61
FSUSER4.log:5:33:53.25
FSUSER5.log:4:49:54.28
FSUSER6.log:5:35:42.93
FSUSER7.log:4:49:54.86
FSUSER8.log:4:50:38.42
Does this behavior make sense or does tweaking zfs_prefetch_disable just 
trigger some other effect?  For this test, setting that flag(disabling 
prefetch) doesn''t seem to hurt performance (at least for this test
scenario.
System specs:
-S10,U7
-2 way Nehalem (Xeon X55500, 8 virtual processors), 2.67GHz
-24GB RAM
- J4400 for storage (JBOD)
# zpool status zpl1
   pool: zpl1
  state: ONLINE
  scrub: none requested
config:
         NAME         STATE     READ WRITE CKSUM
         zpl1         ONLINE       0     0     0
           mirror     ONLINE       0     0     0
             c0t2d0   ONLINE       0     0     0
             c0t3d0   ONLINE       0     0     0
           mirror     ONLINE       0     0     0
             c0t4d0   ONLINE       0     0     0
             c0t5d0   ONLINE       0     0     0
           mirror     ONLINE       0     0     0
             c0t6d0   ONLINE       0     0     0
             c0t7d0   ONLINE       0     0     0
           mirror     ONLINE       0     0     0
             c0t8d0   ONLINE       0     0     0
             c0t9d0   ONLINE       0     0     0
           mirror     ONLINE       0     0     0
             c0t10d0  ONLINE       0     0     0
             c0t11d0  ONLINE       0     0     0
           mirror     ONLINE       0     0     0
             c0t12d0  ONLINE       0     0     0
             c0t13d0  ONLINE       0     0     0
errors: No known data errors
