Maureen Chew
2009-Nov-18 13:58 UTC
[zfs-discuss] open(2), but no I/O to large files creates performance hit
I''m seeing a performance anomaly where opening a large file (but doing *no* I/O to it) seems to cause (or correlates to) a significant performance hit on a mirrored ZFS filesystem. Unintuitively, if I disable zfs_prefetch_disable, I don''t see the performance degradation. It doesn''t make sense that this would help unless there is some cache/VM pollution resulting from the open(2). The basic situation is that there are 8 SAS applications running a statistical procedure. The 8 applications read the same 26GB file but each writes its own unique 41GB output file (using vanilla read(2)/write(2) of 4K/8K/16K I/O sizes). All I/O going to the same mirrored ZFS filesystem. The performance problem occurs only if the 41 GB output files exist. In this situation, SAS will open the file to be overwritten for exclusive read/write so no one will mess with it, write the output file using a temp name, and then do a rename() of temp file to real name when done. I''ve verified with dtrace, that no I/O is done to the original file that is to be overwritten. If the files don''t exist (or if zfs_prefetch_disable is set to true), the times for each of the 8 jobs looks similar to this (all jobs run in less than 5 hours): # ../get-times.sh FSUSER1.log:4:50:31.21 FSUSER2.log:4:51:09.99 FSUSER3.log:4:49:35.22 FSUSER4.log:4:50:31.05 FSUSER5.log:4:50:30.11 FSUSER6.log:4:50:29.51 FSUSER7.log:4:49:35.22 FSUSER8.log:4:51:08.53 If the output files exist (and I don''t touch zfs_prefetch_disable), the times look like this (3/8 jobs run in less than 5 hrs, the other 5 run in 5.5 hours(almost 50 minutes longer)): # ../get-times.sh FSUSER1.log:5:35:30.77 FSUSER2.log:5:36:23.41 FSUSER3.log:5:32:16.61 FSUSER4.log:5:33:53.25 FSUSER5.log:4:49:54.28 FSUSER6.log:5:35:42.93 FSUSER7.log:4:49:54.86 FSUSER8.log:4:50:38.42 Does this behavior make sense or does tweaking zfs_prefetch_disable just trigger some other effect? For this test, setting that flag(disabling prefetch) doesn''t seem to hurt performance (at least for this test scenario. System specs: -S10,U7 -2 way Nehalem (Xeon X55500, 8 virtual processors), 2.67GHz -24GB RAM - J4400 for storage (JBOD) # zpool status zpl1 pool: zpl1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zpl1 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t8d0 ONLINE 0 0 0 c0t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c0t13d0 ONLINE 0 0 0 errors: No known data errors