Tao Ma
2010-Jul-06 06:27 UTC
[Ocfs2-devel] [PATCH 0/6 v6][RFC] jbd[2]: enhance fsync performance when using CFQ
Hi Jeff, On 07/03/2010 03:58 AM, Jeff Moyer wrote:> Hi, > > Running iozone or fs_mark with fsync enabled, the performance of CFQ is > far worse than that of deadline for enterprise class storage when dealing > with file sizes of 8MB or less. I used the following command line as a > representative test case: > > fs_mark -S 1 -D 10000 -N 100000 -d /mnt/test/fs_mark -s 65536 -t 1 -w 4096 -F >I ran the script with "35-rc4 + this patch version" for an ocfs2 volume, and get no hang now. Thanks for the work. I also have some number for you. See below.> > Because the iozone process is issuing synchronous writes, it is put > onto CFQ's SYNC service tree. The significance of this is that CFQ > will idle for up to 8ms waiting for requests on such queues. So, > what happens is that the iozone process will issue, say, 64KB worth > of write I/O. That I/O will just land in the page cache. Then, the > iozone process does an fsync which forces those I/Os to disk as > synchronous writes. Then, the file system's fsync method is invoked, > and for ext3/4, it calls log_start_commit followed by log_wait_commit. > Because those synchronous writes were forced out in the context of the > iozone process, CFQ will now idle on iozone's cfqq waiting for more I/O. > However, iozone's progress is gated by the journal thread, now. > > With this patch series applied (in addition to the two other patches I > sent [1]), CFQ now achieves 530.82 files / second. > > I also wanted to improve the performance of the fsync-ing process in the > presence of a competing sequential reader. The workload I used for that > was a fio job that did sequential buffered 4k reads while running the fs_mark > process. The run-time was 30 seconds, except where otherwise noted. > > Deadline got 450 files/second while achieving a throughput of 78.2 MB/s for > the sequential reader. CFQ, unpatched, did not finish an fs_mark run > in 30 seconds. I had to bump the time of the test up to 5 minutes, and then > CFQ saw an fs_mark performance of 6.6 files/second and sequential reader > throughput of 137.2MB/s. > > The fs_mark process was being starved as the WRITE_SYNC I/O is marked > with RQ_NOIDLE, and regular WRITES are part of the async workload by > default. So, a single request would be served from either the fs_mark > process or the journal thread, and then they would give up the I/O > scheduler. > > After applying this patch set, CFQ can now perform 113.2 files/second while > achieving a throughput of 78.6 MB/s for the sequential reader. In table > form, the results (all averages of 5 runs) look like this: > > just just > fs_mark fio mixed > -------------------------------+-------------- > deadline 529.44 151.4 | 450.0 78.2 > vanilla cfq 107.88 164.4 | 6.6 137.2 > patched cfq 530.82 158.7 | 113.2 78.6Just some updates from the test of ocfs2. fs_mark ------------------------ deadline 386.3 vanilla cfq 59.7 patched cfq 366.2 So there is really a fantastic improvement at least from what fs_mark gives us. Great thanks. Regards, Tao
Apparently Analagous Threads
- [RFC] All my fsync changes
- Re: PATCH 3/6 - direct-io: do not merge logically non-contiguous requests
- [PATCH] Btrfs: pick the correct metadata allocation size on small devices
- folder with no permissions
- Solaris 8/07 Zfs Raidz NFS dies during iozone test on client host