Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. Single disk results are not too bad. Raid still falls apart on any write heavy workload. Steve -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2009-02-02 at 09:58 -0600, Steven Pratt wrote:> Finally cleared out a backlog of results to upload. Main performance > page is updated with all the links. (http://btrfs.boxacle.net/) Most > recent results are on 2.6.29-rc2. As usual see analysis directory of > results for oprofile, including call graphs. > > Single disk results are not too bad. Raid still falls apart on any > write heavy workload.Thanks Steve, was the mainline btrfs used for this? I''m working on the write heavy problems this week. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> On Mon, 2009-02-02 at 09:58 -0600, Steven Pratt wrote: > >> Finally cleared out a backlog of results to upload. Main performance >> page is updated with all the links. (http://btrfs.boxacle.net/) Most >> recent results are on 2.6.29-rc2. As usual see analysis directory of >> results for oprofile, including call graphs. >> >> Single disk results are not too bad. Raid still falls apart on any >> write heavy workload. >> > > Thanks Steve, was the mainline btrfs used for this? I''m working on the > write heavy problems this week. > >This was straight mainline Linus tree. Steve> -chris > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 2, 2009 at 9:28 PM, Steven Pratt <slpratt@austin.ibm.com> wrote:> > Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. > > Single disk results are not too bad. Raid still falls apart on any write heavy workload.Would you please mind explaining how bad the results are and how much more this needs to be improved for Btrfs to be perfomance wise acceptable? I see that Btrfs almost everywhere lacks XFS and others in some cases. Thanks, Andev -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
debian developer wrote:> On Mon, Feb 2, 2009 at 9:28 PM, Steven Pratt <slpratt@austin.ibm.com> wrote: >> Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. >> >> Single disk results are not too bad. Raid still falls apart on any write heavy workload. > > Would you please mind explaining how bad the results are and > how much more this needs to be improved for Btrfs to be perfomance > wise acceptable? > > I see that Btrfs almost everywhere lacks XFS and others in some cases.Nobody working on btrfs development is satisfied with the current performance. We knew before the merge that the present code would not be a benchmarking champion. We are working on improving it. The more testing of different configurations and more feedback, the better we understand what areas need work. For example, I''m working on really implementing O_DIRECT. Today O_DIRECT just goes through buffer cache. "Acceptable performance" will depend on what features are important to a user. For example, we expect to use more CPU than other filesystems with btrfs doing checksumming. jim -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
debian developer wrote:> On Mon, Feb 2, 2009 at 9:28 PM, Steven Pratt <slpratt@austin.ibm.com> wrote: > >> Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. >> >> Single disk results are not too bad. Raid still falls apart on any write heavy workload. >> > > Would you please mind explaining how bad the results are and > how much more this needs to be improved for Btrfs to be perfomance > wise acceptable? >Well as I pointed out most of the write workloads seem to run into CPU/locking issues on RAID systems (especially at higher thread counts)where high levels of throughput are expected. There is lots of data out there, but a good place to look would be http://btrfs.boxacle.net/repository/raid/2.6.29-rc2/2.6.29-rc2/2.6.29-rc2.html which shows performance on the latest RC kernel. As thread counts go up, BTRFS lags more and more on write workloads. For example on 128 thread random write http://btrfs.boxacle.net/repository/raid/2.6.29-rc2/2.6.29-rc2/2.6.29-rc2_Large_file_random_writes._num_threads=128.html BTRFS achieves about 4MB/sec where the next worse FS (XFS in this case) get 78MB/sec. So for this example BTRFS is slower by a factor of almost 20x. Let me point out that this is not a criticism of BTRFS, this is just the normal development cycle. Most of the major function is now in, and performance can now become a focus. The point of these benchmarks is to help identify the areas that need attention and provide the debug and analysis data to help facilitate that. Chris has already stated that he hopes to start looking at write performance this week.> I see that Btrfs almost everywhere lacks XFS and others in some cases. >For now. Steve> Thanks, > Andev > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2009-02-03 at 19:02 +0530, debian developer wrote:> On Mon, Feb 2, 2009 at 9:28 PM, Steven Pratt <slpratt@austin.ibm.com> wrote: > > > > Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. > > > > Single disk results are not too bad. Raid still falls apart on any write heavy workload. > > Would you please mind explaining how bad the results are and > how much more this needs to be improved for Btrfs to be perfomance > wise acceptable? > > I see that Btrfs almost everywhere lacks XFS and others in some cases.These benchmarks are great because they hammer on some of the worst cases code in btrfs. The mail-server benchmark for example isn''t quite a mail server workload because it doesn''t fsync the files to disk. But what it does do is hammer on a mixed file read/write/delete workload, which hits btree concurrency and file layout. In my testing here, the big difference between ext4 and btrfs isn''t writing to files, it is actually the unlinks. If I take them out of the run, btrfs is very close to ext4 times. So, I''m working on that. The random write workload is probably just a file allocation problem. Btrfs should be perform very well in that workload. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> On Tue, 2009-02-03 at 19:02 +0530, debian developer wrote: > >> On Mon, Feb 2, 2009 at 9:28 PM, Steven Pratt <slpratt@austin.ibm.com> wrote: >> >>> Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. >>> >>> Single disk results are not too bad. Raid still falls apart on any write heavy workload. >>> >> Would you please mind explaining how bad the results are and >> how much more this needs to be improved for Btrfs to be perfomance >> wise acceptable? >> >> I see that Btrfs almost everywhere lacks XFS and others in some cases. >> > > These benchmarks are great because they hammer on some of the worst > cases code in btrfs. The mail-server benchmark for example isn''t quite > a mail server workload because it doesn''t fsync the files to disk. >Actually it does. We fixed this after the first round was posted. Any results since October have fsync on the create of new files. From the latest runs for mailserver: op weights read = 0 (0.00%) readall = 4 (57.14%) write = 0 (0.00%) create = 0 (0.00%) append = 0 (0.00%) delete = 1 (14.29%) metaop = 0 (0.00%) createdir = 0 (0.00%) stat = 0 (0.00%) writeall = 0 (0.00%) writeall_fsync = 0 (0.00%) open_close = 0 (0.00%) write_fsync = 0 (0.00%) create_fsync = 2 (28.57%) append_fsync = 0 (0.00%) We should probably fsync that into the list of system calls that we track latency for. Steve> But what it does do is hammer on a mixed file read/write/delete > workload, which hits btree concurrency and file layout. In my testing > here, the big difference between ext4 and btrfs isn''t writing to files, > it is actually the unlinks. If I take them out of the run, btrfs is > very close to ext4 times. > > So, I''m working on that. > > The random write workload is probably just a file allocation problem. > Btrfs should be perform very well in that workload. > > -chris > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2009-02-03 at 09:13 -0600, Steven Pratt wrote:> Chris Mason wrote: > > On Tue, 2009-02-03 at 19:02 +0530, debian developer wrote: > > > >> On Mon, Feb 2, 2009 at 9:28 PM, Steven Pratt <slpratt@austin.ibm.com> wrote: > >> > >>> Finally cleared out a backlog of results to upload. Main performance page is updated with all the links. (http://btrfs.boxacle.net/) Most recent results are on 2.6.29-rc2. As usual see analysis directory of results for oprofile, including call graphs. > >>> > >>> Single disk results are not too bad. Raid still falls apart on any write heavy workload. > >>> > >> Would you please mind explaining how bad the results are and > >> how much more this needs to be improved for Btrfs to be perfomance > >> wise acceptable? > >> > >> I see that Btrfs almost everywhere lacks XFS and others in some cases. > >> > > > > These benchmarks are great because they hammer on some of the worst > > cases code in btrfs. The mail-server benchmark for example isn''t quite > > a mail server workload because it doesn''t fsync the files to disk. > > > Actually it does. We fixed this after the first round was posted.Oh, sorry I thought that was put into a different run with fsync on. So, the mail server benchmark I''ve been tuning locally doesn''t have fsync on ;) Deletes are still the slow part on that one. Thanks for the correction though, I''ll look at the fsync perf once the non-fsync perf is faster. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 03, 2009 at 10:01:36AM -0500, Chris Mason wrote:> [...] In my testing > here, the big difference between ext4 and btrfs isn''t writing to files, > it is actually the unlinks. If I take them out of the run, btrfs is > very close to ext4 times.Oh man, what is it with unlinks. Nobody does them very fast. We use "delayed delete" with Cyrus so that the majority of unlinks get saved for the weekend, and even then run them serially because the IO hit is so high. We do more IO during the cyr_expire run than even the peak of U.S. day. A "multi-unlink" API would be seriously nice, where you could say "I want all these files to disappear, so don''t bother trying to keep making the directory entries consistent in-between-times". Especially, ''rm -rf'' performance really sucks with single unlinks - you''re re-creating all this directory data that''s just going to be discarded in a second anyway. Bron ( wondering how much is "it''s a hard problem" and how much is "nobody bothers to optimise it" ) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
在 2009-02-02一的 09:58 -0600,Steven Pratt写道:> Finally cleared out a backlog of results to upload. Main performance > page is updated with all the links. (http://btrfs.boxacle.net/) Most > recent results are on 2.6.29-rc2. As usual see analysis directory of > results for oprofile, including call graphs. > > Single disk results are not too bad. Raid still falls apart on any > write heavy workload. > > Steve >Quick link to the latest single disk IO results: http://btrfs.boxacle.net/repository/single-disk/2.6.29-rc2/2.6.29-rc2/2.6.29-rc2.html Steve''s fs performance tests shows performance regression on ext4 comparing with ext3 with O_DIRECT random writes, about 10% on single disk. http://btrfs.boxacle.net/repository/single-disk/2.6.29-rc2/2.6.29-rc2/2.6.29-rc2_Large_file_random_writes_odirect._num_threads=32.html And looked at the oprofile, comparing with ext3, ext4 seems mark the inode diry quit often than ext3 does. 62 0.0063 ext3.ko ext3_mark_iloc_dirty 367 0.0371 ext4.ko ext4_mark_iloc_dirty This also seen in random read test (with no O_DIRECT) oprfofile data, so it doesn''t seem not a O_DIRECT issue. Any thoughts? More detail about the ext3/4 random odirect writes oprofile... ext3: :~/ext4$ grep ext3 oprofile-ext3-random-write-odirect.001 554 0.0567 ext3.ko ext3_get_branch 346 0.0354 ext3.ko ext3_file_write 336 0.0344 ext3.ko ext3_direct_IO 215 0.0220 ext3.ko ext3_get_blocks_handle 125 0.0128 ext3.ko ext3_block_to_path 112 0.0115 ext3.ko verify_chain 76 0.0078 ext3.ko ext3_get_block 62 0.0063 ext3.ko ext3_mark_iloc_dirty 39 0.0040 ext3.ko __ext3_get_inode_loc 17 0.0017 ext3.ko ext3_getblk 14 0.0014 ext3.ko ext3_get_group_desc 13 0.0013 ext3.ko __ext3_journal_get_write_access 13 0.0013 ext3.ko ext3_find_entry 13 0.0013 ext3.ko ext3_new_blocks 13 0.0013 ext3.ko ext3_reserve_inode_write .... $ grep ext4 oprofile-ext4-random-write-odirect.001 warning: could not check that the binary file /lib/modules/2.6.29-rc2/kernel/fs/ext4/ext4.ko has not been modified since the profile was taken. Results may be inaccurate. 420 0.0425 ext4.ko ext4_direct_IO 411 0.0416 ext4.ko ext4_file_write 403 0.0408 ext4.ko ext4_ext_find_extent 374 0.0378 ext4.ko __ext4_get_inode_loc 367 0.0371 ext4.ko ext4_mark_iloc_dirty 202 0.0204 ext4.ko ext4_ext_get_blocks 194 0.0196 ext4.ko ext4_get_blocks_wrap 179 0.0181 ext4.ko __ext4_ext_check_header 178 0.0180 ext4.ko ext4_journal_start_sb 117 0.0118 ext4.ko ext4_get_block 109 0.0110 ext4.ko ext4_get_group_desc 89 0.0090 ext4.ko ext4_mark_inode_dirty 84 0.0085 ext4.ko ext4_dirty_inode 74 0.0075 ext4.ko __ext4_journal_stop ..... :~/ext4$ grep jbd2 oprofile-ext4-random-write-odirect.001 450 0.0455 jbd2.ko start_this_handle 288 0.0291 jbd2.ko do_get_write_access 287 0.0290 jbd2.ko jbd2_journal_stop 244 0.0247 jbd2.ko jbd2_journal_start 235 0.0238 jbd2.ko jbd2_journal_dirty_metadata 214 0.0216 jbd2.ko jbd2_journal_add_journal_head ~/ext4$ grep jbd oprofile-ext3-random-write-odirect.001 116 0.0119 jbd.ko journal_add_journal_head 94 0.0096 jbd.ko do_get_write_access 73 0.0075 jbd.ko start_this_handle 68 0.0070 jbd.ko journal_clean_one_cp_list 58 0.0059 jbd.ko journal_commit_transaction 48 0.0049 jbd.ko journal_dirty_metadata 45 0.0046 jbd.ko __journal_file_buffer 42 0.0043 jbd.ko __journal_temp_unlink_buffer 42 0.0043 jbd.ko journal_put_journal_head> -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html