As discussed on the BTRFS conference call, myself and Kevin Corry have set up some test machines for the purpose of doing performance testing on BTRFS. The intent is to have a semi permanent setup that we can use to test new features and code drops in BTRFS as well as to do comparisons to other file systems. The systems are pretty much fully automated for execution, so we should be able to crank out large numbers of different benchmarks as well as keep up with GIT changes. The data is hosted at http://btrfs.boxacle.net/. So far we have the data for the single disk tests uploaded. We should be able to upload results from the larger RAID config tomorrow. Initial tests were done with the FFSB benchmark and we picked 5 common workloads; create, random and sequential read, random write, and a mail server emulation. We plan to expand this based on feedback to include more FFSB tests and/or other workloads. All runs have complete analysis data with them (iostat, mpstat, oprofile, sar), as well as the FFSB profiles that can be used to recreate any test we ran. We also have collected blktrace data but not uploaded due to size. Please follow the results link on the bottom of the main page to get to the current results. Let me know what you like or don''t like. I will post again when we get the RAID data uploaded. Steve -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 21, 2008 at 05:20:03PM -0500, Steven Pratt wrote:> As discussed on the BTRFS conference call, myself and Kevin Corry have > set up some test machines for the purpose of doing performance testing > on BTRFS. The intent is to have a semi permanent setup that we can use > to test new features and code drops in BTRFS as well as to do > comparisons to other file systems. The systems are pretty much fully > automated for execution, so we should be able to crank out large numbers > of different benchmarks as well as keep up with GIT changes. > > The data is hosted at http://btrfs.boxacle.net/. So far we have the data > for the single disk tests uploaded. We should be able to upload results > from the larger RAID config tomorrow. > > Initial tests were done with the FFSB benchmark and we picked 5 common > workloads; create, random and sequential read, random write, and a mail > server emulation. We plan to expand this based on feedback to include > more FFSB tests and/or other workloads. > > All runs have complete analysis data with them (iostat, mpstat, > oprofile, sar), as well as the FFSB profiles that can be used to > recreate any test we ran. We also have collected blktrace data but not > uploaded due to size. > > Please follow the results link on the bottom of the main page to get to > the current results. Let me know what you like or don''t like. I will > post again when we get the RAID data uploaded.Very interesting data, thank you for posting this. The first comment I''ll make is that -o nodatacow requires -o nodatasum. The sums aren''t valid without the cow. The FFSB mail server workload, does it do fsync writes? For the sequential read workload, I''m guessing (hoping) the files are created in parallel? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> On Tue, Oct 21, 2008 at 05:20:03PM -0500, Steven Pratt wrote: > >> As discussed on the BTRFS conference call, myself and Kevin Corry have >> set up some test machines for the purpose of doing performance testing >> on BTRFS. The intent is to have a semi permanent setup that we can use >> to test new features and code drops in BTRFS as well as to do >> comparisons to other file systems. The systems are pretty much fully >> automated for execution, so we should be able to crank out large numbers >> of different benchmarks as well as keep up with GIT changes. >> >> The data is hosted at http://btrfs.boxacle.net/. So far we have the data >> for the single disk tests uploaded. We should be able to upload results >> from the larger RAID config tomorrow. >> >> Initial tests were done with the FFSB benchmark and we picked 5 common >> workloads; create, random and sequential read, random write, and a mail >> server emulation. We plan to expand this based on feedback to include >> more FFSB tests and/or other workloads. >> >> All runs have complete analysis data with them (iostat, mpstat, >> oprofile, sar), as well as the FFSB profiles that can be used to >> recreate any test we ran. We also have collected blktrace data but not >> uploaded due to size. >> >> Please follow the results link on the bottom of the main page to get to >> the current results. Let me know what you like or don''t like. I will >> post again when we get the RAID data uploaded. >> > > Very interesting data, thank you for posting this. The first comment > I''ll make is that -o nodatacow requires -o nodatasum. The sums aren''t > valid without the cow. >Thought that might be the case. Ok, we will drop this variation.> The FFSB mail server workload, does it do fsync writes? >No, but we have the ability to add that if we choose.> For the sequential read workload, I''m guessing (hoping) the files are > created in parallel? >Sorry, setup is still single threaded. Steve> -chris > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2008-10-22 at 08:53 -0500, Steven Pratt wrote:> Chris Mason wrote: > > On Tue, Oct 21, 2008 at 05:20:03PM -0500, Steven Pratt wrote: > > > >> As discussed on the BTRFS conference call, myself and Kevin Corry have > >> set up some test machines for the purpose of doing performance testing > >> on BTRFS. The intent is to have a semi permanent setup that we can use > >> to test new features and code drops in BTRFS as well as to do > >> comparisons to other file systems. The systems are pretty much fully > >> automated for execution, so we should be able to crank out large numbers > >> of different benchmarks as well as keep up with GIT changes. > >> > >> The data is hosted at http://btrfs.boxacle.net/. So far we have the data > >> for the single disk tests uploaded. We should be able to upload results > >> from the larger RAID config tomorrow. > >> > >> Initial tests were done with the FFSB benchmark and we picked 5 common > >> workloads; create, random and sequential read, random write, and a mail > >> server emulation. We plan to expand this based on feedback to include > >> more FFSB tests and/or other workloads. > >> > >> All runs have complete analysis data with them (iostat, mpstat, > >> oprofile, sar), as well as the FFSB profiles that can be used to > >> recreate any test we ran. We also have collected blktrace data but not > >> uploaded due to size. > >> > >> Please follow the results link on the bottom of the main page to get to > >> the current results. Let me know what you like or don''t like. I will > >> post again when we get the RAID data uploaded. > >> > > > > Very interesting data, thank you for posting this. The first comment > > I''ll make is that -o nodatacow requires -o nodatasum. The sums aren''t > > valid without the cow. > > > Thought that might be the case. Ok, we will drop this variation. > > > The FFSB mail server workload, does it do fsync writes? > > > No, but we have the ability to add that if we choose. >I''d be interested in it at least.> > For the sequential read workload, I''m guessing (hoping) the files are > > created in parallel? > > > Sorry, setup is still single threaded.Ok, I''ll try to reproduce these results. Thanks. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Steven Pratt wrote:> As discussed on the BTRFS conference call, myself and Kevin Corry have > set up some test machines for the purpose of doing performance testing > on BTRFS. The intent is to have a semi permanent setup that we can > use to test new features and code drops in BTRFS as well as to do > comparisons to other file systems. The systems are pretty much fully > automated for execution, so we should be able to crank out large > numbers of different benchmarks as well as keep up with GIT changes. > > The data is hosted at http://btrfs.boxacle.net/. So far we have the > data for the single disk tests uploaded. We should be able to upload > results from the larger RAID config tomorrow. > > Initial tests were done with the FFSB benchmark and we picked 5 common > workloads; create, random and sequential read, random write, and a > mail server emulation. We plan to expand this based on feedback to > include more FFSB tests and/or other workloads. > > All runs have complete analysis data with them (iostat, mpstat, > oprofile, sar), as well as the FFSB profiles that can be used to > recreate any test we ran. We also have collected blktrace data but not > uploaded due to size. > > Please follow the results link on the bottom of the main page to get > to the current results. Let me know what you like or don''t like. I > will post again when we get the RAID data uploaded.RAID data is now uploaded. The config used is 136 15k rpm fiber disks in 8 arrays all striped together with DM. These results are not as favorable to BTRFS, as there seem to be some major issues with random write and mail server workloads. http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html Steve> > > Steve > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2008-10-22 at 10:00 -0500, Steven Pratt wrote:> Steven Pratt wrote: > > As discussed on the BTRFS conference call, myself and Kevin Corry have > > set up some test machines for the purpose of doing performance testing > > on BTRFS. The intent is to have a semi permanent setup that we can > > use to test new features and code drops in BTRFS as well as to do > > comparisons to other file systems. The systems are pretty much fully > > automated for execution, so we should be able to crank out large > > numbers of different benchmarks as well as keep up with GIT changes. > > > > The data is hosted at http://btrfs.boxacle.net/. So far we have the > > data for the single disk tests uploaded. We should be able to upload > > results from the larger RAID config tomorrow. > > > > Initial tests were done with the FFSB benchmark and we picked 5 common > > workloads; create, random and sequential read, random write, and a > > mail server emulation. We plan to expand this based on feedback to > > include more FFSB tests and/or other workloads. > > > > All runs have complete analysis data with them (iostat, mpstat, > > oprofile, sar), as well as the FFSB profiles that can be used to > > recreate any test we ran. We also have collected blktrace data but not > > uploaded due to size. > >I''ll try to reproduce things here, but I might end up asking for some of the blktrace data.> > Please follow the results link on the bottom of the main page to get > > to the current results. Let me know what you like or don''t like. I > > will post again when we get the RAID data uploaded. > RAID data is now uploaded. The config used is 136 15k rpm fiber disks > in 8 arrays all striped together with DM. These results are not as > favorable to BTRFS, as there seem to be some major issues with random > write and mail server workloads. > > http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html >I need to look harder at the mail server workload, my initial guess is that I''m doing too much metadata readahead in these effectively random operations. If I''m reading the config correctly, the random write workload does this: 1) create a file sequentially 2) do buffered random writes to the file Since buffered writeback happens via pdflush, the IO isn''t actually as random as you would expect. Pages are written in file offset order, which actually corresponds to disk order. When btrfs is doing COW, file offset order maps to random order on disk, leading to much lower tput. The nocow results should be better than they are, and I''ll see what I can do about the cow results too. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> On Wed, 2008-10-22 at 10:00 -0500, Steven Pratt wrote: > >> Steven Pratt wrote: >> >>> As discussed on the BTRFS conference call, myself and Kevin Corry have >>> set up some test machines for the purpose of doing performance testing >>> on BTRFS. The intent is to have a semi permanent setup that we can >>> use to test new features and code drops in BTRFS as well as to do >>> comparisons to other file systems. The systems are pretty much fully >>> automated for execution, so we should be able to crank out large >>> numbers of different benchmarks as well as keep up with GIT changes. >>> >>> The data is hosted at http://btrfs.boxacle.net/. So far we have the >>> data for the single disk tests uploaded. We should be able to upload >>> results from the larger RAID config tomorrow. >>> >>> Initial tests were done with the FFSB benchmark and we picked 5 common >>> workloads; create, random and sequential read, random write, and a >>> mail server emulation. We plan to expand this based on feedback to >>> include more FFSB tests and/or other workloads. >>> >>> All runs have complete analysis data with them (iostat, mpstat, >>> oprofile, sar), as well as the FFSB profiles that can be used to >>> recreate any test we ran. We also have collected blktrace data but not >>> uploaded due to size. >>> >>> > > I''ll try to reproduce things here, but I might end up asking for some of > the blktrace data. > >Sure, not a problem for select workloads, but it was just too much data to upload for every run. Just let me know which ones you need.>>> Please follow the results link on the bottom of the main page to get >>> to the current results. Let me know what you like or don''t like. I >>> will post again when we get the RAID data uploaded. >>> >> RAID data is now uploaded. The config used is 136 15k rpm fiber disks >> in 8 arrays all striped together with DM. These results are not as >> favorable to BTRFS, as there seem to be some major issues with random >> write and mail server workloads. >> >> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html >> >> > > I need to look harder at the mail server workload, my initial guess is > that I''m doing too much metadata readahead in these effectively random > operations. > > If I''m reading the config correctly, the random write workload does > this: > > 1) create a file sequentially > 2) do buffered random writes to the file >Correct. Although there are multiple files(but created serially) and multiple threads writing to different files at the same time. We also only write 5% of a file before moving on to a new file. So while there can be some ordering, the merging should be minimal. If fact we see that from iostat (this is from 16 thread ext3) sdf 0.00 177.05 14.17 3665.07 56.69 15374.05 8.39 67.88 18.48 0.24 86.53 177 merges out of 3665 ios, with average request size on 4.2k.> Since buffered writeback happens via pdflush, the IO isn''t actually as > random as you would expect. Pages are written in file offset order, > which actually corresponds to disk order. > >Right, there will be a fair amount of locality to the random writes.> When btrfs is doing COW, file offset order maps to random order on disk, > leading to much lower tput. The nocow results should be better than > they are, and I''ll see what I can do about the cow results too. > >Not sure I understand this point, doesn''t the COW code allocate new space sequentially? Steve> -chris > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2008-10-22 at 10:45 -0500, Steven Pratt wrote:> Chris Mason wrote: > > On Wed, 2008-10-22 at 10:00 -0500, Steven Pratt wrote: > > > >> Steven Pratt wrote: > >> > >>> As discussed on the BTRFS conference call, myself and Kevin Corry have > >>> set up some test machines for the purpose of doing performance testing > >>> on BTRFS. The intent is to have a semi permanent setup that we can > >>> use to test new features and code drops in BTRFS as well as to do > >>> comparisons to other file systems. The systems are pretty much fully > >>> automated for execution, so we should be able to crank out large > >>> numbers of different benchmarks as well as keep up with GIT changes. > >>> > >>> The data is hosted at http://btrfs.boxacle.net/. So far we have the > >>> data for the single disk tests uploaded. We should be able to upload > >>> results from the larger RAID config tomorrow. > >>> > >>> Initial tests were done with the FFSB benchmark and we picked 5 common > >>> workloads; create, random and sequential read, random write, and a > >>> mail server emulation. We plan to expand this based on feedback to > >>> include more FFSB tests and/or other workloads. > >>> > >>> All runs have complete analysis data with them (iostat, mpstat, > >>> oprofile, sar), as well as the FFSB profiles that can be used to > >>> recreate any test we ran. We also have collected blktrace data but not > >>> uploaded due to size. > >>> > >>> > > > > I''ll try to reproduce things here, but I might end up asking for some of > > the blktrace data. > > > > > Sure, not a problem for select workloads, but it was just too much data > to upload for every run. Just let me know which ones you need. >Hopefully I''ll get similar results to yours, I''ll give it a shot later this week.> > If I''m reading the config correctly, the random write workload does > > this: > > > > 1) create a file sequentially > > 2) do buffered random writes to the file > > > Correct. Although there are multiple files(but created serially) and > multiple threads writing to different files at the same time. We also > only write 5% of a file before moving on to a new file. So while there > can be some ordering, the merging should be minimal. If fact we see > that from iostat (this is from 16 thread ext3) > > sdf 0.00 177.05 14.17 3665.07 56.69 15374.05 8.39 67.88 18.48 0.24 86.53 > > 177 merges out of 3665 ios, with average request size on 4.2k.> > > > Since buffered writeback happens via pdflush, the IO isn''t actually as > > random as you would expect. Pages are written in file offset order, > > which actually corresponds to disk order. > > > > > Right, there will be a fair amount of locality to the random writes. > > > When btrfs is doing COW, file offset order maps to random order on disk, > > leading to much lower tput. The nocow results should be better than > > they are, and I''ll see what I can do about the cow results too. > > > > > Not sure I understand this point, doesn''t the COW code allocate new > space sequentially?Yes, COW allocates new space sequentially and via delayed allocation, and based on the config the extents should be about 5MB in size. But based on the numbers, we''re getting something much more random. pdflush is really tricky here, and when it does the wrong thing the COW mode will show it most. I''d be curious to see the difference in performance between this run and this run with 5MB O_SYNC (or O_DIRECT) writes. Btrfs can do both, the O_DIRECT write just does the normal page cache write plus an invalidate. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Replying to Steven Pratt:> Steven Pratt wrote: > RAID data is now uploaded. The config used is 136 15k rpm fiber disks > in 8 arrays all striped together with DM. These results are not as > favorable to BTRFS, as there seem to be some major issues with random > write and mail server workloads.Why don''t use btrfs'' own RAID capabilities instead? Honestly, I will never ever use md as soon as I''ll get btrfs working :)> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html-- Paul P ''Stingray'' Komkoff Jr // http://stingr.net/key <- my pgp key This message represents the official view of the voices in my head -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Paul P Komkoff Jr wrote:> Replying to Steven Pratt: > >> Steven Pratt wrote: >> RAID data is now uploaded. The config used is 136 15k rpm fiber disks >> in 8 arrays all striped together with DM. These results are not as >> favorable to BTRFS, as there seem to be some major issues with random >> write and mail server workloads. >> > > Why don''t use btrfs'' own RAID capabilities instead? > Honestly, I will never ever use md as soon as I''ll get btrfs working > :) > >On the list of things to try. Main reason was we wanted to be able to compare to other file systems and they are lacking that feature. Steve>> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html >> > >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html