Jacob Ritorto
2009-Jan-06 16:44 UTC
[zfs-discuss] Observation of Device Layout vs Performance
My OpenSolaris 2008/11 PC seems to attain better throughput with one big sixteen-device RAIDZ2 than with four stripes of 4-device RAIDZ. I know it''s by no means an exhaustive test, but catting /dev/zero to a file in the pool now frequently exceeds 600 Megabytes per second, whereas before with the striped RAIDZ I was only occasionally peaking around 400MB/s. The kit is SuperMicro Intel 64 bit, 2-socket by 4 thread, 3 GHz with two AOC MV8 boards and 800 MHz (iirc) fsb connecting 16 GB RAM that runs at equal speed to fsb. Cheap 7200 RPM Seagate SATA half-TB disks with 32MB cache. Is this increase explicable / expected? The throughput calculator sheet output I saw seemed to forecast better iops with the striped raidz vdevs and I''d read that, generally, throughput is augmented by keeping the number of vdevs in the single digits. Is my superlative result perhaps related to the large cpu and memory bandwidth? Just throwing this out for sake of discussion/sanity check.. thx jake -- This message posted from opensolaris.org
Keith Bierman
2009-Jan-06 16:51 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Jan 6, 2009, at 9:44 AM 1/6/, Jacob Ritorto wrote:> but catting /dev/zero to a file in the pool now fDo you get the same sort of results from /dev/random? I wouldn''t be surprised if /dev/zero turns out to be a special case. Indeed, using any of the special files is probably not ideal.>-- Keith H. Bierman khbkhb at gmail.com | AIM kbiermank 5430 Nassau Circle East | Cherry Hills Village, CO 80113 | 303-997-2749 <speaking for myself*> Copyright 2008
A Darren Dunham
2009-Jan-06 17:38 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Tue, Jan 06, 2009 at 08:44:01AM -0800, Jacob Ritorto wrote:> Is this increase explicable / expected? The throughput calculator > sheet output I saw seemed to forecast better iops with the striped > raidz vdevs and I''d read that, generally, throughput is augmented by > keeping the number of vdevs in the single digits. Is my superlative > result perhaps related to the large cpu and memory bandwidth?I''d think that for pure sequential loads, larger column setups wouldn''t have too many performance issues. But as soon as you try to do random reads on the large setup you''re going to be much more limited. Do you have tools to do random I/O exercises? -- Darren
Bob Friesenhahn
2009-Jan-06 17:39 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Tue, 6 Jan 2009, Jacob Ritorto wrote:> My OpenSolaris 2008/11 PC seems to attain better throughput with one > big sixteen-device RAIDZ2 than with four stripes of 4-device RAIDZ. > I know it''s by no means an exhaustive test, but catting /dev/zero to > a file in the pool now frequently exceeds 600 Megabytes per second, > whereas before with the striped RAIDZ I was only occasionally > peaking around 400MB/s. The kit is SuperMicro Intel 64 bit, > > Is this increase explicable / expected? The throughput calculatorThis is not surprising. However, your test is only testing the write performance using a single process. With multiple writers and readers, the throughput will be better when using the configuration with more vdevs. It is not recommended to use such a large RAIDZ2 due to the multi-user performance concern, and because a single slow/failing disk drive can destroy the performance until it is identified and fixed. Maybe a balky (but still functioning) drive won''t be replaced under warranty and so you have to pay for a replacement out of your own pocket. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn
2009-Jan-06 18:12 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Tue, 6 Jan 2009, Keith Bierman wrote:> Do you get the same sort of results from /dev/random?/dev/random is very slow and should not be used for benchmarking. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Jacob Ritorto
2009-Jan-06 18:14 UTC
[zfs-discuss] Observation of Device Layout vs Performance
Is urandom nonblocking? On Tue, Jan 6, 2009 at 1:12 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Tue, 6 Jan 2009, Keith Bierman wrote: > >> Do you get the same sort of results from /dev/random? > > /dev/random is very slow and should not be used for benchmarking. > > Bob > =====================================> Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ > >
Keith Bierman
2009-Jan-06 18:14 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Jan 6, 2009, at 11:12 AM 1/6/, Bob Friesenhahn wrote:> On Tue, 6 Jan 2009, Keith Bierman wrote: > >> Do you get the same sort of results from /dev/random? > > /dev/random is very slow and should not be used for benchmarking. >Not directly, no. But copying from /dev/random to a real file and using that should provide better insight than all zeros or all ones (I have seen "clever" devices optimize things away). Tests like bonnie are probably a better bet than rolling one''s own; although the latter is good for building intuition ;> -- Keith H. Bierman khbkhb at gmail.com | AIM kbiermank 5430 Nassau Circle East | Cherry Hills Village, CO 80113 | 303-997-2749 <speaking for myself*> Copyright 2008
Bob Friesenhahn
2009-Jan-06 18:19 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Tue, 6 Jan 2009, Jacob Ritorto wrote:> Is urandom nonblocking?The OS provided random devices need to be secure and so they depend on collecting "entropy" from the system so the random values are truely random. They also execute complex code to produce the random numbers. As a result, both of the random device interfaces are much slower than a disk drive. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Jacob Ritorto
2009-Jan-06 18:21 UTC
[zfs-discuss] Observation of Device Layout vs Performance
OK, so use a real io test program or at least pre-generate files large enough to exceed RAM caching? On Tue, Jan 6, 2009 at 1:19 PM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Tue, 6 Jan 2009, Jacob Ritorto wrote: > >> Is urandom nonblocking? > > The OS provided random devices need to be secure and so they depend on > collecting "entropy" from the system so the random values are truely random. > They also execute complex code to produce the random numbers. As a result, > both of the random device interfaces are much slower than a disk drive. > > Bob > =====================================> Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ > >
Jacob Ritorto
2009-Jan-06 18:55 UTC
[zfs-discuss] Observation of Device Layout vs Performance
I have that iozone program loaded, but its results were rather cryptic for me. Is it adequate if I learn how to decipher the results? Can it thread out and use all of my CPUs?> Do you have tools to do random I/O exercises? > > -- > Darren
Bob Friesenhahn
2009-Jan-06 19:19 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On Tue, 6 Jan 2009, Jacob Ritorto wrote:> I have that iozone program loaded, but its results were rather cryptic > for me. Is it adequate if I learn how to decipher the results? Can > it thread out and use all of my CPUs?Yes, iozone does support threading. Here is a test with a record size of 8KB, eight threads, synchronous writes, and a 2GB test file: Multi_buffer. Work area 16777216 bytes OPS Mode. Output is in operations per second. Record Size 8 KB SYNC Mode. File size set to 2097152 KB Command line used: iozone -m -t 8 -T -O -r 8k -o -s 2G Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 8 threads Each thread writes a 2097152 Kbyte file in 8 Kbyte records When testing with iozone, you will want to make sure that the test file is larger than available RAM, such as 2X the size. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Jacob Ritorto
2009-Jan-06 20:55 UTC
[zfs-discuss] Observation of Device Layout vs Performance
> Yes, iozone does support threading. Here is a test with a record size of > 8KB, eight threads, synchronous writes, and a 2GB test file: > > Multi_buffer. Work area 16777216 bytes > OPS Mode. Output is in operations per second. > Record Size 8 KB > SYNC Mode. > File size set to 2097152 KB > Command line used: iozone -m -t 8 -T -O -r 8k -o -s 2G > Time Resolution = 0.000001 seconds. > Processor cache size set to 1024 Kbytes. > Processor cache line size set to 32 bytes. > File stride size set to 17 * record size. > Throughput test with 8 threads > Each thread writes a 2097152 Kbyte file in 8 Kbyte records > > When testing with iozone, you will want to make sure that the test file is > larger than available RAM, such as 2X the size. > > BobOK, I ran it as suggested (using a 17GB file pre-generated from urandom) and I''m getting what appear to be sane iozone results now. Do we have a place to compare performance notes? thx jake
Toby Thain
2009-Jan-07 15:56 UTC
[zfs-discuss] Observation of Device Layout vs Performance
On 6-Jan-09, at 1:19 PM, Bob Friesenhahn wrote:> On Tue, 6 Jan 2009, Jacob Ritorto wrote: > >> Is urandom nonblocking? > > The OS provided random devices need to be secure and so they depend on > collecting "entropy" from the system so the random values are truely > random. They also execute complex code to produce the random numbers. > As a result, both of the random device interfaces are much slower than > a disk drive.That is true, of course, but only one of them (/dev/random) blocks without entropy (Jacob''s question). --Toby> > Bob > =====================================> Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/ > bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss