Hi, I currently have access to a bunch of disks and a fat machine so I took the chance and performed some vinum benchmarks. You can find the results at <http://mailbox.univie.ac.at/~le/bonnie.html>. I created several RAID 0 and RAID 5 volumes with different stripe sizes and let bonnie++ run over the filesystems. I was quite disappointed about the RAID 5 performance, and even the RAID 0 performance wasn't too good (a plain single disk filesystem was almost as fast as or even faster than a RAID 0 stripe, and I wouldn't expect that). RAID 5 performance was really a mess, some of the test took more than 30min. to complete. Anyone has an idea what's going wrong here? (Apart from me doing bullsh*t benchmarking :-) .) Please keep me cc'ed. best regards, le -- Lukas Ertl eMail: l.ertl@univie.ac.at UNIX-Systemadministrator Tel.: (+43 1) 4277-14073 Zentraler Informatikdienst (ZID) Fax.: (+43 1) 4277-9140 der Universit?t Wien http://mailbox.univie.ac.at/~le/
Lukas Ertl <l.ertl@univie.ac.at> writes:> Anyone has an idea what's going wrong here? (Apart from me doing bullsh*t > benchmarking :-) .)Just out of curiosity, try again with prime stripe sizes (31, 61, 127, 257, 509) or at least odd ones (31, 63, 127, 255, 511). DES -- Dag-Erling Sm?rgrav - des@ofug.org
Lukas Ertl wrote: [ ... ]> I created several RAID 0 and RAID 5 volumes with different stripe sizes > and let bonnie++ run over the filesystems. I was quite disappointed about > the RAID 5 performance, and even the RAID 0 performance wasn't too good > (a plain single disk filesystem was almost as fast as or even faster than > a RAID 0 stripe, and I wouldn't expect that). > > RAID 5 performance was really a mess, some of the test took more than > 30min. to complete.There are three goals or priorities to choose from when configuring RAID: performance, reliability, and cost. What are yours? Also, what tasks you intend to use the RAID filesystem for are critical to consider, even if the answer is simply "undifferentiated general-purpose storage". In particular, RAID-5 write performance is going to be slow, even with RAID hardware support which offloads the parity calculations from the system CPU(s). RAID-5 is best suited for read-mostly or read-only volumes, where you value cost more than performance. Um, that is a dual-channel card, and you're splitting drives onto both channels, right? Anyway, if I had your hardware and no specs as to what to do, I'd probably configure 2 disks as a RAID-1 mirror for an OS boot volume; configure 4 disks as RAID-10; and use the 7th disk as a staging area, hot spare, etc. -- -Chuck
On Sunday, 30 March 2003 at 13:00:56 +0200, Lukas Ertl wrote:> Hi, > > I currently have access to a bunch of disks and a fat machine so I took > the chance and performed some vinum benchmarks. You can find the results > at <http://mailbox.univie.ac.at/~le/bonnie.html>. > > I created several RAID 0 and RAID 5 volumes with different stripe sizes > and let bonnie++ run over the filesystems.As I keep saying, bonnie measures systems, not storage. Hint: ignore any test which uses over 10% CPU time. That leaves you with random seeks.> I was quite disappointed about the RAID 5 performance, and even the > RAID 0 performance wasn't too good (a plain single disk filesystem > was almost as fast as or even faster than a RAID 0 stripe, and I > wouldn't expect that).I would. You're measuring the system, not the subsystem.> RAID 5 performance was really a mess, some of the test took more > than 30min. to complete.It's interesting to note that some of the "CPU intensive" tests did very little with RAID-5. It would be interesting to find why, though the problem could be with the benchmark. To find out what's going on, you'd need: 1. Understand what bonnie++ is doing for these tests. I've measured with bonnie, and in the process discovered that only the random seeks test comes close to measuring the disk, but the sequential I/O looked nothing like that. 2. Use Vinum's built-in monitoring capabilities to see what's really getting as far as Vinum. Look at the info command. Apart from that, it would be interesting to see what rawio shows. That, at least, goes to the disk. Greg -- See complete headers for address and phone numbers -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20030331/175fc737/attachment.bin
Craig Boston wrote:> On Mon, 2003-03-31 at 15:41, Jason Andresen wrote: > >>(Both of these were on previously untouched files to prevent any >>caching, and the "write" test is on a new file, not rewriting an old one) >>Write speed: >>81920000 bytes transferred in 3.761307 secs (21779663 bytes/sec) >>Read speed: >>81920000 bytes transferred in 3.488978 secs (23479655 bytes/sec) >> >>But on the RAID5: >>Write speed: >>81920000 bytes transferred in 17.651300 secs (4641018 bytes/sec) >>Read speed: >>81920000 bytes transferred in 4.304083 secs (19033090 bytes/sec) > > > Disclaimer: IANAVE (I am not a Vinum Expert) > > What block size are you using with dd? If your bs= is less than your > RAID stripe size, it seems like it would end up doing unnecessary > read-modify-write cycles...Commands used: dd if=/dev/zero of=test bs=8192 count=1000 and dd if=random_unused_file_larger_than_80MB of=/dev/null bs=8192 \ count=10000 My stripe size is 384k. -- \ |_ _|__ __|_ \ __| Jason Andresen jandrese@mitre.org |\/ | | | / _| Network and Distributed Systems Engineer _| _|___| _| _|_\___| Office: 703-883-7755
On Mon, 31 Mar 2003, Greg 'groggy' Lehey wrote:> Apart from that, it would be interesting to see what rawio shows. > That, at least, goes to the disk.I've done some rawio benchmarks with different process counts (1, 2, 4, 8, 16, 32 and 64). The results are at <http://mailbox.univie.ac.at/~le/rawio.html>. regards, le -- Lukas Ertl eMail: l.ertl@univie.ac.at UNIX-Systemadministrator Tel.: (+43 1) 4277-14073 Zentraler Informatikdienst (ZID) Fax.: (+43 1) 4277-9140 der Universit?t Wien http://mailbox.univie.ac.at/~le/
This whole thing takes me back to my old SGI days; we had an array on one machine that was meant to stream uncompressed HDTV data (this runs about 1gbit/sec in plain rgb; the sgi video adapters wanted padding to 32bits/pixel so it turns out to be around 1.2-1.4 gbits/sec). Raid 5 was not a consideration; with the controllers in question it was faster to just telecine the film again than to do a parity recovery (film is a *wonderful* storage medium!!) (plus the write-speed demands are pretty strict too, even though the telecine was on a single hippi channel so a bit slower than the playout speed. At least it was a step (drum) telecine so didn't care about missing the frame rate.) The array was 40 drives on 4 fiber-channel controllers. The stripe parameters were chosen to match the size of a video frame (about 150-160meg for color) to the size of one stripe across the whole array - there was a little padding needed to make this come out even with stripes being mults of 512 bytes... (and to get around some of Greg's other hints, you get some seek-independence and lots of other overhead help (OS DMA setup) by making the cross-controller vary fastest and in-controller slowest) This stripe scheme is *very* particular to one kind of performance optimization (BIG specific-io-size streaming); it would be terrible for usenet, for example. You could take it as one extreme with transaction-database storage probably the other (where reliability is often judged more important than raw speed, and transactions generally fit in one IO request. Also the read part of the transaction can be cached easily and thus the write only involves steps 3 and 4 of the raid-5 steps mentioned before). Remember the 3-way tradeoff mentioned earlier in this thread... And at least 2 yrs ago, none of the major raid cabinet folks made (stock) arrays that optimized this kind of streaming performance; they all aimed at database customers. This was on a cray-link challenge machine with a 2-3 gbit/sec backplane and memory, btw; drive array set up as jbod with xfs software raid. Lucky I didn't have to pay for it :-) (and you had to turn off xfs journaling and other such things that could get you without knowing quite why...) Fortunately the SGI graphics folk furnished scripts that normally got this right. We often needed to restripe the array for each transfer, and always newfs to get the sequential write properties right. -- Pete