Hello, BACKGROUND I need to purchase a new system for our developers, for use as a Postgres database test server. Having a RAID, probably RAID1, is desirable for performance and reliability. I have recently set up a system with a gmirror-based software RAID1 on a pair of 250GB ATA drives. I would like to stick with gmirror, because: - We save money on extra hardware. - Since gmirror is part of FreeBSD, maintenance is a lot easier than with a hardware solution. SUMMARY But before I go balls out, I should see how well it compares to hardware RAID. So, I do some benchmarks with bonnie++. Since this simulates create, write and read on thousands of random files, this sounds like a good approximation of what Postgres does. :) I don't have the time and hardware to do very scientific tests, but I have been able to run a series of benchmarks using bonnie++ on some systems I have available to me. The ATA-based gmirror performs extremely well, compared to a few Adaptec RAIDs that we have, EXCEPT that the sequential and random reads are MUCH SLOWER than the hardware solution, and even *slower than the preceding write operations*. This is counter-intuitive, especially since RAID1 implies slowed writes and faster reads. I tried the benchmark on my workstation (single 2.5" IDE in a laptop) and got comparable write-faster-than-read results. DATA I was able to make use of the following test systems. I ran tests in multi-user, but tried to favor times when there wasn't much background activity: mito: (lone 2.5" ATA) 5.4-PRERELEASE CPU: Intel(R) Pentium(R) M processor 1.50GHz (1495.16-MHz 686-class CPU) atapci0: <Intel ICH4 UDMA100 controller> ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 ad0: 28615MB <TOSHIBA MK3021GAS/GA129D> [58140/16/63] at ata0-master UDMA100 amun: (gmirror RAID1 2 x WD250GB High Intensity) FreeBSD 5.3-RELEASE CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2799.22-MHz 686-class CPU) atapci0: <SiI 3112 SATA150 controller> ata2: channel #0 on atapci0 ata3: channel #1 on atapci0 ad4: 238475MB <WDC WD2500SD-01KCB0/08.02D08> [484521/16/63] at ata2-master SATA150 ad6: 238475MB <WDC WD2500SD-01KCB0/08.02D08> [484521/16/63] at ata3-master SATA150 atapci1: <SiI 3112 SATA150 controller> acd0: CDROM <CD-224E/1.9A> at ata1-master UDMA33 janus: (Adaptec RAID1 2 x 72G 10,000 RPM SCSI) FreeBSD 4.10-STABLE CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2399.33-MHz 686-class CPU) [DUAL] aac0: <Adaptec SCSI RAID 2120S> mem 0xf8000000-0xfbffffff irq 16 at device 1.0 on pci2 aac0: i960RX 100MHz, 48MB cache memory, optional battery present aac0: Kernel 4.0-0, Build 6011, S/N baec64 aac0: Supported Options=1f7e<CLUSTERS,WCACHE,DATA64,HOSTTIME,RAID50,WINDOW4GB,SOFTERR,NORECOND,SGMAP64,ALARM,NONDASD> aacd0: <RAID 1 (Mirror)> on aac0 aacd0: 69998MB (143357184 sectors) db2: (Adaptec RAID10 4 x 36G 15,000 RPM SCSI) FreeBSD 4.8-STABLE CPU: Intel(R) Xeon(TM) CPU 3.06GHz (3065.81-MHz 686-class CPU) [DUAL] aac0: <Adaptec SCSI RAID 2120S> mem 0xf8000000-0xfbffffff irq 18 at device 2.0 on pci5 aac0: i960RX 100MHz, 48MB cache memory, optional battery present aac0: Kernel 4.0-0, Build 6008, S/N b97ce8 aac0: Supported Options=1f7e<CLUSTERS,WCACHE,DATA64,HOSTTIME,RAID50,WINDOW4GB,SOFTERR,NORECOND,SGMAP64,ALARM,NONDASD> aacp0: <SCSI Passthrough Bus> on aac0 The raw data can be viewed at http://dannyman.toldme.com/scratch/benchmarks/ ANALYSIS Unfortunately, my hardware RAIDs are on FreeBSD 4, and gmirror is on 5. My hardware RAIDs are on dual CPU systems, with 2G RAM, and my gmirror is on a single hyperthreaded CPU with 512M. Yes, sorry, not especially scientific. Maybe the changes in FreeBSD make a big difference? Maybe RAM makes a big difference? The first results show a serious advantage for the gmirror setup: Sequential output (char) gmirror ATA RAID1: avg 320K/s Adaptec SCSI RAID1: avg 222K/s Adaptec SCSI RAID10: avg 202K/s Sequential input (char) gmirror ATA RAID1: avg 617K/s Adaptec SCSI RAID1: avg 345K/s Adaptec SCSI RAID10: avg 336K/s Sequential Output (block) gmirror ATA RAID1: avg 37893K/s Adaptec SCSI RAID1: avg 13829K/s Adaptec SCSI RAID10: avg 40440K/s The gmirror sees slightly poorer performance in random seeks: Rndom Seeks gmirror ATA RAID1: avg 4144/s Adaptec SCSI RAID1: avg 5428/s Adaptec SCSI RAID10: avg 13302/s That all sounds great if I was streaming video, but I want to run a database, opening and closing, reading, writing, and rewriting several small files. This is where things seem to go rotten. We see the ATA performance go to heck on the File Create tests: Sequential Create laptop 2.5" ATA: avg 101/s gmirror ATA RAID1: avg 365/s Adaptec SCSI RAID1: avg 160/s Adaptec SCSI RAID10: avg 412/s Sequential Read laptop 2.5" ATA: avg 76/s # SLOWER than write! gmirror ATA RAID1: avg 251/s # SLOWER than write! Adaptec SCSI RAID1: avg 7862/s Adaptec SCSI RAID10: avg 7618/s Random Create laptop 2.5" ATA: avg 124/s gmirror ATA RAID1: avg 354/s Adaptec SCSI RAID1: avg 155/s Adaptec SCSI RAID10: avg 504/s Random Read laptop 2.5" ATA: avg 57/s # SLOWER than write! gmirror ATA RAID1: avg 144/s # SLOWER than write! Adaptec SCSI RAID1: avg 7655/s Adaptec SCSI RAID10: avg 7413/s CONFUSION Now, I could explain poor read performance by: - Less RAM == Less buffer - Bigger Disks == Slower Seeks - Less CPU == ??? I DO have a 4.8-STABLE with a single IDE disk, no Soft Updates, and faster read than write: Version 1.93c ------Sequential Create------ --------Random Create-------- anubis.xxxxxxxxxxxx -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 10:104884:0/5 183 32 1739 97 502 16 176 32 1624 94 368 13 Latency 707ms 11487us 32824us 488ms 207ms 117ms However, seeing read SLOWER than write ... I have to wonder if something fishy is going on. Suggestions? Ideas? I'm fresh out, at the moment. My suspicion is that something in 5.x is out-of-tune!? Thanks a lot. Sincerely, -danny -- http://dannyman.toldme.com/
On April 8, 2005 04:41 pm, Danny Howard wrote:> Hello, > > BACKGROUND > > I need to purchase a new system for our developers, for use as a > Postgres database test server. Having a RAID, probably RAID1, is > desirable for performance and reliability. I have recently set up a > system with a gmirror-based software RAID1 on a pair of 250GB ATA > drives. I would like to stick with gmirror, because: > > - We save money on extra hardware. > - Since gmirror is part of FreeBSD, maintenance is a lot easier than > with a hardware solution. > > SUMMARY > > But before I go balls out, I should see how well it compares to hardware > RAID. So, I do some benchmarks with bonnie++. Since this simulates > create, write and read on thousands of random files, this sounds like a > good approximation of what Postgres does. :) > > I don't have the time and hardware to do very scientific tests, but I > have been able to run a series of benchmarks using bonnie++ on some > systems I have available to me. The ATA-based gmirror performs > extremely well, compared to a few Adaptec RAIDs that we have, EXCEPT > that the sequential and random reads are MUCH SLOWER than the hardware > solution, and even *slower than the preceding write operations*. This > is counter-intuitive, especially since RAID1 implies slowed writes and > faster reads. I tried the benchmark on my workstation (single 2.5" IDE > in a laptop) and got comparable write-faster-than-read results.May I suggest that you turn softupdates off and sync on for the filesystems you are testing. If you don't you are not really testing the hardware. You are testing the FreeBSD disk caching system in the kernel.> DATA > > I was able to make use of the following test systems. I ran tests in > multi-user, but tried to favor times when there wasn't much background > activity: > > mito: (lone 2.5" ATA) > 5.4-PRERELEASE > CPU: Intel(R) Pentium(R) M processor 1.50GHz (1495.16-MHz 686-class CPU) > atapci0: <Intel ICH4 UDMA100 controller> > ata0: channel #0 on atapci0 > ata1: channel #1 on atapci0 > ad0: 28615MB <TOSHIBA MK3021GAS/GA129D> [58140/16/63] at ata0-master > UDMA100 > > amun: (gmirror RAID1 2 x WD250GB High Intensity) > FreeBSD 5.3-RELEASE > CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2799.22-MHz 686-class CPU) > atapci0: <SiI 3112 SATA150 controller> > ata2: channel #0 on atapci0 > ata3: channel #1 on atapci0 > ad4: 238475MB <WDC WD2500SD-01KCB0/08.02D08> [484521/16/63] at > ata2-master SATA150 > ad6: 238475MB <WDC WD2500SD-01KCB0/08.02D08> [484521/16/63] at > ata3-master SATA150 > atapci1: <SiI 3112 SATA150 controller> > acd0: CDROM <CD-224E/1.9A> at ata1-master UDMA33 > > janus: (Adaptec RAID1 2 x 72G 10,000 RPM SCSI) > FreeBSD 4.10-STABLE > CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2399.33-MHz 686-class CPU) [DUAL] > aac0: <Adaptec SCSI RAID 2120S> mem 0xf8000000-0xfbffffff irq 16 at > device 1.0 on pci2 > aac0: i960RX 100MHz, 48MB cache memory, optional battery present > aac0: Kernel 4.0-0, Build 6011, S/N baec64 > aac0: Supported > Options=1f7e<CLUSTERS,WCACHE,DATA64,HOSTTIME,RAID50,WINDOW4GB,SOFTERR,NOREC >OND,SGMAP64,ALARM,NONDASD> aacd0: <RAID 1 (Mirror)> on aac0 > aacd0: 69998MB (143357184 sectors) > > db2: (Adaptec RAID10 4 x 36G 15,000 RPM SCSI) > FreeBSD 4.8-STABLE > CPU: Intel(R) Xeon(TM) CPU 3.06GHz (3065.81-MHz 686-class CPU) [DUAL] > aac0: <Adaptec SCSI RAID 2120S> mem 0xf8000000-0xfbffffff irq 18 at > device 2.0 on pci5 > aac0: i960RX 100MHz, 48MB cache memory, optional battery present > aac0: Kernel 4.0-0, Build 6008, S/N b97ce8 > aac0: Supported > Options=1f7e<CLUSTERS,WCACHE,DATA64,HOSTTIME,RAID50,WINDOW4GB,SOFTERR,NOREC >OND,SGMAP64,ALARM,NONDASD> aacp0: <SCSI Passthrough Bus> on aac0 > > The raw data can be viewed at > http://dannyman.toldme.com/scratch/benchmarks/ > > ANALYSIS > > Unfortunately, my hardware RAIDs are on FreeBSD 4, and gmirror is on 5. > My hardware RAIDs are on dual CPU systems, with 2G RAM, and my gmirror > is on a single hyperthreaded CPU with 512M. Yes, sorry, not especially > scientific. Maybe the changes in FreeBSD make a big difference? Maybe > RAM makes a big difference? > > The first results show a serious advantage for the gmirror setup: > > Sequential output (char) > gmirror ATA RAID1: avg 320K/s > Adaptec SCSI RAID1: avg 222K/s > Adaptec SCSI RAID10: avg 202K/s > > Sequential input (char) > gmirror ATA RAID1: avg 617K/s > Adaptec SCSI RAID1: avg 345K/s > Adaptec SCSI RAID10: avg 336K/s > > Sequential Output (block) > gmirror ATA RAID1: avg 37893K/s > Adaptec SCSI RAID1: avg 13829K/s > Adaptec SCSI RAID10: avg 40440K/s > > The gmirror sees slightly poorer performance in random seeks: > > Rndom Seeks > gmirror ATA RAID1: avg 4144/s > Adaptec SCSI RAID1: avg 5428/s > Adaptec SCSI RAID10: avg 13302/s > > That all sounds great if I was streaming video, but I want to run a > database, opening and closing, reading, writing, and rewriting several > small files. This is where things seem to go rotten. > > We see the ATA performance go to heck on the File Create tests: > > Sequential Create > laptop 2.5" ATA: avg 101/s > gmirror ATA RAID1: avg 365/s > Adaptec SCSI RAID1: avg 160/s > Adaptec SCSI RAID10: avg 412/s > > Sequential Read > laptop 2.5" ATA: avg 76/s # SLOWER than write! > gmirror ATA RAID1: avg 251/s # SLOWER than write! > Adaptec SCSI RAID1: avg 7862/s > Adaptec SCSI RAID10: avg 7618/s > > Random Create > laptop 2.5" ATA: avg 124/s > gmirror ATA RAID1: avg 354/s > Adaptec SCSI RAID1: avg 155/s > Adaptec SCSI RAID10: avg 504/s > > Random Read > laptop 2.5" ATA: avg 57/s # SLOWER than write! > gmirror ATA RAID1: avg 144/s # SLOWER than write! > Adaptec SCSI RAID1: avg 7655/s > Adaptec SCSI RAID10: avg 7413/s > > CONFUSION > > Now, I could explain poor read performance by: > - Less RAM == Less buffer > - Bigger Disks == Slower Seeks > - Less CPU == ??? > > I DO have a 4.8-STABLE with a single IDE disk, no Soft Updates, and > faster read than write: > > Version 1.93c ------Sequential Create------ --------Random > Create-------- > anubis.xxxxxxxxxxxx -Create-- --Read--- -Delete-- -Create-- --Read--- > -Delete-- > files:max /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > /sec %CP > 10:104884:0/5 183 32 1739 97 502 16 176 32 1624 94 > 368 13 > Latency 707ms 11487us 32824us 488ms 207ms > 117ms > > However, seeing read SLOWER than write ... I have to wonder if something > fishy is going on. Suggestions? Ideas? I'm fresh out, at the moment. > My suspicion is that something in 5.x is out-of-tune!? > > Thanks a lot. > > Sincerely, > -danny-- Ean Kingston E-Mail: ean AT hedron DOT org URL: http://www.hedron.org/
On Fri, 8 Apr 2005, Danny Howard wrote:> I don't have the time and hardware to do very scientific tests, but I > have been able to run a series of benchmarks using bonnie++ on some > systems I have available to me. The ATA-based gmirror performs > extremely well, compared to a few Adaptec RAIDs that we have, EXCEPT > that the sequential and random reads are MUCH SLOWER than the hardware > solution, and even *slower than the preceding write operations*. This > is counter-intuitive, especially since RAID1 implies slowed writes and > faster reads. I tried the benchmark on my workstation (single 2.5" IDE > in a laptop) and got comparable write-faster-than-read results.> The raw data can be viewed at > http://dannyman.toldme.com/scratch/benchmarks/Could you place the 'dmesg' output for each system in this directory? The output here is marginally useful since it shows the bonnie command line. However, 100MB as the test filesize is really small unless the systems have 64MB of RAM though -- otherwise you're testing how well FreeBSD manages memory (or how much crap the systems are running when you run this test). For recent I/O tests I was doing with iozone I was using 10GB filesizes. This blows out the cache on just about everything.> Unfortunately, my hardware RAIDs are on FreeBSD 4, and gmirror is on 5. > My hardware RAIDs are on dual CPU systems, with 2G RAM, and my gmirror > is on a single hyperthreaded CPU with 512M. Yes, sorry, not especially > scientific. Maybe the changes in FreeBSD make a big difference? Maybe > RAM makes a big difference?Yes, lots. Both 4.x vs. 5.x and RAM :)> Sequential Read > laptop 2.5" ATA: avg 76/s # SLOWER than write! > gmirror ATA RAID1: avg 251/s # SLOWER than write! > Adaptec SCSI RAID1: avg 7862/s > Adaptec SCSI RAID10: avg 7618/sI'd be really careful here... that is # of files read per second after the create, and as pointed out before, SoftUpdates usually gets you a big win until it has to flush the directories out then things suffer during the actual flush op since the disk gets hammered. Lots of free memory to use for the directory cache helps. The disk cache on the RAID controller is buying you even more. From your results you were getting 9ms latency which is spot-on so I think you are simply misinterpreting your results here. File creation tests are usually more for filesystem-specific benchmarking than for throughput benchmarking. I'd suggest something more like iozone for throughput testing. If the volumes have nothing on them you care about then rawio can also be instructive. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org
Have you been certain to disable write caching on the drives? If not I bet that's what you're seeing (which also means you're begging for database corruption). -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"