Hi everyone, I'm looking into setting up a SATA hardware raid, probably 5 to use with CentOS 4. I chose hardware raid over software mostly because I like the fact that the raid is transparent to the OS. Does anyone know of any SATA controllers that are well tested for this sort of usage? From what I can tell from googling, this is more or less where RHEL stands: Red Hat Enterprise Linux version 3 and the current version of Fedora support the following SATA chipsets: Intel's ICH5 SATA chipset Silicon Image's SATA chipset Does CentOS4 add anything to this as it is based on 2.6 kernel? My question upon reading what I found on Google, is if it is true hardware raid, shouldn't the OS not be able to tell it's raid at all? I'm assuming that the chipsets listed above are driver based hardware raid? what I am after is a raid array based on true hardware raid such that the OS see's just one drive, and the hardware firmware handles any mirror/striping. Any suggestions? regards Franki
On Sat, 16 Apr 2005 at 1:11am, Franki wrote> I'm looking into setting up a SATA hardware raid, probably 5 to use with > CentOS 4. I chose hardware raid over software mostly because I like the > fact that the raid is transparent to the OS. > > Does anyone know of any SATA controllers that are well tested for this > sort of usage?IMO, 3ware is far and away the best choice for true hardware SATA RAID. The 3w-xxxx driver (for the 8000 series boards) has been in the kernel for a *long* time, and 3w-9xxx driver (for the 9000 series) has been in a while as well. There are some better performers out there at the moment (Areka, for example), but their drivers are unproven. FWIW, I currently have 4TB (2 servers) worth of 3ware based storage in production, and just took delivery of a new 5.5TB server. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University
Am Fr, den 15.04.2005 schrieb Franki um 19:11:> I'm looking into setting up a SATA hardware raid, probably 5 to use with > CentOS 4. I chose hardware raid over software mostly because I like the > fact that the raid is transparent to the OS. > > Does anyone know of any SATA controllers that are well tested for this > sort of usage? > > From what I can tell from googling, this is more or less where RHEL stands: > Red Hat Enterprise Linux version 3 and the current version of Fedora > support the following SATA chipsets: > > Intel's ICH5 SATA chipset > Silicon Image's SATA chipsetThose are no hardware RAID controllers. They are 'winraid' / 'fake raid' as it is BIOS supported software RAID. If you want real hardware RAID have a look at 3Ware's products.> Does CentOS4 add anything to this as it is based on 2.6 kernel?Kernel 2.6 too supports 3Ware controllers well.> My question upon reading what I found on Google, is if it is true > hardware raid, shouldn't the OS not be able to tell it's raid at all?Right, the system sees 1 drive rather than the independent drives. Thats the difference between hardware RAID and software RAID.> I'm assuming that the chipsets listed above are driver based hardware > raid? what I am after is a raid array based on true hardware raid such > that the OS see's just one drive, and the hardware firmware handles any > mirror/striping.Correct :)> Any suggestions?Yes, 3Ware - from small controllers to those with RAID5 capability and several SATA connectors. ICP Vortex may be a good choice too. The stories about Adaptec controllers do not invite to be interested in them.> FrankiAlexander -- Alexander Dalloz | Enger, Germany | GPG http://pgp.mit.edu 0xB366A773 legal statement: http://www.uni-x.org/legal.html Fedora Core 2 GNU/Linux on Athlon with kernel 2.6.11-1.14_FC2smp Serendipity 19:22:18 up 3 days, 16:02, load average: 0.22, 0.31, 0.32 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Dies ist ein digital signierter Nachrichtenteil URL: <http://lists.centos.org/pipermail/centos/attachments/20050415/fcb26321/attachment-0001.sig>
On Sat, 2005-04-16 at 01:11 +0800, Franki wrote:> Hi everyone, > > I'm looking into setting up a SATA hardware raid, probably 5 to use with > CentOS 4. I chose hardware raid over software mostly because I like the > fact that the raid is transparent to the OS. > > Does anyone know of any SATA controllers that are well tested for this > sort of usage? > > From what I can tell from googling, this is more or less where RHEL stands: > Red Hat Enterprise Linux version 3 and the current version of Fedora > support the following SATA chipsets: > > Intel's ICH5 SATA chipset > Silicon Image's SATA chipset > > Does CentOS4 add anything to this as it is based on 2.6 kernel? > > My question upon reading what I found on Google, is if it is true > hardware raid, shouldn't the OS not be able to tell it's raid at all? > I'm assuming that the chipsets listed above are driver based hardware > raid? what I am after is a raid array based on true hardware raid such > that the OS see's just one drive, and the hardware firmware handles any > mirror/striping.I have used the 6 port LSI Megaraid SATA controller for an application using an RHEL rebuild (Rocks) with a 2.4 kernel. It simply uses the same modules as the SCSI flavor (e.g some of the Dell PERC 4 line). It has worked like a champ (so far), and it wasn't too outrageous priced. -- Sean O'Connell Office of Engineering Computing oconnell at soe.ucsd.edu Jacobs School of Engineering, UCSD 858.534.9716 (49716)
----- Original Message ----- From: "Franki" <franki at htmlfixit.com> Sent: Friday, April 15, 2005 10:11 AM Subject: [CentOS] Serial ATA hardware raid.> I'm looking into setting up a SATA hardware raid, probably 5 to use with > CentOS 4. I chose hardware raid over software mostly because I like the > fact that the raid is transparent to the OS. > > Does anyone know of any SATA controllers that are well tested for this > sort of usage?[...]> Any suggestions?>From experience, 3Ware has the best support for hardware SATA RAID on Linux,whether on kernel 2.4 or 2.6. Their drivers have been included in most distros for a very long time, and even the latest 9xxx cards are fully supported out of the box, or with drivers you can freely download from 3Ware web site. We use their cards on several production servers, including a few that have 12 SATA drives for massive 3TB storage solutions. I don't recall ever having problems with their cards or support. Chris
centos-bounces at centos.org wrote on 15.04.2005 19:11:09:> I'm looking into setting up a SATA hardware raid, probably 5 to use with> CentOS 4. I chose hardware raid over software mostly because I like the > fact that the raid is transparent to the OS. > > Does anyone know of any SATA controllers that are well tested for this > sort of usage?3ware (7500, 8500, 9500) is IMHO best bet on all distros including CentOS 4. I also run it on an LSI SATA controller, but I cannot remember the exact model. But I can find it if you want to know. :) Regards, Harald -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20050415/45304ed0/attachment-0001.html>
centos-bounces at centos.org wrote on 15.04.2005 19:11:09:> Does anyone know of any SATA controllers that are well tested for this > sort of usage?Just have to mention it while I remember. If you decide on the 3ware 9500s controller, be aware that you HAVE to do some manual tuning. It performs like crap out of the box. Regards, Harald -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20050415/ecbbdb3b/attachment-0001.html>
Harald Finn?s wrote:> > centos-bounces at centos.org wrote on 15.04.2005 19:11:09: > > > Does anyone know of any SATA controllers that are well tested for this > > sort of usage? > > Just have to mention it while I remember. If you decide on the 3ware > 9500s controller, be aware that you HAVE to do some manual tuning. It > performs like crap out of the box. > > Regards, > HaraldThanks for all the tips guys, I'm now wandering around 3wares site. http://www.3ware.com/products/serial_ata8000.asp What I get is determined by what is available in Western Australia, though I suppose these are small enough that I can get them shipped pretty cheap however if one dies I'd like to be able to get it swapped out quickly so I guess I should stick with one of my local wholesalers. Looking at this page: http://www.auspcmarket.com.au/index.php?redir=http://www.auspcmarket.com.au/show_product_info.php?input[product_code]=CO-3W9500-8&input[category_id] The 3ware 9500 goes for $1023 AUD. once I add some fast drives to that this thing is starting to look like an expensive upgrade. I've just checked my wholesaler pricelists and I don't see any 3ware kit. (bummer) so if I buy one, I'll be getting retail and still having to ship it 5000km (from Sydney to Perth.) My main wholesaler sells Adaptec stuff, and I can get this one: http://www.techbuy.com.au/products/34526/I_OCARDS_HARDDRIVECONTROLLERS/Adaptec/AAR2410SA/AAR2410SA_4-Port_Serial_ATA_RAID_Card_-_RAID_0_1_5_10_JBOD.asp For $450 it is a 4 port controller and has onboard 64MB ECC cache. But Alexander on the list here said that adaptec is questionable for Raid cards.. The last time I had an Adaptec anything was a 2940UW Scsi controller and that thing gave me years of good service. Not sure what to chase up now. :-( Does anyone have any experiance with the above adaptec card? rgds Franki
From: Harald Finn?s> 3ware (7500, 8500, 9500) is IMHO best bet on all distros3Ware includes a GPL driver in the stock kernel. Although as with any "intelligent" storage adapter, you should verify the driver, firmware and user-space tools are compatible. I need to get my FAQ out showing 3Ware releases v. Red Hat updates. The 3Ware 7506/8506 are of the same 64-bit ASIC 66MHz PCI64 storage switch. They are non-blocking (like a network switch) thanx to their ASIC+SRAM (static RAM) design and kill at RAID-0, 1 and 0+1. While they are also high performing at RAID-5 reads, the small amount of 2MB (-4, -8) or 4MB (-12) SRAM can overflow on random, RAID-5 writes. The 3Ware 9500S series adds 128MB+ SDRAM to help buffer these overflows. Unlike RAID-0, 1 and 0+1, RAID-3/4/5 are always blocking/buffered I/O, so the ASIC+SRAM design goes from an advantage to a liability. So if your application calls for lots of random writes, either use RAID-0+1 or get a 9500S for $50-200 more.> I also run it on an LSI SATA controller, > but I cannot remember the exact model. > But I can find it if you want to know. :)Unless the model says "X" (for XScale), you don't want it. Pre-XScale (StrongARM lineage) microcontroller+DRAM designs are *dog*slow*. Intel i960 (including IOP30x series) can't push much beyond 60-70MBps. While that was sufficient for yester-year's drives in RAID-3/4/5, it is not today. And blocking/buffered is never ideal for ATA. The StrongARM and, subsequent, superscalar XScale microcontrollers change everything. They can push as little as 200MBps in old designs up to 500MBps in the new IOP32x/33x series. But you'll pay through the nose for the cards, $500+, although they do best even the 3Ware 9500S at RAID-5. Some have RAID-3/4 options that are more ideal than RAID-5 for desktops and NFS servers, respectively. But if you are looking for maximum performance, reliability and compatibility as well as load-balanced reads, RAID-0+1 on a sub-$300 3Ware Escalade 8506-4 (or a -8 for a little more) is pretty much unbeatable. The 9500S is overkill for just RAID-0+1, and if you're doing lots of writes, it's typically going to be faster than any RAID-5 solution. For extreme performance with disk redundancy, consider (2) 3Ware 8506 cards in an Opteron system with an AMD8131 or AMD8132 dual-PCI-X bus HyperTransport Tunnel -- with each card on their own PCI-X bus and then their individual RAID-0+1 volumes in a single LVM2 stripe set (RAID-0). Connected to a GbE in the HyperTransport interconnect or, better yet, a HTX (HyperTransport) 10GbE (just coming out) or InfiniBand (upto 1.8GBps actual performance on HTX) and you're slapping silly anything Intel can offer with either Xeon or Itanium over PCI-X by a factor of 2-3x.
From: Franki> The 3ware 9500 goes for $1023 AUD. > once I add some fast drives to that this thing is starting to look like an expensive upgrade.RAID-5 is only ideal for lots of contiguous reads or disk storage efficiency. If you have more writes, or more random reads, then RAID-0+1 is better. Then the 8506 does nice for less.> My main wholesaler sells Adaptec stuff,I live 5 miles from one of Adaptec's major support centers in the 1,006 acre University of Central Florida Research Park. I know many Adaptec employees. Adaptec still does not officially support Linux for RAID cards, only standard SCSI adapters. And unlike LSI/Symbios, who treats OEM and retail the same (making suppor easier), Adaptec is retail focused, and produces variants for OEMs that are quite compatibility nightmares.> http://www.techbuy.com.au/products/34526/I_OCARDS_HARDDRIVECONTROLLERS/Addaptec/AAR2410SA/AAR2410SA_4-Port_Serial_ATA_RAID_Card_-_RAID_0_1_5_10_JBBOD.asp > For $450 it is a 4 port controller and has onboard 64MB ECC cache.Adaptec's aging 2400A/2800A series are 66-100MHz i960/IOP30x series based on their former DPT acquisition. In fact, before their DPT acquisition, they had virtually *no* i960 or StrongARM solution that worked with Linux. Since then, the support has been varied, although the dpt_i2o driver seems to work with most products. Intel designed I20 for its i960 to make drivers and user-space software more uniform compatible, but only a few vendors (like DPT and LSI) took advantage of it. I used to have DPT and LSI i960 SCSI RAID controllers on Linux, NT, OpenVMS and UNIX on a variety of non-PC platforms like Alpha and MIPS. But, as I mentioned before, the i960/IOP30x is a *massive*slouche* for today's drives. The ones used in the 2400A/2800A series were fine 5 years ago, but don't even bother with the 28x0 and the 24x0's i960 is going to be the bottleneck. And then you have the fact that Adaptec is "hands-off," officially, on the dpt_i2o driver. Get a 3Ware Escalade 8506-4 for about the same price, and either run RAID-5 if you have lots of reads, or RAID-0+1 if you are doing lots of writes (especially random). Yeah, you'll lose a disk in the effective storage for RAID-0+1, but I've yet to see an application where RAID-5 could beat RAID-0+1 on any controller.> But Alexander on the list here said that adaptec is questionable > for Raid cards. The last time I had an Adaptec anything was a 2940UW Scsi controller > and that thing gave me years of good service.Retail, of course. For those of us with OEM versions, Linux compatibility is a nightmare. Why can't Adaptec be like LSI and just have one set of firmware?> Not sure what to chase up now. :-( > Does anyone have any experiance with the above adaptec card?Yes. At RAID-5, it only beats the Escalade 8506 at random writes where the measly 2-4MB SRAM oveflows. But then the Escalade 9500S with 128MB+ of DRAM buffer (in addition to the ASIC+SRAM cache) wipes the floor with it because it doesn't have a 10 year-old i960 behind it. When it comes to non-writes (or RAID-0, 1 or 0+1 non-blocking), its 32-bit blocking/buffering i960 microcontroller+DRAM bottleneck looses badly to the 64-bit non-blocking/caching ASIC+SRAM design of any 3Ware product. And 3Ware writes the GPL drivers in the kernel, which have been included since 2.2.15 (yes, 2.*2*.15). 3Ware's 3DM2 suite is also very nice for user-space monitoring. And yes, you can load 3DM2 from the 9500S download on the older 7000/8000 series (just not the 6000 series). BTW, performance matters *little* if you are putting your storage controller on the same bus as your NIC. Especially if it's a measly, "shared" 33MHz PCI32 bus. You might as well use 2 hard drives in RAID-1 because you're going to saturate it - let alone your GbE is going to be "fighting" it. Get a mainbaord with PCI-X, or at least get a chipset with the GbE on a HyperTransport interconnect or PCIe x1 rail so it's not fighting the storage controller for the PCI bus. Of course most chipset GbE and 99% of GbE cards are poorly designed (8-32KB of SRAM is typical, 64KB is you are lucky), and one should get a GbE card for a server with at least 256KB SRAM.
Jonathan wrote:> Franki: > > Can't speak for all of them, but the IDE 160s I bought failed me pretty > hard (2 of 6 failed within a year). The 250s Sata 8ms I bought for a > NAS application have been rock solid. > > Seems inconsistent, but in general the WD Sata drives I have are pretty > solid. > > JonathanI don't understand, aren't they more or less the same drive with a different PCB on the back? I'm just remembering all the problems I had several years back when I was running RH4-6.2 on WD drives. I would imagine that with striping, the 8MB 7200 drives would be good enough, and besides, it's not like I can't upgrade later. The 10,000rpm drives have me alittle worried about heat and longevity anyway. Thanks for all your help guys, I'm amazed at the wealth of knowledge on this list. regards Frank
From: Franki> After abit of searching, > I found a 4 port 8506 for retail $480 here:BTW, when I was estimating prices earlier, I was using US$.> For the record, this machine is to replace a web server with about 60+domains hosted, I've been maintaining Linux DNS/SMTP servers since just before Apache surfaced, adding CERN httpd and the "patchy." But the majority of my Linux deployment has been for file servers - especially NFS as of 1999. I still prefer Fujitsu SPARC/Solaris and, even more so, NetApp filers as NFS/SMB plarforms, but throwing around TBph is typical.> some of which get allot of hits, > most are dynamic(Perl/PHP/Java) and the> server runs local MySQL/PostgreSQL as well.I feel very strongly about using RAID-0+1 in this configuration if you can afford to reduce your effective storage by 33% over RAID-5. But the choice is yours. You're not going to make a massive hit since CPU and network is more important. But you should still segment network and storage with an AMD8131 HyperTransport tunnel that provides seperate PCI-X channels for each. And don't skimp on the NIC - get something with at least 256KB SRAM receive cache.> I don't think speed is as important as redundancy,RAID-0+1 means you could possibly lose 2 discs and be okay, although there's no guarantee after losing 1. The other thing 3Ware does is read interleaving between the two mirrors.> but having said that, for future proofing, > I'd like to get a good performer as well.3Ware's ASIC compared to a Microcontroller is like comparing a layer-2 ASIC+SRAM switch to a PC CPU+DRAM that does switching/routing. The ASIC can do it much faster, although the PC can buffer more. So it depends if you're just throwing data around like layer-2 switching (switching RAID-0, 1, 0+1 reads/writes), or if you are calculating dynamic routes in non-real-time (like RAID-5 XORs for parity). That's why 3Ware calls it a "storage switch." Of course if you have a 9500, it has DRAM too, so it's like having a layer-3 switch (that can route too). And without the inefficiency of the PC interconnect.> Which brings up another question, > I was looking at populating this thing > with 10,000 RPM 8MB cache 40gig (roughly) drives, > but the only drives Isee in my pricelists that> match those specs are Western DigitalWd360GB's> and I generally have stayed away from WD drives in the > past due to dodgy standards implimentation > and the problems that can cause with Linux,First, other than Maxtor, Seagate and Hitachi, no one makes their own drives. WD taps the first and last for many, and Maxtor's approch is different than Seagate and Hitachi (I'll send you a link to any explaination). Secondly, the ATA drives *never* talk to the system, only the 3Ware ASIC. And unlike GPL Linux, 3Ware can get proprietary command set info from vendors. I have *never* had a "DMA timeout" in the history of my 6-year 3Ware usage.> should I continue to stay away?WD's 10,000rpm drives come off the same "enterprise" line as Hitachi's SCSI/SATA. They are not commodity at all. Hitachi and Seagate have dedicated "enterprise" lines for some SCSI/SATA, and then "commodity" lines for SCSI/SATA/ATA. Maxtor has only one line and then tests for tolerance and those that rate high are then declared "enterprise" with a 3-5 year warranty.> Actually, now that I think about it, > I guess 10,000 rpm drives with raid is probably over kill for my usage,Actually, since you are more concerned with latency in your application, then a faster spindle is better. In reality, the higher the density, typically the higher the DTR. So it is very common to see lower density or 1-gen back of higher spindles have a *lower* DTR than a lower spindle.> but they are pretty cheap anyway, (149 AUD each) but > if WD are still dodgyI have WD 40, 80 and 160 "consumer" ATA/SATA drives that give some chipsets fits under Linux, but 0 issues on a 3Ware card.> then I'd probably go for7200rpm seagates> as I can get the 8MB 80gig 7200rpm drives for $85 AUD.Seagate is the only manufacture giving 3-5 year warranties on their "consumer" drives instead of 1-3. Yes, they cost 10-20% more than Maxtor/WD, but that's piece-of-mind IMHO. The "sweet spot" right now seems to be 200GB (160-250GB) for size, density (DTR performance) and price (sub-US$100). (4) 200GB = 400GB effective RAID-0+1.> Are Western Digitals still dodgy?All "integrated drive electronics" (IDE) drives via Advanced Technology (AT) Attachment (ATA) seem to do a poor job of following ATA-5 and ATA-6 specs. But with 3Ware, I've yet to have an issue. I even have 80GB WDs on old 3Ware 6000 series cards with 3-year-old firmware 6.9 that don't have an issue. BTW, 40C is the maximum ambient temperature a "consumer" SCSI/SATA/ATA drive should be exposed to. Drive life dives off if they run hotter. So consider getting a hot-swap enlosure that fits in a 3x5.25" bay that has 5 bays. They run $150.
Hi, On Fri, Apr 15, 2005 at 09:13:17PM +0200, Harald Finn?s wrote:> centos-bounces at centos.org wrote on 15.04.2005 19:11:09: > > > Does anyone know of any SATA controllers that are well tested for this > > sort of usage? > > Just have to mention it while I remember. If you decide on the 3ware 9500s > controller, be aware that you HAVE to do some manual tuning. It performs > like crap out of the box. >This is as good insertion point as anything else, for what i am going to say. 3ware is working like a champ, but slowly. The tuning won't make it magically go over 100MB/s sustained writes. Random I/O sucks (what i've seen) for any SATA-setup. Even for /dev/mdX. Puny oldish A1000 can beat those with almoust factor of ten for random I/O, but being limited to max. 40MB/s transfers by it's interface (UW/HVD). But what i am going to say is that for my centos devel work (as in my NFS-server), i just recently moved my 1.6TB raid under /dev/md5 with HighPoint RocketRaid1820. I don't care that NOT being hardware RAID. The /dev/mdX beats the 3ware 9500S-8 formerly used hands down when you do have 'spare CPU cycles to let kernel handle the parity operations'. http://www.supermicro.com/products/accessories/addon/DAC-SATA-MV8.cfm should be even cheaper 8-port solution, but those RocketRAIDs are available on like in every store. Google reveals that there is souce driver for these Marvel-chips (mvsata340.zip) like in http://www2.abit.com.tw/page/en/server/server_detail.php?pMODEL_NAME=SU-2S&fMTYPE=Server%20Boards&pPRODINFO=Driver That driver isn't exactly GPL, but it has been working quite good past month or two while i've been on my 'testing pahse' with this. With that Supermicro board, one doesn't even have annoying BIOS problems at POST (that damn 128k BIOS init window + order of card detection etc. problems which won't always result the system you'd like to have :). So i am not saying the 3ware isn't good solution, i am just saying that at least for me there are better solutions which gets me cheaper more compatible and faster solution. The metadevice is just something one can toss in to any hardware as it's in kernel. The 3ware needs 3ware for replacement if RAID5(0) is used. I've told that RAID1 on 3ware is just hardware mirror and those disks can be used on any controller, but i have not personally verified it. Then few words about this '3ware driver'. How i see it, it's just a few lines wrapping linux SCSI-layer to 3ware firmware and not even doing it too good. There has been constantly talk about that the 3ware firmware would be the reason of bad performance for example CentOS-3/ext3 + 3ware. Noone out of 3ware can do anything about it and it's as it is. With kernel solution, every line of code is there and anyone can study it and find the problem if it should exist. The now used metaformat for devices is just something which is plug'n'play. I just recently had 'a pack of disk laying around in random order' which had been 8 disk /dev/md5 and RAID5. Plugged those in a machine and found out that it was re-sunching the raid before i logged in and i just tossed those disks in w/o any knowledge about the previous order as i wasn't even going to preserve it. So the linux software implementation has grown to be pretty damn good and neat. But then again. I need to get another cup of coffee to really wake up :P my .02 euros -- Pasi Pirhonen - upi at iki.fi - http://iki.fi/upi/
From: Pasi Pirhonen> This is as good insertion point as anything else, > for what i am goingto say.> 3ware is working like a champ, but slowly. > The tuning won't make itmagically go over 100MB/s sustained writes. I am bumping up against 100MBps on my bonnie write benchmarks on an older 6410. Now that's in RAID-0+1, and not RAID-5.> Random I/O sucks (what i'veseen) for any SATA-setup. It depends. "Raw" ATA sucks for multiple operations because it has no I/O queuing. AHCI is trying to address that, but it's still unintelligent. 3Ware queues reads/writes very well, and sequences them as best as it can. But it's still not perfect.> Even for /dev/mdX.Now with MD, you're starting to taxi your interconnect on writes. E.g., with Microcontroller or ASIC RAID, you only push the data you write. With software (including "FRAID"), you push 2x for RAID-1. That's 2x through your memory, over the system interconnect into the I/O and out the PCI bus. When you talk RAID-3/4/5 writes, you slaughter the interconnect. The bottleneck isn't the CPU. It's the fact that for each stripe, you've gotta load from memory through the CPU back to memory - all over the system interconnect, before even looking at I/O. For 4+ modern ATA disks, your talking a roundtrip that costs you an aggregate percentage of your system interconnect time beyond 30%+. On a dynamic web or other CPU computational intensive server, it matters little. The XOR operations actually use very little CPU power. And the web or computational streams aren't saturating the interconnect. But when you are doing file server I/O, and the system interconnect is used for raw bursts of network I/O as much as storage, it kills.> Puny oldish A1000 can beatthose with almoust factor of ten for random I/O, but being limited to> max. 40MB/s transfers by it's interface (UW/HVD).Or more like the i960 because, after all, RAID should stripe some operations across multiple channels.> But what i am going to say is that for my centos devel work > (as in my NFS-server), i just recently moved my 1.6TB raid under /dev/md5 with > HighPoint RocketRaid1820. > I don't care that NOT being hardware RAID. > The /dev/mdX beats the 3ware 9500S-8 formerly used hands down when you > do have 'spare CPU cycles to let kernel handle the parity operations'.It has nothing to do with CPU cycles but interconnect. XOR puts no strain on modern CPUs, it's the added data streams being feed from memory to CPU. Furthermore, using async I/O, MD can actually be _faster_ than hardware RAID. Volume management in an OS will typically do much better than a hardware RAID card when it comes to block writes. Of course the 9500S is still maturing. Which is why I still prefer to use 4 and 8 channel 7506/8506 cards with RAID-0+1. Even the AccelATA and 5000 left much to be desired before the 6000 and latter 7000/8000 series.
Harald Finn?s wrote:> Just have to mention it while I remember. If you decide on the 3ware > 9500s controller, be aware that you HAVE to do some manual tuning. It > performs like crap out of the box.I was wondering if you could expand on that. What type of array(s) are you using, which controller and any specific settings?
Bryan J. Smith <b.j.smith@ieee.org>
2005-Apr-17 04:06 UTC
[CentOS] Serial ATA hardware raid.
From: Les Mikesell <lesmikesell at gmail.com>> Note that software mirroring has some advantages too and is not > a big performance hit as long as the underlying hardware uses > DMA and works independently (i.e. don't do it with drives > on the same IDE controller cable).Actually, I wouldn't do it on the same PCI bus. With drives able to burst at 50MBps, the amount of traffic on your interconnect can actually be high.> I consider it a big plus to be able to take a drive from a mirrored pair, plug > it into just about any computer without worrying about having > exactly the same brand of controller, and recover the data - or > take over the service the broken computer was providing.Then you'll like 3Ware. For RAID-1, it doesn't change the organization. But even if you use RAID-0, 0+1 and RAID-5, the great thing about 3Ware is you _can_ safely move the drives to _any_ model card, as long as the firmware is at least the same or newer. I have done this with volumes from 5000 to 6000 or 7000, 6000 to 7000.> And by the way, there's nothing wrong with running a DNS server > and mail server on the same machine - but you should have a 2nd > dns server which can also provide other services.A secondary mail server is always a good idea too. Ideally, you should consider a chroot environment or even an UML instance for each.> You might need a bit more RAM but DNS typically is not a big cpu load.But you are correct from a load standpoint, neither will hit you too much. -- Bryan J. Smith mailto:b.j.smith at ieee.org
Bryan J. Smith <b.j.smith@ieee.org>
2005-Apr-17 04:16 UTC
[CentOS] Serial ATA hardware raid.
From: Harald Finn???s <spamcatcher at lantrix.no>> If memory serves, I had the worst performance on aa 7506-8 running R5 and > MDK 10.1. Write performance was extremenly slow, so the box would use all > memory for cache almost immidiately. This resulted in very high CPU, and > the machine became unresponsive. After applying: > sysctl -w "vm.bdflush=0 500 0 0 500 3000 0 20 0" > it was back to normal. This works with 2.4 kernels only.Correct, because the Linux kernel is tuned to assume the storage does NOT provide I/O queuing. As I stated in another e-mail, the Linux kernel does async I/O by default. Some cards, like the 3Ware, try to sync I/O as best as it can for safety. This hurts performance quite a bit -- especially on the 9000 series. The 9000 series now introduces DRAM, and by default, isn't battery backed on the 9000 (the kits just became available), hence the defaults that are very "sync-like." I _highly_recommend_ you change those defaults (and get the kit too). All prior 3Ware cards are SRAM-only, which does not require an external battery source. DRAM is a leaky cell that uses 100x+ the power of SRAM, which is a complex, combinational circuit. That's why prior 3Ware cards didn't need external battery, they use SRAM. You can typically cycle power and the 3Ware write cache will be flushed (at least on the 7000/8000 series).> With 2.6 this improves performance: > blockdev --setra 16384 /dev/sda > I'm pretty sure I found both these setting in the 3ware knowledgebase.So many people don't hit the 3Ware knowledge base. You should *ALWAYS* hit the knowledge base of your intelligent card vendor. Remember, the Linux kernel is tuned to assume storage controllers unintelligent and have *0* I/O queuing capability. Hence why LVM/MD is "tuned out of the box" with Linux. Furthermore, the SCSI-2 protocol provides standard command-sets for I/O queuing for SCSI cards, as well as Intel's I2O (dpt_i2o). But when you have a "generic" SCSI block driver, like with 3Ware and many others (including the DAC_960 for a few older Mylex cards), you need to tell the kernel's VM and block interface how to best stages writes and other operations. -- Bryan J. Smith mailto:b.j.smith at ieee.org