Hello all, I've been on a real roller coaster ride getting a large virtual host up and running. One troublesome thing I've discovered (the hard way) is that the drivers for Marvell SAS/SATA chips still have a few problems. After Googling around quite a bit, I see a significant number of others have had similar issues, especially evident in the Ubuntu forums but also for a few RHEL/CentOS users. I have found that under heavy load (in my case, simply doing the initial sync of large RAID-6 arrays) the current 0.8 driver can wander off into the weeds after a while, less so for the older 0.5 driver in CentOS-5. It would appear that some sort of bug has been introduced into the newer driver. I've had to replace the Marvell-based controllers with LSI, which seem rock solid. I'm rather disappointed that I've wasted good money on several Marvell-based controller cards (2 SAS/SATA and 2 SATA). My question ... is anyone aware of the *real* status of these drivers? The Internet is full of somewhat conflicting reports. I'm referring to 'mvsas' and 'sata-mv', both of which seem to have issues under heavy load. It sure would be nice to return to using what appear to be well-made controller cards. I understand that even Alan Cox has expressed some frustration with the current driver status. FWIW, I had similar problems under the RHEL-6 evaluation OS too. Chuck
On 3/3/11 6:52 PM, Chuck Munro wrote:> > I've been on a real roller coaster ride getting a large virtual host up > and running. One troublesome thing I've discovered (the hard way) is > that the drivers for Marvell SAS/SATA chips still have a few problems. > > After Googling around quite a bit, I see a significant number of others > have had similar issues, especially evident in the Ubuntu forums but > also for a few RHEL/CentOS users. > > I have found that under heavy load (in my case, simply doing the initial > sync of large RAID-6 arrays) the current 0.8 driver can wander off into > the weeds after a while, less so for the older 0.5 driver in CentOS-5. > It would appear that some sort of bug has been introduced into the newer > driver. > > I've had to replace the Marvell-based controllers with LSI, which seem > rock solid. I'm rather disappointed that I've wasted good money on > several Marvell-based controller cards (2 SAS/SATA and 2 SATA).I replaced separate SII and promise controllers with a single 8-port Marvell based card and thought it was a big improvement. No problems with centos5.x, mostly running RAID1 pairs, one of which is frequently hot-swapped and re-synced. I hope its not going to have problems when I upgrade. -- Les Mikesell lesmikesell at gmail.com
On 03/04/2011 09:00 AM, Les Mikesell wrote:> > On 3/3/11 6:52 PM, Chuck Munro wrote: >> > >> > I've been on a real roller coaster ride getting a large virtual host up >> > and running. One troublesome thing I've discovered (the hard way) is >> > that the drivers for Marvell SAS/SATA chips still have a few problems. >> > >> > After Googling around quite a bit, I see a significant number of others >> > have had similar issues, especially evident in the Ubuntu forums but >> > also for a few RHEL/CentOS users. >> > >> > I have found that under heavy load (in my case, simply doing the initial >> > sync of large RAID-6 arrays) the current 0.8 driver can wander off into >> > the weeds after a while, less so for the older 0.5 driver in CentOS-5. >> > It would appear that some sort of bug has been introduced into the newer >> > driver. >> > >> > I've had to replace the Marvell-based controllers with LSI, which seem >> > rock solid. I'm rather disappointed that I've wasted good money on >> > several Marvell-based controller cards (2 SAS/SATA and 2 SATA).>> I replaced separate SII and promise controllers with a single 8-port Marvell > based card and thought it was a big improvement. No problems with centos5.x, > mostly running RAID1 pairs, one of which is frequently hot-swapped and > re-synced. I hope its not going to have problems when I upgrade. >Since I have the luxury of time to evaluate options, I've just downloaded Scientific Linux 6 to see what happens with either the mvsas or sata-mv driver. This is my first experience with SL but I wanted native ext4 rather than the preview version in CentOS-Plus. Even if I stick with SL-6 as the KVM host, I'll continue using CentOS as guest machines. If the Marvell drivers don't pan out, it looks like I'll have to either spend money on a 3Ware|LSI|Promise controller or revert to CentOS-Plus 5.5 for ext4. SL-6 is installing as I write this. Chuck
On 03/05/2011 09:00 AM, Nico Kadel-Garcia wrote:> > On Fri, Mar 4, 2011 at 4:16 PM, compdoc<compdoc at hotrodpc.com> wrote: >>> >>If the Marvell drivers don't pan out, it looks like I'll have >>> >>to either spend money on a 3Ware|LSI|Promise controller >> > >> > The 3ware are excellent... > And Promise, historically, is*not*. > >Yes, I've had problems with Promise cards in the past, but haven't bought any for a long time. They seem to be moving upscale these days. Regarding the Marvell drivers, I had good luck with the 'sata_mv' driver in Scientific Linux 6 just yesterday, running a pair of 4-port PCIe-x4 Tempo 'Sonnet' controller cards. So it appears someone has fixed that particular driver. I've decided to stick with those cards rather than re-install the Supermicro/Marvell SAS/SATA 8-port controllers, which use the 'mvsas' driver that I had problems with on the RHEL-6 evaluation distro. So far, SL-6 has performed very well, all RAID-6 arrays re-synced properly, and running concurrent forced fscks on eight arrays was very fast (because the ext4 filesystems were still empty :-) ). I think I'll stick with SL-6 as the VM host OS, but will use CentOS for the guest VMs. CentOS-5.x will do fine for now, and I'll have the luxury of upgrading guest OSs to CentOS-6 as the opportunity arises. Chuck
On 03/06/2011 09:00 AM, compdoc wrote:> >> >Regarding the Marvell drivers, I had good luck with the 'sata_mv' driver >> >in Scientific Linux 6 just yesterday, running a pair of 4-port PCIe-x4 >> >Tempo 'Sonnet' controller cards. > Are those the Mac/Windows Sonnet cards that go for less than $200? > > What kind of performance you seeing? Are you doing software raid on them?Yes, those are the cards which target Windows and OS-X, but they work fine on Linux as well. They use the Marvell 88SX series chips. They control 6 2TB WD Caviar Black drives, arranged as 5 drives in a RAID-6 array with one hot spare. 3 drives are connected to each of two cards. mdstat shows array re-sync speed is usually over 100 MBytes/sec although that tends to vary quite a bit over time.> ------------------------------ > On 03/06/2011 09:00 AM, John R Pierce wrote: > > On 03/05/11 7:01 AM, Eero Volotinen wrote: >> > >> > areca works.. >> > >> > > for SAS, I prefer LSI Logic.The Supermicro mobo I'm using (X8DAL-3) has an on-board LSI 1068E SAS/SATA controller chip, although I have the RAID functionality disabled so I can use it as a bunch of drives for software RAID-6. Like the Tempo cards, it has 6 2TB WD SATA drives attached which provides a second set of arrays. Performance really sucks, for some unknown reason, and I get lots of I/O error messages logged when the drives get busy. There appears to be no data corruption, just a lot of retries that slow things down significantly. The LSI web site has no info about the errors. The firmware is passing back I/O abort code 0403 and LSI Debug info related to "channel 0 id 9". There are only 8 ports so I don't know which disk drive may or may not be causing problems. The SMART data on all disks shows no issues, although I tend to treat some SMART data with scepticism. I need to track this error down because my understanding is that the LSI controller chip has very good performance. Chuck
On 03/07/2011 09:00 AM, Nico Kadel-Garcia wrote:> > On Sun, Mar 6, 2011 at 10:07 PM, Charles Polisher<cpolish at surewest.net> wrote: > >> > https://secure.wikimedia.org/wikipedia/en/wiki/Fakeraid#Firmware.2Fdriver-based_RAID >> > covers fake RAID. > Ouch. That was*precisely* why I used the 2410, not the 1420, SATA > card, some years back. It was nominally more expensive but well worth > the reliability and support, which was very good for RHEL and CentOS. > > I hadn't been thinking about that HostRaid messiness because I read > the reviews and avoided it early. >Here's the latest info which I'll share ... it's good news, thankfully. The problem with terrible performance on the LSI controller was traced to a flaky disk. It turns out that if you examine 'dmesg' carefully you'll find a mapping of the controller's PHY to the "id X" string (thanks to an IT friend for that tip). The LSI error messages have dropped from several thousand/day to maybe 4 or 5/day when stressed. Now the LSI controller is busy re-syncing the arrays with speed consistently over 100,000K/sec, which is excellent. My scepticism regarding SMART data continues ... the flaky drive showed no errors, and a full test and full zero-write using the WD diagnostics revealed no errors either. If the drive is bad, there's no evidence that would cause WD to issue an RMA. Regarding "fake raid" controllers, I use them in several small machines, but only as JBOD with software RAID. I haven't used Adaptec cards for many years, mostly because their SCSI controllers back in the early days were junk. Using RAID for protecting the root/boot drives requires one bit of extra work ... make sure you install grub in the boot sector of at least two drives so you can boot from an alternate if necessary. CentOS/SL/RHEL doesn't do that for you, it only puts grub in the boot sector of the first drive in an array. Chuck