m.roth at 5-cent.us wrote:> Mark Haney wrote: >> On 09/08/2017 09:49 AM, hw wrote: >>> Mark Haney wrote: > <snip> >>> >>> It depends, i. e. I can?t tell how these SSDs would behave if large >>> amounts of data would be written and/or read to/from them over extended >>> periods of time because I haven?t tested that. That isn?t the >>> application, anyway. >> >> If your I/O is going to be heavy (and you've not mentioned expected >> traffic, so we can only go on what little we glean from your posts), >> then SSDs will likely start having issues sooner than a mechanical drive >> might. (Though, YMMV.) As I've said, we process 600 million messages a >> month, on primary SSDs in a VMWare cluster, with mechanical storage for >> older, archived user mail. Archived, may not be exactly correct, but >> the context should be clear. >> > One thing to note, which I'm aware of because I was recently spec'ing out > a Dell server: Dell, at least, offers two kinds of SSDs, one for heavy > write, I think it was, and one for equal r/w. You might dig into that. >>> >>> But mdadm does, the impact is severe. I know there are ppl saying >>> otherwise, but I?ve seen the impact myself, and I definitely don?t want >>> it on that particular server because it would likely interfere with other >>> services. I don?t know if the software RAID of btrfs is better in that >>> or not, though, but I?m seeing btrfs on SSDs being fast, and testing >>> with the particular application has shown a speedup of factor 20--30. > > Odd, we've never seen anything like that. Of course, we're not handling > the kind of mail you are... but serious scientific computing hits storage > hard, also. > >> I never said anything about MD RAID. I trust that about as far as I >> could throw it. And having had 5 surgeries on my throwing shoulder >> wouldn't be far. > > Why? We have it all over, and have never seen a problem with it. Nor have > I, personally, as I have a RAID 1 at home. > <snip>Make a test and replace a software RAID5 with a hardware RAID5. Even with only 4 disks, you will see an overall performance gain. I?m guessing that the SATA controllers they put onto the mainboards are not designed to handle all the data --- which gets multiplied to all the disks --- and that the PCI bus might get clogged. There?s also the CPU being burdened with the calculations required for the RAID, and that may not be displayed by tools like top, so you can be fooled easily. Graphics cards have acceleration in hardware for a reason. What was the last time you tried to do software rendering, and what frame rates did you get? :) Offloading the I/O to a designated controller gives you room for the things you actually want to do, similar to a graphics card.
On 09/08/2017 11:06 AM, hw wrote:> Make a test and replace a software RAID5 with a hardware RAID5.? Even > with > only 4 disks, you will see an overall performance gain.? I?m guessing > that > the SATA controllers they put onto the mainboards are not designed to > handle > all the data --- which gets multiplied to all the disks --- and that the > PCI bus might get clogged.? There?s also the CPU being burdened with the > calculations required for the RAID, and that may not be displayed by > tools > like top, so you can be fooled easily.That sounds like a whole lot of guesswork, which I'd suggest should inspire slightly less confidence than you are showing in it. RAID parity calculations are accounted under a process named md<number>_raid<level>.? You will see time consumed by that code under all of the normal process accounting tools, including total time under "ps" and current time under "top". Typically, your CPU is vastly faster than the cheap processors on hardware RAID controllers, and the advantage will go to software RAID over hardware.? If your system is CPU bound, however, and you need that extra fraction of a percent of CPU cycles that go to calculating parity, hardware might offer an advantage. The last system I purchased had its storage controller on a PCIe 3.0 x16 port, so its throughput to the card should be around 16GB/s.? Yours might be different.? I should be able to put roughly 20 disks on that card before the PCIe bus is the bottleneck.? If this were a RAID6 volume, a hardware RAID card would be able to support sustained writes to 22 drives vs 20 for md RAID.? I don't see that as a compelling advantage, but it is potentially an advantage for a hypothetical hardware RAID card. When you are testing your 4 disk RAID5 array, microbenchmarks like bonnie++ will show you a very significant advantage toward the hardware RAID as very small writes are added to the battery-backed cache on the card and the OS considers them complete.? However, on many cards, if the system writes data to the card faster than the card writes to disks, the cache will fill up, and at that point, the system performance can suddenly and unexpectedly plummet.? I've fun a few workloads where that happened, and we had to replace the system entirely, and use software RAID instead.? Software RAID's performance tends to be far more predictable as the workload increases. Outside of microbenchmarks like bonnie++, software RAID often offers much better performance than hardware RAID controllers. Having tested systems extensively for many years, my advice is this:? there is no simple answer to the question of whether software or hardware RAID is better.? You need to test your specific application on your specific hardware to determine what configuration will work best.? There are some workloads where a hardware controller will offer better write performance, since a battery backed write-cache can complete very small random writes very quickly.? If that is not the specific behavior of your application, software RAID will very often offer you better performance, as well as other advantages.? On the other hand, software RAID absolutely requires a monitored UPS and tested auto-shutdown in order to be remotely reliable, just as a hardware RAID controller requires a battery backed write-cache, and monitoring of the battery state.
Gordon Messmer wrote:> On 09/08/2017 11:06 AM, hw wrote: >> Make a test and replace a software RAID5 with a hardware RAID5. Even with >> only 4 disks, you will see an overall performance gain. I?m guessing that >> the SATA controllers they put onto the mainboards are not designed to handle >> all the data --- which gets multiplied to all the disks --- and that the >> PCI bus might get clogged. There?s also the CPU being burdened with the >> calculations required for the RAID, and that may not be displayed by tools >> like top, so you can be fooled easily. > > > That sounds like a whole lot of guesswork, which I'd suggest should inspire slightly less confidence than you are showing in it.It?s called "experience". I haven?t tested a great number of machines extensively to experience the difference between software and hardware on them, and I agree with what you?re saying. It?s all theory until it has been suitably tested, hence my recommendation to test it.> RAID parity calculations are accounted under a process named md<number>_raid<level>. You will see time consumed by that code under all of the normal process accounting tools, including total time under "ps" and current time under "top". Typically, your CPU is vastly faster than the cheap processors on hardware RAID controllers, and the advantage will go to software RAID over hardware. If your system is CPU bound, however, and you need that extra fraction of a percent of CPU cycles that go to calculating parity, hardware might offer an advantage. > > The last system I purchased had its storage controller on a PCIe 3.0 x16 port, so its throughput to the card should be around 16GB/s. Yours might be different. I should be able to put roughly 20 disks on that card before the PCIe bus is the bottleneck. If this were a RAID6 volume, a hardware RAID card would be able to support sustained writes to 22 drives vs 20 for md RAID. I don't see that as a compelling advantage, but it is potentially an advantage for a hypothetical hardware RAID card. > > When you are testing your 4 disk RAID5 array, microbenchmarks like bonnie++ will show you a very significant advantage toward the hardware RAID as very small writes are added to the battery-backed cache on the card and the OS considers them complete. However, on many cards, if the system writes data to the card faster than the card writes to disks, the cache will fill up, and at that point, the system performance can suddenly and unexpectedly plummet. I've fun a few workloads where that happened, and we had to replace the system entirely, and use software RAID instead. Software RAID's performance tends to be far more predictable as the workload increases. > > Outside of microbenchmarks like bonnie++, software RAID often offers much better performance than hardware RAID controllers. Having tested systems extensively for many years, my advice is this: there is no simple answer to the question of whether software or hardware RAID is better. You need to test your specific application on your specific hardware to determine what configuration will work best. There are some workloads where a hardware controller will offer better write performance, since a battery backed write-cache can complete very small random writes very quickly. If that is not the specific behavior of your application, software RAID will very often offer you better performance, as well as other advantages. On the other hand, software RAID absolutely requires a monitored UPS and tested auto-shutdown in order to be remotely reliable, just as a hardware RAID controller requires a battery backed write-cache, and monitoring of the battery state. > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos