On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:> On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote: > > I have yet to see software RAID that doesn't kill the performance. > > When was the last time you tried it?I'm currently using it, and the performance sucks. Perhaps it's not the software itself or the CPU but the on-board controllers or other components being incable handling multiple disks in a software raid. That's something I can't verify.> Why would you expect that a modern 8-core Intel CPU would impede I/O in any measureable way as compared to the outdated single-core 32-bit RISC CPU typically found on hardware RAID cards? These are the same CPUs, mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much tougher task than mediating spinning disk I/O.It doesn't matter what I expect.> > And where > > do you get cost-efficient cards that can do JBOD? > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1That says it's for HP. So will you still get firmware updates once the warranty is expired? Does it exclusively work with HP hardware? And are these good?> Search for ?LSI JBOD? for tons more options. You may have to fiddle with the firmware to get it to stop trying to do clever RAID stuff, which lets you do smart RAID stuff like ZFS instead. > > > What has HP been thinking? > > That the hardware vs software RAID argument is over in 2020. >Do you have a reference for that, like a final statement from HP? Did they stop developing RAID controllers, or do they ship their servers now without them and tell customers to use btrfs or mdraid?
On Sat, Nov 14, 2020, 4:57 AM hw <hw at gc-24.de> wrote:> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote: > > > > And where > > > do you get cost-efficient cards that can do JBOD? > > > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1 > > That says it's for HP. So will you still get firmware updates once > the warranty is expired? Does it exclusively work with HP hardware? > > And are these good? >That specific card is a bad choice, it's the very obsolete SAS1068E chip, which was SAS 1.0, with max 2gb per disk. Cards based on the SAS 2008, 2308, and 3008 chips are a much better choice. Any oem card with these chips can be flashed with generic LSI/Broadcom IT firmware.
> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote: >> On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote: >> > I have yet to see software RAID that doesn't kill the performance. >> >> When was the last time you tried it? > > I'm currently using it, and the performance sucks. Perhaps it's > not the software itself or the CPU but the on-board controllers > or other components being incable handling multiple disks in a > software raid. That's something I can't verify. > >> Why would you expect that a modern 8-core Intel CPU would impede I/O in >> any measureable way as compared to the outdated single-core 32-bit RISC >> CPU typically found on hardware RAID cards? These are the same CPUs, >> mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet >> links, a much tougher task than mediating spinning disk I/O. > > It doesn't matter what I expect. > >> > And where >> > do you get cost-efficient cards that can do JBOD? >> >> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1 > > That says it's for HP. So will you still get firmware updates once > the warranty is expired? Does it exclusively work with HP hardware? > > And are these good? > >> Search for ?LSI JBOD? for tons more options. You may have to fiddle >> with the firmware to get it to stop trying to do clever RAID stuff, >> which lets you do smart RAID stuff like ZFS instead. >> >> > What has HP been thinking? >> >> That the hardware vs software RAID argument is over in 2020. >> > > Do you have a reference for that, like a final statement from HP? > Did they stop developing RAID controllers, or do they ship their > servers now without them and tell customers to use btrfs or mdraid?HPE and the other large vendors won't tell you directly because they love to sell you their outdated SAS/SATA Raid stuff. They were quite slow to introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also clear to them that NVMe is the future and that it's used with software redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's 4AA4-7186ENW.pdf file which also mentions it. In fact local storage was one reason why we turned away from HPE and Dell after many years because we just didn't want to invest in outdated technology. Regards, Simon
On Nov 14, 2020, at 5:56 AM, hw <hw at gc-24.de> wrote:> > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote: >> On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote: >>> I have yet to see software RAID that doesn't kill the performance. >> >> When was the last time you tried it? > > I'm currently using it, and the performance sucks.Be specific. Give chip part numbers, drivers used, whether this is on-board software RAID or something entirely different like LVM or MD RAID, etc. For that matter, I don?t even see that you?ve identified whether this is CentOS 6, 7 or 8. (I hope it isn't older!)> Perhaps it's > not the software itself or the CPU but the on-board controllers > or other components being incable handling multiple disks in a > software raid. That's something I can't verify.Sure you can. Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays. In a 2-disk array, a proper software RAID system should give 2x a single disk?s performance for both read and write in RAID-0, but single-disk write performance for RAID-1. Such values should scale reasonably as you add disks: RAID-0 over 8 disks gives 8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc. These are rough numbers, but what you?re looking for are failure cases where it?s 1x a single disk for read or write. That tells you there?s a bottleneck or serialization condition, such that you aren?t getting the parallel I/O you should be expecting.>> Why would you expect that a modern 8-core Intel CPU would impede I/O > > It doesn't matter what I expect.It *does* matter if you know what the hardware?s capable of. TLS is a much harder problem than XOR checksumming for traditional RAID, yet it imposes [approximately zero][1] performance penalty on modern server hardware, so if your CPU can fill a 10GE pipe with TLS, then it should have no problem dealing with the simpler calculations needed by the ~2 Gbit/sec flat-out max data rate of a typical RAID-grade 4 TB spinning HDD. Even with 8 in parallel in the best case where they?re all reading linearly, you?re still within a small multiple of the Ethernet case, so we should still expect the software RAID stack not to become CPU-bound. And realize that HDDs don?t fall into this max data rate case often outside of benchmarking. Once you start throwing ~5 ms seek times into the mix, the CPU?s job becomes even easier. [1]: https://stackoverflow.com/a/548042/142454> >>> And where >>> do you get cost-efficient cards that can do JBOD? >> >> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1 > > That says it's for HP. So will you still get firmware updates once > the warranty is expired? Does it exclusively work with HP hardware? > > And are these good?You asked for ?cost-efficient,? which I took to be a euphemism for ?cheapest thing that could possibly work.? If you?re willing to spend money, then I fully expect you can find JBOD cards you?ll be happy with. Personally, I get servers with enough SFF-8087 SAS connectors on them to address all the disks in the system. I haven?t bothered with add-on SATA cards in years. I use ZFS, so absolute flat-out benchmark speed isn?t my primary consideration. Data durability and data set features matter to me far more.>>> What has HP been thinking? >> >> That the hardware vs software RAID argument is over in 2020. > > Do you have a reference for that, like a final statement from HP?Since I?m not posting from an hpe.com email address, I think it?s pretty obvious that that is my opinion, not an HP corporate statement. I base it on observing the Linux RAID market since the mid-90s. The massive consolidation for hardware RAID is a big part of it. That?s what happens when a market becomes ?mature,? which is often the step just prior to ?moribund.?> Did they stop developing RAID controllers, or do they ship their > servers now without themWere you under the impression that HP was trying to provide you the best possible technology for all possible use cases, rather than make money by maximizing the ratio of cash in vs cash out? Just because they?re serving it up on a plate doesn?t mean you hafta pick up a fork.
On Sat, 2020-11-14 at 07:11 -0800, John Pierce wrote:> On Sat, Nov 14, 2020, 4:57 AM hw <hw at gc-24.de> wrote: > > > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote: > > > > > > And where > > > > do you get cost-efficient cards that can do JBOD? > > > > > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1 > > > > That says it's for HP. So will you still get firmware updates once > > the warranty is expired? Does it exclusively work with HP hardware? > > > > And are these good? > > > > That specific card is a bad choice, it's the very obsolete SAS1068E chip, > which was SAS 1.0, with max 2gb per disk. >Thanks! That's probably why it isn't so expensive.> Cards based on the SAS 2008, 2308, and 3008 chips are a much better choice. > > Any oem card with these chips can be flashed with generic LSI/Broadcom IT > firmware. >I don't like the idea of flashing one. I don't have the firmware and I don't know if they can be flashed with Linux. Aren't there any good --- and cost efficient --- ones that do JBOD by default, preferably including 16-port cards with mini-SAS connectors?
On Sat, 2020-11-14 at 18:55 +0100, Simon Matter wrote:> > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote: > > > On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote: > > > > I have yet to see software RAID that doesn't kill the performance. > > > > > > When was the last time you tried it? > > > > I'm currently using it, and the performance sucks. Perhaps it's > > not the software itself or the CPU but the on-board controllers > > or other components being incable handling multiple disks in a > > software raid. That's something I can't verify. > > > > > Why would you expect that a modern 8-core Intel CPU would impede I/O in > > > any measureable way as compared to the outdated single-core 32-bit RISC > > > CPU typically found on hardware RAID cards? These are the same CPUs, > > > mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet > > > links, a much tougher task than mediating spinning disk I/O. > > > > It doesn't matter what I expect. > > > > > > And where > > > > do you get cost-efficient cards that can do JBOD? > > > > > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1 > > > > That says it's for HP. So will you still get firmware updates once > > the warranty is expired? Does it exclusively work with HP hardware? > > > > And are these good? > > > > > Search for ?LSI JBOD? for tons more options. You may have to fiddle > > > with the firmware to get it to stop trying to do clever RAID stuff, > > > which lets you do smart RAID stuff like ZFS instead. > > > > > > > What has HP been thinking? > > > > > > That the hardware vs software RAID argument is over in 2020. > > > > > > > Do you have a reference for that, like a final statement from HP? > > Did they stop developing RAID controllers, or do they ship their > > servers now without them and tell customers to use btrfs or mdraid? > > HPE and the other large vendors won't tell you directly because they love > to sell you their outdated SAS/SATA Raid stuff. They were quite slow to > introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also > clear to them that NVMe is the future and that it's used with software > redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's > 4AA4-7186ENW.pdf file which also mentions it. > > In fact local storage was one reason why we turned away from HPE and Dell > after many years because we just didn't want to invest in outdated > technology. >I'm currently running an mdadm raid-check and two RAID-1 arrays, and the server shows 2 processes with 24--27% CPU each and two others around 5%. And you want to tell me that the CPU load is almost non-existent. I've also constantly seen much better performance with hardware RAID than with software RAID over the years and ZFS having the worst performance of anything, even with SSD caches. It speaks for itself, and, like I said, I have yet to see a software RAID that doesn't bring the performance down. Show me one that doesn't. Are there any hardware RAID controllers designed for NVMe storage you could use to compare software RAID with? Are there any ZFS or btrfs hardware controllers you could compare with?
On Sat, 2020-11-14 at 14:37 -0700, Warren Young wrote:> On Nov 14, 2020, at 5:56 AM, hw <hw at gc-24.de> wrote: > > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote: > > > On Nov 11, 2020, at 2:01 PM, hw <hw at gc-24.de> wrote: > > > > I have yet to see software RAID that doesn't kill the performance. > > > > > > When was the last time you tried it? > > > > I'm currently using it, and the performance sucks. > > Be specific. Give chip part numbers, drivers used, whether this is on-board software RAID or something entirely different like LVM or MD RAID, etc. For that matter, I don?t even see that you?ve identified whether this is CentOS 6, 7 or 8. (I hope it isn't older!)I don't need to be specific because I have seen the difference in practical usage over the last 20 years. I'm not setting up scientific testing environments that would cost tremendous amounts of money and am using available and cost-efficient hard- and software.> > Perhaps it's > > not the software itself or the CPU but the on-board controllers > > or other components being incable handling multiple disks in a > > software raid. That's something I can't verify. > > Sure you can. Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays.No, I can't. I don't have tons of different CPUs, mainboards, controller cards and electronic diagnosting equipment around to do that, and what would you even benchmark? Is the user telling you that the software they are using in a VM that is stored on an NFS server, run by another server connected to it, is now running faster or slower? Are you doing SQL queries to create reports that are rarely required and take a while to run your benchmark? And what is even relevant? I am seeing that a particular software running in a VM is now running not any slower and maybe even faster than before the failed disk was replaced. That means hardware RAID with 8 disks in hardware RAID 1+0 vs. two disks as RAID 0 each in software RAID, using the otherwise same hardware, is not faster and even slower than the software RAID. The CPU load on the storage server is also higher, which in this case does not matter. I'm happy with the result so far, and that is what matters. If the disks were connected to the mainboard instead, the software might be running slower. I can't benchmark that, either, because I can't connect the disks to the SATA ports on the board. If there were 8 disks in a RAID 1+0, all connected to the board, it might be a lot slower. I can't benchmark that, the board doesn't have so many SATA connectors. I only have two new disks and no additional or different hardware. Telling me to specify particular chips and such is totally pointless. Benchmarking is not feasible and pointless, either. Sure you can do some kind of benchmarking in a lab if you can afford it, but how does that correlate to the results you'll be getting in practise? Even if you involve users, those users will be different from the users I'm dealing with.> In a 2-disk array, a proper software RAID system should give 2x a single disk?s performance for both read and write in RAID-0, but single-disk write performance for RAID-1. > > Such values should scale reasonably as you add disks: RAID-0 over 8 disks gives 8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc. > > These are rough numbers, but what you?re looking for are failure cases where it?s 1x a single disk for read or write. That tells you there?s a bottleneck or serialization condition, such that you aren?t getting the parallel I/O you should be expecting.And?> > > Why would you expect that a modern 8-core Intel CPU would impede I/O > > > > It doesn't matter what I expect. > > It *does* matter if you know what the hardware?s capable of.I can expect a hardware to do something as much as I want, it will always only do whatever it does regardless.> TLS is a much harder problem than XOR checksumming for traditional RAID, yet it imposes [approximately zero][1] performance penalty on modern server hardware, so if your CPU can fill a 10GE pipe with TLS, then it should have no problem dealing with the simpler calculations needed by the ~2 Gbit/sec flat-out max data rate of a typical RAID-grade 4 TB spinning HDD. > > Even with 8 in parallel in the best case where they?re all reading linearly, you?re still within a small multiple of the Ethernet case, so we should still expect the software RAID stack not to become CPU-bound. > > And realize that HDDs don?t fall into this max data rate case often outside of benchmarking. Once you start throwing ~5 ms seek times into the mix, the CPU?s job becomes even easier. > > [1]: https://stackoverflow.com/a/548042/142454This may all be nice and good in theory. In practise, I'm seeing up to 30% CPU during a mdraid resync for a single 2-disk array. How much performance impact does that indicate for "normal" operations?> > > > And where > > > > do you get cost-efficient cards that can do JBOD? > > > > > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1 > > > > That says it's for HP. So will you still get firmware updates once > > the warranty is expired? Does it exclusively work with HP hardware? > > > > And are these good? > > You asked for ?cost-efficient,? which I took to be a euphemism for ?cheapest thing that could possibly work.?Buying crap tends not to be cost-efficient.> If you?re willing to spend money, then I fully expect you can find JBOD cards you?ll be happy with.Like $500+ cards? That's not cost efficient for my backup server I'm running about once a month to put backups on it. If I can get one good 16-port card or two 8-port cards for max. $100, I'll consider it. Otherwise, I can keep using the P410s, turn all disks into RAID0 and use btrfs.> Personally, I get servers with enough SFF-8087 SAS connectors on them to address all the disks in the system. I haven?t bothered with add-on SATA cards in years.How do you get all these servers?> I use ZFS, so absolute flat-out benchmark speed isn?t my primary consideration. Data durability and data set features matter to me far more.Well, I tried ZFS and was not happy with it, though it does have some nice features.> > > > What has HP been thinking? > > > > > > That the hardware vs software RAID argument is over in 2020. > > > > Do you have a reference for that, like a final statement from HP? > > Since I?m not posting from an hpe.com email address, I think it?s pretty obvious that that is my opinion, not an HP corporate statement.I haven't payed attention to the email address.> I base it on observing the Linux RAID market since the mid-90s. The massive consolidation for hardware RAID is a big part of it. That?s what happens when a market becomes ?mature,? which is often the step just prior to ?moribund.? > > > Did they stop developing RAID controllers, or do they ship their > > servers now without them > > Were you under the impression that HP was trying to provide you the best possible technology for all possible use cases, rather than make money by maximizing the ratio of cash in vs cash out? > > Just because they?re serving it up on a plate doesn?t mean you hafta pick up a fork.If they had stopped making hardware RAID controllers, that would show that they have turned away from hardware RAID, and that might be seen as putting an end to the discussion --- *because* they are trying to make money. If they haven't stopped making them, that might indicate that there is still sufficient demand for the technology, and there are probably good reasons for that. That different technologies have matured over time doesn't mean that others have become bad. Besides, always "picking up the best technology" comes with it's own disadvantages while all technology will ultimately fail eventually, and sometimes hardware RAID can be the "best technology".