On 09/08/2017 09:49 AM, hw wrote:> Mark Haney wrote: >> I hate top posting, but since you've got two items I want to comment >> on, I'll suck it up for now. > > I do, too, yet sometimes it?s reasonable.? I also hate it when the lines > are too long :) >I'm afraid you'll have to live with it a bit longer.? Sorry.>> Having SSDs alone will give you great performance regardless of >> filesystem. > > It depends, i. e. I can?t tell how these SSDs would behave if large > amounts of > data would be written and/or read to/from them over extended periods > of time because > I haven?t tested that.? That isn?t the application, anyway.If your I/O is going to be heavy (and you've not mentioned expected traffic, so we can only go on what little we glean from your posts), then SSDs will likely start having issues sooner than a mechanical drive might.? (Though, YMMV.)? As I've said, we process 600 million messages a month, on primary SSDs in a VMWare cluster, with mechanical storage for older, archived user mail.? Archived, may not be exactly correct, but the context should be clear.> >> BTRFS isn't going to impact I/O any more significantly than, say, XFS. > > But mdadm does, the impact is severe.? I know there are ppl saying > otherwise, > but I?ve seen the impact myself, and I definitely don?t want it on that > particular server because it would likely interfere with other > services.? I don?t > know if the software RAID of btrfs is better in that or not, though, > but I?m > seeing btrfs on SSDs being fast, and testing with the particular > application has > shown a speedup of factor 20--30.I never said anything about MD RAID.? I trust that about as far as I could throw it.? And having had 5 surgeries on my throwing shoulder wouldn't be far.> > That is the crucial improvement.? If the hardware RAID delivers that, > I?ll use > that and probably remove the SSDs from the machine as it wouldn?t even > make sense > to put temporary data onto them because that would involve software RAID.Again, if the idea is to have fast primary storage, there are pretty large SSDs available now and I've hardware RAIDED SSDs before without trouble, though not for any heavy lifting, it's my test servers at home. Without an idea of the expected mail traffic, this is all speculation.> >> It does have serious stability/data integrity issues that XFS doesn't >> have.? There's no reason not to use SSDs for storage of immediate >> data and mechanical drives for archival data storage. >> >> As for VMs we run a huge Zimbra cluster in VMs on VPC with large >> primary SSD volumes and even larger (and slower) secondary volumes >> for archived mail.? It's all CentOS 6 and works very well.? We >> process 600 million emails a month on that virtual cluster.? All EXT4 >> inside LVM. > > Do you use hardware RAID with SSDs?We do not here where I work, but that was setup LONG before I arrived.> >> I can't tell you what to do, but it seems to me you're viewing your >> setup from a narrow SSD/BTRFS standpoint.? Lots of ways to skin that >> cat. > > That?s because I do not store data on a single disk, without > redundancy, and > the SSDs I have are not suitable for hardware RAID.? So what else is > there but > either md-RAID or btrfs when I do not want to use ZFS?? I also do not > want to > use md-RAID, hence only btrfs remains.? I also like to use > sub-volumes, though > that isn?t a requirement (because I can use directories instead and > loose the > ability to make snapshots).If the SSDs you have aren't suitable for hardware RAID, then they aren't good for production level mail spools, IMHO.? I mean, you're talking like you're expecting a metric buttload of mail traffic, so it stands to reason you'll need really beefy hardware.? I don't think you can do what you seem to need on budget hardware. Personally, and solely based on this thread alone, if I was building this in-house, I'd get a decent server cluster together and build a FC or iSCSI SAN to a Nimble storage array with Flash/SSD front ends and large HDDs in the back end.? This solves virtually all your problems.? The servers will have tiny SSD boot drives (which I prefer over booting from the SAN) and then everything else gets handled by the storage back-end. In effect this is how our mail servers are setup here.? And they are virtual.> > I stay away from LVM because that just sucks.? It wouldn?t even have > any advantage > in this case.LVM is a joke.? It's always been something I've avoided like the plague.> > >> >> >> On 09/08/2017 08:07 AM, hw wrote: >>> >>> PS: >>> >>> What kind of storage solutions do people use for cyrus mail spools?? >>> Apparently >>> you can not use remote storage, at least not NFS.? That even makes >>> it difficult >>> to use a VM due to limitations of available disk space. >>> >>> I?m reluctant to use btrfs, but there doesn?t seem to be any >>> reasonable alternative. >>> >>> >>> hw wrote: >>>> Mark Haney wrote: >>>>> On 09/07/2017 01:57 PM, hw wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> is there anything that speaks against putting a cyrus mail spool >>>>>> onto a >>>>>> btrfs subvolume? >>>>>> >>>>> I might be the lone voice on this, but I refuse to use btrfs for >>>>> anything, much less a mail spool. I used it in production on DB >>>>> and Web servers and fought corruption issues and scrubs hanging >>>>> the system more times than I can count.? (This was within the last >>>>> 24 months.)? I was told by certain mailing lists, that btrfs isn't >>>>> considered production level.? So, I scrapped the lot, went to xfs >>>>> and haven't had a problem since. >>>>> >>>>> I'm not sure why you'd want your mail spool on a filesystem and >>>>> seems to hate being hammered with reads/writes. Personally, on all >>>>> my mail spools, I use XFS or EXT4.? OUr servers here handle >>>>> 600million messages a month without trouble on those filesystems. >>>>> >>>>> Just my $0.02. >>>> >>>> Btrfs appears rather useful because the disks are SSDs, because it >>>> allows me to create subvolumes and because it handles SSDs nicely. >>>> Unfortunately, the SSDs are not suited for hardware RAID. >>>> >>>> The only alternative I know is xfs or ext4 on mdadm and no subvolumes, >>>> and md RAID has severe performance penalties which I?m not willing to >>>> afford. >>>> >>>> Part of the data I plan to store on these SSDs greatly benefits from >>>> the low latency, making things about 20--30 times faster for an >>>> important >>>> application. >>>> >>>> So what should I do? >> >> > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos-- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney at neonova.net www.neonova.net
Mark Haney wrote:> On 09/08/2017 09:49 AM, hw wrote: >> Mark Haney wrote: >>> I hate top posting, but since you've got two items I want to comment on, I'll suck it up for now. >> >> I do, too, yet sometimes it?s reasonable. I also hate it when the lines >> are too long :) >> > I'm afraid you'll have to live with it a bit longer. Sorry. >>> Having SSDs alone will give you great performance regardless of filesystem. >> >> It depends, i. e. I can?t tell how these SSDs would behave if large amounts of >> data would be written and/or read to/from them over extended periods of time because >> I haven?t tested that. That isn?t the application, anyway. > > If your I/O is going to be heavy (and you've not mentioned expected traffic, so we can only go on what little we glean from your posts), then SSDs will likely start having issues sooner than a mechanical drive might. (Though, YMMV.) As I've said, we process 600 million messages a month, on primary SSDs in a VMWare cluster, with mechanical storage for older, archived user mail. Archived, may not be exactly correct, but the context should be clear.I/O is not heavy in that sense, that?s why I said that?s not the application. There is I/O which, as tests have shown, benefits greatly from low latency, which is where the idea to use SSDs for the relevant data has arisen from. This I/O only involves a small amount of data and is not sustained over long periods of time. What exactly the problem is with the application being slow with spinning disks is unknown because I don?t have the sources, and the maker of the application refuses to deal with the problem entirely. Since the data requiring low latency will occupy about 5% of the available space on the SSDs and since they are large enough to hold the mail spool for about 10 years at its current rate of growth besides that data, these SSDs could be well used to hold that mail spool.>>> BTRFS isn't going to impact I/O any more significantly than, say, XFS. >> >> But mdadm does, the impact is severe. I know there are ppl saying otherwise, >> but I?ve seen the impact myself, and I definitely don?t want it on that >> particular server because it would likely interfere with other services. I don?t >> know if the software RAID of btrfs is better in that or not, though, but I?m >> seeing btrfs on SSDs being fast, and testing with the particular application has >> shown a speedup of factor 20--30. > I never said anything about MD RAID. I trust that about as far as I could throw it. And having had 5 surgeries on my throwing shoulder wouldn't be far.How else would I create a RAID with these SSDs? I?ve been using md-RAID for years, and it always worked fine.>> That is the crucial improvement. If the hardware RAID delivers that, I?ll use >> that and probably remove the SSDs from the machine as it wouldn?t even make sense >> to put temporary data onto them because that would involve software RAID. > Again, if the idea is to have fast primary storage, there are pretty large SSDs available now and I've hardware RAIDED SSDs before without trouble, though not for any heavy lifting, it's my test servers at home. Without an idea of the expected mail traffic, this is all speculation.The SSDs don?t need to be large, and they aren?t. They are already greatly oversized at 512GB nominal capacity. There?s only a few hundred emails per day. There is no special requirement for their storage, but there is a lot of free space on these SSDs, and since the email traffic is mostly read-only, it won?t wear out the SSDs. It simply would make sense to put the mail spool onto these SSDs.>>> It does have serious stability/data integrity issues that XFS doesn't have. There's no reason not to use SSDs for storage of immediate data and mechanical drives for archival data storage. >>> >>> As for VMs we run a huge Zimbra cluster in VMs on VPC with large primary SSD volumes and even larger (and slower) secondary volumes for archived mail. It's all CentOS 6 and works very well. We process 600 million emails a month on that virtual cluster. All EXT4 inside LVM. >> >> Do you use hardware RAID with SSDs? > We do not here where I work, but that was setup LONG before I arrived.Probably with the very expensive SSDs suited for this ...>> >>> I can't tell you what to do, but it seems to me you're viewing your setup from a narrow SSD/BTRFS standpoint. Lots of ways to skin that cat. >> >> That?s because I do not store data on a single disk, without redundancy, and >> the SSDs I have are not suitable for hardware RAID. So what else is there but >> either md-RAID or btrfs when I do not want to use ZFS? I also do not want to >> use md-RAID, hence only btrfs remains. I also like to use sub-volumes, though >> that isn?t a requirement (because I can use directories instead and loose the >> ability to make snapshots). > > If the SSDs you have aren't suitable for hardware RAID, then they aren't good for production level mail spools, IMHO. I mean, you're talking like you're expecting a metric buttload of mail traffic, so it stands to reason you'll need really beefy hardware. I don't think you can do what you seem to need on budget hardware. Personally, and solely based on this thread alone, if I was building this in-house, I'd get a decent server cluster together and build a FC or iSCSI SAN to a Nimble storage array with Flash/SSD front ends and large HDDs in the back end. This solves virtually all your problems. The servers will have tiny SSD boot drives (which I prefer over booting from the SAN) and then everything else gets handled by the storage back-end.If SSDs not suitable for RAID usage aren?t suitable for production use, then basically all SSDs not suitable for RAID usage are SSDs that can?t be used for anything that requires something less volatile than a ramdisk. Experience with such SSDs contradicts this so far. There is no "storage backend" but a file server, which, instead of 99.95% idling, is being asisgned additional tasks, and since it is difficult to put a cyrus mail spool on remote storage, the email server is one of these tasks.> In effect this is how our mail servers are setup here. And they are virtual.You have entirely different requirements.>> >> I stay away from LVM because that just sucks. It wouldn?t even have any advantage >> in this case. > LVM is a joke. It's always been something I've avoided like the plague.I?ve also avoided it until I had an application where it would have been advantageous if it actually provided the benefits it seems supposed to provide. It turned out that it didn?t and only made things much worse, and I continue to stay away from it. After all, you?re saying it?s a bad idea to use these SSDs, especially with btrfs. I don?t feel good about it, either, and I?ll try to avoid using them.
Mark Haney wrote:> On 09/08/2017 09:49 AM, hw wrote: >> Mark Haney wrote:<snip>>> >> It depends, i. e. I can?t tell how these SSDs would behave if large >> amounts of data would be written and/or read to/from them over extended >> periods of time because I haven?t tested that.? That isn?t the >> application, anyway. > > If your I/O is going to be heavy (and you've not mentioned expected > traffic, so we can only go on what little we glean from your posts), > then SSDs will likely start having issues sooner than a mechanical drive > might.? (Though, YMMV.)? As I've said, we process 600 million messages a > month, on primary SSDs in a VMWare cluster, with mechanical storage for > older, archived user mail.? Archived, may not be exactly correct, but > the context should be clear. >One thing to note, which I'm aware of because I was recently spec'ing out a Dell server: Dell, at least, offers two kinds of SSDs, one for heavy write, I think it was, and one for equal r/w. You might dig into that.>> >> But mdadm does, the impact is severe.? I know there are ppl saying >> otherwise, but I?ve seen the impact myself, and I definitely don?t want >> it on that particular server because it would likely interfere with other >> services.? I don?t know if the software RAID of btrfs is better in that >> or not, though, but I?m seeing btrfs on SSDs being fast, and testing >> with the particular application has shown a speedup of factor 20--30.Odd, we've never seen anything like that. Of course, we're not handling the kind of mail you are... but serious scientific computing hits storage hard, also.> I never said anything about MD RAID.? I trust that about as far as I > could throw it.? And having had 5 surgeries on my throwing shoulder > wouldn't be far.Why? We have it all over, and have never seen a problem with it. Nor have I, personally, as I have a RAID 1 at home. <snip> mark
hw wrote:> Mark Haney wrote: >> On 09/08/2017 09:49 AM, hw wrote: >>> Mark Haney wrote:<snip>> Probably with the very expensive SSDs suited for this ...<snip>>>> >>> That?s because I do not store data on a single disk, without >>> redundancy, and the SSDs I have are not suitable for hardware RAID.<snip> That's a biggie: are these SSDs consumer grade, or enterprise grade? It was common knowledge 8-9 years ago that you *never* want consumer grade in anything that mattered, other than maybe a home PC - they wear out much sooner. But then, you can't really use consumer grade h/ds in a server. We like the NAS-rated ones, like WD Red, which are about 1.33% the price of consumer grade, and solid... and a lot less than the enterprise-grade, which are about 3x consumer grade. mark
m.roth at 5-cent.us wrote:> Mark Haney wrote: >> On 09/08/2017 09:49 AM, hw wrote: >>> Mark Haney wrote: > <snip> >>> >>> It depends, i. e. I can?t tell how these SSDs would behave if large >>> amounts of data would be written and/or read to/from them over extended >>> periods of time because I haven?t tested that. That isn?t the >>> application, anyway. >> >> If your I/O is going to be heavy (and you've not mentioned expected >> traffic, so we can only go on what little we glean from your posts), >> then SSDs will likely start having issues sooner than a mechanical drive >> might. (Though, YMMV.) As I've said, we process 600 million messages a >> month, on primary SSDs in a VMWare cluster, with mechanical storage for >> older, archived user mail. Archived, may not be exactly correct, but >> the context should be clear. >> > One thing to note, which I'm aware of because I was recently spec'ing out > a Dell server: Dell, at least, offers two kinds of SSDs, one for heavy > write, I think it was, and one for equal r/w. You might dig into that. >>> >>> But mdadm does, the impact is severe. I know there are ppl saying >>> otherwise, but I?ve seen the impact myself, and I definitely don?t want >>> it on that particular server because it would likely interfere with other >>> services. I don?t know if the software RAID of btrfs is better in that >>> or not, though, but I?m seeing btrfs on SSDs being fast, and testing >>> with the particular application has shown a speedup of factor 20--30. > > Odd, we've never seen anything like that. Of course, we're not handling > the kind of mail you are... but serious scientific computing hits storage > hard, also. > >> I never said anything about MD RAID. I trust that about as far as I >> could throw it. And having had 5 surgeries on my throwing shoulder >> wouldn't be far. > > Why? We have it all over, and have never seen a problem with it. Nor have > I, personally, as I have a RAID 1 at home. > <snip>Make a test and replace a software RAID5 with a hardware RAID5. Even with only 4 disks, you will see an overall performance gain. I?m guessing that the SATA controllers they put onto the mainboards are not designed to handle all the data --- which gets multiplied to all the disks --- and that the PCI bus might get clogged. There?s also the CPU being burdened with the calculations required for the RAID, and that may not be displayed by tools like top, so you can be fooled easily. Graphics cards have acceleration in hardware for a reason. What was the last time you tried to do software rendering, and what frame rates did you get? :) Offloading the I/O to a designated controller gives you room for the things you actually want to do, similar to a graphics card.
On 09/08/2017 01:31 PM, hw wrote:> Mark Haney wrote: > > I/O is not heavy in that sense, that?s why I said that?s not the > application. > There is I/O which, as tests have shown, benefits greatly from low > latency, which > is where the idea to use SSDs for the relevant data has arisen from.? > This I/O > only involves a small amount of data and is not sustained over long > periods of time. > What exactly the problem is with the application being slow with > spinning disks is > unknown because I don?t have the sources, and the maker of the > application refuses > to deal with the problem entirely. > > Since the data requiring low latency will occupy about 5% of the > available space on > the SSDs and since they are large enough to hold the mail spool for > about 10 years at > its current rate of growth besides that data, these SSDs could be well > used to hold > that mail spool.See, this is the kind of information that would have made this thread far shorter.? (Maybe.)? The one thing that you didn't explain is whether this application is the one /using/ the mail spool or if you're adding Cyrus to that system to be a mail server.>>>> BTRFS isn't going to impact I/O any more significantly than, say, XFS. >>> >>> But mdadm does, the impact is severe.? I know there are ppl saying >>> otherwise, >>> but I?ve seen the impact myself, and I definitely don?t want it on that >>> particular server because it would likely interfere with other >>> services.? I don?t >>> know if the software RAID of btrfs is better in that or not, though, >>> but I?m >>> seeing btrfs on SSDs being fast, and testing with the particular >>> application has >>> shown a speedup of factor 20--30. >> I never said anything about MD RAID.? I trust that about as far as I >> could throw it.? And having had 5 surgeries on my throwing shoulder >> wouldn't be far. > > How else would I create a RAID with these SSDs? > > I?ve been using md-RAID for years, and it always worked fine. > >>> That is the crucial improvement.? If the hardware RAID delivers >>> that, I?ll use >>> that and probably remove the SSDs from the machine as it wouldn?t >>> even make sense >>> to put temporary data onto them because that would involve software >>> RAID. >> Again, if the idea is to have fast primary storage, there are pretty >> large SSDs available now and I've hardware RAIDED SSDs before without >> trouble, though not for any heavy lifting, it's my test servers at >> home. Without an idea of the expected mail traffic, this is all >> speculation. > > The SSDs don?t need to be large, and they aren?t.? They are already > greatly oversized at > 512GB nominal capacity. > > There?s only a few hundred emails per day.? There is no special > requirement for their > storage, but there is a lot of free space on these SSDs, and since the > email traffic is > mostly read-only, it won?t wear out the SSDs.? It simply would make > sense to put the > mail spool onto these SSDs. > >>>> It does have serious stability/data integrity issues that XFS >>>> doesn't have.? There's no reason not to use SSDs for storage of >>>> immediate data and mechanical drives for archival data storage. >>>> >>>> As for VMs we run a huge Zimbra cluster in VMs on VPC with large >>>> primary SSD volumes and even larger (and slower) secondary volumes >>>> for archived mail.? It's all CentOS 6 and works very well.? We >>>> process 600 million emails a month on that virtual cluster.? All >>>> EXT4 inside LVM. >>> >>> Do you use hardware RAID with SSDs? >> We do not here where I work, but that was setup LONG before I arrived. > > Probably with the very expensive SSDs suited for this ...Possibly, but that's somewhat irrelevant.? I've taken off the shelf SSDs and hardware RAID'd them.? If they work for the hell I put them through (processing weather data), they'll work for the type of service you're saying you have.> >> If the SSDs you have aren't suitable for hardware RAID, then they >> aren't good for production level mail spools, IMHO.? I mean, you're >> talking like you're expecting a metric buttload of mail traffic, so >> it stands to reason you'll need really beefy hardware.? I don't think >> you can do what you seem to need on budget hardware. Personally, and >> solely based on this thread alone, if I was building this in-house, >> I'd get a decent server cluster together and build a FC or iSCSI SAN >> to a Nimble storage array with Flash/SSD front ends and large HDDs in >> the back end.? This solves virtually all your problems.? The servers >> will have tiny SSD boot drives (which I prefer over booting from the >> SAN) and then everything else gets handled by the storage back-end. > > If SSDs not suitable for RAID usage aren?t suitable for production > use, then basically > all SSDs not suitable for RAID usage are SSDs that can?t be used for > anything that > requires something less volatile than a ramdisk.? Experience with such > SSDs contradicts > this so far.Not true at all.? Maybe 5 years ago SSDs were hit or miss with hardware RAID.? Not anymore.? It's just another drive to the system, the controllers don't know the difference between a SATA HDD and a SATA SSD. Couple that with the low volume of mail, and you should be fine for HW RAID.> > There is no "storage backend" but a file server, which, instead of > 99.95% idling, is > being asisgned additional tasks, and since it is difficult to put a > cyrus mail spool > on remote storage, the email server is one of these tasks.Again, you never mentioned the volume of mail expected, and your previous threads seemed to indicate you were expecting enough to cause issues with SSDs and BTRFS. In IT when we get a 'my printer is broken', we ask for more info since that's not descriptive enough.? If this server as is asleep and you (now) make it sound, BTRFS might be fine.? Though, personally, I'd avoid it regardless.> >> In effect this is how our mail servers are setup here.? And they are >> virtual. > > You have entirely different requirements.I know that now.? Previously, you made it sound the your mail flow would be a lot closer to 'heavy' than what you've finally described.? I can only offer thoughts based on what information I'm given.> >>> >>> I stay away from LVM because that just sucks.? It wouldn?t even have >>> any advantage >>> in this case. >> LVM is a joke.? It's always been something I've avoided like the plague. > > I?ve also avoided it until I had an application where it would have > been advantageous > if it actually provided the benefits it seems supposed to provide.? It > turned out that > it didn?t and only made things much worse, and I continue to stay away > from it. > > > After all, you?re saying it?s a bad idea to use these SSDs, especially > with btrfs. > I don?t feel good about it, either, and I?ll try to avoid using them. >No, I'm not saying not to use your SSDs.? I'm saying that BTRFS is not worth using in any server.? The SSD question, prompted by you, was whether the SSDs could: 1) be hardware RAID'd 2) handle the load of mail you were expecting. 512GB SSDs are new enough to probably be HW RAID'd fine, assuming they are weird ones from a third party no one has really heard of. I know because my last company bought some inexpensive (I call them knockoffs) third party SSDs that were utter crap from the moment an OS was installed on them. If yours are from Seagate, WG, or other bigname drive maker, I would be surprised if they choked being on a hardware RAID card.? A setup like yours doesn't appear to need 'Enterprise' level hardware, SMB hardware appears would work for you just as well. Just not with BTRFS. On any drive.? Ever. -- Mark Haney Network Engineer at NeoNova 919-460-3330 option 1 mark.haney at neonova.net www.neonova.net