Chris Murphy
2017-Aug-11 17:00 UTC
[CentOS] Btrfs going forward, was: Errors on an SSD drive
Changing the subject since this is rather Btrfs specific now. On Fri, Aug 11, 2017 at 5:41 AM, hw <hw at gc-24.de> wrote:> Chris Murphy wrote: >> >> On Wed, Aug 9, 2017, 11:55 AM Mark Haney <mark.haney at neonova.net> wrote: >> >>> To be honest, I'd not try a btrfs volume on a notebook SSD. I did that on >>> a >>> couple of systems and it corrupted pretty quickly. I'd stick with >>> xfs/ext4 >> >> >> if you manage to get the drive working again. >>> >>> >> >> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, the >> maintainers worked for FusionIO for several years of its development. If >> the drive is silently corrupting data, Btrfs will pretty much immediately >> start complaining where other filesystems will continue. Bad RAM can also >> result in scary warnings where you don't with other filesytems. And I've >> been using it in numerous SSDs for years and NVMe for a year with zero >> problems. > > > That?s one thing I?ve been wondering about: When using btrfs RAID, do you > need to somehow monitor the disks to see if one has failed?Yes. The block layer has no faulty device handling, i.e. it just reports whatever problems the device or the controller report. Where md/mdadm and md/LVM have implemented policies for ejecting (setting a device to faulty) a block device. Btrfs does not do that, it'll just keep trying to use a faulty device. So you have to setup something that monitors for either physical device errors, or btrfs errors or both, depending on what you want.> >> On CentOS though, I'd get newer btrfs-progs RPM from Fedora, and use >> either >> an elrepo.org kernel, a Fedora kernel, or build my own latest long-term >> from kernel.org. There's just too much development that's happened since >> the tree found in RHEL/CentOS kernels. > > > I can?t go with a more recent kernel version before NVIDIA has updated their > drivers to no longer need fence.h (or what it was). > > And I thought stuff gets backported, especially things as important as file > systems.There's 1500 to 3000 line changes to Btrfs code per kernel release. There's too much to backport most of it. Serious fixes do get backported by upstream to longterm kernels, but to what degree, you have to check the upstream changelogs to know about it. And right now most backports go to only 4.4 and 4.9. And I can't tell you what kernel-3.10.0-514.10.2.el7.x86_64.rpm translates into, that requires a secret decoder ring near as I can tell as it's a kernel made from multiple branches, and then also a bunch of separate patches.>> Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement. >> Support will be removed probably in RHEL 8. I have no idea how it'll >> affect >> CentOS kernels though. It will remain in Fedora kernels. > > > That would suck badly to the point at which I?d have to look for yet another > distribution. The only one ramaining is arch. > > What do they suggest as a replacement? The only other FS that comes close > is > ZFS, and removing btrfs alltogether would be taking living in the past too > many > steps too far.Red Hat are working on a new user space wrapper and volume format based on md, device mapper, LVM, and XFS. http://stratis-storage.github.io/ https://stratis-storage.github.io/StratisSoftwareDesign.pdf It's an aggressive development schedule and as so much of it is journaling and CoW based I have no way to assess whether it ends up with its own set of problems, not dissimilar to Btrfs. We'll just have to see. But if there are underlying guts in the device-mapper that do things better/faster/easier than Btrfs, the Btrfs devs have said they can hook into device-mapper for these things to consolidate code base, in particular for the multiple device handling. By its own vague time table it will be years before it has "rough ZFS features" and again estimating bootloader support, and to what degree other distros pick up on it, it very well could end up being widely adopted, or it could be a Red Hat only thing in practice. Canonical appears to be charging ahead with OpenZFS included by default out of the box (although not for rootfs yet I guess), and that has an open ended and possibly long window before legal issues get tested. But this is by far the most cross platform solution: FreeBSD, Illumos, Linux, macOS. And ZoL has RHEL/CentOS specific packages. But I can't tell you for sure what ZoL's faulty device behavior is either, whether it ejects faulty or flaky devices and when, or if like Btrfs is just tolerates it. The elrepo.org folks can still sanely set CONFIG_BTRFS_FS=m, but I suspect if RHEL unsets that in RHEL 8 kernels, that CentOS will do the same. -- Chris Murphy
Mark Haney
2017-Aug-11 17:17 UTC
[CentOS] Btrfs going forward, was: Errors on an SSD drive
On Fri, Aug 11, 2017 at 1:00 PM, Chris Murphy <lists at colorremedies.com> wrote:> Changing the subject since this is rather Btrfs specific now. > > > > >> > >> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, > the > >> maintainers worked for FusionIO for several years of its development. If > >> the drive is silently corrupting data, Btrfs will pretty much > immediately > >> start complaining where other filesystems will continue. Bad RAM can > also > >> result in scary warnings where you don't with other filesytems. And I've > >> been using it in numerous SSDs for years and NVMe for a year with zero > >> problems. > > > > > > > LMFAO. Trust me, I tried several SSDs with BTRFS over the last couple of > years and had trouble the entire time. I constantly had to scrub the drive, > had freezes under moderate load and general nastiness. If that's > 'optimized for SSDs', then something is very wrong with the definition of > optimized. Not to mention the fact that BTRFS is not production ready for > anything, and I'm done trying to use it and going with XFS or EXT4 > depending on my need. >As for a hardware problem, the drives were ones purchased in Lenovo professional workstation laptops, and, while you do get lemons occasionally, I tried 4 different ones of the exact same model and had the exact same issues. Its highly unlikely I'd get 4 of the same brand to have hardware issues. Once I went back to ext4 on those systems I could run the devil out of them and not see any freezes under even heavy load, nor any other hardware related items. In fact, the one I used at my last job was given to me on my way out and it's now being used by my daughter. It's been upgraded from Fedora 23 to 26 without a hitch. On ext4. Say what you want, BTRFS is a very bad filesystem in my experience.> > > -- > Chris Murphy > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos >-- [image: photo] Mark Haney Network Engineer at NeoNova 919-460-3330 <(919)%20460-3330> (opt 1) ? mark.haney at neonova.net www.neonova.net <https://neonova.net/> <https://www.facebook.com/NeoNovaNNS/> <https://twitter.com/NeoNova_NNS> <http://www.linkedin.com/company/neonova-network-services> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> Virus-free. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Chris Murphy wrote:> Changing the subject since this is rather Btrfs specific now. > > > > On Fri, Aug 11, 2017 at 5:41 AM, hw <hw at gc-24.de> wrote: >> Chris Murphy wrote: >>> >>> On Wed, Aug 9, 2017, 11:55 AM Mark Haney <mark.haney at neonova.net> wrote: >>> >>>> To be honest, I'd not try a btrfs volume on a notebook SSD. I did that on >>>> a >>>> couple of systems and it corrupted pretty quickly. I'd stick with >>>> xfs/ext4 >>> >>> >>> if you manage to get the drive working again. >>>> >>>> >>> >>> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, the >>> maintainers worked for FusionIO for several years of its development. If >>> the drive is silently corrupting data, Btrfs will pretty much immediately >>> start complaining where other filesystems will continue. Bad RAM can also >>> result in scary warnings where you don't with other filesytems. And I've >>> been using it in numerous SSDs for years and NVMe for a year with zero >>> problems. >> >> >> That?s one thing I?ve been wondering about: When using btrfs RAID, do you >> need to somehow monitor the disks to see if one has failed? > > Yes. > > The block layer has no faulty device handling, i.e. it just reports > whatever problems the device or the controller report. Where md/mdadm > and md/LVM have implemented policies for ejecting (setting a device to > faulty) a block device. Btrfs does not do that, it'll just keep trying > to use a faulty device. > > So you have to setup something that monitors for either physical > device errors, or btrfs errors or both, depending on what you want.I want to know when a drive has failed. How can I monitor that? I?ve begun to use btrfs only recently.>>> On CentOS though, I'd get newer btrfs-progs RPM from Fedora, and use >>> either >>> an elrepo.org kernel, a Fedora kernel, or build my own latest long-term >>> from kernel.org. There's just too much development that's happened since >>> the tree found in RHEL/CentOS kernels. >> >> >> I can?t go with a more recent kernel version before NVIDIA has updated their >> drivers to no longer need fence.h (or what it was). >> >> And I thought stuff gets backported, especially things as important as file >> systems. > > There's 1500 to 3000 line changes to Btrfs code per kernel release. > There's too much to backport most of it. Serious fixes do get > backported by upstream to longterm kernels, but to what degree, you > have to check the upstream changelogs to know about it. > > And right now most backports go to only 4.4 and 4.9. And I can't tell > you what kernel-3.10.0-514.10.2.el7.x86_64.rpm translates into, that > requires a secret decoder ring near as I can tell as it's a kernel > made from multiple branches, and then also a bunch of separate > patches.So these kernels are a mess. What?s the point of backports when they aren?t done correctly? This puts a big stamp "stay away from" on RHEL/Centos.>>> Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement. >>> Support will be removed probably in RHEL 8. I have no idea how it'll >>> affect >>> CentOS kernels though. It will remain in Fedora kernels. >> >> >> That would suck badly to the point at which I?d have to look for yet another >> distribution. The only one ramaining is arch. >> >> What do they suggest as a replacement? The only other FS that comes close >> is >> ZFS, and removing btrfs alltogether would be taking living in the past too >> many >> steps too far. > > Red Hat are working on a new user space wrapper and volume format > based on md, device mapper, LVM, and XFS. > http://stratis-storage.github.io/ > https://stratis-storage.github.io/StratisSoftwareDesign.pdf > > It's an aggressive development schedule and as so much of it is > journaling and CoW based I have no way to assess whether it ends upSo in another 15 or 20 years, some kind of RH file system might become usable. I?d say the need to wake up because the need for features provided by ZFS and btrfs already exists since years. Even their current XFS implementation is flawed because there is no way to install on an XFS that is adjusted to the volume of the hardware RAID the XFS is created on as it is supposed to be.> [...] > tested. But this is by far the most cross platform solution: FreeBSD, > Illumos, Linux, macOS. And ZoL has RHEL/CentOS specific packages.That can be an advantage. What is the state of ZFS for Centos? I?m going to need it because I have data on some disks that were used for ZFS and now need to be read by a machine running Centos. Does it require a particular kernel version?> But I can't tell you for sure what ZoL's faulty device behavior is > either, whether it ejects faulty or flaky devices and when, or if like > Btrfs is just tolerates it.You can monitor the disks and see when one has failed.> The elrepo.org folks can still sanely set CONFIG_BTRFS_FS=m, but I > suspect if RHEL unsets that in RHEL 8 kernels, that CentOS will do the > same.Sanely? With the kernel being such a mess?
Mark Haney wrote:> On Fri, Aug 11, 2017 at 1:00 PM, Chris Murphy <lists at colorremedies.com> > wrote: > >> Changing the subject since this is rather Btrfs specific now. >> >> >> >>>> >>>> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, >> the >>>> maintainers worked for FusionIO for several years of its development. If >>>> the drive is silently corrupting data, Btrfs will pretty much >> immediately >>>> start complaining where other filesystems will continue. Bad RAM can >> also >>>> result in scary warnings where you don't with other filesytems. And I've >>>> been using it in numerous SSDs for years and NVMe for a year with zero >>>> problems. >>> >>> >> >> >> LMFAO. Trust me, I tried several SSDs with BTRFS over the last couple of >> years and had trouble the entire time. I constantly had to scrub the drive, >> had freezes under moderate load and general nastiness. If that's >> 'optimized for SSDs', then something is very wrong with the definition of >> optimized. Not to mention the fact that BTRFS is not production ready for >> anything, and I'm done trying to use it and going with XFS or EXT4 >> depending on my need. >> > > As for a hardware problem, the drives were ones purchased in Lenovo > professional workstation laptops, and, while you do get lemons > occasionally, I tried 4 different ones of the exact same model and had the > exact same issues. Its highly unlikely I'd get 4 of the same brand to have > hardware issues. Once I went back to ext4 on those systems I could run the > devil out of them and not see any freezes under even heavy load, nor any > other hardware related items. In fact, the one I used at my last job was > given to me on my way out and it's now being used by my daughter. It's been > upgraded from Fedora 23 to 26 without a hitch. On ext4. Say what you > want, BTRFS is a very bad filesystem in my experience.What?s the alternative? Hardware RAID with SSDs not particularly designed for this application is a bad idea. Software RAID with mdadm is a bad idea because it comes with quite some performance loss. ZFS is troublesome because it?s not as well integrated as we can wish for, booting from a ZFS volume gives you even more trouble, and it is rather noticeable that ZFS wasn?t designed with performance in mind. That doesn?t even mention features like checksumming, deduplication, compression and the creation of subvolumes (or their equivalent). It also doesn?t mention that LVM is a catastrophy. I could use hardware RAID, but neither XFS, nor ext4 offer the required features. So what should I use instead of btrfs or ZFS? I went with btrfs because it?s less troublesome than ZFS and provides features for which I don?t know any good alternative. So far, it?s working fine, but I?d rather switch now than experience desaster.
Warren Young
2017-Aug-11 18:12 UTC
[CentOS] Btrfs going forward, was: Errors on an SSD drive
On Aug 11, 2017, at 11:00 AM, Chris Murphy <lists at colorremedies.com> wrote:> > On Fri, Aug 11, 2017 at 5:41 AM, hw <hw at gc-24.de> wrote: >> That?s one thing I?ve been wondering about: When using btrfs RAID, do you >> need to somehow monitor the disks to see if one has failed? > > Yes. > > The block layer has no faulty device handlingThat is one of the open questions about Stratis: should its stratisd act in the place of smartd? Vote and comment on its GitHub issue here: https://github.com/stratis-storage/stratisd/issues/72 I?m in favor of it. The daemon had to be there anyway, it makes sense to push SMART failure indicators up through the block layer into the volume manager layer so it can react intelligently to the failure, and FreeBSD?s ZFS is getting such a daemon soon so we want one, too: https://www.phoronix.com/scan.php?page=news_item&px=ZFSD-For-FreeBSD>>> Also FWIW Red Hat is deprecating Btrfs, in the RHEL 7.4 announcement. >>> Support will be removed probably in RHEL 8. I have no idea how it'll >>> affect >>> CentOS kernels though. It will remain in Fedora kernels.I rather doubt btrfs will be compiled out of the kernel in EL8, and even if it is, it?ll probably be in the CentOSPlus kernels. What you *won?t* get from Red Hat is the ability to install EL8 onto a btrfs volume from within Anaconda, the btrfs tools won?t be installed by default, and if you have a Red Hat subscription, they won?t be all that willing to help you with btrfs-related problems. But will you be able to install EL8 onto an existing XFS-formatted boot volume and mount your old btrfs data volume? I guess ?yes.? I suspect you?ll even be able to manually create new btrfs data volumes in EL8.>> That would suck badly to the point at which I?d have to look for yet another >> distribution. The only one ramaining is arch.openSUSE defaults to btrfs on root, though XFS on /home for some reason: https://goo.gl/Hiuzbu>> What do they suggest as a replacement?Stratis: https://stratis-storage.github.io/StratisSoftwareDesign.pdf The main downside to Stratis I see is that it looks like 1.0 is scheduled to coincide with RHEL 8, based on the release dates of RHELs past, which means it won?t have any kind of redundant storage options to begin with, not even RAID-1, the only meaningful RAID level when it comes to comparing against btrfs. The claim is that ?enterprise? users don?t want software RAID anyway, so they don?t need to provide it in whatever version of Stratis ships with EL 8. I think my reply to that holds true for many of us CentOS users: https://github.com/stratis-storage/stratis-docs/issues/54 Ah well, my company has historically been skipping even-numbered RHEL releases anyway due to lack of compelling reasons to migrate from the prior odd-numbered release still being supported. Maybe Stratis will be ready for prime time by the time EL9 ships.>> removing btrfs alltogether would be taking living in the past too >> many steps too far.The Red Hat/Fedora developers are well aware that they started out ~7 years behind when they pushed btrfs forward as a ?technology preview? with RHEL 6, and are now more like 12 years behind the ZFS world after waiting in vain for btrfs to catch up. Basically, Stratis is their plan to catch up on the cheap, building atop existing, tested infrastructure already in Linux. My biggest worry is that because it?s not integrated top-to-bottom like ZFS is, they?ll miss out on some of the key advantages you have with ZFS. I?m all for making the current near-manual LVM2 + MD + DM + XFS lash-up more integrated and automated, even if it?s just a pretty face in front of those same components. The question is how well that interface mimics the end user experience of ZFS, which in my mind still provides the best CLI experience, even if you compare only on features they share in common. btrfs? tools are close, but I guess the correct command much more often with ZFS? tools. That latter is an explicit goal of the Stratis project. They know that filesystem maintenance is not a daily task for most of us, so that we tend to forget commands, since we haven?t used them in months. It is a major feature of a filesystem to have commands you can guess correctly based on fuzzy memories of having used them once months ago.> Canonical appears to be charging ahead with OpenZFS included by > default out of the box (although not for rootfs yet I guess)Correct. ZFS-on-root-on-Ubuntu is still an unholy mess: https://github.com/zfsonlinux/zfs/wiki/Ubuntu> I can't tell you for sure what ZoL's faulty device behavior is > either, whether it ejects faulty or flaky devices and when, or if like > Btrfs is just tolerates it.Lacking something like zfsd, I?d guess it just tolerates it, and that you need to pair it with smartd to have notification of failing devices. You could script that to have automatic spare replacement. Or, port FreeBSD?s zfsd over.
Warren Young wrote:> [...] >>> What do they suggest as a replacement? > > Stratis: https://stratis-storage.github.io/StratisSoftwareDesign.pdfCan I use that now?> The main downside to Stratis I see is that it looks like 1.0 is scheduled to coincide with RHEL 8, based on the release dates of RHELs past, which means it won?t have any kind of redundant storage options to begin with, not even RAID-1, the only meaningful RAID level when it comes to comparing against btrfs.Redundancy is required. How do you install on an XFS that is adjusted to the stripe size and the number of units when using hardware RAID? I tried that, without success. What if you want to use SSDs to install the system on? That usually puts hardware RAID of the question.> The claim is that ?enterprise? users don?t want software RAID anyway, so they don?t need to provide it in whatever version of Stratis ships with EL 8. I think my reply to that holds true for many of us CentOS users:That leaves them unable to overcome the disadvantages of hardware RAID. I don?t want the performance penalty MD brings about even as a home user. Same goes for ZFS. I can?t tell yet how the penalty looks with btrfs, only that I haven?t noticed any yet. And that brings back the question why nobody makes a hardware ZFS controller. Enterprise users would probably love that, provided that the performance issues could be resolved.> [...] > I?m all for making the current near-manual LVM2 + MD + DM + XFS lash-up more integrated and automated,I?m more for getting rid of it. Just try to copy a LV into another VG, especially when the VG resides on different devices. Or try to make a snapshot in another VG because the devices the source of the snapshot resides on don?t have enough free space. LVM lacks so much flexibility that it is more a cumbersome burdon than anything else. I have lost a whole VM when I tried to copy it, thanks to LVM. It was so complicated that the LV somehow vanished, and I still don?t know what happened. No more LVM. > [...]
Chris Murphy
2017-Aug-11 22:14 UTC
[CentOS] Btrfs going forward, was: Errors on an SSD drive
On Fri, Aug 11, 2017 at 11:17 AM, Mark Haney <mark.haney at neonova.net> wrote:> On Fri, Aug 11, 2017 at 1:00 PM, Chris Murphy <lists at colorremedies.com> > wrote: > >> Changing the subject since this is rather Btrfs specific now. >> >> >> >> >> >> >> Sounds like a hardware problem. Btrfs is explicitly optimized for SSD, >> the >> >> maintainers worked for FusionIO for several years of its development. If >> >> the drive is silently corrupting data, Btrfs will pretty much >> immediately >> >> start complaining where other filesystems will continue. Bad RAM can >> also >> >> result in scary warnings where you don't with other filesytems. And I've >> >> been using it in numerous SSDs for years and NVMe for a year with zero >> >> problems. >> > >> > >> >> >> LMFAO. Trust me, I tried several SSDs with BTRFS over the last couple of >> years and had trouble the entire time. I constantly had to scrub the drive, >> had freezes under moderate load and general nastiness. If that's >> 'optimized for SSDs', then something is very wrong with the definition of >> optimized. Not to mention the fact that BTRFS is not production ready for >> anything, and I'm done trying to use it and going with XFS or EXT4 >> depending on my need.Could you get your quoting in proper order? The way you did this looks like I wrote the above steaming pile rant. Whoever did write it, it's ridiculous, meaning it's worthy of ridicule. From the provably unscientific and non-technical, to craptasticly snotty writing "not to mention the fact" and then proceeding to mention it. That's just being an idiot, and then framing it. Where are your bug reports? That question is a trap if you haven't in fact filed any bugs, in particular upstream.> As for a hardware problem, the drives were ones purchased in Lenovo > professional workstation laptops, and, while you do get lemons > occasionally, I tried 4 different ones of the exact same model and had the > exact same issues. Its highly unlikely I'd get 4 of the same brand to have > hardware issues.In fact it's highly likely because a.) it's a non-scientific sample and b.) the hardware is intentionally identical. If the firmware is For SSDs all the sauce is in the firmware. If the model and firmware were all the same, it is more likely to be a firmware bug than it is to be a Btrfs bug. There are absolutely cases where Btrfs runs into problems that other file systems don't, because Btrfs is designed to detect them and others aren't. There's a reason why XFS and ext4 have added metadata checksumming in recent versions. Hardware lies. Firmware has bugs and it causes problems. And it can be months before it materializes into a noticeable problem. https://lwn.net/Articles/698090/ Btrfs tends to complain early and often when it encounters confusion, It also will go read only sooner than other file systems in order to avoid corrupting the file system. Almost always a normal mount will automatically fallback to the most recent consistent state. Sometimes it needs to be mounted with -o usebackuproot option. And still in fewer cases it will need to be mounted read only, where other file systems won't even tolerate that in the same situation. The top two complaints I have about Btrfs is a.) what to do when a normal mount doesn't work, it's really non-obvious what you *should* do and in what order because there are many specialized tools for different problems, so if your file system doesn't mount normally you are really best off going straight to the upstream list and asking for help, which is sorta shitty but that's the reality; b.) there are still some minority workloads where users have to micromanage the file system with a filtered balance to avoid a particular variety of bogus enospc. Most of the enospc problems are fixed with some changes in kernel 4.1 and 4.8. The upstream expert users are discussing some sort of one size fits all user space filtered (meaning partial) balance so regular users don't have to micromanage. It's completely a legitimate complaint that having to micromanage a file system is b.s. This has been a particularly difficult problem, and it's been around for a long enough time that I think a lot of normal workloads that would have run into problems have been masked (no problem) because so many users have gotten into the arguably bad habit of doing their own filtered balances. But as for Btrfs having some inherent flaw that results in corrupt file systems, it's silly. There are thousands of users in many production workloads using this file system and they'd have given up a long time ago, including myself.>Once I went back to ext4 on those systems I could run the > devil out of them and not see any freezes under even heavy load, nor any > other hardware related items. In fact, the one I used at my last job was > given to me on my way out and it's now being used by my daughter. It's been > upgraded from Fedora 23 to 26 without a hitch. On ext4. Say what you > want, BTRFS is a very bad filesystem in my experience.Read this. https://www.spinics.net/lists/linux-btrfs/msg67308.html If there was some inherent problem with Btrfs and SSDs, as you've asserted, that wouldn't be possible. And that's an example with quota support enabled, that's my big surprise. There are some performance implications with Btrfs quotas, and it's a relatively new feature, but that is a very good report. -- Chris Murphy
Chris Murphy
2017-Aug-11 22:45 UTC
[CentOS] Btrfs going forward, was: Errors on an SSD drive
On Fri, Aug 11, 2017 at 11:37 AM, hw <hw at gc-24.de> wrote:> I want to know when a drive has failed. How can I monitor that? I?ve begun > to use btrfs only recently.Maybe checkout epylog and have it monitor for BTRFS messages. That's your earliest warning because Btrfs will complain with any csum mismatch even if the hardware is not reporting problems. For impending drive failures, still your best bet is smartd even though the stats are that it only predicts drive failures maybe 60% of the time.>Chris Murphy wrote: >> There's 1500 to 3000 line changes to Btrfs code per kernel release. >> There's too much to backport most of it. Serious fixes do get >> backported by upstream to longterm kernels, but to what degree, you >> have to check the upstream changelogs to know about it. >> >> And right now most backports go to only 4.4 and 4.9. And I can't tell >> you what kernel-3.10.0-514.10.2.el7.x86_64.rpm translates into, that >> requires a secret decoder ring near as I can tell as it's a kernel >> made from multiple branches, and then also a bunch of separate >> patches. > > > So these kernels are a mess. What?s the point of backports when they aren?t > done correctly?*sigh* Can we try to act rationally instead of emotionally? Backporting is fucking hard. Have you bothered to look at kernel code and how backporting is done? Or do you just assume that it's like microwaving a hot pocket or something trivial? If it were easy, it would be automated. It's not easy. A human has to look at the new code, new fixes for old problems, and they have to graft it on old ways of doing it and very often the new code does not cleanly apply to old kernels. It's just a fact. And now that person has to come up with a fix with old methods. That's a backport. It is only messy to an outside observer, which includes me. People who are doing the work at Red Hat very clearly understand it, the whole point is to have a thoroughly understood stable conservative kernel. They're very picky about taking on new features which tends to include new regressions.> This puts a big stamp "stay away from" on RHEL/Centos.You have to pick your battles is what it comes down to. It is completely legitimate to CentOS for stability elsewhere, and use a nearly upstream kernel from elrepo.org or Fedora. Of hand I'm not sure who is building CentOS compatible kernel packages based on upstream longterm. A really good compromise right now is the 4.9 series, so if someone has a 4.9.42 kernel somewhere that'd be neat. It's not difficult to build yourself either for that matter. I can't advise you with Nvidia stuff though.>Chris Murphy wrote >> Red Hat are working on a new user space wrapper and volume format >> based on md, device mapper, LVM, and XFS. >> http://stratis-storage.github.io/ >> https://stratis-storage.github.io/StratisSoftwareDesign.pdf >> >> It's an aggressive development schedule and as so much of it is >> journaling and CoW based I have no way to assess whether it ends up > > > So in another 15 or 20 years, some kind of RH file system might become > usable.Lovely more hyperbole... Read the document. It talks about an initial production quality release 1st half of next year. It admits they're behind, *and* it also says they can't wait 10 more years. So maybe 3? Maybe 5? I have no idea. File systems are hard. Backups are good.>Chris Murphy wrote: >> tested. But this is by far the most cross platform solution: FreeBSD, >> Illumos, Linux, macOS. And ZoL has RHEL/CentOS specific packages. > > > That can be an advantage. > > What is the state of ZFS for Centos? I?m going to need it because I have > data on some disks that were used for ZFS and now need to be read by a > machine running Centos. > > Does it require a particular kernel version?Well, not to be a jerk but RTFM: http://zfsonlinux.org/ It's like - I can't answer your question without reading it myself. So there you go. I think it's DKMS based, so it has some kernel dependencies but I think it's quite a bit more tolerant of different kernel versions while maintain the same relative ZFS feature/bug set for that particular release - basically it's decoupled from Linux.>> But I can't tell you for sure what ZoL's faulty device behavior is >> either, whether it ejects faulty or flaky devices and when, or if like >> Btrfs is just tolerates it. > > > You can monitor the disks and see when one has failed.That doesn't tell me anything about how it differs from anything else. mdadm offers email notifications as an option; LVM has its own notification system I haven't really looked at but I don't think it including email notifications; smartd can do emails but also dumps standard messages to dmesg.> >> The elrepo.org folks can still sanely set CONFIG_BTRFS_FS=m, but I >> suspect if RHEL unsets that in RHEL 8 kernels, that CentOS will do the >> same. > > > Sanely? With the kernel being such a mess?I don't speak for elrepo I have no idea how their config option differs from RHEL or CentOS. But I do know elrepo offers stable upstream kernels very soon after kernel.org posts them. It seems completely reasonable to me for them to include the Btrfs module. If there's a big regression that bites people in the ass, you can rest assured you will not be the only person pissed off. Btrfs has been really good about few regressions in the kernel for a few years now. The maintainers are running a bunch of the more risky patches for months, and sometimes even once in mainline kernel they aren't the default (for example the v2 space cache has been in the kernel since 4.5, but is still not the default in 4.13). -- Chris Murphy
Chris Murphy
2017-Aug-12 00:20 UTC
[CentOS] Btrfs going forward, was: Errors on an SSD drive
On Fri, Aug 11, 2017 at 12:12 PM, Warren Young <warren at etr-usa.com> wrote:> I rather doubt btrfs will be compiled out of the kernel in EL8, and even if it is, it?ll probably be in the CentOSPlus kernels.https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.4_Release_Notes/chap-Red_Hat_Enterprise_Linux-7.4_Release_Notes-Deprecated_Functionality.html "Red Hat will not be moving Btrfs to a fully supported feature and it will be removed in a future major release of Red Hat Enterprise Linux." Unless the EL8 kernel is based on a regularly maintained (by Btrfs upstream) longterm kernel such as 4.9 at the oldest, then it's way too much work to backport Btrfs fixes to older kernels. Red Hat is clearly saying they're removing it. And I expect CentOS won't have the resources to do this work either. Either they get it for free from upstream by using an longterm kernel, or gets removed. But we'll just have to see I guess. All I can say is that Fedora is keeping it and still has high hope for Btrfs users, experts, contributors, developers, etc. That's in Fedora's interest.> But will you be able to install EL8 onto an existing XFS-formatted boot volume and mount your old btrfs data volume? I guess ?yes.?I guess no, based on the word "removed"> >>> removing btrfs alltogether would be taking living in the past too >>> many steps too far. > > The Red Hat/Fedora developers are well aware that they started out ~7 years behind when they pushed btrfs forward as a ?technology preview? with RHEL 6, and are now more like 12 years behind the ZFS world after waiting in vain for btrfs to catch up. > > Basically, Stratis is their plan to catch up on the cheap, building atop existing, tested infrastructure already in Linux.Well they have no upstream Btrfs developers. SUSE has around a dozen. Big difference. And then Red Hat has probably at least a dozen developers involved in md, device-mapper, LVM, and XFS. So it makes sense if they are hedging their bets, they're going to go with what they already have resources in. Also, I see it as tacit acknowledgement that Btrfs is stable enough that it's silly to call it a technology preview, but not so stable they can just assume it will work on its own without having knowledgeable support staff and developers on hand, which they don't. So it seems to me they pretty much had to cut off Btrfs, but I don't know if this was a technical decision or if it was a recruiting problem or some combination.> > My biggest worry is that because it?s not integrated top-to-bottom like ZFS is, they?ll miss out on some of the key advantages you have with ZFS. > > I?m all for making the current near-manual LVM2 + MD + DM + XFS lash-up more integrated and automated, even if it?s just a pretty face in front of those same components. The question is how well that interface mimics the end user experience of ZFS, which in my mind still provides the best CLI experience, even if you compare only on features they share in common. btrfs? tools are close, but I guess the correct command much more often with ZFS? tools.It always comes down to edge cases. XFS folks are adding in CoW for reflinks, which Btrfs has had stable for years, practically from the very beginning, and they have had plenty of problems with mainly one developer working on it, it's not a default feature yet. LVM thin provisioning likewise uses CoW, they've had plenty of problems with fragmentation that are not all that different from the user experience anyway, than when Btrfs experiences massive free space fragmentation. So... these are really hard problems to fix and I think people really do underestimate still the brilliance of the ZFS development team and the hands free approach they were given for development. And then I also think we don't recognize how lucky we are in free software to have so many options to address myriad workloads and use cases. Yes it's tedious to have so many choices rather than a magic unicorn file system that's one size fits all. But I think we're more lucky than we are cursed.> > That latter is an explicit goal of the Stratis project. They know that filesystem maintenance is not a daily task for most of us, so that we tend to forget commands, since we haven?t used them in months. It is a major feature of a filesystem to have commands you can guess correctly based on fuzzy memories of having used them once months ago.Yeah and frankly a really constrained feature set that doesn't account for literally everything. LVM's fatal flaw in my view is as an anaconda dev once put it, it's emacs for storage. It's a bottomless rabbit hole. Badass, but it's just a massive "choose your own adventure" book. Btrfs isn't magic but it does do a lot of things really right in terms of Ux. 'btrfs device add' 'btrfs device delete' That does file system resize, including moving extents if necessary in the delete case, and it removes the device from the volume/array and wipes the signature from the device. Resize is always online, atomic, and in theory crash proof. There's also the seldom discussed seed device / overlay feature, useful for live media that's substantially simpler to implement and understand compared to dm - and it's also much more reliable. The dm solution we currently have for lives will eventually blow up without warning when it gets full and the overlay is toast. https://github.com/kdave/btrfs-wiki/wiki/Seed-device -- Chris Murphy