On 11/13/2015 01:46 AM, J Martin Rushton wrote:> If you really_need_ the guarantee of a snapshot, consider either LVM > or RAID1. Break out a volume from the RAID set, back it up, then > rebuild.FFS, don't do the latter. LVM is the standard filesystem backing for Red Hat and CentOS systems, and fully supports consistent snapshots without doing half-ass shit like breaking a RAID volume. Breaking a RAID volume doesn't make filesystems consistent, so when you try to mount it, you might have a corrupt filesystem, or corrupt data. Breaking the RAID will duplicate UUIDs of filesystems and the name of volume groups. There are a whole bunch of configurations where it just won't work. At best, it's unreliable. Never do this. Don't advise other people to do it. Use LVM snapshots (or ZFS if that's an option for you).
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 13/11/15 17:55, Gordon Messmer wrote:> On 11/13/2015 01:46 AM, J Martin Rushton wrote: >> If you really_need_ the guarantee of a snapshot, consider either >> LVM or RAID1. Break out a volume from the RAID set, back it up, >> then rebuild. > > FFS, don't do the latter. LVM is the standard filesystem backing > for Red Hat and CentOS systems, and fully supports consistent > snapshots without doing half-ass shit like breaking a RAID volume. > > Breaking a RAID volume doesn't make filesystems consistent, so when > you try to mount it, you might have a corrupt filesystem, or > corrupt data. Breaking the RAID will duplicate UUIDs of filesystems > and the name of volume groups. There are a whole bunch of > configurations where it just won't work. At best, it's unreliable. > Never do this. Don't advise other people to do it. Use LVM > snapshots (or ZFS if that's an option for you). > _______________________________________________ CentOS mailing > list CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centosMaybe I should have been clearer: use (LVM) OR (RAID1 and break). Don't use LVM and break, that would be silly. I hope I'm wrong, but you wouldn't be thinking of mounting the broken out copy on a the same system would you? You must never do that, not even during disaster recovery. Use dd or similar on the disk, not the mounted partitions - isn't that obvious? I wasn't trying to give step by step instructions. Way before LVM existed we used this technique to back up VAXes (and later Alphas) under VMS using "volume shadowing" (ie RAID1). It worked quite happily for several years with disks shared across the cluster. IIRC it was actually recommended by DEC, indeed a selling point, but I don't have any manuals to hand to confirm that nowadays! One thing I did omit was you MUST sync first (there was an equivalent VMS command, don't ask me now), and also ensure that as the disks are added back a full catchup copy occurs. You may consider it half a mule's droppings, but it is, after all, what happens if you loose a spindle and hot replace. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJWRk8fAAoJEAF3yXsqtyBlM50P/iHt5rwT/sGWSaNnsBNNoS0L WKb8Z9M7nVwaWsjceHPwEWMDrW2M7TUlXWCWhmDOL5oWP2PtX5J0YkXZ3ADmn5cp GE8gKmFDIdoepxs/GQREpryh+mT+kyhr+3WIISgSplEsGP0PezEBEX5jemvMAFcn bVH4KYj5Cqt/xludubqxaNe/GF72FwJKVl/ie5GIMF1gk039QpykOvI8GZzcXXVU /vEUH72i+JOgyrLCMIuzH6na2YSiXI1pav8NnPV4pZCX6Rre8/MTtNGRd8Lda3zR Nqb1Jow7ozTRwYWJpORU0ZiPN4aTQakktSLuPxN3KpAFOUiKbt4EMAI7YwceYh0b DwE7fml1auINn2XhwLYyHyX6bu0TJQmC8PbOxtx2J79wO0707ZPpvNN2imgGqYbg zdO7cQMlI04MqeRn9A+OgtLzAh/yrJaVDNYNN6OFbSpyfB0FrmrZpKxozX2gMOp2 C1WBffSflKN+RLAaWGsXY+CIDyHvkIJifUx+618O1iOqxXlWuMTmMa+Ez4DVpVLZ SIoQBmcE950ZE6mZZazdb2rGVC1OhQVcvsIxv3qDqxPUkBP5rrbYhW3SziqaOZHj M5o5iVkCRwBbxyV8GyVK8YYgBlu9CqxjMAfuNxT8aWZfl5kXVMxmfU8W2x4Pgjm1 n7ygIadGuw+cXEcr9ech =8SBu -----END PGP SIGNATURE-----
On 11/13/2015 12:59 PM, J Martin Rushton wrote:> Maybe I should have been clearer: use (LVM) OR (RAID1 and break).I took your meaning. I'm saying that's a terrible backup strategy, for a list of reasons. For instance, it only works if you mirror a single disk. It doesn't work if you use RAID10 or RAID5, or RAID6, or RAIDZ, etc. Breaking RAID doesn't make the data consistent, so you might have corrupt files (especially if the system runs any kind of database. SQL, LDAP, etc). It doesn't make the filesystem consistent, so you might have a corrupt filesystem. Even if you ignore the potential for corruption, you have a backup process that only works on some specific hardware configurations. Everything else has to have a different backup solution. That's insane. Use one backup process that works for everything. You're much more likely to consistently back up your data that way.> I hope I'm wrong, but you wouldn't be thinking of mounting the broken > out copy on a the same system would you? You must never do that, not > even during disaster recovery. Use dd or similar on the disk, not the > mounted partitions - isn't that obvious? I wasn't trying to give step > by step instructions.Well, that's *one* of the problems with your advice. Even if we ignore the fact that it doesn't work reliably (and IMO, it therefore doesn't work), it's far more complicated than you pretend it is. Because now you're talking about quiescing your services, breaking your RAID, physically removing the drive, connecting it to another system, fsck the filesystems, mount them, and backing up the data. For each backup. Every day. Or using 'dd' and... backing up the whole image? No incremental or differentials? Your process involves a human being doing physical tasks as part of the backup. Maybe I'm the only one, but I want my backups fully automated. People make mistakes. I don't want them involved in regular processes. In fact, the entire point of computing is that the computer should do the work so that I don't have to.> Way before LVM existed we used this technique to back up VAXes (and > later Alphas) under VMS using "volume shadowing" (ie RAID1). It worked > quite happily for several years with disks shared across the cluster. > IIRC it was actually recommended by DEC, indeed a selling point, but I > don't have any manuals to hand to confirm that nowadays! One thing I > did omit was you MUST sync firstsync flushes the OS data buffers to disk, but it does not sync application data buffers, it does not flush the journal, it doesn't make filesystems "clean", and even if you break the RAID volume immediately after "sync" there's no guarantee that there weren't cached writes from other processes in between those two steps. There is absolutely no way to make this a reliable process without a full shutdown.
On Fri, 13 Nov 2015, Gordon Messmer wrote:>Breaking a RAID volume doesn't make filesystems consistent,While using LVM arranges for some filesystems to be consistent (it is not always possible), it does nothing to ensure application consistency which can be just as important. Linux doesn't have a widely deployed analog to Windows' VSS, which provides both though only for those that cooperate. On Linux you must arrange to quiesce applications yourself, which is seldom possible.>Breaking the >RAID will duplicate UUIDs of filesystems and the name of volume groups.Making an LVM snapshot duplicates UUIDs (and LABELs) too, the whole LV is the same in the snapshot as it was in the source. There are ways to cope with that for XFS (I usually use mount -ro nouuid) -- ext2/3/4 doesn't care (so just mount -r for them). If the original filesystem isn't yet mounted then a mount by uuid (or label) would not be pretty for either. And that's just two filesystems, others are supported and they too will potentially have issues. /mark
On 11/14/2015 09:01 AM, Mark Milhollan wrote:> On Fri, 13 Nov 2015, Gordon Messmer wrote: >> Breaking a RAID volume doesn't make filesystems consistent, > While using LVM arranges for some filesystems to be consistent (it is > not always possible)Can you explain what you mean? The standard filesytems, ext4 and XFS, both will be made consistent when making an LVM snapshot.> , it does nothing to ensure application consistency > which can be just as important. Linux doesn't have a widely deployed > analog to Windows' VSS, which provides both though only for those that > cooperate.I know. That's why I wrote snapshot: https://bitbucket.org/gordonmessmer/dragonsdawn-snapshot> On Linux you must arrange to quiesce applications yourself, > which is seldom possible.I have not found that to be true. Examples?>> Breaking the >> RAID will duplicate UUIDs of filesystems and the name of volume groups. > Making an LVM snapshot duplicates UUIDs (and LABELs) too, the whole LV > is the same in the snapshot as it was in the source.The VG name is the bigger problem. If you tried to activate the VG in the broken RAID1 component, Very Bad Things(TM) would happen.