On 11/13/2015 12:59 PM, J Martin Rushton wrote:> Maybe I should have been clearer: use (LVM) OR (RAID1 and break).I took your meaning. I'm saying that's a terrible backup strategy, for a list of reasons. For instance, it only works if you mirror a single disk. It doesn't work if you use RAID10 or RAID5, or RAID6, or RAIDZ, etc. Breaking RAID doesn't make the data consistent, so you might have corrupt files (especially if the system runs any kind of database. SQL, LDAP, etc). It doesn't make the filesystem consistent, so you might have a corrupt filesystem. Even if you ignore the potential for corruption, you have a backup process that only works on some specific hardware configurations. Everything else has to have a different backup solution. That's insane. Use one backup process that works for everything. You're much more likely to consistently back up your data that way.> I hope I'm wrong, but you wouldn't be thinking of mounting the broken > out copy on a the same system would you? You must never do that, not > even during disaster recovery. Use dd or similar on the disk, not the > mounted partitions - isn't that obvious? I wasn't trying to give step > by step instructions.Well, that's *one* of the problems with your advice. Even if we ignore the fact that it doesn't work reliably (and IMO, it therefore doesn't work), it's far more complicated than you pretend it is. Because now you're talking about quiescing your services, breaking your RAID, physically removing the drive, connecting it to another system, fsck the filesystems, mount them, and backing up the data. For each backup. Every day. Or using 'dd' and... backing up the whole image? No incremental or differentials? Your process involves a human being doing physical tasks as part of the backup. Maybe I'm the only one, but I want my backups fully automated. People make mistakes. I don't want them involved in regular processes. In fact, the entire point of computing is that the computer should do the work so that I don't have to.> Way before LVM existed we used this technique to back up VAXes (and > later Alphas) under VMS using "volume shadowing" (ie RAID1). It worked > quite happily for several years with disks shared across the cluster. > IIRC it was actually recommended by DEC, indeed a selling point, but I > don't have any manuals to hand to confirm that nowadays! One thing I > did omit was you MUST sync firstsync flushes the OS data buffers to disk, but it does not sync application data buffers, it does not flush the journal, it doesn't make filesystems "clean", and even if you break the RAID volume immediately after "sync" there's no guarantee that there weren't cached writes from other processes in between those two steps. There is absolutely no way to make this a reliable process without a full shutdown.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Have a coffee or a beer, breathe deeply, then: On 14/11/15 00:42, Gordon Messmer wrote:> On 11/13/2015 12:59 PM, J Martin Rushton wrote: >> Maybe I should have been clearer: use (LVM) OR (RAID1 and >> break). > > I took your meaning. I'm saying that's a terrible backup strategy, > for a list of reasons. > > For instance, it only works if you mirror a single disk. It > doesn't work if you use RAID10 or RAID5, or RAID6, or RAIDZ, etc.That of course is exactly why I said RAID1. Breaking RAID> doesn't make the data consistent, so you might have corrupt files > (especially if the system runs any kind of database. SQL, LDAP, > etc). It doesn't make the filesystem consistent, so you might have > a corrupt filesystem.Possibly, but that is another problem altogether. Any low level backup will do the same. You need to have an understanding of the filesystem to handle filesystem problems. Even if the utility understands the filesystem you have problems with open files such as databases. More generally, for anything except a trivial database you should use the database to dump itself; for instance using mysqldump. Have a look at the page https://mariadb.com/kb/en/mariadb/backup-and-restore-overview/ for (as it says) an overview. Try running a database backup timed to complete before your normal filesystem backups run, whatever method you use.> > Even if you ignore the potential for corruption, you have a backup > process that only works on some specific hardware configurations. > Everything else has to have a different backup solution. That's > insane. Use one backup process that works for everything. You're > much more likely to consistently back up your data that way.Remember that this is a last resort if (1) the user can't accept more sensible backups and handle (or let the backup handle) the dates safely; (2) the user insists on a snapshot; (3) the user can't use a filesytem snapshot (ZFS, GPFS etc) and (4) the user can't/won't use LVM. You can't refuse to use better solutions" and then complain that last resort is not as good as the better solutions"!>> I hope I'm wrong, but you wouldn't be thinking of mounting the >> broken out copy on a the same system would you? You must never >> do that, not even during disaster recovery. Use dd or similar on >> the disk, not the mounted partitions - isn't that obvious? I >> wasn't trying to give step by step instructions. > > Well, that's *one* of the problems with your advice. Even if we > ignore the fact that it doesn't work reliably (and IMO, it > therefore doesn't work), it's far more complicated than you pretend > it is. > > Because now you're talking about quiescing your services, breaking > your RAID, physically removing the drive, connecting it to another > system, fsck the filesystems, mount them, and backing up the data. > For each backup. Every day.No need to remove if you handle whole disk. When we used this technique we only did it monthly - it would be pretty crazy to do level 0 backups daily.> Or using 'dd' and... backing up the whole image? No incremental > or differentials?See the previous.> Your process involves a human being doing physical tasks as part of > the backup. Maybe I'm the only one, but I want my backups fully > automated. People make mistakes. I don't want them involved in > regular processes. In fact, the entire point of computing is that > the computer should do the work so that I don't have to.See the comments about using better solutions. I'd be worried though if you use a solution that doesn't remove the backup media from the vicinity of the machine. Fine if you have a remote site, but otherwise you still need a person to physically take the tapes (or whatever) out of the machine room to fireproof storage. That's pretty manual.>> Way before LVM existed we used this technique to back up VAXes >> (and later Alphas) under VMS using "volume shadowing" (ie RAID1). >> It worked quite happily for several years with disks shared >> across the cluster. IIRC it was actually recommended by DEC, >> indeed a selling point, but I don't have any manuals to hand to >> confirm that nowadays! One thing I did omit was you MUST sync >> first > > sync flushes the OS data buffers to disk, but it does not sync > application data buffers, it does not flush the journal, it doesn't > make filesystems "clean", and even if you break the RAID volume > immediately after "sync" there's no guarantee that there weren't > cached writes from other processes in between those two steps.The journal is a fair point if it is stored on an separate spindle, as for instance is possible under XFS.> There is absolutely no way to make this a reliable process without > a full shutdown.Not IME. At that date the preferred method for monthly backups was a shutdown and standalone utility for disk-disk copies, but that was not always possible. The technique worked.> > _______________________________________________ CentOS mailing > list CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJWRxU9AAoJEAF3yXsqtyBldK8P+gMocnEFL0d5ciFhl/QUj50V z1GU4zMOhJeVZgS+KW2WM48/YYd9XdTX82G3352UEbnwOd7OmWkt3JhQ5QsmeZRP F1AwmetHCt0+RtQli9uAPywGvPtnc7ROJPEznZa97YJU4G56/8sEqxA26On5G2h9 uCNUG69dyI4yhAH/liW76iJWRZt6TJQVKaHeMXUX9lqdTACZ64WCWAS+dJACmMiA mrOYFUbey5EBRHcqlXYX4Az3O/9btD2++bTdqqMJ3BN8Q7NF3pbfrxVvqeghR8mV kqkFs5W7kk4xbJS+yMgbMnPkE4LpCxgIDBpKg/7pLqYVBjs91TqzSXWWVGluAdM4 5I4mI5lbvqA+OZjV5sIfKhyv+SfrQJm0Y6+FXZjPq1ul9xVbi9DWYMVEJGeRc+Gj bbU83nnK7L01i6yEANP6UIN07BKfciAwrDHy6VZBJsQn4cM2ce0YGgKMlobfgp1D XFuR2RncDzcgpVEhz9r4nsc9vVt3WRQrk4KcxP1AA5VjMR6YD8wS47Ssox1nNnx8 T85DCupNZsXIlUp7AqWiSZTLYx9O9Ulkdhpt2uUx4/aC0GIUdNnyGEpcyHGlkI3K FAYXSYF5nEukpU5km0iX67vAcJe9EjfiuEIwp0w25YdNIQYzOI/HnuPgSTEsO1au J9hexRZOa30aSACVye8S =r9jt -----END PGP SIGNATURE-----
On 11/14/2015 03:04 AM, J Martin Rushton wrote:> On 14/11/15 00:42, Gordon Messmer wrote: >> For instance, it only works if you mirror a single disk. It >> doesn't work if you use RAID10 or RAID5, or RAID6, or RAIDZ, etc. > That of course is exactly why I said RAID1.I know. And I was trying to make the point that the process of breaking RAID1 for backup purposes is inflexible in addition to being unreliable. Users should not have to re-engineer their backup system for every hardware configuration.> > Breaking RAID >> doesn't make the data consistent, so you might have corrupt files >> (especially if the system runs any kind of database. SQL, LDAP, >> etc). It doesn't make the filesystem consistent, so you might have >> a corrupt filesystem. > Possibly, but that is another problem altogether. Any low level > backup will do the same.If you were to attempt a block-level backup of the raw device, then yes, you would have similar problems. But since that is insane, and no one is suggesting that process, I didn't feel the need to address it.> You need to have an understanding of the > filesystem to handle filesystem problems. Even if the utility > understands the filesystem you have problems with open files such as > databases.There *are* tools that exist to dump filesystems, but they're not intended to be used for backup, and they won't operate on mounted filesystems. For instance, clonezilla includes tools to dump ext4 and ntfs filesystems for the purpose of cloning a system. You could treat that as a backup, but you have to shut down the host OS to boot clonezilla.> More generally, for anything except a trivial database you should use > the database to dump itself; for instance using mysqldump.Uhh.... no. I'd argue the opposite. You should only use a DB dump tools for trivial databases (or in some cases, such as PostgreSQL, upgrades). Dumping a database is *slow*. The only thing slower than dumping a database is restoring a database dump. If you have a non-trivial database, you definitely want to quiesce, snapshot, resume, and back up the snapshot.> Have a > look at the page > https://mariadb.com/kb/en/mariadb/backup-and-restore-overview/ for (as > it says) an overview. Try running a database backup timed to complete > before your normal filesystem backups run, whatever method you use.Again, you seem entirely too willing to accept unreliable processes. Timing? You should absolutely, under no circumstances, trust the timing of two processes to not overlap. If you're dumping data, you should either trigger the backup from the dump job, after it completes, or you should employ a locking system so that only one of the two processes can operate simultaneously.> Remember that this is a last resort if (1) the user can't accept more > sensible backups and handle (or let the backup handle) the dates > safely; (2) the user insists on a snapshot; (3) the user can't use a > filesytem snapshot (ZFS, GPFS etc) and (4) the user can't/won't use > LVM. You can't refuse to use better solutions" and then complain that > last resort is not as good as the better solutions"!No one is refusing better solutions. You are tilting at windmills.> See the comments about using better solutions. I'd be worried though > if you use a solution that doesn't remove the backup media from the > vicinity of the machine. Fine if you have a remote siteWe agree, there. You should have backups in a physically separate location.