As we move forward with our lustre testing I am wondering about MDT backup. Is it feasible to unmount the MDT, create an image of it and remount it after the backup. Of course this wouldn''t happen but nightly.>From what I can identify, in the case of an MDT failure we would have to dothe following: Restore from the last backup. Run an lfsck across the filesystem. Am I missing anything else at this point? We will also be doing file level backups of the filesystem as a whole but we are looking for quick ways to recover from an MDT failure. Thanks, Dan Kulinski -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/c8dc7322/attachment.html
On Jun 17, 2009 09:41 -0600, Daniel Kulinski wrote:> As we move forward with our lustre testing I am wondering about MDT backup. > > Is it feasible to unmount the MDT, create an image of it and remount it > after the backup. Of course this wouldn''t happen but nightly. > > >From what I can identify, in the case of an MDT failure we would have to do > the following: > > Restore from the last backup. > > Run an lfsck across the filesystem. > > Am I missing anything else at this point? We will also be doing file level > backups of the filesystem as a whole but we are looking for quick ways to > recover from an MDT failure.There is a documented process for doing MDT backup/restore that should be used. In particular there are some files which shold not be restored if the MDT is being restored. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
HiDaniel, By reading Chapter 15 of Lustre Operations Manual, it follows that an MDT backup is only useful if you are changing hardwary or the like. I am afraid that you can not pretend to replace with a previous image an failed MDT, as data in OSTs and MDT is not matching any more, right? Cheers On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote:> As we move forward with our lustre testing I am wondering about MDT > backup. > > > > Is it feasible to unmount the MDT, create an image of it and remount > it after the backup. Of course this wouldn???t happen but nightly. > > > > From what I can identify, in the case of an MDT failure we would have > to do the following: > > > > Restore from the last backup. > > Run an lfsck across the filesystem. > > > > Am I missing anything else at this point? We will also be doing file > level backups of the filesystem as a whole but we are looking for > quick ways to recover from an MDT failure. > > > > Thanks, > > Dan Kulinski > > > > -- > Aquest missatge ha estat analitzat per MailScanner > a la cerca de virus i d''altres continguts perillosos, > i es considera que est?? net. > MailScanner agraeix a transtec Computers pel seu suport. > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- Ramiro Alba Centre Tecnol??gic de Tranfer??ncia de Calor http://www.cttc.upc.edu Escola T??cnica Superior d''Enginyeries Industrial i Aeron??utica de Terrassa Colom 11, E-08222, Terrassa, Barcelona, Spain Tel: (+34) 93 739 86 46 -- Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d''altres continguts perillosos, i es considera que est? net. For all your IT requirements visit: http://www.transtec.co.uk
IMHO, may be one can do a HA-MDS using shared storage all the datas are in the shared storage so U can do a failover Ramiro Alba Queipo wrote:> HiDaniel, > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > MDT backup is only useful if you are changing hardwary or the like. > I am afraid that you can not pretend to replace with a previous image an > failed MDT, as data in OSTs and MDT is not matching any more, right? > > Cheers > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > >> As we move forward with our lustre testing I am wondering about MDT >> backup. >> >> >> >> Is it feasible to unmount the MDT, create an image of it and remount >> it after the backup. Of course this wouldn?t happen but nightly. >> >> >> >> From what I can identify, in the case of an MDT failure we would have >> to do the following: >> >> >> >> Restore from the last backup. >> >> Run an lfsck across the filesystem. >> >> >> >> Am I missing anything else at this point? We will also be doing file >> level backups of the filesystem as a whole but we are looking for >> quick ways to recover from an MDT failure. >> >> >> >> Thanks, >> >> Dan Kulinski >> >> >> >> -- >> Aquest missatge ha estat analitzat per MailScanner >> a la cerca de virus i d''altres continguts perillosos, >> i es considera que est? net. >> MailScanner agraeix a transtec Computers pel seu suport. >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>-------------- next part -------------- A non-text attachment was scrubbed... Name: hung-sheng_tsao.vcf Type: text/x-vcard Size: 377 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/e70c4b63/attachment.vcf
We are actually in an HA setup now. My main concern is a double disk failure on the MDT device. Thanks, Dan Kulinski -----Original Message----- From: Hung-Sheng.Tsao at Sun.COM [mailto:Hung-Sheng.Tsao at Sun.COM] Sent: Wednesday, June 17, 2009 11:49 AM To: Ramiro Alba Queipo Cc: Daniel Kulinski; lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] MDT backup procedure IMHO, may be one can do a HA-MDS using shared storage all the datas are in the shared storage so U can do a failover Ramiro Alba Queipo wrote:> HiDaniel, > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > MDT backup is only useful if you are changing hardwary or the like. > I am afraid that you can not pretend to replace with a previous image > an failed MDT, as data in OSTs and MDT is not matching any more, right? > > Cheers > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > >> As we move forward with our lustre testing I am wondering about MDT >> backup. >> >> >> >> Is it feasible to unmount the MDT, create an image of it and remount >> it after the backup. Of course this wouldn?t happen but nightly. >> >> >> >> From what I can identify, in the case of an MDT failure we would have >> to do the following: >> >> >> >> Restore from the last backup. >> >> Run an lfsck across the filesystem. >> >> >> >> Am I missing anything else at this point? We will also be doing file >> level backups of the filesystem as a whole but we are looking for >> quick ways to recover from an MDT failure. >> >> >> >> Thanks, >> >> Dan Kulinski >> >> >> >> -- >> Aquest missatge ha estat analitzat per MailScanner a la cerca de >> virus i d''altres continguts perillosos, i es considera que est? net. >> MailScanner agraeix a transtec Computers pel seu suport. >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> --------------------------------------------------------------------- >> --- >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>
Ramiro Alba Queipo wrote:> HiDaniel, > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > MDT backup is only useful if you are changing hardwary or the like. > I am afraid that you can not pretend to replace with a previous image an > failed MDT, as data in OSTs and MDT is not matching any more, right?If you do a backup/immediate restore, it should be fine. If you restore from an old image you will lose the changes made post-backup, but the rest of the data should be fine. cliffw> > Cheers > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: >> As we move forward with our lustre testing I am wondering about MDT >> backup. >> >> >> >> Is it feasible to unmount the MDT, create an image of it and remount >> it after the backup. Of course this wouldn?t happen but nightly. >> >> >> >> From what I can identify, in the case of an MDT failure we would have >> to do the following: >> >> >> >> Restore from the last backup. >> >> Run an lfsck across the filesystem. >> >> >> >> Am I missing anything else at this point? We will also be doing file >> level backups of the filesystem as a whole but we are looking for >> quick ways to recover from an MDT failure. >> >> >> >> Thanks, >> >> Dan Kulinski >> >> >> >> -- >> Aquest missatge ha estat analitzat per MailScanner >> a la cerca de virus i d''altres continguts perillosos, >> i es considera que est? net. >> MailScanner agraeix a transtec Computers pel seu suport. >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Jun 17, 2009 12:35 -0700, Cliff White wrote:> Ramiro Alba Queipo wrote: > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > MDT backup is only useful if you are changing hardwary or the like. > > I am afraid that you can not pretend to replace with a previous image an > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > If you do a backup/immediate restore, it should be fine. If you restore > from an old image you will lose the changes made post-backup, but the > rest of the data should be fine. > cliffwRight - just like any backup, any changes made after the backup will of course not be restored. One additional issue is that some OST objects will not be available if they were deleted after the backup, even though the restored MDS will still reference them. Accessing these files will return -ENOENT. At that point it would be possible (though not necessary) to run "lfsck" to clean up the inconsistencies between the MDT and OST filesystems. It is also possible to just re-delete the files that have "-ENOENT" and restore (from some other filesystem-level backup) the rest of the files. An MDS backup is a good idea, because it avoids having to restore 100TB+ (or whatever) of data from backup, leaving only a smaller number of changed files that might need to be restored. It should NOT be the only form of backup for the filesystem, since it does not contain any of the FILE data. You, or your users, should do backups of their critical files separately.> > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > >> As we move forward with our lustre testing I am wondering about MDT > >> backup. > >> > >> > >> > >> Is it feasible to unmount the MDT, create an image of it and remount > >> it after the backup. Of course this wouldn?t happen but nightly. > >> > >> > >> > >> From what I can identify, in the case of an MDT failure we would have > >> to do the following: > >> > >> > >> > >> Restore from the last backup. > >> > >> Run an lfsck across the filesystem. > >> > >> > >> > >> Am I missing anything else at this point? We will also be doing file > >> level backups of the filesystem as a whole but we are looking for > >> quick ways to recover from an MDT failure. > >> > >> > >> > >> Thanks, > >> > >> Dan Kulinski > >> > >> > >> > >> -- > >> Aquest missatge ha estat analitzat per MailScanner > >> a la cerca de virus i d''altres continguts perillosos, > >> i es considera que est? net. > >> MailScanner agraeix a transtec Computers pel seu suport. > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Thanks for this verbose reply. It is exactly what I needed and what I suspected I would run into. We are planning on multiple backup procedures. Users will backup at checkpoints in their work flow, IT will backup the MDT nightly and we are also looking at the possibility of backup the complete file system. Thanks again for everyone''s input, this gives me some good ammunition going forward for proposals. Thanks, Dan Kulinski -----Original Message----- From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On Behalf Of Andreas Dilger Sent: Wednesday, June 17, 2009 4:23 PM To: Cliff White Cc: Ramiro Alba Queipo; lustre-discuss at lists.lustre.org; Daniel Kulinski Subject: Re: [Lustre-discuss] MDT backup procedure On Jun 17, 2009 12:35 -0700, Cliff White wrote:> Ramiro Alba Queipo wrote: > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > MDT backup is only useful if you are changing hardwary or the like. > > I am afraid that you can not pretend to replace with a previous image an > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > If you do a backup/immediate restore, it should be fine. If you restore > from an old image you will lose the changes made post-backup, but the > rest of the data should be fine. > cliffwRight - just like any backup, any changes made after the backup will of course not be restored. One additional issue is that some OST objects will not be available if they were deleted after the backup, even though the restored MDS will still reference them. Accessing these files will return -ENOENT. At that point it would be possible (though not necessary) to run "lfsck" to clean up the inconsistencies between the MDT and OST filesystems. It is also possible to just re-delete the files that have "-ENOENT" and restore (from some other filesystem-level backup) the rest of the files. An MDS backup is a good idea, because it avoids having to restore 100TB+ (or whatever) of data from backup, leaving only a smaller number of changed files that might need to be restored. It should NOT be the only form of backup for the filesystem, since it does not contain any of the FILE data. You, or your users, should do backups of their critical files separately.> > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > >> As we move forward with our lustre testing I am wondering about MDT > >> backup. > >> > >> > >> > >> Is it feasible to unmount the MDT, create an image of it and remount > >> it after the backup. Of course this wouldn?t happen but nightly. > >> > >> > >> > >> From what I can identify, in the case of an MDT failure we would have > >> to do the following: > >> > >> > >> > >> Restore from the last backup. > >> > >> Run an lfsck across the filesystem. > >> > >> > >> > >> Am I missing anything else at this point? We will also be doing file > >> level backups of the filesystem as a whole but we are looking for > >> quick ways to recover from an MDT failure. > >> > >> > >> > >> Thanks, > >> > >> Dan Kulinski > >> > >> > >> > >> -- > >> Aquest missatge ha estat analitzat per MailScanner > >> a la cerca de virus i d''altres continguts perillosos, > >> i es considera que est? net. > >> MailScanner agraeix a transtec Computers pel seu suport. > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Pertaining to your original email, rather than taking the MDT down to backup, it is very convenient to use LVM snapshots. With this functionality it creates a LV duplicate of the MDT and allows you to mount that as ldiskfs and backup files from a consistent copy (won''t be changing even if your MDT continues to add/remove data). Your lustre filesystem will therefore stay operational during the backup. If you time it cleverly, you can snapshot your MDT and OSTs at the same time and backup from all of them to have a consistent copy of the whole filesystem as well.> Thanks for this verbose reply. It is exactly what I needed and what I suspected I would run into. We are planning on multiple backup procedures. Users will backup at checkpoints in their work flow, IT will backup the MDT nightly and we are also looking at the possibility of backup the complete file system. > > Thanks again for everyone''s input, this gives me some good ammunition going forward for proposals. > > Thanks, > Dan Kulinski > > >
Hi all, In order to clarify ideas, let me to sum up (Please tell me if I am wrong). There are 3 ways of doing an MDT backup: 1) Device-level using dd command You can do it from the original device to another local device with at least the same capacity, BUT no clients and no OSTs should be active, so NOT SUITABLE for an automated nightly backup 2) File-level using tar or rsync commands You can make a copy to other directory (even remotely) BUT you MUST STOP lustre and remount it as an ''ldiskfs'' file system type. You also have to save aditional information (cd /lustre/mds; getfattr -R -d -m ''.*'' -P .> /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backupeither 3) File-level on LVM snapshots LVM allows you to make a duplication of the MDT while lustre file system is operational, so you can make afterwards a File-level backup of the LVM snapshot while everything is running. Then it IS SUITABLE for an automated backup. Disadvantages are that you need extra local space for LVM snapshots and the impact on performance of using LVM over the MDT. By the way. The procedure described at ''How do I replace an OST or MDS?'' in Apendix B of Lustre Operational Manual differs from procedure discribed at 15.1.3.1 (Backing Up an MDS File): - getfattr -R -d -m ''.*'' -P . > ea.bak - getfattr -R -e base64 -d . > /tmp/mdsea Which one is the right one? Cheers On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote:> On Jun 17, 2009 12:35 -0700, Cliff White wrote: > > Ramiro Alba Queipo wrote: > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > > MDT backup is only useful if you are changing hardwary or the like. > > > I am afraid that you can not pretend to replace with a previous image an > > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > > > If you do a backup/immediate restore, it should be fine. If you restore > > from an old image you will lose the changes made post-backup, but the > > rest of the data should be fine. > > cliffw > > Right - just like any backup, any changes made after the backup will of > course not be restored. One additional issue is that some OST objects > will not be available if they were deleted after the backup, even though > the restored MDS will still reference them. Accessing these files will > return -ENOENT. > > At that point it would be possible (though not necessary) to run "lfsck" > to clean up the inconsistencies between the MDT and OST filesystems. > It is also possible to just re-delete the files that have "-ENOENT" and > restore (from some other filesystem-level backup) the rest of the files. > > An MDS backup is a good idea, because it avoids having to restore 100TB+ > (or whatever) of data from backup, leaving only a smaller number of changed > files that might need to be restored. It should NOT be the only form of > backup for the filesystem, since it does not contain any of the FILE data. > You, or your users, should do backups of their critical files separately. > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > > >> As we move forward with our lustre testing I am wondering about MDT > > >> backup. > > >> > > >> > > >> > > >> Is it feasible to unmount the MDT, create an image of it and remount > > >> it after the backup. Of course this wouldn???t happen but nightly. > > >> > > >> > > >> > > >> From what I can identify, in the case of an MDT failure we would have > > >> to do the following: > > >> > > >> > > >> > > >> Restore from the last backup. > > >> > > >> Run an lfsck across the filesystem. > > >> > > >> > > >> > > >> Am I missing anything else at this point? We will also be doing file > > >> level backups of the filesystem as a whole but we are looking for > > >> quick ways to recover from an MDT failure. > > >> > > >> > > >> > > >> Thanks, > > >> > > >> Dan Kulinski > > >> > > >> > > >> > > >> -- > > >> Aquest missatge ha estat analitzat per MailScanner > > >> a la cerca de virus i d''altres continguts perillosos, > > >> i es considera que est?? net. > > >> MailScanner agraeix a transtec Computers pel seu suport. > > >> _______________________________________________ > > >> Lustre-discuss mailing list > > >> Lustre-discuss at lists.lustre.org > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > >> > > >> ------------------------------------------------------------------------ > > >> > > >> _______________________________________________ > > >> Lustre-discuss mailing list > > >> Lustre-discuss at lists.lustre.org > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > >-- Ramiro Alba Centre Tecnol??gic de Tranfer??ncia de Calor http://www.cttc.upc.edu Escola T??cnica Superior d''Enginyeries Industrial i Aeron??utica de Terrassa Colom 11, E-08222, Terrassa, Barcelona, Spain Tel: (+34) 93 739 86 46 -- Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d''altres continguts perillosos, i es considera que est? net. For all your IT requirements visit: http://www.transtec.co.uk
On Jun 18, 2009 11:32 +0200, Ramiro Alba Queipo wrote:> There are 3 ways of doing an MDT backup: > > 1) Device-level using dd command > > You can do it from the original device to another local device with at > least the same capacity, BUT no clients and no OSTs should be active, so > NOT SUITABLE for an automated nightly backupWell, "no clients/OSTs should be active" is a relative term. You will almost certainly have a usable backup even if the filesystem was active, because ext3 has a robust on-disk layout, but you would need to run an e2fsck afterward.> 2) File-level using tar or rsync commands > > You can make a copy to other directory (even remotely) BUT you MUST STOP > lustre and remount it as an ''ldiskfs'' file system type. You also have to > save aditional information (cd /lustre/mds; getfattr -R -d -m ''.*'' -P . > > /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup > eitherRight. Note that when using "tar" or "rsync" you should use the "--sparse" option so that it doesn''t back up empty files. Also, with newer versions of tar (on RHEL/FC) and rsync it is possible to have it do the backup/restore of the extended attributes directly. You could also use "dump-0.4b40" (or later) to do a hybrid device/file level backup. It will back up the filesystem directly from the block device, but only the files that are in use. Versions 0.4b40+ can also do the backup/restore of extended attributes, which is critical.> 3) File-level on LVM snapshots > > LVM allows you to make a duplication of the MDT while lustre file system > is operational, so you can make afterwards a File-level backup of the > LVM snapshot while everything is running. Then it IS SUITABLE for an > automated backup. > Disadvantages are that you need extra local space for LVM snapshots and > the impact on performance of using LVM over the MDT.This is probably the best option. It allows consistent backups to be done, and if you only keep a single snapshot the performance hit isn''t too big.> By the way. The procedure described at ''How do I replace an OST or MDS?'' > in Apendix B of Lustre Operational Manual differs from procedure > discribed at 15.1.3.1 (Backing Up an MDS File): > - getfattr -R -d -m ''.*'' -P . > ea.bak > - getfattr -R -e base64 -d . > /tmp/mdseaI would say the first one is better, though I like to use "-e hex" instead of "-e base64" because the hex output is easier for me to decode if I need to for some reason. Probably the "replace an OST/MDT" chapter should just reference the backup/restore section instead of duplicating the content.> On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote: > > On Jun 17, 2009 12:35 -0700, Cliff White wrote: > > > Ramiro Alba Queipo wrote: > > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > > > MDT backup is only useful if you are changing hardwary or the like. > > > > I am afraid that you can not pretend to replace with a previous image an > > > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > > > > > If you do a backup/immediate restore, it should be fine. If you restore > > > from an old image you will lose the changes made post-backup, but the > > > rest of the data should be fine. > > > cliffw > > > > Right - just like any backup, any changes made after the backup will of > > course not be restored. One additional issue is that some OST objects > > will not be available if they were deleted after the backup, even though > > the restored MDS will still reference them. Accessing these files will > > return -ENOENT. > > > > At that point it would be possible (though not necessary) to run "lfsck" > > to clean up the inconsistencies between the MDT and OST filesystems. > > It is also possible to just re-delete the files that have "-ENOENT" and > > restore (from some other filesystem-level backup) the rest of the files. > > > > An MDS backup is a good idea, because it avoids having to restore 100TB+ > > (or whatever) of data from backup, leaving only a smaller number of changed > > files that might need to be restored. It should NOT be the only form of > > backup for the filesystem, since it does not contain any of the FILE data. > > You, or your users, should do backups of their critical files separately. > > > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > > > >> As we move forward with our lustre testing I am wondering about MDT > > > >> backup. > > > >> > > > >> > > > >> > > > >> Is it feasible to unmount the MDT, create an image of it and remount > > > >> it after the backup. Of course this wouldn?t happen but nightly. > > > >> > > > >> > > > >> > > > >> From what I can identify, in the case of an MDT failure we would have > > > >> to do the following: > > > >> > > > >> > > > >> > > > >> Restore from the last backup. > > > >> > > > >> Run an lfsck across the filesystem. > > > >> > > > >> > > > >> > > > >> Am I missing anything else at this point? We will also be doing file > > > >> level backups of the filesystem as a whole but we are looking for > > > >> quick ways to recover from an MDT failure. > > > >> > > > >> > > > >> > > > >> Thanks, > > > >> > > > >> Dan Kulinski > > > >> > > > >> > > > >> > > > >> -- > > > >> Aquest missatge ha estat analitzat per MailScanner > > > >> a la cerca de virus i d''altres continguts perillosos, > > > >> i es considera que est? net. > > > >> MailScanner agraeix a transtec Computers pel seu suport. > > > >> _______________________________________________ > > > >> Lustre-discuss mailing list > > > >> Lustre-discuss at lists.lustre.org > > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > >> > > > >> ------------------------------------------------------------------------ > > > >> > > > >> _______________________________________________ > > > >> Lustre-discuss mailing list > > > >> Lustre-discuss at lists.lustre.org > > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > _______________________________________________ > > > Lustre-discuss mailing list > > > Lustre-discuss at lists.lustre.org > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Sr. Staff Engineer, Lustre Group > > Sun Microsystems of Canada, Inc. > > > > > -- > Ramiro Alba > > Centre Tecnol?gic de Tranfer?ncia de Calor > http://www.cttc.upc.edu > > > Escola T?cnica Superior d''Enginyeries > Industrial i Aeron?utica de Terrassa > Colom 11, E-08222, Terrassa, Barcelona, Spain > Tel: (+34) 93 739 86 46 > > > -- > Aquest missatge ha estat analitzat per MailScanner > a la cerca de virus i d''altres continguts perillosos, > i es considera que est? net. > For all your IT requirements visit: http://www.transtec.co.uk >> _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Jun 17, 2009, at 4:09 PM, Adam Knight wrote:> Pertaining to your original email, rather than taking the MDT down to > backup, it is very convenient to use LVM snapshots. With this > functionality it creates a LV duplicate of the MDT and allows you to > mount that as ldiskfs and backup files from a consistent copy (won''t > be > changing even if your MDT continues to add/remove data). Your lustre > filesystem will therefore stay operational during the backup. If you > time it cleverly, you can snapshot your MDT and OSTs at the same time > and backup from all of them to have a consistent copy of the whole > filesystem as well. >So following this, has anyone migrated an MDT to new storage with this sort of procedure? -create an lvm''d MDT that produces snapshots -use it for a while in production -get some snazzy new disk -shutdown lustre -take a snapshot of the MDT and shuffle it off to some different storage media -create a new LVM with snazzy new disk (specifically of a different size from the original MDT) -restore snapshot -run lfsck for good measure (is this advisable on what could feasibly be a clean filesystem?) -bring up lustre Please keep in mind, I''ve used LVM but haven''t used snapshots, I''m not familiar with their limitations. We''re looking to create a filesystem immediately but would like to get some much faster storage for the MDT later without burning and building a new FS.>> Thanks for this verbose reply. It is exactly what I needed and >> what I suspected I would run into. We are planning on multiple >> backup procedures. Users will backup at checkpoints in their work >> flow, IT will backup the MDT nightly and we are also looking at the >> possibility of backup the complete file system. >> >> Thanks again for everyone''s input, this gives me some good >> ammunition going forward for proposals. >> >> Thanks, >> Dan Kulinski >> >> >> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Ah, if I had only read on one mail I would have seen this procedure is far more complex than described in the Operations Manual. I suppose that''s one of the benefits of an object based FS. ---------------- John White High Performance Computing Services (HPCS) (510) 486-7307 One Cyclotron Rd, MS: 50B-3209C Lawrence Berkeley National Lab Berkeley, CA 94720 On Jun 18, 2009, at 10:58 PM, John White wrote:> On Jun 17, 2009, at 4:09 PM, Adam Knight wrote: > >> Pertaining to your original email, rather than taking the MDT down to >> backup, it is very convenient to use LVM snapshots. With this >> functionality it creates a LV duplicate of the MDT and allows you to >> mount that as ldiskfs and backup files from a consistent copy (won''t >> be >> changing even if your MDT continues to add/remove data). Your lustre >> filesystem will therefore stay operational during the backup. If you >> time it cleverly, you can snapshot your MDT and OSTs at the same time >> and backup from all of them to have a consistent copy of the whole >> filesystem as well. >> > > So following this, has anyone migrated an MDT to new storage with this > sort of procedure? > > -create an lvm''d MDT that produces snapshots > -use it for a while in production > -get some snazzy new disk > -shutdown lustre > -take a snapshot of the MDT and shuffle it off to some different > storage media > -create a new LVM with snazzy new disk (specifically of a different > size from the original MDT) > -restore snapshot > -run lfsck for good measure (is this advisable on what could feasibly > be a clean filesystem?) > -bring up lustre > > Please keep in mind, I''ve used LVM but haven''t used snapshots, I''m not > familiar with their limitations. We''re looking to create a filesystem > immediately but would like to get some much faster storage for the MDT > later without burning and building a new FS. > >>> Thanks for this verbose reply. It is exactly what I needed and >>> what I suspected I would run into. We are planning on multiple >>> backup procedures. Users will backup at checkpoints in their work >>> flow, IT will backup the MDT nightly and we are also looking at the >>> possibility of backup the complete file system. >>> >>> Thanks again for everyone''s input, this gives me some good >>> ammunition going forward for proposals. >>> >>> Thanks, >>> Dan Kulinski >>> >>> >>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Andreas, This is a very interesting discussion, and it has raised some doubts on the matter. On Thu, 2009-06-18 at 10:24 -0600, Andreas Dilger wrote:> On Jun 18, 2009 11:32 +0200, Ramiro Alba Queipo wrote: > > There are 3 ways of doing an MDT backup: > > > > 1) Device-level using dd command > > > > You can do it from the original device to another local device with at > > least the same capacity, BUT no clients and no OSTs should be active, so > > NOT SUITABLE for an automated nightly backup > > Well, "no clients/OSTs should be active" is a relative term. You will > almost certainly have a usable backup even if the filesystem was active, > because ext3 has a robust on-disk layout, but you would need to run an > e2fsck afterward. > > > 2) File-level using tar or rsync commands > > > > You can make a copy to other directory (even remotely) BUT you MUST STOP > > lustre and remount it as an ''ldiskfs'' file system type. You also have to > > save aditional information (cd /lustre/mds; getfattr -R -d -m ''.*'' -P . > > > /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup > > either > > Right. Note that when using "tar" or "rsync" you should use the "--sparse" > option so that it doesn''t back up empty files. Also, with newer versionsCan you tell me which versions? (I am using Ubuntu 8.04 with tar-1.19 and rsync-2.6.9).> of tar (on RHEL/FC) and rsync it is possible to have it do the backup/restore > of the extended attributes directly.You mean there is no need use getfattr/setfattr commands?> > You could also use "dump-0.4b40" (or later) to do a hybrid device/file > level backup. It will back up the filesystem directly from the block device, > but only the files that are in use. Versions 0.4b40+ can also do the > backup/restore of extended attributes, which is critical. > > > 3) File-level on LVM snapshots > > > > LVM allows you to make a duplication of the MDT while lustre file system > > is operational, so you can make afterwards a File-level backup of the > > LVM snapshot while everything is running. Then it IS SUITABLE for an > > automated backup. > > Disadvantages are that you need extra local space for LVM snapshots and > > the impact on performance of using LVM over the MDT. > > This is probably the best option. It allows consistent backups to be > done, and if you only keep a single snapshot the performance hit isn''t > too big.So, the best option for automated backups could be the use of LVM snapshots and then use ''dump'' with dump levels over the mounted snapshot. No needed the use of getfattr/setfattr commands, right? What about performance influence of LMV for MDT on the overall Lustre performance?> > > By the way. The procedure described at ''How do I replace an OST or MDS?'' > > in Apendix B of Lustre Operational Manual differs from procedure > > discribed at 15.1.3.1 (Backing Up an MDS File): > > - getfattr -R -d -m ''.*'' -P . > ea.bak > > - getfattr -R -e base64 -d . > /tmp/mdsea > > I would say the first one is better, though I like to use "-e hex" > instead of "-e base64" because the hex output is easier for me to > decode if I need to for some reason. Probably the "replace an OST/MDT" > chapter should just reference the backup/restore section instead of > duplicating the content. > > > On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote: > > > On Jun 17, 2009 12:35 -0700, Cliff White wrote: > > > > Ramiro Alba Queipo wrote: > > > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > > > > MDT backup is only useful if you are changing hardwary or the like. > > > > > I am afraid that you can not pretend to replace with a previous image an > > > > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > > > > > > > If you do a backup/immediate restore, it should be fine. If you restore > > > > from an old image you will lose the changes made post-backup, but the > > > > rest of the data should be fine. > > > > cliffw > > > > > > Right - just like any backup, any changes made after the backup will of > > > course not be restored. One additional issue is that some OST objects > > > will not be available if they were deleted after the backup, even though > > > the restored MDS will still reference them. Accessing these files will > > > return -ENOENT. > > > > > > At that point it would be possible (though not necessary) to run "lfsck" > > > to clean up the inconsistencies between the MDT and OST filesystems. > > > It is also possible to just re-delete the files that have "-ENOENT" and > > > restore (from some other filesystem-level backup) the rest of the files. > > > > > > An MDS backup is a good idea, because it avoids having to restore 100TB+ > > > (or whatever) of data from backup, leaving only a smaller number of changed > > > files that might need to be restored. It should NOT be the only form of > > > backup for the filesystem, since it does not contain any of the FILE data. > > > You, or your users, should do backups of their critical files separately. > > > > > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > > > > >> As we move forward with our lustre testing I am wondering about MDT > > > > >> backup. > > > > >> > > > > >> > > > > >> > > > > >> Is it feasible to unmount the MDT, create an image of it and remount > > > > >> it after the backup. Of course this wouldn???t happen but nightly. > > > > >> > > > > >> > > > > >> > > > > >> From what I can identify, in the case of an MDT failure we would have > > > > >> to do the following: > > > > >> > > > > >> > > > > >> > > > > >> Restore from the last backup. > > > > >> > > > > >> Run an lfsck across the filesystem. > > > > >> > > > > >> > > > > >> > > > > >> Am I missing anything else at this point? We will also be doing file > > > > >> level backups of the filesystem as a whole but we are looking for > > > > >> quick ways to recover from an MDT failure. > > > > >> > > > > >> > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Dan Kulinski > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> Aquest missatge ha estat analitzat per MailScanner > > > > >> a la cerca de virus i d''altres continguts perillosos, > > > > >> i es considera que est?? net. > > > > >> MailScanner agraeix a transtec Computers pel seu suport. > > > > >> _______________________________________________ > > > > >> Lustre-discuss mailing list > > > > >> Lustre-discuss at lists.lustre.org > > > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > >> > > > > >> ------------------------------------------------------------------------ > > > > >> > > > > >> _______________________________________________ > > > > >> Lustre-discuss mailing list > > > > >> Lustre-discuss at lists.lustre.org > > > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > _______________________________________________ > > > > Lustre-discuss mailing list > > > > Lustre-discuss at lists.lustre.org > > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > Cheers, Andreas > > > -- > > > Andreas Dilger > > > Sr. Staff Engineer, Lustre Group > > > Sun Microsystems of Canada, Inc. > > > > > > > > -- > > Ramiro Alba > > > > Centre Tecnol??gic de Tranfer??ncia de Calor > > http://www.cttc.upc.edu > > > > > > Escola T??cnica Superior d''Enginyeries > > Industrial i Aeron??utica de Terrassa > > Colom 11, E-08222, Terrassa, Barcelona, Spain > > Tel: (+34) 93 739 86 46 > > > > > > -- > > Aquest missatge ha estat analitzat per MailScanner > > a la cerca de virus i d''altres continguts perillosos, > > i es considera que est? net. > > For all your IT requirements visit: http://www.transtec.co.uk > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > >-- Ramiro Alba Centre Tecnol??gic de Tranfer??ncia de Calor http://www.cttc.upc.edu Escola T??cnica Superior d''Enginyeries Industrial i Aeron??utica de Terrassa Colom 11, E-08222, Terrassa, Barcelona, Spain Tel: (+34) 93 739 86 46 -- Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d''altres continguts perillosos, i es considera que est? net. For all your IT requirements visit: http://www.transtec.co.uk
Hi John, I migrated my MGS/MDT to new hardware just a few weeks ago without much difficulty. I did not use an LVM snapshots though, rather the procedure outlined in the manual (section 15.1.3.1 of the 1.6 manual) using tar (with the "sparse" option, this is very important!) and getattr. Mine is a combination MGS/MDT, so I also needed to tunefs.lustre --writeconf to get the OST''s to update their configuration logs on the new server. I gave the new server the same IP address as the old one, so there weren''t any issues with changing nids. It''s been running great ever since. FYI, it took a few hours to create the tar and extended attribute files on the old server (~3.4M inodes) and about half that time to restore them onto the new server (faster disks :) All in all, about 4 hours of down time. Ron Jerome National Research Council Canada.> -----Original Message----- > From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss- > bounces at lists.lustre.org] On Behalf Of John White > Sent: June 19, 2009 1:58 AM > To: Adam Knight > Cc: lustre-discuss at lists.lustre.org > Subject: Re: [Lustre-discuss] MDT backup procedure > > On Jun 17, 2009, at 4:09 PM, Adam Knight wrote: > > > Pertaining to your original email, rather than taking the MDT downto> > backup, it is very convenient to use LVM snapshots. With this > > functionality it creates a LV duplicate of the MDT and allows you to > > mount that as ldiskfs and backup files from a consistent copy (won''t > > be > > changing even if your MDT continues to add/remove data). Yourlustre> > filesystem will therefore stay operational during the backup. Ifyou> > time it cleverly, you can snapshot your MDT and OSTs at the sametime> > and backup from all of them to have a consistent copy of the whole > > filesystem as well. > > > > So following this, has anyone migrated an MDT to new storage with this > sort of procedure? > > -create an lvm''d MDT that produces snapshots > -use it for a while in production > -get some snazzy new disk > -shutdown lustre > -take a snapshot of the MDT and shuffle it off to some different > storage media > -create a new LVM with snazzy new disk (specifically of a different > size from the original MDT) > -restore snapshot > -run lfsck for good measure (is this advisable on what could feasibly > be a clean filesystem?) > -bring up lustre > > Please keep in mind, I''ve used LVM but haven''t used snapshots, I''m not > familiar with their limitations. We''re looking to create a filesystem > immediately but would like to get some much faster storage for the MDT > later without burning and building a new FS. > > >> Thanks for this verbose reply. It is exactly what I needed and > >> what I suspected I would run into. We are planning on multiple > >> backup procedures. Users will backup at checkpoints in their work > >> flow, IT will backup the MDT nightly and we are also looking at the > >> possibility of backup the complete file system. > >> > >> Thanks again for everyone''s input, this gives me some good > >> ammunition going forward for proposals. > >> > >> Thanks, > >> Dan Kulinski > >> > >> > >> > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Fri, Jun 19, 2009 at 10:39:06AM +0200, Ramiro Alba Queipo wrote:> Andreas, > > This is a very interesting discussion, and it has raised some doubts on > the matter. > > On Thu, 2009-06-18 at 10:24 -0600, Andreas Dilger wrote: > > On Jun 18, 2009 11:32 +0200, Ramiro Alba Queipo wrote: > > > There are 3 ways of doing an MDT backup: > > > > > > 1) Device-level using dd command > > > > > > You can do it from the original device to another local device with at > > > least the same capacity, BUT no clients and no OSTs should be active, so > > > NOT SUITABLE for an automated nightly backup > > > > Well, "no clients/OSTs should be active" is a relative term. You will > > almost certainly have a usable backup even if the filesystem was active, > > because ext3 has a robust on-disk layout, but you would need to run an > > e2fsck afterward. > > > > > 2) File-level using tar or rsync commands > > > > > > You can make a copy to other directory (even remotely) BUT you MUST STOP > > > lustre and remount it as an ''ldiskfs'' file system type. You also have to > > > save aditional information (cd /lustre/mds; getfattr -R -d -m ''.*'' -P . > > > > /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup > > > either > > > > Right. Note that when using "tar" or "rsync" you should use the "--sparse" > > option so that it doesn''t back up empty files. Also, with newer versions > > Can you tell me which versions? (I am using Ubuntu 8.04 with tar-1.19 > and rsync-2.6.9).The versions we tested: - tar-1.20-5 from Fedora 10 works. - tar-1.15.1-23.0.1 from RHEL 5 does NOT work Also, for file level backup, exclude /OBJECTS/* and /CATALOGS from the backup, and make sure clients are unmounted during the restore or their caches will become corrupt when the restored MDS comes back online (due to changing inode numbers on the backing fs I believe). The procedure that we tested a while back is as follows (to which I would add Andreas''s suggestion of --sparse): # Backup mount -t ldiskfs -ouser_xattr /dev/sda /mnt/mdt tar --xattrs --no-selinux --exclude ''./OBJECTS/*'' \ --exclude ''./CATALOGS'' -C/mnt/mdt -cf backup.tar # Restore mount -t ldiskfs -ouser_xattr /dev/sda /mnt/mdt tar -C/mnt/mdt -xf backup.tar # (be afraid if this command produces no output) getfattr -d -m ".*" -R /mnt/mdt | grep trusted.lov | more> > of tar (on RHEL/FC) and rsync it is possible to have it do the backup/restore > > of the extended attributes directly. > > You mean there is no need use getfattr/setfattr commands? > > > > > You could also use "dump-0.4b40" (or later) to do a hybrid device/file > > level backup. It will back up the filesystem directly from the block device, > > but only the files that are in use. Versions 0.4b40+ can also do the > > backup/restore of extended attributes, which is critical. > > > > > 3) File-level on LVM snapshots > > > > > > LVM allows you to make a duplication of the MDT while lustre file system > > > is operational, so you can make afterwards a File-level backup of the > > > LVM snapshot while everything is running. Then it IS SUITABLE for an > > > automated backup. > > > Disadvantages are that you need extra local space for LVM snapshots and > > > the impact on performance of using LVM over the MDT. > > > > This is probably the best option. It allows consistent backups to be > > done, and if you only keep a single snapshot the performance hit isn''t > > too big. > > So, the best option for automated backups could be the use of LVM > snapshots and then use ''dump'' with dump levels over the mounted > snapshot. No needed the use of getfattr/setfattr commands, right? > > What about performance influence of LMV for MDT on the overall Lustre > performance? > > > > > > By the way. The procedure described at ''How do I replace an OST or MDS?'' > > > in Apendix B of Lustre Operational Manual differs from procedure > > > discribed at 15.1.3.1 (Backing Up an MDS File): > > > - getfattr -R -d -m ''.*'' -P . > ea.bak > > > - getfattr -R -e base64 -d . > /tmp/mdsea > > > > I would say the first one is better, though I like to use "-e hex" > > instead of "-e base64" because the hex output is easier for me to > > decode if I need to for some reason. Probably the "replace an OST/MDT" > > chapter should just reference the backup/restore section instead of > > duplicating the content. > > > > > On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote: > > > > On Jun 17, 2009 12:35 -0700, Cliff White wrote: > > > > > Ramiro Alba Queipo wrote: > > > > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > > > > > MDT backup is only useful if you are changing hardwary or the like. > > > > > > I am afraid that you can not pretend to replace with a previous image an > > > > > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > > > > > > > > > If you do a backup/immediate restore, it should be fine. If you restore > > > > > from an old image you will lose the changes made post-backup, but the > > > > > rest of the data should be fine. > > > > > cliffw > > > > > > > > Right - just like any backup, any changes made after the backup will of > > > > course not be restored. One additional issue is that some OST objects > > > > will not be available if they were deleted after the backup, even though > > > > the restored MDS will still reference them. Accessing these files will > > > > return -ENOENT. > > > > > > > > At that point it would be possible (though not necessary) to run "lfsck" > > > > to clean up the inconsistencies between the MDT and OST filesystems. > > > > It is also possible to just re-delete the files that have "-ENOENT" and > > > > restore (from some other filesystem-level backup) the rest of the files. > > > > > > > > An MDS backup is a good idea, because it avoids having to restore 100TB+ > > > > (or whatever) of data from backup, leaving only a smaller number of changed > > > > files that might need to be restored. It should NOT be the only form of > > > > backup for the filesystem, since it does not contain any of the FILE data. > > > > You, or your users, should do backups of their critical files separately. > > > > > > > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > > > > > >> As we move forward with our lustre testing I am wondering about MDT > > > > > >> backup. > > > > > >> > > > > > >> > > > > > >> > > > > > >> Is it feasible to unmount the MDT, create an image of it and remount > > > > > >> it after the backup. Of course this wouldn?t happen but nightly. > > > > > >> > > > > > >> > > > > > >> > > > > > >> From what I can identify, in the case of an MDT failure we would have > > > > > >> to do the following: > > > > > >> > > > > > >> > > > > > >> > > > > > >> Restore from the last backup. > > > > > >> > > > > > >> Run an lfsck across the filesystem. > > > > > >> > > > > > >> > > > > > >> > > > > > >> Am I missing anything else at this point? We will also be doing file > > > > > >> level backups of the filesystem as a whole but we are looking for > > > > > >> quick ways to recover from an MDT failure. > > > > > >> > > > > > >> > > > > > >> > > > > > >> Thanks, > > > > > >> > > > > > >> Dan Kulinski > > > > > >> > > > > > >> > > > > > >> > > > > > >> -- > > > > > >> Aquest missatge ha estat analitzat per MailScanner > > > > > >> a la cerca de virus i d''altres continguts perillosos, > > > > > >> i es considera que est? net. > > > > > >> MailScanner agraeix a transtec Computers pel seu suport. > > > > > >> _______________________________________________ > > > > > >> Lustre-discuss mailing list > > > > > >> Lustre-discuss at lists.lustre.org > > > > > >> http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > >> > > > > > >> ------------------------------------------------------------------------ > > > > > >> > > > > > >> _______________________________________________ > > > > > >> Lustre-discuss mailing list > > > > > >> Lustre-discuss at lists.lustre.org > > > > > >> http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > > > _______________________________________________ > > > > > Lustre-discuss mailing list > > > > > Lustre-discuss at lists.lustre.org > > > > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > Cheers, Andreas > > > > -- > > > > Andreas Dilger > > > > Sr. Staff Engineer, Lustre Group > > > > Sun Microsystems of Canada, Inc. > > > > > > > > > > > -- > > > Ramiro Alba > > > > > > Centre Tecnol?gic de Tranfer?ncia de Calor > > > http:// www. cttc.upc.edu > > > > > > > > > Escola T?cnica Superior d''Enginyeries > > > Industrial i Aeron?utica de Terrassa > > > Colom 11, E-08222, Terrassa, Barcelona, Spain > > > Tel: (+34) 93 739 86 46 > > > > > > > > > -- > > > Aquest missatge ha estat analitzat per MailScanner > > > a la cerca de virus i d''altres continguts perillosos, > > > i es considera que est? net. > > > For all your IT requirements visit: http:// www. transtec.co.uk > > > > > > > > _______________________________________________ > > > Lustre-discuss mailing list > > > Lustre-discuss at lists.lustre.org > > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > Cheers, Andreas > > -- > > Andreas Dilger > > Sr. Staff Engineer, Lustre Group > > Sun Microsystems of Canada, Inc. > > > > > -- > Ramiro Alba > > Centre Tecnol?gic de Tranfer?ncia de Calor > http:// www. cttc.upc.edu > > > Escola T?cnica Superior d''Enginyeries > Industrial i Aeron?utica de Terrassa > Colom 11, E-08222, Terrassa, Barcelona, Spain > Tel: (+34) 93 739 86 46 > > > -- > Aquest missatge ha estat analitzat per MailScanner > a la cerca de virus i d''altres continguts perillosos, > i es considera que est? net. > For all your IT requirements visit: http:// www. transtec.co.uk >> _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http:// lists.lustre.org/mailman/listinfo/lustre-discuss
Hi all, Thank you very much to all the people in this thread. You have been really helpful Cheers On Fri, 2009-06-19 at 08:59 -0700, Jim Garlick wrote:> On Fri, Jun 19, 2009 at 10:39:06AM +0200, Ramiro Alba Queipo wrote: > > Andreas, > > > > This is a very interesting discussion, and it has raised some doubts on > > the matter. > > > > On Thu, 2009-06-18 at 10:24 -0600, Andreas Dilger wrote: > > > On Jun 18, 2009 11:32 +0200, Ramiro Alba Queipo wrote: > > > > There are 3 ways of doing an MDT backup: > > > > > > > > 1) Device-level using dd command > > > > > > > > You can do it from the original device to another local device with at > > > > least the same capacity, BUT no clients and no OSTs should be active, so > > > > NOT SUITABLE for an automated nightly backup > > > > > > Well, "no clients/OSTs should be active" is a relative term. You will > > > almost certainly have a usable backup even if the filesystem was active, > > > because ext3 has a robust on-disk layout, but you would need to run an > > > e2fsck afterward. > > > > > > > 2) File-level using tar or rsync commands > > > > > > > > You can make a copy to other directory (even remotely) BUT you MUST STOP > > > > lustre and remount it as an ''ldiskfs'' file system type. You also have to > > > > save aditional information (cd /lustre/mds; getfattr -R -d -m ''.*'' -P . > > > > > /<backup-dir>/ea.bak). So NOT SUITABLE for an automated nightly backup > > > > either > > > > > > Right. Note that when using "tar" or "rsync" you should use the "--sparse" > > > option so that it doesn''t back up empty files. Also, with newer versions > > > > Can you tell me which versions? (I am using Ubuntu 8.04 with tar-1.19 > > and rsync-2.6.9). > > The versions we tested: > - tar-1.20-5 from Fedora 10 works. > - tar-1.15.1-23.0.1 from RHEL 5 does NOT work > > Also, for file level backup, exclude /OBJECTS/* and /CATALOGS from the > backup, and make sure clients are unmounted during the restore or their > caches will become corrupt when the restored MDS comes back online > (due to changing inode numbers on the backing fs I believe). > > The procedure that we tested a while back is as follows (to which I would > add Andreas''s suggestion of --sparse): > > # Backup > mount -t ldiskfs -ouser_xattr /dev/sda /mnt/mdt > tar --xattrs --no-selinux --exclude ''./OBJECTS/*'' \ > --exclude ''./CATALOGS'' -C/mnt/mdt -cf backup.tar > > # Restore > mount -t ldiskfs -ouser_xattr /dev/sda /mnt/mdt > tar -C/mnt/mdt -xf backup.tar > # (be afraid if this command produces no output) > getfattr -d -m ".*" -R /mnt/mdt | grep trusted.lov | more > > > > of tar (on RHEL/FC) and rsync it is possible to have it do the backup/restore > > > of the extended attributes directly. > > > > You mean there is no need use getfattr/setfattr commands? > > > > > > > > You could also use "dump-0.4b40" (or later) to do a hybrid device/file > > > level backup. It will back up the filesystem directly from the block device, > > > but only the files that are in use. Versions 0.4b40+ can also do the > > > backup/restore of extended attributes, which is critical. > > > > > > > 3) File-level on LVM snapshots > > > > > > > > LVM allows you to make a duplication of the MDT while lustre file system > > > > is operational, so you can make afterwards a File-level backup of the > > > > LVM snapshot while everything is running. Then it IS SUITABLE for an > > > > automated backup. > > > > Disadvantages are that you need extra local space for LVM snapshots and > > > > the impact on performance of using LVM over the MDT. > > > > > > This is probably the best option. It allows consistent backups to be > > > done, and if you only keep a single snapshot the performance hit isn''t > > > too big. > > > > So, the best option for automated backups could be the use of LVM > > snapshots and then use ''dump'' with dump levels over the mounted > > snapshot. No needed the use of getfattr/setfattr commands, right? > > > > What about performance influence of LMV for MDT on the overall Lustre > > performance? > > > > > > > > > By the way. The procedure described at ''How do I replace an OST or MDS?'' > > > > in Apendix B of Lustre Operational Manual differs from procedure > > > > discribed at 15.1.3.1 (Backing Up an MDS File): > > > > - getfattr -R -d -m ''.*'' -P . > ea.bak > > > > - getfattr -R -e base64 -d . > /tmp/mdsea > > > > > > I would say the first one is better, though I like to use "-e hex" > > > instead of "-e base64" because the hex output is easier for me to > > > decode if I need to for some reason. Probably the "replace an OST/MDT" > > > chapter should just reference the backup/restore section instead of > > > duplicating the content. > > > > > > > On Wed, 2009-06-17 at 16:23 -0600, Andreas Dilger wrote: > > > > > On Jun 17, 2009 12:35 -0700, Cliff White wrote: > > > > > > Ramiro Alba Queipo wrote: > > > > > > > By reading Chapter 15 of Lustre Operations Manual, it follows that an > > > > > > > MDT backup is only useful if you are changing hardwary or the like. > > > > > > > I am afraid that you can not pretend to replace with a previous image an > > > > > > > failed MDT, as data in OSTs and MDT is not matching any more, right? > > > > > > > > > > > > If you do a backup/immediate restore, it should be fine. If you restore > > > > > > from an old image you will lose the changes made post-backup, but the > > > > > > rest of the data should be fine. > > > > > > cliffw > > > > > > > > > > Right - just like any backup, any changes made after the backup will of > > > > > course not be restored. One additional issue is that some OST objects > > > > > will not be available if they were deleted after the backup, even though > > > > > the restored MDS will still reference them. Accessing these files will > > > > > return -ENOENT. > > > > > > > > > > At that point it would be possible (though not necessary) to run "lfsck" > > > > > to clean up the inconsistencies between the MDT and OST filesystems. > > > > > It is also possible to just re-delete the files that have "-ENOENT" and > > > > > restore (from some other filesystem-level backup) the rest of the files. > > > > > > > > > > An MDS backup is a good idea, because it avoids having to restore 100TB+ > > > > > (or whatever) of data from backup, leaving only a smaller number of changed > > > > > files that might need to be restored. It should NOT be the only form of > > > > > backup for the filesystem, since it does not contain any of the FILE data. > > > > > You, or your users, should do backups of their critical files separately. > > > > > > > > > > > > On Wed, 2009-06-17 at 09:41 -0600, Daniel Kulinski wrote: > > > > > > >> As we move forward with our lustre testing I am wondering about MDT > > > > > > >> backup. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Is it feasible to unmount the MDT, create an image of it and remount > > > > > > >> it after the backup. Of course this wouldn???t happen but nightly. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> From what I can identify, in the case of an MDT failure we would have > > > > > > >> to do the following: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Restore from the last backup. > > > > > > >> > > > > > > >> Run an lfsck across the filesystem. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Am I missing anything else at this point? We will also be doing file > > > > > > >> level backups of the filesystem as a whole but we are looking for > > > > > > >> quick ways to recover from an MDT failure. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Thanks, > > > > > > >> > > > > > > >> Dan Kulinski > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> -- > > > > > > >> Aquest missatge ha estat analitzat per MailScanner > > > > > > >> a la cerca de virus i d''altres continguts perillosos, > > > > > > >> i es considera que est?? net. > > > > > > >> MailScanner agraeix a transtec Computers pel seu suport. > > > > > > >> _______________________________________________ > > > > > > >> Lustre-discuss mailing list > > > > > > >> Lustre-discuss at lists.lustre.org > > > > > > >> http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > >> > > > > > > >> ------------------------------------------------------------------------ > > > > > > >> > > > > > > >> _______________________________________________ > > > > > > >> Lustre-discuss mailing list > > > > > > >> Lustre-discuss at lists.lustre.org > > > > > > >> http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > > > > > _______________________________________________ > > > > > > Lustre-discuss mailing list > > > > > > Lustre-discuss at lists.lustre.org > > > > > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > > > Cheers, Andreas > > > > > -- > > > > > Andreas Dilger > > > > > Sr. Staff Engineer, Lustre Group > > > > > Sun Microsystems of Canada, Inc. > > > > > > > > > > > > > > -- > > > > Ramiro Alba > > > > > > > > Centre Tecnol??gic de Tranfer??ncia de Calor > > > > http:// www. cttc.upc.edu > > > > > > > > > > > > Escola T??cnica Superior d''Enginyeries > > > > Industrial i Aeron??utica de Terrassa > > > > Colom 11, E-08222, Terrassa, Barcelona, Spain > > > > Tel: (+34) 93 739 86 46 > > > > > > > > > > > > -- > > > > Aquest missatge ha estat analitzat per MailScanner > > > > a la cerca de virus i d''altres continguts perillosos, > > > > i es considera que est? net. > > > > For all your IT requirements visit: http:// www. transtec.co.uk > > > > > > > > > > > _______________________________________________ > > > > Lustre-discuss mailing list > > > > Lustre-discuss at lists.lustre.org > > > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > > Cheers, Andreas > > > -- > > > Andreas Dilger > > > Sr. Staff Engineer, Lustre Group > > > Sun Microsystems of Canada, Inc. > > > > > > > > -- > > Ramiro Alba > > > > Centre Tecnol??gic de Tranfer??ncia de Calor > > http:// www. cttc.upc.edu > > > > > > Escola T??cnica Superior d''Enginyeries > > Industrial i Aeron??utica de Terrassa > > Colom 11, E-08222, Terrassa, Barcelona, Spain > > Tel: (+34) 93 739 86 46 > > > > > > -- > > Aquest missatge ha estat analitzat per MailScanner > > a la cerca de virus i d''altres continguts perillosos, > > i es considera que est? net. > > For all your IT requirements visit: http:// www. transtec.co.uk > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http:// lists.lustre.org/mailman/listinfo/lustre-discuss > >-- Ramiro Alba Centre Tecnol??gic de Tranfer??ncia de Calor http://www.cttc.upc.edu Escola T??cnica Superior d''Enginyeries Industrial i Aeron??utica de Terrassa Colom 11, E-08222, Terrassa, Barcelona, Spain Tel: (+34) 93 739 86 46 -- Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d''altres continguts perillosos, i es considera que est? net. For all your IT requirements visit: http://www.transtec.co.uk