Hi, what is the right way to backup MDT ? People get worried what will be "The Day After" the catastrophic disk failure. We are about to release lustre system into production, preproduction testing went good so far (Thanks !). - At present we do LVM snapshots to backup MDT and we were able to restore snapshot to another node. Is there any way to capture changes made on MDT after snapshot done and how close can we get to point of crash ? Is there some kind of MDT journal synchronized with LVM snapshot ? Is there way to do incremental backups to do it more often ? - What is an experience with DRBD replication ? There are multiple reports from sites using it and there are also there are reports indicating replicated file system is not clean when master MDT crashes as DRBD knows nothing and does not synchronize with file system on top of it. Is there way to avoid corruption or it just fixed by fsck ? Can DRBD failover "cleanly" if we do it manually e.g. to upgrade master MDS ? Can I verify slave disk is consistent with master and is not corrupt after a year of running ? It seems like both LVM and DRBD approaches are not perfect. Are there plans to implement native replication of MDT in lustre ? SNS is for OSTs only, right ? I would appreciate to hear about experience on MDT backup. Thanks, Alex.
On Mon, 2009-01-19 at 22:46 -0600, Alex Kulyavtsev wrote:> Hi, > what is the right way to backup MDT ?I think this is covered in the operations manual.> People get worried what will be > "The Day After" the catastrophic disk failure.Indeed. You should use "reliable" storage under your MDT. RAID 1 (mirroring, not necessary limited to only 2 mirrors) is recommended.> - At present we do LVM snapshots to backup MDT and we were able to > restore snapshot to another node.Excellent. You have already done more than most people do in planning and testing your recovery strategy.> Is there any way to capture changes made on MDT after snapshot done and > how close can we get to point of crash ?Server Changelogs (see the roadmap) might give you what you are looking for. Specifically they will be used to implement our Replication feature. A replicated copy of your MDT would serve your disaster recovery requirements as well. Both of these features are off in the future however.> Is there some kind of MDT > journal synchronized with LVM snapshot ?No.> Is there way to do incremental > backups to do it more often ?You can make snapshots and back them up as frequently as you wish (obviously to the bandwidth limitations of your backup strategy). You could even do it incrementally by comparing adjacent snapshots and only backing up the delta between them.> - What is an experience with DRBD replication ?I believe there are folks here who are using DRBD for their Lustre targets.> There are multiple > reports from sites using it and there are also there are reports > indicating replicated file system is not clean when master MDT crashes > as DRBD knows nothing and does not synchronize with file system on top > of it.DRBD should not need to know anything about the filesystem that is on it any more than Linux RAID (or LVM) needs to know what is on it.> Is there way to avoid corruption or it just fixed by fsck ?Of course, if an MDS just up and dies, the MDT, whether it be on the local disk or a DRBD replicated copy should be fsck''d to follow best practises. That''s got nothing to do with DRBD though, but just simply cleaning up a filesystem left in an "open" state before using it again.> Can DRBD > failover "cleanly" if we do it manually e.g. to upgrade master MDS ? > Can I verify slave disk is consistent with master and is not corrupt > after a year of running ?Those are questions better asked of the DRBD developers I think. I am sure you will get more accurate answers from them.> It seems like both LVM and DRBD approaches are not perfect.Perfect in what sense? Every backup solution has it''s drawbacks, whether they be performance, cost, data freshness, etc. I think you just have to decide what aspects are important for you and budget accordingly. A "perfect" backup is probably also a very expensive (in more aspects than just money) backup. Your "happy medium" probably lies somewhere around mirroring whether that be within a single chassis (i.e. Linux RAID) or remote mirroring such as DRBD, or both.> Are there > plans to implement native replication of MDT in lustre ?Well, as I said before, Server Changelogs (and perhaps a modification to the planned "Replication" feature to only replicate meta-data) could probably be utilized to that end. Do note however that "replication" is not mirroring as it''s more "lazy" (i.e. asynchronous; incoherent) than the synchronous (and coherent) nature of mirroring so there is still a chance of data staleness.> SNS is for OSTs only, right ?I believe so. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090120/5f6f40ca/attachment.bin
On Jan 20, 2009 09:43 -0500, Brian J. Murrell wrote:> > Is there way to avoid corruption or it just fixed by fsck ? > > Of course, if an MDS just up and dies, the MDT, whether it be on the > local disk or a DRBD replicated copy should be fsck''d to follow best > practises. That''s got nothing to do with DRBD though, but just simply > cleaning up a filesystem left in an "open" state before using it again.To clarify - you do not need to run e2fsck just on a crash. The times that e2fsck are required is when the backing storage is running with write cache enabled and the write cache was not safely written to disk, or in the case of a double failure (e.g. RAID disk failure + server failure). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Brian J. Murrell wrote:> On Mon, 2009-01-19 at 22:46 -0600, Alex Kulyavtsev wrote: >> Hi, >> what is the right way to backup MDT ? > > I think this is covered in the operations manual. >The manual does indeed cover backups: http://manual.lustre.org/manual/LustreManual16_HTML/BackupAndRestore.html It includes this step:> 3. Back up the EAs, run: > getfattr -R -d -m ''.*'' -P . > ea.bak >Is that necessary for newer versions of tar that support the "-xattrs" option? From the man page (I have GNU tar 1.15.1): --xattrs this option causes tar to store each file''s extended attributes in the archive. This option also enables --acls and--selinux if they haven''t been set already, due to the fact that the data for those are stored in special xattrs. Also, AFAIK, cpio will archive extended attributes as well. Is either ''tar --xattrs'' or ''cpio'' sufficient for capturing all the Lustre MDT information? Would that eliminate the need for the ''getfattr'' step, and cut the time for backup almost in half by requiring only one pass through the filesystem? Thanks, Nathan
On Tue, 2009-02-03 at 10:57 -0700, Nathan Dauchy wrote:> Brian J. Murrell wrote: > > On Mon, 2009-01-19 at 22:46 -0600, Alex Kulyavtsev wrote: > >> Hi, > >> what is the right way to backup MDT ? > > > > I think this is covered in the operations manual. > > > > The manual does indeed cover backups: > http://manual.lustre.org/manual/LustreManual16_HTML/BackupAndRestore.html > > It includes this step: > > > 3. Back up the EAs, run: > > getfattr -R -d -m ''.*'' -P . > ea.bak > > > > Is that necessary for newer versions of tar that support the "-xattrs" > option? From the man page (I have GNU tar 1.15.1): > > --xattrs this option causes tar to store each file''s > extended attributes in the archive. This option also > enables --acls and--selinux if they haven''t been > set already, due to the fact that the data for those > are stored in special xattrs. > > Also, AFAIK, cpio will archive extended attributes as well. > > Is either ''tar --xattrs'' or ''cpio'' sufficient for capturing all the > Lustre MDT information? Would that eliminate the need for the > ''getfattr'' step, and cut the time for backup almost in half by requiring > only one pass through the filesystem?The "tar" package shipped with distros cannot recognize "lustre." extended attributes and hence cannot be used for Lustre backups. You will need to use Lustre patched tar available for download at http://downloads.lustre.org/public/tools/lustre-tar/ Thanks, Kalpak
On Tue, Feb 03, 2009 at 11:47:37PM +0530, Kalpak Shah wrote:> On Tue, 2009-02-03 at 10:57 -0700, Nathan Dauchy wrote: > > Brian J. Murrell wrote: > > > On Mon, 2009-01-19 at 22:46 -0600, Alex Kulyavtsev wrote: > > >> Hi, > > >> what is the right way to backup MDT ? > > > > > > I think this is covered in the operations manual. > > > > > > > The manual does indeed cover backups: > > http:// manual.lustre.org/manual/LustreManual16_HTML/BackupAndRestore.html > > > > It includes this step: > > > > > 3. Back up the EAs, run: > > > getfattr -R -d -m ''.*'' -P . > ea.bak > > > > > > > Is that necessary for newer versions of tar that support the "-xattrs" > > option? From the man page (I have GNU tar 1.15.1): > > > > --xattrs this option causes tar to store each file''s > > extended attributes in the archive. This option also > > enables --acls and--selinux if they haven''t been > > set already, due to the fact that the data for those > > are stored in special xattrs. > > > > Also, AFAIK, cpio will archive extended attributes as well. > > > > Is either ''tar --xattrs'' or ''cpio'' sufficient for capturing all the > > Lustre MDT information? Would that eliminate the need for the > > ''getfattr'' step, and cut the time for backup almost in half by requiring > > only one pass through the filesystem? > > The "tar" package shipped with distros cannot recognize "lustre." > extended attributes and hence cannot be used for Lustre backups. > > You will need to use Lustre patched tar available for download at > http:// downloads.lustre.org/public/tools/lustre-tar/I just did some testing with rsync-2.6.8-3.1 from rhel 5 and it appears to support Lustre''s extended attributes with its -X (--xattrs) option. The test I ran was as follows: See if initial copy looks correct: - create some files in lustre (nonzero length) - stop lustre and remount mdt with -t ldiskfs on /mnt/lustre/mdt - create an ldiskfs file system on the MDS with -Oext_attr and mount on /mnt/lustre/mdt-backup - rsync -Xav --delete /mnt/lustre/mdt/ /mnt/lustre/mdt-backup - verify with getfattr -d -m ''.*'' -P that attrs are identical on all files in both copies Make sure rsync updates file/dir if only attrs change - unmount /mnt/lustre/mdt and restart lustre - run lfs setstripe --count -1 on an existing directory in lustre - stop lustre and remount mdt with -t ldiskfs on /mnt/lustre/mdt - verify with getfattr that attrs differ on that dir in both copies - repeat rsync - verify with getfattr that attrs are identical on that dir in both copies Verify that a restore works: - rm -rf /mnt/lustre/mdt/* - rsync -Xav --delete /mnt/lustre/mdt-backup/ /mnt/lustre/mdt - unmount /mnt/lustre/mdt and restart lustre - verify that md5sums of files created in first test match originals Jim
On Mon, 2009-03-16 at 13:43 -0700, Jim Garlick wrote:> On Tue, Feb 03, 2009 at 11:47:37PM +0530, Kalpak Shah wrote: > > On Tue, 2009-02-03 at 10:57 -0700, Nathan Dauchy wrote: > > > Brian J. Murrell wrote: > > > > On Mon, 2009-01-19 at 22:46 -0600, Alex Kulyavtsev wrote: > > > >> Hi, > > > >> what is the right way to backup MDT ? > > > > > > > > I think this is covered in the operations manual. > > > > > > > > > > The manual does indeed cover backups: > > > http:// manual.lustre.org/manual/LustreManual16_HTML/BackupAndRestore.html > > > > > > It includes this step: > > > > > > > 3. Back up the EAs, run: > > > > getfattr -R -d -m ''.*'' -P . > ea.bak > > > > > > > > > > Is that necessary for newer versions of tar that support the "-xattrs" > > > option? From the man page (I have GNU tar 1.15.1): > > > > > > --xattrs this option causes tar to store each file''s > > > extended attributes in the archive. This option also > > > enables --acls and--selinux if they haven''t been > > > set already, due to the fact that the data for those > > > are stored in special xattrs. > > > > > > Also, AFAIK, cpio will archive extended attributes as well. > > > > > > Is either ''tar --xattrs'' or ''cpio'' sufficient for capturing all the > > > Lustre MDT information? Would that eliminate the need for the > > > ''getfattr'' step, and cut the time for backup almost in half by requiring > > > only one pass through the filesystem? > > > > The "tar" package shipped with distros cannot recognize "lustre." > > extended attributes and hence cannot be used for Lustre backups. > > > > You will need to use Lustre patched tar available for download at > > http:// downloads.lustre.org/public/tools/lustre-tar/ > > I just did some testing with rsync-2.6.8-3.1 from rhel 5 and it appears > to support Lustre''s extended attributes with its -X (--xattrs) option. > > The test I ran was as follows: > > See if initial copy looks correct: > - create some files in lustre (nonzero length) > - stop lustre and remount mdt with -t ldiskfs on /mnt/lustre/mdt > - create an ldiskfs file system on the MDS with -Oext_attr and mount > on /mnt/lustre/mdt-backup > - rsync -Xav --delete /mnt/lustre/mdt/ /mnt/lustre/mdt-backup > - verify with getfattr -d -m ''.*'' -P that attrs are identical on all files > in both copies > > Make sure rsync updates file/dir if only attrs change > - unmount /mnt/lustre/mdt and restart lustre > - run lfs setstripe --count -1 on an existing directory in lustre > - stop lustre and remount mdt with -t ldiskfs on /mnt/lustre/mdt > - verify with getfattr that attrs differ on that dir in both copies > - repeat rsync > - verify with getfattr that attrs are identical on that dir in both copies > > Verify that a restore works: > - rm -rf /mnt/lustre/mdt/* > - rsync -Xav --delete /mnt/lustre/mdt-backup/ /mnt/lustre/mdt > - unmount /mnt/lustre/mdt and restart lustre > - verify that md5sums of files created in first test match originalsIs the file restored with the same striping pattern as it was archived with? Anyways, its good to know that rsync can backup Lustre attributes, we can think about modifying rsync to restore striping parameters as well. Thanks, Kalpak
Hello, No intention to spam but I would like to mention that there is a Lustre group on LinkedIn for those interested. The URL is http://www.linkedin.com/groups?home=&gid=1772375 jab
Alex: For its worth, we gave up on backing up our MDS because of its sheer size and the contents of our filesystem. We have close to 40TB of space with very small files. Backing up MDS without LVM snapshot took almost 5 days. Its probally better to have a backup of your most important files and if in case the MDS gets corrupted or lost, I would simple recreate the filesystem and restore the most important files. Ofcourse, your network admin won''t be happy :-) Hope this helps. On Mon, Mar 16, 2009 at 11:47 PM, Kalpak Shah <Kalpak.Shah at sun.com> wrote:> On Mon, 2009-03-16 at 13:43 -0700, Jim Garlick wrote: >> On Tue, Feb 03, 2009 at 11:47:37PM +0530, Kalpak Shah wrote: >> > On Tue, 2009-02-03 at 10:57 -0700, Nathan Dauchy wrote: >> > > Brian J. Murrell wrote: >> > > > On Mon, 2009-01-19 at 22:46 -0600, Alex Kulyavtsev wrote: >> > > >> Hi, >> > > >> what is the right way to backup MDT ? >> > > > >> > > > I think this is covered in the operations manual. >> > > > >> > > >> > > The manual does indeed cover backups: >> > > http:// manual.lustre.org/manual/LustreManual16_HTML/BackupAndRestore.html >> > > >> > > It includes this step: >> > > >> > > > 3. Back up the EAs, run: >> > > > getfattr -R -d -m ''.*'' -P . > ea.bak >> > > > >> > > >> > > Is that necessary for newer versions of tar that support the "-xattrs" >> > > option? From the man page (I have GNU tar 1.15.1): >> > > >> > > --xattrs this option causes tar to store each file''s >> > > extended attributes in the archive. This option also >> > > enables --acls and--selinux if they haven''t been >> > > set already, due to the fact that the data for those >> > > are stored in special xattrs. >> > > >> > > Also, AFAIK, cpio will archive extended attributes as well. >> > > >> > > Is either ''tar --xattrs'' or ''cpio'' sufficient for capturing all the >> > > Lustre MDT information? Would that eliminate the need for the >> > > ''getfattr'' step, and cut the time for backup almost in half by requiring >> > > only one pass through the filesystem? >> > >> > The "tar" package shipped with distros cannot recognize "lustre." >> > extended attributes and hence cannot be used for Lustre backups. >> > >> > You will need to use Lustre patched tar available for download at >> > http:// downloads.lustre.org/public/tools/lustre-tar/ >> >> I just did some testing with rsync-2.6.8-3.1 from rhel 5 and it appears >> to support Lustre''s extended attributes with its -X (--xattrs) option. >> >> The test I ran was as follows: >> >> See if initial copy looks correct: >> - create some files in lustre (nonzero length) >> - stop lustre and remount mdt with -t ldiskfs on /mnt/lustre/mdt >> - create an ldiskfs file system on the MDS with -Oext_attr and mount >> on /mnt/lustre/mdt-backup >> - rsync -Xav --delete /mnt/lustre/mdt/ /mnt/lustre/mdt-backup >> - verify with getfattr -d -m ''.*'' -P that attrs are identical on all files >> in both copies >> >> Make sure rsync updates file/dir if only attrs change >> - unmount /mnt/lustre/mdt and restart lustre >> - run lfs setstripe --count -1 on an existing directory in lustre >> - stop lustre and remount mdt with -t ldiskfs on /mnt/lustre/mdt >> - verify with getfattr that attrs differ on that dir in both copies >> - repeat rsync >> - verify with getfattr that attrs are identical on that dir in both copies >> >> Verify that a restore works: >> - rm -rf /mnt/lustre/mdt/* >> - rsync -Xav --delete /mnt/lustre/mdt-backup/ /mnt/lustre/mdt >> - unmount /mnt/lustre/mdt and restart lustre >> - verify that md5sums of files created in first test match originals > > Is the file restored with the same striping pattern as it was archived > with? > > Anyways, its good to know that rsync can backup Lustre attributes, we > can think about modifying rsync to restore striping parameters as well. > > Thanks, > Kalpak > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Is there a facebook page? On Tue, Mar 17, 2009 at 1:48 AM, Jeffrey Bennett <jab at sdsc.edu> wrote:> Hello, > > No intention to spam but I would like to mention that there is a Lustre group on LinkedIn for those interested. > > The URL is http://www.linkedin.com/groups?home=&gid=1772375 > > jab > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >