Nick Jennings | Technical Director
2009-Apr-17 12:45 UTC
[Lustre-discuss] Direct Snapshots of Lustre Filesystem & MDT size
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I''ve got two questions that don''t seem to be answered in the Lustre Operations Manual. - - The first is recommended MDT size. How big should your MDT partition be? Does it depend on your estimated eventual size of your Lustre file system? If so whats the ratio? - - The next question has to do with backing up a Lustre file system. In the operations manual, it says that there is significant CPU overhead when snapshotting the Lustre file system directly. However the recommended work around is to make a separate LVM file system, copy important files/directories on it - then take a snapshot of that. This kind of skirts the issue for me, ideally I''d like to backup the entire file system (everything is important). Making a full copy of the file system on another drive is already a backup in and of itself. Doesn''t this kind of defeat the purpose? I can understand if you only need a few files backed up, but if you have a 5TB file system, and want a snapshot of it, it doesn''t make much sense to make another 5TB LVM slice, and copy everything there - then take a snapshot of that. The manual doesn''t explain how to snapshot the lustre file system directly. Is it not supported? Would it cost a significant amount of CPU cycles just during the snapshotting process, or continuously during the course of operations, in order to maintain the snapshot? Thanks for any advice on these issues. - -Nick - -- Nick Jennings Technical Director Creative Motion Design www.creativemotiondesign.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAknoegcACgkQA9251mzMDYaRIACeJWk5yofTpCGpNH/ywZwq5TI/ BmMAoIX5W6zaHNxYx77mmN6gHUAl6U0T =WzYr -----END PGP SIGNATURE-----
Brian J. Murrell
2009-Apr-17 12:57 UTC
[Lustre-discuss] Direct Snapshots of Lustre Filesystem & MDT size
On Fri, 2009-04-17 at 14:45 +0200, Nick Jennings | Technical Director wrote:> > - - The first is recommended MDT size.I think this is discussed, at least indirectly in the manual. If not, it most definitely has been discussed here several times. The archives will probably yield quite fruitful.> How big should your MDT partition > be? Does it depend on your estimated eventual size of your Lustre file > system?Yes. In numbers of files, not total capacity.> If so whats the ratio?No ratio. Number of files.> This kind of skirts the issue for me, ideally I''d like to backup the > entire file system (everything is important).Backup has been discussed on this many times as well.> Making a full copy of the > file system on another drive is already a backup in and of itself. > Doesn''t this kind of defeat the purpose?If the purpose is to make a backup, I don''t see how that can be.> I can understand if you only > need a few files backed up, but if you have a 5TB file system, and want > a snapshot of it, it doesn''t make much sense to make another 5TB LVM > slice, and copy everything there - then take a snapshot of that.Yeah. I don''t recall reading anything like that in manual. It does not sound terribly practical. But then again, I suppose practicality is a function of the importance/value of your data.> The manual doesn''t explain how to snapshot the lustre file system > directly. Is it not supported?No. The closest you can get is LVM snapshots of the targets and that comes with all the regular LVM snapshot caveats, again discussed on this list many times. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090417/11c3c9d7/attachment.bin
Nick Jennings
2009-Apr-17 13:37 UTC
[Lustre-discuss] Direct Snapshots of Lustre Filesystem & MDT size
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thanks for the reply Brian, I actually did try several searches regarding both questions, guess I just wasn''t using the right keywords (was getting much more unrelated info). Couple comments below: Brian J. Murrell wrote:> On Fri, 2009-04-17 at 14:45 +0200, Nick Jennings | Technical Director > wrote: > >> Making a full copy of the >> file system on another drive is already a backup in and of itself. >> Doesn''t this kind of defeat the purpose? > > If the purpose is to make a backup, I don''t see how that can be.Snapshotting combined with offsite backup is what I''m going for here. Not a full copy of the filesystem on another set of (our) disks.>> I can understand if you only >> need a few files backed up, but if you have a 5TB file system, and want >> a snapshot of it, it doesn''t make much sense to make another 5TB LVM >> slice, and copy everything there - then take a snapshot of that. > > Yeah. I don''t recall reading anything like that in manual. It does not > sound terribly practical. But then again, I suppose practicality is a > function of the importance/value of your data.In the intro paragraph of section 15.3 (page 214 of the PDF) it says: To get around this problem [performance loss of snapshotting main Lustre filesystem], create a new, backup filesystem and periodically back up new/changed files. Take periodic snapshots of this backup filesystem to create a series of compact "full" backups. Maybe I''m misinterpreting that, but it seems to suggest coping any important data to a separate partition (w/LVM+snapshotting). If all my data is important, that''s a full copy.>> The manual doesn''t explain how to snapshot the lustre file system >> directly. Is it not supported? > > No. The closest you can get is LVM snapshots of the targets and that > comes with all the regular LVM snapshot caveats, again discussed on this > list many times.This is what I''m going for. By lustre filesystem I meant more specifically, direct snapshotting of an OST, as opposed to a copy/snapshot on another partition. Is there a downside to having just one large OST per OSS, or is that actually better? (assuming the storage target can only be connected to one host at a time anyway) - -- Nick Jennings Technical Director Creative Motion Design www.creativemotiondesign.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAknohjIACgkQbqosUH1Nr8crfwCgimmKjCRZU8bZ/32F/yiSCF5d BQIAoN3ziQL8YBjwGBky2/pwCO1hfpft =h+Fb -----END PGP SIGNATURE-----
Brian J. Murrell
2009-Apr-17 15:17 UTC
[Lustre-discuss] Direct Snapshots of Lustre Filesystem & MDT size
On Fri, 2009-04-17 at 15:37 +0200, Nick Jennings wrote:> > Thanks for the reply Brian, I actually did try several searches > regarding both questions, guess I just wasn''t using the right keywords > (was getting much more unrelated info)."inode size" would probably find some relevant discussions on MDT size.> Snapshotting combined with offsite backup is what I''m going for here.Well, snapshots are just a vehicle to getting the state of the target "frozen". You could achieve the same by simply taking the device off-line, but that kind of down-time is unacceptable to some, hence the option to use an LVM snapshot -- with it''s penalties.> Not a full copy of the filesystem on another set of (our) disks.Well, no matter who/what it''s on, you still need a "snapshot in time" to make your copy from. Whether that''s to another disk (which you keep locally or move offsite) or tape or whatever.> In the intro paragraph of section 15.3 (page 214 of the PDF) it says: > To get around this problem [performance loss of snapshotting main Lustre > filesystem], create a new, backup filesystem and periodically back up > new/changed files. Take periodic snapshots of this backup filesystem to > create a series of compact "full" backups.Hrm. Yeah. Well, that really is your only answer to avoiding the penalties of snapshotting your live filesystem. There''s no free lunch.> Maybe I''m misinterpreting that, but it seems to suggest coping any > important data to a separate partition (w/LVM+snapshotting). If all my > data is important, that''s a full copy.That''s right. Again, this is a balance between cost, performance and redundancy. Cheap, redundant, fast. Pick two. :-)> This is what I''m going for. By lustre filesystem I meant more > specifically, direct snapshotting of an OST, as opposed to a > copy/snapshot on another partition.You can most certainly snapshot an OST if it''s an LVM target. But there are the usual COW performance caveats with that (for the life of the snasphot(s) anyway). Perhaps given your budget and data redundancy requirements, the performance penalty (again, only while a snapshot is present) is acceptable. You will have to be the judge of that.> Is there a downside to having just one large OST per OSS,Only the 8TB device limitation, and perhaps backup considerations. It usually easier to back up smaller devices. The larger your dataset gets, the more difficulties you run into backing it up. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090417/f1481574/attachment.bin
Nick Jennings
2009-Apr-17 16:21 UTC
[Lustre-discuss] Direct Snapshots of Lustre Filesystem & MDT size
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Brian J. Murrell wrote:> On Fri, 2009-04-17 at 15:37 +0200, Nick Jennings wrote: >> Thanks for the reply Brian, I actually did try several searches >> regarding both questions, guess I just wasn''t using the right keywords >> (was getting much more unrelated info). > > "inode size" would probably find some relevant discussions on MDT size.Thanks, this helped! I found the relevant piece of info under "Lustre Tuning" http://manual.lustre.org/manual/LustreManual16_HTML/LustreTuning.html Might be worth making a note under quickstart lustre configuration - or at least a pointer to the relevant section. It goes over formatting the MDT device but provides no info on MDT device size. - -Nick - -- Nick Jennings Technical Director Creative Motion Design www.creativemotiondesign.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAknorIcACgkQbqosUH1Nr8eVfgCeOu7700WLZTTQIHTyfTU/5V70 y/MAoJ19rOkGsdY7/v5DUR4VjV8U/QZ6 =olaN -----END PGP SIGNATURE-----