Hi all, I have a bit of a problem here. I am currently running Lustre 1.6.7 on 5 servers. 2 servers for MGS/MDT and 3 for OSDs. Behind the 2 MGS/MDT servers (failover setup) is a raidarray so that both machines can use the device. Now the partition table on the raiddevice got deleted and cannot be recovered. The OSTs are ok and the data on those should be fine. Now here is my question is, it possible to create a new MGS and new MDTs and somehow connect the old OSTs to them? Is there a way to recreate the metadata with the data whis is held on the OSTs? I''m deeply grateful for any help or hint on this issue. Thanks in advance Tom Woezel
Hi again, Or maybe there is a way to get my hands on the data which is on the OSTs? Some way to mount it without MGS and MDT maybe? If this is possible I could just recreate everything from scratch. Thanks again. Tom Am 15.07.2009 um 10:53 schrieb Tom Woezel:> Hi all, > > I have a bit of a problem here. I am currently running Lustre 1.6.7 on > 5 servers. 2 servers for MGS/MDT and 3 for OSDs. Behind the 2 MGS/MDT > servers (failover setup) is a raidarray so that both machines can use > the device. Now the partition table on the raiddevice got deleted and > cannot be recovered. The OSTs are ok and the data on those should be > fine. > > Now here is my question is, it possible to create a new MGS and new > MDTs and somehow connect the old OSTs to them? Is there a way to > recreate the metadata with the data whis is held on the OSTs? > > I''m deeply grateful for any help or hint on this issue. > > Thanks in advance > > Tom Woezel > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090715/0d5c534c/attachment.html
Hi Tom! On Wed, 15 Jul 2009 13:28:18 +0200 Tom Woezel <twoezel at it.dcs.ch> wrote:> > Or maybe there is a way to get my hands on the data which is on the > OSTs? Some way to mount it without MGS and MDT maybe? > > If this is possible I could just recreate everything from scratch.Well... for my understanding, the OST is a plain ext3 partition. So mounting the partition locally as ext3 should work fine. In case you didn''t use striping on the filesystem (which is the default), you should be able to access all the files of a single OST. wolfgang
On Wed, 2009-07-15 at 10:53 +0200, Tom Woezel wrote:> Hi all,Hi,> I have a bit of a problem here. I am currently running Lustre 1.6.7 on > 5 servers. 2 servers for MGS/MDT and 3 for OSDs.OST, not OSD.> Behind the 2 MGS/MDT > servers (failover setup) is a raidarray so that both machines can use > the device.Good.> Now the partition table on the raiddevice got deleted and > cannot be recovered.Ouch. How did it get deleted? How come it cannot be recovered? A partition table is nothing more than a small area at the start of a disk that contains pointers (i.e. offsets on the disk) to where partitions start and end. Even if it was completely wiped, the process of scanning the entire disk looking for "signatures" that can help identify a likely partition beginning and then recreate the partition table is usually quite successful. You might want to look into the "gpart" tool for this.> The OSTs are ok and the data on those should be > fine.Yes. But all you have is file "contents", nothing else.> Now here is my question is, it possible to create a new MGS and new > MDTs and somehow connect the old OSTs to them?No. There is nothing on the OSTs that indicate what file an object belongs to. This is why we are adamant about MDT storage being reliable and backed up.> Is there a way to > recreate the metadata with the data whis is held on the OSTs?No.> I''m deeply grateful for any help or hint on this issue.Without knowing the whole story of your MDT/RAID saga, I''d say gpart is your best bet. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090715/60d90b97/attachment.bin
On Wed, 2009-07-15 at 13:28 +0200, Tom Woezel wrote:> Hi again,Hi,> Or maybe there is a way to get my hands on the data which is on the > OSTs? Some way to mount it without MGS and MDT maybe?Sure. You can mount an OST with "mount -t ldiskfs ..." to see the contents of it. The objects are in /O/0 under the place you mounted it. From there you can examine the contents of individual objects. But note, that if any files were striped, their total contents will be in multiple objects, with no linkage from one object to another. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090715/af016d09/attachment.bin
On Wed, Jul 15, 2009 at 08:32:27AM -0400, Brian J. Murrell wrote:>On Wed, 2009-07-15 at 10:53 +0200, Tom Woezel wrote: >> Now the partition table on the raiddevice got deleted and >> cannot be recovered. >Ouch. How did it get deleted? How come it cannot be recovered? A >partition table is nothing more than a small area at the start of a disk >that contains pointers (i.e. offsets on the disk) to where partitions >start and end.if a kernel is still up and looking at the device then /proc/partitions and /sys/block/<disk>/* might well still contain enough valid data from which the previous partition table can be reconstructed. been there, dd''d over that. (almost) all good in the end :-) thankfully not to a Lustre fs, just my home server :-/ cheers, robin -- Dr Robin Humble, HPC Systems Analyst, NCI National Facility>Even if it was completely wiped, the process of scanning the entire disk >looking for "signatures" that can help identify a likely partition >beginning and then recreate the partition table is usually quite >successful. You might want to look into the "gpart" tool for this. > >> The OSTs are ok and the data on those should be >> fine. > >Yes. But all you have is file "contents", nothing else. > >> Now here is my question is, it possible to create a new MGS and new >> MDTs and somehow connect the old OSTs to them? > >No. There is nothing on the OSTs that indicate what file an object >belongs to. This is why we are adamant about MDT storage being reliable >and backed up. > >> Is there a way to >> recreate the metadata with the data whis is held on the OSTs? > >No. > >> I''m deeply grateful for any help or hint on this issue. > >Without knowing the whole story of your MDT/RAID saga, I''d say gpart is >your best bet. > >b. >>_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss at lists.lustre.org >http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Jul 15, 2009 08:32 -0400, Brian J. Murrell wrote:> Even if it was completely wiped, the process of scanning the entire disk > looking for "signatures" that can help identify a likely partition > beginning and then recreate the partition table is usually quite > successful. You might want to look into the "gpart" tool for this. > > On Wed, 2009-07-15 at 10:53 +0200, Tom Woezel wrote: > > Now here is my question is, it possible to create a new MGS and new > > MDTs and somehow connect the old OSTs to them? > > No. There is nothing on the OSTs that indicate what file an object > belongs to. This is why we are adamant about MDT storage being reliable > and backed up.That isn''t totally correct - there is an xattr on each OST object that contains the MDT inode number and stripe index for that object. That doesn''t help much in this case, because you will still have a whole filesystem of objects without filenames.> Without knowing the whole story of your MDT/RAID saga, I''d say gpart is > your best bet.Recovering the MDS data using gpart seems like a better idea. If the corruption is in the RAID layout itself, that is a harder issue to fix. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Wed, 2009-07-15 at 09:53 -0400, Andreas Dilger wrote:> > That isn''t totally correct - there is an xattr on each OST object that > contains the MDT inode number and stripe index for that object.Ahhh. Yes, I do recall that too. I think because I was as fixed on the "object->filename" resolution that OP was looking for, I didn''t think the inode xattr would be terribly useful to him, but indeed, the stripe index could be helpful in ordering objects in striped files. Thanx for the reminder Andreas. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090715/53a2b46f/attachment.bin
Thank you all for the answers so far. The main problem with the raid is that the partition table has been overwritten. I don''t know how it happened but it seems that another machine could see the raid device during a kickstart installation and used it (some of our kickstart files use the clearpart --all --initlabel option). However I was able to recover a partition table with gpart (tried TestDisk befor and didn''t work) and will now see if I can somehow recover the MGS and MDTs. Regards Tom Am 15.07.2009 um 15:53 schrieb Andreas Dilger:> On Jul 15, 2009 08:32 -0400, Brian J. Murrell wrote: >> Even if it was completely wiped, the process of scanning the entire >> disk >> looking for "signatures" that can help identify a likely partition >> beginning and then recreate the partition table is usually quite >> successful. You might want to look into the "gpart" tool for this. >> >> On Wed, 2009-07-15 at 10:53 +0200, Tom Woezel wrote: >>> Now here is my question is, it possible to create a new MGS and new >>> MDTs and somehow connect the old OSTs to them? >> >> No. There is nothing on the OSTs that indicate what file an object >> belongs to. This is why we are adamant about MDT storage being >> reliable >> and backed up. > > That isn''t totally correct - there is an xattr on each OST object that > contains the MDT inode number and stripe index for that object. That > doesn''t help much in this case, because you will still have a whole > filesystem of objects without filenames. > >> Without knowing the whole story of your MDT/RAID saga, I''d say >> gpart is >> your best bet. > > Recovering the MDS data using gpart seems like a better idea. If the > corruption is in the RAID layout itself, that is a harder issue to > fix. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss----------------------------------------------------------------- Tom Woezel | DCS Contractor in DMO/OTS/SOS Group Office 2001 ESO/IPP | System Administrator Tel.:+49-89-32006-184 | Fax.:+49-89-32006-677 | Address: | European Southern Observatory mailto:twoezel at it.dcs.ch | Karl-Schwarzschild-Strasse 2 | D-85748 Garching bei Munchen, Germany web: http://www.dcs.ch | http://www.eso.org ----------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090715/3695c5c9/attachment-0001.html
Hello Tom, On Wednesday 15 July 2009, Tom Woezel wrote:> Thank you all for the answers so far. The main problem with the raid > is that the partition table has been overwritten. I don''t know how it > happened but it seems that another machine could see the raid device > during a kickstart installation and used it (some of our kickstart > files use the clearpart --all --initlabel option). > > However I was able to recover a partition table with gpart (tried > TestDisk befor and didn''t work) and will now see if I can somehow > recover the MGS and MDTs.have you been able to recover your data? We (DDN) are presently recovering a reformated MDT. Compared to that your case sounds rather minor though (unless you can''t assemble a raid5 or raid6). I''m going to update bug#19904 with the new tools we have based on MDT directory entry recovery. If you still can''t recover anything those tools might be helpful for you, too. Cheers, Bernd -- Bernd Schubert DataDirect Networks