We do a lot of fluid simulations at my university, but on a similar note I would like to know what the Lustre experts will do in particular simulated scenarios... The environment is this: 30 Servers (All Linux) 1000+ Clients (All Linux) 30 Servers 1 MDS 30 OSTs each with 2TB of storage No fail over capabilities. Scenario 1: Your client is trying to mount lustre filesystem using lustre module, and it hung. Do what? Scenario 2: Your MDS won''t mount up. Its saying, "The server is already running". You try to mount it up couple of times and still its not Scenario 3: OST/OSS reboots due to a power outage. Some files are striped on this, and some aren''t What happens? What to do for minimal outage? Scenario 4: lctl dl shows some devices in "ST" state. What does that mean, and how do I clear it? I know some of these scenarios may be ambiguous, but please let me know which so I can further elaborate. I am eventually planning to wiki this for future reference and other lustre newbies. If anyone else has any other scenarios, please don''t be shy and ask away. We can create a good trouble shooting doc similar to the operations manual. TIA
Mag Gam wrote:> We do a lot of fluid simulations at my university, but on a similar > note I would like to know what the Lustre experts will do in > particular simulated scenarios... > > The environment is this: > 30 Servers (All Linux) > 1000+ Clients (All Linux) > > 30 Servers > 1 MDS > 30 OSTs each with 2TB of storage > > No fail over capabilities. > > > Scenario 1: > Your client is trying to mount lustre filesystem using lustre module, > and it hung. Do what?Answer 0 to all questions: "Read the Lustre Manual. File doc bugs in Lustre Bugzilla if there''s a part you don''t understand, or a part missing" Answer 1 for all your questions. "Check syslogs/consoles on the impacted clients. Check syslogs/consoles on _all lustre servers. Pay careful attention to timestamps. Work backwards to the first error." Is the problem restricted to one client or seen by multiple clients? If multiple clients, start with the network, use lctl ping to check lustre connectivity. If a single client, it''s generally a client config/network config issue.> > Scenario 2: > Your MDS won''t mount up. Its saying, "The server is already running". > You try to mount it up couple of times and still its notBe certain the server is not already running. Be certain no hung mount processes exist. Unload all lustre modules (lustre_rmmod script will do this) Retry and -> answer 1> > Scenario 3: > OST/OSS reboots due to a power outage. Some files are striped on this, > and some aren''t What happens? What to do for minimal outage?- Clients can be mounted with a dead OST using the exclude options to the mount command. lfs getstripe can be run from clients to find files on the bad OST. See answer 0 for detailed process.> > Scenario 4: > lctl dl shows some devices in "ST" state. What does that mean, and how > do I clear it?ST = stopped. Clear this by cleaning up all devices (answer 0) or restarting the stopped devices. Usually indicates an error/issue with the stopped device, so see answer 1.> > > I know some of these scenarios may be ambiguous, but please let me > know which so I can further elaborate. I am eventually planning to > wiki this for future reference and other lustre newbies.Please contribute to wiki.lustre.org - there is considerable information there already, and a decent existing structure.> > If anyone else has any other scenarios, please don''t be shy and ask > away. We can create a good trouble shooting doc similar to the > operations manual.Again, please file doc bugs at bugzilla.lustre.org and contribute to wiki.lustre.org, hope this helps! cliffw> > > TIA > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
CliffW: This helps out a lot! We still have problems determining devices. We don''t know what their numbers are (I been using lctl dl), but I don''t know how to activate or deactivate them. Do you have an example? TIA On Thu, Aug 7, 2008 at 10:59 AM, Cliff White <Cliff.White at sun.com> wrote:> Mag Gam wrote: >> >> We do a lot of fluid simulations at my university, but on a similar >> note I would like to know what the Lustre experts will do in >> particular simulated scenarios... >> >> The environment is this: >> 30 Servers (All Linux) >> 1000+ Clients (All Linux) >> >> 30 Servers >> 1 MDS >> 30 OSTs each with 2TB of storage >> >> No fail over capabilities. >> >> >> Scenario 1: >> Your client is trying to mount lustre filesystem using lustre module, >> and it hung. Do what? > > Answer 0 to all questions: > "Read the Lustre Manual. File doc bugs in Lustre Bugzilla if there''s a part > you don''t understand, or a part missing" > > Answer 1 for all your questions. > "Check syslogs/consoles on the impacted clients. > Check syslogs/consoles on _all lustre servers. > Pay careful attention to timestamps. > Work backwards to the first error." > > Is the problem restricted to one client or seen by multiple clients? > If multiple clients, start with the network, use lctl ping to check lustre > connectivity. > If a single client, it''s generally a client config/network config issue. >> >> Scenario 2: >> Your MDS won''t mount up. Its saying, "The server is already running". >> You try to mount it up couple of times and still its not > > Be certain the server is not already running. > Be certain no hung mount processes exist. > Unload all lustre modules (lustre_rmmod script will do this) > Retry and -> answer 1 > >> >> Scenario 3: >> OST/OSS reboots due to a power outage. Some files are striped on this, >> and some aren''t What happens? What to do for minimal outage? > > - Clients can be mounted with a dead OST using the exclude options to the > mount command. lfs getstripe can be run from clients to find files > on the bad OST. See answer 0 for detailed process. >> >> Scenario 4: >> lctl dl shows some devices in "ST" state. What does that mean, and how >> do I clear it? > > ST = stopped. > Clear this by cleaning up all devices (answer 0) > or restarting the stopped devices. > Usually indicates an error/issue with the stopped device, so see > answer 1. >> >> >> I know some of these scenarios may be ambiguous, but please let me >> know which so I can further elaborate. I am eventually planning to >> wiki this for future reference and other lustre newbies. > > Please contribute to wiki.lustre.org - there is considerable information > there already, and a decent existing structure. >> >> If anyone else has any other scenarios, please don''t be shy and ask >> away. We can create a good trouble shooting doc similar to the >> operations manual. > > Again, please file doc bugs at bugzilla.lustre.org and contribute to > wiki.lustre.org, hope this helps! > cliffw > >> >> >> TIA >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Mag Gam wrote:> CliffW: > > This helps out a lot! > > We still have problems determining devices. We don''t know what their > numbers are (I been using lctl dl), but I don''t know how to activate > or deactivate them. > > > Do you have an example? >Yup http://manual.lustre.org/manual/LustreManual16_HTML/KnowledgeBase.html#50544717_84403 The .pdf version I think has more details. cliffw> > TIA > > On Thu, Aug 7, 2008 at 10:59 AM, Cliff White <Cliff.White at sun.com> wrote: >> Mag Gam wrote: >>> We do a lot of fluid simulations at my university, but on a similar >>> note I would like to know what the Lustre experts will do in >>> particular simulated scenarios... >>> >>> The environment is this: >>> 30 Servers (All Linux) >>> 1000+ Clients (All Linux) >>> >>> 30 Servers >>> 1 MDS >>> 30 OSTs each with 2TB of storage >>> >>> No fail over capabilities. >>> >>> >>> Scenario 1: >>> Your client is trying to mount lustre filesystem using lustre module, >>> and it hung. Do what? >> Answer 0 to all questions: >> "Read the Lustre Manual. File doc bugs in Lustre Bugzilla if there''s a part >> you don''t understand, or a part missing" >> >> Answer 1 for all your questions. >> "Check syslogs/consoles on the impacted clients. >> Check syslogs/consoles on _all lustre servers. >> Pay careful attention to timestamps. >> Work backwards to the first error." >> >> Is the problem restricted to one client or seen by multiple clients? >> If multiple clients, start with the network, use lctl ping to check lustre >> connectivity. >> If a single client, it''s generally a client config/network config issue. >>> Scenario 2: >>> Your MDS won''t mount up. Its saying, "The server is already running". >>> You try to mount it up couple of times and still its not >> Be certain the server is not already running. >> Be certain no hung mount processes exist. >> Unload all lustre modules (lustre_rmmod script will do this) >> Retry and -> answer 1 >> >>> Scenario 3: >>> OST/OSS reboots due to a power outage. Some files are striped on this, >>> and some aren''t What happens? What to do for minimal outage? >> - Clients can be mounted with a dead OST using the exclude options to the >> mount command. lfs getstripe can be run from clients to find files >> on the bad OST. See answer 0 for detailed process. >>> Scenario 4: >>> lctl dl shows some devices in "ST" state. What does that mean, and how >>> do I clear it? >> ST = stopped. >> Clear this by cleaning up all devices (answer 0) >> or restarting the stopped devices. >> Usually indicates an error/issue with the stopped device, so see >> answer 1. >>> >>> I know some of these scenarios may be ambiguous, but please let me >>> know which so I can further elaborate. I am eventually planning to >>> wiki this for future reference and other lustre newbies. >> Please contribute to wiki.lustre.org - there is considerable information >> there already, and a decent existing structure. >>> If anyone else has any other scenarios, please don''t be shy and ask >>> away. We can create a good trouble shooting doc similar to the >>> operations manual. >> Again, please file doc bugs at bugzilla.lustre.org and contribute to >> wiki.lustre.org, hope this helps! >> cliffw >> >>> >>> TIA >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
I am trying to track logged errors upstream from the error to the file that may have been affected. What is the easy(and not so dangerous) way to: 1. derive OST inode from OST object? OST object modulo 32 for directory on OST then run debug.ldiskfs(stat) the file(ost object), after cd into O/0/d$modulo_number, that displays inode of object on the OST 2. derive MDS inode from OST inode? use a tool that is nice uses OST inode and gives me the mds inode or decode using source code the extended attributes that are in some hex string that is in the output from the debugfs step above at "fid =" line. 3.derive filename from MDS inode? run debug.ldiskfs(ncheck) the MDS inode that displays the filename. PS; debug.ldiskfs used with -c option to load faster. -- }}}===============>> LLNL James E. Harm (Jim); jharm at llnl.gov System Administrator, ICCD Clusters (925) 422-4018 Page: 423-7705x57152
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Jim Harm wrote: | I am trying to track logged errors upstream from the error to the | file that may have been affected. | | What is the easy(and not so dangerous) way to: | | 1. derive OST inode from OST object? | OST object modulo 32 for directory on OST | then run debug.ldiskfs(stat) the file(ost object), | after cd into O/0/d$modulo_number, | that displays inode of object on the OST | Jim, We have a rudimentary tool that I developed here at LLNL that does what I think you want here. You asked for getting an OST inode from an ost object. All you have to do is stat the file using debugfs to get at that information. What I think you want is something a bit more tricky. We had an incident here where the fsck found some corruption and moved some OST objects into the lost+found. One nice thing about Lustre is that it stores extended attributes about the file with the inode. We have a tool here called eadump.ldiskfs that reads and decodes the extended attribute information for an ost object. This tells you what the object id should be for the file as well as what the mds inode should be as well (This also answers youe #2 below)...=) EG: | eadump.ldiskfs -d /dev/sdc -i 105906277 Name: trusted.fid Value: MDSINO: 112108525 GEN: 1401146486 STRIPEIDX: 1 OBJID: 10942568 GROUP: 0 | 2. derive MDS inode from OST inode? | use a tool that is nice uses OST inode and gives me the mds inode or | decode using source code the extended attributes | that are in some hex string that is in the output | from the debugfs step above at "fid =" line. | | 3.derive filename from MDS inode? | run debug.ldiskfs(ncheck) the MDS inode | that displays the filename. Using ncheck in debugfs is the only way I know of to get at this information. This is a SLOW process since it has to rumble through the filesystem for it. You should also note that this filename may not be the only one pointing to that inode. | | PS; debug.ldiskfs used with -c option to load faster. | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iEYEAREKAAYFAkicuHwACgkQP/62XqEEbMaOdQCfbwtRnF/iiqp6y/He91k6tW4l ISQAoM3INPeYFoBq2MmUdXFtUZoMcL0i =mvWx -----END PGP SIGNATURE-----
On Aug 08, 2008 14:19 -0700, Herb Wartens wrote:> We have a rudimentary tool that I developed here at LLNL that > does what I think you want here. > You asked for getting an OST inode from an ost object. All you have > to do is stat the file using debugfs to get at that information. > What I think you want is something a bit more tricky. > > We had an incident here where the fsck found some corruption and > moved some OST objects into the lost+found. One nice thing about > Lustre is that it stores extended attributes about the file with > the inode. > > We have a tool here called eadump.ldiskfs that reads and decodes the > extended attribute information for an ost object. This tells you > what the object id should be for the file as well as what the mds > inode should be as well (This also answers youe #2 below)...=) > > EG: > | eadump.ldiskfs -d /dev/sdc -i 105906277 > Name: trusted.fid Value: MDSINO: 112108525 GEN: 1401146486 STRIPEIDX: 1 OBJID: 10942568 GROUP: 0Note that there is also a new tool ll_recover_lost_found_objs in 1.6.6 (also in bugzilla) that will move objects from lost+found back into place in O/0/d*, including rebuilding the directory structure there if it was broken for some reason. It will also (AFAIR) print out the MDS inode number.> | 2. derive MDS inode from OST inode? > | use a tool that is nice uses OST inode and gives me the mds inode or > | decode using source code the extended attributes > | that are in some hex string that is in the output > | from the debugfs step above at "fid =" line. > | > | 3.derive filename from MDS inode? > | run debug.ldiskfs(ncheck) the MDS inode > | that displays the filename. > > Using ncheck in debugfs is the only way I know of to get at this information. > This is a SLOW process since it has to rumble through the filesystem for it. > You should also note that this filename may not be the only one pointing to > that inode.Right. There is a discussion underway about storing the filename(s) in the inode itself to allow this kind of operation to be done in O(path_parts) instead of O(number_of_inodes * path_parts). This is also needed for things like changelog generation. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.