thr3ads.net - Lustre discuss - [Lustre-discuss] simulations [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Mag Gam

2008-Aug-07 01:50 UTC

[Lustre-discuss] simulations

We do a lot of fluid simulations at my university, but on a similar
note I would like to know what the Lustre experts will do in
particular simulated scenarios...

The environment is this:
30 Servers (All Linux)
1000+ Clients (All Linux)

30 Servers
1 MDS
30 OSTs each with 2TB of storage

No fail over capabilities.


Scenario 1:
Your client is trying to mount lustre filesystem using lustre module,
and it hung. Do what?

Scenario 2:
Your MDS won''t mount up. Its saying, "The server is already
running".
You try to mount it up couple of times and still its not

Scenario 3:
OST/OSS reboots due to a power outage. Some files are striped on this,
and some aren''t What happens? What to do for minimal outage?

Scenario 4:
lctl dl shows some devices in "ST" state. What does that mean, and how
do I clear it?


I know some of these scenarios may be ambiguous, but please let me
know which so I can further elaborate. I am eventually planning to
wiki this for future reference and other lustre newbies.

If anyone else has any other scenarios, please don''t be shy and ask
away. We can create a good trouble shooting doc similar to the
operations manual.


TIA

Cliff White

2008-Aug-07 17:59 UTC

head link

[Lustre-discuss] simulations

Mag Gam wrote:> We do a lot of fluid simulations at my university, but on a similar
> note I would like to know what the Lustre experts will do in
> particular simulated scenarios...
> 
> The environment is this:
> 30 Servers (All Linux)
> 1000+ Clients (All Linux)
> 
> 30 Servers
> 1 MDS
> 30 OSTs each with 2TB of storage
> 
> No fail over capabilities.
> 
> 
> Scenario 1:
> Your client is trying to mount lustre filesystem using lustre module,
> and it hung. Do what?Answer 0 to all questions:
"Read the Lustre Manual. File doc bugs in Lustre Bugzilla if
there''s a
part you don''t understand, or a part missing"

Answer 1 for all your questions.
"Check syslogs/consoles on the impacted clients.
Check syslogs/consoles on _all lustre servers.
Pay careful attention to timestamps.
Work backwards to the first error."

Is the problem restricted to one client or seen by multiple clients?
If multiple clients, start with the network, use lctl ping to check 
lustre connectivity.
If a single client, it''s generally a client config/network config
issue.> 
> Scenario 2:
> Your MDS won''t mount up. Its saying, "The server is already
running".
> You try to mount it up couple of times and still its not
Be certain the server is not already running.
Be certain no hung mount processes exist.
Unload all lustre modules (lustre_rmmod script will do this)
Retry and -> answer 1
> 
> Scenario 3:
> OST/OSS reboots due to a power outage. Some files are striped on this,
> and some aren''t What happens? What to do for minimal outage?
- Clients can be mounted with a dead OST using the exclude options to 
the mount command. lfs getstripe can be run from clients to find files
on the bad OST. See answer 0 for detailed process.> 
> Scenario 4:
> lctl dl shows some devices in "ST" state. What does that mean,
and how
> do I clear it?
ST = stopped.
Clear this by cleaning up all devices (answer 0)
or restarting the stopped devices.
Usually indicates an error/issue with the stopped device, so see
answer 1.> 
> 
> I know some of these scenarios may be ambiguous, but please let me
> know which so I can further elaborate. I am eventually planning to
> wiki this for future reference and other lustre newbies.
Please contribute to wiki.lustre.org - there is considerable information 
there already, and a decent existing structure.> 
> If anyone else has any other scenarios, please don''t be shy and
ask
> away. We can create a good trouble shooting doc similar to the
> operations manual.
Again, please file doc bugs at bugzilla.lustre.org and contribute to 
wiki.lustre.org, hope this helps!
cliffw
> 
> 
> TIA
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Mag Gam

2008-Aug-08 04:50 UTC

head link

[Lustre-discuss] simulations

CliffW:

This helps out a lot!

We still have problems determining devices. We don''t know what their
numbers are (I been using lctl dl), but I don''t know how to activate
or deactivate them.


Do you have an example?


TIA

On Thu, Aug 7, 2008 at 10:59 AM, Cliff White <Cliff.White at sun.com>
wrote:> Mag Gam wrote:
>>
>> We do a lot of fluid simulations at my university, but on a similar
>> note I would like to know what the Lustre experts will do in
>> particular simulated scenarios...
>>
>> The environment is this:
>> 30 Servers (All Linux)
>> 1000+ Clients (All Linux)
>>
>> 30 Servers
>> 1 MDS
>> 30 OSTs each with 2TB of storage
>>
>> No fail over capabilities.
>>
>>
>> Scenario 1:
>> Your client is trying to mount lustre filesystem using lustre module,
>> and it hung. Do what?
>
> Answer 0 to all questions:
> "Read the Lustre Manual. File doc bugs in Lustre Bugzilla if
there''s a part
> you don''t understand, or a part missing"
>
> Answer 1 for all your questions.
> "Check syslogs/consoles on the impacted clients.
> Check syslogs/consoles on _all lustre servers.
> Pay careful attention to timestamps.
> Work backwards to the first error."
>
> Is the problem restricted to one client or seen by multiple clients?
> If multiple clients, start with the network, use lctl ping to check lustre
> connectivity.
> If a single client, it''s generally a client config/network config
issue.
>>
>> Scenario 2:
>> Your MDS won''t mount up. Its saying, "The server is
already running".
>> You try to mount it up couple of times and still its not
>
> Be certain the server is not already running.
> Be certain no hung mount processes exist.
> Unload all lustre modules (lustre_rmmod script will do this)
> Retry and -> answer 1
>
>>
>> Scenario 3:
>> OST/OSS reboots due to a power outage. Some files are striped on this,
>> and some aren''t What happens? What to do for minimal outage?
>
> - Clients can be mounted with a dead OST using the exclude options to the
> mount command. lfs getstripe can be run from clients to find files
> on the bad OST. See answer 0 for detailed process.
>>
>> Scenario 4:
>> lctl dl shows some devices in "ST" state. What does that
mean, and how
>> do I clear it?
>
> ST = stopped.
> Clear this by cleaning up all devices (answer 0)
> or restarting the stopped devices.
> Usually indicates an error/issue with the stopped device, so see
> answer 1.
>>
>>
>> I know some of these scenarios may be ambiguous, but please let me
>> know which so I can further elaborate. I am eventually planning to
>> wiki this for future reference and other lustre newbies.
>
> Please contribute to wiki.lustre.org - there is considerable information
> there already, and a decent existing structure.
>>
>> If anyone else has any other scenarios, please don''t be shy
and ask
>> away. We can create a good trouble shooting doc similar to the
>> operations manual.
>
> Again, please file doc bugs at bugzilla.lustre.org and contribute to
> wiki.lustre.org, hope this helps!
> cliffw
>
>>
>>
>> TIA
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

Cliff White

2008-Aug-08 17:45 UTC

head link

[Lustre-discuss] simulations

Mag Gam wrote:> CliffW:
> 
> This helps out a lot!
> 
> We still have problems determining devices. We don''t know what
their
> numbers are (I been using lctl dl), but I don''t know how to
activate
> or deactivate them.
> 
> 
> Do you have an example?
> Yup
http://manual.lustre.org/manual/LustreManual16_HTML/KnowledgeBase.html#50544717_84403

The .pdf version I think has more details.
cliffw
> 
> TIA
> 
> On Thu, Aug 7, 2008 at 10:59 AM, Cliff White <Cliff.White at sun.com>
wrote:
>> Mag Gam wrote:
>>> We do a lot of fluid simulations at my university, but on a similar
>>> note I would like to know what the Lustre experts will do in
>>> particular simulated scenarios...
>>>
>>> The environment is this:
>>> 30 Servers (All Linux)
>>> 1000+ Clients (All Linux)
>>>
>>> 30 Servers
>>> 1 MDS
>>> 30 OSTs each with 2TB of storage
>>>
>>> No fail over capabilities.
>>>
>>>
>>> Scenario 1:
>>> Your client is trying to mount lustre filesystem using lustre
module,
>>> and it hung. Do what?
>> Answer 0 to all questions:
>> "Read the Lustre Manual. File doc bugs in Lustre Bugzilla if
there''s a part
>> you don''t understand, or a part missing"
>>
>> Answer 1 for all your questions.
>> "Check syslogs/consoles on the impacted clients.
>> Check syslogs/consoles on _all lustre servers.
>> Pay careful attention to timestamps.
>> Work backwards to the first error."
>>
>> Is the problem restricted to one client or seen by multiple clients?
>> If multiple clients, start with the network, use lctl ping to check
lustre
>> connectivity.
>> If a single client, it''s generally a client config/network
config issue.
>>> Scenario 2:
>>> Your MDS won''t mount up. Its saying, "The server is
already running".
>>> You try to mount it up couple of times and still its not
>> Be certain the server is not already running.
>> Be certain no hung mount processes exist.
>> Unload all lustre modules (lustre_rmmod script will do this)
>> Retry and -> answer 1
>>
>>> Scenario 3:
>>> OST/OSS reboots due to a power outage. Some files are striped on
this,
>>> and some aren''t What happens? What to do for minimal
outage?
>> - Clients can be mounted with a dead OST using the exclude options to
the
>> mount command. lfs getstripe can be run from clients to find files
>> on the bad OST. See answer 0 for detailed process.
>>> Scenario 4:
>>> lctl dl shows some devices in "ST" state. What does that
mean, and how
>>> do I clear it?
>> ST = stopped.
>> Clear this by cleaning up all devices (answer 0)
>> or restarting the stopped devices.
>> Usually indicates an error/issue with the stopped device, so see
>> answer 1.
>>>
>>> I know some of these scenarios may be ambiguous, but please let me
>>> know which so I can further elaborate. I am eventually planning to
>>> wiki this for future reference and other lustre newbies.
>> Please contribute to wiki.lustre.org - there is considerable
information
>> there already, and a decent existing structure.
>>> If anyone else has any other scenarios, please don''t be
shy and ask
>>> away. We can create a good trouble shooting doc similar to the
>>> operations manual.
>> Again, please file doc bugs at bugzilla.lustre.org and contribute to
>> wiki.lustre.org, hope this helps!
>> cliffw
>>
>>>
>>> TIA
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Jim Harm

2008-Aug-08 21:03 UTC

head link

[Lustre-discuss] simulations

I am trying to track logged errors upstream from the error to the 
file that may have been affected.

What is the easy(and not so dangerous) way to:

1. derive OST inode from OST object?
OST object modulo 32 for directory on OST
then run debug.ldiskfs(stat) the file(ost object),
after cd into O/0/d$modulo_number,
that displays inode of object on the OST

2. derive MDS inode from OST inode?
use a tool that is nice uses OST inode and gives me the mds inode or
decode using source code the extended attributes
that are in some hex string that is in the output
from the debugfs step above at "fid =" line.

3.derive filename from MDS inode?
run debug.ldiskfs(ncheck) the MDS inode
that displays the filename.

PS; debug.ldiskfs used with -c option to load faster.

-- 
}}}===============>>  LLNL
James E. Harm (Jim); jharm at llnl.gov
System Administrator, ICCD Clusters
(925) 422-4018 Page: 423-7705x57152

Herb Wartens

2008-Aug-08 21:19 UTC

head link

[Lustre-discuss] Extended Attributes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512



Jim Harm wrote:
| I am trying to track logged errors upstream from the error to the
| file that may have been affected.
|
| What is the easy(and not so dangerous) way to:
|
| 1. derive OST inode from OST object?
| OST object modulo 32 for directory on OST
| then run debug.ldiskfs(stat) the file(ost object),
| after cd into O/0/d$modulo_number,
| that displays inode of object on the OST
|

Jim,
We have a rudimentary tool that I developed here at LLNL that
does what I think you want here.
You asked for getting an OST inode from an ost object.  All you have
to do is stat the file using debugfs to get at that information.
What I think you want is something a bit more tricky.

We had an incident here where the fsck found some corruption and
moved some OST objects into the lost+found.  One nice thing about
Lustre is that it stores extended attributes about the file with
the inode.

We have a tool here called eadump.ldiskfs that reads and decodes the
extended attribute information for an ost object.  This tells you
what the object id should be for the file as well as what the mds
inode should be as well (This also answers youe #2 below)...=)

EG:
| eadump.ldiskfs -d /dev/sdc -i 105906277
Name: trusted.fid Value: MDSINO: 112108525 GEN: 1401146486 STRIPEIDX: 1 OBJID:
10942568 GROUP: 0

| 2. derive MDS inode from OST inode?
| use a tool that is nice uses OST inode and gives me the mds inode or
| decode using source code the extended attributes
| that are in some hex string that is in the output
| from the debugfs step above at "fid =" line.
|
| 3.derive filename from MDS inode?
| run debug.ldiskfs(ncheck) the MDS inode
| that displays the filename.

Using ncheck in debugfs is the only way I know of to get at this information.
This is a SLOW process since it has to rumble through the filesystem for it.
You should also note that this filename may not be the only one pointing to
that inode.

|
| PS; debug.ldiskfs used with -c option to load faster.
|
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iEYEAREKAAYFAkicuHwACgkQP/62XqEEbMaOdQCfbwtRnF/iiqp6y/He91k6tW4l
ISQAoM3INPeYFoBq2MmUdXFtUZoMcL0i
=mvWx
-----END PGP SIGNATURE-----

Andreas Dilger

2008-Aug-18 10:11 UTC

head link

[Lustre-discuss] Extended Attributes

On Aug 08, 2008  14:19 -0700, Herb Wartens wrote:> We have a rudimentary tool that I developed here at LLNL that
> does what I think you want here.
> You asked for getting an OST inode from an ost object.  All you have
> to do is stat the file using debugfs to get at that information.
> What I think you want is something a bit more tricky.
> 
> We had an incident here where the fsck found some corruption and
> moved some OST objects into the lost+found.  One nice thing about
> Lustre is that it stores extended attributes about the file with
> the inode.
> 
> We have a tool here called eadump.ldiskfs that reads and decodes the
> extended attribute information for an ost object.  This tells you
> what the object id should be for the file as well as what the mds
> inode should be as well (This also answers youe #2 below)...=)
> 
> EG:
> | eadump.ldiskfs -d /dev/sdc -i 105906277
> Name: trusted.fid Value: MDSINO: 112108525 GEN: 1401146486 STRIPEIDX: 1
OBJID: 10942568 GROUP: 0
Note that there is also a new tool ll_recover_lost_found_objs in 1.6.6
(also in bugzilla) that will move objects from lost+found back into
place in O/0/d*, including rebuilding the directory structure there if
it was broken for some reason.  It will also (AFAIR) print out the
MDS inode number.
> | 2. derive MDS inode from OST inode?
> | use a tool that is nice uses OST inode and gives me the mds inode or
> | decode using source code the extended attributes
> | that are in some hex string that is in the output
> | from the debugfs step above at "fid =" line.
> |
> | 3.derive filename from MDS inode?
> | run debug.ldiskfs(ncheck) the MDS inode
> | that displays the filename.
> 
> Using ncheck in debugfs is the only way I know of to get at this
information.
> This is a SLOW process since it has to rumble through the filesystem for
it.
> You should also note that this filename may not be the only one pointing to
> that inode.
Right.  There is a discussion underway about storing the filename(s) in
the inode itself to allow this kind of operation to be done in O(path_parts)
instead of O(number_of_inodes * path_parts).  This is also needed for
things like changelog generation.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Lustre discuss - Aug 2008 - simulations

[Lustre-discuss] simulations

[Lustre-discuss] simulations

[Lustre-discuss] simulations

[Lustre-discuss] simulations

[Lustre-discuss] simulations

[Lustre-discuss] Extended Attributes

[Lustre-discuss] Extended Attributes