thr3ads.net - Lustre discuss - [Lustre-discuss] Luster clients getting evicted [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Brock Palen

2008-Feb-04 15:11 UTC

[Lustre-discuss] Luster clients getting evicted

on our cluster that has been running lustre for about 1 month. I have  
1 MDT/MGS and 1 OSS with 2 OST''s.

Our cluster uses all Gige and has about 608 nodes 1854 cores.

We have allot of jobs that die, and/or go into high IO wait,  strace  
shows processes stuck in fstat().

The big problem is (i think) I would like some feedback on it that of  
these 608 nodes 209 of them have in dmesg the string

"This client was evicted by"

Is this normal for clients to be dropped like this?  Is there some  
tuning that needs to be done to the server to carry this many nodes  
out of the box?  We are using default lustre install with Gige.


Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985

Kilian CAVALOTTI

2008-Feb-04 17:41 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Hi Brock,

On Monday 04 February 2008 07:11:11 am Brock Palen
wrote:> on our cluster that has been running lustre for about 1 month. I have
> 1 MDT/MGS and 1 OSS with 2 OST''s.
>
> Our cluster uses all Gige and has about 608 nodes 1854 cores.
This seems to be a lot of clients for only one OSS (and thus for only 
one GigE link to the OSS).
> We have allot of jobs that die, and/or go into high IO wait,  strace
> shows processes stuck in fstat().
>
> The big problem is (i think) I would like some feedback on it that of
> these 608 nodes 209 of them have in dmesg the string
>
> "This client was evicted by"
>
> Is this normal for clients to be dropped like this?  
I''m not an expert here, but evictions typically occur when a client 
hasn''t been seen for a certain period by the OSS/MDS. This is often 
related to network problems. Considering your number of clients, if 
they all do I/O operations on the filesystem concurrently, maybe your 
Ethernet switches are the bottleneck and have to drop packets. Is your 
GigE network working fine outside of Lustre?

To eliminate networking issues from the equation, you can try to lctl 
ping your MDS and OSS from a freshly evicted node, and see what you 
get. (lctl ping <your-oss-nid>)
> Is there some 
> tuning that needs to be done to the server to carry this many nodes
> out of the box?  We are using default lustre install with Gige.
Do your MDS or OSS show any particularly high load or memory usage? Do 
you see any Lustre-related error messages in their logs?

CHeers,
-- 
Kilian

Brock Palen

2008-Feb-04 18:17 UTC

head link

[Lustre-discuss] Luster clients getting evicted

> Hi Brock,
>
> On Monday 04 February 2008 07:11:11 am Brock Palen wrote:
>> on our cluster that has been running lustre for about 1 month. I have
>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>
>> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>
> This seems to be a lot of clients for only one OSS (and thus for only
> one GigE link to the OSS).
Its more for evaluation, the ''real'' file system is a NFS file
system
provided by a OnStor bobcat.  So anything is a improvement.  The  
cluster IS to big, but there isn''t a person at the university who is  
willing to pay for anything other than more cluster nodes.  Enough  
with politics.
>
>> We have allot of jobs that die, and/or go into high IO wait,  strace
>> shows processes stuck in fstat().
>>
>> The big problem is (i think) I would like some feedback on it that of
>> these 608 nodes 209 of them have in dmesg the string
>>
>> "This client was evicted by"
>>
>> Is this normal for clients to be dropped like this?
>
> I''m not an expert here, but evictions typically occur when a
client
> hasn''t been seen for a certain period by the OSS/MDS. This is
often
> related to network problems. Considering your number of clients, if
> they all do I/O operations on the filesystem concurrently, maybe your
> Ethernet switches are the bottleneck and have to drop packets. Is your
> GigE network working fine outside of Lustre?
>
> To eliminate networking issues from the equation, you can try to lctl
> ping your MDS and OSS from a freshly evicted node, and see what you
> get. (lctl ping <your-oss-nid>)
I just had another node get evicted while running code causing the  
code to lock up.  This time it was the MDS that evicted it.  Pinging  
work though:

[root at nyx350 ~]# lctl ping 141.212.30.184 at tcp
12345-0 at lo
12345-141.212.30.184 at tcp

Recovery is slow, this clinet has been evicted for about 10 minutes.

I have attached the output of lctl dk  from the client and some  
syslog messages from the MDS.
>
>> Is there some
>> tuning that needs to be done to the server to carry this many nodes
>> out of the box?  We are using default lustre install with Gige.
>
> Do your MDS or OSS show any particularly high load or memory usage? Do
> you see any Lustre-related error messages in their logs?
Nope both servers have 2GB ram, and load is almost 0.  No swapping.
Thanks
-------------- next part --------------
A non-text attachment was scrubbed...
Name: client.err
Type: application/octet-stream
Size: 27024 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080204/3d7714b2/attachment-0004.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mds.log
Type: application/octet-stream
Size: 997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080204/3d7714b2/attachment-0005.obj
-------------- next part --------------


>
> CHeers,
> -- 
> Kilian
>
>

Kilian CAVALOTTI

2008-Feb-04 18:43 UTC

head link

[Lustre-discuss] Luster clients getting evicted

On Monday 04 February 2008 10:17:37 am Brock Palen
wrote:> The
> cluster IS to big, but there isn''t a person at the university who
is
> willing to pay for anything other than more cluster nodes.  Enough
> with politics.
That''s the first time I hear a cluster is too big, people usually 
complain about the contrary. :)
But the second part sounds very very familiar, though... Anyway.
> I just had another node get evicted while running code causing the
> code to lock up.  This time it was the MDS that evicted it.  Pinging
> work though:
>
> [root at nyx350 ~]# lctl ping 141.212.30.184 at tcp
> 12345-0 at lo
> 12345-141.212.30.184 at tcp
Ok.
> I have attached the output of lctl dk  from the client and some
> syslog messages from the MDS.
(recover.c:188:ptlrpc_request_handle_notconn()) import 
nobackup-MDT0000-mdc-000001012bd27c00 of 
nobackup-MDT0000_UUID at 141.212.30.184@tcp abruptly disconnected: 
reconnecting
(import.c:133:ptlrpc_set_import_discon()) 
nobackup-MDT0000-mdc-000001012bd27c00: Connection to service 
nobackup-MDT0000 via nid 141.212.30.184 at tcp was lost; 

I will let Lustre people comment on this, but this sure looks like a 
network problem to me.

Is there any information you can get out of the switches (logs, dropped 
packets, retries, stats, anything)?
> Nope both servers have 2GB ram, and load is almost 0.  No swapping.
Do you see dropped packets or errors in your ifconfig output, on the 
servers and/or clients?

Cheers,
-- 
Kilian

Brock Palen

2008-Feb-04 18:48 UTC

head link

[Lustre-discuss] Luster clients getting evicted

On Feb 4, 2008, at 1:43 PM, Kilian CAVALOTTI wrote:
> On Monday 04 February 2008 10:17:37 am Brock Palen wrote:
>> The
>> cluster IS to big, but there isn''t a person at the university
who is
>> willing to pay for anything other than more cluster nodes.  Enough
>> with politics.
>
> That''s the first time I hear a cluster is too big, people usually
> complain about the contrary. :)
> But the second part sounds very very familiar, though... Anyway.
>
>> I just had another node get evicted while running code causing the
>> code to lock up.  This time it was the MDS that evicted it.  Pinging
>> work though:
>>
>> [root at nyx350 ~]# lctl ping 141.212.30.184 at tcp
>> 12345-0 at lo
>> 12345-141.212.30.184 at tcp
>
> Ok.
>
>> I have attached the output of lctl dk  from the client and some
>> syslog messages from the MDS.
>
> (recover.c:188:ptlrpc_request_handle_notconn()) import
> nobackup-MDT0000-mdc-000001012bd27c00 of
> nobackup-MDT0000_UUID at 141.212.30.184@tcp abruptly disconnected:
> reconnecting
> (import.c:133:ptlrpc_set_import_discon())
> nobackup-MDT0000-mdc-000001012bd27c00: Connection to service
> nobackup-MDT0000 via nid 141.212.30.184 at tcp was lost;
>
> I will let Lustre people comment on this, but this sure looks like a
> network problem to me.
>
> Is there any information you can get out of the switches (logs,  
> dropped
> packets, retries, stats, anything)?
The client, shows 107 dropped packets.  The servers have none.  I  
think your right, the client is the same clint that caused problems  
in the week sooner with losing connections to the OSS is now losing  
the connection to the MDT.

I have asked networking to look at the counters between the force10  
and the cisco.

Lustre doesn''t care about frames at 6000 MTU right?
>
>> Nope both servers have 2GB ram, and load is almost 0.  No swapping.
>
> Do you see dropped packets or errors in your ifconfig output, on the
> servers and/or clients?
>
> Cheers,
> -- 
> Kilian
>
>

Harald van Pee

2008-Feb-04 19:06 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Which version of lustre do you use?
Server and clients same version and same os? which one?

Harald

On Monday 04 February 2008 04:11 pm, Brock Palen wrote:> on our cluster that has been running lustre for about 1 month. I have
> 1 MDT/MGS and 1 OSS with 2 OST''s.
>
> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>
> We have allot of jobs that die, and/or go into high IO wait,  strace
> shows processes stuck in fstat().
>
> The big problem is (i think) I would like some feedback on it that of
> these 608 nodes 209 of them have in dmesg the string
>
> "This client was evicted by"
>
> Is this normal for clients to be dropped like this?  Is there some
> tuning that needs to be done to the server to carry this many nodes
> out of the box?  We are using default lustre install with Gige.
>
>
> Brock Palen
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
-- 
Harald van Pee

Helmholtz-Institut fuer Strahlen- und Kernphysik der Universitaet Bonn

Andreas Dilger

2008-Feb-04 19:17 UTC

head link

[Lustre-discuss] Luster clients getting evicted

On Feb 04, 2008  13:17 -0500, Brock Palen wrote:>> On Monday 04 February 2008 07:11:11 am Brock Palen wrote:
>>> on our cluster that has been running lustre for about 1 month. I
have
>>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>>
>>> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>>
>> This seems to be a lot of clients for only one OSS (and thus for only
>> one GigE link to the OSS).
>
> Its more for evaluation, the ''real'' file system is a NFS
file system
> provided by a OnStor bobcat.  So anything is a improvement.  The cluster IS
> to big, but there isn''t a person at the university who is willing
to pay
> for anything other than more cluster nodes.  Enough with politics.
I''d suggest increasing the lustre timeout, to avoid eviction if the
system
is overloaded:

Temporarily (on the MDS, OSS, and all client nodes):
	[root at mds]# sysctl -w lustre.timeout=300

If this helps you can set it permanently on the MGS (MDS) node:
	mgs> lctl conf_param testfs-MDT0000.sys.timeout=300

replacing "testfs" with the actual name of your filesystem.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Brock Palen

2008-Feb-04 19:22 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Just some clarifying questions:
> I''d suggest increasing the lustre timeout, to avoid eviction if
the
> system
> is overloaded:Is this more fore clients being overloaded?  (say user is swapping some)
>
> Temporarily (on the MDS, OSS, and all client nodes):
> 	[root at mds]# sysctl -w lustre.timeout=300
>
> If this helps you can set it permanently on the MGS (MDS) node:
> 	mgs> lctl conf_param testfs-MDT0000.sys.timeout=300
Changing options like this does it take affect only for new mounted  
clients?  Or is it forced to all currently mounted clients?

Should this only be done on a ''down'' filesystem?  Or can
conf_param
values be changed while live?

Thanks
> replacing "testfs" with the actual name of your filesystem.
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
>

Brock Palen

2008-Feb-04 19:47 UTC

head link

[Lustre-discuss] Luster clients getting evicted

> Which version of lustre do you use?
> Server and clients same version and same os? which one?
lustre-1.6.4.1

The servers (oss and mds/mgs) use the RHEL4 rpm from lustre.org:
2.6.9-55.0.9.EL_lustre.1.6.4.1smp

The clients run patchless RHEL4
2.6.9-67.0.1.ELsmp

One set of clients are on a 10.x network while the servers and other  
half of clients are on a 141.  network, because we are using the tcp  
network type, we have not setup any lnet routes.  I don''t think  
should cause a problem but I include the information for clarity.  We  
do route 10.x on campus.
>
> Harald
>
> On Monday 04 February 2008 04:11 pm, Brock Palen wrote:
>> on our cluster that has been running lustre for about 1 month. I have
>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>
>> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>>
>> We have allot of jobs that die, and/or go into high IO wait,  strace
>> shows processes stuck in fstat().
>>
>> The big problem is (i think) I would like some feedback on it that of
>> these 608 nodes 209 of them have in dmesg the string
>>
>> "This client was evicted by"
>>
>> Is this normal for clients to be dropped like this?  Is there some
>> tuning that needs to be done to the server to carry this many nodes
>> out of the box?  We are using default lustre install with Gige.
>>
>>
>> Brock Palen
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> -- 
> Harald van Pee
>
> Helmholtz-Institut fuer Strahlen- und Kernphysik der Universitaet Bonn
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

Craig Tierney

2008-Feb-04 21:52 UTC

head link

[Lustre-discuss] Question about building Lustre, correct version of GCC

I am trying to lustre-1.6.4.2 with my system and I am reading through
the documentation to figure out how to it.  I am reading version 1.6_man_v1.9
of the Operations manual.

On page 31, regarding compiler choice, it says:

Compiler Choice
The compiler must be greater than GCC version 3.3.4. Currently, GCC v4.0 is not
supported. GCC v3.3.4
has been used to successfully compile all of the pre-packaged releases made
available by CFS, and it is
the only officially-supported compiler. Your mileage may vary with other
compilers, or even with other
versions of GCC.

NOTE:
GCC v3.3.4 was used to build 2.6 series kernels.


So, which is it?  Is 3.3.4 the right compiler, or does it have to be
"greater than" 3.3.4?

Has anyone built Lustre using Centos 5.X?  I am trying to get Lustre working
with 5.1, and have downgraded the kernel for simplicity.  Using a vanilla
2.6.18 kernel, I have been able to build lustre and mount some basic
filesystems,
but I have not tested it thoroughly enough to say it works.

Craig

-- 
Craig Tierney (craig.tierney at noaa.gov)

Andreas Dilger

2008-Feb-04 23:19 UTC

head link

[Lustre-discuss] Question about building Lustre, correct version of GCC

On Feb 04, 2008  14:52 -0700, Craig Tierney wrote:> I am trying to lustre-1.6.4.2 with my system and I am reading through
> the documentation to figure out how to it.  I am reading version
1.6_man_v1.9
> of the Operations manual.
> 
> On page 31, regarding compiler choice, it says:
> 
> Compiler Choice
> The compiler must be greater than GCC version 3.3.4. Currently,
> GCC v4.0 is not supported. GCC v3.3.4 has been used to successfully
> compile all of the pre-packaged releases made available by CFS, and it
> is the only officially-supported compiler. Your mileage may vary with
> other compilers, or even with other versions of GCC.
> 
> NOTE:
> GCC v3.3.4 was used to build 2.6 series kernels.
> 
> 
> So, which is it?  Is 3.3.4 the right compiler, or does it have to be
> "greater than" 3.3.4?
The right answer today is "Lustre is built with the kernel shipped with the
distro".  The documentation needs to be updated.
> Has anyone built Lustre using Centos 5.X?  I am trying to get Lustre
working
> with 5.1, and have downgraded the kernel for simplicity.  Using a vanilla
> 2.6.18 kernel, I have been able to build lustre and mount some basic
filesystems,
> but I have not tested it thoroughly enough to say it works.
Why not use the RHEL5 2.6.18 kernel?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Brock Palen

2008-Feb-05 16:01 UTC

head link

[Lustre-discuss] Luster clients getting evicted

The timeouts fixed the random evictions.  The problem we were trying  
to solve in the first place still is in place though.  In talking  
with the user of the code the problem is related to a similar problem  
in another code.

One code is from NOAA, the Other is S3D from Sandia (I think).

Both these codes write one file per process.  (NetCDF for one,  
tecplot for the other).
When the code has finished with a iteration they copy/tar/cpio the  
files to another location.  This is where the job will hand *some*  
times.  Most the time it works, but with enough iterations of this  
method a job will hang at some point.  The job does not die.  Just  
hangs.

The NOAA code does the mv+cpio in its pbs script.  The S3D code uses  
system() to run tar.  In the end they have the same behavior.

has anyone seen similar behavior?

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Feb 4, 2008, at 2:47 PM, Brock Palen wrote:
>> Which version of lustre do you use?
>> Server and clients same version and same os? which one?
>
> lustre-1.6.4.1
>
> The servers (oss and mds/mgs) use the RHEL4 rpm from lustre.org:
> 2.6.9-55.0.9.EL_lustre.1.6.4.1smp
>
> The clients run patchless RHEL4
> 2.6.9-67.0.1.ELsmp
>
> One set of clients are on a 10.x network while the servers and other
> half of clients are on a 141.  network, because we are using the tcp
> network type, we have not setup any lnet routes.  I don''t think
> should cause a problem but I include the information for clarity.  We
> do route 10.x on campus.
>
>>
>> Harald
>>
>> On Monday 04 February 2008 04:11 pm, Brock Palen wrote:
>>> on our cluster that has been running lustre for about 1 month. I  
>>> have
>>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>>
>>> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>>>
>>> We have allot of jobs that die, and/or go into high IO wait, 
strace
>>> shows processes stuck in fstat().
>>>
>>> The big problem is (i think) I would like some feedback on it  
>>> that of
>>> these 608 nodes 209 of them have in dmesg the string
>>>
>>> "This client was evicted by"
>>>
>>> Is this normal for clients to be dropped like this?  Is there some
>>> tuning that needs to be done to the server to carry this many nodes
>>> out of the box?  We are using default lustre install with Gige.
>>>
>>>
>>> Brock Palen
>>> Center for Advanced Computing
>>> brockp at umich.edu
>>> (734)936-1985
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>> -- 
>> Harald van Pee
>>
>> Helmholtz-Institut fuer Strahlen- und Kernphysik der Universitaet  
>> Bonn
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

Tom.Wang

2008-Feb-05 18:41 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Brock Palen wrote:> The timeouts fixed the random evictions.  The problem we were trying  
> to solve in the first place still is in place though.  In talking  
> with the user of the code the problem is related to a similar problem  
> in another code.
>
> One code is from NOAA, the Other is S3D from Sandia (I think).
>
> Both these codes write one file per process.  (NetCDF for one,  
> tecplot for the other).
> When the code has finished with a iteration they copy/tar/cpio the  
> files to another location.  This is where the job will hand *some*  
> times.  Most the time it works, but with enough iterations of this  
> method a job will hang at some point.  The job does not die.  Just  
> hangs.
>
> The NOAA code does the mv+cpio in its pbs script.  The S3D code uses  
> system() to run tar.  In the end they have the same behavior.
>
> has anyone seen similar behavior?
>   If client get eviction from the server, it might be triggered by

1) server did not get client pinger msg in a long time.
2) client is too busy to handle the server lock cancel req.
3) client cancel the lock, but the network just dropped the cancel reply 
to server.
4) server is too busy to handle the lock cancel reply from the client or 
be blocked somewhere.

It seems there are a lot of metadata operations in your job. I guess 
your eviction
might be caused by the latter 2 reasons. If you could provide the 
process stack trace on MDS
when the job died, it might help us to figure out what is going on there?

WangDi> Brock Palen
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
> On Feb 4, 2008, at 2:47 PM, Brock Palen wrote:
>
>   
>>> Which version of lustre do you use?
>>> Server and clients same version and same os? which one?
>>>       
>> lustre-1.6.4.1
>>
>> The servers (oss and mds/mgs) use the RHEL4 rpm from lustre.org:
>> 2.6.9-55.0.9.EL_lustre.1.6.4.1smp
>>
>> The clients run patchless RHEL4
>> 2.6.9-67.0.1.ELsmp
>>
>> One set of clients are on a 10.x network while the servers and other
>> half of clients are on a 141.  network, because we are using the tcp
>> network type, we have not setup any lnet routes.  I don''t
think
>> should cause a problem but I include the information for clarity.  We
>> do route 10.x on campus.
>>
>>     
>>> Harald
>>>
>>> On Monday 04 February 2008 04:11 pm, Brock Palen wrote:
>>>       
>>>> on our cluster that has been running lustre for about 1 month.
I
>>>> have
>>>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>>>
>>>> Our cluster uses all Gige and has about 608 nodes 1854 cores.
>>>>
>>>> We have allot of jobs that die, and/or go into high IO wait, 
strace
>>>> shows processes stuck in fstat().
>>>>
>>>> The big problem is (i think) I would like some feedback on it  
>>>> that of
>>>> these 608 nodes 209 of them have in dmesg the string
>>>>
>>>> "This client was evicted by"
>>>>
>>>> Is this normal for clients to be dropped like this?  Is there
some
>>>> tuning that needs to be done to the server to carry this many
nodes
>>>> out of the box?  We are using default lustre install with Gige.
>>>>
>>>>
>>>> Brock Palen
>>>> Center for Advanced Computing
>>>> brockp at umich.edu
>>>> (734)936-1985
>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>         
>>> -- 
>>> Harald van Pee
>>>
>>> Helmholtz-Institut fuer Strahlen- und Kernphysik der Universitaet  
>>> Bonn
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>       
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>     
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Brock Palen

2008-Feb-06 15:59 UTC

head link

[Lustre-discuss] Luster clients getting evicted

>> If client get eviction from the server, it might be triggered by
>
> 1) server did not get client pinger msg in a long time.
> 2) client is too busy to handle the server lock cancel req.
Clients show a load of 4.2  (4 cores total, 1 process per core).
> 3) client cancel the lock, but the network just dropped the cancel  
> reply to server.I see a very small amount (6339) of dropped packets on the interfaces  
of the OSS.  Links between the switches show no errors.

> 4) server is too busy to handle the lock cancel reply from the  
> client or be blocked somewhere.
I started paying attention to the OSS more once you said this, some  
times i see the cpu use of socknal_sd00 get to 100%.  Now is this  
process used to keep all the odb_ping''s going?

Both the OSS and the MDS/MGS are SMP systems and run single  
interfaces.  If I dual homed the servers would that create another  
socknal process for lnet?

>
> It seems there are a lot of metadata operations in your job. I  
> guess your eviction
> might be caused by the latter 2 reasons. If you could provide the  
> process stack trace on MDS
> when the job died, it might help us to figure out what is going on  
> there?
>
> WangDi
>> Brock Palen
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>> On Feb 4, 2008, at 2:47 PM, Brock Palen wrote:
>>
>>
>>>> Which version of lustre do you use?
>>>> Server and clients same version and same os? which one?
>>>>
>>> lustre-1.6.4.1
>>>
>>> The servers (oss and mds/mgs) use the RHEL4 rpm from lustre.org:
>>> 2.6.9-55.0.9.EL_lustre.1.6.4.1smp
>>>
>>> The clients run patchless RHEL4
>>> 2.6.9-67.0.1.ELsmp
>>>
>>> One set of clients are on a 10.x network while the servers and
other
>>> half of clients are on a 141.  network, because we are using the
tcp
>>> network type, we have not setup any lnet routes.  I don''t
think
>>> should cause a problem but I include the information for  
>>> clarity.  We
>>> do route 10.x on campus.
>>>
>>>
>>>> Harald
>>>>
>>>> On Monday 04 February 2008 04:11 pm, Brock Palen wrote:
>>>>
>>>>> on our cluster that has been running lustre for about 1
month.
>>>>> I  have
>>>>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>>>>
>>>>> Our cluster uses all Gige and has about 608 nodes 1854
cores.
>>>>>
>>>>> We have allot of jobs that die, and/or go into high IO
wait,
>>>>> strace
>>>>> shows processes stuck in fstat().
>>>>>
>>>>> The big problem is (i think) I would like some feedback on
it
>>>>> that of
>>>>> these 608 nodes 209 of them have in dmesg the string
>>>>>
>>>>> "This client was evicted by"
>>>>>
>>>>> Is this normal for clients to be dropped like this?  Is
there some
>>>>> tuning that needs to be done to the server to carry this
many
>>>>> nodes
>>>>> out of the box?  We are using default lustre install with
Gige.
>>>>>
>>>>>
>>>>> Brock Palen
>>>>> Center for Advanced Computing
>>>>> brockp at umich.edu
>>>>> (734)936-1985
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>> -- 
>>>> Harald van Pee
>>>>
>>>> Helmholtz-Institut fuer Strahlen- und Kernphysik der  
>>>> Universitaet  Bonn
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
>
>

Roland Laifer

2008-Feb-06 16:56 UTC

head link

[Lustre-discuss] Luster clients getting evicted

On Tue, Feb 05, 2008 at 11:01:47AM -0500, Brock Palen
wrote:> The timeouts fixed the random evictions.  The problem we were trying  
> to solve in the first place still is in place though.  In talking  
> with the user of the code the problem is related to a similar problem  
> in another code.
> 
> One code is from NOAA, the Other is S3D from Sandia (I think).
> 
> Both these codes write one file per process.  (NetCDF for one,  
> tecplot for the other).
> When the code has finished with a iteration they copy/tar/cpio the  
> files to another location.  This is where the job will hand *some*  
> times.  Most the time it works, but with enough iterations of this  
> method a job will hang at some point.  The job does not die.  Just  
> hangs.
> 
> The NOAA code does the mv+cpio in its pbs script.  The S3D code uses  
> system() to run tar.  In the end they have the same behavior.
> 
> has anyone seen similar behavior?
we have seen evictions several times and I noticed that it''s worth 
to investigate them. You can get evictions by bad applications, 
e.g. if lots of nodes write few bytes to a shared file. 

One time the reason was a tecplot routine and the user reported that 
it includes bad code (in preutil.c). 

Regards, 
  Roland 
-- 
 --------------------------------------------------------------------------
  Roland Laifer 
  Rechenzentrum, Universitaet Karlsruhe (TH), D-76128 Karlsruhe, Germany
  Email: Roland.Laifer at rz.uni-karlsruhe.de, Phone: +49 721 608 4861, 
  Fax: +49 721 32550, Web: www.rz.uni-karlsruhe.de/personen/roland.laifer
 --------------------------------------------------------------------------

Brock Palen

2008-Feb-06 17:10 UTC

head link

[Lustre-discuss] Luster clients getting evicted

I was able to catch a client and server in the act:

client dmesg:

eth0: no IPv6 routers present
Lustre: nobackup-MDT0000-mdc-000001012bd39800: Connection to service  
nobackup-MDT0000 via nid 141.212.30.184 at tcp was lost; in progress  
operations using this service will wait for recovery to complete.
LustreError: 167-0: This client was evicted by nobackup-MDT0000; in  
progress operations using this service will fail.
LustreError: 2757:0:(client.c:519:ptlrpc_import_delay_req()) @@@  
IMP_INVALID  req at 00000100cfce6800 x3216741/t0 o101->nobackup- 
MDT0000_UUID at 141.212.30.184@tcp:12 lens 448/768 ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 2757:0:(mdc_locks.c:423:mdc_finish_enqueue())  
ldlm_cli_enqueue: -108
LustreError: 2822:0:(file.c:97:ll_close_inode_openhandle()) inode  
11237379 mdc close failed: rc = -108
LustreError: 2822:0:(client.c:519:ptlrpc_import_delay_req()) @@@  
IMP_INVALID  req at 000001002966d000 x3216837/t0 o35->nobackup- 
MDT0000_UUID at 141.212.30.184@tcp:12 lens 296/448 ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 2822:0:(client.c:519:ptlrpc_import_delay_req()) Skipped  
95 previous similar messages
LustreError: 2822:0:(file.c:97:ll_close_inode_openhandle()) inode  
11270746 mdc close failed: rc = -108
LustreError: 2757:0:(mdc_locks.c:423:mdc_finish_enqueue())  
ldlm_cli_enqueue: -108
LustreError: 2757:0:(mdc_locks.c:423:mdc_finish_enqueue()) Skipped 30  
previous similar messages
LustreError: 2822:0:(file.c:97:ll_close_inode_openhandle()) Skipped  
62 previous similar messages
LustreError: 2757:0:(dir.c:258:ll_get_dir_page()) lock enqueue: rc: -108
LustreError: 2757:0:(dir.c:412:ll_readdir()) error reading dir  
11239903/324715747 page 2: rc -108
Lustre: nobackup-MDT0000-mdc-000001012bd39800: Connection restored to  
service nobackup-MDT0000 using nid 141.212.30.184 at tcp.
LustreError: 11-0: an error occurred while communicating with  
141.212.30.184 at tcp. The mds_close operation failed with -116
LustreError: 11-0: an error occurred while communicating with  
141.212.30.184 at tcp. The mds_close operation failed with -116
LustreError: 11-0: an error occurred while communicating with  
141.212.30.184 at tcp. The mds_close operation failed with -116
LustreError: 2834:0:(file.c:97:ll_close_inode_openhandle()) inode  
11270686 mdc close failed: rc = -116
LustreError: 2834:0:(file.c:97:ll_close_inode_openhandle()) Skipped  
40 previous similar messages
LustreError: 11-0: an error occurred while communicating with  
141.212.30.184 at tcp. The mds_close operation failed with -116
LustreError: 2728:0:(file.c:97:ll_close_inode_openhandle()) inode  
11240591 mdc close failed: rc = -116

MDT dmesg:

LustreError: 9042:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@  
processing error (-107)  req at 000001002b
52b000 x445020/t0 o400-><?>@<?>:-1 lens 128/0 ref 0 fl
Interpret:/0/0
rc -107/0
LustreError: 0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ### lock  
callback timer expired: evicting cl
ient 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  
nid 10.164.0.141 at tcp  ns: mds-nobackup
-MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a lrc: 1/0,0  
mode: CR/CR res: 11240142/324715850 bi
ts 0x5 rrc: 2 type: IBT flags: 20 remote: 0x4e54bc800174cd08 expref:  
372 pid 26925
Lustre: 3170:0:(mds_reint.c:127:mds_finish_transno()) commit  
transaction for disconnected client 2faf3c9e
-26fb-64b7-ca6c-7c5b09374e67: rc 0
LustreError: 27505:0:(mds_open.c:1474:mds_close()) @@@ no handle for  
file close ino 11239903: cookie 0xbc
269e05c51912d8  req at 000001001e69c600 x3216892/t0 o35- 
 >2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at NET_0x200000aa
4008d_UUID:-1 lens 296/448 ref 0 fl Interpret:/0/0 rc 0/0
LustreError: 27505:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@  
processing error (-116)  req at 000001001
e69c600 x3216892/t0 o35->2faf3c9e-26fb-64b7- 
ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID:-1 lens 296/448 re
f 0 fl Interpret:/0/0 rc -116/0

I hope this helps, but I see it happen more often with the OSS  
evicting client.

On Feb 6, 2008, at 10:59 AM, Brock Palen wrote:
>>> If client get eviction from the server, it might be triggered by
>>
>> 1) server did not get client pinger msg in a long time.
>> 2) client is too busy to handle the server lock cancel req.
>
> Clients show a load of 4.2  (4 cores total, 1 process per core).
>
>> 3) client cancel the lock, but the network just dropped the cancel
>> reply to server.
> I see a very small amount (6339) of dropped packets on the interfaces
> of the OSS.  Links between the switches show no errors.
>
>
>> 4) server is too busy to handle the lock cancel reply from the
>> client or be blocked somewhere.
>
> I started paying attention to the OSS more once you said this, some
> times i see the cpu use of socknal_sd00 get to 100%.  Now is this
> process used to keep all the odb_ping''s going?
>
> Both the OSS and the MDS/MGS are SMP systems and run single
> interfaces.  If I dual homed the servers would that create another
> socknal process for lnet?
>
>
>>
>> It seems there are a lot of metadata operations in your job. I
>> guess your eviction
>> might be caused by the latter 2 reasons. If you could provide the
>> process stack trace on MDS
>> when the job died, it might help us to figure out what is going on
>> there?
>>
>> WangDi
>>> Brock Palen
>>> Center for Advanced Computing
>>> brockp at umich.edu
>>> (734)936-1985
>>>
>>>
>>> On Feb 4, 2008, at 2:47 PM, Brock Palen wrote:
>>>
>>>
>>>>> Which version of lustre do you use?
>>>>> Server and clients same version and same os? which one?
>>>>>
>>>> lustre-1.6.4.1
>>>>
>>>> The servers (oss and mds/mgs) use the RHEL4 rpm from
lustre.org:
>>>> 2.6.9-55.0.9.EL_lustre.1.6.4.1smp
>>>>
>>>> The clients run patchless RHEL4
>>>> 2.6.9-67.0.1.ELsmp
>>>>
>>>> One set of clients are on a 10.x network while the servers and
>>>> other
>>>> half of clients are on a 141.  network, because we are using
the
>>>> tcp
>>>> network type, we have not setup any lnet routes.  I
don''t think
>>>> should cause a problem but I include the information for
>>>> clarity.  We
>>>> do route 10.x on campus.
>>>>
>>>>
>>>>> Harald
>>>>>
>>>>> On Monday 04 February 2008 04:11 pm, Brock Palen wrote:
>>>>>
>>>>>> on our cluster that has been running lustre for about 1
month.
>>>>>> I  have
>>>>>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>>>>>
>>>>>> Our cluster uses all Gige and has about 608 nodes 1854
cores.
>>>>>>
>>>>>> We have allot of jobs that die, and/or go into high IO
wait,
>>>>>> strace
>>>>>> shows processes stuck in fstat().
>>>>>>
>>>>>> The big problem is (i think) I would like some feedback
on it
>>>>>> that of
>>>>>> these 608 nodes 209 of them have in dmesg the string
>>>>>>
>>>>>> "This client was evicted by"
>>>>>>
>>>>>> Is this normal for clients to be dropped like this?  Is
there
>>>>>> some
>>>>>> tuning that needs to be done to the server to carry
this many
>>>>>> nodes
>>>>>> out of the box?  We are using default lustre install
with Gige.
>>>>>>
>>>>>>
>>>>>> Brock Palen
>>>>>> Center for Advanced Computing
>>>>>> brockp at umich.edu
>>>>>> (734)936-1985
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>> -- 
>>>>> Harald van Pee
>>>>>
>>>>> Helmholtz-Institut fuer Strahlen- und Kernphysik der
>>>>> Universitaet  Bonn
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>>
>>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

Tom.Wang

2008-Feb-08 04:09 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Hello,

Brock Palen wrote:> I was able to catch a client and server in the act:
>
> client dmesg:
>
> eth0: no IPv6 routers present
> Lustre: nobackup-MDT0000-mdc-000001012bd39800: Connection to service  
> nobackup-MDT0000 via nid 141.212.30.184 at tcp was lost; in progress  
> operations using this service will wait for recovery to complete.
> LustreError: 167-0: This client was evicted by nobackup-MDT0000; in  
> progress operations using this service will fail.
> LustreError: 2757:0:(client.c:519:ptlrpc_import_delay_req()) @@@  
> IMP_INVALID  req at 00000100cfce6800 x3216741/t0 o101->nobackup- 
> MDT0000_UUID at 141.212.30.184@tcp:12 lens 448/768 ref 1 fl Rpc:/0/0 rc 0/0
> LustreError: 2757:0:(mdc_locks.c:423:mdc_finish_enqueue())  
> ldlm_cli_enqueue: -108
> LustreError: 2822:0:(file.c:97:ll_close_inode_openhandle()) inode  
> 11237379 mdc close failed: rc = -108
> LustreError: 2822:0:(client.c:519:ptlrpc_import_delay_req()) @@@  
> IMP_INVALID  req at 000001002966d000 x3216837/t0 o35->nobackup- 
> MDT0000_UUID at 141.212.30.184@tcp:12 lens 296/448 ref 1 fl Rpc:/0/0 rc 0/0
> LustreError: 2822:0:(client.c:519:ptlrpc_import_delay_req()) Skipped  
> 95 previous similar messages
> LustreError: 2822:0:(file.c:97:ll_close_inode_openhandle()) inode  
> 11270746 mdc close failed: rc = -108
> LustreError: 2757:0:(mdc_locks.c:423:mdc_finish_enqueue())  
> ldlm_cli_enqueue: -108
> LustreError: 2757:0:(mdc_locks.c:423:mdc_finish_enqueue()) Skipped 30  
> previous similar messages
> LustreError: 2822:0:(file.c:97:ll_close_inode_openhandle()) Skipped  
> 62 previous similar messages
> LustreError: 2757:0:(dir.c:258:ll_get_dir_page()) lock enqueue: rc: -108
> LustreError: 2757:0:(dir.c:412:ll_readdir()) error reading dir  
> 11239903/324715747 page 2: rc -108
> Lustre: nobackup-MDT0000-mdc-000001012bd39800: Connection restored to  
> service nobackup-MDT0000 using nid 141.212.30.184 at tcp.
> LustreError: 11-0: an error occurred while communicating with  
> 141.212.30.184 at tcp. The mds_close operation failed with -116
> LustreError: 11-0: an error occurred while communicating with  
> 141.212.30.184 at tcp. The mds_close operation failed with -116
> LustreError: 11-0: an error occurred while communicating with  
> 141.212.30.184 at tcp. The mds_close operation failed with -116
> LustreError: 2834:0:(file.c:97:ll_close_inode_openhandle()) inode  
> 11270686 mdc close failed: rc = -116
> LustreError: 2834:0:(file.c:97:ll_close_inode_openhandle()) Skipped  
> 40 previous similar messages
> LustreError: 11-0: an error occurred while communicating with  
> 141.212.30.184 at tcp. The mds_close operation failed with -116
> LustreError: 2728:0:(file.c:97:ll_close_inode_openhandle()) inode  
> 11240591 mdc close failed: rc = -116
>
> MDT dmesg:
>
> LustreError: 9042:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@  
> processing error (-107)  req at 000001002b
> 52b000 x445020/t0 o400-><?>@<?>:-1 lens 128/0 ref 0 fl
Interpret:/0/0
> rc -107/0
> LustreError: 0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ### lock  
> callback timer expired: evicting cl
> ient 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  
> nid 10.164.0.141 at tcp  ns: mds-nobackup
> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a lrc: 1/0,0  
> mode: CR/CR res: 11240142/324715850 bi
> ts 0x5 rrc: 2 type: IBT flags: 20 remote: 0x4e54bc800174cd08 expref:  
> 372 pid 26925
>   The client was evicted because of this lock can not be released on client
on time. Could you provide the stack strace of client at that time?

I assume increase obd_timeout could fix your problem. Then maybe
you should wait 1.6.5 released, including a new feature adaptive_timeout,
which will adjust the timeout value according to the network congestion
and server load. And it should help your problem.
> Lustre: 3170:0:(mds_reint.c:127:mds_finish_transno()) commit  
> transaction for disconnected client 2faf3c9e
> -26fb-64b7-ca6c-7c5b09374e67: rc 0
> LustreError: 27505:0:(mds_open.c:1474:mds_close()) @@@ no handle for  
> file close ino 11239903: cookie 0xbc
> 269e05c51912d8  req at 000001001e69c600 x3216892/t0 o35- 
>  >2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at NET_0x200000aa
> 4008d_UUID:-1 lens 296/448 ref 0 fl Interpret:/0/0 rc 0/0
> LustreError: 27505:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@  
> processing error (-116)  req at 000001001
> e69c600 x3216892/t0 o35->2faf3c9e-26fb-64b7- 
> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID:-1 lens 296/448 re
> f 0 fl Interpret:/0/0 rc -116/0
>
> I hope this helps, but I see it happen more often with the OSS  
> evicting client.
> On Feb 6, 2008, at 10:59 AM, Brock Palen wrote:
>
>   
>>>> If client get eviction from the server, it might be triggered
by
>>>>         
>>> 1) server did not get client pinger msg in a long time.
>>> 2) client is too busy to handle the server lock cancel req.
>>>       
>> Clients show a load of 4.2  (4 cores total, 1 process per core).
>>
>>     
>>> 3) client cancel the lock, but the network just dropped the cancel
>>> reply to server.
>>>       
>> I see a very small amount (6339) of dropped packets on the interfaces
>> of the OSS.  Links between the switches show no errors.
>>
>>
>>     
>>> 4) server is too busy to handle the lock cancel reply from the
>>> client or be blocked somewhere.
>>>       
>> I started paying attention to the OSS more once you said this, some
>> times i see the cpu use of socknal_sd00 get to 100%.  Now is this
>> process used to keep all the odb_ping''s going?
>>
>> Both the OSS and the MDS/MGS are SMP systems and run single
>> interfaces.  If I dual homed the servers would that create another
>> socknal process for lnet?
>>
>>
>>     
>>> It seems there are a lot of metadata operations in your job. I
>>> guess your eviction
>>> might be caused by the latter 2 reasons. If you could provide the
>>> process stack trace on MDS
>>> when the job died, it might help us to figure out what is going on
>>> there?
>>>
>>> WangDi
>>>       
>>>> Brock Palen
>>>> Center for Advanced Computing
>>>> brockp at umich.edu
>>>> (734)936-1985
>>>>
>>>>
>>>> On Feb 4, 2008, at 2:47 PM, Brock Palen wrote:
>>>>
>>>>
>>>>         
>>>>>> Which version of lustre do you use?
>>>>>> Server and clients same version and same os? which one?
>>>>>>
>>>>>>             
>>>>> lustre-1.6.4.1
>>>>>
>>>>> The servers (oss and mds/mgs) use the RHEL4 rpm from
lustre.org:
>>>>> 2.6.9-55.0.9.EL_lustre.1.6.4.1smp
>>>>>
>>>>> The clients run patchless RHEL4
>>>>> 2.6.9-67.0.1.ELsmp
>>>>>
>>>>> One set of clients are on a 10.x network while the servers
and
>>>>> other
>>>>> half of clients are on a 141.  network, because we are
using the
>>>>> tcp
>>>>> network type, we have not setup any lnet routes.  I
don''t think
>>>>> should cause a problem but I include the information for
>>>>> clarity.  We
>>>>> do route 10.x on campus.
>>>>>
>>>>>
>>>>>           
>>>>>> Harald
>>>>>>
>>>>>> On Monday 04 February 2008 04:11 pm, Brock Palen wrote:
>>>>>>
>>>>>>             
>>>>>>> on our cluster that has been running lustre for
about 1 month.
>>>>>>> I  have
>>>>>>> 1 MDT/MGS and 1 OSS with 2 OST''s.
>>>>>>>
>>>>>>> Our cluster uses all Gige and has about 608 nodes
1854 cores.
>>>>>>>
>>>>>>> We have allot of jobs that die, and/or go into high
IO wait,
>>>>>>> strace
>>>>>>> shows processes stuck in fstat().
>>>>>>>
>>>>>>> The big problem is (i think) I would like some
feedback on it
>>>>>>> that of
>>>>>>> these 608 nodes 209 of them have in dmesg the
string
>>>>>>>
>>>>>>> "This client was evicted by"
>>>>>>>
>>>>>>> Is this normal for clients to be dropped like this?
Is there
>>>>>>> some
>>>>>>> tuning that needs to be done to the server to carry
this many
>>>>>>> nodes
>>>>>>> out of the box?  We are using default lustre
install with Gige.
>>>>>>>
>>>>>>>
>>>>>>> Brock Palen
>>>>>>> Center for Advanced Computing
>>>>>>> brockp at umich.edu
>>>>>>> (734)936-1985
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Lustre-discuss mailing list
>>>>>>> Lustre-discuss at lists.lustre.org
>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>>
>>>>>>>               
>>>>>> -- 
>>>>>> Harald van Pee
>>>>>>
>>>>>> Helmholtz-Institut fuer Strahlen- und Kernphysik der
>>>>>> Universitaet  Bonn
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>>         
>>>
>>>       
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>     
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Brock Palen

2008-Feb-08 18:09 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:>> MDT dmesg:
>>
>> LustreError: 9042:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@   
>> processing error (-107)  req at 000001002b
>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens 128/0 ref 0 fl
Interpret:/
>> 0/0  rc -107/0
>> LustreError: 0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ###  
>> lock  callback timer expired: evicting cl
>> ient 2faf3c9e-26fb-64b7- 
>> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  nid 10.164.0.141 at tcp
>> ns: mds-nobackup
>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a lrc:  
>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>> ts 0x5 rrc: 2 type: IBT flags: 20 remote: 0x4e54bc800174cd08  
>> expref:  372 pid 26925
>>
> The client was evicted because of this lock can not be released on  
> client
> on time. Could you provide the stack strace of client at that time?
>
> I assume increase obd_timeout could fix your problem. Then maybe
> you should wait 1.6.5 released, including a new feature  
> adaptive_timeout,
> which will adjust the timeout value according to the network  
> congestion
> and server load. And it should help your problem.
Waiting for the next version of lustre might be the best thing.  I  
had upped the timeout a few days back but the next day i had errors  
on the MDS box.  I have switched it back:

lctl conf_param nobackup-MDT0000.sys.timeout=300

I would love to give you that trace but I don''t know how to get it.   
Is there a debug option to turn on in the clients?

Tom.Wang

2008-Feb-08 18:19 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Brock Palen wrote:>
>
> Brock Palen
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>> MDT dmesg:
>>>
>>> LustreError: 9042:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@  
>>> processing error (-107)  req at 000001002b
>>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens 128/0 ref 0
fl
>>> Interpret:/0/0  rc -107/0
>>> LustreError: 0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ### 
>>> lock  callback timer expired: evicting cl
>>> ient 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at
NET_0x200000aa4008d_UUID
>>> nid 10.164.0.141 at tcp  ns: mds-nobackup
>>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a lrc: 1/0,0
>>> mode: CR/CR res: 11240142/324715850 bi
>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote: 0x4e54bc800174cd08 
>>> expref:  372 pid 26925
>>>
>> The client was evicted because of this lock can not be released on 
>> client
>> on time. Could you provide the stack strace of client at that time?
>>
>> I assume increase obd_timeout could fix your problem. Then maybe
>> you should wait 1.6.5 released, including a new feature 
>> adaptive_timeout,
>> which will adjust the timeout value according to the network congestion
>> and server load. And it should help your problem.
>
> Waiting for the next version of lustre might be the best thing.  I had 
> upped the timeout a few days back but the next day i had errors on the 
> MDS box.  I have switched it back:
>
> lctl conf_param nobackup-MDT0000.sys.timeout=300
>
> I would love to give you that trace but I don''t know how to get
it.
> Is there a debug option to turn on in the clients? You can get that by echo t > /proc/sysrq-trigger on client.

Brock Palen

2008-Feb-08 18:45 UTC

head link

[Lustre-discuss] Luster clients getting evicted

>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>> MDT dmesg:
>>>>
>>>> LustreError: 9042:0:(ldlm_lib.c:1442:target_send_reply_msg())  
>>>> @@@  processing error (-107)  req at 000001002b
>>>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens 128/0
ref 0 fl
>>>> Interpret:/0/0  rc -107/0
>>>> LustreError: 0:0:(ldlm_lockd.c:210:waiting_locks_callback())
###
>>>> lock  callback timer expired: evicting cl
>>>> ient 2faf3c9e-26fb-64b7- 
>>>> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  nid  
>>>> 10.164.0.141 at tcp  ns: mds-nobackup
>>>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a lrc:  
>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote: 0x4e54bc800174cd08  
>>>> expref:  372 pid 26925
>>>>
>>> The client was evicted because of this lock can not be released  
>>> on client
>>> on time. Could you provide the stack strace of client at that time?
>>>
>>> I assume increase obd_timeout could fix your problem. Then maybe
>>> you should wait 1.6.5 released, including a new feature  
>>> adaptive_timeout,
>>> which will adjust the timeout value according to the network  
>>> congestion
>>> and server load. And it should help your problem.
>>
>> Waiting for the next version of lustre might be the best thing.  I  
>> had upped the timeout a few days back but the next day i had  
>> errors on the MDS box.  I have switched it back:
>>
>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>
>> I would love to give you that trace but I don''t know how to
get
>> it.  Is there a debug option to turn on in the clients?
> You can get that by echo t > /proc/sysrq-trigger on client.
>Cool command,  output of the client is attached.  The four processes  
m45_amp214_om,  is the application that hung when working off of  
luster.  you can see its stuck in IO state.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: trace
Type: application/octet-stream
Size: 117493 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080208/99f28028/attachment-0002.obj
-------------- next part --------------
>
>
>
>
>

Tom.Wang

2008-Feb-08 19:47 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Hello,

m45_amp214_om D 0000000000000000     0  2587      1         31389  2586 
(NOTLB)
00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
       00000100080f1a40 0000000000000246 00000101f6b435a8 0000000380136025
       00000102270a1030 00000000000000d0
Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689} 
<ffffffff8030e45f>{__down+147}
       <ffffffff80134659>{default_wake_function+0} 
<ffffffff8030ff7d>{__down_failed+53}
       <ffffffffa04292e1>{:lustre:.text.lock.file+5} 
<ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
       <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456} 
<ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
       <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
       <ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
       <ffffffffa02c3dbc>{:ptlrpc:search_queue+284} 
<ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
       <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
       <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
       <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
       <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023} 
<ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
       <ffffffffa0268730>{:obdclass:class_handle2object+224}
       <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
       <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
       <ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
       <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
       <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53} 
<ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
       <ffffffffa039617d>{:mdc:mdc_intent_lock+685} 
<ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0} 
<ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0} 
<ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
       <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
       <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0} 
<ffffffff80192006>{__d_lookup+287}
       <ffffffffa0419724>{:lustre:ll_file_open+2100} 
<ffffffffa0428a18>{:lustre:ll_inode_permission+184}
       <ffffffff80179bdb>{sys_access+349} 
<ffffffff8017a1ee>{__dentry_open+201}
       <ffffffff8017a3a9>{filp_open+95}
<ffffffff80179bdb>{sys_access+349}
       <ffffffff801f00b5>{strncpy_from_user+74} 
<ffffffff8017a598>{sys_open+57}
       <ffffffff8011026a>{system_call+126}

It seems blocking_ast process was blocked here. Could you dump the 
lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send to me?

Thanks
WangDi

Brock Palen wrote:>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>> MDT dmesg:
>>>>>
>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@
>>>>> processing error (-107)  req at 000001002b
>>>>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens
128/0 ref 0 fl
>>>>> Interpret:/0/0  rc -107/0
>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ###
>>>>> lock  callback timer expired: evicting cl
>>>>> ient 
>>>>> 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at
NET_0x200000aa4008d_UUID  nid
>>>>> 10.164.0.141 at tcp  ns: mds-nobackup
>>>>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a
lrc:
>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>> expref:  372 pid 26925
>>>>>
>>>> The client was evicted because of this lock can not be released
on
>>>> client
>>>> on time. Could you provide the stack strace of client at that
time?
>>>>
>>>> I assume increase obd_timeout could fix your problem. Then
maybe
>>>> you should wait 1.6.5 released, including a new feature 
>>>> adaptive_timeout,
>>>> which will adjust the timeout value according to the network 
>>>> congestion
>>>> and server load. And it should help your problem.
>>>
>>> Waiting for the next version of lustre might be the best thing.  I 
>>> had upped the timeout a few days back but the next day i had errors
>>> on the MDS box.  I have switched it back:
>>>
>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>
>>> I would love to give you that trace but I don''t know how
to get it.
>>> Is there a debug option to turn on in the clients?
>> You can get that by echo t > /proc/sysrq-trigger on client.
>>
> Cool command,  output of the client is attached.  The four processes 
> m45_amp214_om,  is the application that hung when working off of 
> luster.  you can see its stuck in IO state.
>
>>
>>
>>
>>
>>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Brock Palen

2008-Feb-08 19:59 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Sure, Attached,  note though, we rebuilt our lustre source for  
another box that uses the largesmp kernel. but it used the same  
options and compiler.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: objdump
Type: application/octet-stream
Size: 354530 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080208/abfba097/attachment-0002.obj
-------------- next part --------------


Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
> Hello,
>
> m45_amp214_om D 0000000000000000     0  2587      1         31389   
> 2586 (NOTLB)
> 00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
>       00000100080f1a40 0000000000000246 00000101f6b435a8  
> 0000000380136025
>       00000102270a1030 00000000000000d0
> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689}
<ffffffff8030e45f>
> {__down+147}
>       <ffffffff80134659>{default_wake_function+0}
<ffffffff8030ff7d>
> {__down_failed+53}
>       <ffffffffa04292e1>{:lustre:.text.lock.file+5}  
> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>       <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456}  
> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>       <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>       <ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>       <ffffffffa02c3dbc>{:ptlrpc:search_queue+284}  
> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>       <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>       <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>       <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>       <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023}
<ffffffffa02c1035>
> {:ptlrpc:lock_res_and_lock+53}
>       <ffffffffa0268730>{:obdclass:class_handle2object+224}
>       <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>       <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>       <ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>       <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>       <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}  
> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>       <ffffffffa039617d>{:mdc:mdc_intent_lock+685}  
> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}  
> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}  
> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>       <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>       <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}  
> <ffffffff80192006>{__d_lookup+287}
>       <ffffffffa0419724>{:lustre:ll_file_open+2100}  
> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>       <ffffffff80179bdb>{sys_access+349} <ffffffff8017a1ee> 
> {__dentry_open+201}
>       <ffffffff8017a3a9>{filp_open+95}
<ffffffff80179bdb>{sys_access
> +349}
>       <ffffffff801f00b5>{strncpy_from_user+74}
<ffffffff8017a598>
> {sys_open+57}
>       <ffffffff8011026a>{system_call+126}
>
> It seems blocking_ast process was blocked here. Could you dump the  
> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send  
> to me?
>
> Thanks
> WangDi
>
> Brock Palen wrote:
>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>> MDT dmesg:
>>>>>>
>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>> @@@  processing error (-107)  req at 000001002b
>>>>>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens
128/0 ref 0 fl
>>>>>> Interpret:/0/0  rc -107/0
>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback())
>>>>>> ### lock  callback timer expired: evicting cl
>>>>>> ient 2faf3c9e-26fb-64b7- 
>>>>>> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  nid  
>>>>>> 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a
lrc:
>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>> expref:  372 pid 26925
>>>>>>
>>>>> The client was evicted because of this lock can not be
released
>>>>> on client
>>>>> on time. Could you provide the stack strace of client at
that
>>>>> time?
>>>>>
>>>>> I assume increase obd_timeout could fix your problem. Then
maybe
>>>>> you should wait 1.6.5 released, including a new feature  
>>>>> adaptive_timeout,
>>>>> which will adjust the timeout value according to the
network
>>>>> congestion
>>>>> and server load. And it should help your problem.
>>>>
>>>> Waiting for the next version of lustre might be the best thing.
>>>> I had upped the timeout a few days back but the next day i had
>>>> errors on the MDS box.  I have switched it back:
>>>>
>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>
>>>> I would love to give you that trace but I don''t know
how to get
>>>> it.  Is there a debug option to turn on in the clients?
>>> You can get that by echo t > /proc/sysrq-trigger on client.
>>>
>> Cool command,  output of the client is attached.  The four  
>> processes m45_amp214_om,  is the application that hung when  
>> working off of luster.  you can see its stuck in IO state.
>>
>>>
>>>
>>>
>>>
>>>
>> --------------------------------------------------------------------- 
>> ---
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
>

Tom.Wang

2008-Feb-09 05:06 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Hi,
Aha, this is bug has been fixed in 14360.

https://bugzilla.lustre.org/show_bug.cgi?id=14360

The patch there should fix your problem, which should be released in 1.6.5

Thanks

Brock Palen wrote:> Sure, Attached,  note though, we rebuilt our lustre source for another 
> box that uses the largesmp kernel. but it used the same options and 
> compiler.
>
>
> Brock Palen
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
> On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
>
>> Hello,
>>
>> m45_amp214_om D 0000000000000000     0  2587      1         31389  
>> 2586 (NOTLB)
>> 00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
>>       00000100080f1a40 0000000000000246 00000101f6b435a8 
>> 0000000380136025
>>       00000102270a1030 00000000000000d0
>> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689} 
>> <ffffffff8030e45f>{__down+147}
>>       <ffffffff80134659>{default_wake_function+0} 
>> <ffffffff8030ff7d>{__down_failed+53}
>>       <ffffffffa04292e1>{:lustre:.text.lock.file+5} 
>> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>>       <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456} 
>> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>>       <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>>      
<ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>>       <ffffffffa02c3dbc>{:ptlrpc:search_queue+284} 
>> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>>       <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>>       <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>>       <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>>       <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023} 
>> <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>       <ffffffffa0268730>{:obdclass:class_handle2object+224}
>>       <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>>       <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>>       <ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>>       <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>>       <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53} 
>> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>>       <ffffffffa039617d>{:mdc:mdc_intent_lock+685} 
>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0} 
>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0} 
>> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>>       <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>>       <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0} 
>> <ffffffff80192006>{__d_lookup+287}
>>       <ffffffffa0419724>{:lustre:ll_file_open+2100} 
>> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>>       <ffffffff80179bdb>{sys_access+349} 
>> <ffffffff8017a1ee>{__dentry_open+201}
>>       <ffffffff8017a3a9>{filp_open+95} 
>> <ffffffff80179bdb>{sys_access+349}
>>       <ffffffff801f00b5>{strncpy_from_user+74} 
>> <ffffffff8017a598>{sys_open+57}
>>       <ffffffff8011026a>{system_call+126}
>>
>> It seems blocking_ast process was blocked here. Could you dump the 
>> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send to
me?
>>
>> Thanks
>> WangDi
>>
>> Brock Palen wrote:
>>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>>> MDT dmesg:
>>>>>>>
>>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>>> @@@  processing error (-107)  req at 000001002b
>>>>>>> 52b000 x445020/t0 o400-><?>@<?>:-1
lens 128/0 ref 0 fl
>>>>>>> Interpret:/0/0  rc -107/0
>>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback()) ###
>>>>>>> lock  callback timer expired: evicting cl
>>>>>>> ient 
>>>>>>> 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at
NET_0x200000aa4008d_UUID
>>>>>>> nid 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>>> -MDT0000_UUID lock:
00000100476df240/0xbc269e05c512de3a lrc:
>>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>>> expref:  372 pid 26925
>>>>>>>
>>>>>> The client was evicted because of this lock can not be
released
>>>>>> on client
>>>>>> on time. Could you provide the stack strace of client
at that time?
>>>>>>
>>>>>> I assume increase obd_timeout could fix your problem.
Then maybe
>>>>>> you should wait 1.6.5 released, including a new feature
>>>>>> adaptive_timeout,
>>>>>> which will adjust the timeout value according to the
network
>>>>>> congestion
>>>>>> and server load. And it should help your problem.
>>>>>
>>>>> Waiting for the next version of lustre might be the best
thing.  I
>>>>> had upped the timeout a few days back but the next day i
had
>>>>> errors on the MDS box.  I have switched it back:
>>>>>
>>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>>
>>>>> I would love to give you that trace but I don''t
know how to get
>>>>> it.  Is there a debug option to turn on in the clients?
>>>> You can get that by echo t > /proc/sysrq-trigger on client.
>>>>
>>> Cool command,  output of the client is attached.  The four
processes
>>> m45_amp214_om,  is the application that hung when working off of 
>>> luster.  you can see its stuck in IO state.
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
------------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Aaron Knister

2008-Feb-11 19:16 UTC

head link

[Lustre-discuss] Luster clients getting evicted

I''m having a similar issue with lustre 1.6.4.2 and infiniband. Under  
load, the clients hand about every 10 minutes which is really bad for  
a production machine. The only way to fix the hang is to reboot the  
server. My users are getting extremely impatient :-/

I see this on the clients-

LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@  
timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 x1796079/ 
t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1 fl  
Rpc:/0/0 rc 0/-22
Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- 
OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations  
using this service will wait for recovery to complete.
LustreError: 11-0: an error occurred while communicating with  
192.168.64.71 at o2ib. The ost_connect operation failed with -16
LustreError: 11-0: an error occurred while communicating with  
192.168.64.71 at o2ib. The ost_connect operation failed with -16

I''ve increased the timeout to 300seconds and it has helped marginally.

-Aaron

On Feb 9, 2008, at 12:06 AM, Tom.Wang wrote:
> Hi,
> Aha, this is bug has been fixed in 14360.
>
> https://bugzilla.lustre.org/show_bug.cgi?id=14360
>
> The patch there should fix your problem, which should be released in  
> 1.6.5
>
> Thanks
>
> Brock Palen wrote:
>> Sure, Attached,  note though, we rebuilt our lustre source for  
>> another
>> box that uses the largesmp kernel. but it used the same options and
>> compiler.
>>
>>
>> Brock Palen
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>> On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
>>
>>> Hello,
>>>
>>> m45_amp214_om D 0000000000000000     0  2587      1         31389
>>> 2586 (NOTLB)
>>> 00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
>>>      00000100080f1a40 0000000000000246 00000101f6b435a8
>>> 0000000380136025
>>>      00000102270a1030 00000000000000d0
>>> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689}
>>> <ffffffff8030e45f>{__down+147}
>>>      <ffffffff80134659>{default_wake_function+0}
>>> <ffffffff8030ff7d>{__down_failed+53}
>>>      <ffffffffa04292e1>{:lustre:.text.lock.file+5}
>>> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>>>      <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456}
>>> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>>>      <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>>>     
<ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>>>      <ffffffffa02c3dbc>{:ptlrpc:search_queue+284}
>>> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>>>      <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>>>      <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>>>      <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>>>      <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023}
>>> <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>>      <ffffffffa0268730>{:obdclass:class_handle2object+224}
>>>      <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>>>      <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>>>     
<ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>>>      <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>>>      <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>>>      <ffffffffa039617d>{:mdc:mdc_intent_lock+685}
>>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>      <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}
>>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>      <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}
>>> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>>>      <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>>>      <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>> <ffffffff80192006>{__d_lookup+287}
>>>      <ffffffffa0419724>{:lustre:ll_file_open+2100}
>>> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>>>      <ffffffff80179bdb>{sys_access+349}
>>> <ffffffff8017a1ee>{__dentry_open+201}
>>>      <ffffffff8017a3a9>{filp_open+95}
>>> <ffffffff80179bdb>{sys_access+349}
>>>      <ffffffff801f00b5>{strncpy_from_user+74}
>>> <ffffffff8017a598>{sys_open+57}
>>>      <ffffffff8011026a>{system_call+126}
>>>
>>> It seems blocking_ast process was blocked here. Could you dump the
>>> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send  
>>> to me?
>>>
>>> Thanks
>>> WangDi
>>>
>>> Brock Palen wrote:
>>>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>>>> MDT dmesg:
>>>>>>>>
>>>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>>>> @@@  processing error (-107)  req at 000001002b
>>>>>>>> 52b000 x445020/t0
o400-><?>@<?>:-1 lens 128/0 ref 0 fl
>>>>>>>> Interpret:/0/0  rc -107/0
>>>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback())
>>>>>>>> ###
>>>>>>>> lock  callback timer expired: evicting cl
>>>>>>>> ient
>>>>>>>> 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at
NET_0x200000aa4008d_UUID
>>>>>>>> nid 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>>>> -MDT0000_UUID lock:
00000100476df240/0xbc269e05c512de3a lrc:
>>>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>>>> expref:  372 pid 26925
>>>>>>>>
>>>>>>> The client was evicted because of this lock can not
be released
>>>>>>> on client
>>>>>>> on time. Could you provide the stack strace of
client at that
>>>>>>> time?
>>>>>>>
>>>>>>> I assume increase obd_timeout could fix your
problem. Then maybe
>>>>>>> you should wait 1.6.5 released, including a new
feature
>>>>>>> adaptive_timeout,
>>>>>>> which will adjust the timeout value according to
the network
>>>>>>> congestion
>>>>>>> and server load. And it should help your problem.
>>>>>>
>>>>>> Waiting for the next version of lustre might be the
best
>>>>>> thing.  I
>>>>>> had upped the timeout a few days back but the next day
i had
>>>>>> errors on the MDS box.  I have switched it back:
>>>>>>
>>>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>>>
>>>>>> I would love to give you that trace but I
don''t know how to get
>>>>>> it.  Is there a debug option to turn on in the clients?
>>>>> You can get that by echo t > /proc/sysrq-trigger on
client.
>>>>>
>>>> Cool command,  output of the client is attached.  The four  
>>>> processes
>>>> m45_amp214_om,  is the application that hung when working off
of
>>>> luster.  you can see its stuck in IO state.
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
------------------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>>
------------------------------------------------------------------------
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org

Tom.Wang

2008-Feb-11 20:04 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Aaron Knister wrote:> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
> load, the clients hand about every 10 minutes which is really bad for 
> a production machine. The only way to fix the hang is to reboot the 
> server. My users are getting extremely impatient :-/
>
> I see this on the clients-
>
> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@ 
> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 
> x1796079/t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 
> ref 1 fl Rpc:/0/0 rc 0/-22It means OST could not response the request(unlink, o6) in 300 seconds, 
so client disconnect the import to OST and try to reconnect.
Does this disconnection always happened when do unlink ? Could you 
please post process trace and console msg of OST at that time?

Thanks
WangDi> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service 
> data-OST0000 via nid 192.168.64.71 at o2ib was lost; in progress 
> operations using this service will wait for recovery to complete.
> LustreError: 11-0: an error occurred while communicating with 
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> LustreError: 11-0: an error occurred while communicating with 
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
>
> I''ve increased the timeout to 300seconds and it has helped
marginally.
>
> -Aaron
>
>
>
>
>
>

Craig Prescott

2008-Feb-11 20:19 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Aaron Knister wrote:> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
> load, the clients hand about every 10 minutes which is really bad for  
> a production machine. The only way to fix the hang is to reboot the  
> server. My users are getting extremely impatient :-/
> 
> I see this on the clients-
> 
> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@  
> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 x1796079/ 
> t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1 fl
> Rpc:/0/0 rc 0/-22
> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- 
> OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations  
> using this service will wait for recovery to complete.
> LustreError: 11-0: an error occurred while communicating with  
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> LustreError: 11-0: an error occurred while communicating with  
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> 
> I''ve increased the timeout to 300seconds and it has helped
marginally.
Hi Aaron;

We set the timeout a big number (1000secs) on our 400 node cluster
(mostly o2ib, some tcp clients).  Until we did this, we had loads
of evictions.  In our case, it solved the problem.

Cheers,
Craig

Brock Palen

2008-Feb-11 21:17 UTC

head link

[Lustre-discuss] Luster clients getting evicted

>> I''ve increased the timeout to 300seconds and it has helped  
>> marginally.
>
> Hi Aaron;
>
> We set the timeout a big number (1000secs) on our 400 node cluster
> (mostly o2ib, some tcp clients).  Until we did this, we had loads
> of evictions.  In our case, it solved the problem.
This feels excessive.  But at this point I guess Ill try it.
>
> Cheers,
> Craig
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

Aaron Knister

2008-Feb-11 21:48 UTC

head link

[Lustre-discuss] Luster clients getting evicted

So far it''s helped. If this doesn''t fix it I''m going
to apply the
patch mentioned here -
https://bugzilla.lustre.org/attachment.cgi?id=14006&action=edit
  I''ll let you know how it goes. If you''d like a copy of the
patched
version let me know. Are you running RHEL/SLES? what version of the OS  
and lustre?

-Aaron

On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
>>> I''ve increased the timeout to 300seconds and it has helped
>>> marginally.
>>
>> Hi Aaron;
>>
>> We set the timeout a big number (1000secs) on our 400 node cluster
>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>> of evictions.  In our case, it solved the problem.
>
> This feels excessive.  But at this point I guess Ill try it.
>
>>
>> Cheers,
>> Craig
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org

Brock Palen

2008-Feb-11 22:11 UTC

head link

[Lustre-discuss] Luster clients getting evicted

RHEL4
x86_64 lustre-1.6.4.1

Ill wait and see if this helps.  We wont be running any patched  
kernels outside of the OSS/MGS/MDS ever.

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Feb 11, 2008, at 4:48 PM, Aaron Knister wrote:
> So far it''s helped. If this doesn''t fix it I''m
going to apply the
> patch mentioned here - https://bugzilla.lustre.org/attachment.cgi? 
> id=14006&action=edit I''ll let you know how it goes. If
you''d like a
> copy of the patched version let me know. Are you running RHEL/SLES?  
> what version of the OS and lustre?
>
> -Aaron
>
> On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
>
>>>> I''ve increased the timeout to 300seconds and it has
helped
>>>> marginally.
>>>
>>> Hi Aaron;
>>>
>>> We set the timeout a big number (1000secs) on our 400 node cluster
>>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>>> of evictions.  In our case, it solved the problem.
>>
>> This feels excessive.  But at this point I guess Ill try it.
>>
>>>
>>> Cheers,
>>> Craig
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>
>
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
>
> (301) 595-7000
> aaron at iges.org
>
>
>
>
>
>

Tom.Wang

2008-Feb-11 22:13 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Hi, Aaron

FYI, the patch in 14360 will unlikely help your problem, since the 
problem here seems OST load is too high or stuck somewhere.  So we need
more information. Actually, we have met some similar problems when do 
unlink before. If increase obd_timeout could help you, that is good.
But if you could provide stack trace and console msg of OST at that 
time, if it is not much trouble to get these information,  that will
help us to figure out what happened there?

Thanks
WangDi

Aaron Knister wrote:> So far it''s helped. If this doesn''t fix it I''m
going to apply the
> patch mentioned here -
https://bugzilla.lustre.org/attachment.cgi?id=14006&action=edit
>   I''ll let you know how it goes. If you''d like a copy of
the patched
> version let me know. Are you running RHEL/SLES? what version of the OS  
> and lustre?
>
> -Aaron
>
> On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
>
>   
>>>> I''ve increased the timeout to 300seconds and it has
helped
>>>> marginally.
>>>>         
>>> Hi Aaron;
>>>
>>> We set the timeout a big number (1000secs) on our 400 node cluster
>>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>>> of evictions.  In our case, it solved the problem.
>>>       
>> This feels excessive.  But at this point I guess Ill try it.
>>
>>     
>>> Cheers,
>>> Craig
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>       
>
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
>
> (301) 595-7000
> aaron at iges.org
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Charles Taylor

2008-Feb-11 22:33 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Aaron,

We are running 1.6.3 with some patches that we applied by hand after  
rummaging through the Lustre bugzilla database.    We run CentOS 5.0  
on the servers and 4.5 on the clients with an updated kernel.

# uname -a
Linux submit.ufhpc 2.6.18-8.1.14.el5Lustre #1 SMP Fri Oct 12 15:51:56  
EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

We also  run OFED 1.2 which we built with Lustre by  configuring IB  
out of the CentOS kernel entirely and then installling OFED.    We  
then build the Lustre modules against the resulting kernel and IB  
modules.
We are pretty stable right now and are very pleased with Lustre.      
It took a little work to get there with the base 1.6.3 release which  
we needed for o2ib nids but it has worked out for us so far.

BTW, we agree with the post of a few days ago.    We think the lustre  
team has done a fantastic job for the open source  community.

Thanks,

Charlie Taylor
UF HPC Center

On Feb 11, 2008, at 4:48 PM, Aaron Knister wrote:
> So far it''s helped. If this doesn''t fix it I''m
going to apply the
> patch mentioned here - https://bugzilla.lustre.org/attachment.cgi? 
> id=14006&action=edit
>   I''ll let you know how it goes. If you''d like a copy of
the patched
> version let me know. Are you running RHEL/SLES? what version of the OS
> and lustre?
>
> -Aaron
>
> On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
>
>>>> I''ve increased the timeout to 300seconds and it has
helped
>>>> marginally.
>>>
>>> Hi Aaron;
>>>
>>> We set the timeout a big number (1000secs) on our 400 node cluster
>>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>>> of evictions.  In our case, it solved the problem.
>>
>> This feels excessive.  But at this point I guess Ill try it.
>>
>>>
>>> Cheers,
>>> Craig
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>
>
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
>
> (301) 595-7000
> aaron at iges.org
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Brock Palen

2008-Feb-18 21:11 UTC

head link

[Lustre-discuss] Luster clients getting evicted

I found that something is getting overloaded some place.  If i just  
go start and stop a job over and over quickly the client will lose  
contact with one of the servers, ether OST or MDT.

Would more ram in the servers help? I dont see a high load or IO  
wait, but both servers are older (dual 1.4Ghz amd) with only 2 gb of  
memory.

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
> Hello,
>
> m45_amp214_om D 0000000000000000     0  2587      1         31389   
> 2586 (NOTLB)
> 00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
>       00000100080f1a40 0000000000000246 00000101f6b435a8  
> 0000000380136025
>       00000102270a1030 00000000000000d0
> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689}
<ffffffff8030e45f>
> {__down+147}
>       <ffffffff80134659>{default_wake_function+0}
<ffffffff8030ff7d>
> {__down_failed+53}
>       <ffffffffa04292e1>{:lustre:.text.lock.file+5}  
> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>       <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456}  
> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>       <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>       <ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>       <ffffffffa02c3dbc>{:ptlrpc:search_queue+284}  
> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>       <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>       <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>       <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>       <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023}
<ffffffffa02c1035>
> {:ptlrpc:lock_res_and_lock+53}
>       <ffffffffa0268730>{:obdclass:class_handle2object+224}
>       <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>       <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>       <ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>       <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>       <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}  
> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>       <ffffffffa039617d>{:mdc:mdc_intent_lock+685}  
> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}  
> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}  
> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>       <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>       <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}  
> <ffffffff80192006>{__d_lookup+287}
>       <ffffffffa0419724>{:lustre:ll_file_open+2100}  
> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>       <ffffffff80179bdb>{sys_access+349} <ffffffff8017a1ee> 
> {__dentry_open+201}
>       <ffffffff8017a3a9>{filp_open+95}
<ffffffff80179bdb>{sys_access
> +349}
>       <ffffffff801f00b5>{strncpy_from_user+74}
<ffffffff8017a598>
> {sys_open+57}
>       <ffffffff8011026a>{system_call+126}
>
> It seems blocking_ast process was blocked here. Could you dump the  
> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send  
> to me?
>
> Thanks
> WangDi
>
> Brock Palen wrote:
>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>> MDT dmesg:
>>>>>>
>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>> @@@  processing error (-107)  req at 000001002b
>>>>>> 52b000 x445020/t0 o400-><?>@<?>:-1 lens
128/0 ref 0 fl
>>>>>> Interpret:/0/0  rc -107/0
>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback())
>>>>>> ### lock  callback timer expired: evicting cl
>>>>>> ient 2faf3c9e-26fb-64b7- 
>>>>>> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  nid  
>>>>>> 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>> -MDT0000_UUID lock: 00000100476df240/0xbc269e05c512de3a
lrc:
>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>> expref:  372 pid 26925
>>>>>>
>>>>> The client was evicted because of this lock can not be
released
>>>>> on client
>>>>> on time. Could you provide the stack strace of client at
that
>>>>> time?
>>>>>
>>>>> I assume increase obd_timeout could fix your problem. Then
maybe
>>>>> you should wait 1.6.5 released, including a new feature  
>>>>> adaptive_timeout,
>>>>> which will adjust the timeout value according to the
network
>>>>> congestion
>>>>> and server load. And it should help your problem.
>>>>
>>>> Waiting for the next version of lustre might be the best thing.
>>>> I had upped the timeout a few days back but the next day i had
>>>> errors on the MDS box.  I have switched it back:
>>>>
>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>
>>>> I would love to give you that trace but I don''t know
how to get
>>>> it.  Is there a debug option to turn on in the clients?
>>> You can get that by echo t > /proc/sysrq-trigger on client.
>>>
>> Cool command,  output of the client is attached.  The four  
>> processes m45_amp214_om,  is the application that hung when  
>> working off of luster.  you can see its stuck in IO state.
>>
>>>
>>>
>>>
>>>
>>>
>> --------------------------------------------------------------------- 
>> ---
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
>

Tom.Wang

2008-Feb-18 21:43 UTC

head link

[Lustre-discuss] Luster clients getting evicted

Brock Palen wrote:> I found that something is getting overloaded some place.  If i just  
> go start and stop a job over and over quickly the client will lose  
> contact with one of the servers, ether OST or MDT.
>
>   Server might be stuck somewhere.  It should depend on what does
the job do if you start and stop it over and over?

Will it create, then unlink a lot of file if you start and stop the job?

Whether the memory will help your problem depend on what triggers
this server stuck. Could you find some console error msg when they
are stuck?

Usually memory is more helpful on MDS, if you have a large number of
clients and big directory in your system.  Memory on OST is also helpful,
but not directly for read and write.  I think you can find the reason of 
this
easily on the list, and there are many discussion about the hardware
requirement for lustre before.

Thanks
WangDi> Would more ram in the servers help? I dont see a high load or IO  
> wait, but both servers are older (dual 1.4Ghz amd) with only 2 gb of  
> memory.
>
> Brock Palen
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
> On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
>
>   
>> Hello,
>>
>> m45_amp214_om D 0000000000000000     0  2587      1         31389   
>> 2586 (NOTLB)
>> 00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
>>       00000100080f1a40 0000000000000246 00000101f6b435a8  
>> 0000000380136025
>>       00000102270a1030 00000000000000d0
>> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689}
<ffffffff8030e45f>
>> {__down+147}
>>       <ffffffff80134659>{default_wake_function+0}
<ffffffff8030ff7d>
>> {__down_failed+53}
>>       <ffffffffa04292e1>{:lustre:.text.lock.file+5}  
>> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>>       <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456}  
>> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>>       <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>>      
<ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>>       <ffffffffa02c3dbc>{:ptlrpc:search_queue+284}  
>> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>>       <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>>       <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>>       <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>>       <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023}
<ffffffffa02c1035>
>> {:ptlrpc:lock_res_and_lock+53}
>>       <ffffffffa0268730>{:obdclass:class_handle2object+224}
>>       <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>>       <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>>       <ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>>       <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>>       <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}  
>> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>>       <ffffffffa039617d>{:mdc:mdc_intent_lock+685}  
>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}  
>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>       <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}  
>> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>>       <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>>       <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}  
>> <ffffffff80192006>{__d_lookup+287}
>>       <ffffffffa0419724>{:lustre:ll_file_open+2100}  
>> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>>       <ffffffff80179bdb>{sys_access+349} <ffffffff8017a1ee>
>> {__dentry_open+201}
>>       <ffffffff8017a3a9>{filp_open+95}
<ffffffff80179bdb>{sys_access
>> +349}
>>       <ffffffff801f00b5>{strncpy_from_user+74}
<ffffffff8017a598>
>> {sys_open+57}
>>       <ffffffff8011026a>{system_call+126}
>>
>> It seems blocking_ast process was blocked here. Could you dump the  
>> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send  
>> to me?
>>
>> Thanks
>> WangDi
>>
>> Brock Palen wrote:
>>     
>>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>           
>>>>>>> MDT dmesg:
>>>>>>>
>>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>>> @@@  processing error (-107)  req at 000001002b
>>>>>>> 52b000 x445020/t0 o400-><?>@<?>:-1
lens 128/0 ref 0 fl
>>>>>>> Interpret:/0/0  rc -107/0
>>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback())
>>>>>>> ### lock  callback timer expired: evicting cl
>>>>>>> ient 2faf3c9e-26fb-64b7- 
>>>>>>> ca6c-7c5b09374e67 at NET_0x200000aa4008d_UUID  nid
>>>>>>> 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>>> -MDT0000_UUID lock:
00000100476df240/0xbc269e05c512de3a lrc:
>>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>>> expref:  372 pid 26925
>>>>>>>
>>>>>>>               
>>>>>> The client was evicted because of this lock can not be
released
>>>>>> on client
>>>>>> on time. Could you provide the stack strace of client
at that
>>>>>> time?
>>>>>>
>>>>>> I assume increase obd_timeout could fix your problem.
Then maybe
>>>>>> you should wait 1.6.5 released, including a new feature
>>>>>> adaptive_timeout,
>>>>>> which will adjust the timeout value according to the
network
>>>>>> congestion
>>>>>> and server load. And it should help your problem.
>>>>>>             
>>>>> Waiting for the next version of lustre might be the best
thing.
>>>>> I had upped the timeout a few days back but the next day i
had
>>>>> errors on the MDS box.  I have switched it back:
>>>>>
>>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>>
>>>>> I would love to give you that trace but I don''t
know how to get
>>>>> it.  Is there a debug option to turn on in the clients?
>>>>>           
>>>> You can get that by echo t > /proc/sysrq-trigger on client.
>>>>
>>>>         
>>> Cool command,  output of the client is attached.  The four  
>>> processes m45_amp214_om,  is the application that hung when  
>>> working off of luster.  you can see its stuck in IO state.
>>>
>>>       
>>>>
>>>>
>>>>
>>>>         
>>>
---------------------------------------------------------------------
>>> ---
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>       
>>
>>     
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Maybe Matching Threads

Search for more maybe matching threads

Lustre discuss - Feb 2008 - Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Question about building Lustre, correct version of GCC

[Lustre-discuss] Question about building Lustre, correct version of GCC

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

[Lustre-discuss] Luster clients getting evicted

Maybe Matching Threads