thr3ads.net - Lustre discuss - [Lustre-discuss] Contents of Lustre-discuss digest... [Feb 2008]

If this information is useful, please help other people find it:
Share via:

ashok bharat bayana

2008-Feb-12 05:48 UTC

[Lustre-discuss] Contents of Lustre-discuss digest...

Hi,
i just want to know whether there are any alternative file systems for HP SFS.
I heard that there is Cluster Gateway from Polyserve. Can anybody plz help me in
finding more abt this Cluster Gateway.

Thanks and Regards,
Ashok Bharat

-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org on behalf of
lustre-discuss-request at lists.lustre.org
Sent: Tue 2/12/2008 11:05 AM
To: lustre-discuss at lists.lustre.org
Subject: Lustre-discuss Digest, Vol 25, Issue 19
 
Send Lustre-discuss mailing list submissions to
	lustre-discuss at lists.lustre.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://lists.lustre.org/mailman/listinfo/lustre-discuss
or, via email, send a message with subject or body ''help'' to
	lustre-discuss-request at lists.lustre.org

You can reach the person managing the list at
	lustre-discuss-owner at lists.lustre.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Lustre-discuss digest..."


Today''s Topics:

   1. Re: multihomed clients ignoring lnet options (Cliff White)
   2. Re: multihomed clients ignoring lnet options (Joe Little)
   3. Re: multihomed clients ignoring lnet options (Steden Klaus)
   4. Re: Lustre-discuss Digest, Vol 25, Issue 17 (ashok bharat bayana)


----------------------------------------------------------------------

Message: 1
Date: Mon, 11 Feb 2008 20:00:10 -0800
From: Cliff White <Cliff.White at Sun.COM>
Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
To: Aaron Knister <aaron at iges.org>
Cc: lustre-discuss at lists.lustre.org
Message-ID: <47B119CA.4050105 at sun.com>
Content-Type: text/plain; format=flowed; charset=ISO-8859-1

Aaron Knister wrote:> I believe that''s correct. The nids of the various server
components
> are stored on the filesystem itself.
Yes, and you can always see them with
tunefs.lustre --print <device>

cliffw
> 
> On Feb 10, 2008, at 12:58 AM, Joe Little wrote:
> 
>> never mind.. The problem was resolved by recreating again the MGS and
>> the OST''s using the same parameters on the server. I was able
to
>> change the parameters and still have the servers working, but my guess
>> is that those options are permanently etched into the filesystem.
>>
>>
>> On Feb 9, 2008 8:16 PM, Joe Little <jmlittle at gmail.com> wrote:
>>> I have all of my servers and clients using eth1 for the tcp lustre
>>> lnet.
>>>
>>> All have modprobe.conf entries of:
>>>
>>> options lnet networks="tcp0(eth1)"
>>>
>>> and all report with "lctl list_nids" that they are using
the IP
>>> address associated with that interface (a net 192.168.200.x
address)
>>>
>>> However, when my client connects, it ignores the above and goes
with
>>> eth0 for routing, even though the mds/mgs is on that network range:
>>>
>>> client dmesg:
>>>
>>> Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum lustre  
>>> stack 8192
>>> Lustre: Added LNI 192.168.200.100 at tcp [8/256]
>>> Lustre: Accept secure, port 988
>>> Lustre: OBD class driver, info at clusterfs.com
>>>        Lustre Version: 1.6.4.2
>>>        Build Version:
>>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre- 
>>> kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
>>> Lustre: Lustre Client File System; info at clusterfs.com
>>> LustreError: 4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error
>>> -104 reading HELLO from 192.168.2.201
>>> LustreError: 11b-b: Connection to 192.168.2.201 at tcp at host
>>> 192.168.2.201 on port 988 was reset: is it running a compatible
>>> version of Lustre and is 192.168.2.201 at tcp one of its NIDs?
>>>
>>> server dmesg:
>>> LustreError: 120-3: Refusing connection from 192.168.2.192 for
>>> 192.168.2.201 at tcp: No matching NI
>>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
> 
> (301) 595-7000
> aaron at iges.org
> 
> 
> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss


------------------------------

Message: 2
Date: Mon, 11 Feb 2008 20:51:20 -0800
From: "Joe Little" <jmlittle at gmail.com>
Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
To: "Cliff White" <Cliff.White at sun.com>
Cc: lustre-discuss at lists.lustre.org
Message-ID:
	<e3849caa0802112051q7e24e6acv5af03a16f2bca2c3 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Feb 11, 2008 8:00 PM, Cliff White <Cliff.White at sun.com>
wrote:> Aaron Knister wrote:
> > I believe that''s correct. The nids of the various server
components
> > are stored on the filesystem itself.
>
> Yes, and you can always see them with
> tunefs.lustre --print <device>
>
> cliffw
anyone to change them after the fact?>
>
> >
> > On Feb 10, 2008, at 12:58 AM, Joe Little wrote:
> >
> >> never mind.. The problem was resolved by recreating again the MGS
and
> >> the OST''s using the same parameters on the server. I was
able to
> >> change the parameters and still have the servers working, but my
guess
> >> is that those options are permanently etched into the filesystem.
> >>
> >>
> >> On Feb 9, 2008 8:16 PM, Joe Little <jmlittle at gmail.com>
wrote:
> >>> I have all of my servers and clients using eth1 for the tcp
lustre
> >>> lnet.
> >>>
> >>> All have modprobe.conf entries of:
> >>>
> >>> options lnet networks="tcp0(eth1)"
> >>>
> >>> and all report with "lctl list_nids" that they are
using the IP
> >>> address associated with that interface (a net 192.168.200.x
address)
> >>>
> >>> However, when my client connects, it ignores the above and
goes with
> >>> eth0 for routing, even though the mds/mgs is on that network
range:
> >>>
> >>> client dmesg:
> >>>
> >>> Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum
lustre
> >>> stack 8192
> >>> Lustre: Added LNI 192.168.200.100 at tcp [8/256]
> >>> Lustre: Accept secure, port 988
> >>> Lustre: OBD class driver, info at clusterfs.com
> >>>        Lustre Version: 1.6.4.2
> >>>        Build Version:
> >>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre-
> >>> kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
> >>> Lustre: Lustre Client File System; info at clusterfs.com
> >>> LustreError: 4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello())
Error
> >>> -104 reading HELLO from 192.168.2.201
> >>> LustreError: 11b-b: Connection to 192.168.2.201 at tcp at host
> >>> 192.168.2.201 on port 988 was reset: is it running a
compatible
> >>> version of Lustre and is 192.168.2.201 at tcp one of its NIDs?
> >>>
> >>> server dmesg:
> >>> LustreError: 120-3: Refusing connection from 192.168.2.192 for
> >>> 192.168.2.201 at tcp: No matching NI
> >>>
> >> _______________________________________________
> >> Lustre-discuss mailing list
> >> Lustre-discuss at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
> > Aaron Knister
> > Associate Systems Analyst
> > Center for Ocean-Land-Atmosphere Studies
> >
> > (301) 595-7000
> > aaron at iges.org
> >
> >
> >
> >
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>

------------------------------

Message: 3
Date: Mon, 11 Feb 2008 20:53:41 -0800
From: "Steden Klaus" <Klaus.Steden at thomson.net>
Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
To: <jmlittle at gmail.com>,	<Cliff.White at sun.com>
Cc: lustre-discuss at lists.lustre.org
Message-ID:
	<23480D326186CF49819F5EF363276C9003AB2AB3 at BRBKSMAIL04.am.thmulti.com>
Content-Type: text/plain;	charset="utf-8"


If you have root, you can change them using tunefs.lustre after the file system
has been shut down.

I''ve done this a number of times to test various lnet configs.

Klaus


----- Original Message -----
From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at
lists.lustre.org>
To: Cliff White <Cliff.White at sun.com>
Cc: lustre-discuss at lists.lustre.org <lustre-discuss at
lists.lustre.org>
Sent: Mon Feb 11 20:51:20 2008
Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options

On Feb 11, 2008 8:00 PM, Cliff White <Cliff.White at sun.com>
wrote:> Aaron Knister wrote:
> > I believe that''s correct. The nids of the various server
components
> > are stored on the filesystem itself.
>
> Yes, and you can always see them with
> tunefs.lustre --print <device>
>
> cliffw
anyone to change them after the fact?>
>
> >
> > On Feb 10, 2008, at 12:58 AM, Joe Little wrote:
> >
> >> never mind.. The problem was resolved by recreating again the MGS
and
> >> the OST''s using the same parameters on the server. I was
able to
> >> change the parameters and still have the servers working, but my
guess
> >> is that those options are permanently etched into the filesystem.
> >>
> >>
> >> On Feb 9, 2008 8:16 PM, Joe Little <jmlittle at gmail.com>
wrote:
> >>> I have all of my servers and clients using eth1 for the tcp
lustre
> >>> lnet.
> >>>
> >>> All have modprobe.conf entries of:
> >>>
> >>> options lnet networks="tcp0(eth1)"
> >>>
> >>> and all report with "lctl list_nids" that they are
using the IP
> >>> address associated with that interface (a net 192.168.200.x
address)
> >>>
> >>> However, when my client connects, it ignores the above and
goes with
> >>> eth0 for routing, even though the mds/mgs is on that network
range:
> >>>
> >>> client dmesg:
> >>>
> >>> Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum
lustre
> >>> stack 8192
> >>> Lustre: Added LNI 192.168.200.100 at tcp [8/256]
> >>> Lustre: Accept secure, port 988
> >>> Lustre: OBD class driver, info at clusterfs.com
> >>>        Lustre Version: 1.6.4.2
> >>>        Build Version:
> >>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre-
> >>> kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
> >>> Lustre: Lustre Client File System; info at clusterfs.com
> >>> LustreError: 4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello())
Error
> >>> -104 reading HELLO from 192.168.2.201
> >>> LustreError: 11b-b: Connection to 192.168.2.201 at tcp at host
> >>> 192.168.2.201 on port 988 was reset: is it running a
compatible
> >>> version of Lustre and is 192.168.2.201 at tcp one of its NIDs?
> >>>
> >>> server dmesg:
> >>> LustreError: 120-3: Refusing connection from 192.168.2.192 for
> >>> 192.168.2.201 at tcp: No matching NI
> >>>
> >> _______________________________________________
> >> Lustre-discuss mailing list
> >> Lustre-discuss at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
> > Aaron Knister
> > Associate Systems Analyst
> > Center for Ocean-Land-Atmosphere Studies
> >
> > (301) 595-7000
> > aaron at iges.org
> >
> >
> >
> >
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

------------------------------

Message: 4
Date: Tue, 12 Feb 2008 11:15:18 +0530
From: "ashok bharat bayana" <ashok.bharat.bayana at iiitb.ac.in>
Subject: Re: [Lustre-discuss] Lustre-discuss Digest, Vol 25, Issue 17
To: <lustre-discuss at lists.lustre.org>
Message-ID: <8626C1B7EB748940BCDD7596134632BE850213 at jal.iiitb.ac.in>
Content-Type: text/plain; charset="iso-8859-1"


Hi,
i just want to know whether there are any alternative file systems for HP SFS.
I heard that there is Cluster Gateway from Polyserve. Can anybody plz help me in
finding more abt this Cluster Gateway.

Thanks and Regards,
Ashok Bharat

-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org on behalf of
lustre-discuss-request at lists.lustre.org
Sent: Tue 2/12/2008 3:18 AM
To: lustre-discuss at lists.lustre.org
Subject: Lustre-discuss Digest, Vol 25, Issue 17
 
Send Lustre-discuss mailing list submissions to
	lustre-discuss at lists.lustre.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://lists.lustre.org/mailman/listinfo/lustre-discuss
or, via email, send a message with subject or body ''help'' to
	lustre-discuss-request at lists.lustre.org

You can reach the person managing the list at
	lustre-discuss-owner at lists.lustre.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Lustre-discuss digest..."


Today''s Topics:

   1. Re: Benchmarking Lustre (Marty Barnaby)
   2. Re: Luster clients getting evicted (Aaron Knister)
   3. Re: Luster clients getting evicted (Tom.Wang)
   4. Re: Luster clients getting evicted (Craig Prescott)
   5. Re: rc -43: Identifier removed (Andreas Dilger)
   6. Re: Luster clients getting evicted (Brock Palen)
   7. Re: Luster clients getting evicted (Aaron Knister)


----------------------------------------------------------------------

Message: 1
Date: Mon, 11 Feb 2008 11:25:48 -0700
From: "Marty Barnaby" <mlbarna at sandia.gov>
Subject: Re: [Lustre-discuss] Benchmarking Lustre
To: "lustre-discuss at lists.lustre.org"
	<lustre-discuss at lists.lustre.org>
Message-ID: <47B0932C.2090200 at sandia.gov>
Content-Type: text/plain; charset=iso-8859-1; format=flowed

Do you have any special interests, like: writing from a true MPI job; 
collective vs. independent; one-file-per-processor vs. a single, share 
file; or writing via MPI-IO vs. posix?


Marty Barnaby


mayur bhosle wrote:> hi everyone,
>
>                         I am a student at Georgia Tech university, and 
> as a part of a project i need to benchmark lustre file system. I did a 
> lot of searching regarding
> the possible benchmark, but i need some advice on which benchmarks 
> would be more suitable... if any one can post a sugesstion that would 
> be really helpful.......................
>
>                         thanks in advance ............
>
> Mayur



------------------------------

Message: 2
Date: Mon, 11 Feb 2008 14:16:20 -0500
From: Aaron Knister <aaron at iges.org>
Subject: Re: [Lustre-discuss] Luster clients getting evicted
To: Tom.Wang <Tom.Wang at Sun.COM>
Cc: lustre-discuss at lists.lustre.org
Message-ID: <79343CD8-77EA-4686-A2AE-BEE6FAC59914 at iges.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

I''m having a similar issue with lustre 1.6.4.2 and infiniband. Under  
load, the clients hand about every 10 minutes which is really bad for  
a production machine. The only way to fix the hang is to reboot the  
server. My users are getting extremely impatient :-/

I see this on the clients-

LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@  
timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 x1796079/ 
t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1 fl  
Rpc:/0/0 rc 0/-22
Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- 
OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations  
using this service will wait for recovery to complete.
LustreError: 11-0: an error occurred while communicating with  
192.168.64.71 at o2ib. The ost_connect operation failed with -16
LustreError: 11-0: an error occurred while communicating with  
192.168.64.71 at o2ib. The ost_connect operation failed with -16

I''ve increased the timeout to 300seconds and it has helped marginally.

-Aaron

On Feb 9, 2008, at 12:06 AM, Tom.Wang wrote:
> Hi,
> Aha, this is bug has been fixed in 14360.
>
> https://bugzilla.lustre.org/show_bug.cgi?id=14360
>
> The patch there should fix your problem, which should be released in  
> 1.6.5
>
> Thanks
>
> Brock Palen wrote:
>> Sure, Attached,  note though, we rebuilt our lustre source for  
>> another
>> box that uses the largesmp kernel. but it used the same options and
>> compiler.
>>
>>
>> Brock Palen
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>> On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
>>
>>> Hello,
>>>
>>> m45_amp214_om D 0000000000000000     0  2587      1         31389
>>> 2586 (NOTLB)
>>> 00000101f6b435f8 0000000000000006 000001022c7fc030 0000000000000001
>>>      00000100080f1a40 0000000000000246 00000101f6b435a8
>>> 0000000380136025
>>>      00000102270a1030 00000000000000d0
>>> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689}
>>> <ffffffff8030e45f>{__down+147}
>>>      <ffffffff80134659>{default_wake_function+0}
>>> <ffffffff8030ff7d>{__down_failed+53}
>>>      <ffffffffa04292e1>{:lustre:.text.lock.file+5}
>>> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>>>      <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456}
>>> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>>>      <ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>>>     
<ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>>>      <ffffffffa02c3dbc>{:ptlrpc:search_queue+284}
>>> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>>>      <ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>>>      <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>>>      <ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>>>      <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023}
>>> <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>>      <ffffffffa0268730>{:obdclass:class_handle2object+224}
>>>      <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>>>      <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>>>     
<ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>>>      <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>>>      <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>>>      <ffffffffa039617d>{:mdc:mdc_intent_lock+685}
>>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>      <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}
>>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>      <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}
>>> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>>>      <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>>>      <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>> <ffffffff80192006>{__d_lookup+287}
>>>      <ffffffffa0419724>{:lustre:ll_file_open+2100}
>>> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>>>      <ffffffff80179bdb>{sys_access+349}
>>> <ffffffff8017a1ee>{__dentry_open+201}
>>>      <ffffffff8017a3a9>{filp_open+95}
>>> <ffffffff80179bdb>{sys_access+349}
>>>      <ffffffff801f00b5>{strncpy_from_user+74}
>>> <ffffffff8017a598>{sys_open+57}
>>>      <ffffffff8011026a>{system_call+126}
>>>
>>> It seems blocking_ast process was blocked here. Could you dump the
>>> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and send  
>>> to me?
>>>
>>> Thanks
>>> WangDi
>>>
>>> Brock Palen wrote:
>>>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>>>> MDT dmesg:
>>>>>>>>
>>>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>>>> @@@  processing error (-107)  req at 000001002b
>>>>>>>> 52b000 x445020/t0
o400-><?>@<?>:-1 lens 128/0 ref 0 fl
>>>>>>>> Interpret:/0/0  rc -107/0
>>>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback())
>>>>>>>> ###
>>>>>>>> lock  callback timer expired: evicting cl
>>>>>>>> ient
>>>>>>>> 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at
NET_0x200000aa4008d_UUID
>>>>>>>> nid 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>>>> -MDT0000_UUID lock:
00000100476df240/0xbc269e05c512de3a lrc:
>>>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850 bi
>>>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>>>> expref:  372 pid 26925
>>>>>>>>
>>>>>>> The client was evicted because of this lock can not
be released
>>>>>>> on client
>>>>>>> on time. Could you provide the stack strace of
client at that
>>>>>>> time?
>>>>>>>
>>>>>>> I assume increase obd_timeout could fix your
problem. Then maybe
>>>>>>> you should wait 1.6.5 released, including a new
feature
>>>>>>> adaptive_timeout,
>>>>>>> which will adjust the timeout value according to
the network
>>>>>>> congestion
>>>>>>> and server load. And it should help your problem.
>>>>>>
>>>>>> Waiting for the next version of lustre might be the
best
>>>>>> thing.  I
>>>>>> had upped the timeout a few days back but the next day
i had
>>>>>> errors on the MDS box.  I have switched it back:
>>>>>>
>>>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>>>
>>>>>> I would love to give you that trace but I
don''t know how to get
>>>>>> it.  Is there a debug option to turn on in the clients?
>>>>> You can get that by echo t > /proc/sysrq-trigger on
client.
>>>>>
>>>> Cool command,  output of the client is attached.  The four  
>>>> processes
>>>> m45_amp214_om,  is the application that hung when working off
of
>>>> luster.  you can see its stuck in IO state.
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
------------------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>>
>>
------------------------------------------------------------------------
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org






------------------------------

Message: 3
Date: Mon, 11 Feb 2008 15:04:05 -0500
From: "Tom.Wang" <Tom.Wang at Sun.COM>
Subject: Re: [Lustre-discuss] Luster clients getting evicted
To: Aaron Knister <aaron at iges.org>
Cc: lustre-discuss at lists.lustre.org
Message-ID: <47B0AA35.7070303 at sun.com>
Content-Type: text/plain; format=flowed; charset=ISO-8859-1

Aaron Knister wrote:> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
> load, the clients hand about every 10 minutes which is really bad for 
> a production machine. The only way to fix the hang is to reboot the 
> server. My users are getting extremely impatient :-/
>
> I see this on the clients-
>
> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@ 
> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 
> x1796079/t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 
> ref 1 fl Rpc:/0/0 rc 0/-22It means OST could not response the request(unlink, o6) in 300 seconds, 
so client disconnect the import to OST and try to reconnect.
Does this disconnection always happened when do unlink ? Could you 
please post process trace and console msg of OST at that time?

Thanks
WangDi> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service 
> data-OST0000 via nid 192.168.64.71 at o2ib was lost; in progress 
> operations using this service will wait for recovery to complete.
> LustreError: 11-0: an error occurred while communicating with 
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> LustreError: 11-0: an error occurred while communicating with 
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
>
> I''ve increased the timeout to 300seconds and it has helped
marginally.
>
> -Aaron
>
>
>
>
>
>


------------------------------

Message: 4
Date: Mon, 11 Feb 2008 15:19:21 -0500
From: Craig Prescott <prescott at hpc.ufl.edu>
Subject: Re: [Lustre-discuss] Luster clients getting evicted
To: Aaron Knister <aaron at iges.org>
Cc: "Tom.Wang" <Tom.Wang at Sun.COM>, lustre-discuss at
lists.lustre.org
Message-ID: <47B0ADC9.8020501 at hpc.ufl.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Aaron Knister wrote:> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
> load, the clients hand about every 10 minutes which is really bad for  
> a production machine. The only way to fix the hang is to reboot the  
> server. My users are getting extremely impatient :-/
> 
> I see this on the clients-
> 
> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@  
> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 x1796079/ 
> t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1 fl
> Rpc:/0/0 rc 0/-22
> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- 
> OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations  
> using this service will wait for recovery to complete.
> LustreError: 11-0: an error occurred while communicating with  
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> LustreError: 11-0: an error occurred while communicating with  
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> 
> I''ve increased the timeout to 300seconds and it has helped
marginally.
Hi Aaron;

We set the timeout a big number (1000secs) on our 400 node cluster
(mostly o2ib, some tcp clients).  Until we did this, we had loads
of evictions.  In our case, it solved the problem.

Cheers,
Craig


------------------------------

Message: 5
Date: Mon, 11 Feb 2008 14:11:45 -0700
From: Andreas Dilger <adilger at sun.com>
Subject: Re: [Lustre-discuss] rc -43: Identifier removed
To: Per Lundqvist <perl at nsc.liu.se>
Cc: Lustre Discuss <lustre-discuss at lists.lustre.org>
Message-ID: <20080211211145.GJ3029 at webber.adilger.int>
Content-Type: text/plain; charset=us-ascii

On Feb 11, 2008  17:04 +0100, Per Lundqvist wrote:> I got this error today when testing a newly set up 1.6 filesystem:
> 
>    n50 1% cd /mnt/test
>    n50 2% ls
>    ls: reading directory .: Identifier removed
>    
>    n50 3% ls -alrt
>    total 8
>    ?---------  ? ?    ?       ?            ? dir1
>    ?---------  ? ?    ?       ?            ? dir2
>    drwxr-xr-x  4 root root 4096 Feb  8 15:46 ../
>    drwxr-xr-x  4 root root 4096 Feb 11 15:11 ./
> 
>    n50 4% stat .
>      File: `.''
>      Size: 4096            Blocks: 8          IO Block: 4096   directory
>    Device: b438c888h/-1271347064d  Inode: 27616681    Links: 2
>    Access: (0755/drwxr-xr-x)  Uid: ( 1120/   faxen)   Gid: (  500/     nsc)
>    Access: 2008-02-11 16:11:48.336621154 +0100
>    Modify: 2008-02-11 15:11:27.000000000 +0100
>    Change: 2008-02-11 15:11:31.352841294 +0100
>    
> this seems to be happen almost all the time when I am running as a 
> specific user on this system. Note that the stat call always works... I 
> haven''t yet been able to reproduce this problem when running as my
own
> user.
EIDRM (Identifier removed) means that your MDS has a user database
(/etc/passwd and /etc/group) that is missing the particular user ID.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.



------------------------------

Message: 6
Date: Mon, 11 Feb 2008 16:17:37 -0500
From: Brock Palen <brockp at umich.edu>
Subject: Re: [Lustre-discuss] Luster clients getting evicted
To: Craig Prescott <prescott at hpc.ufl.edu>
Cc: "Tom.Wang" <Tom.Wang at Sun.COM>, lustre-discuss at
lists.lustre.org
Message-ID: <38A6B1A2-E20A-40BC-80C2-CEBB971BDC09 at umich.edu>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>> I''ve increased the timeout to 300seconds and it has helped  
>> marginally.
>
> Hi Aaron;
>
> We set the timeout a big number (1000secs) on our 400 node cluster
> (mostly o2ib, some tcp clients).  Until we did this, we had loads
> of evictions.  In our case, it solved the problem.
This feels excessive.  But at this point I guess Ill try it.
>
> Cheers,
> Craig
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>


------------------------------

Message: 7
Date: Mon, 11 Feb 2008 16:48:05 -0500
From: Aaron Knister <aaron at iges.org>
Subject: Re: [Lustre-discuss] Luster clients getting evicted
To: Brock Palen <brockp at umich.edu>
Cc: "Tom.Wang" <Tom.Wang at Sun.COM>, lustre-discuss at
lists.lustre.org
Message-ID: <7A1D46E5-CC69-4C37-9CC7-B229FCA43BA1 at iges.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

So far it''s helped. If this doesn''t fix it I''m going
to apply the
patch mentioned here -
https://bugzilla.lustre.org/attachment.cgi?id=14006&action=edit
  I''ll let you know how it goes. If you''d like a copy of the
patched
version let me know. Are you running RHEL/SLES? what version of the OS  
and lustre?

-Aaron

On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
>>> I''ve increased the timeout to 300seconds and it has helped
>>> marginally.
>>
>> Hi Aaron;
>>
>> We set the timeout a big number (1000secs) on our 400 node cluster
>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>> of evictions.  In our case, it solved the problem.
>
> This feels excessive.  But at this point I guess Ill try it.
>
>>
>> Cheers,
>> Craig
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org






------------------------------

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


End of Lustre-discuss Digest, Vol 25, Issue 17
**********************************************

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 11404 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080212/610bb025/attachment.bin

------------------------------

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


End of Lustre-discuss Digest, Vol 25, Issue 19
**********************************************

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 16245 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080212/2fa9e8b7/attachment-0002.bin

Cliff White

2008-Feb-12 22:01 UTC

head link

[Lustre-discuss] Contents of Lustre-discuss digest...

ashok bharat bayana wrote:> Hi,
> i just want to know whether there are any alternative file systems for HP
SFS.
> I heard that there is Cluster Gateway from Polyserve. Can anybody plz help
me in finding more abt this Cluster Gateway.
Polyserve is now owned by HP, so I would ask there.
cliffw
> 
> Thanks and Regards,
> Ashok Bharat
> 
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org on behalf of
lustre-discuss-request at lists.lustre.org
> Sent: Tue 2/12/2008 11:05 AM
> To: lustre-discuss at lists.lustre.org
> Subject: Lustre-discuss Digest, Vol 25, Issue 19
>  
> Send Lustre-discuss mailing list submissions to
> 	lustre-discuss at lists.lustre.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.lustre.org/mailman/listinfo/lustre-discuss
> or, via email, send a message with subject or body ''help''
to
> 	lustre-discuss-request at lists.lustre.org
> 
> You can reach the person managing the list at
> 	lustre-discuss-owner at lists.lustre.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Lustre-discuss digest..."
> 
> 
> Today''s Topics:
> 
>    1. Re: multihomed clients ignoring lnet options (Cliff White)
>    2. Re: multihomed clients ignoring lnet options (Joe Little)
>    3. Re: multihomed clients ignoring lnet options (Steden Klaus)
>    4. Re: Lustre-discuss Digest, Vol 25, Issue 17 (ashok bharat bayana)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 11 Feb 2008 20:00:10 -0800
> From: Cliff White <Cliff.White at Sun.COM>
> Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
> To: Aaron Knister <aaron at iges.org>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID: <47B119CA.4050105 at sun.com>
> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
> 
> Aaron Knister wrote:
>> I believe that''s correct. The nids of the various server
components
>> are stored on the filesystem itself.
> 
> Yes, and you can always see them with
> tunefs.lustre --print <device>
> 
> cliffw
> 
>> On Feb 10, 2008, at 12:58 AM, Joe Little wrote:
>>
>>> never mind.. The problem was resolved by recreating again the MGS
and
>>> the OST''s using the same parameters on the server. I was
able to
>>> change the parameters and still have the servers working, but my
guess
>>> is that those options are permanently etched into the filesystem.
>>>
>>>
>>> On Feb 9, 2008 8:16 PM, Joe Little <jmlittle at gmail.com>
wrote:
>>>> I have all of my servers and clients using eth1 for the tcp
lustre
>>>> lnet.
>>>>
>>>> All have modprobe.conf entries of:
>>>>
>>>> options lnet networks="tcp0(eth1)"
>>>>
>>>> and all report with "lctl list_nids" that they are
using the IP
>>>> address associated with that interface (a net 192.168.200.x
address)
>>>>
>>>> However, when my client connects, it ignores the above and goes
with
>>>> eth0 for routing, even though the mds/mgs is on that network
range:
>>>>
>>>> client dmesg:
>>>>
>>>> Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum
lustre
>>>> stack 8192
>>>> Lustre: Added LNI 192.168.200.100 at tcp [8/256]
>>>> Lustre: Accept secure, port 988
>>>> Lustre: OBD class driver, info at clusterfs.com
>>>>        Lustre Version: 1.6.4.2
>>>>        Build Version:
>>>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre- 
>>>> kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
>>>> Lustre: Lustre Client File System; info at clusterfs.com
>>>> LustreError: 4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello())
Error
>>>> -104 reading HELLO from 192.168.2.201
>>>> LustreError: 11b-b: Connection to 192.168.2.201 at tcp at host
>>>> 192.168.2.201 on port 988 was reset: is it running a compatible
>>>> version of Lustre and is 192.168.2.201 at tcp one of its NIDs?
>>>>
>>>> server dmesg:
>>>> LustreError: 120-3: Refusing connection from 192.168.2.192 for
>>>> 192.168.2.201 at tcp: No matching NI
>>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> Aaron Knister
>> Associate Systems Analyst
>> Center for Ocean-Land-Atmosphere Studies
>>
>> (301) 595-7000
>> aaron at iges.org
>>
>>
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 11 Feb 2008 20:51:20 -0800
> From: "Joe Little" <jmlittle at gmail.com>
> Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
> To: "Cliff White" <Cliff.White at sun.com>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID:
> 	<e3849caa0802112051q7e24e6acv5af03a16f2bca2c3 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> On Feb 11, 2008 8:00 PM, Cliff White <Cliff.White at sun.com> wrote:
>> Aaron Knister wrote:
>>> I believe that''s correct. The nids of the various server
components
>>> are stored on the filesystem itself.
>> Yes, and you can always see them with
>> tunefs.lustre --print <device>
>>
>> cliffw
> 
> anyone to change them after the fact?
>>
>>> On Feb 10, 2008, at 12:58 AM, Joe Little wrote:
>>>
>>>> never mind.. The problem was resolved by recreating again the
MGS and
>>>> the OST''s using the same parameters on the server. I
was able to
>>>> change the parameters and still have the servers working, but
my guess
>>>> is that those options are permanently etched into the
filesystem.
>>>>
>>>>
>>>> On Feb 9, 2008 8:16 PM, Joe Little <jmlittle at
gmail.com> wrote:
>>>>> I have all of my servers and clients using eth1 for the tcp
lustre
>>>>> lnet.
>>>>>
>>>>> All have modprobe.conf entries of:
>>>>>
>>>>> options lnet networks="tcp0(eth1)"
>>>>>
>>>>> and all report with "lctl list_nids" that they
are using the IP
>>>>> address associated with that interface (a net 192.168.200.x
address)
>>>>>
>>>>> However, when my client connects, it ignores the above and
goes with
>>>>> eth0 for routing, even though the mds/mgs is on that
network range:
>>>>>
>>>>> client dmesg:
>>>>>
>>>>> Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum
lustre
>>>>> stack 8192
>>>>> Lustre: Added LNI 192.168.200.100 at tcp [8/256]
>>>>> Lustre: Accept secure, port 988
>>>>> Lustre: OBD class driver, info at clusterfs.com
>>>>>        Lustre Version: 1.6.4.2
>>>>>        Build Version:
>>>>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre-
>>>>> kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
>>>>> Lustre: Lustre Client File System; info at clusterfs.com
>>>>> LustreError:
4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error
>>>>> -104 reading HELLO from 192.168.2.201
>>>>> LustreError: 11b-b: Connection to 192.168.2.201 at tcp at
host
>>>>> 192.168.2.201 on port 988 was reset: is it running a
compatible
>>>>> version of Lustre and is 192.168.2.201 at tcp one of its
NIDs?
>>>>>
>>>>> server dmesg:
>>>>> LustreError: 120-3: Refusing connection from 192.168.2.192
for
>>>>> 192.168.2.201 at tcp: No matching NI
>>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>> Aaron Knister
>>> Associate Systems Analyst
>>> Center for Ocean-Land-Atmosphere Studies
>>>
>>> (301) 595-7000
>>> aaron at iges.org
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 11 Feb 2008 20:53:41 -0800
> From: "Steden Klaus" <Klaus.Steden at thomson.net>
> Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
> To: <jmlittle at gmail.com>,	<Cliff.White at sun.com>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID:
> 	<23480D326186CF49819F5EF363276C9003AB2AB3 at
BRBKSMAIL04.am.thmulti.com>
> Content-Type: text/plain;	charset="utf-8"
> 
> 
> If you have root, you can change them using tunefs.lustre after the file
system has been shut down.
> 
> I''ve done this a number of times to test various lnet configs.
> 
> Klaus
> 
> 
> ----- Original Message -----
> From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces
at lists.lustre.org>
> To: Cliff White <Cliff.White at sun.com>
> Cc: lustre-discuss at lists.lustre.org <lustre-discuss at
lists.lustre.org>
> Sent: Mon Feb 11 20:51:20 2008
> Subject: Re: [Lustre-discuss] multihomed clients ignoring lnet options
> 
> On Feb 11, 2008 8:00 PM, Cliff White <Cliff.White at sun.com> wrote:
>> Aaron Knister wrote:
>>> I believe that''s correct. The nids of the various server
components
>>> are stored on the filesystem itself.
>> Yes, and you can always see them with
>> tunefs.lustre --print <device>
>>
>> cliffw
> 
> anyone to change them after the fact?
>>
>>> On Feb 10, 2008, at 12:58 AM, Joe Little wrote:
>>>
>>>> never mind.. The problem was resolved by recreating again the
MGS and
>>>> the OST''s using the same parameters on the server. I
was able to
>>>> change the parameters and still have the servers working, but
my guess
>>>> is that those options are permanently etched into the
filesystem.
>>>>
>>>>
>>>> On Feb 9, 2008 8:16 PM, Joe Little <jmlittle at
gmail.com> wrote:
>>>>> I have all of my servers and clients using eth1 for the tcp
lustre
>>>>> lnet.
>>>>>
>>>>> All have modprobe.conf entries of:
>>>>>
>>>>> options lnet networks="tcp0(eth1)"
>>>>>
>>>>> and all report with "lctl list_nids" that they
are using the IP
>>>>> address associated with that interface (a net 192.168.200.x
address)
>>>>>
>>>>> However, when my client connects, it ignores the above and
goes with
>>>>> eth0 for routing, even though the mds/mgs is on that
network range:
>>>>>
>>>>> client dmesg:
>>>>>
>>>>> Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum
lustre
>>>>> stack 8192
>>>>> Lustre: Added LNI 192.168.200.100 at tcp [8/256]
>>>>> Lustre: Accept secure, port 988
>>>>> Lustre: OBD class driver, info at clusterfs.com
>>>>>        Lustre Version: 1.6.4.2
>>>>>        Build Version:
>>>>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre-
>>>>> kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
>>>>> Lustre: Lustre Client File System; info at clusterfs.com
>>>>> LustreError:
4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error
>>>>> -104 reading HELLO from 192.168.2.201
>>>>> LustreError: 11b-b: Connection to 192.168.2.201 at tcp at
host
>>>>> 192.168.2.201 on port 988 was reset: is it running a
compatible
>>>>> version of Lustre and is 192.168.2.201 at tcp one of its
NIDs?
>>>>>
>>>>> server dmesg:
>>>>> LustreError: 120-3: Refusing connection from 192.168.2.192
for
>>>>> 192.168.2.201 at tcp: No matching NI
>>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>> Aaron Knister
>>> Associate Systems Analyst
>>> Center for Ocean-Land-Atmosphere Studies
>>>
>>> (301) 595-7000
>>> aaron at iges.org
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> ------------------------------
> 
> Message: 4
> Date: Tue, 12 Feb 2008 11:15:18 +0530
> From: "ashok bharat bayana" <ashok.bharat.bayana at
iiitb.ac.in>
> Subject: Re: [Lustre-discuss] Lustre-discuss Digest, Vol 25, Issue 17
> To: <lustre-discuss at lists.lustre.org>
> Message-ID: <8626C1B7EB748940BCDD7596134632BE850213 at
jal.iiitb.ac.in>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> 
> Hi,
> i just want to know whether there are any alternative file systems for HP
SFS.
> I heard that there is Cluster Gateway from Polyserve. Can anybody plz help
me in finding more abt this Cluster Gateway.
> 
> Thanks and Regards,
> Ashok Bharat
> 
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org on behalf of
lustre-discuss-request at lists.lustre.org
> Sent: Tue 2/12/2008 3:18 AM
> To: lustre-discuss at lists.lustre.org
> Subject: Lustre-discuss Digest, Vol 25, Issue 17
>  
> Send Lustre-discuss mailing list submissions to
> 	lustre-discuss at lists.lustre.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.lustre.org/mailman/listinfo/lustre-discuss
> or, via email, send a message with subject or body ''help''
to
> 	lustre-discuss-request at lists.lustre.org
> 
> You can reach the person managing the list at
> 	lustre-discuss-owner at lists.lustre.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Lustre-discuss digest..."
> 
> 
> Today''s Topics:
> 
>    1. Re: Benchmarking Lustre (Marty Barnaby)
>    2. Re: Luster clients getting evicted (Aaron Knister)
>    3. Re: Luster clients getting evicted (Tom.Wang)
>    4. Re: Luster clients getting evicted (Craig Prescott)
>    5. Re: rc -43: Identifier removed (Andreas Dilger)
>    6. Re: Luster clients getting evicted (Brock Palen)
>    7. Re: Luster clients getting evicted (Aaron Knister)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 11 Feb 2008 11:25:48 -0700
> From: "Marty Barnaby" <mlbarna at sandia.gov>
> Subject: Re: [Lustre-discuss] Benchmarking Lustre
> To: "lustre-discuss at lists.lustre.org"
> 	<lustre-discuss at lists.lustre.org>
> Message-ID: <47B0932C.2090200 at sandia.gov>
> Content-Type: text/plain; charset=iso-8859-1; format=flowed
> 
> Do you have any special interests, like: writing from a true MPI job; 
> collective vs. independent; one-file-per-processor vs. a single, share 
> file; or writing via MPI-IO vs. posix?
> 
> 
> Marty Barnaby
> 
> 
> mayur bhosle wrote:
>> hi everyone,
>>
>>                         I am a student at Georgia Tech university, and 
>> as a part of a project i need to benchmark lustre file system. I did a 
>> lot of searching regarding
>> the possible benchmark, but i need some advice on which benchmarks 
>> would be more suitable... if any one can post a sugesstion that would 
>> be really helpful.......................
>>
>>                         thanks in advance ............
>>
>> Mayur
> 
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 11 Feb 2008 14:16:20 -0500
> From: Aaron Knister <aaron at iges.org>
> Subject: Re: [Lustre-discuss] Luster clients getting evicted
> To: Tom.Wang <Tom.Wang at Sun.COM>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID: <79343CD8-77EA-4686-A2AE-BEE6FAC59914 at iges.org>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> 
> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
> load, the clients hand about every 10 minutes which is really bad for  
> a production machine. The only way to fix the hang is to reboot the  
> server. My users are getting extremely impatient :-/
> 
> I see this on the clients-
> 
> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@  
> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 x1796079/ 
> t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1 fl
> Rpc:/0/0 rc 0/-22
> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- 
> OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations  
> using this service will wait for recovery to complete.
> LustreError: 11-0: an error occurred while communicating with  
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> LustreError: 11-0: an error occurred while communicating with  
> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
> 
> I''ve increased the timeout to 300seconds and it has helped
marginally.
> 
> -Aaron
> 
> On Feb 9, 2008, at 12:06 AM, Tom.Wang wrote:
> 
>> Hi,
>> Aha, this is bug has been fixed in 14360.
>>
>> https://bugzilla.lustre.org/show_bug.cgi?id=14360
>>
>> The patch there should fix your problem, which should be released in  
>> 1.6.5
>>
>> Thanks
>>
>> Brock Palen wrote:
>>> Sure, Attached,  note though, we rebuilt our lustre source for  
>>> another
>>> box that uses the largesmp kernel. but it used the same options and
>>> compiler.
>>>
>>>
>>> Brock Palen
>>> Center for Advanced Computing
>>> brockp at umich.edu
>>> (734)936-1985
>>>
>>>
>>> On Feb 8, 2008, at 2:47 PM, Tom.Wang wrote:
>>>
>>>> Hello,
>>>>
>>>> m45_amp214_om D 0000000000000000     0  2587      1        
31389
>>>> 2586 (NOTLB)
>>>> 00000101f6b435f8 0000000000000006 000001022c7fc030
0000000000000001
>>>>      00000100080f1a40 0000000000000246 00000101f6b435a8
>>>> 0000000380136025
>>>>      00000102270a1030 00000000000000d0
>>>> Call Trace:<ffffffffa0216e79>{:lnet:LNetPut+1689}
>>>> <ffffffff8030e45f>{__down+147}
>>>>      <ffffffff80134659>{default_wake_function+0}
>>>> <ffffffff8030ff7d>{__down_failed+53}
>>>>      <ffffffffa04292e1>{:lustre:.text.lock.file+5}
>>>> <ffffffffa044b12e>{:lustre:ll_mdc_blocking_ast+798}
>>>>      <ffffffffa02c8eb8>{:ptlrpc:ldlm_resource_get+456}
>>>> <ffffffffa02c3bbb>{:ptlrpc:ldlm_cancel_callback+107}
>>>>     
<ffffffffa02da615>{:ptlrpc:ldlm_cli_cancel_local+213}
>>>>     
<ffffffffa02c3c48>{:ptlrpc:ldlm_lock_addref_internal_nolock+56}
>>>>      <ffffffffa02c3dbc>{:ptlrpc:search_queue+284}
>>>> <ffffffffa02dbc03>{:ptlrpc:ldlm_cancel_list+99}
>>>>     
<ffffffffa02dc113>{:ptlrpc:ldlm_cancel_lru_local+915}
>>>>      <ffffffffa02ca293>{:ptlrpc:ldlm_resource_putref+435}
>>>>     
<ffffffffa02dc2c9>{:ptlrpc:ldlm_prep_enqueue_req+313}
>>>>      <ffffffffa0394e6f>{:mdc:mdc_enqueue+1023}
>>>> <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>>>     
<ffffffffa0268730>{:obdclass:class_handle2object+224}
>>>>      <ffffffffa02c5fea>{:ptlrpc:__ldlm_handle2lock+794}
>>>>      <ffffffffa02c106f>{:ptlrpc:unlock_res_and_lock+31}
>>>>     
<ffffffffa02c5c03>{:ptlrpc:ldlm_lock_decref_internal+595}
>>>>      <ffffffffa02c156c>{:ptlrpc:ldlm_lock_add_to_lru+140}
>>>>      <ffffffffa02c1035>{:ptlrpc:lock_res_and_lock+53}
>>>> <ffffffffa02c6f0a>{:ptlrpc:ldlm_lock_decref+154}
>>>>      <ffffffffa039617d>{:mdc:mdc_intent_lock+685}
>>>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>>      <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}
>>>> <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>>      <ffffffffa02d85f0>{:ptlrpc:ldlm_completion_ast+0}
>>>> <ffffffffa044b64b>{:lustre:ll_prepare_mdc_op_data+139}
>>>>      <ffffffffa0418a32>{:lustre:ll_intent_file_open+450}
>>>>      <ffffffffa044ae10>{:lustre:ll_mdc_blocking_ast+0}
>>>> <ffffffff80192006>{__d_lookup+287}
>>>>      <ffffffffa0419724>{:lustre:ll_file_open+2100}
>>>> <ffffffffa0428a18>{:lustre:ll_inode_permission+184}
>>>>      <ffffffff80179bdb>{sys_access+349}
>>>> <ffffffff8017a1ee>{__dentry_open+201}
>>>>      <ffffffff8017a3a9>{filp_open+95}
>>>> <ffffffff80179bdb>{sys_access+349}
>>>>      <ffffffff801f00b5>{strncpy_from_user+74}
>>>> <ffffffff8017a598>{sys_open+57}
>>>>      <ffffffff8011026a>{system_call+126}
>>>>
>>>> It seems blocking_ast process was blocked here. Could you dump
the
>>>> lustre/llite/namei.o by  objdump -S lustre/llite/namei.o and
send
>>>> to me?
>>>>
>>>> Thanks
>>>> WangDi
>>>>
>>>> Brock Palen wrote:
>>>>>>> On Feb 7, 2008, at 11:09 PM, Tom.Wang wrote:
>>>>>>>>> MDT dmesg:
>>>>>>>>>
>>>>>>>>> LustreError:
9042:0:(ldlm_lib.c:1442:target_send_reply_msg())
>>>>>>>>> @@@  processing error (-107)  req at
000001002b
>>>>>>>>> 52b000 x445020/t0
o400-><?>@<?>:-1 lens 128/0 ref 0 fl
>>>>>>>>> Interpret:/0/0  rc -107/0
>>>>>>>>> LustreError:
0:0:(ldlm_lockd.c:210:waiting_locks_callback())
>>>>>>>>> ###
>>>>>>>>> lock  callback timer expired: evicting cl
>>>>>>>>> ient
>>>>>>>>> 2faf3c9e-26fb-64b7-ca6c-7c5b09374e67 at
NET_0x200000aa4008d_UUID
>>>>>>>>> nid 10.164.0.141 at tcp  ns: mds-nobackup
>>>>>>>>> -MDT0000_UUID lock:
00000100476df240/0xbc269e05c512de3a lrc:
>>>>>>>>> 1/0,0  mode: CR/CR res: 11240142/324715850
bi
>>>>>>>>> ts 0x5 rrc: 2 type: IBT flags: 20 remote:
0x4e54bc800174cd08
>>>>>>>>> expref:  372 pid 26925
>>>>>>>>>
>>>>>>>> The client was evicted because of this lock can
not be released
>>>>>>>> on client
>>>>>>>> on time. Could you provide the stack strace of
client at that
>>>>>>>> time?
>>>>>>>>
>>>>>>>> I assume increase obd_timeout could fix your
problem. Then maybe
>>>>>>>> you should wait 1.6.5 released, including a new
feature
>>>>>>>> adaptive_timeout,
>>>>>>>> which will adjust the timeout value according
to the network
>>>>>>>> congestion
>>>>>>>> and server load. And it should help your
problem.
>>>>>>> Waiting for the next version of lustre might be the
best
>>>>>>> thing.  I
>>>>>>> had upped the timeout a few days back but the next
day i had
>>>>>>> errors on the MDS box.  I have switched it back:
>>>>>>>
>>>>>>> lctl conf_param nobackup-MDT0000.sys.timeout=300
>>>>>>>
>>>>>>> I would love to give you that trace but I
don''t know how to get
>>>>>>> it.  Is there a debug option to turn on in the
clients?
>>>>>> You can get that by echo t > /proc/sysrq-trigger on
client.
>>>>>>
>>>>> Cool command,  output of the client is attached.  The four
>>>>> processes
>>>>> m45_amp214_om,  is the application that hung when working
off of
>>>>> luster.  you can see its stuck in IO state.
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
------------------------------------------------------------------------
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>>
>>>
------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
> 
> (301) 595-7000
> aaron at iges.org
> 
> 
> 
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 11 Feb 2008 15:04:05 -0500
> From: "Tom.Wang" <Tom.Wang at Sun.COM>
> Subject: Re: [Lustre-discuss] Luster clients getting evicted
> To: Aaron Knister <aaron at iges.org>
> Cc: lustre-discuss at lists.lustre.org
> Message-ID: <47B0AA35.7070303 at sun.com>
> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
> 
> Aaron Knister wrote:
>> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
>> load, the clients hand about every 10 minutes which is really bad for 
>> a production machine. The only way to fix the hang is to reboot the 
>> server. My users are getting extremely impatient :-/
>>
>> I see this on the clients-
>>
>> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@ 
>> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600 
>> x1796079/t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens
336/336
>> ref 1 fl Rpc:/0/0 rc 0/-22
> It means OST could not response the request(unlink, o6) in 300 seconds, 
> so client disconnect the import to OST and try to reconnect.
> Does this disconnection always happened when do unlink ? Could you 
> please post process trace and console msg of OST at that time?
> 
> Thanks
> WangDi
>> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service 
>> data-OST0000 via nid 192.168.64.71 at o2ib was lost; in progress 
>> operations using this service will wait for recovery to complete.
>> LustreError: 11-0: an error occurred while communicating with 
>> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
>> LustreError: 11-0: an error occurred while communicating with 
>> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
>>
>> I''ve increased the timeout to 300seconds and it has helped
marginally.
>>
>> -Aaron
>>
> 
>>
>>
>>
>>
> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 11 Feb 2008 15:19:21 -0500
> From: Craig Prescott <prescott at hpc.ufl.edu>
> Subject: Re: [Lustre-discuss] Luster clients getting evicted
> To: Aaron Knister <aaron at iges.org>
> Cc: "Tom.Wang" <Tom.Wang at Sun.COM>, lustre-discuss at
lists.lustre.org
> Message-ID: <47B0ADC9.8020501 at hpc.ufl.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Aaron Knister wrote:
>> I''m having a similar issue with lustre 1.6.4.2 and infiniband.
Under
>> load, the clients hand about every 10 minutes which is really bad for  
>> a production machine. The only way to fix the hang is to reboot the  
>> server. My users are getting extremely impatient :-/
>>
>> I see this on the clients-
>>
>> LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@  
>> timeout (sent at 1202756629, 301s ago)  req at ffff8100af233600
x1796079/
>> t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1
fl
>> Rpc:/0/0 rc 0/-22
>> Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- 
>> OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations
>> using this service will wait for recovery to complete.
>> LustreError: 11-0: an error occurred while communicating with  
>> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
>> LustreError: 11-0: an error occurred while communicating with  
>> 192.168.64.71 at o2ib. The ost_connect operation failed with -16
>>
>> I''ve increased the timeout to 300seconds and it has helped
marginally.
> 
> Hi Aaron;
> 
> We set the timeout a big number (1000secs) on our 400 node cluster
> (mostly o2ib, some tcp clients).  Until we did this, we had loads
> of evictions.  In our case, it solved the problem.
> 
> Cheers,
> Craig
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Mon, 11 Feb 2008 14:11:45 -0700
> From: Andreas Dilger <adilger at sun.com>
> Subject: Re: [Lustre-discuss] rc -43: Identifier removed
> To: Per Lundqvist <perl at nsc.liu.se>
> Cc: Lustre Discuss <lustre-discuss at lists.lustre.org>
> Message-ID: <20080211211145.GJ3029 at webber.adilger.int>
> Content-Type: text/plain; charset=us-ascii
> 
> On Feb 11, 2008  17:04 +0100, Per Lundqvist wrote:
>> I got this error today when testing a newly set up 1.6 filesystem:
>>
>>    n50 1% cd /mnt/test
>>    n50 2% ls
>>    ls: reading directory .: Identifier removed
>>    
>>    n50 3% ls -alrt
>>    total 8
>>    ?---------  ? ?    ?       ?            ? dir1
>>    ?---------  ? ?    ?       ?            ? dir2
>>    drwxr-xr-x  4 root root 4096 Feb  8 15:46 ../
>>    drwxr-xr-x  4 root root 4096 Feb 11 15:11 ./
>>
>>    n50 4% stat .
>>      File: `.''
>>      Size: 4096            Blocks: 8          IO Block: 4096  
directory
>>    Device: b438c888h/-1271347064d  Inode: 27616681    Links: 2
>>    Access: (0755/drwxr-xr-x)  Uid: ( 1120/   faxen)   Gid: (  500/    
nsc)
>>    Access: 2008-02-11 16:11:48.336621154 +0100
>>    Modify: 2008-02-11 15:11:27.000000000 +0100
>>    Change: 2008-02-11 15:11:31.352841294 +0100
>>    
>> this seems to be happen almost all the time when I am running as a 
>> specific user on this system. Note that the stat call always works... I
>> haven''t yet been able to reproduce this problem when running
as my own
>> user.
> 
> EIDRM (Identifier removed) means that your MDS has a user database
> (/etc/passwd and /etc/group) that is missing the particular user ID.
> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Mon, 11 Feb 2008 16:17:37 -0500
> From: Brock Palen <brockp at umich.edu>
> Subject: Re: [Lustre-discuss] Luster clients getting evicted
> To: Craig Prescott <prescott at hpc.ufl.edu>
> Cc: "Tom.Wang" <Tom.Wang at Sun.COM>, lustre-discuss at
lists.lustre.org
> Message-ID: <38A6B1A2-E20A-40BC-80C2-CEBB971BDC09 at umich.edu>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
> 
>>> I''ve increased the timeout to 300seconds and it has helped
>>> marginally.
>> Hi Aaron;
>>
>> We set the timeout a big number (1000secs) on our 400 node cluster
>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>> of evictions.  In our case, it solved the problem.
> 
> This feels excessive.  But at this point I guess Ill try it.
> 
>> Cheers,
>> Craig
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
> 
> 
> 
> ------------------------------
> 
> Message: 7
> Date: Mon, 11 Feb 2008 16:48:05 -0500
> From: Aaron Knister <aaron at iges.org>
> Subject: Re: [Lustre-discuss] Luster clients getting evicted
> To: Brock Palen <brockp at umich.edu>
> Cc: "Tom.Wang" <Tom.Wang at Sun.COM>, lustre-discuss at
lists.lustre.org
> Message-ID: <7A1D46E5-CC69-4C37-9CC7-B229FCA43BA1 at iges.org>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> 
> So far it''s helped. If this doesn''t fix it I''m
going to apply the
> patch mentioned here -
https://bugzilla.lustre.org/attachment.cgi?id=14006&action=edit
>   I''ll let you know how it goes. If you''d like a copy of
the patched
> version let me know. Are you running RHEL/SLES? what version of the OS  
> and lustre?
> 
> -Aaron
> 
> On Feb 11, 2008, at 4:17 PM, Brock Palen wrote:
> 
>>>> I''ve increased the timeout to 300seconds and it has
helped
>>>> marginally.
>>> Hi Aaron;
>>>
>>> We set the timeout a big number (1000secs) on our 400 node cluster
>>> (mostly o2ib, some tcp clients).  Until we did this, we had loads
>>> of evictions.  In our case, it solved the problem.
>> This feels excessive.  But at this point I guess Ill try it.
>>
>>> Cheers,
>>> Craig
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
> 
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
> 
> (301) 595-7000
> aaron at iges.org
> 
> 
> 
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> End of Lustre-discuss Digest, Vol 25, Issue 17
> **********************************************
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/ms-tnef
> Size: 11404 bytes
> Desc: not available
> Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080212/610bb025/attachment.bin
> 
> ------------------------------
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> End of Lustre-discuss Digest, Vol 25, Issue 19
> **********************************************
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Lustre discuss - Feb 2008 - Contents of Lustre-discuss digest...

[Lustre-discuss] Contents of Lustre-discuss digest...

[Lustre-discuss] Contents of Lustre-discuss digest...