thr3ads.net - Lustre discuss - [Lustre-discuss] Thread might be hung, Heavy IO Load messages [Feb 2012]

If this information is useful, please help other people find it:
Share via:

David Noriega

2012-Feb-01 18:57 UTC

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

As of late I''ve been seeing alot of these messages:

Lustre: Service thread pid 22974 was inactive for 200.00s. The thread
might be hung, or it might only be slow and will resume later. Dumping
the stack trace for debugging purposes:
Pid: 22974, comm: ll_ost_io_233

Call Trace:
 [<ffffffff8006e1db>] do_gettimeofday+0x40/0x90
 [<ffffffff8001546f>] sync_buffer+0x0/0x3f
 [<ffffffff800637ea>] io_schedule+0x3f/0x67
 [<ffffffff800154aa>] sync_buffer+0x3b/0x3f
 [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
 [<ffffffff8001546f>] sync_buffer+0x0/0x3f
 [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
 [<ffffffff800a0ae0>] wake_bit_function+0x0/0x23
 [<ffffffff88945bc8>] bh_submit_read+0x58/0x70 [ldiskfs]
 [<ffffffff88945ef8>] read_block_bitmap+0xc8/0x1c0 [ldiskfs]
 [<ffffffff88968101>] ldiskfs_mb_free_blocks+0x191/0x5d0 [ldiskfs]
 [<ffffffff8894a0d1>] ldiskfs_mark_iloc_dirty+0x411/0x480 [ldiskfs]
 [<ffffffff88030d09>] do_get_write_access+0x4f9/0x530 [jbd]
 [<ffffffff80007691>] find_get_page+0x21/0x51
 [<ffffffff80010c54>] __find_get_block_slow+0x2f/0xf7
 [<ffffffff88946d4d>] ldiskfs_free_blocks+0x8d/0xe0 [ldiskfs]
 [<ffffffff8895ee66>] ldiskfs_ext_remove_space+0x3a6/0x740 [ldiskfs]
 [<ffffffff88960401>] ldiskfs_ext_truncate+0x161/0x1f0 [ldiskfs]
 [<ffffffff8894c881>] ldiskfs_truncate+0xc1/0x610 [ldiskfs]
 [<ffffffff800cd34f>] unmap_mapping_range+0x59/0x204
 [<ffffffff8894a0d1>] ldiskfs_mark_iloc_dirty+0x411/0x480 [ldiskfs]
 [<ffffffff800cdd9d>] vmtruncate+0xa2/0xc9
 [<ffffffff800417a6>] inode_setattr+0x22/0x104
 [<ffffffff8894df3b>] ldiskfs_setattr+0x1eb/0x270 [ldiskfs]
 [<ffffffff889ca037>] fsfilt_ldiskfs_setattr+0x1a7/0x250 [fsfilt_ldiskfs]
 [<ffffffff889e5551>] filter_version_get_check+0x91/0x2a0 [obdfilter]
 [<ffffffff800645ab>] __down_write_nested+0x12/0x92
 [<ffffffff885b1378>] cfs_alloc+0x68/0xc0 [libcfs]
 [<ffffffff889f2bfb>] filter_destroy+0xd9b/0x1fb0 [obdfilter]
 [<ffffffff886f0bc0>] ldlm_blocking_ast+0x0/0x2a0 [ptlrpc]
 [<ffffffff886f42a0>] ldlm_completion_ast+0x0/0x880 [ptlrpc]
 [<ffffffff88703549>] ldlm_srv_pool_recalc+0x79/0x220 [ptlrpc]
 [<ffffffff88719924>] lustre_msg_add_version+0x34/0x110 [ptlrpc]
 [<ffffffff8871c62a>] lustre_pack_reply_flags+0x86a/0x950 [ptlrpc]
 [<ffffffff886dba4c>] ldlm_resource_putref+0x34c/0x3c0 [ptlrpc]
 [<ffffffff886d68d2>] ldlm_lock_put+0x372/0x3d0 [ptlrpc]
 [<ffffffff8871c739>] lustre_pack_reply+0x29/0xb0 [ptlrpc]
 [<ffffffff889a4050>] ost_destroy+0x660/0x790 [ost]
 [<ffffffff88718a78>] lustre_msg_check_version_v2+0x8/0x20 [ptlrpc]
 [<ffffffff887188c5>] lustre_msg_get_opc+0x35/0xf0 [ptlrpc]
 [<ffffffff889ada26>] ost_handle+0x1556/0x55b0 [ost]
 [<ffffffff800dbcaa>] free_block+0x126/0x143
 [<ffffffff800dbeec>] __drain_alien_cache+0x51/0x66
 [<ffffffff88725c37>] ptlrpc_server_handle_request+0xaa7/0x1150 [ptlrpc]
 [<ffffffff800470f3>] try_to_wake_up+0x472/0x484
 [<ffffffff80062ff8>] thread_return+0x62/0xfe
 [<ffffffff8008b4a5>] __wake_up_common+0x3e/0x68
 [<ffffffff88729698>] ptlrpc_main+0x1258/0x1420 [ptlrpc]
 [<ffffffff8008d07b>] default_wake_function+0x0/0xe
 [<ffffffff800b7a9c>] audit_syscall_exit+0x336/0x362
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff88728440>] ptlrpc_main+0x0/0x1420 [ptlrpc]
 [<ffffffff8005dfa7>] child_rip+0x0/0x11

or

Pid: 13507, comm: ll_ost_io_205
LustreError: dumping log to /tmp/lustre-log.1328117954.22974
Lustre: lustre-OST0001: slow journal start 34s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: lustre-OST0001: slow brw_start 34s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: lustre-OST0001: slow journal start 36s due to heavy IO load
Lustre: Skipped 6 previous similar messages
Lustre: lustre-OST0001: slow journal start 37s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: lustre-OST0001: slow commitrw commit 37s due to heavy IO load
Lustre: lustre-OST0001: slow i_mutex 34s due to heavy IO load
Lustre: Skipped 1 previous similar message
Lustre: lustre-OST0001: slow i_mutex 34s due to heavy IO load
Lustre: lustre-OST0000: slow setattr 44s due to heavy IO load
Lustre: lustre-OST0000: slow setattr 44s due to heavy IO load
Lustre: lustre-OST0001: slow setattr 106s due to heavy IO load
Lustre: Skipped 2 previous similar messages
Lustre: lustre-OST0001: slow setattr 106s due to heavy IO load

On the MDS I see the following:

LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0x234ca921 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd03d1 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0x14e2441 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xc1861 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd0581 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd0272 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xc8042 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xc805c sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xc805c sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd024d sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd0311 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd0316 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd032a sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xd032b sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xc8fd6 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0xc1847 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0x234ca903 sub-object on OST idx 3/4: rc = -107
LustreError: 5255:0:(lov_request.c:692:lov_update_create_set()) error
creating fid 0x234ca904 sub-object on OST idx 3/4: rc = -107
Lustre: Service thread pid 25775 was inactive for 222.00s. The thread
might be hung, or it might only be slow and will resume later. Dumping
the stack trace for debugging purposes:
Lustre: Skipped 3 previous similar messages
Pid: 25775, comm: ll_mdt_117

Call Trace:
 [<ffffffff800638ab>] schedule_timeout+0x8a/0xad
 [<ffffffff80097d9f>] process_timeout+0x0/0x5
 [<ffffffff88809c75>] osc_create+0xc75/0x13d0 [osc]
 [<ffffffff8008d07b>] default_wake_function+0x0/0xe
 [<ffffffff888b7cbb>] qos_remedy_create+0x45b/0x570 [lov]
 [<ffffffff888b1be3>] lov_fini_create_set+0x243/0x11e0 [lov]
 [<ffffffff888a5982>] lov_create+0x1552/0x1860 [lov]
 [<ffffffff888a65b6>] lov_iocontrol+0x926/0xf0f [lov]
 [<ffffffff8008d07b>] default_wake_function+0x0/0xe
 [<ffffffff88a6140a>] mds_finish_open+0x1fea/0x43e0 [mds]
 [<ffffffff88030d09>] do_get_write_access+0x4f9/0x530 [jbd]
 [<ffffffff889740d1>] ldiskfs_mark_iloc_dirty+0x411/0x480 [ldiskfs]
 [<ffffffff88974796>] ldiskfs_mark_inode_dirty+0x136/0x160 [ldiskfs]
 [<ffffffff889740d1>] ldiskfs_mark_iloc_dirty+0x411/0x480 [ldiskfs]
 [<ffffffff88a6845e>] mds_open+0x2cce/0x35f8 [mds]
 [<ffffffff887cddbf>] ksocknal_find_conn_locked+0xcf/0x1f0 [ksocklnd]
 [<ffffffff887cfef5>] ksocknal_alloc_tx+0x1f5/0x2a0 [ksocklnd]
 [<ffffffff88a3ef89>] mds_reint_rec+0x1d9/0x2b0 [mds]
 [<ffffffff88a6ac72>] mds_open_unpack+0x312/0x430 [mds]
 [<ffffffff88a31e7a>] mds_reint+0x35a/0x420 [mds]
 [<ffffffff88a30d8a>] fixup_handle_for_resent_req+0x5a/0x2c0 [mds]
 [<ffffffff88a3bbfc>] mds_intent_policy+0x4ac/0xc80 [mds]
 [<ffffffff887058b6>] ldlm_resource_putref+0x1b6/0x3c0 [ptlrpc]
 [<ffffffff88702eb6>] ldlm_lock_enqueue+0x186/0xb20 [ptlrpc]
 [<ffffffff886ff7fd>] ldlm_lock_create+0x9bd/0x9f0 [ptlrpc]
 [<ffffffff88727720>] ldlm_server_blocking_ast+0x0/0x83d [ptlrpc]
 [<ffffffff88724849>] ldlm_handle_enqueue+0xbf9/0x1210 [ptlrpc]
 [<ffffffff88a3ab20>] mds_handle+0x4130/0x4d60 [mds]
 [<ffffffff88633be5>] lnet_match_blocked_msg+0x375/0x390 [lnet]
 [<ffffffff88748705>] lustre_msg_get_conn_cnt+0x35/0xf0 [ptlrpc]
 [<ffffffff8874fc37>] ptlrpc_server_handle_request+0xaa7/0x1150 [ptlrpc]
 [<ffffffff800470f3>] try_to_wake_up+0x472/0x484
 [<ffffffff8008b4a5>] __wake_up_common+0x3e/0x68
 [<ffffffff88753698>] ptlrpc_main+0x1258/0x1420 [ptlrpc]
 [<ffffffff8008d07b>] default_wake_function+0x0/0xe
 [<ffffffff800b7a9c>] audit_syscall_exit+0x336/0x362
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff88752440>] ptlrpc_main+0x0/0x1420 [ptlrpc]
 [<ffffffff8005dfa7>] child_rip+0x0/0x11

LustreError: dumping log to /tmp/lustre-log.1328058813.25775
Lustre: Service thread pid 25775 completed after 223.60s. This
indicates the system was overloaded (too many service threads, or
there were not enough hardware resources).
Lustre: Skipped 3 previous similar messages


What do these messages mean?




-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu

Carlos Thomaz

2012-Feb-01 19:04 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

Hi David,

You may be facing the same issue discussed on previous threads, which is
the issue regarding the zone_reclaim_mode.

Take a look on the previous thread where myself and Kevin replied to
Vijesh Ek.

If you don''t have access to the previous emails, look at your kernel
settings for the zone reclaim:

cat /proc/sys/vm/zone_reclaim_mode

It should be set to 0.

Also, look at the number of Lustre OSS service threads. It may be set to
high...

Rgds.
Carlos.

--
Carlos Thomaz | HPC Systems Architect
Mobile: +1 (303) 519-0578
cthomaz at ddn.com | Skype ID: carlosthomaz
DataDirect Networks, Inc.
9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
<http://twitter.com/ddn_limitless> | 1.800.TERABYTE

On 2/1/12 11:57 AM, "David Noriega" <tsk133 at my.utsa.edu>
wrote:
>indicates the system was overloaded (too many service threads, or
>

Charles Taylor

2012-Feb-01 19:27 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

You may also want to check and, if necessary, limit the lru_size on your
clients.   I believe there are guidelines in the ops manual.      We have ~750
clients and limit ours to 600 per OST.   That, combined with the setting
zone_reclaim_mode=0 should make a big difference.

Regards,

Charlie Taylor
UF HPC Center


On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
> Hi David,
> 
> You may be facing the same issue discussed on previous threads, which is
> the issue regarding the zone_reclaim_mode.
> 
> Take a look on the previous thread where myself and Kevin replied to
> Vijesh Ek.
> 
> If you don''t have access to the previous emails, look at your
kernel
> settings for the zone reclaim:
> 
> cat /proc/sys/vm/zone_reclaim_mode
> 
> It should be set to 0.
> 
> Also, look at the number of Lustre OSS service threads. It may be set to
> high...
> 
> Rgds.
> Carlos.
> 
> 
> --
> Carlos Thomaz | HPC Systems Architect
> Mobile: +1 (303) 519-0578
> cthomaz at ddn.com | Skype ID: carlosthomaz
> DataDirect Networks, Inc.
> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
> 
> 
> 
> 
> 
> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at my.utsa.edu>
wrote:
> 
>> indicates the system was overloaded (too many service threads, or
>> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Charles A. Taylor, Ph.D.
Associate Director,
UF HPC Center
(352) 392-4036

David Noriega

2012-Feb-01 21:11 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

zone_reclaim_mode is 0 on all clients/servers

When changing number of service threads or the lru_size, can these be
done on the fly or do they require a reboot of either client or
server?
For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
give about 300(300, 359) so I''m thinking try half of that and see how
it goes?

Also checking lru_size, I get different numbers from the clients. cat
/proc/fs/lustre/ldlm/namespaces/*/lru_size

Client: MDT0 OST0 OST1 OST2 OST3 MGC
head node: 0 22 22 22 22 400 (only a few users logged in)
busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
samba/nfs server: 4 440070 44370 44348 26282 1600

So my understanding is the lru_size is set to auto by default thus the
varying values, but setting it manually is effectively setting a max
value? Also what does it mean to have a lower value(especially in the
case of the samba/nfs server)?

On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at hpc.ufl.edu>
wrote:>
> You may also want to check and, if necessary, limit the lru_size on your
clients. ? I believe there are guidelines in the ops manual. ? ? ?We have ~750
clients and limit ours to 600 per OST. ? That, combined with the setting
zone_reclaim_mode=0 should make a big difference.
>
> Regards,
>
> Charlie Taylor
> UF HPC Center
>
>
> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>
>> Hi David,
>>
>> You may be facing the same issue discussed on previous threads, which
is
>> the issue regarding the zone_reclaim_mode.
>>
>> Take a look on the previous thread where myself and Kevin replied to
>> Vijesh Ek.
>>
>> If you don''t have access to the previous emails, look at your
kernel
>> settings for the zone reclaim:
>>
>> cat /proc/sys/vm/zone_reclaim_mode
>>
>> It should be set to 0.
>>
>> Also, look at the number of Lustre OSS service threads. It may be set
to
>> high...
>>
>> Rgds.
>> Carlos.
>>
>>
>> --
>> Carlos Thomaz | HPC Systems Architect
>> Mobile: +1 (303) 519-0578
>> cthomaz at ddn.com | Skype ID: carlosthomaz
>> DataDirect Networks, Inc.
>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>
>>
>>
>>
>>
>> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>
>>> indicates the system was overloaded (too many service threads, or
>>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> Charles A. Taylor, Ph.D.
> Associate Director,
> UF HPC Center
> (352) 392-4036
>
>
>


-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu

Carlos Thomaz

2012-Feb-02 00:33 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

David,

The oss service threads is a function of your RAM size and CPUs. It''s
difficult to say what would be a good upper limit without knowing the size
of your OSS, # clients, storage back-end and workload. But the good thing
you can give a try on the fly via lctl set_param command.

Assuming you are running lustre 1.8, here is a good explanation on how to
do it:
http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
87260

Some remarks:
- reducing the number of OSS threads may impact the performance depending
on how is your workload.
- unfortunately I guess you will need to try and see what happens. I would
go for 128 and analyze the behavior of your OSSs (via log files) and also
keeping an eye on your workload. Seems to me that 300 is a bit too high
(but again, I don''t know what you have on your storage back-end or OSS
configuration).


I can''t tell you much about the lru_size, but as far as I understand
the
values are dynamic and there''s not much to do rather than clear the
last
recently used queue or disable the lru sizing. I can''t help much on
this
other than pointing you out the explanation for it (see 31.2.11):

http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html


Regards,
Carlos




--
Carlos Thomaz | HPC Systems Architect
Mobile: +1 (303) 519-0578
cthomaz at ddn.com | Skype ID: carlosthomaz
DataDirect Networks, Inc.
9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
<http://twitter.com/ddn_limitless> | 1.800.TERABYTE





On 2/1/12 2:11 PM, "David Noriega" <tsk133 at my.utsa.edu>
wrote:
>zone_reclaim_mode is 0 on all clients/servers
>
>When changing number of service threads or the lru_size, can these be
>done on the fly or do they require a reboot of either client or
>server?
>For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
>give about 300(300, 359) so I''m thinking try half of that and see
how
>it goes?
>
>Also checking lru_size, I get different numbers from the clients. cat
>/proc/fs/lustre/ldlm/namespaces/*/lru_size
>
>Client: MDT0 OST0 OST1 OST2 OST3 MGC
>head node: 0 22 22 22 22 400 (only a few users logged in)
>busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>samba/nfs server: 4 440070 44370 44348 26282 1600
>
>So my understanding is the lru_size is set to auto by default thus the
>varying values, but setting it manually is effectively setting a max
>value? Also what does it mean to have a lower value(especially in the
>case of the samba/nfs server)?
>
>On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at hpc.ufl.edu>
wrote:
>>
>> You may also want to check and, if necessary, limit the lru_size on
>>your clients.   I believe there are guidelines in the ops manual.
>>We have ~750 clients and limit ours to 600 per OST.   That, combined
>>with the setting zone_reclaim_mode=0 should make a big difference.
>>
>> Regards,
>>
>> Charlie Taylor
>> UF HPC Center
>>
>>
>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>
>>> Hi David,
>>>
>>> You may be facing the same issue discussed on previous threads,
which
>>>is
>>> the issue regarding the zone_reclaim_mode.
>>>
>>> Take a look on the previous thread where myself and Kevin replied
to
>>> Vijesh Ek.
>>>
>>> If you don''t have access to the previous emails, look at
your kernel
>>> settings for the zone reclaim:
>>>
>>> cat /proc/sys/vm/zone_reclaim_mode
>>>
>>> It should be set to 0.
>>>
>>> Also, look at the number of Lustre OSS service threads. It may be
set
>>>to
>>> high...
>>>
>>> Rgds.
>>> Carlos.
>>>
>>>
>>> --
>>> Carlos Thomaz | HPC Systems Architect
>>> Mobile: +1 (303) 519-0578
>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>> DataDirect Networks, Inc.
>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>
>>>
>>>
>>>
>>>
>>> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>
>>>> indicates the system was overloaded (too many service threads,
or
>>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>> Charles A. Taylor, Ph.D.
>> Associate Director,
>> UF HPC Center
>> (352) 392-4036
>>
>>
>>
>
>
>
>-- 
>David Noriega
>System Administrator
>Computational Biology Initiative
>High Performance Computing Center
>University of Texas at San Antonio
>One UTSA Circle
>San Antonio, TX 78249
>Office: BSE 3.112
>Phone: 210-458-7100
>http://www.cbi.utsa.edu
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss

David Noriega

2012-Feb-02 15:54 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
StorageTek 2540 connected with 8Gb fiber.

What about tweaking max_dirty_mb on the client side?

On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz <cthomaz at ddn.com>
wrote:> David,
>
> The oss service threads is a function of your RAM size and CPUs.
It''s
> difficult to say what would be a good upper limit without knowing the size
> of your OSS, # clients, storage back-end and workload. But the good thing
> you can give a try on the fly via lctl set_param command.
>
> Assuming you are running lustre 1.8, here is a good explanation on how to
> do it:
> http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
> 87260
>
> Some remarks:
> - reducing the number of OSS threads may impact the performance depending
> on how is your workload.
> - unfortunately I guess you will need to try and see what happens. I would
> go for 128 and analyze the behavior of your OSSs (via log files) and also
> keeping an eye on your workload. Seems to me that 300 is a bit too high
> (but again, I don''t know what you have on your storage back-end or
OSS
> configuration).
>
>
> I can''t tell you much about the lru_size, but as far as I
understand the
> values are dynamic and there''s not much to do rather than clear
the last
> recently used queue or disable the lru sizing. I can''t help much
on this
> other than pointing you out the explanation for it (see 31.2.11):
>
> http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html
>
>
> Regards,
> Carlos
>
>
>
>
> --
> Carlos Thomaz | HPC Systems Architect
> Mobile: +1 (303) 519-0578
> cthomaz at ddn.com | Skype ID: carlosthomaz
> DataDirect Networks, Inc.
> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>
>
>
>
>
> On 2/1/12 2:11 PM, "David Noriega" <tsk133 at my.utsa.edu>
wrote:
>
>>zone_reclaim_mode is 0 on all clients/servers
>>
>>When changing number of service threads or the lru_size, can these be
>>done on the fly or do they require a reboot of either client or
>>server?
>>For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
>>give about 300(300, 359) so I''m thinking try half of that and
see how
>>it goes?
>>
>>Also checking lru_size, I get different numbers from the clients. cat
>>/proc/fs/lustre/ldlm/namespaces/*/lru_size
>>
>>Client: MDT0 OST0 OST1 OST2 OST3 MGC
>>head node: 0 22 22 22 22 400 (only a few users logged in)
>>busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>>samba/nfs server: 4 440070 44370 44348 26282 1600
>>
>>So my understanding is the lru_size is set to auto by default thus the
>>varying values, but setting it manually is effectively setting a max
>>value? Also what does it mean to have a lower value(especially in the
>>case of the samba/nfs server)?
>>
>>On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at
hpc.ufl.edu> wrote:
>>>
>>> You may also want to check and, if necessary, limit the lru_size on
>>>your clients. ? I believe there are guidelines in the ops manual.
>>>We have ~750 clients and limit ours to 600 per OST. ? That, combined
>>>with the setting zone_reclaim_mode=0 should make a big difference.
>>>
>>> Regards,
>>>
>>> Charlie Taylor
>>> UF HPC Center
>>>
>>>
>>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>>
>>>> Hi David,
>>>>
>>>> You may be facing the same issue discussed on previous threads,
which
>>>>is
>>>> the issue regarding the zone_reclaim_mode.
>>>>
>>>> Take a look on the previous thread where myself and Kevin
replied to
>>>> Vijesh Ek.
>>>>
>>>> If you don''t have access to the previous emails, look
at your kernel
>>>> settings for the zone reclaim:
>>>>
>>>> cat /proc/sys/vm/zone_reclaim_mode
>>>>
>>>> It should be set to 0.
>>>>
>>>> Also, look at the number of Lustre OSS service threads. It may
be set
>>>>to
>>>> high...
>>>>
>>>> Rgds.
>>>> Carlos.
>>>>
>>>>
>>>> --
>>>> Carlos Thomaz | HPC Systems Architect
>>>> Mobile: +1 (303) 519-0578
>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>> DataDirect Networks, Inc.
>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>>
>>>>> indicates the system was overloaded (too many service
threads, or
>>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>> Charles A. Taylor, Ph.D.
>>> Associate Director,
>>> UF HPC Center
>>> (352) 392-4036
>>>
>>>
>>>
>>
>>
>>
>>--
>>David Noriega
>>System Administrator
>>Computational Biology Initiative
>>High Performance Computing Center
>>University of Texas at San Antonio
>>One UTSA Circle
>>San Antonio, TX 78249
>>Office: BSE 3.112
>>Phone: 210-458-7100
>>http://www.cbi.utsa.edu
>>_______________________________________________
>>Lustre-discuss mailing list
>>Lustre-discuss at lists.lustre.org
>>http://lists.lustre.org/mailman/listinfo/lustre-discuss
>


-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu

David Noriega

2012-Feb-02 16:05 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

On a side note, what about increasing the MDS service threads?
Checking that, its running at its max of 128.

On Thu, Feb 2, 2012 at 9:54 AM, David Noriega <tsk133 at my.utsa.edu>
wrote:> We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
> and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
> StorageTek 2540 connected with 8Gb fiber.
>
> What about tweaking max_dirty_mb on the client side?
>
> On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz <cthomaz at ddn.com>
wrote:
>> David,
>>
>> The oss service threads is a function of your RAM size and CPUs.
It''s
>> difficult to say what would be a good upper limit without knowing the
size
>> of your OSS, # clients, storage back-end and workload. But the good
thing
>> you can give a try on the fly via lctl set_param command.
>>
>> Assuming you are running lustre 1.8, here is a good explanation on how
to
>> do it:
>>
http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
>> 87260
>>
>> Some remarks:
>> - reducing the number of OSS threads may impact the performance
depending
>> on how is your workload.
>> - unfortunately I guess you will need to try and see what happens. I
would
>> go for 128 and analyze the behavior of your OSSs (via log files) and
also
>> keeping an eye on your workload. Seems to me that 300 is a bit too high
>> (but again, I don''t know what you have on your storage
back-end or OSS
>> configuration).
>>
>>
>> I can''t tell you much about the lru_size, but as far as I
understand the
>> values are dynamic and there''s not much to do rather than
clear the last
>> recently used queue or disable the lru sizing. I can''t help
much on this
>> other than pointing you out the explanation for it (see 31.2.11):
>>
>> http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html
>>
>>
>> Regards,
>> Carlos
>>
>>
>>
>>
>> --
>> Carlos Thomaz | HPC Systems Architect
>> Mobile: +1 (303) 519-0578
>> cthomaz at ddn.com | Skype ID: carlosthomaz
>> DataDirect Networks, Inc.
>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>
>>
>>
>>
>>
>> On 2/1/12 2:11 PM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>
>>>zone_reclaim_mode is 0 on all clients/servers
>>>
>>>When changing number of service threads or the lru_size, can these
be
>>>done on the fly or do they require a reboot of either client or
>>>server?
>>>For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
>>>give about 300(300, 359) so I''m thinking try half of that
and see how
>>>it goes?
>>>
>>>Also checking lru_size, I get different numbers from the clients.
cat
>>>/proc/fs/lustre/ldlm/namespaces/*/lru_size
>>>
>>>Client: MDT0 OST0 OST1 OST2 OST3 MGC
>>>head node: 0 22 22 22 22 400 (only a few users logged in)
>>>busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>>>samba/nfs server: 4 440070 44370 44348 26282 1600
>>>
>>>So my understanding is the lru_size is set to auto by default thus
the
>>>varying values, but setting it manually is effectively setting a max
>>>value? Also what does it mean to have a lower value(especially in
the
>>>case of the samba/nfs server)?
>>>
>>>On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at
hpc.ufl.edu> wrote:
>>>>
>>>> You may also want to check and, if necessary, limit the
lru_size on
>>>>your clients. ? I believe there are guidelines in the ops
manual.
>>>>We have ~750 clients and limit ours to 600 per OST. ? That,
combined
>>>>with the setting zone_reclaim_mode=0 should make a big
difference.
>>>>
>>>> Regards,
>>>>
>>>> Charlie Taylor
>>>> UF HPC Center
>>>>
>>>>
>>>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> You may be facing the same issue discussed on previous
threads, which
>>>>>is
>>>>> the issue regarding the zone_reclaim_mode.
>>>>>
>>>>> Take a look on the previous thread where myself and Kevin
replied to
>>>>> Vijesh Ek.
>>>>>
>>>>> If you don''t have access to the previous emails,
look at your kernel
>>>>> settings for the zone reclaim:
>>>>>
>>>>> cat /proc/sys/vm/zone_reclaim_mode
>>>>>
>>>>> It should be set to 0.
>>>>>
>>>>> Also, look at the number of Lustre OSS service threads. It
may be set
>>>>>to
>>>>> high...
>>>>>
>>>>> Rgds.
>>>>> Carlos.
>>>>>
>>>>>
>>>>> --
>>>>> Carlos Thomaz | HPC Systems Architect
>>>>> Mobile: +1 (303) 519-0578
>>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>>> DataDirect Networks, Inc.
>>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>>>> ddn.com <http://www.ddn.com/> | Twitter:
@ddn_limitless
>>>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>>>
>>>>>> indicates the system was overloaded (too many service
threads, or
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>> Charles A. Taylor, Ph.D.
>>>> Associate Director,
>>>> UF HPC Center
>>>> (352) 392-4036
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>--
>>>David Noriega
>>>System Administrator
>>>Computational Biology Initiative
>>>High Performance Computing Center
>>>University of Texas at San Antonio
>>>One UTSA Circle
>>>San Antonio, TX 78249
>>>Office: BSE 3.112
>>>Phone: 210-458-7100
>>>http://www.cbi.utsa.edu
>>>_______________________________________________
>>>Lustre-discuss mailing list
>>>Lustre-discuss at lists.lustre.org
>>>http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
>
>
> --
> David Noriega
> System Administrator
> Computational Biology Initiative
> High Performance Computing Center
> University of Texas at San Antonio
> One UTSA Circle
> San Antonio, TX 78249
> Office: BSE 3.112
> Phone: 210-458-7100
> http://www.cbi.utsa.edu


-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu

Andreas Dilger

2012-Feb-02 18:07 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

On 2012-02-02, at 8:54 AM, David Noriega wrote:> We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
> and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
> StorageTek 2540 connected with 8Gb fiber.
Running 32-64 threads per OST is the optimum number, based on previous
experience.
> What about tweaking max_dirty_mb on the client side?
Probably unrelated.
> On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz <cthomaz at ddn.com>
wrote:
>> David,
>> 
>> The oss service threads is a function of your RAM size and CPUs.
It''s
>> difficult to say what would be a good upper limit without knowing the
size
>> of your OSS, # clients, storage back-end and workload. But the good
thing
>> you can give a try on the fly via lctl set_param command.
>> 
>> Assuming you are running lustre 1.8, here is a good explanation on how
to
>> do it:
>>
http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
>> 87260
>> 
>> Some remarks:
>> - reducing the number of OSS threads may impact the performance
depending
>> on how is your workload.
>> - unfortunately I guess you will need to try and see what happens. I
would
>> go for 128 and analyze the behavior of your OSSs (via log files) and
also
>> keeping an eye on your workload. Seems to me that 300 is a bit too high
>> (but again, I don''t know what you have on your storage
back-end or OSS
>> configuration).
>> 
>> 
>> I can''t tell you much about the lru_size, but as far as I
understand the
>> values are dynamic and there''s not much to do rather than
clear the last
>> recently used queue or disable the lru sizing. I can''t help
much on this
>> other than pointing you out the explanation for it (see 31.2.11):
>> 
>> http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html
>> 
>> 
>> Regards,
>> Carlos
>> 
>> 
>> 
>> 
>> --
>> Carlos Thomaz | HPC Systems Architect
>> Mobile: +1 (303) 519-0578
>> cthomaz at ddn.com | Skype ID: carlosthomaz
>> DataDirect Networks, Inc.
>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>> 
>> 
>> 
>> 
>> 
>> On 2/1/12 2:11 PM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>> 
>>> zone_reclaim_mode is 0 on all clients/servers
>>> 
>>> When changing number of service threads or the lru_size, can these
be
>>> done on the fly or do they require a reboot of either client or
>>> server?
>>> For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
>>> give about 300(300, 359) so I''m thinking try half of that
and see how
>>> it goes?
>>> 
>>> Also checking lru_size, I get different numbers from the clients.
cat
>>> /proc/fs/lustre/ldlm/namespaces/*/lru_size
>>> 
>>> Client: MDT0 OST0 OST1 OST2 OST3 MGC
>>> head node: 0 22 22 22 22 400 (only a few users logged in)
>>> busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>>> samba/nfs server: 4 440070 44370 44348 26282 1600
>>> 
>>> So my understanding is the lru_size is set to auto by default thus
the
>>> varying values, but setting it manually is effectively setting a
max
>>> value? Also what does it mean to have a lower value(especially in
the
>>> case of the samba/nfs server)?
>>> 
>>> On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at
hpc.ufl.edu> wrote:
>>>> 
>>>> You may also want to check and, if necessary, limit the
lru_size on
>>>> your clients.   I believe there are guidelines in the ops
manual.
>>>> We have ~750 clients and limit ours to 600 per OST.   That,
combined
>>>> with the setting zone_reclaim_mode=0 should make a big
difference.
>>>> 
>>>> Regards,
>>>> 
>>>> Charlie Taylor
>>>> UF HPC Center
>>>> 
>>>> 
>>>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>>> 
>>>>> Hi David,
>>>>> 
>>>>> You may be facing the same issue discussed on previous
threads, which
>>>>> is
>>>>> the issue regarding the zone_reclaim_mode.
>>>>> 
>>>>> Take a look on the previous thread where myself and Kevin
replied to
>>>>> Vijesh Ek.
>>>>> 
>>>>> If you don''t have access to the previous emails,
look at your kernel
>>>>> settings for the zone reclaim:
>>>>> 
>>>>> cat /proc/sys/vm/zone_reclaim_mode
>>>>> 
>>>>> It should be set to 0.
>>>>> 
>>>>> Also, look at the number of Lustre OSS service threads. It
may be set
>>>>> to
>>>>> high...
>>>>> 
>>>>> Rgds.
>>>>> Carlos.
>>>>> 
>>>>> 
>>>>> --
>>>>> Carlos Thomaz | HPC Systems Architect
>>>>> Mobile: +1 (303) 519-0578
>>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>>> DataDirect Networks, Inc.
>>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>>>> ddn.com <http://www.ddn.com/> | Twitter:
@ddn_limitless
>>>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On 2/1/12 11:57 AM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>>> 
>>>>>> indicates the system was overloaded (too many service
threads, or
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>> 
>>>> Charles A. Taylor, Ph.D.
>>>> Associate Director,
>>>> UF HPC Center
>>>> (352) 392-4036
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> David Noriega
>>> System Administrator
>>> Computational Biology Initiative
>>> High Performance Computing Center
>>> University of Texas at San Antonio
>>> One UTSA Circle
>>> San Antonio, TX 78249
>>> Office: BSE 3.112
>>> Phone: 210-458-7100
>>> http://www.cbi.utsa.edu
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
> 
> 
> 
> -- 
> David Noriega
> System Administrator
> Computational Biology Initiative
> High Performance Computing Center
> University of Texas at San Antonio
> One UTSA Circle
> San Antonio, TX 78249
> Office: BSE 3.112
> Phone: 210-458-7100
> http://www.cbi.utsa.edu
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Engineer                   http://www.whamcloud.com/

David Noriega

2012-Feb-03 00:05 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

I found this thread "Luster clients getting evicted" as I''ve
also seen
the "ost_connect operation failed with -16" message and there they
recommend increasing the timeout, though that was for 1.6 and as I''ve
read 1.8 has a different timeout system. Reading that, would
increasing at_min(currently 0) or at_max(currently 600) be best?

On Thu, Feb 2, 2012 at 12:07 PM, Andreas Dilger <adilger at whamcloud.com>
wrote:> On 2012-02-02, at 8:54 AM, David Noriega wrote:
>> We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
>> and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
>> StorageTek 2540 connected with 8Gb fiber.
>
> Running 32-64 threads per OST is the optimum number, based on previous
> experience.
>
>> What about tweaking max_dirty_mb on the client side?
>
> Probably unrelated.
>
>> On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz <cthomaz at
ddn.com> wrote:
>>> David,
>>>
>>> The oss service threads is a function of your RAM size and CPUs.
It''s
>>> difficult to say what would be a good upper limit without knowing
the size
>>> of your OSS, # clients, storage back-end and workload. But the good
thing
>>> you can give a try on the fly via lctl set_param command.
>>>
>>> Assuming you are running lustre 1.8, here is a good explanation on
how to
>>> do it:
>>>
http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
>>> 87260
>>>
>>> Some remarks:
>>> - reducing the number of OSS threads may impact the performance
depending
>>> on how is your workload.
>>> - unfortunately I guess you will need to try and see what happens.
I would
>>> go for 128 and analyze the behavior of your OSSs (via log files)
and also
>>> keeping an eye on your workload. Seems to me that 300 is a bit too
high
>>> (but again, I don''t know what you have on your storage
back-end or OSS
>>> configuration).
>>>
>>>
>>> I can''t tell you much about the lru_size, but as far as I
understand the
>>> values are dynamic and there''s not much to do rather than
clear the last
>>> recently used queue or disable the lru sizing. I can''t
help much on this
>>> other than pointing you out the explanation for it (see 31.2.11):
>>>
>>> http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html
>>>
>>>
>>> Regards,
>>> Carlos
>>>
>>>
>>>
>>>
>>> --
>>> Carlos Thomaz | HPC Systems Architect
>>> Mobile: +1 (303) 519-0578
>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>> DataDirect Networks, Inc.
>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>
>>>
>>>
>>>
>>>
>>> On 2/1/12 2:11 PM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>
>>>> zone_reclaim_mode is 0 on all clients/servers
>>>>
>>>> When changing number of service threads or the lru_size, can
these be
>>>> done on the fly or do they require a reboot of either client or
>>>> server?
>>>> For my two OSTs, cat
/proc/fs/lustre/ost/OSS/ost_io/threads_started
>>>> give about 300(300, 359) so I''m thinking try half of
that and see how
>>>> it goes?
>>>>
>>>> Also checking lru_size, I get different numbers from the
clients. cat
>>>> /proc/fs/lustre/ldlm/namespaces/*/lru_size
>>>>
>>>> Client: MDT0 OST0 OST1 OST2 OST3 MGC
>>>> head node: 0 22 22 22 22 400 (only a few users logged in)
>>>> busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>>>> samba/nfs server: 4 440070 44370 44348 26282 1600
>>>>
>>>> So my understanding is the lru_size is set to auto by default
thus the
>>>> varying values, but setting it manually is effectively setting
a max
>>>> value? Also what does it mean to have a lower value(especially
in the
>>>> case of the samba/nfs server)?
>>>>
>>>> On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor at
hpc.ufl.edu> wrote:
>>>>>
>>>>> You may also want to check and, if necessary, limit the
lru_size on
>>>>> your clients. ? I believe there are guidelines in the ops
manual.
>>>>> We have ~750 clients and limit ours to 600 per OST. ? That,
combined
>>>>> with the setting zone_reclaim_mode=0 should make a big
difference.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Charlie Taylor
>>>>> UF HPC Center
>>>>>
>>>>>
>>>>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> You may be facing the same issue discussed on previous
threads, which
>>>>>> is
>>>>>> the issue regarding the zone_reclaim_mode.
>>>>>>
>>>>>> Take a look on the previous thread where myself and
Kevin replied to
>>>>>> Vijesh Ek.
>>>>>>
>>>>>> If you don''t have access to the previous
emails, look at your kernel
>>>>>> settings for the zone reclaim:
>>>>>>
>>>>>> cat /proc/sys/vm/zone_reclaim_mode
>>>>>>
>>>>>> It should be set to 0.
>>>>>>
>>>>>> Also, look at the number of Lustre OSS service threads.
It may be set
>>>>>> to
>>>>>> high...
>>>>>>
>>>>>> Rgds.
>>>>>> Carlos.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Carlos Thomaz | HPC Systems Architect
>>>>>> Mobile: +1 (303) 519-0578
>>>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>>>> DataDirect Networks, Inc.
>>>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>>>>> ddn.com <http://www.ddn.com/> | Twitter:
@ddn_limitless
>>>>>> <http://twitter.com/ddn_limitless> |
1.800.TERABYTE
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2/1/12 11:57 AM, "David Noriega"
<tsk133 at my.utsa.edu> wrote:
>>>>>>
>>>>>>> indicates the system was overloaded (too many
service threads, or
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>>> Charles A. Taylor, Ph.D.
>>>>> Associate Director,
>>>>> UF HPC Center
>>>>> (352) 392-4036
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> David Noriega
>>>> System Administrator
>>>> Computational Biology Initiative
>>>> High Performance Computing Center
>>>> University of Texas at San Antonio
>>>> One UTSA Circle
>>>> San Antonio, TX 78249
>>>> Office: BSE 3.112
>>>> Phone: 210-458-7100
>>>> http://www.cbi.utsa.edu
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>>
>>
>> --
>> David Noriega
>> System Administrator
>> Computational Biology Initiative
>> High Performance Computing Center
>> University of Texas at San Antonio
>> One UTSA Circle
>> San Antonio, TX 78249
>> Office: BSE 3.112
>> Phone: 210-458-7100
>> http://www.cbi.utsa.edu
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
> Cheers, Andreas
> --
> Andreas Dilger ? ? ? ? ? ? ? ? ? ? ? Whamcloud, Inc.
> Principal Engineer ? ? ? ? ? ? ? ? ? http://www.whamcloud.com/
>
>
>
>


-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu

Carlos Thomaz

2012-Feb-03 01:37 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

I can''t comment much on this (don''t have much experience
tuning it), but
Lustre 1.8 has a completely different timeouts architecture (Adaptive
timeouts).
I suggest you to take a deep look first:

--
Carlos Thomaz | HPC Systems Architect
Mobile: +1 (303) 519-0578
cthomaz at ddn.com | Skype ID: carlosthomaz
DataDirect Networks, Inc.
9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
<http://twitter.com/ddn_limitless> | 1.800.TERABYTE





On 2/2/12 5:05 PM, "David Noriega" <tsk133 at my.utsa.edu>
wrote:
>I found this thread "Luster clients getting evicted" as
I''ve also seen
>the "ost_connect operation failed with -16" message and there they
>recommend increasing the timeout, though that was for 1.6 and as
I''ve
>read 1.8 has a different timeout system. Reading that, would
>increasing at_min(currently 0) or at_max(currently 600) be best?
>
>On Thu, Feb 2, 2012 at 12:07 PM, Andreas Dilger <adilger at
whamcloud.com>
>wrote:
>> On 2012-02-02, at 8:54 AM, David Noriega wrote:
>>> We have two OSSs, each with two quad core AMD Opterons and 8GB of
ram
>>> and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
>>> StorageTek 2540 connected with 8Gb fiber.
>>
>> Running 32-64 threads per OST is the optimum number, based on previous
>> experience.
>>
>>> What about tweaking max_dirty_mb on the client side?
>>
>> Probably unrelated.
>>
>>> On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz <cthomaz at
ddn.com> wrote:
>>>> David,
>>>>
>>>> The oss service threads is a function of your RAM size and
CPUs. It''s
>>>> difficult to say what would be a good upper limit without
knowing the
>>>>size
>>>> of your OSS, # clients, storage back-end and workload. But the
good
>>>>thing
>>>> you can give a try on the fly via lctl set_param command.
>>>>
>>>> Assuming you are running lustre 1.8, here is a good explanation
on
>>>>how to
>>>> do it:
>>>> 
>>>>http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651
>>>>263_
>>>> 87260
>>>>
>>>> Some remarks:
>>>> - reducing the number of OSS threads may impact the performance
>>>>depending
>>>> on how is your workload.
>>>> - unfortunately I guess you will need to try and see what
happens. I
>>>>would
>>>> go for 128 and analyze the behavior of your OSSs (via log
files) and
>>>>also
>>>> keeping an eye on your workload. Seems to me that 300 is a bit
too
>>>>high
>>>> (but again, I don''t know what you have on your storage
back-end or OSS
>>>> configuration).
>>>>
>>>>
>>>> I can''t tell you much about the lru_size, but as far
as I understand
>>>>the
>>>> values are dynamic and there''s not much to do rather
than clear the
>>>>last
>>>> recently used queue or disable the lru sizing. I can''t
help much on
>>>>this
>>>> other than pointing you out the explanation for it (see
31.2.11):
>>>>
>>>>
http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html
>>>>
>>>>
>>>> Regards,
>>>> Carlos
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Carlos Thomaz | HPC Systems Architect
>>>> Mobile: +1 (303) 519-0578
>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>> DataDirect Networks, Inc.
>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>>> ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
>>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 2/1/12 2:11 PM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>>
>>>>> zone_reclaim_mode is 0 on all clients/servers
>>>>>
>>>>> When changing number of service threads or the lru_size,
can these be
>>>>> done on the fly or do they require a reboot of either
client or
>>>>> server?
>>>>> For my two OSTs, cat
/proc/fs/lustre/ost/OSS/ost_io/threads_started
>>>>> give about 300(300, 359) so I''m thinking try half
of that and see how
>>>>> it goes?
>>>>>
>>>>> Also checking lru_size, I get different numbers from the
clients. cat
>>>>> /proc/fs/lustre/ldlm/namespaces/*/lru_size
>>>>>
>>>>> Client: MDT0 OST0 OST1 OST2 OST3 MGC
>>>>> head node: 0 22 22 22 22 400 (only a few users logged in)
>>>>> busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
>>>>> samba/nfs server: 4 440070 44370 44348 26282 1600
>>>>>
>>>>> So my understanding is the lru_size is set to auto by
default thus
>>>>>the
>>>>> varying values, but setting it manually is effectively
setting a max
>>>>> value? Also what does it mean to have a lower
value(especially in the
>>>>> case of the samba/nfs server)?
>>>>>
>>>>> On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor <taylor
at hpc.ufl.edu>
>>>>>wrote:
>>>>>>
>>>>>> You may also want to check and, if necessary, limit the
lru_size on
>>>>>> your clients.   I believe there are guidelines in the
ops manual.
>>>>>> We have ~750 clients and limit ours to 600 per OST.  
That, combined
>>>>>> with the setting zone_reclaim_mode=0 should make a big
difference.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Charlie Taylor
>>>>>> UF HPC Center
>>>>>>
>>>>>>
>>>>>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> You may be facing the same issue discussed on
previous threads,
>>>>>>>which
>>>>>>> is
>>>>>>> the issue regarding the zone_reclaim_mode.
>>>>>>>
>>>>>>> Take a look on the previous thread where myself and
Kevin replied
>>>>>>>to
>>>>>>> Vijesh Ek.
>>>>>>>
>>>>>>> If you don''t have access to the previous
emails, look at your
>>>>>>>kernel
>>>>>>> settings for the zone reclaim:
>>>>>>>
>>>>>>> cat /proc/sys/vm/zone_reclaim_mode
>>>>>>>
>>>>>>> It should be set to 0.
>>>>>>>
>>>>>>> Also, look at the number of Lustre OSS service
threads. It may be
>>>>>>>set
>>>>>>> to
>>>>>>> high...
>>>>>>>
>>>>>>> Rgds.
>>>>>>> Carlos.
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Carlos Thomaz | HPC Systems Architect
>>>>>>> Mobile: +1 (303) 519-0578
>>>>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>>>>> DataDirect Networks, Inc.
>>>>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO
80921
>>>>>>> ddn.com <http://www.ddn.com/> | Twitter:
@ddn_limitless
>>>>>>> <http://twitter.com/ddn_limitless> |
1.800.TERABYTE
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2/1/12 11:57 AM, "David Noriega"
<tsk133 at my.utsa.edu> wrote:
>>>>>>>
>>>>>>>> indicates the system was overloaded (too many
service threads, or
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Lustre-discuss mailing list
>>>>>>> Lustre-discuss at lists.lustre.org
>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>
>>>>>> Charles A. Taylor, Ph.D.
>>>>>> Associate Director,
>>>>>> UF HPC Center
>>>>>> (352) 392-4036
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> David Noriega
>>>>> System Administrator
>>>>> Computational Biology Initiative
>>>>> High Performance Computing Center
>>>>> University of Texas at San Antonio
>>>>> One UTSA Circle
>>>>> San Antonio, TX 78249
>>>>> Office: BSE 3.112
>>>>> Phone: 210-458-7100
>>>>> http://www.cbi.utsa.edu
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>
>>>
>>>
>>> --
>>> David Noriega
>>> System Administrator
>>> Computational Biology Initiative
>>> High Performance Computing Center
>>> University of Texas at San Antonio
>>> One UTSA Circle
>>> San Antonio, TX 78249
>>> Office: BSE 3.112
>>> Phone: 210-458-7100
>>> http://www.cbi.utsa.edu
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger                       Whamcloud, Inc.
>> Principal Engineer                   http://www.whamcloud.com/
>>
>>
>>
>>
>
>
>
>-- 
>David Noriega
>System Administrator
>Computational Biology Initiative
>High Performance Computing Center
>University of Texas at San Antonio
>One UTSA Circle
>San Antonio, TX 78249
>Office: BSE 3.112
>Phone: 210-458-7100
>http://www.cbi.utsa.edu
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss

Carlos Thomaz

2012-Feb-03 01:38 UTC

head link

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

Ooopss..
Take a look first at:

http://wiki.lustre.org/index.php/Architecture_-_Adaptive_Timeouts_-_Use_Cas
es

And google for adaptive timeouts

Carlos.

--
Carlos Thomaz | HPC Systems Architect
Mobile: +1 (303) 519-0578
cthomaz at ddn.com | Skype ID: carlosthomaz
DataDirect Networks, Inc.
9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
<http://twitter.com/ddn_limitless> | 1.800.TERABYTE





On 2/2/12 6:37 PM, "Carlos Thomaz" <cthomaz at ddn.com> wrote:
>I can''t comment much on this (don''t have much experience
tuning it), but
>Lustre 1.8 has a completely different timeouts architecture (Adaptive
>timeouts).
>I suggest you to take a deep look first:
>
>--
>Carlos Thomaz | HPC Systems Architect
>Mobile: +1 (303) 519-0578
>cthomaz at ddn.com | Skype ID: carlosthomaz
>DataDirect Networks, Inc.
>9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless
><http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>
>
>
>
>
>On 2/2/12 5:05 PM, "David Noriega" <tsk133 at my.utsa.edu>
wrote:
>
>>I found this thread "Luster clients getting evicted" as
I''ve also seen
>>the "ost_connect operation failed with -16" message and there
they
>>recommend increasing the timeout, though that was for 1.6 and as
I''ve
>>read 1.8 has a different timeout system. Reading that, would
>>increasing at_min(currently 0) or at_max(currently 600) be best?
>>
>>On Thu, Feb 2, 2012 at 12:07 PM, Andreas Dilger <adilger at
whamcloud.com>
>>wrote:
>>> On 2012-02-02, at 8:54 AM, David Noriega wrote:
>>>> We have two OSSs, each with two quad core AMD Opterons and 8GB
of ram
>>>> and two OSTs each(4.4T and 3.5T). Backend storage is a pair of
Sun
>>>> StorageTek 2540 connected with 8Gb fiber.
>>>
>>> Running 32-64 threads per OST is the optimum number, based on
previous
>>> experience.
>>>
>>>> What about tweaking max_dirty_mb on the client side?
>>>
>>> Probably unrelated.
>>>
>>>> On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz <cthomaz at
ddn.com> wrote:
>>>>> David,
>>>>>
>>>>> The oss service threads is a function of your RAM size and
CPUs. It''s
>>>>> difficult to say what would be a good upper limit without
knowing the
>>>>>size
>>>>> of your OSS, # clients, storage back-end and workload. But
the good
>>>>>thing
>>>>> you can give a try on the fly via lctl set_param command.
>>>>>
>>>>> Assuming you are running lustre 1.8, here is a good
explanation on
>>>>>how to
>>>>> do it:
>>>>> 
>>>>>http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#5065
>>>>>1
>>>>>263_
>>>>> 87260
>>>>>
>>>>> Some remarks:
>>>>> - reducing the number of OSS threads may impact the
performance
>>>>>depending
>>>>> on how is your workload.
>>>>> - unfortunately I guess you will need to try and see what
happens. I
>>>>>would
>>>>> go for 128 and analyze the behavior of your OSSs (via log
files) and
>>>>>also
>>>>> keeping an eye on your workload. Seems to me that 300 is a
bit too
>>>>>high
>>>>> (but again, I don''t know what you have on your
storage back-end or
>>>>>OSS
>>>>> configuration).
>>>>>
>>>>>
>>>>> I can''t tell you much about the lru_size, but as
far as I understand
>>>>>the
>>>>> values are dynamic and there''s not much to do
rather than clear the
>>>>>last
>>>>> recently used queue or disable the lru sizing. I
can''t help much on
>>>>>this
>>>>> other than pointing you out the explanation for it (see
31.2.11):
>>>>>
>>>>>
http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html
>>>>>
>>>>>
>>>>> Regards,
>>>>> Carlos
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Carlos Thomaz | HPC Systems Architect
>>>>> Mobile: +1 (303) 519-0578
>>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>>> DataDirect Networks, Inc.
>>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
>>>>> ddn.com <http://www.ddn.com/> | Twitter:
@ddn_limitless
>>>>> <http://twitter.com/ddn_limitless> | 1.800.TERABYTE
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 2/1/12 2:11 PM, "David Noriega" <tsk133 at
my.utsa.edu> wrote:
>>>>>
>>>>>> zone_reclaim_mode is 0 on all clients/servers
>>>>>>
>>>>>> When changing number of service threads or the
lru_size, can these
>>>>>>be
>>>>>> done on the fly or do they require a reboot of either
client or
>>>>>> server?
>>>>>> For my two OSTs, cat
/proc/fs/lustre/ost/OSS/ost_io/threads_started
>>>>>> give about 300(300, 359) so I''m thinking try
half of that and see
>>>>>>how
>>>>>> it goes?
>>>>>>
>>>>>> Also checking lru_size, I get different numbers from
the clients.
>>>>>>cat
>>>>>> /proc/fs/lustre/ldlm/namespaces/*/lru_size
>>>>>>
>>>>>> Client: MDT0 OST0 OST1 OST2 OST3 MGC
>>>>>> head node: 0 22 22 22 22 400 (only a few users logged
in)
>>>>>> busy node: 1 501 504 503 505 400 (Fully loaded with
jobs)
>>>>>> samba/nfs server: 4 440070 44370 44348 26282 1600
>>>>>>
>>>>>> So my understanding is the lru_size is set to auto by
default thus
>>>>>>the
>>>>>> varying values, but setting it manually is effectively
setting a max
>>>>>> value? Also what does it mean to have a lower
value(especially in
>>>>>>the
>>>>>> case of the samba/nfs server)?
>>>>>>
>>>>>> On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor
<taylor at hpc.ufl.edu>
>>>>>>wrote:
>>>>>>>
>>>>>>> You may also want to check and, if necessary, limit
the lru_size on
>>>>>>> your clients.   I believe there are guidelines in
the ops manual.
>>>>>>> We have ~750 clients and limit ours to 600 per OST.
That,
>>>>>>>combined
>>>>>>> with the setting zone_reclaim_mode=0 should make a
big difference.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Charlie Taylor
>>>>>>> UF HPC Center
>>>>>>>
>>>>>>>
>>>>>>> On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:
>>>>>>>
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> You may be facing the same issue discussed on
previous threads,
>>>>>>>>which
>>>>>>>> is
>>>>>>>> the issue regarding the zone_reclaim_mode.
>>>>>>>>
>>>>>>>> Take a look on the previous thread where myself
and Kevin replied
>>>>>>>>to
>>>>>>>> Vijesh Ek.
>>>>>>>>
>>>>>>>> If you don''t have access to the
previous emails, look at your
>>>>>>>>kernel
>>>>>>>> settings for the zone reclaim:
>>>>>>>>
>>>>>>>> cat /proc/sys/vm/zone_reclaim_mode
>>>>>>>>
>>>>>>>> It should be set to 0.
>>>>>>>>
>>>>>>>> Also, look at the number of Lustre OSS service
threads. It may be
>>>>>>>>set
>>>>>>>> to
>>>>>>>> high...
>>>>>>>>
>>>>>>>> Rgds.
>>>>>>>> Carlos.
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Carlos Thomaz | HPC Systems Architect
>>>>>>>> Mobile: +1 (303) 519-0578
>>>>>>>> cthomaz at ddn.com | Skype ID: carlosthomaz
>>>>>>>> DataDirect Networks, Inc.
>>>>>>>> 9960 Federal Dr., Ste 100 Colorado Springs, CO
80921
>>>>>>>> ddn.com <http://www.ddn.com/> | Twitter:
@ddn_limitless
>>>>>>>> <http://twitter.com/ddn_limitless> |
1.800.TERABYTE
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/1/12 11:57 AM, "David Noriega"
<tsk133 at my.utsa.edu> wrote:
>>>>>>>>
>>>>>>>>> indicates the system was overloaded (too
many service threads, or
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Lustre-discuss mailing list
>>>>>>>> Lustre-discuss at lists.lustre.org
>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>>>
>>>>>>> Charles A. Taylor, Ph.D.
>>>>>>> Associate Director,
>>>>>>> UF HPC Center
>>>>>>> (352) 392-4036
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> David Noriega
>>>>>> System Administrator
>>>>>> Computational Biology Initiative
>>>>>> High Performance Computing Center
>>>>>> University of Texas at San Antonio
>>>>>> One UTSA Circle
>>>>>> San Antonio, TX 78249
>>>>>> Office: BSE 3.112
>>>>>> Phone: 210-458-7100
>>>>>> http://www.cbi.utsa.edu
>>>>>> _______________________________________________
>>>>>> Lustre-discuss mailing list
>>>>>> Lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> David Noriega
>>>> System Administrator
>>>> Computational Biology Initiative
>>>> High Performance Computing Center
>>>> University of Texas at San Antonio
>>>> One UTSA Circle
>>>> San Antonio, TX 78249
>>>> Office: BSE 3.112
>>>> Phone: 210-458-7100
>>>> http://www.cbi.utsa.edu
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger                       Whamcloud, Inc.
>>> Principal Engineer                   http://www.whamcloud.com/
>>>
>>>
>>>
>>>
>>
>>
>>
>>-- 
>>David Noriega
>>System Administrator
>>Computational Biology Initiative
>>High Performance Computing Center
>>University of Texas at San Antonio
>>One UTSA Circle
>>San Antonio, TX 78249
>>Office: BSE 3.112
>>Phone: 210-458-7100
>>http://www.cbi.utsa.edu
>>_______________________________________________
>>Lustre-discuss mailing list
>>Lustre-discuss at lists.lustre.org
>>http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss

Lustre discuss - Feb 2012 - Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages

[Lustre-discuss] Thread might be hung, Heavy IO Load messages