Dear list,
We got a unusally frequency of computing node crash this days, after we add 10
more OSS to present Lustre system.
The computing nodes crashed with log like:
Aug 27 11:50:43 bws0202 kernel: gmond: page allocation failure. order:1,
mode:0x50
Aug 27 11:50:43 bws0202 kernel: [<c0144410>] __alloc_pages+0x294/0x2a6
Aug 27 11:50:43 bws0202 kernel: [<c014443a>] __get_free_pages+0x18/0x24
Aug 27 11:50:43 bws0202 kernel: [<c0146f60>] kmem_getpages+0x1c/0xbb
Aug 27 11:50:43 bws0202 kernel: [<c0147aae>] cache_grow+0xab/0x138
Aug 27 11:50:43 bws0202 kernel: [<c0147ca0>]
cache_alloc_refill+0x165/0x19d
Aug 27 11:50:43 bws0202 kernel: [<c0148074>] __kmalloc+0x76/0x88
Aug 27 11:50:43 bws0202 kernel: [<f9630359>] cfs_alloc+0x29/0x70 [libcfs]
Aug 27 11:50:43 bws0202 kernel: [<f96f1407>] ptl_send_rpc+0x197/0x1790
[ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f96e5f24>]
ptlrpc_retain_replayable_request+0x84/0x200 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f96df701>] after_reply+0x5d1/0xaa0
[ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f96fbfde>]
lustre_msg_set_status+0x2e/0x120 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<c011e851>] __wake_up+0x29/0x3c
Aug 27 11:50:43 bws0202 kernel: [<f96e622f>]
ptlrpc_queue_wait+0x18f/0x2720 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f94668e0>] lnet_me_unlink+0x40/0x260
[lnet]
Aug 27 11:50:43 bws0202 kernel: [<f9700b22>]
reply_in_callback+0x1d2/0x990 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f96f9b2e>]
lustre_msg_add_version+0xbe/0x130 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f9630359>] cfs_alloc+0x29/0x70 [libcfs]
Aug 27 11:50:43 bws0202 kernel: [<f96f39a3>]
lustre_pack_request_v2+0x83/0x3c0 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f9696d90>]
ldlm_resource_putref+0xa0/0x680 [ptlrpc]
Aug 27 11:50:43 bws0202 kernel: [<f96fbb2e>]
lustre_msg_set_opc+0x2e/0x120 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f9630359>] cfs_alloc+0x29/0x70 [libcfs]
Aug 27 11:50:44 bws0202 kernel: [<f96e97bc>] ptlrpc_next_xid+0x3c/0x50
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96fc21e>]
lustre_msg_set_timeout+0x2e/0x100 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96f9726>]
lustre_msg_get_type+0xd6/0x210 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f98b24cb>] mdc_close+0x22b/0xdf0 [mdc]
Aug 27 11:50:44 bws0202 kernel: [<f97db0d3>] ll_release+0xd3/0x600
[lustre]
Aug 27 11:50:44 bws0202 kernel: [<f97ecda2>]
ll_close_inode_openhandle+0x152/0xb80 [lustre]
Aug 27 11:50:44 bws0202 kernel: [<c0107ab4>] do_IRQ+0x1a2/0x1ae
Aug 27 11:50:44 bws0202 kernel: [<f97ecda2>]
ll_close_inode_openhandle+0x152/0xb80 [lustre]
Aug 27 11:50:44 bws0202 kernel: [<c0107ab4>] do_IRQ+0x1a2/0x1ae
Aug 27 11:50:44 bws0202 kernel: [<f97ed8fb>]
ll_mdc_real_close+0x12b/0x520 [lustre]
Aug 27 11:50:44 bws0202 kernel: [<f9842504>]
ll_mdc_blocking_ast+0x224/0x950 [lustre]
Aug 27 11:50:44 bws0202 kernel: [<c02d6c60>] common_interrupt+0x18/0x20
Aug 27 11:50:44 bws0202 kernel: [<f96d3ee5>] ldlm_pool_del+0x75/0x2f0
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f9686ac7>]
ldlm_lock_destroy_nolock+0x87/0x1f0 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f9685118>]
unlock_res_and_lock+0x58/0xe0 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f9685118>]
unlock_res_and_lock+0x58/0xe0 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f968e61b>]
ldlm_cancel_callback+0x10b/0x160 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96e97bc>] ptlrpc_next_xid+0x3c/0x50
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f9685045>] lock_res_and_lock+0x45/0xc0
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96b56b4>]
ldlm_cli_cancel_local+0xa4/0x6f0 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96e4459>]
__ptlrpc_req_finished+0x449/0x5f0 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96fc21e>]
lustre_msg_set_timeout+0x2e/0x100 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96b7b77>] ldlm_cancel_list+0x137/0x360
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96b6412>]
ldlm_cli_cancel_req+0x252/0xc60 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<c01c4172>] memmove+0xe/0x24
Aug 27 11:50:44 bws0202 kernel: [<f9685cae>]
ldlm_lock_remove_from_lru+0x5e/0x210 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96b8206>]
ldlm_cancel_lru_local+0x126/0x480 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96b9154>]
ldlm_cli_cancel_list+0x104/0x550 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<c011de76>]
find_busiest_group+0xdd/0x295
Aug 27 11:50:44 bws0202 kernel: [<f96b7da0>]
ldlm_cancel_shrink_policy+0x0/0x100 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96b88b2>] ldlm_cancel_lru+0x72/0x330
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<c011d2f4>] try_to_wake_up+0x28e/0x299
Aug 27 11:50:44 bws0202 kernel: [<c0107ab4>] do_IRQ+0x1a2/0x1ae
Aug 27 11:50:44 bws0202 kernel: [<f96d1946>]
ldlm_cli_pool_shrink+0x166/0x440 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96d17e0>]
ldlm_cli_pool_shrink+0x0/0x440 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96d1cb4>] ldlm_pool_shrink+0x44/0x160
[ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<c011e875>] __wake_up_locked+0x11/0x13
Aug 27 11:50:44 bws0202 kernel: [<c0104fc3>] __down_trylock+0x3d/0x46
Aug 27 11:50:44 bws0202 kernel: [<c02d3867>]
__down_failed_trylock+0x7/0xc
Aug 27 11:50:44 bws0202 kernel: [<f96987fd>]
.text.lock.ldlm_resource+0x2d/0x90 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<f96d4641>]
ldlm_pools_shrink+0x271/0x340 [ptlrpc]
Aug 27 11:50:44 bws0202 kernel: [<c0145038>]
get_writeback_state+0x30/0x35
Aug 27 11:50:44 bws0202 kernel: [<c014505d>] get_dirty_limits+0x20/0xff
Aug 27 11:50:44 bws0202 kernel: [<c0145038>]
get_writeback_state+0x30/0x35
Aug 27 11:50:44 bws0202 kernel: [<c014505d>] get_dirty_limits+0x20/0xff
Aug 27 11:50:44 bws0202 kernel: [<c0149a04>] shrink_slab+0xf8/0x161
Aug 27 11:50:44 bws0202 kernel: [<c014aa3c>] try_to_free_pages+0xd5/0x1bb
Aug 27 11:50:44 bws0202 kernel: [<c0144338>] __alloc_pages+0x1bc/0x2a6
Aug 27 11:50:45 bws0202 kernel: [<c014443a>] __get_free_pages+0x18/0x24
Aug 27 11:50:45 bws0202 kernel: [<c016be0e>] __pollwait+0x2d/0x95
Aug 27 11:50:45 bws0202 kernel: [<c027f579>] datagram_poll+0x25/0xcc
Aug 27 11:50:45 bws0202 kernel: [<c0279e5d>] sock_poll+0x12/0x14
Aug 27 11:50:45 bws0202 kernel: [<c016c675>] do_pollfd+0x47/0x81
Aug 27 11:50:45 bws0202 kernel: [<c016c6f9>] do_poll+0x4a/0xac
Aug 27 11:50:45 bws0202 kernel: [<c016c91d>] sys_poll+0x1c2/0x279
Aug 27 11:50:45 bws0202 kernel: [<c016bde1>] __pollwait+0x0/0x95
Aug 27 11:50:45 bws0202 kernel: [<c0126285>] sys_gettimeofday+0x53/0xac
Aug 27 11:50:45 bws0202 kernel: [<c02d6287>] syscall_call+0x7/0xb
Aug 27 11:50:45 bws0202 kernel: [<c02d007b>]
unix_stream_sendmsg+0x33/0x33a
Aug 27 11:50:45 bws0202 kernel: Mem-info:
Aug 27 11:50:45 bws0202 kernel: DMA per-cpu:
...................................................................................
Aug 27 11:50:45 bws0202 kernel: Normal per-cpu:
Aug 27 11:50:45 bws0202 kernel: cpu 0 hot: low 32, high 96, batch 16
...............................................................................................................
Aug 27 11:50:46 bws0202 kernel: Free pages: 3904248kB (3890880kB HighMem)
Aug 27 11:50:46 bws0202 kernel: Active:785397 inactive:2228918 dirty:684
writeback:512 unstable:0 free:976062 slab:156542 ma
pped:192154 pagetables:1461
Aug 27 11:50:46 bws0202 kernel: DMA free:12504kB min:16kB low:32kB high:48kB
active:0kB inactive:0kB present:16384kB pages_s
canned:0 all_unreclaimable? yes
Aug 27 11:50:46 bws0202 kernel: protections[]: 0 0 0
Aug 27 11:50:46 bws0202 kernel: Normal free:864kB min:928kB low:1856kB
high:2784kB active:89892kB inactive:25980kB present:9
01120kB pages_scanned:0 all_unreclaimable? no
Aug 27 11:50:46 bws0202 kernel: protections[]: 0 0 0
Aug 27 11:50:46 bws0202 kernel: HighMem free:3890880kB min:512kB low:1024kB
high:1536kB active:3051644kB inactive:8889744kB
present:16646144kB pages_scanned:0 all_unreclaimable? no
Aug 27 11:50:46 bws0202 kernel: protections[]: 0 0 0
Aug 27 11:50:46 bws0202 kernel: DMA: 2*4kB 0*8kB 3*16kB 3*32kB 3*64kB 3*128kB
2*256kB 0*512kB 1*1024kB 1*2048kB 2*4096kB = 1
2504kB
Aug 27 11:50:46 bws0202 kernel: Normal: 214*4kB 1*8kB 0*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096k
B = 864kB
Aug 27 11:50:46 bws0202 kernel: HighMem: 45384*4kB 53998*8kB 35621*16kB
7967*32kB 880*64kB 814*128kB 253*256kB 94*512kB 676*
1024kB 510*2048kB 108*4096kB = 3890880kB
Aug 27 11:50:46 bws0202 kernel: Swap cache: add 56, delete 56, find 0/0, race
0+0
Aug 27 11:50:46 bws0202 kernel: 0 bounce buffer pages
Aug 27 11:50:46 bws0202 kernel: Free swap: 16776984kB
Aug 27 11:50:46 bws0202 kernel: 4390912 pages of RAM
Aug 27 11:50:46 bws0202 kernel: 3964594 pages of HIGHMEM
Aug 27 11:50:46 bws0202 kernel: 232823 reserved pages
Aug 27 11:50:46 bws0202 kernel: 2673733 pages shared
Aug 27 11:50:46 bws0202 kernel: 0 pages swap cached
The computing nodes are running
"lustre-1.6.5-2.6.9_55.EL.cernsmp", with 16 GB memory on 32 bit OS.
servers are running "2.6.9-67.0.22.EL_lustre.1.6.6smp" on 64 bit OS.
Every computing nodes are mounting two lustre: one with 20 OSS, one with 2 OSS.
I have set /proc/fs/lustre/llite/*/max_cached_mb=4158 for each Lustre file
system. It seemed
that the computing nodes were runnning out of Normal memory when they were dead.
Is it possible to control the Normal Memory a Lustre client used with certain
tuning option?
Our server have experienced same problem when the OS of OSSes are 32 bit.
After switched to 64 bit, the problem has not appreared any more. It is
difficult for us to switch all computing nodes to 64 bit right now.
Best Regards
Lu Wang
--------------------------------------------------------------
Computing Center
IHEP Office: Computing Center,123
19B Yuquan Road Tel: (+86) 10 88236012-607
P.O. Box 918-7 Fax: (+86) 10 8823 6839
Beijing 100049,China Email: Lu.Wang at ihep.ac.cn
--------------------------------------------------------------