More follow-up:
I confirmed that it is indeed the ethernet chipset (or some conflict
with it and lustre). I have tried this with another Ethernet chipset
(Intel? 82551). I don''t think it''s worth taking out a bug
report over as
this chipset is pretty darn old (4 years+), but it''s good to know there
might be an issue. I have no problem just getting rid of this older
motherboard and moving onto something more modern, but it IS a very
capable older motherboard. Any help on it would be great, but I''m
willing to chalk it all up to experience.
This only occurs during very heavy data transfer of many small files. It
does not, as I said before, happen on any of my other motherboards that
have the same hardware attached (same qla2300 card, same harddrive, same
processors, same amount of RAM). Very odd, but these things happen.
----
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(mdc_request.c:421:mdc_req2lustre_md()) OBD_MD_FLEASIZE set, but
eadatasize 0
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14965461 total bytes allocated by
Lustre, 282644 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout (sent at
0, 1160952932s ago)
Oct 15 18:55:32 proc2 kernel: Lustre:
4873:0:(peer.c:238:lnet_debug_peer()) 10.51.0.17@tcp 2
up 8 8 8 8 6 0
Oct 15 18:55:32 proc2 kernel: LustreError: lustre-MDT0000-mdc-f5dce000:
Connection to service lustre-MDT0000 via nid 10.51.0.17@tcp was lost; in
progress operations using this service will wait for recovery to complete.
Oct 15 18:55:32 proc2 kernel: Lustre: lustre-MDT0000-mdc-f5dce000:
Connection restored to service lustre-MDT0000 using nid 10.51.0.17@tcp.
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) req@f2933e00 x137777/t0
o101->lustre-MDT0000_UUID@10.51.0.17@tcp:12 lens 512/1279337184 ref 1 fl
Rpc:SP/2/0 rc -11/0
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 462 previous similar
messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 520 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 520 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 1172 previous similar
messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 904 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 904 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 1805 previous similar
messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 1524 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 1524 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 3059 previous similar
messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 2747 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 2748 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 5497 previous similar
messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 5277 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 5278 previous similar messages
Oct 15 18:55:32 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) req@f2933e00 x137777/t0
o101->lustre-MDT0000_UUID@10.51.0.17@tcp:12 lens 512/1279337184 ref 1 fl
Rpc:SP/2/0 rc -11/0
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 10558 previous
similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 10122 previous similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 10122 previous similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 20256 previous
similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 20364 previous similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14962285 total bytes allocated by
Lustre, 282496 by Portals
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 20364 previous similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 40715 previous
similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 40164 previous similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14963481 total bytes allocated by
Lustre, 283048 by Portals
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 40164 previous similar messages
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:33 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 80303 previous
similar messages
Oct 15 18:55:34 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:34 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 80141 previous similar messages
Oct 15 18:55:34 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14963481 total bytes allocated by
Lustre, 283048 by Portals
Oct 15 18:55:34 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 80141 previous similar messages
Oct 15 18:55:34 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:34 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 160289 previous
similar messages
Oct 15 18:55:35 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:55:36 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14963481 total bytes allocated by
Lustre, 283048 by Portals
Oct 15 18:55:36 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 140418 previous similar
messages
Oct 15 18:55:36 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:36 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 140419 previous similar
messages
Oct 15 18:55:36 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:36 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 280709 previous
similar messages
Oct 15 18:55:40 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:55:40 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:40 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 192416 previous similar
messages
Oct 15 18:55:40 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14963481 total bytes allocated by
Lustre, 283048 by Portals
Oct 15 18:55:40 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 192417 previous similar
messages
Oct 15 18:55:40 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:40 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 384661 previous
similar messages
Oct 15 18:55:42 proc2 kernel: BUG: soft lockup detected on CPU#0!
Oct 15 18:55:42 proc2 kernel: <c0132e49> softlockup_tick+0x9a/0xab
<c011ecb3> update_process_times+0x39/0x5c
Oct 15 18:55:42 proc2 kernel: <c010c85f>
smp_apic_timer_interrupt+0x54/0x5a <c0103138>
apic_timer_interrupt+0x1c/0x24
Oct 15 18:55:42 proc2 kernel: <c0258f68> vsnprintf+0x30e/0x49e
<f8f7a6ba> libcfs_debug_vmsg+0x1ba/0x49e [libcfs]
Oct 15 18:55:42 proc2 kernel: <f8f78653> cdebug_va+0x87/0x8f [libcfs]
<f8f7867e> cdebug+0x23/0x29 [libcfs]
Oct 15 18:55:42 proc2 kernel: <f907ee9f> debug_req+0x1c1/0x1ca
[ptlrpc] <f906c93d> ptlrpc_check_reply+0x2db/0x4e7 [ptlrpc]
Oct 15 18:55:42 proc2 kernel: <f907145b> ptlrpc_queue_wait+0xa26/0x14d2
[ptlrpc] <f9070684> expired_request+0x0/0x117 [ptlrpc]
Oct 15 18:55:42 proc2 kernel: <f907079b> interrupted_request+0x0/0x3f
[ptlrpc] <f9070684> expired_request+0x0/0x117 [ptlrpc]
Oct 15 18:55:42 proc2 kernel: <f907079b> interrupted_request+0x0/0x3f
[ptlrpc] <f905b33f> ldlm_cli_enqueue+0x59a/0x6a3 [ptlrpc]
Oct 15 18:55:42 proc2 kernel: <f90f3bae> mdc_enqueue+0x8be/0x1208
[mdc] <f91f094e> ll_mdc_blocking_ast+0x0/0x52c [lustre]
Oct 15 18:55:42 proc2 kernel: <f90593b2> ldlm_completion_ast+0x0/0x713
[ptlrpc] <f90f4856> mdc_intent_lock+0x35e/0xaab [mdc]
Oct 15 18:55:42 proc2 kernel: <f90593b2> ldlm_completion_ast+0x0/0x713
[ptlrpc] <f91f094e> ll_mdc_blocking_ast+0x0/0x52c [lustre]
Oct 15 18:55:42 proc2 kernel: <f8f7a81e> libcfs_debug_vmsg+0x31e/0x49e
[libcfs] <f8f7a9c5> libcfs_debug_msg+0x27/0x2c [libcfs]
Oct 15 18:55:42 proc2 kernel: <f91f197b> ll_lookup_it+0x220/0x5e3
[lustre] <f91f103f> ll_i2gids+0x4d/0xdb [lustre]
Oct 15 18:55:42 proc2 kernel: <f91f1a9a> ll_lookup_it+0x33f/0x5e3
[lustre] <f91f094e> ll_mdc_blocking_ast+0x0/0x52c [lustre]
Oct 15 18:55:42 proc2 kernel: <f91cfb54> ll_inode_permission+0x0/0xa6
[lustre] <f91f20d4> ll_lookup_nd+0x173/0x358 [lustre]
Oct 15 18:55:42 proc2 kernel: <c015c0e0> __lookup_hash+0x70/0x89
<c015c7b0> open_namei+0xe6/0x5c0
Oct 15 18:55:42 proc2 kernel: <c014e70e> do_filp_open+0x1c/0x31
<c0259b7e> strncpy_from_user+0x3c/0x5b
Oct 15 18:55:42 proc2 kernel: <c014e8ab> get_unused_fd+0xaa/0xb1
<c014e985> do_sys_open+0x3c/0xac
Oct 15 18:55:42 proc2 kernel: <c014ea0b> sys_open+0x16/0x18
<c01025ff>
sysenter_past_esp+0x54/0x75
Oct 15 18:55:45 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:55:49 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14963481 total bytes allocated by
Lustre, 283048 by Portals
Oct 15 18:55:49 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 303452 previous similar
messages
Oct 15 18:55:49 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:55:49 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 303453 previous similar
messages
Oct 15 18:55:49 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:55:49 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 606883 previous
similar messages
Oct 15 18:55:50 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:56:05 proc2 last message repeated 3 times
Oct 15 18:56:05 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:56:05 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 531039 previous similar
messages
Oct 15 18:56:05 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14965669 total bytes allocated by
Lustre, 285500 by Portals
Oct 15 18:56:05 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 531040 previous similar
messages
Oct 15 18:56:05 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) @@@ RESEND:
Oct 15 18:56:05 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 1062029 previous
similar messages
Oct 15 18:56:10 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:56:30 proc2 last message repeated 4 times
Oct 15 18:56:35 proc2 kernel: LustreError: A timeout occurred sending
data to 12345-10.51.0.17@tcp (10.51.0.17:988) the network or that node
may be down.
Oct 15 18:56:35 proc2 kernel: Lustre:
4255:0:(router.c:187:lnet_do_notify()) Upcall: NID 10.51.0.17@tcp is dead
Oct 15 18:56:35 proc2 kernel: Lustre:
15:0:(linux-debug.c:96:libcfs_run_upcall()) Invoked LNET upcall
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,10.51.0.17@tcp,down,1160952933
Oct 15 18:56:35 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:56:38 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) kmalloc of
''request->rq_repmsg''
(1279337184 bytes) failed at /root/lustre-1.5.95/lustre/ptlrpc/niobuf.c:431
Oct 15 18:56:38 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 933433 previous similar
messages
Oct 15 18:56:38 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) 14966685 total bytes allocated by
Lustre, 271804 by Portals
Oct 15 18:56:38 proc2 kernel: LustreError:
4873:0:(niobuf.c:431:ptl_send_rpc()) Skipped 933433 previous similar
messages
Oct 15 18:56:38 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) req@f2933e00 x137777/t0
o101->lustre-MDT0000_UUID@10.51.0.17@tcp:12 lens 512/1279337184 ref 1 fl
Rpc:SP/2/0 rc -11/0
Oct 15 18:56:38 proc2 kernel: LustreError:
4873:0:(client.c:556:ptlrpc_check_reply()) Skipped 1866836 previous
similar messages
Oct 15 18:56:40 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:56:45 proc2 kernel: CPU1: Temperature/speed normal
Oct 15 18:56:48 proc2 kernel: LustreError: A timeout occurred sending
data to 12345-10.51.0.18@tcp (10.51.0.18:988) the network or that node
may be down.
Oct 15 18:56:48 proc2 kernel: Lustre:
4255:0:(router.c:187:lnet_do_notify()) Upcall: NID 10.51.0.18@tcp is dead
Oct 15 18:56:48 proc2 kernel: Lustre:
15:0:(linux-debug.c:96:libcfs_run_upcall()) Invoked LNET upcall
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,10.51.0.18@tcp,down,1160953008
David Bernick wrote:> A followup to my last message about lustre crashing. As i said, it only
> crashes when this motherboard (Supermicro X5DPL-IGM. Intel? E7501
> Chipset. 82545EM Gigabit Ethernet) is used. When I use a much more
> modern motherboard, I don''t seem to have this problem. The
motherboard
> is the only difference between the machines that crash and the machines
> that don''t. The machines also have a qla2300 card in the PCI slot
via a
> riser-card, but, as I said, only one machine is unstable. I have
> replaced the motherboard just to make sure that the actual motherboard
> is not the problem. It does not seem to be. Any ideas?
>
>
>
> Oct 14 23:49:41 proc2 kernel: LustreError:
> 3931:0:(pack_generic.c:482:lustre_msg_buf_v2()) msg efeb3000 buffer[3]
> size 56 too small (required 544175136)
> Oct 14 23:49:41 proc2 kernel: LustreError:
> 3931:0:(mdc_locks.c:477:mdc_enqueue()) Missing/short eadata
> Oct 14 23:49:42 proc2 kernel: LustreError:
> 3931:0:(pack_generic.c:482:lustre_msg_buf_v2()) msg efb52400 buffer[3]
> size 56 too small (required 1230261829)
> Oct 14 23:49:42 proc2 kernel: LustreError:
> 3931:0:(mdc_locks.c:477:mdc_enqueue()) Missing/short eadata
> Oct 14 23:49:43 proc2 kernel: LustreError:
> 3931:0:(llite_lib.c:1530:ll_update_inode()) lsm mismatch for inode 22140674
> Oct 14 23:49:43 proc2 kernel: LustreError:
> 3931:0:(llite_lib.c:1531:ll_update_inode()) lli_smd:
> Oct 14 23:49:43 proc2 kernel: LustreError:
> 3931:0:(debug.c:108:dump_lsm()) lsm ef47ebc0, objid 0x151d702, maxbytes
> 0x1fffffff000, magic 0x0BD10BD0, stripe_size 1048576, stripe_count 1
> Oct 14 23:49:43 proc2 kernel: LustreError:
> 3931:0:(llite_lib.c:1533:ll_update_inode()) lsm:
> Oct 14 23:49:43 proc2 kernel: LustreError:
> 3931:0:(debug.c:108:dump_lsm()) lsm ef47eac0, objid 0x151d6fa, maxbytes
> 0x1fffffff000, magic 0x0BD10BD0, stripe_size 1048576, stripe_count 1
> Oct 14 23:49:43 proc2 kernel: LustreError:
> 3931:0:(linux-debug.c:130:lbug_with_loc()) LBUG
> Oct 14 23:49:43 proc2 kernel: Lustre:
> 3931:0:(linux-debug.c:158:libcfs_debug_dumpstack()) can''t show
stack:
> kernel doesn''t export show_task
> Oct 14 23:49:43 proc2 kernel: <f9135a60> lbug_with_loc+0x88/0xaf
> [libcfs] <f9383682> ll_update_inode+0x352/0xb22 [lustre]
> Oct 14 23:49:43 proc2 kernel: <f93866ad> ll_prep_inode+0x194/0x80f
> [lustre] <f9357284> revalidate_it_finish+0x1f3/0x283 [lustre]
> Oct 14 23:49:43 proc2 kernel: <f93743c2>
> ll_inode_revalidate_it+0x368/0x7c0 [lustre] <f937482e>
> ll_getattr_it+0x14/0x109 [lustre]
> Oct 14 23:49:43 proc2 kernel: <f937494f> ll_getattr+0x2c/0x34
[lustre]
> <c01670a1> mntput_no_expire+0x11/0x6e
> Oct 14 23:49:43 proc2 kernel: <f9374923> ll_getattr+0x0/0x34
[lustre]
> <c0157159> vfs_getattr+0x3e/0x98
> Oct 14 23:49:43 proc2 kernel: <c0157266> vfs_fstat+0x22/0x31
> <c01577e0> sys_fstat64+0xf/0x23
> Oct 14 23:49:43 proc2 kernel: <c014e924> fd_install+0x24/0x49
> <c014e9ed> do_sys_open+0xa4/0xac
> Oct 14 23:49:43 proc2 kernel: <c014ea0b> sys_open+0x16/0x18
<c01025ff>
> sysenter_past_esp+0x54/0x75
> Oct 14 23:49:43 proc2 kernel: LustreError: dumping log to
> /tmp/lustre-log.1160884183.3931
> Oct 14 23:49:43 proc2 kernel: Lustre:
> 3931:0:(linux-debug.c:96:libcfs_run_upcall()) Invoked LNET upcall
> /usr/lib/lustre/lnet_upcall
> LBUG,/root/lustre-1.5.95/lustre/llite/llite_lib.c,ll_update_inode,1535
>
>