uselton2@llnl.gov
2007-Jan-09 17:05 UTC
[Lustre-devel] [Bug 11176] ltest OSTs seeing Out-Of-Memory condition
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11176 Brian started a little loop on the igs21 OSS that did the following: # screen # while [ 1 ] ; do echo "m" > /proc/sysrq-trigger; sleep 1200; done # ^ad The loop ran from the time of a reboot on January 3rd, 2007. It shows a rapid rise in the Active pages value unitl it stabilizes after a couple of hours: 2007-01-03 17:33:10 Active:60953 inactive:218430 dirty:0 writeback:0 unstable:0 free:99633 slab:58644 mapped:5598 pagetables:233 2007-01-03 17:54:50 Active:143831 inactive:198215 dirty:0 writeback:0 unstable:0 free:35719 slab:59937 mapped:5598 pagetables:233 2007-01-03 18:16:43 Active:216975 inactive:152993 dirty:0 writeback:0 unstable:0 free:6445 slab:61204 mapped:5548 pagetables:205 2007-01-03 18:36:44 Active:248424 inactive:121477 dirty:0 writeback:0 unstable:0 free:6261 slab:61472 mapped:5548 pagetables:205 2007-01-03 18:56:44 Active:266343 inactive:103574 dirty:0 writeback:0 unstable:0 free:6345 slab:61411 mapped:5548 pagetables:205 2007-01-03 19:16:44 Active:285333 inactive:84681 dirty:26 writeback:0 unstable:0 free:6649 slab:60978 mapped:5548 pagetables:205 2007-01-03 19:36:44 Active:296231 inactive:73783 dirty:28 writeback:0 unstable:0 free:6263 slab:61392 mapped:5548 pagetables:205 2007-01-03 19:56:44 Active:301777 inactive:67702 dirty:38 writeback:0 unstable:0 free:6103 slab:61889 mapped:5548 pagetables:205 2007-01-03 20:16:45 Active:306633 inactive:62612 dirty:38 writeback:0 unstable:0 free:6717 slab:61690 mapped:5551 pagetables:205 2007-01-03 20:36:45 Active:310025 inactive:59337 dirty:26 writeback:0 unstable:0 free:6843 slab:61431 mapped:5551 pagetables:205 and the inactive pages slowly drain away until your see numbers like the following: 2007-01-07 23:57:45 Active:340033 inactive:13701 dirty:0 writeback:0 unstable:0 free:13593 slab:70320 mapped:5621 pagetables:206 2007-01-08 00:17:46 Active:340068 inactive:13701 dirty:597 writeback:0 unstable:0 free:12575 slab:70980 mapped:5621 pagetables:206 at this point the oom killer is liable to get invoked at any time. We decided to try a stepped sequence of tests to see if we could get any insight into where all the pages are disappearing to. 1) Run for a few hours with lustre off to see if active and inactive pages stay stable 2) Start Lustre but leave the network intefaces off while we monitor memory use 3) Activate the network interfaces but leave the file system idle and continue to monitor 4) Start auto-testing again and continue to monitor - we''d like some way of tracking exactly who (what kerel process) is responisble for the change in memory use.
uselton2@llnl.gov
2007-Jan-13 12:03 UTC
[Lustre-devel] [Bug 11176] ltest OSTs seeing Out-Of-Memory condition
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11176 Hit this again after the update to: 2007-01-12 16:35:12 Starting lustre: Lustre: OBD class driver Build Version: 1.4.8.0_-19691231160000-PRISTINE-.lib.modules.2.6.9-58chaos.build-2.6.9-58chaos, info@clusterfs.com Here''s the console log (igs1): 2007-01-13 03:05:03 NMI Watchdog detected LOCKUP, CPU=1, registers: 2007-01-13 03:05:03 CPU 1 2007-01-13 03:05:03 Modules linked in: lquota(U) obdfilter(U) fsfilt_ldiskfs(U) ldiskfs(U) jbd(U) ost(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) perfctr(U) i2c_i801(U) netdump(U) e752x_edac(U) edac_mc(U) i2c_dev(U) i2c_core(U) dm_mod(U) rtc(U) md(U) sd_mod(U) uhci_hcd(U) ehci_hcd(U) floppy(U) qla2300(U) qla2xxx(U) scsi_transport_fc(U) scsi_mod(U) unionfs(U) nfs(U) lockd(U) sunrpc(U) e1000(U) 2007-01-13 03:05:03 Pid: 4528, comm: socknal_sd01 Not tainted 2.6.9-58chaos 2007-01-13 03:05:03 RIP: 0010:[<ffffffff802edd14>] <ffffffff802edd14>{.text.lock.spinlock+17} 2007-01-13 03:05:03 RSP: 0018:0000010072575968 EFLAGS: 00000086 2007-01-13 03:05:03 RAX: 000001000000d8f8 RBX: 000001000000d780 RCX: 0000000000000002 2007-01-13 03:05:03 RDX: 0000000000000002 RSI: 0000000000000001 RDI: 000001000000d780 2007-01-13 03:05:03 RBP: 00000100025e9d80 R08: 0000000000004000 R09: 0000010068fc1030 2007-01-13 03:05:03 R10: 0000000000000138 R11: 0000000000001d18 R12: 000001000000d780 2007-01-13 03:05:03 R13: 0000000000000002 R14: 0000000000000000 R15: 000001000bcd6e00 2007-01-13 03:05:03 FS: 0000002a958a0b00(0000) GS:ffffffff804e3180(0000) knlGS:0000000000000000 2007-01-13 03:05:03 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2007-01-13 03:05:03 CR2: 0000002a955ad000 CR3: 000000007f80a000 CR4: 00000000000006e0 2007-01-13 03:05:03 Process socknal_sd01 (pid: 4528, threadinfo 0000010072574000, task 000001006b8854d0) 2007-01-13 03:05:03 Stack: 0000000000000001 0000000000000006 00000100025e9e28 ffffffff80159a9d 2007-01-13 03:05:03 00000000ffffffff 00000000ffffffff 000001007de9c000 0000000000000002 2007-01-13 03:05:03 0000000071c35b00 0000010001038000 2007-01-13 03:05:03 Call Trace:<ffffffff80159a9d>{free_pages_bulk+79} <ffffffff8029326c>{dev_queue_xmit+530} 2007-01-13 03:05:03 <ffffffff80159e14>{__free_pages_ok+255} <ffffffff8015db9f>{kmem_freepages+282} 2007-01-13 03:05:03 <ffffffff8015e5f4>{slab_destroy+124} <ffffffff8015e73a>{free_block+285} 2007-01-13 03:05:03 <ffffffff8015e86c>{cache_flusharray+94} <ffffffff8015e519>{kfree+195} 2007-01-13 03:05:03 <ffffffff8028dea8>{kfree_skbmem+9} <ffffffff802b4e08>{tcp_recvmsg+1589} 2007-01-13 03:05:03 <ffffffff8028d786>{sock_common_recvmsg+48} <ffffffff8028a37c>{sock_recvmsg+284} 2007-01-13 03:05:03 <ffffffff8012dcc5>{recalc_task_prio+337} <ffffffff801313a2>{autoremove_wake_function+0} 2007-01-13 03:05:03 <ffffffffa02388b3>{:ksocklnd:ksocknal_lib_recv_kiov+343} 2007-01-13 03:05:03 <ffffffffa01fcdfd>{:lnet:lnet_md_unlink+381} <ffffffffa0233bdc>{:ksocklnd:ksocknal_recv_kiov+62} 2007-01-13 03:05:03 <ffffffffa0233daa>{:ksocklnd:ksocknal_receive+270} 2007-01-13 03:05:03 <ffffffffa023540d>{:ksocklnd:ksocknal_process_receive+126} 2007-01-13 03:05:03 <ffffffffa0235aa6>{:ksocklnd:ksocknal_scheduler+375} 2007-01-13 03:05:03 <ffffffff801313a2>{autoremove_wake_function+0} <ffffffff801313a2>{autoremove_wake_function+0} 2007-01-13 03:05:03 <ffffffff8010ff3f>{child_rip+8} <ffffffffa023592f>{:ksocklnd:ksocknal_scheduler+0} 2007-01-13 03:05:03 <ffffffff8010ff37>{child_rip+0} 2007-01-13 03:05:03 2007-01-13 03:05:03 Code: 7e f9 e9 ce fc ff ff e8 c8 69 ef ff e9 4a fd ff ff e8 de 69 2007-01-13 03:05:03 Kernel panic - not syncing: nmi watchdog 2007-01-13 03:05:03 ----------- [cut here ] --------- [please bite here ] --------- 2007-01-13 03:05:03 Kernel BUG at panic:74 2007-01-13 03:05:03 invalid operand: 0000 [1] SMP 2007-01-13 03:05:03 CPU 1 2007-01-13 03:05:03 Modules linked in: lquota(U) obdfilter(U) fsfilt_ldiskfs(U) ldiskfs(U) jbd(U) ost(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) perfctr(U) i2c_i801(U) netdump(U) e752x_edac(U) edac_mc(U) i2c_dev(U) i2c_core(U) dm_mod(U) rtc(U) md(U) sd_mod(U) uhci_hcd(U) ehci_hcd(U) floppy(U) qla2300(U) qla2xxx(U) scsi_transport_fc(U) scsi_mod(U) unionfs(U) nfs(U) lockd(U) sunrpc(U) e1000(U) 2007-01-13 03:05:03 Pid: 4528, comm: socknal_sd01 Not tainted 2.6.9-58chaos 2007-01-13 03:05:03 RIP: 0010:[<ffffffff8013370e>] <ffffffff8013370e>{panic+211} 2007-01-13 03:05:03 RSP: 0018:000001007f80dda8 EFLAGS: 00010082 2007-01-13 03:05:03 RAX: 000000000000002c RBX: ffffffff80301d02 RCX: 0000000000000046 2007-01-13 03:05:03 RDX: 0000000000006bf3 RSI: 0000000000000046 RDI: ffffffff803ba8e0 2007-01-13 03:05:03 RBP: 000001007f80df58 R08: 0000000000000007 R09: ffffffff80301d02 2007-01-13 03:05:03 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000021 2007-01-13 03:05:03 R13: 0000000000000002 R14: 0000000000000000 R15: 000001000bcd6e00 2007-01-13 03:05:03 FS: 0000002a958a0b00(0000) GS:ffffffff804e3180(0000) knlGS:0000000000000000 2007-01-13 03:05:03 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2007-01-13 03:05:03 CR2: 0000002a955ad000 CR3: 000000007f80a000 CR4: 00000000000006e0 2007-01-13 03:05:03 Process socknal_sd01 (pid: 4528, threadinfo 0000010072574000, task 000001006b8854d0) 2007-01-13 03:05:03 Stack: 0000003000000008 000001007f80de88 000001007f80ddc8 0000000000000013 2007-01-13 03:05:03 0000000000000000 0000000000000046 0000000000006bc7 0000000000000046 2007-01-13 03:05:03 0000000000000007 ffffffff80304257 2007-01-13 03:05:03 Call Trace:<ffffffff80110858>{show_stack+241} <ffffffff80110982>{show_registers+277} 2007-01-13 03:05:03 <ffffffff80110c89>{die_nmi+130} <ffffffff8011a597>{nmi_watchdog_tick+216} 2007-01-13 03:05:03 <ffffffff80111560>{default_do_nmi+122} <ffffffff8011a64d>{do_nmi+115} 2007-01-13 03:05:03 <ffffffff8011016b>{paranoid_exit+0} <ffffffff802edd14>{.text.lock.spinlock+17} 2007-01-13 03:05:03 <EOE> <ffffffff80159a9d>{free_pages_bulk+79} 2007-01-13 03:05:03 <ffffffff8029326c>{dev_queue_xmit+530} <ffffffff80159e14>{__free_pages_ok+255} 2007-01-13 03:05:03 <ffffffff8015db9f>{kmem_freepages+282} <ffffffff8015e5f4>{slab_destroy+124} 2007-01-13 03:05:03 <ffffffff8015e73a>{free_block+285} <ffffffff8015e86c>{cache_flusharray+94} 2007-01-13 03:05:03 <ffffffff8015e519>{kfree+195} <ffffffff8028dea8>{kfree_skbmem+9} 2007-01-13 03:05:03 <ffffffff802b4e08>{tcp_recvmsg+1589} <ffffffff8028d786>{sock_common_recvmsg+48} 2007-01-13 03:05:03 <ffffffff8028a37c>{sock_recvmsg+284} <ffffffff8012dcc5>{recalc_task_prio+337} 2007-01-13 03:05:03 <ffffffff801313a2>{autoremove_wake_function+0} <ffffffffa02388b3>{:ksocklnd:ksocknal_lib_recv_kiov+343} 2007-01-13 03:05:03 <ffffffffa01fcdfd>{:lnet:lnet_md_unlink+381} <ffffffffa0233bdc>{:ksocklnd:ksocknal_recv_kiov+62} 2007-01-13 03:05:03 <ffffffffa0233daa>{:ksocklnd:ksocknal_receive+270} 2007-01-13 03:05:03 <ffffffffa023540d>{:ksocklnd:ksocknal_process_receive+126} 2007-01-13 03:05:03 <ffffffffa0235aa6>{:ksocklnd:ksocknal_scheduler+375} 2007-01-13 03:05:03 <ffffffff801313a2>{autoremove_wake_function+0} <ffffffff801313a2>{autoremove_wake_function+0} 2007-01-13 03:05:03 <ffffffff8010ff3f>{child_rip+8} <ffffffffa023592f>{:ksocklnd:ksocknal_scheduler+0} 2007-01-13 03:05:03 <ffffffff8010ff37>{child_rip+0} 2007-01-13 03:05:03 2007-01-13 03:05:03 Code: 0f 0b 5e 22 30 80 ff ff ff ff 4a 00 31 ff e8 47 bf fe ff e8 2007-01-13 03:05:03 RIP <ffffffff8013370e>{panic+211} RSP <000001007f80dda8> 2007-01-13 03:05:06 CPU#1 is executing netdump. 2007-01-13 03:05:06 < netdump activated - performing handshake with the server. > 2007-01-13 03:05:07 NETDUMP START! 2007-01-13 03:05:07 < handshake completed - listening for dump requests. > I wasn''t able to get a crash dump. It does look like an OOM-related event. We have made some progress in that investigation. I''ll summarize ofter the (long) weekend on Tuesday.
adilger@clusterfs.com
2007-Jan-23 16:14 UTC
[Lustre-devel] [Bug 11176] ltest OSTs seeing Out-Of-Memory condition
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11176 I noticed during code review for another bug that the obdfilter code is allocating too much memory for the preallocated IO buffers. We currently preallocate an IO buffer per OST thread in filter_iobuf_pool_init(), and in filter_iobuf_get->filter_alloc_iobuf(). The problem is that these iobuf pools are PER OBDFILTER INSTANCE and with many OST IO threads this can add up. Instead, the iobuf pool can be global to the obdfilter (instead of per-instance) because it isn''t possible for an OST RPC thread to be active with requests to two obdfilter instances at the same time. Note that I misspoke on the conference call previously - the 1MB IO buffer is in fact allocated in the OST and is not part of the above problem. Also, the obdfilter iobuf pools ARE accounted for in the lustre memused total, so they are likely not the culprit here.