So I have not seen this in a long time. We appear to be getting a double
bast for a dentry lock. The second bast was received after the lockres
was freed by the fs.
2.6.24 is a 2 year old kernel. Upgrade to something more latest. We have
made numerous changes. If you want a stable distro, use one of the
enterprise
kernels that we support.
Laurence Mayer wrote:> I have seen a couple of times that when SYSTEM CPU is running very
> high, the node receives a panic.
>
> I would think this is not working as designed, Is there a way to
> prevent this?
>
>
> Oct 25 16:32:45 n8 kernel: [1554039.835464] general protection fault:
> 0000 [1] SMP
> Oct 25 16:32:45 n8 kernel: [1554039.835509] CPU 3
> Oct 25 16:32:45 n8 kernel: [1554039.835536] Modules linked in: nfs
> lockd nfs_acl sunrpc ocfs2 crc32c libcrc32c ipmi_devintf ipmi_si
> ipmi_msghandler ocfs2_dlmfs ocfs2_dlm ocfs
> 2_nodemanager configfs iptable_filter ip_tables x_tables xfs ipv6
> ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp
> libiscsi scsi_transport_iscsi parport_pc
> lp parport loop serio_raw psmouse i2c_piix4 button i2c_core k8temp
> dcdbas shpchp pci_hotplug pcspkr evdev ext3 jbd mbcache sg sr_mod
> cdrom sd_mod ata_generic usbhid hid pata_
> acpi pata_serverworks sata_svw tg3 libata ehci_hcd ohci_hcd scsi_mod
> usbcore thermal processor fan fbcon tileblit font bitblit softcursor fuse
> Oct 25 16:32:45 n8 kernel: [1554039.835947] Pid: 4925, comm: o2net
> Tainted: G M 2.6.24-24-server #1
> Oct 25 16:32:45 n8 kernel: [1554039.835982] RIP:
> 0010:[<ffffffff88472a7e>] [<ffffffff88472a7e>]
> :ocfs2:ocfs2_get_dentry_osb+0xe/0x20
> Oct 25 16:32:45 n8 kernel: [1554039.836060] RSP:
> 0000:ffff810414811ca8 EFLAGS: 00010282
> Oct 25 16:32:45 n8 kernel: [1554039.836093] RAX: 0000940513688ddd RBX:
> ffff8103db805018 RCX: 0000000000000005
> Oct 25 16:32:45 n8 kernel: [1554039.836145] RDX: ffff81036c912980 RSI:
> 0000000000000005 RDI: ffff8103db805018
> Oct 25 16:32:45 n8 kernel: [1554039.836197] RBP: ffff8103db805200 R08:
> ffff8103db805200 R09: ffff8103e7ae9200
> Oct 25 16:32:45 n8 kernel: [1554039.836250] R10: 000000000000004e R11:
> ffffffff8847a580 R12: 080000000005364d
> Oct 25 16:32:45 n8 kernel: [1554039.836302] R13: 0000000000000005 R14:
> 0000000000000000 R15: 000000000000001f
> Oct 25 16:32:45 n8 kernel: [1554039.836355] FS:
> 00002b9aaa791ff0(0000) GS:ffff810416d25c80(0000) knlGS:00000000f54d2b90
> Oct 25 16:32:45 n8 kernel: [1554039.836409] CS: 0010 DS: 0018 ES:
> 0018 CR0: 000000008005003b
> Oct 25 16:32:45 n8 kernel: [1554039.836442] CR2: 00002b9aab037000 CR3:
> 0000000215df3000 CR4: 00000000000006e0
> Oct 25 16:32:45 n8 kernel: [1554039.836494] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000
> Oct 25 16:32:45 n8 kernel: [1554039.836546] DR3: 0000000000000000 DR6:
> 00000000ffff0ff0 DR7: 0000000000000400
> Oct 25 16:32:45 n8 kernel: [1554039.836599] Process o2net (pid: 4925,
> threadinfo ffff810414810000, task ffff8104145ae7f0)
> Oct 25 16:32:45 n8 kernel: [1554039.836653] Stack: ffffffff8847a5a6
> ffff810413c53000 0000000046c107d6 ffff8102f2437028
> Oct 25 16:32:45 n8 kernel: [1554039.836716] 0000000000000000
> ffff8103db805200 080000000005364d ffff8102f2437018
> Oct 25 16:32:45 n8 kernel: [1554039.836777] 0000000000000000
> 000000000000001f ffffffff8840aef4 000000000000012c
> Oct 25 16:32:45 n8 kernel: [1554039.836818] Call Trace:
> Oct 25 16:32:45 n8 kernel: [1554039.836874] [<ffffffff8847a5a6>]
> :ocfs2:ocfs2_blocking_ast+0x26/0x310
> Oct 25 16:32:45 n8 kernel: [1554039.836924]
> [ocfs2_dlm:dlm_proxy_ast_handler+0x824/0x830]
> :ocfs2_dlm:dlm_proxy_ast_handler+0x824/0x830
> Oct 25 16:32:45 n8 kernel: [1554039.836988]
> [ocfs2_nodemanager:do_gettimeofday+0x2f/0x2fb90] do_gettimeofday+0x2f/0xc0
> Oct 25 16:32:45 n8 kernel: [1554039.837035]
> [ocfs2_nodemanager:o2net_process_message+0x4cc/0x5b0]
> :ocfs2_nodemanager:o2net_process_message+0x4cc/0x5b0
> Oct 25 16:32:45 n8 kernel: [1554039.837096]
> [__dequeue_entity+0x3d/0x50] __dequeue_entity+0x3d/0x50
> Oct 25 16:32:45 n8 kernel: [1554039.837137]
> [ocfs2_nodemanager:o2net_recv_tcp_msg+0x65/0x80]
> :ocfs2_nodemanager:o2net_recv_tcp_msg+0x65/0x80
> Oct 25 16:32:45 n8 kernel: [1554039.839042]
> [ocfs2_nodemanager:o2net_rx_until_empty+0x38b/0x900]
> :ocfs2_nodemanager:o2net_rx_until_empty+0x38b/0x900
> Oct 25 16:32:45 n8 kernel: [1554039.839106]
> [ocfs2_nodemanager:o2net_rx_until_empty+0x0/0x900]
> :ocfs2_nodemanager:o2net_rx_until_empty+0x0/0x900
> Oct 25 16:32:45 n8 kernel: [1554039.839165]
> [run_workqueue+0xcc/0x170] run_workqueue+0xcc/0x170
> Oct 25 16:32:45 n8 kernel: [1554039.839198] [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Oct 25 16:32:45 n8 kernel: [1554039.839232] [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Oct 25 16:32:45 n8 kernel: [1554039.839266]
> [worker_thread+0xa3/0x110] worker_thread+0xa3/0x110
> Oct 25 16:32:45 n8 kernel: [1554039.839300] [<ffffffff80254510>]
> autoremove_wake_function+0x0/0x30
> Oct 25 16:32:45 n8 kernel: [1554039.839336] [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Oct 25 16:32:45 n8 kernel: [1554039.839370] [worker_thread+0x0/0x110]
> worker_thread+0x0/0x110
> Oct 25 16:32:45 n8 kernel: [1554039.839403] [kthread+0x4b/0x80]
> kthread+0x4b/0x80
> Oct 25 16:32:45 n8 kernel: [1554039.839438] [child_rip+0xa/0x12]
> child_rip+0xa/0x12
> Oct 25 16:32:45 n8 kernel: [1554039.839476]
> [lapic_next_event+0x0/0x10] lapic_next_event+0x0/0x10
> Oct 25 16:32:45 n8 kernel: [1554039.839512] [kthread+0x0/0x80]
> kthread+0x0/0x80
> Oct 25 16:32:45 n8 kernel: [1554039.839543] [child_rip+0x0/0x12]
> child_rip+0x0/0x12
> Oct 25 16:32:45 n8 kernel: [1554039.839575]
> Oct 25 16:32:45 n8 kernel: [1554039.839598]
> Oct 25 16:32:45 n8 kernel: [1554039.839599] Code: 48 8b 80 58 02 00 00
> c3 66 2e 0f 1f 84 00 00 00 00 00 8b 47
> Oct 25 16:32:45 n8 kernel: [1554039.839716] RIP [<ffffffff88472a7e>]
> :ocfs2:ocfs2_get_dentry_osb+0xe/0x20
> Oct 25 16:32:45 n8 kernel: [1554039.839761] RSP <ffff810414811ca8>
> Oct 25 16:32:45 n8 kernel: [1554039.840184] ---[ end trace
> 7b4f7c37a3752d22 ]---
> Oct 25 16:33:14 n8 kernel: [1554069.382728] o2net: connection to node
> n3 (num 3) at 172.16.16.203:7777 has been idle for 30.0 seconds,
> shutting it down.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users