We have a 2 node OCFS2 cluster running Oracle 10g, both nodes crashed. Node 1 because it panic running IOSTAT, the second node crashed with this error message you can see below. I was hoping to see a newer version of OCFS2 so I could proceed with the upgrade if necessary. Have you seen this problem, anybody has resolved this? Node 1 reboots reboot system boot 2.6.18-92.el5 Tue Jun 16 09:26 (02:14) Node 2 reboots reboot system boot 2.6.18-92.el5 Tue Jun 16 09:29 (02:10) Running Kernel Linux uscosprdvrtxdb02 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux OCFS2 Version installed ocfs2console-1.4.1-1.el5 ocfs2-tools-1.4.1-1.el5 ocfs2-2.6.18-92.el5-1.4.1-1.el5 Crash analysis: KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 8 DATE: Tue Jun 16 09:15:26 2009 UPTIME: 2 days, 02:17:01 LOAD AVERAGE: 0.22, 0.31, 0.21 TASKS: 570 NODENAME: uscosprdvrtxdb02 RELEASE: 2.6.18-92.el5 VERSION: #1 SMP Tue Apr 29 13:16:15 EDT 2008 MACHINE: x86_64 (2666 Mhz) MEMORY: 11.8 GB PANIC: "" PID: 28123 COMMAND: "oracle" TASK: ffff8102e25e97e0 [THREAD_INFO: ffff8102cf0ba000] CPU: 3 STATE: TASK_RUNNING (PANIC) Kernel messages: o2net: connection to node uscosprdvrtxdb01 (num 0) at 192.168.5.1:7000 has been idle for 60.0 seconds, shutting it down. (0,0):o2net_idle_timer:1476 here are some times that might help debug the situation: (tmr 1245143657.942607 now 1245143717.944198 dr 1245143657.942600 adv 12 45143657.942608:1245143657.942609 func (5010bc9a:505) 1245128670.144972:1245128670.144981) o2net: no longer connected to node uscosprdvrtxdb01 (num 0) at 192.168.5.1:7000 (28123,3):dlm_do_master_request:1330 ERROR: unhandled error!----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at ...mushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:1331 invalid opcode: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:05.0/0000:10:00.0/0000:11:01.0/0000:14:00.0/ 0000:15:00.0/irq CPU 3 Modules linked in: nfs lockd fscache nfs_acl mptctl mptbase ipmi_si(U) ipmi_devintf(U) ipmi_msghandler(U) autofs4 hidp l2cap bluetooth ocfs2(U) ocfs2_dlmfs(U ) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc hp_ilo(U) bonding ipv6 xfrm_nalgo crypto_api emcpdm(PU) emcpgpx(PU) emcpmpx(PU) emcp(PU) dm_mirror dm_mul tipath dm_mod video sbs backlight i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i5000_edac edac_mc bnx2 sg serio_raw shpc hp pcspkr usb_storage lpfc scsi_transport_fc cciss(U) sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 28123, comm: oracle Tainted: P 2.6.18-92.el5 #1 RIP: 0010:[<ffffffff88652f8a>] [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c RSP: 0018:ffff8102cf0bba38 EFLAGS: 00010286 RAX: 000000000000003f RBX: 00000000fffffe00 RCX: ffffffff802ec9a8 RDX: ffffffff802ec9a8 RSI: 0000000000000000 RDI: ffffffff802ec9a0 RBP: ffff8101b98d3e40 R08: ffffffff802ec9a8 R09: 0000000000000046 R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000000 R13: ffff810316df5c00 R14: ffff810316df5c00 R15: ffff8101bc0625c0 FS: 00002b806dfccc40(0000) GS:ffff81032ff24640(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000086229b8 CR3: 00000002cf119000 CR4: 00000000000006e0 Process oracle (pid: 28123, threadinfo ffff8102cf0ba000, task ffff8102e25e97e0) Stack: 0000000000001f01 3030303030303057 3030303030303030 3435303061323030 0061316437626364 0000000000000000 0000000000000000 0000000000000000 0000000000000000 000000008865344a 0000000116df5c00 0000000000000000 Call Trace: [<ffffffff88658669>] :ocfs2_dlm:dlm_get_lock_resource+0xa5e/0x1913 [<ffffffff8005be70>] cache_alloc_refill+0x106/0x186 [<ffffffff8865dde5>] :ocfs2_dlm:dlm_wait_for_recovery+0xa1/0x116 [<ffffffff88650c46>] :ocfs2_dlm:dlmlock+0x731/0x11f9 [<ffffffff886a5ad0>] :ocfs2:ocfs2_cluster_unlock+0x240/0x2ad [<ffffffff80009523>] __d_lookup+0xb0/0xff [<ffffffff886a17d8>] :ocfs2:ocfs2_dentry_revalidate+0x111/0x259 [<ffffffff886a69c1>] :ocfs2:ocfs2_init_mask_waiter+0x24/0x3d [<ffffffff8000cb46>] do_lookup+0x65/0x1d4 [<ffffffff886a7e00>] :ocfs2:ocfs2_cluster_lock+0x354/0x7eb [<ffffffff886a9a5c>] :ocfs2:ocfs2_locking_ast+0x0/0x486 [<ffffffff886acfd2>] :ocfs2:ocfs2_blocking_ast+0x0/0x2c1 [<ffffffff801458b9>] snprintf+0x44/0x4c [<ffffffff886ac242>] :ocfs2:ocfs2_rw_lock+0x10f/0x1d6 [<ffffffff886b0159>] :ocfs2:ocfs2_file_aio_read+0x128/0x394 [<ffffffff886a75eb>] :ocfs2:ocfs2_add_lockres_tracking+0x73/0x81 [<ffffffff8000caa4>] do_sync_read+0xc7/0x104 [<ffffffff886aedcc>] :ocfs2:ocfs2_init_file_private+0x4d/0x5a [<ffffffff8001e35e>] __dentry_open+0x101/0x1dc [<ffffffff8009dde2>] autoremove_wake_function+0x0/0x2e [<ffffffff80027338>] do_filp_open+0x2a/0x38 [<ffffffff8000b337>] vfs_read+0xcb/0x171 [<ffffffff800130a3>] sys_pread64+0x50/0x70 [<ffffffff8005d229>] tracesys+0x71/0xe0 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 de 85 66 88 c2 33 05 48 b8 00 09 00 00 01 00 00 00 RIP [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c RSP <ffff8102cf0bba38> Saul J. Gabay Sr. Linux Engineer IT Infrastructure & Operations Herbalife International Inc. 310-410-9600 x24341 saulg at herbalife.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090616/2b4a8690/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 1120 bytes Desc: image001.jpg Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090616/2b4a8690/attachment-0001.jpe
Please file a bugzilla in oss.oracle.com/bugzilla. Saul Gabay wrote:> > We have a 2 node OCFS2 cluster running Oracle 10g, both nodes crashed. > > > > Node 1 because it panic running IOSTAT, the second node crashed with > this error message you can see below. > > > > I was hoping to see a newer version of OCFS2 so I could proceed with > the upgrade if necessary. > > > > Have you seen this problem, anybody has resolved this? > > > > Node 1 reboots > > reboot system boot 2.6.18-92.el5 Tue Jun 16 09:26 (02:14) > > > > Node 2 reboots > > reboot system boot 2.6.18-92.el5 Tue Jun 16 09:29 (02:10) > > > > Running Kernel > > Linux uscosprdvrtxdb02 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT > 2008 x86_64 x86_64 x86_64 GNU/Linux > > > > OCFS2 Version installed > > ocfs2console-1.4.1-1.el5 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > > > *_Crash analysis:_* > > > > KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux > > DUMPFILE: vmcore [PARTIAL DUMP] > > CPUS: 8 > > DATE: Tue Jun 16 09:15:26 2009 > > UPTIME: 2 days, 02:17:01 > > LOAD AVERAGE: 0.22, 0.31, 0.21 > > TASKS: 570 > > NODENAME: uscosprdvrtxdb02 > > RELEASE: 2.6.18-92.el5 > > VERSION: #1 SMP Tue Apr 29 13:16:15 EDT 2008 > > MACHINE: x86_64 (2666 Mhz) > > MEMORY: 11.8 GB > > *PANIC: ""* > > * PID: 28123* > > * COMMAND: "oracle"* > > TASK: ffff8102e25e97e0 [THREAD_INFO: ffff8102cf0ba000] > > CPU: 3 > > *STATE: TASK_RUNNING (PANIC)* > > > > *_ _* > > *_Kernel messages:_* > > *o2net: connection to node uscosprdvrtxdb01 (num 0) at > 192.168.5.1:7000 has been idle for 60.0 seconds, shutting it down.* > > *(0,0):o2net_idle_timer:1476 here are some times that might help debug > the situation: (tmr 1245143657.942607 now 1245143717.944198 dr > 1245143657.942600 adv 12* > > *45143657.942608:1245143657.942609 func (5010bc9a:505) > 1245128670.144972:1245128670.144981)* > > *o2net: no longer connected to node uscosprdvrtxdb01 (num 0) at > 192.168.5.1:7000* > > *(28123,3):dlm_do_master_request:1330 ERROR: unhandled > error!----------- [cut here ] --------- [please bite here ] ---------* > > *Kernel BUG at ...mushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:1331* > > *invalid opcode: 0000 [1] SMP* > > last sysfs file: > /devices/pci0000:00/0000:00:05.0/0000:10:00.0/0000:11:01.0/0000:14:00.0/0000:15:00.0/irq > > CPU 3 > > Modules linked in: nfs lockd fscache nfs_acl mptctl mptbase ipmi_si(U) > ipmi_devintf(U) ipmi_msghandler(U) autofs4 hidp l2cap bluetooth > ocfs2(U) ocfs2_dlmfs(U > > ) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc hp_ilo(U) bonding > ipv6 xfrm_nalgo crypto_api emcpdm(PU) emcpgpx(PU) emcpmpx(PU) emcp(PU) > dm_mirror dm_mul > > tipath dm_mod video sbs backlight i2c_ec i2c_core button battery > asus_acpi acpi_memhotplug ac parport_pc lp parport i5000_edac edac_mc > bnx2 sg serio_raw shpc > > hp pcspkr usb_storage lpfc scsi_transport_fc cciss(U) sd_mod scsi_mod > ext3 jbd uhci_hcd ohci_hcd ehci_hcd > > *Pid: 28123, comm: oracle Tainted: P 2.6.18-92.el5 #1* > > RIP: 0010:[<ffffffff88652f8a>] [<ffffffff88652f8a>] > :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c > > RSP: 0018:ffff8102cf0bba38 EFLAGS: 00010286 > > RAX: 000000000000003f RBX: 00000000fffffe00 RCX: ffffffff802ec9a8 > > RDX: ffffffff802ec9a8 RSI: 0000000000000000 RDI: ffffffff802ec9a0 > > RBP: ffff8101b98d3e40 R08: ffffffff802ec9a8 R09: 0000000000000046 > > R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000000 > > R13: ffff810316df5c00 R14: ffff810316df5c00 R15: ffff8101bc0625c0 > > FS: 00002b806dfccc40(0000) GS:ffff81032ff24640(0000) > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00000000086229b8 CR3: 00000002cf119000 CR4: 00000000000006e0 > > Process oracle (pid: 28123, threadinfo ffff8102cf0ba000, task > ffff8102e25e97e0) > > Stack: 0000000000001f01 3030303030303057 3030303030303030 > 3435303061323030 > > 0061316437626364 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 000000008865344a 0000000116df5c00 0000000000000000 > > Call Trace: > > [<ffffffff88658669>] :ocfs2_dlm:dlm_get_lock_resource+0xa5e/0x1913 > > [<ffffffff8005be70>] cache_alloc_refill+0x106/0x186 > > [<ffffffff8865dde5>] :ocfs2_dlm:dlm_wait_for_recovery+0xa1/0x116 > > [<ffffffff88650c46>] :ocfs2_dlm:dlmlock+0x731/0x11f9 > > [<ffffffff886a5ad0>] :ocfs2:ocfs2_cluster_unlock+0x240/0x2ad > > [<ffffffff80009523>] __d_lookup+0xb0/0xff > > [<ffffffff886a17d8>] :ocfs2:ocfs2_dentry_revalidate+0x111/0x259 > > [<ffffffff886a69c1>] :ocfs2:ocfs2_init_mask_waiter+0x24/0x3d > > [<ffffffff8000cb46>] do_lookup+0x65/0x1d4 > > [<ffffffff886a7e00>] :ocfs2:ocfs2_cluster_lock+0x354/0x7eb > > [<ffffffff886a9a5c>] :ocfs2:ocfs2_locking_ast+0x0/0x486 > > [<ffffffff886acfd2>] :ocfs2:ocfs2_blocking_ast+0x0/0x2c1 > > [<ffffffff801458b9>] snprintf+0x44/0x4c > > [<ffffffff886ac242>] :ocfs2:ocfs2_rw_lock+0x10f/0x1d6 > > [<ffffffff886b0159>] :ocfs2:ocfs2_file_aio_read+0x128/0x394 > > [<ffffffff886a75eb>] :ocfs2:ocfs2_add_lockres_tracking+0x73/0x81 > > [<ffffffff8000caa4>] do_sync_read+0xc7/0x104 > > [<ffffffff886aedcc>] :ocfs2:ocfs2_init_file_private+0x4d/0x5a > > [<ffffffff8001e35e>] __dentry_open+0x101/0x1dc > > [<ffffffff8009dde2>] autoremove_wake_function+0x0/0x2e > > [<ffffffff80027338>] do_filp_open+0x2a/0x38 > > [<ffffffff8000b337>] vfs_read+0xcb/0x171 > > [<ffffffff800130a3>] sys_pread64+0x50/0x70 > > [<ffffffff8005d229>] tracesys+0x71/0xe0 > > [<ffffffff8005d28d>] tracesys+0xd5/0xe0 > > > > > > Code: 0f 0b 68 de 85 66 88 c2 33 05 48 b8 00 09 00 00 01 00 00 00 > > RIP [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c > > RSP <ffff8102cf0bba38> > > * * > > > > > > /*/Saul J. Gabay/*/** > > //Sr. Linux Engineer////// > > //IT Infrastructure & Operations//// > > //Herbalife International Inc.//// > > //310-410-9600 x24341// > > //saulg at herbalife.com// > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
I reported this incident as a new BUG #1130. Please treat this as urgent is affecting our production environment repetitively. Let me know if more information is needed. Thank you Saul -----Original Message----- From: Sunil Mushran [mailto:sunil.mushran at oracle.com] Sent: Tuesday, June 16, 2009 11:58 AM To: Saul Gabay Cc: ocfs2-users at oss.oracle.com; Server Ops_Linux Subject: Re: [Ocfs2-users] OCFS2 1.4.1 DLM unhandled error Please file a bugzilla in oss.oracle.com/bugzilla. Saul Gabay wrote:> > We have a 2 node OCFS2 cluster running Oracle 10g, both nodes crashed. > > > > Node 1 because it panic running IOSTAT, the second node crashed with > this error message you can see below. > > > > I was hoping to see a newer version of OCFS2 so I could proceed with > the upgrade if necessary. > > > > Have you seen this problem, anybody has resolved this? > > > > Node 1 reboots > > reboot system boot 2.6.18-92.el5 Tue Jun 16 09:26(02:14)> > > > Node 2 reboots > > reboot system boot 2.6.18-92.el5 Tue Jun 16 09:29(02:10)> > > > Running Kernel > > Linux uscosprdvrtxdb02 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT > 2008 x86_64 x86_64 x86_64 GNU/Linux > > > > OCFS2 Version installed > > ocfs2console-1.4.1-1.el5 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > > > *_Crash analysis:_* > > > > KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux > > DUMPFILE: vmcore [PARTIAL DUMP] > > CPUS: 8 > > DATE: Tue Jun 16 09:15:26 2009 > > UPTIME: 2 days, 02:17:01 > > LOAD AVERAGE: 0.22, 0.31, 0.21 > > TASKS: 570 > > NODENAME: uscosprdvrtxdb02 > > RELEASE: 2.6.18-92.el5 > > VERSION: #1 SMP Tue Apr 29 13:16:15 EDT 2008 > > MACHINE: x86_64 (2666 Mhz) > > MEMORY: 11.8 GB > > *PANIC: ""* > > * PID: 28123* > > * COMMAND: "oracle"* > > TASK: ffff8102e25e97e0 [THREAD_INFO: ffff8102cf0ba000] > > CPU: 3 > > *STATE: TASK_RUNNING (PANIC)* > > > > *_ _* > > *_Kernel messages:_* > > *o2net: connection to node uscosprdvrtxdb01 (num 0) at > 192.168.5.1:7000 has been idle for 60.0 seconds, shutting it down.* > > *(0,0):o2net_idle_timer:1476 here are some times that might help debug> the situation: (tmr 1245143657.942607 now 1245143717.944198 dr > 1245143657.942600 adv 12* > > *45143657.942608:1245143657.942609 func (5010bc9a:505) > 1245128670.144972:1245128670.144981)* > > *o2net: no longer connected to node uscosprdvrtxdb01 (num 0) at > 192.168.5.1:7000* > > *(28123,3):dlm_do_master_request:1330 ERROR: unhandled > error!----------- [cut here ] --------- [please bite here ] ---------* > > *Kernel BUG at...mushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:1331*> > *invalid opcode: 0000 [1] SMP* > > last sysfs file: >/devices/pci0000:00/0000:00:05.0/0000:10:00.0/0000:11:01.0/0000:14:00.0/ 0000:15:00.0/irq> > CPU 3 > > Modules linked in: nfs lockd fscache nfs_acl mptctl mptbase ipmi_si(U)> ipmi_devintf(U) ipmi_msghandler(U) autofs4 hidp l2cap bluetooth > ocfs2(U) ocfs2_dlmfs(U > > ) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc hp_ilo(U) bonding > ipv6 xfrm_nalgo crypto_api emcpdm(PU) emcpgpx(PU) emcpmpx(PU) emcp(PU)> dm_mirror dm_mul > > tipath dm_mod video sbs backlight i2c_ec i2c_core button battery > asus_acpi acpi_memhotplug ac parport_pc lp parport i5000_edac edac_mc > bnx2 sg serio_raw shpc > > hp pcspkr usb_storage lpfc scsi_transport_fc cciss(U) sd_mod scsi_mod > ext3 jbd uhci_hcd ohci_hcd ehci_hcd > > *Pid: 28123, comm: oracle Tainted: P 2.6.18-92.el5 #1* > > RIP: 0010:[<ffffffff88652f8a>] [<ffffffff88652f8a>] > :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c > > RSP: 0018:ffff8102cf0bba38 EFLAGS: 00010286 > > RAX: 000000000000003f RBX: 00000000fffffe00 RCX: ffffffff802ec9a8 > > RDX: ffffffff802ec9a8 RSI: 0000000000000000 RDI: ffffffff802ec9a0 > > RBP: ffff8101b98d3e40 R08: ffffffff802ec9a8 R09: 0000000000000046 > > R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000000 > > R13: ffff810316df5c00 R14: ffff810316df5c00 R15: ffff8101bc0625c0 > > FS: 00002b806dfccc40(0000) GS:ffff81032ff24640(0000) > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00000000086229b8 CR3: 00000002cf119000 CR4: 00000000000006e0 > > Process oracle (pid: 28123, threadinfo ffff8102cf0ba000, task > ffff8102e25e97e0) > > Stack: 0000000000001f01 3030303030303057 3030303030303030 > 3435303061323030 > > 0061316437626364 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 000000008865344a 0000000116df5c00 0000000000000000 > > Call Trace: > > [<ffffffff88658669>] :ocfs2_dlm:dlm_get_lock_resource+0xa5e/0x1913 > > [<ffffffff8005be70>] cache_alloc_refill+0x106/0x186 > > [<ffffffff8865dde5>] :ocfs2_dlm:dlm_wait_for_recovery+0xa1/0x116 > > [<ffffffff88650c46>] :ocfs2_dlm:dlmlock+0x731/0x11f9 > > [<ffffffff886a5ad0>] :ocfs2:ocfs2_cluster_unlock+0x240/0x2ad > > [<ffffffff80009523>] __d_lookup+0xb0/0xff > > [<ffffffff886a17d8>] :ocfs2:ocfs2_dentry_revalidate+0x111/0x259 > > [<ffffffff886a69c1>] :ocfs2:ocfs2_init_mask_waiter+0x24/0x3d > > [<ffffffff8000cb46>] do_lookup+0x65/0x1d4 > > [<ffffffff886a7e00>] :ocfs2:ocfs2_cluster_lock+0x354/0x7eb > > [<ffffffff886a9a5c>] :ocfs2:ocfs2_locking_ast+0x0/0x486 > > [<ffffffff886acfd2>] :ocfs2:ocfs2_blocking_ast+0x0/0x2c1 > > [<ffffffff801458b9>] snprintf+0x44/0x4c > > [<ffffffff886ac242>] :ocfs2:ocfs2_rw_lock+0x10f/0x1d6 > > [<ffffffff886b0159>] :ocfs2:ocfs2_file_aio_read+0x128/0x394 > > [<ffffffff886a75eb>] :ocfs2:ocfs2_add_lockres_tracking+0x73/0x81 > > [<ffffffff8000caa4>] do_sync_read+0xc7/0x104 > > [<ffffffff886aedcc>] :ocfs2:ocfs2_init_file_private+0x4d/0x5a > > [<ffffffff8001e35e>] __dentry_open+0x101/0x1dc > > [<ffffffff8009dde2>] autoremove_wake_function+0x0/0x2e > > [<ffffffff80027338>] do_filp_open+0x2a/0x38 > > [<ffffffff8000b337>] vfs_read+0xcb/0x171 > > [<ffffffff800130a3>] sys_pread64+0x50/0x70 > > [<ffffffff8005d229>] tracesys+0x71/0xe0 > > [<ffffffff8005d28d>] tracesys+0xd5/0xe0 > > > > > > Code: 0f 0b 68 de 85 66 88 c2 33 05 48 b8 00 09 00 00 01 00 00 00 > > RIP [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c > > RSP <ffff8102cf0bba38> > > * * > > > > > > /*/Saul J. Gabay/*/** > > //Sr. Linux Engineer////// > > //IT Infrastructure & Operations//// > > //Herbalife International Inc.//// > > //310-410-9600 x24341// > > //saulg at herbalife.com// > > > >------------------------------------------------------------------------> > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
The error was shown on KDUMP VMCORE browsing the kernel messages log, the output was included on the same BUG #1130 -----Original Message----- From: Saul Gabay Sent: Tuesday, June 16, 2009 12:50 PM To: Sunil Mushran Cc: ocfs2-users at oss.oracle.com; Server Ops_Linux Subject: RE: [Ocfs2-users] OCFS2 1.4.1 DLM unhandled error I reported this incident as a new BUG #1130. Please treat this as urgent is affecting our production environment repetitively. Let me know if more information is needed. Thank you Saul -----Original Message----- From: Sunil Mushran [mailto:sunil.mushran at oracle.com] Sent: Tuesday, June 16, 2009 11:58 AM To: Saul Gabay Cc: ocfs2-users at oss.oracle.com; Server Ops_Linux Subject: Re: [Ocfs2-users] OCFS2 1.4.1 DLM unhandled error Please file a bugzilla in oss.oracle.com/bugzilla. Saul Gabay wrote:> > We have a 2 node OCFS2 cluster running Oracle 10g, both nodes crashed. > > > > Node 1 because it panic running IOSTAT, the second node crashed with > this error message you can see below. > > > > I was hoping to see a newer version of OCFS2 so I could proceed with > the upgrade if necessary. > > > > Have you seen this problem, anybody has resolved this? > > > > Node 1 reboots > > reboot system boot 2.6.18-92.el5 Tue Jun 16 09:26(02:14)> > > > Node 2 reboots > > reboot system boot 2.6.18-92.el5 Tue Jun 16 09:29(02:10)> > > > Running Kernel > > Linux uscosprdvrtxdb02 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT > 2008 x86_64 x86_64 x86_64 GNU/Linux > > > > OCFS2 Version installed > > ocfs2console-1.4.1-1.el5 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > > > *_Crash analysis:_* > > > > KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux > > DUMPFILE: vmcore [PARTIAL DUMP] > > CPUS: 8 > > DATE: Tue Jun 16 09:15:26 2009 > > UPTIME: 2 days, 02:17:01 > > LOAD AVERAGE: 0.22, 0.31, 0.21 > > TASKS: 570 > > NODENAME: uscosprdvrtxdb02 > > RELEASE: 2.6.18-92.el5 > > VERSION: #1 SMP Tue Apr 29 13:16:15 EDT 2008 > > MACHINE: x86_64 (2666 Mhz) > > MEMORY: 11.8 GB > > *PANIC: ""* > > * PID: 28123* > > * COMMAND: "oracle"* > > TASK: ffff8102e25e97e0 [THREAD_INFO: ffff8102cf0ba000] > > CPU: 3 > > *STATE: TASK_RUNNING (PANIC)* > > > > *_ _* > > *_Kernel messages:_* > > *o2net: connection to node uscosprdvrtxdb01 (num 0) at > 192.168.5.1:7000 has been idle for 60.0 seconds, shutting it down.* > > *(0,0):o2net_idle_timer:1476 here are some times that might help debug> the situation: (tmr 1245143657.942607 now 1245143717.944198 dr > 1245143657.942600 adv 12* > > *45143657.942608:1245143657.942609 func (5010bc9a:505) > 1245128670.144972:1245128670.144981)* > > *o2net: no longer connected to node uscosprdvrtxdb01 (num 0) at > 192.168.5.1:7000* > > *(28123,3):dlm_do_master_request:1330 ERROR: unhandled > error!----------- [cut here ] --------- [please bite here ] ---------* > > *Kernel BUG at...mushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:1331*> > *invalid opcode: 0000 [1] SMP* > > last sysfs file: >/devices/pci0000:00/0000:00:05.0/0000:10:00.0/0000:11:01.0/0000:14:00.0/ 0000:15:00.0/irq> > CPU 3 > > Modules linked in: nfs lockd fscache nfs_acl mptctl mptbase ipmi_si(U)> ipmi_devintf(U) ipmi_msghandler(U) autofs4 hidp l2cap bluetooth > ocfs2(U) ocfs2_dlmfs(U > > ) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc hp_ilo(U) bonding > ipv6 xfrm_nalgo crypto_api emcpdm(PU) emcpgpx(PU) emcpmpx(PU) emcp(PU)> dm_mirror dm_mul > > tipath dm_mod video sbs backlight i2c_ec i2c_core button battery > asus_acpi acpi_memhotplug ac parport_pc lp parport i5000_edac edac_mc > bnx2 sg serio_raw shpc > > hp pcspkr usb_storage lpfc scsi_transport_fc cciss(U) sd_mod scsi_mod > ext3 jbd uhci_hcd ohci_hcd ehci_hcd > > *Pid: 28123, comm: oracle Tainted: P 2.6.18-92.el5 #1* > > RIP: 0010:[<ffffffff88652f8a>] [<ffffffff88652f8a>] > :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c > > RSP: 0018:ffff8102cf0bba38 EFLAGS: 00010286 > > RAX: 000000000000003f RBX: 00000000fffffe00 RCX: ffffffff802ec9a8 > > RDX: ffffffff802ec9a8 RSI: 0000000000000000 RDI: ffffffff802ec9a0 > > RBP: ffff8101b98d3e40 R08: ffffffff802ec9a8 R09: 0000000000000046 > > R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000000 > > R13: ffff810316df5c00 R14: ffff810316df5c00 R15: ffff8101bc0625c0 > > FS: 00002b806dfccc40(0000) GS:ffff81032ff24640(0000) > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00000000086229b8 CR3: 00000002cf119000 CR4: 00000000000006e0 > > Process oracle (pid: 28123, threadinfo ffff8102cf0ba000, task > ffff8102e25e97e0) > > Stack: 0000000000001f01 3030303030303057 3030303030303030 > 3435303061323030 > > 0061316437626364 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 000000008865344a 0000000116df5c00 0000000000000000 > > Call Trace: > > [<ffffffff88658669>] :ocfs2_dlm:dlm_get_lock_resource+0xa5e/0x1913 > > [<ffffffff8005be70>] cache_alloc_refill+0x106/0x186 > > [<ffffffff8865dde5>] :ocfs2_dlm:dlm_wait_for_recovery+0xa1/0x116 > > [<ffffffff88650c46>] :ocfs2_dlm:dlmlock+0x731/0x11f9 > > [<ffffffff886a5ad0>] :ocfs2:ocfs2_cluster_unlock+0x240/0x2ad > > [<ffffffff80009523>] __d_lookup+0xb0/0xff > > [<ffffffff886a17d8>] :ocfs2:ocfs2_dentry_revalidate+0x111/0x259 > > [<ffffffff886a69c1>] :ocfs2:ocfs2_init_mask_waiter+0x24/0x3d > > [<ffffffff8000cb46>] do_lookup+0x65/0x1d4 > > [<ffffffff886a7e00>] :ocfs2:ocfs2_cluster_lock+0x354/0x7eb > > [<ffffffff886a9a5c>] :ocfs2:ocfs2_locking_ast+0x0/0x486 > > [<ffffffff886acfd2>] :ocfs2:ocfs2_blocking_ast+0x0/0x2c1 > > [<ffffffff801458b9>] snprintf+0x44/0x4c > > [<ffffffff886ac242>] :ocfs2:ocfs2_rw_lock+0x10f/0x1d6 > > [<ffffffff886b0159>] :ocfs2:ocfs2_file_aio_read+0x128/0x394 > > [<ffffffff886a75eb>] :ocfs2:ocfs2_add_lockres_tracking+0x73/0x81 > > [<ffffffff8000caa4>] do_sync_read+0xc7/0x104 > > [<ffffffff886aedcc>] :ocfs2:ocfs2_init_file_private+0x4d/0x5a > > [<ffffffff8001e35e>] __dentry_open+0x101/0x1dc > > [<ffffffff8009dde2>] autoremove_wake_function+0x0/0x2e > > [<ffffffff80027338>] do_filp_open+0x2a/0x38 > > [<ffffffff8000b337>] vfs_read+0xcb/0x171 > > [<ffffffff800130a3>] sys_pread64+0x50/0x70 > > [<ffffffff8005d229>] tracesys+0x71/0xe0 > > [<ffffffff8005d28d>] tracesys+0xd5/0xe0 > > > > > > Code: 0f 0b 68 de 85 66 88 c2 33 05 48 b8 00 09 00 00 01 00 00 00 > > RIP [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c > > RSP <ffff8102cf0bba38> > > * * > > > > > > /*/Saul J. Gabay/*/** > > //Sr. Linux Engineer////// > > //IT Infrastructure & Operations//// > > //Herbalife International Inc.//// > > //310-410-9600 x24341// > > //saulg at herbalife.com// > > > >------------------------------------------------------------------------> > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
Hello Saul, Please log a Support Request via Metalink using your Oracle CSI. Thanks, Herbert, Saul Gabay wrote:> I reported this incident as a new BUG #1130. > > Please treat this as urgent is affecting our production environment > repetitively. > > Let me know if more information is needed. > > Thank you > > Saul > > -----Original Message----- > From: Sunil Mushran [mailto:sunil.mushran at oracle.com] > Sent: Tuesday, June 16, 2009 11:58 AM > To: Saul Gabay > Cc: ocfs2-users at oss.oracle.com; Server Ops_Linux > Subject: Re: [Ocfs2-users] OCFS2 1.4.1 DLM unhandled error > > Please file a bugzilla in oss.oracle.com/bugzilla. > > Saul Gabay wrote: > >> We have a 2 node OCFS2 cluster running Oracle 10g, both nodes crashed. >> >> >> >> Node 1 because it panic running IOSTAT, the second node crashed with >> this error message you can see below. >> >> >> >> I was hoping to see a newer version of OCFS2 so I could proceed with >> the upgrade if necessary. >> >> >> >> Have you seen this problem, anybody has resolved this? >> >> >> >> Node 1 reboots >> >> reboot system boot 2.6.18-92.el5 Tue Jun 16 09:26 >> > (02:14) > >> >> >> Node 2 reboots >> >> reboot system boot 2.6.18-92.el5 Tue Jun 16 09:29 >> > (02:10) > >> >> >> Running Kernel >> >> Linux uscosprdvrtxdb02 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT >> 2008 x86_64 x86_64 x86_64 GNU/Linux >> >> >> >> OCFS2 Version installed >> >> ocfs2console-1.4.1-1.el5 >> >> ocfs2-tools-1.4.1-1.el5 >> >> ocfs2-2.6.18-92.el5-1.4.1-1.el5 >> >> >> >> *_Crash analysis:_* >> >> >> >> KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux >> >> DUMPFILE: vmcore [PARTIAL DUMP] >> >> CPUS: 8 >> >> DATE: Tue Jun 16 09:15:26 2009 >> >> UPTIME: 2 days, 02:17:01 >> >> LOAD AVERAGE: 0.22, 0.31, 0.21 >> >> TASKS: 570 >> >> NODENAME: uscosprdvrtxdb02 >> >> RELEASE: 2.6.18-92.el5 >> >> VERSION: #1 SMP Tue Apr 29 13:16:15 EDT 2008 >> >> MACHINE: x86_64 (2666 Mhz) >> >> MEMORY: 11.8 GB >> >> *PANIC: ""* >> >> * PID: 28123* >> >> * COMMAND: "oracle"* >> >> TASK: ffff8102e25e97e0 [THREAD_INFO: ffff8102cf0ba000] >> >> CPU: 3 >> >> *STATE: TASK_RUNNING (PANIC)* >> >> >> >> *_ _* >> >> *_Kernel messages:_* >> >> *o2net: connection to node uscosprdvrtxdb01 (num 0) at >> 192.168.5.1:7000 has been idle for 60.0 seconds, shutting it down.* >> >> *(0,0):o2net_idle_timer:1476 here are some times that might help debug >> > > >> the situation: (tmr 1245143657.942607 now 1245143717.944198 dr >> 1245143657.942600 adv 12* >> >> *45143657.942608:1245143657.942609 func (5010bc9a:505) >> 1245128670.144972:1245128670.144981)* >> >> *o2net: no longer connected to node uscosprdvrtxdb01 (num 0) at >> 192.168.5.1:7000* >> >> *(28123,3):dlm_do_master_request:1330 ERROR: unhandled >> error!----------- [cut here ] --------- [please bite here ] ---------* >> >> *Kernel BUG at >> > ...mushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:1331* > >> *invalid opcode: 0000 [1] SMP* >> >> last sysfs file: >> >> > /devices/pci0000:00/0000:00:05.0/0000:10:00.0/0000:11:01.0/0000:14:00.0/ > 0000:15:00.0/irq > >> CPU 3 >> >> Modules linked in: nfs lockd fscache nfs_acl mptctl mptbase ipmi_si(U) >> > > >> ipmi_devintf(U) ipmi_msghandler(U) autofs4 hidp l2cap bluetooth >> ocfs2(U) ocfs2_dlmfs(U >> >> ) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc hp_ilo(U) bonding >> ipv6 xfrm_nalgo crypto_api emcpdm(PU) emcpgpx(PU) emcpmpx(PU) emcp(PU) >> > > >> dm_mirror dm_mul >> >> tipath dm_mod video sbs backlight i2c_ec i2c_core button battery >> asus_acpi acpi_memhotplug ac parport_pc lp parport i5000_edac edac_mc >> bnx2 sg serio_raw shpc >> >> hp pcspkr usb_storage lpfc scsi_transport_fc cciss(U) sd_mod scsi_mod >> ext3 jbd uhci_hcd ohci_hcd ehci_hcd >> >> *Pid: 28123, comm: oracle Tainted: P 2.6.18-92.el5 #1* >> >> RIP: 0010:[<ffffffff88652f8a>] [<ffffffff88652f8a>] >> :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c >> >> RSP: 0018:ffff8102cf0bba38 EFLAGS: 00010286 >> >> RAX: 000000000000003f RBX: 00000000fffffe00 RCX: ffffffff802ec9a8 >> >> RDX: ffffffff802ec9a8 RSI: 0000000000000000 RDI: ffffffff802ec9a0 >> >> RBP: ffff8101b98d3e40 R08: ffffffff802ec9a8 R09: 0000000000000046 >> >> R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000000 >> >> R13: ffff810316df5c00 R14: ffff810316df5c00 R15: ffff8101bc0625c0 >> >> FS: 00002b806dfccc40(0000) GS:ffff81032ff24640(0000) >> knlGS:0000000000000000 >> >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> >> CR2: 00000000086229b8 CR3: 00000002cf119000 CR4: 00000000000006e0 >> >> Process oracle (pid: 28123, threadinfo ffff8102cf0ba000, task >> ffff8102e25e97e0) >> >> Stack: 0000000000001f01 3030303030303057 3030303030303030 >> 3435303061323030 >> >> 0061316437626364 0000000000000000 0000000000000000 0000000000000000 >> >> 0000000000000000 000000008865344a 0000000116df5c00 0000000000000000 >> >> Call Trace: >> >> [<ffffffff88658669>] :ocfs2_dlm:dlm_get_lock_resource+0xa5e/0x1913 >> >> [<ffffffff8005be70>] cache_alloc_refill+0x106/0x186 >> >> [<ffffffff8865dde5>] :ocfs2_dlm:dlm_wait_for_recovery+0xa1/0x116 >> >> [<ffffffff88650c46>] :ocfs2_dlm:dlmlock+0x731/0x11f9 >> >> [<ffffffff886a5ad0>] :ocfs2:ocfs2_cluster_unlock+0x240/0x2ad >> >> [<ffffffff80009523>] __d_lookup+0xb0/0xff >> >> [<ffffffff886a17d8>] :ocfs2:ocfs2_dentry_revalidate+0x111/0x259 >> >> [<ffffffff886a69c1>] :ocfs2:ocfs2_init_mask_waiter+0x24/0x3d >> >> [<ffffffff8000cb46>] do_lookup+0x65/0x1d4 >> >> [<ffffffff886a7e00>] :ocfs2:ocfs2_cluster_lock+0x354/0x7eb >> >> [<ffffffff886a9a5c>] :ocfs2:ocfs2_locking_ast+0x0/0x486 >> >> [<ffffffff886acfd2>] :ocfs2:ocfs2_blocking_ast+0x0/0x2c1 >> >> [<ffffffff801458b9>] snprintf+0x44/0x4c >> >> [<ffffffff886ac242>] :ocfs2:ocfs2_rw_lock+0x10f/0x1d6 >> >> [<ffffffff886b0159>] :ocfs2:ocfs2_file_aio_read+0x128/0x394 >> >> [<ffffffff886a75eb>] :ocfs2:ocfs2_add_lockres_tracking+0x73/0x81 >> >> [<ffffffff8000caa4>] do_sync_read+0xc7/0x104 >> >> [<ffffffff886aedcc>] :ocfs2:ocfs2_init_file_private+0x4d/0x5a >> >> [<ffffffff8001e35e>] __dentry_open+0x101/0x1dc >> >> [<ffffffff8009dde2>] autoremove_wake_function+0x0/0x2e >> >> [<ffffffff80027338>] do_filp_open+0x2a/0x38 >> >> [<ffffffff8000b337>] vfs_read+0xcb/0x171 >> >> [<ffffffff800130a3>] sys_pread64+0x50/0x70 >> >> [<ffffffff8005d229>] tracesys+0x71/0xe0 >> >> [<ffffffff8005d28d>] tracesys+0xd5/0xe0 >> >> >> >> >> >> Code: 0f 0b 68 de 85 66 88 c2 33 05 48 b8 00 09 00 00 01 00 00 00 >> >> RIP [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c >> >> RSP <ffff8102cf0bba38> >> >> * * >> >> >> >> >> >> /*/Saul J. Gabay/*/** >> >> //Sr. Linux Engineer////// >> >> //IT Infrastructure & Operations//// >> >> //Herbalife International Inc.//// >> >> //310-410-9600 x24341// >> >> //saulg at herbalife.com// >> >> >> >> >> > ------------------------------------------------------------------------ > >> _______________________________________________ >> Ocfs2-users mailing list >> Ocfs2-users at oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-users >> > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090616/ddbc8348/attachment-0001.html
Confirmation ________________________________ The following Service Request has been created: SR Number 7561997.993 Priority 1 SR Submitted Date 16-Jun-2009 17:38:17 GMT This SR will be assigned to a support analyst during normal business hours in your country ________________________________ From: Herbert van den Bergh [mailto:herbert.van.den.bergh at oracle.com] Sent: Tuesday, June 16, 2009 2:43 PM To: Saul Gabay Cc: Sunil Mushran; Server Ops_Linux; ocfs2-users at oss.oracle.com Subject: Re: [Ocfs2-users] OCFS2 1.4.1 DLM unhandled error Hello Saul, Please log a Support Request via Metalink using your Oracle CSI. Thanks, Herbert, Saul Gabay wrote: I reported this incident as a new BUG #1130. Please treat this as urgent is affecting our production environment repetitively. Let me know if more information is needed. Thank you Saul -----Original Message----- From: Sunil Mushran [mailto:sunil.mushran at oracle.com] Sent: Tuesday, June 16, 2009 11:58 AM To: Saul Gabay Cc: ocfs2-users at oss.oracle.com; Server Ops_Linux Subject: Re: [Ocfs2-users] OCFS2 1.4.1 DLM unhandled error Please file a bugzilla in oss.oracle.com/bugzilla. Saul Gabay wrote: We have a 2 node OCFS2 cluster running Oracle 10g, both nodes crashed. Node 1 because it panic running IOSTAT, the second node crashed with this error message you can see below. I was hoping to see a newer version of OCFS2 so I could proceed with the upgrade if necessary. Have you seen this problem, anybody has resolved this? Node 1 reboots reboot system boot 2.6.18-92.el5 Tue Jun 16 09:26 (02:14) Node 2 reboots reboot system boot 2.6.18-92.el5 Tue Jun 16 09:29 (02:10) Running Kernel Linux uscosprdvrtxdb02 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux OCFS2 Version installed ocfs2console-1.4.1-1.el5 ocfs2-tools-1.4.1-1.el5 ocfs2-2.6.18-92.el5-1.4.1-1.el5 *_Crash analysis:_* KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.el5/vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 8 DATE: Tue Jun 16 09:15:26 2009 UPTIME: 2 days, 02:17:01 LOAD AVERAGE: 0.22, 0.31, 0.21 TASKS: 570 NODENAME: uscosprdvrtxdb02 RELEASE: 2.6.18-92.el5 VERSION: #1 SMP Tue Apr 29 13:16:15 EDT 2008 MACHINE: x86_64 (2666 Mhz) MEMORY: 11.8 GB *PANIC: ""* * PID: 28123* * COMMAND: "oracle"* TASK: ffff8102e25e97e0 [THREAD_INFO: ffff8102cf0ba000] CPU: 3 *STATE: TASK_RUNNING (PANIC)* *_ _* *_Kernel messages:_* *o2net: connection to node uscosprdvrtxdb01 (num 0) at 192.168.5.1:7000 has been idle for 60.0 seconds, shutting it down.* *(0,0):o2net_idle_timer:1476 here are some times that might help debug the situation: (tmr 1245143657.942607 now 1245143717.944198 dr 1245143657.942600 adv 12* *45143657.942608:1245143657.942609 func (5010bc9a:505) 1245128670.144972:1245128670.144981)* *o2net: no longer connected to node uscosprdvrtxdb01 (num 0) at 192.168.5.1:7000* *(28123,3):dlm_do_master_request:1330 ERROR: unhandled error!----------- [cut here ] --------- [please bite here ] ---------* *Kernel BUG at ...mushran/BUILD/ocfs2-1.4.1/fs/ocfs2/dlm/dlmmaster.c:1331* *invalid opcode: 0000 [1] SMP* last sysfs file: /devices/pci0000:00/0000:00:05.0/0000:10:00.0/0000:11:01.0/0000:14:00.0/ 0000:15:00.0/irq CPU 3 Modules linked in: nfs lockd fscache nfs_acl mptctl mptbase ipmi_si(U) ipmi_devintf(U) ipmi_msghandler(U) autofs4 hidp l2cap bluetooth ocfs2(U) ocfs2_dlmfs(U ) ocfs2_dlm(U) ocfs2_nodemanager(U) configfs sunrpc hp_ilo(U) bonding ipv6 xfrm_nalgo crypto_api emcpdm(PU) emcpgpx(PU) emcpmpx(PU) emcp(PU) dm_mirror dm_mul tipath dm_mod video sbs backlight i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i5000_edac edac_mc bnx2 sg serio_raw shpc hp pcspkr usb_storage lpfc scsi_transport_fc cciss(U) sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd *Pid: 28123, comm: oracle Tainted: P 2.6.18-92.el5 #1* RIP: 0010:[<ffffffff88652f8a>] [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c RSP: 0018:ffff8102cf0bba38 EFLAGS: 00010286 RAX: 000000000000003f RBX: 00000000fffffe00 RCX: ffffffff802ec9a8 RDX: ffffffff802ec9a8 RSI: 0000000000000000 RDI: ffffffff802ec9a0 RBP: ffff8101b98d3e40 R08: ffffffff802ec9a8 R09: 0000000000000046 R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000000 R13: ffff810316df5c00 R14: ffff810316df5c00 R15: ffff8101bc0625c0 FS: 00002b806dfccc40(0000) GS:ffff81032ff24640(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000086229b8 CR3: 00000002cf119000 CR4: 00000000000006e0 Process oracle (pid: 28123, threadinfo ffff8102cf0ba000, task ffff8102e25e97e0) Stack: 0000000000001f01 3030303030303057 3030303030303030 3435303061323030 0061316437626364 0000000000000000 0000000000000000 0000000000000000 0000000000000000 000000008865344a 0000000116df5c00 0000000000000000 Call Trace: [<ffffffff88658669>] :ocfs2_dlm:dlm_get_lock_resource+0xa5e/0x1913 [<ffffffff8005be70>] cache_alloc_refill+0x106/0x186 [<ffffffff8865dde5>] :ocfs2_dlm:dlm_wait_for_recovery+0xa1/0x116 [<ffffffff88650c46>] :ocfs2_dlm:dlmlock+0x731/0x11f9 [<ffffffff886a5ad0>] :ocfs2:ocfs2_cluster_unlock+0x240/0x2ad [<ffffffff80009523>] __d_lookup+0xb0/0xff [<ffffffff886a17d8>] :ocfs2:ocfs2_dentry_revalidate+0x111/0x259 [<ffffffff886a69c1>] :ocfs2:ocfs2_init_mask_waiter+0x24/0x3d [<ffffffff8000cb46>] do_lookup+0x65/0x1d4 [<ffffffff886a7e00>] :ocfs2:ocfs2_cluster_lock+0x354/0x7eb [<ffffffff886a9a5c>] :ocfs2:ocfs2_locking_ast+0x0/0x486 [<ffffffff886acfd2>] :ocfs2:ocfs2_blocking_ast+0x0/0x2c1 [<ffffffff801458b9>] snprintf+0x44/0x4c [<ffffffff886ac242>] :ocfs2:ocfs2_rw_lock+0x10f/0x1d6 [<ffffffff886b0159>] :ocfs2:ocfs2_file_aio_read+0x128/0x394 [<ffffffff886a75eb>] :ocfs2:ocfs2_add_lockres_tracking+0x73/0x81 [<ffffffff8000caa4>] do_sync_read+0xc7/0x104 [<ffffffff886aedcc>] :ocfs2:ocfs2_init_file_private+0x4d/0x5a [<ffffffff8001e35e>] __dentry_open+0x101/0x1dc [<ffffffff8009dde2>] autoremove_wake_function+0x0/0x2e [<ffffffff80027338>] do_filp_open+0x2a/0x38 [<ffffffff8000b337>] vfs_read+0xcb/0x171 [<ffffffff800130a3>] sys_pread64+0x50/0x70 [<ffffffff8005d229>] tracesys+0x71/0xe0 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 de 85 66 88 c2 33 05 48 b8 00 09 00 00 01 00 00 00 RIP [<ffffffff88652f8a>] :ocfs2_dlm:dlm_do_master_request+0x2f1/0x61c RSP <ffff8102cf0bba38> * * /*/Saul J. Gabay/*/** //Sr. Linux Engineer////// //IT Infrastructure & Operations//// //Herbalife International Inc.//// //310-410-9600 x24341// //saulg at herbalife.com// ------------------------------------------------------------------------ _______________________________________________ Ocfs2-users mailing list Ocfs2-users at oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users at oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090616/6ee2a348/attachment-0001.html