Hello All, I?ve been running a 4-node OCFS2 cluster for about a month now, and recently I?ve had a total of 3 kernel errors on random nodes. This causes the machine to lock up and a forced power cycle is necessary to bring the machine back to life. Our nodes all run on Fedora 8 with OCFS2 installed via yum. 3 nodes run the 2.6.25.10-47.fc8PAE kernel and one runs on 2.6.25.10-47.fc8. The following ocfs2 packages have been installed on all nodes: [root at web2 ~]# rpm -qa | grep ocfs2 ocfs2-tools-1.3.9-7.20080221git.fc8 ocfs2-tools-devel-1.3.9-7.20080221git.fc8 ocfs2console-1.3.9-7.20080221git.fc8 Below you?ll find the output of /var/log/messages related to this kernel error. The output is similar on every machine that had this kernel error, so I?ll only attach one copy. Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_unlock_ast:2616 ERROR: Dlm passes status 24 for lock M00000000000000000b52fa00000000, unlock_action 1 Aug 17 06:16:23 web2 kernel: (14494,4):dlmunlock:685 ERROR: dlm status DLM_BADPARAM Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_cancel_convert:2913 ERROR: Dlm error "DLM_BADPARAM" while calling dlmunlock on resource M00000000000000000b52fa00000000: invalid lock mode specified Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_unblock_lock:2948 ERROR: status = -22 Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_process_blocked_lock:3252 ERROR: status = -22 Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_prepare_downconvert:2818 ERROR: lockres->l_level (0) <= new_level (0) Aug 17 06:16:23 web2 kernel: ------------[ cut here ]------------ Aug 17 06:16:23 web2 kernel: kernel BUG at fs/ocfs2/dlmglue.c:2819! Aug 17 06:16:23 web2 kernel: invalid opcode: 0000 [#1] SMP Aug 17 06:16:23 web2 kernel: Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs crc32c libcrc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi sit tunnel4 ipv6 iptable_filter ip_tables x_tables dm_mirror dm_multipath dm_mod iTCO_wdt iTCO_vendor_support sg pcspkr i2c_i801 i5000_edac button edac_core i2c_core e1000e ata_piix libata 3w_xxxx sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: configfs] Aug 17 06:16:23 web2 kernel: Aug 17 06:16:23 web2 kernel: Pid: 14494, comm: ocfs2dc Not tainted (2.6.25.10-47.fc8PAE #1) Aug 17 06:16:23 web2 kernel: EIP: 0060:[<f8bf6fcd>] EFLAGS: 00010096 CPU: 4 Aug 17 06:16:23 web2 kernel: EIP is at ocfs2_prepare_downconvert+0x8c/0xb3 [ocfs2] Aug 17 06:16:23 web2 kernel: EAX: 00000059 EBX: 00000000 ECX: c0718fe4 EDX: 00000000 Aug 17 06:16:23 web2 kernel: ESI: 00000000 EDI: ed03d1c8 EBP: e9089f8c ESP: e9089f84 Aug 17 06:16:23 web2 kernel:? DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Aug 17 06:16:23 web2 kernel: Process ocfs2dc (pid: 14494, ti=e9089000 task=f4012e70 task.ti=e9089000) Aug 17 06:16:23 web2 kernel: Stack: 00000000 00000293 e9089fd0 f8bfa801 f4911000 00000001 f49112f4 00000000 Aug 17 06:16:23 web2 kernel:??????? ed03d1d0 00000000 00000001 00000000 f4012e70 c04377a1 e9089fbc e9089fbc Aug 17 06:16:23 web2 kernel:??????? f4911000 f8bfa4f2 00000000 e9089fe0 c04376cd c0437692 00000000 00000000 Aug 17 06:16:23 web2 kernel: Call Trace: Aug 17 06:16:23 web2 kernel:? [<f8bfa801>] ? ocfs2_downconvert_thread+0x30f/0x4c1 [ocfs2] Aug 17 06:16:23 web2 kernel:? [<c04377a1>] ? autoremove_wake_function+0x0/0x33 Aug 17 06:16:23 web2 kernel:? [<f8bfa4f2>] ? ocfs2_downconvert_thread+0x0/0x4c1 [ocfs2] Aug 17 06:16:23 web2 kernel:? [<c04376cd>] ? kthread+0x3b/0x62 Aug 17 06:16:23 web2 kernel:? [<c0437692>] ? kthread+0x0/0x62 Aug 17 06:16:23 web2 kernel:? [<c04057af>] ? kernel_thread_helper+0x7/0x10 Aug 17 06:16:23 web2 kernel:? ======================Aug 17 06:16:23 web2 kernel: Code: 68 02 0b 00 00 68 29 b0 c1 f8 64 a1 04 40 79 c0 64 8b 15 00 40 79 c0 50 ff b2 c0 01 00 00 68 bc e2 c1 f8 e8 53 08 83 c7 83 c4 1c <0f> 0b eb fe 89 99 a4 00 00 00 ba 02 00 00 00 89 c8 c7 81 9c 00 Aug 17 06:16:23 web2 kernel: EIP: [<f8bf6fcd>] ocfs2_prepare_downconvert+0x8c/0xb3 [ocfs2] SS:ESP 0068:e9089f84 Aug 17 06:16:23 web2 kernel: ---[ end trace 9b60c6d036d2ec42 ]--- Any help related to this issue is greatly appreciated. Kind regards, Wessel Sandkuyl NB: this is a repost from yesterday, I noticed I submitted my previous message in HTML mark-up. My apologies for that.
Thanks. This looks like a new issue. Please log it in the bugzilla. http://oss.oracle.com/bugzilla Wessel wrote:> Hello All, > > I?ve been running a 4-node OCFS2 cluster for about a month now, and recently > I?ve had a total of 3 kernel errors on random nodes. This causes the machine > to lock up and a forced power cycle is necessary to bring the machine back > to life. > > Our nodes all run on Fedora 8 with OCFS2 installed via yum. 3 nodes run the > 2.6.25.10-47.fc8PAE kernel and one runs on 2.6.25.10-47.fc8. > > The following ocfs2 packages have been installed on all nodes: > > [root at web2 ~]# rpm -qa | grep ocfs2 > ocfs2-tools-1.3.9-7.20080221git.fc8 > ocfs2-tools-devel-1.3.9-7.20080221git.fc8 > ocfs2console-1.3.9-7.20080221git.fc8 > > Below you?ll find the output of /var/log/messages related to this kernel > error. The output is similar on every machine that had this kernel error, so > I?ll only attach one copy. > > Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_unlock_ast:2616 ERROR: Dlm > passes status 24 for lock M00000000000000000b52fa00000000, unlock_action 1 > Aug 17 06:16:23 web2 kernel: (14494,4):dlmunlock:685 ERROR: dlm status > DLM_BADPARAM > Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_cancel_convert:2913 ERROR: Dlm > error "DLM_BADPARAM" while calling dlmunlock on resource > M00000000000000000b52fa00000000: invalid lock mode specified > Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_unblock_lock:2948 ERROR: status > = -22 > Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_process_blocked_lock:3252 > ERROR: status = -22 > Aug 17 06:16:23 web2 kernel: (14494,4):ocfs2_prepare_downconvert:2818 ERROR: > lockres->l_level (0) <= new_level (0) > Aug 17 06:16:23 web2 kernel: ------------[ cut here ]------------ > Aug 17 06:16:23 web2 kernel: kernel BUG at fs/ocfs2/dlmglue.c:2819! > Aug 17 06:16:23 web2 kernel: invalid opcode: 0000 [#1] SMP > Aug 17 06:16:23 web2 kernel: Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_dlm > ocfs2_nodemanager configfs crc32c libcrc32c ib_iser rdma_cm ib_cm iw_cm > ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi sit > tunnel4 ipv6 iptable_filter ip_tables x_tables dm_mirror dm_multipath dm_mod > iTCO_wdt iTCO_vendor_support sg pcspkr i2c_i801 i5000_edac button edac_core > i2c_core e1000e ata_piix libata 3w_xxxx sd_mod scsi_mod ext3 jbd mbcache > uhci_hcd ohci_hcd ehci_hcd [last unloaded: configfs] > Aug 17 06:16:23 web2 kernel: > Aug 17 06:16:23 web2 kernel: Pid: 14494, comm: ocfs2dc Not tainted > (2.6.25.10-47.fc8PAE #1) > Aug 17 06:16:23 web2 kernel: EIP: 0060:[<f8bf6fcd>] EFLAGS: 00010096 CPU: 4 > Aug 17 06:16:23 web2 kernel: EIP is at ocfs2_prepare_downconvert+0x8c/0xb3 > [ocfs2] > Aug 17 06:16:23 web2 kernel: EAX: 00000059 EBX: 00000000 ECX: c0718fe4 EDX: > 00000000 > Aug 17 06:16:23 web2 kernel: ESI: 00000000 EDI: ed03d1c8 EBP: e9089f8c ESP: > e9089f84 > Aug 17 06:16:23 web2 kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Aug 17 06:16:23 web2 kernel: Process ocfs2dc (pid: 14494, ti=e9089000 > task=f4012e70 task.ti=e9089000) > Aug 17 06:16:23 web2 kernel: Stack: 00000000 00000293 e9089fd0 f8bfa801 > f4911000 00000001 f49112f4 00000000 > Aug 17 06:16:23 web2 kernel: ed03d1d0 00000000 00000001 00000000 > f4012e70 c04377a1 e9089fbc e9089fbc > Aug 17 06:16:23 web2 kernel: f4911000 f8bfa4f2 00000000 e9089fe0 > c04376cd c0437692 00000000 00000000 > Aug 17 06:16:23 web2 kernel: Call Trace: > Aug 17 06:16:23 web2 kernel: [<f8bfa801>] ? > ocfs2_downconvert_thread+0x30f/0x4c1 [ocfs2] > Aug 17 06:16:23 web2 kernel: [<c04377a1>] ? > autoremove_wake_function+0x0/0x33 > Aug 17 06:16:23 web2 kernel: [<f8bfa4f2>] ? > ocfs2_downconvert_thread+0x0/0x4c1 [ocfs2] > Aug 17 06:16:23 web2 kernel: [<c04376cd>] ? kthread+0x3b/0x62 > Aug 17 06:16:23 web2 kernel: [<c0437692>] ? kthread+0x0/0x62 > Aug 17 06:16:23 web2 kernel: [<c04057af>] ? kernel_thread_helper+0x7/0x10 > Aug 17 06:16:23 web2 kernel: ======================> Aug 17 06:16:23 web2 kernel: Code: 68 02 0b 00 00 68 29 b0 c1 f8 64 a1 04 40 > 79 c0 64 8b 15 00 40 79 c0 50 ff b2 c0 01 00 00 68 bc e2 c1 f8 e8 53 08 83 > c7 83 c4 1c <0f> 0b eb fe 89 99 a4 00 00 00 ba 02 00 00 00 89 c8 c7 81 9c 00 > Aug 17 06:16:23 web2 kernel: EIP: [<f8bf6fcd>] > ocfs2_prepare_downconvert+0x8c/0xb3 [ocfs2] SS:ESP 0068:e9089f84 > Aug 17 06:16:23 web2 kernel: ---[ end trace 9b60c6d036d2ec42 ]--- > > Any help related to this issue is greatly appreciated. > > Kind regards, > > Wessel Sandkuyl > > NB: this is a repost from yesterday, I noticed I submitted my previous > message in HTML mark-up. My apologies for that. > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >