andre at digirati.com.br
2010-Oct-21 14:00 UTC
[Ocfs2-users] ocfs2_delete_inode kernel bug
Hello I've started to have high load average problems with ocfs2 partitions used for backup. I have an "umount" process which is apparently frozen. The following messages appeared in dmesg. [1660565.655199] (3215,3):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x0[1660565.689191] (3215,3):ocfs2_delete_inode:1030 ERROR: status = -17[1660752.601603] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16664401 (on-disk 16664401) not orphaned! Disk flags 0x9, inode flags 0x20[1660752.635823] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17[1660752.652998] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x20[1660752.686803] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17[1660752.703880] ------------[ cut here ]------------[1660752.720938] kernel BUG at /build/buildd/linux-2.6.32/fs/inode.c:1343![1660752.737933] invalid opcode: 0000 [#1] SMP [1660752.754710] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map[1660752.787687] CPU 10 [1660752.803949] Modules linked in: dummy iptable_filter ip_tables x_tables ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs drbd fbcon tileblit font bitblit softcursor aoe bonding joydev vga16fb vgastate lp ioatdma shpchp ixgbe mdio parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov usbhid hid megaraid_sas e1000e igb dca raid6_pq async_tx raid1 raid0 multipath linear [last unloaded: dummy][1660752.821155] Pid: 3215, comm: ocfs2_wq Not tainted 2.6.32-25-server #44-Ubuntu S5520HC[1660565.655199] (3215,3):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x0[1660565.689191] (3215,3):ocfs2_delete_inode:1030 ERROR: status = -17[1660752.601603] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16664401 (on-disk 16664401) not orphaned! Disk flags 0x9, inode flags 0x20[1660752.635823] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17[1660752.652998] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x20[1660752.686803] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17[1660752.703880] ------------[ cut here ]------------[1660752.720938] kernel BUG at /build/buildd/linux-2.6.32/fs/inode.c:1343![1660752.737933] invalid opcode: 0000 [#1] SMP [1660752.754710] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map[1660752.787687] CPU 10 [1660752.803949] Modules linked in: dummy iptable_filter ip_tables x_tables ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs drbd fbcon tileblit font bitblit softcursor aoe bonding joydev vga16fb vgastate lp ioatdma shpchp ixgbe mdio parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov usbhid hid megaraid_sas e1000e igb dca raid6_pq async_tx raid1 raid0 multipath linear [last unloaded: dummy][1660752.821155] Pid: 3215, comm: ocfs2_wq Not tainted 2.6.32-25-server #44-Ubuntu S5520HC[1660752.821157] RIP: 0010:[<ffffffff8115a959>] [<ffffffff8115a959>] iput+0x69/0x70[1660752.821165] RSP: 0018:ffff88045db49d10 EFLAGS: 00010246[1660752.821167] RAX: 0000000000000020 RBX: ffff88007edf31c8 RCX: 0000000000000002[1660752.821170] RDX: 0000000000000002 RSI: ffffffff81a1e680 RDI: ffff88007edf31c8[1660752.821172] RBP: ffff88045db49d20 R08: 00000000000000eb R09: ff6320ebd7b0f807[1660752.821174] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007edf3120[1660752.821176] R13: ffff880433b5c0c8 R14: 0000000000000000 R15: ffff88045d9d0000[1660752.821179] FS: 0000000000000000(0000) GS:ffff88000aca0000(0000) knlGS:0000000000000000[1660752.821182] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b[1660752.821184] CR2: 00007fff0fd45d5c CR3: 0000000001001000 CR4: 00000000000006e0[1660752.821186] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[1660752.821189] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[1660752.821191] Process ocfs2_wq (pid: 3215, threadinfo ffff88045db48000, task ffff88045d9d0000)[1660752.821193] Stack:[1660752.821194] ffff88045db49d20 ffff88007edf2d00 ffff88045db49d70 ffffffffa02b6a6c[1660752.821198] <0> 0000000000000000 ffff88007edf31c8 ffff88045db49da0 ffff8803d58dd080[1660752.821201] <0> ffff8803d58dd800 0000000000000000 ffff880457ed6000 ffff88045db49dc0[1660752.821204] Call Trace:[1660752.821227] [<ffffffffa02b6a6c>] ocfs2_recover_orphans+0xec/0x200 [ocfs2][1660752.821244] [<ffffffffa02b6d6f>] ocfs2_complete_recovery+0x1ef/0x610 [ocfs2][1660752.821261] [<ffffffffa02b6b80>] ? ocfs2_complete_recovery+0x0/0x610 [ocfs2][1660752.821269] [<ffffffff8107f6b7>] run_workqueue+0xc7/0x1a0[1660752.821273] [<ffffffff8107f833>] worker_thread+0xa3/0x110[1660752.821277] [<ffffffff81084250>] ? autoremove_wake_function+0x0/0x40[1660752.821280] [<ffffffff8107f790>] ? worker_thread+0x0/0x110[1660752.821284] [<ffffffff81083ed6>] kthread+0x96/0xa0[1660752.821288] [<ffffffff810131ea>] child_rip+0xa/0x20[1660752.821291] [<ffffffff81083e40>] ? kthread+0x0/0xa0[1660752.821294] [<ffffffff810131e0>] ? child_rip+0x0/0x20[1660752.821295] Code: 38 48 c7 c0 60 bf 15 81 48 85 d2 74 12 48 8b 42 20 48 c7 c2 60 bf 15 81 48 85 c0 48 0f 44 c2 48 89 df ff d0 48 83 c4 08 5b c9 c3 <0f> 0b eb fe 0f 1f 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f [1660752.821316] RIP [<ffffffff8115a959>] iput+0x69/0x70[1660752.821319] RSP <ffff88045db49d10>[1660752.839259] ---[ end trace 96d9a8e7fe30625b ]--- The kernel version is 2.6.35. Is there anything that can be done regarding this? If there's more information I can provide to help find the problem I' ll gladly provide it. Thanks in advance,Andre -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101021/f6c2d6b6/attachment.html
This does not look like a 2.6.35. The stack trace says 2.6.32. But I could not match it to a stable 2.6.32 too. It looks more like 2.6.32 + patches. It is hard to diagnose a problem without the source. That said, the first issue (-17) is a known one that was fixed in 2.6.34. commit 3939fda4b389993caf8741df5739b3e49f33a263 Author: Tristan Ye <tristan.ye at oracle.com> Date: Fri Mar 19 09:21:09 2010 +0800 Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir. I cannot comment on the oops I could not match it to any kernel. On 10/21/2010 07:00 AM, andre at digirati.com.br wrote:> Hello > > I've started to have high load average problems with ocfs2 partitions used for backup. I have an "umount" process which is apparently frozen. The following messages appeared in dmesg. > > [1660565.655199] (3215,3):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x0 > [1660565.689191] (3215,3):ocfs2_delete_inode:1030 ERROR: status = -17 > [1660752.601603] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16664401 (on-disk 16664401) not orphaned! Disk flags 0x9, inode flags 0x20 > [1660752.635823] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17 > [1660752.652998] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x20 > [1660752.686803] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17 > [1660752.703880] ------------[ cut here ]------------ > [1660752.720938] kernel BUG at /build/buildd/linux-2.6.32/fs/inode.c:1343! > [1660752.737933] invalid opcode: 0000 [#1] SMP > [1660752.754710] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map > [1660752.787687] CPU 10 > [1660752.803949] Modules linked in: dummy iptable_filter ip_tables x_tables ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs drbd fbcon tileblit font bitblit softcursor aoe bonding joydev vga16fb vgastate lp ioatdma shpchp ixgbe mdio parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov usbhid hid megaraid_sas e1000e igb dca raid6_pq async_tx raid1 raid0 multipath linear [last unloaded: dummy] > [1660752.821155] Pid: 3215, comm: ocfs2_wq Not tainted 2.6.32-25-server #44-Ubuntu S5520HC > [1660565.655199] (3215,3):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x0 > [1660565.689191] (3215,3):ocfs2_delete_inode:1030 ERROR: status = -17 > [1660752.601603] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16664401 (on-disk 16664401) not orphaned! Disk flags 0x9, inode flags 0x20 > [1660752.635823] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17 > [1660752.652998] (3215,10):ocfs2_query_inode_wipe:902 ERROR: Inode 16656380 (on-disk 16656380) not orphaned! Disk flags 0x9, inode flags 0x20 > [1660752.686803] (3215,10):ocfs2_delete_inode:1030 ERROR: status = -17 > [1660752.703880] ------------[ cut here ]------------ > [1660752.720938] kernel BUG at /build/buildd/linux-2.6.32/fs/inode.c:1343! > [1660752.737933] invalid opcode: 0000 [#1] SMP > [1660752.754710] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map > [1660752.787687] CPU 10 > [1660752.803949] Modules linked in: dummy iptable_filter ip_tables x_tables ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs drbd fbcon tileblit font bitblit softcursor aoe bonding joydev vga16fb vgastate lp ioatdma shpchp ixgbe mdio parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov usbhid hid megaraid_sas e1000e igb dca raid6_pq async_tx raid1 raid0 multipath linear [last unloaded: dummy] > [1660752.821155] Pid: 3215, comm: ocfs2_wq Not tainted 2.6.32-25-server #44-Ubuntu S5520HC > [1660752.821157] RIP: 0010:[<ffffffff8115a959>] [<ffffffff8115a959>] iput+0x69/0x70 > [1660752.821165] RSP: 0018:ffff88045db49d10 EFLAGS: 00010246 > [1660752.821167] RAX: 0000000000000020 RBX: ffff88007edf31c8 RCX: 0000000000000002 > [1660752.821170] RDX: 0000000000000002 RSI: ffffffff81a1e680 RDI: ffff88007edf31c8 > [1660752.821172] RBP: ffff88045db49d20 R08: 00000000000000eb R09: ff6320ebd7b0f807 > [1660752.821174] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007edf3120 > [1660752.821176] R13: ffff880433b5c0c8 R14: 0000000000000000 R15: ffff88045d9d0000 > [1660752.821179] FS: 0000000000000000(0000) GS:ffff88000aca0000(0000) knlGS:0000000000000000 > [1660752.821182] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > [1660752.821184] CR2: 00007fff0fd45d5c CR3: 0000000001001000 CR4: 00000000000006e0 > [1660752.821186] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [1660752.821189] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [1660752.821191] Process ocfs2_wq (pid: 3215, threadinfo ffff88045db48000, task ffff88045d9d0000) > [1660752.821193] Stack: > [1660752.821194] ffff88045db49d20 ffff88007edf2d00 ffff88045db49d70 ffffffffa02b6a6c > [1660752.821198] <0> 0000000000000000 ffff88007edf31c8 ffff88045db49da0 ffff8803d58dd080 > [1660752.821201] <0> ffff8803d58dd800 0000000000000000 ffff880457ed6000 ffff88045db49dc0 > [1660752.821204] Call Trace: > [1660752.821227] [<ffffffffa02b6a6c>] ocfs2_recover_orphans+0xec/0x200 [ocfs2] > [1660752.821244] [<ffffffffa02b6d6f>] ocfs2_complete_recovery+0x1ef/0x610 [ocfs2] > [1660752.821261] [<ffffffffa02b6b80>] ? ocfs2_complete_recovery+0x0/0x610 [ocfs2] > [1660752.821269] [<ffffffff8107f6b7>] run_workqueue+0xc7/0x1a0 > [1660752.821273] [<ffffffff8107f833>] worker_thread+0xa3/0x110 > [1660752.821277] [<ffffffff81084250>] ? autoremove_wake_function+0x0/0x40 > [1660752.821280] [<ffffffff8107f790>] ? worker_thread+0x0/0x110 > [1660752.821284] [<ffffffff81083ed6>] kthread+0x96/0xa0 > [1660752.821288] [<ffffffff810131ea>] child_rip+0xa/0x20 > [1660752.821291] [<ffffffff81083e40>] ? kthread+0x0/0xa0 > [1660752.821294] [<ffffffff810131e0>] ? child_rip+0x0/0x20 > [1660752.821295] Code: 38 48 c7 c0 60 bf 15 81 48 85 d2 74 12 48 8b 42 20 48 c7 c2 60 bf 15 81 48 85 c0 48 0f 44 c2 48 89 df ff d0 48 83 c4 08 5b c9 c3 <0f> 0b eb fe 0f 1f 00 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f > [1660752.821316] RIP [<ffffffff8115a959>] iput+0x69/0x70 > [1660752.821319] RSP <ffff88045db49d10> > [1660752.839259] ---[ end trace 96d9a8e7fe30625b ]--- > > The kernel version is 2.6.35. > > Is there anything that can be done regarding this? If there's more information I can provide to help find the problem I' ll gladly provide it. > > Thanks in advance, > Andre > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101021/d8a61ae1/attachment.html
andre at digirati.com.br
2010-Oct-28 14:09 UTC
[Ocfs2-users] ocfs2_delete_inode kernel bug
Em 28/10/2010, Joel Becker <Joel.Becker at oracle.com> escreveu: > Are you having more than one machine access the same disk > without being in the same cluster? I would hope not, but something is > weird here. No, only the two machines in the cluster access the disks. Best,Andre -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101028/6a0141b5/attachment-0001.html