thr3ads.net - Ocfs2 devel - [Ocfs2-devel] Kernel BUG in ocfs2_get_clusters

If this information is useful, please help other people find it:
Share via:

David Weber

2013-Oct-21 07:53 UTC

[Ocfs2-devel] Kernel BUG in ocfs2_get_clusters_nocache

Hi,

we ran into a BUG() in ocfs2_get_clusters_nocache:

[Fri Oct 18 10:52:28 2013] ------------[ cut here ]------------
[Fri Oct 18 10:52:28 2013] Kernel BUG at ffffffffa028ad5a [verbose debug info 
unavailable]
[Fri Oct 18 10:52:28 2013] invalid opcode: 0000 [#1] SMP 
[Fri Oct 18 10:52:28 2013] Modules linked in: vhost_net vhost macvtap macvlan 
drbd ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables 
x_tables ocfs2_stack_o2cb rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd fscache 
sunrpc bridge stp llc w83795 coretemp kvm_intel kvm lru_cache dlm sctp 
libcrc32c ocfs2_dlm ocfs2_dlmfs ocfs2 ocfs2_stackglue ocfs2_nodemanager 
configfs quota_tree snd_pcm e1000e snd_page_alloc snd_timer ixgbe snd joydev 
hid_generic usbmouse usbkbd psmouse usbhid soundcore iTCO_wdt i7core_edac 
ioatdma gpio_ich hid ptp edac_core iTCO_vendor_support i2c_i801 pcspkr mac_hid 
lpc_ich serio_raw ses mdio enclosure pps_core dca [last unloaded: evbug]
[Fri Oct 18 10:52:28 2013] CPU: 3 PID: 16938 Comm: qemu-system-x86 Tainted: G
W    3.11.4 #1
[Fri Oct 18 10:52:28 2013] Hardware name: Supermicro X8DT6/X8DT6, BIOS 2.0c    
05/15/2012
[Fri Oct 18 10:52:28 2013] task: ffff880c69b62ee0 ti: ffff88130978e000 task.ti: 
ffff88130978e000
[Fri Oct 18 10:52:28 2013] RIP: 0010:[<ffffffffa028ad5a>] 
[<ffffffffa028ad5a>]
ocfs2_get_clusters_nocache.isra.11+0x4aa/0x530 [ocfs2]
[Fri Oct 18 10:52:28 2013] RSP: 0018:ffff88130978f708  EFLAGS: 00010297
[Fri Oct 18 10:52:28 2013] RAX: 00000000000000fa RBX: 0000000000000000 RCX: 
000000000012cbd4
[Fri Oct 18 10:52:28 2013] RDX: ffff880868180fe0 RSI: 000000000012cbd3 RDI: 
ffff880868180030
[Fri Oct 18 10:52:28 2013] RBP: ffff88130978f788 R08: 000000000012cbd4 R09: 
00000000000000fc
[Fri Oct 18 10:52:28 2013] R10: 0000000000000000 R11: 0000000000000000 R12: 
ffff88130978f7c8
[Fri Oct 18 10:52:28 2013] R13: ffff880868180030 R14: ffff88176cc7a000 R15: 
0000000000000000
[Fri Oct 18 10:52:28 2013] FS:  00007f32c4ff9700(0000) GS:ffff8817dfc60000(0000)
knlGS:0000000000000000
[Fri Oct 18 10:52:28 2013] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[Fri Oct 18 10:52:28 2013] CR2: 00007f34f4074000 CR3: 0000002c5d211000 CR4: 
00000000000027e0
[Fri Oct 18 10:52:28 2013] DR0: 0000000000000001 DR1: 0000000000000002 DR2: 
0000000000000001
[Fri Oct 18 10:52:28 2013] DR3: 000000000000000a DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[Fri Oct 18 10:52:28 2013] Stack:
[Fri Oct 18 10:52:28 2013]  ffff881300000000 0000000000000000 ffff88130978f7e4 
ffff880868180000
[Fri Oct 18 10:52:28 2013]  ffff882fb66ded80 0012cbd300000001 ffff88130978f8d4 
ffff8808ef23f270
[Fri Oct 18 10:52:28 2013]  ffff88130978f778 ffffffffa02969fb ffff8817dfc545b0 
0000000000000000
[Fri Oct 18 10:52:28 2013] Call Trace:
[Fri Oct 18 10:52:28 2013]  [<ffffffffa02969fb>] ? 
ocfs2_read_inode_block_full+0x3b/0x60 [ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa028b2be>]
ocfs2_get_clusters+0x23e/0x3b0
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffff8109a9ad>] ?
sched_clock_cpu+0xbd/0x110
[Fri Oct 18 10:52:28 2013]  [<ffffffffa028b48a>] 
ocfs2_extent_map_get_blocks+0x5a/0x190 [ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026eb3a>] 
ocfs2_direct_IO_get_blocks+0x5a/0x160 [ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffff811c87c1>] ?
inode_dio_done+0x31/0x40
[Fri Oct 18 10:52:28 2013]  [<ffffffff811ea90c>] 
do_blockdev_direct_IO+0xdfc/0x1fb0
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026eae0>] ?
ocfs2_dio_end_io+0x110/0x110
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffff811ebb15>]
__blockdev_direct_IO+0x55/0x60
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026eae0>] ?
ocfs2_dio_end_io+0x110/0x110
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026e9d0>] ?
ocfs2_direct_IO+0x80/0x80
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026e9c3>] ocfs2_direct_IO+0x73/0x80
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026eae0>] ?
ocfs2_dio_end_io+0x110/0x110
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa026e9d0>] ?
ocfs2_direct_IO+0x80/0x80
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffff81146e2b>]
generic_file_aio_read+0x6bb/0x720
[Fri Oct 18 10:52:28 2013]  [<ffffffff8172168e>] ? _raw_spin_lock+0xe/0x20
[Fri Oct 18 10:52:28 2013]  [<ffffffffa02843db>] ? 
__ocfs2_cluster_unlock.isra.32+0x9b/0xe0 [ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa02847a9>] ?
ocfs2_inode_unlock+0xb9/0x130
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffffa028dcf9>]
ocfs2_file_aio_read+0xd9/0x3c0
[ocfs2]
[Fri Oct 18 10:52:28 2013]  [<ffffffff811ae425>]
do_sync_readv_writev+0x65/0x90
[Fri Oct 18 10:52:28 2013]  [<ffffffff811afba2>]
do_readv_writev+0xd2/0x2b0
[Fri Oct 18 10:52:28 2013]  [<ffffffff811eeda2>] ? fsnotify+0x1d2/0x2b0
[Fri Oct 18 10:52:28 2013]  [<ffffffff811ae500>] ? do_sync_write+0xb0/0xb0
[Fri Oct 18 10:52:28 2013]  [<ffffffff811f8886>] ?
eventfd_write+0x1a6/0x210
[Fri Oct 18 10:52:28 2013]  [<ffffffff811afe09>] vfs_readv+0x39/0x50
[Fri Oct 18 10:52:28 2013]  [<ffffffff811b0062>] SyS_preadv+0xc2/0xd0
[Fri Oct 18 10:52:28 2013]  [<ffffffff8172a59d>]
system_call_fastpath+0x1a/0x1f
[Fri Oct 18 10:52:28 2013] Code: b9 00 02 00 00 49 c7 c0 f0 8d 2f a0 48 c7 c7 
b8 28 30 a0 e8 82 b1 48 e1 e9 07 fd ff ff 0f 1f 40 00 bb 01 00 00 00 e9 68 fe ff
ff <0f> 0b 48 8b 55 a0 48 c7 c6 10 8e 2f a0 bb e2 ff ff ff 4c 8b 47 
[Fri Oct 18 10:52:28 2013] RIP  [<ffffffffa028ad5a>] 
ocfs2_get_clusters_nocache.isra.11+0x4aa/0x530 [ocfs2]
[Fri Oct 18 10:52:28 2013]  RSP <ffff88130978f708>
[Fri Oct 18 10:52:28 2013] ---[ end trace 1831bd3aefe19b02 ]---

https://gist.github.com/David-Weber/f3072dd5c44a6ce593b6

(gdb) list *(ocfs2_get_clusters_nocache+0x4aa)
0xa6a is in ocfs2_get_clusters_nocache (fs/ocfs2/extent_map.c:475).
470                     goto out_hole;
471             }
472
473             rec = &el->l_recs[i];
474
475             BUG_ON(v_cluster < le32_to_cpu(rec->e_cpos));
476
477             if (!rec->e_blkno) {
478                     ocfs2_error(inode->i_sb, "Inode %lu has bad
extent "
479                                 "record (%u, %u, 0)",
inode->i_ino,

This happend the second time but I don't have a reproducer.
It is a KVM host with a dual Primary DRBD/OCFS2 System.
Kernel is 3.11.4

Thanks!

Cheers,
David

Goldwyn Rodrigues

2013-Oct-23 12:09 UTC

head link

[Ocfs2-devel] Kernel BUG in ocfs2_get_clusters_nocache

Hi David,

On 10/21/2013 02:53 AM, David Weber wrote:> Hi,
>
> we ran into a BUG() in ocfs2_get_clusters_nocache:
>
> [Fri Oct 18 10:52:28 2013] ------------[ cut here ]------------
> [Fri Oct 18 10:52:28 2013] Kernel BUG at ffffffffa028ad5a [verbose debug
info
> unavailable]
> [Fri Oct 18 10:52:28 2013] invalid opcode: 0000 [#1] SMP
> [Fri Oct 18 10:52:28 2013] Modules linked in: vhost_net vhost macvtap
macvlan
> drbd ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat
ebtables
> x_tables ocfs2_stack_o2cb rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd
fscache
> sunrpc bridge stp llc w83795 coretemp kvm_intel kvm lru_cache dlm sctp
> libcrc32c ocfs2_dlm ocfs2_dlmfs ocfs2 ocfs2_stackglue ocfs2_nodemanager
> configfs quota_tree snd_pcm e1000e snd_page_alloc snd_timer ixgbe snd
joydev
> hid_generic usbmouse usbkbd psmouse usbhid soundcore iTCO_wdt i7core_edac
> ioatdma gpio_ich hid ptp edac_core iTCO_vendor_support i2c_i801 pcspkr
mac_hid
> lpc_ich serio_raw ses mdio enclosure pps_core dca [last unloaded: evbug]
> [Fri Oct 18 10:52:28 2013] CPU: 3 PID: 16938 Comm: qemu-system-x86 Tainted:
G
> W    3.11.4 #1
> [Fri Oct 18 10:52:28 2013] Hardware name: Supermicro X8DT6/X8DT6, BIOS 2.0c
> 05/15/2012
> [Fri Oct 18 10:52:28 2013] task: ffff880c69b62ee0 ti: ffff88130978e000
task.ti:
> ffff88130978e000
> [Fri Oct 18 10:52:28 2013] RIP: 0010:[<ffffffffa028ad5a>] 
[<ffffffffa028ad5a>]
> ocfs2_get_clusters_nocache.isra.11+0x4aa/0x530 [ocfs2]
> [Fri Oct 18 10:52:28 2013] RSP: 0018:ffff88130978f708  EFLAGS: 00010297
> [Fri Oct 18 10:52:28 2013] RAX: 00000000000000fa RBX: 0000000000000000 RCX:
> 000000000012cbd4
> [Fri Oct 18 10:52:28 2013] RDX: ffff880868180fe0 RSI: 000000000012cbd3 RDI:
> ffff880868180030
> [Fri Oct 18 10:52:28 2013] RBP: ffff88130978f788 R08: 000000000012cbd4 R09:
> 00000000000000fc
> [Fri Oct 18 10:52:28 2013] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff88130978f7c8
> [Fri Oct 18 10:52:28 2013] R13: ffff880868180030 R14: ffff88176cc7a000 R15:
> 0000000000000000
> [Fri Oct 18 10:52:28 2013] FS:  00007f32c4ff9700(0000)
GS:ffff8817dfc60000(0000)
> knlGS:0000000000000000
> [Fri Oct 18 10:52:28 2013] CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
> [Fri Oct 18 10:52:28 2013] CR2: 00007f34f4074000 CR3: 0000002c5d211000 CR4:
> 00000000000027e0
> [Fri Oct 18 10:52:28 2013] DR0: 0000000000000001 DR1: 0000000000000002 DR2:
> 0000000000000001
> [Fri Oct 18 10:52:28 2013] DR3: 000000000000000a DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [Fri Oct 18 10:52:28 2013] Stack:
> [Fri Oct 18 10:52:28 2013]  ffff881300000000 0000000000000000
ffff88130978f7e4
> ffff880868180000
> [Fri Oct 18 10:52:28 2013]  ffff882fb66ded80 0012cbd300000001
ffff88130978f8d4
> ffff8808ef23f270
> [Fri Oct 18 10:52:28 2013]  ffff88130978f778 ffffffffa02969fb
ffff8817dfc545b0
> 0000000000000000
> [Fri Oct 18 10:52:28 2013] Call Trace:
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa02969fb>] ?
> ocfs2_read_inode_block_full+0x3b/0x60 [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa028b2be>]
ocfs2_get_clusters+0x23e/0x3b0
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffff8109a9ad>] ?
sched_clock_cpu+0xbd/0x110
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa028b48a>]
> ocfs2_extent_map_get_blocks+0x5a/0x190 [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026eb3a>]
> ocfs2_direct_IO_get_blocks+0x5a/0x160 [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811c87c1>] ?
inode_dio_done+0x31/0x40
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811ea90c>]
> do_blockdev_direct_IO+0xdfc/0x1fb0
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026eae0>] ?
ocfs2_dio_end_io+0x110/0x110
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811ebb15>]
__blockdev_direct_IO+0x55/0x60
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026eae0>] ?
ocfs2_dio_end_io+0x110/0x110
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026e9d0>] ?
ocfs2_direct_IO+0x80/0x80
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026e9c3>]
ocfs2_direct_IO+0x73/0x80 [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026eae0>] ?
ocfs2_dio_end_io+0x110/0x110
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa026e9d0>] ?
ocfs2_direct_IO+0x80/0x80
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffff81146e2b>]
generic_file_aio_read+0x6bb/0x720
> [Fri Oct 18 10:52:28 2013]  [<ffffffff8172168e>] ?
_raw_spin_lock+0xe/0x20
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa02843db>] ?
> __ocfs2_cluster_unlock.isra.32+0x9b/0xe0 [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa02847a9>] ?
ocfs2_inode_unlock+0xb9/0x130
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffffa028dcf9>]
ocfs2_file_aio_read+0xd9/0x3c0
> [ocfs2]
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811ae425>]
do_sync_readv_writev+0x65/0x90
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811afba2>]
do_readv_writev+0xd2/0x2b0
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811eeda2>] ?
fsnotify+0x1d2/0x2b0
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811ae500>] ?
do_sync_write+0xb0/0xb0
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811f8886>] ?
eventfd_write+0x1a6/0x210
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811afe09>] vfs_readv+0x39/0x50
> [Fri Oct 18 10:52:28 2013]  [<ffffffff811b0062>] SyS_preadv+0xc2/0xd0
> [Fri Oct 18 10:52:28 2013]  [<ffffffff8172a59d>]
system_call_fastpath+0x1a/0x1f
> [Fri Oct 18 10:52:28 2013] Code: b9 00 02 00 00 49 c7 c0 f0 8d 2f a0 48 c7
c7
> b8 28 30 a0 e8 82 b1 48 e1 e9 07 fd ff ff 0f 1f 40 00 bb 01 00 00 00 e9 68
fe ff
> ff <0f> 0b 48 8b 55 a0 48 c7 c6 10 8e 2f a0 bb e2 ff ff ff 4c 8b 47
> [Fri Oct 18 10:52:28 2013] RIP  [<ffffffffa028ad5a>]
> ocfs2_get_clusters_nocache.isra.11+0x4aa/0x530 [ocfs2]
> [Fri Oct 18 10:52:28 2013]  RSP <ffff88130978f708>
> [Fri Oct 18 10:52:28 2013] ---[ end trace 1831bd3aefe19b02 ]---
>
> https://gist.github.com/David-Weber/f3072dd5c44a6ce593b6
>
> (gdb) list *(ocfs2_get_clusters_nocache+0x4aa)
> 0xa6a is in ocfs2_get_clusters_nocache (fs/ocfs2/extent_map.c:475).
> 470                     goto out_hole;
> 471             }
> 472
> 473             rec = &el->l_recs[i];
> 474
> 475             BUG_ON(v_cluster < le32_to_cpu(rec->e_cpos));
> 476
> 477             if (!rec->e_blkno) {
> 478                     ocfs2_error(inode->i_sb, "Inode %lu has bad
extent "
> 479                                 "record (%u, %u, 0)",
inode->i_ino,
>
> This happend the second time but I don't have a reproducer.
> It is a KVM host with a dual Primary DRBD/OCFS2 System.
> Kernel is 3.11.4
>
It seems your data structures on disk are corrupted. Have you tried 
running the fsck.ocfs2 as yet? If yes, what errors is the fsck fixing?


-- 
Goldwyn

Maybe Matching Threads

Search for more possibly parallel threads

Ocfs2 devel - Oct 2013 - Kernel BUG in ocfs2_get_clusters_nocache

[Ocfs2-devel] Kernel BUG in ocfs2_get_clusters_nocache

[Ocfs2-devel] Kernel BUG in ocfs2_get_clusters_nocache

Maybe Matching Threads