Hi, I got this kernel BUG on a server running multiple Ceph cosd instances, during a heavy write load generated by multiple Ceph clients. The server was running the current ceph unstable kernel (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git). Please let me know what other information you need to make this report useful. -- Jim BUG: unable to handle kernel NULL pointer dereference at 0000000000000100 [97221.834832] IP: [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs] [97221.834832] PGD 198d6b067 PUD 13d79f067 PMD 0 [97221.834832] Oops: 0000 [#1] SMP [97221.834832] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:10:0d.0/local_cpus [97221.834832] CPU 3 [97221.834832] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] [97221.834832] [97221.834832] Pid: 30295, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 [97221.834832] RIP: 0010:[<ffffffffa075b3ab>] [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs] [97221.834832] RSP: 0018:ffff8801cf205c08 EFLAGS: 00010282 [97221.834832] RAX: ffffffffa075b39b RBX: ffff88018490a3a0 RCX: 0000000000000001 [97221.834832] RDX: 0000000000000000 RSI: ffffffff819e7ea0 RDI: ffff88018490a3a0 [97221.834832] RBP: ffff8801cf205c08 R08: ffffe8ffffccefa8 R09: 0000000000000000 [97221.834832] R10: ffff8801488e9658 R11: 0000000000000000 R12: ffff88021b5c6400 [97221.834832] R13: ffff8801fad145a0 R14: ffff8801faf8c440 R15: ffff88017bab9848 [97221.834832] FS: 00007f0b011f9940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000 [97221.834832] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [97221.834832] CR2: 0000000000000100 CR3: 00000001b8c89000 CR4: 00000000000006e0 [97221.834832] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [97221.834832] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [97221.834832] Process cosd (pid: 30295, threadinfo ffff8801cf204000, task ffff8801488e9610) [97221.834832] Stack: [97221.834832] ffff8801cf205c28 ffffffff810fd714 00000000fffffffb fffffffffffffffb [97221.834832] ffff8801cf205cd8 ffffffffa07587e8 ffff8801cf205c48 0000000000000102 [97221.834832] 0000000fcf205c58 ffff88022f5c46a0 ffff8801d1ef8800 ffffffff8136a638 [97221.834832] Call Trace: [97221.834832] [<ffffffff810fd714>] iput+0x5c/0x1e0 [97221.834832] [<ffffffffa07587e8>] btrfs_new_inode+0x2d3/0x2e5 [btrfs] [97221.834832] [<ffffffff8136a638>] ? _cond_resched+0xe/0x22 [97221.834832] [<ffffffff8136ae20>] ? mutex_lock+0x16/0x3a [97221.834832] [<ffffffffa0756da1>] ? start_transaction+0x176/0x1bc [btrfs] [97221.834832] [<ffffffffa075d1fc>] btrfs_create+0xbb/0x1fa [btrfs] [97221.834832] [<ffffffff810f49e2>] vfs_create+0x76/0x96 [97221.834832] [<ffffffff810f56af>] do_last+0x24d/0x4d3 [97221.834832] [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5 [97221.834832] [<ffffffff81031061>] ? should_resched+0xe/0x2f [97221.834832] [<ffffffff8136a638>] ? _cond_resched+0xe/0x22 [97221.834832] [<ffffffff811aa669>] ? might_fault+0xe/0x10 [97221.834832] [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a [97221.834832] [<ffffffff810e9023>] do_sys_open+0x62/0xeb [97221.834832] [<ffffffff810e90df>] sys_open+0x20/0x22 [97221.834832] [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b [97221.834832] Code: 53 fc 94 e0 4c 89 e7 e8 f6 8a 95 e0 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 48 8b 97 68 fe ff ff <83> ba 00 01 00 00 00 75 12 48 8b 82 28 01 00 00 b9 01 00 [97221.834832] RIP [<ffffffffa075b3ab>] btrfs_drop_inode+0x10/0x36 [btrfs] [97221.834832] RSP <ffff8801cf205c08> [97221.834832] CR2: 0000000000000100 [97222.207152] ---[ end trace 32eb8bbbb4782eb8 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote:> Hi, > > I got this kernel BUG on a server running multiple Ceph > cosd instances, during a heavy write load generated by > multiple Ceph clients. > > The server was running the current ceph unstable kernel > (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git). > > Please let me know what other information you need to > make this report useful. > > -- Jim >Here''s another example. Again, please let me know what other information you need to make this report useful. -- Jim [11199.532483] ------------[ cut here ]------------ [11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198! [11199.536292] invalid opcode: 0000 [#1] SMP [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map [11199.536292] CPU 3 [11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] [11199.536292] [11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 [11199.536292] RIP: 0010:[<ffffffffa0774081>] [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs] [11199.536292] RSP: 0018:ffff8801c90abb58 EFLAGS: 00010282 [11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: ffff8802262c5000 [11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI: 0000000000000001 [11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09: 0000000000000000 [11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12: ffff880140bb8f00 [11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15: ffff8802262c5000 [11199.536292] FS: 00007f5e680fc940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000 [11199.536292] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4: 00000000000006e0 [11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000, task ffff8801df12d840) [11199.536292] Stack: [11199.536292] 0000000000000000 0000000000000000 0000000000000001 0000000000000000 [11199.536292] ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600 ffff880181eff378 [11199.536292] 0000000000000000 0000002600000206 ffff880181eff380 000000007921e750 [11199.536292] Call Trace: [11199.536292] [<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd3 [btrfs] [11199.536292] [<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0x15e [btrfs] [11199.536292] [<ffffffff810fa54d>] ? __fsnotify_update_dcache_flags+0x22/0x56 [11199.536292] [<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/0x1e3 [btrfs] [11199.536292] [<ffffffffa0780372>] btrfs_end_transaction_throttle+0x18/0x1a [btrfs] [11199.536292] [<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btrfs] [11199.536292] [<ffffffff810f49e2>] vfs_create+0x76/0x96 [11199.536292] [<ffffffff810f56af>] do_last+0x24d/0x4d3 [11199.536292] [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5 [11199.536292] [<ffffffff81031061>] ? should_resched+0xe/0x2f [11199.536292] [<ffffffff8136a638>] ? _cond_resched+0xe/0x22 [11199.536292] [<ffffffff811aa669>] ? might_fault+0xe/0x10 [11199.536292] [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a [11199.536292] [<ffffffff810e9023>] do_sys_open+0x62/0xeb [11199.536292] [<ffffffff810e90df>] sys_open+0x20/0x22 [11199.536292] [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04 <0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff [11199.536292] RIP [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs] [11199.536292] RSP <ffff8801c90abb58> [11199.905250] ---[ end trace b0dead1e7c3dbf7b ]--- Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]------------ Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP Jan 26 11:40:33 an1 [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Jan 26 11:40:38 an1 [11199.536292] Stack: Jan 26 11:40:38 an1 [11199.536292] Call Trace: Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04 <0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 4 [11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0 [11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0 [11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0 [11212.729433] ------------[ cut here ]------------ [11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789! [11212.734157] invalid opcode: 0000 [#2] SMP [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map [11212.734157] CPU 3 [11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] [11212.734157] [11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G D 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 [11212.734157] RIP: 0010:[<ffffffffa0773452>] [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs] [11212.734157] RSP: 0018:ffff880227539be0 EFLAGS: 00010282 [11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX: ffff88020b993000 [11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI: 0000000100000090 [11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09: 0000000000000000 [11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12: ffff8801d83c3000 [11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15: 00000000000000e0 [11212.734157] FS: 0000000000000000(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000 [11212.734157] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4: 00000000000006e0 [11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo ffff880227538000, task ffff88020ebc0000) [11212.734157] Stack: [11212.734157] ffff880227539bf0 0000000400000000 ffff8801cd50d750 ffff8801e0a9ca00 [11212.734157] 00000000024cd000 000010000000006b ffff88021527f880 0000000100000001 [11212.734157] ffff880227539c50 ffffffffa079c6bc ffff880225c96198 ffff8801b0cf9aa8 [11212.734157] Call Trace: [11212.734157] [<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c/0x8a [btrfs] [11212.734157] [<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btrfs] [11212.734157] [<ffffffffa076db1f>] ? btrfs_header_generation+0x1f/0x25 [btrfs] [11212.734157] [<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 [btrfs] [11212.734157] [<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btrfs] [11212.734157] [<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x467 [btrfs] [11212.734157] [<ffffffff81031049>] ? need_resched+0x23/0x2d [11212.734157] [<ffffffff81031061>] ? should_resched+0xe/0x2f [11212.734157] [<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [btrfs] [11212.734157] [<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xee/0x10c [btrfs] [11212.734157] [<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [btrfs] [11212.734157] [<ffffffff8105b11e>] kthread+0x72/0x7a [11212.734157] [<ffffffff810039d4>] kernel_thread_helper+0x4/0x10 [11212.734157] [<ffffffff8105b0ac>] ? kthread+0x0/0x7a [11212.734157] [<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x10 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04 <0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83 [11212.734157] RIP [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs] [11212.734157] RSP <ffff880227539be0> [11213.101484] ---[ end trace b0dead1e7c3dbf7c ]--- Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]------------ Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP Jan 26 11:40:45 an1 [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Jan 26 11:40:46 an1 [11212.734157] Stack: Jan 26 11:40:46 an1 [11212.734157] Call Trace: Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04 <0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 0 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
heavy writes as well Jan 5 16:56:46 linuscs101 kernel: [ 3666.496742] ------------[ cut here ]------------> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496754] WARNING: at fs/btrfs/inode.c:2143 btrfs_orphan_commit_root+0xb0/0xc0() > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496756] Hardware name: ProLiant DL380 G5 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496758] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo cciss fbcon tileblit font bitblit softcursor > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496788] Pid: 2764, comm: cosd Not tainted 2.6.37-ceph-client #1 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496790] Call Trace: > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496797] [<ffffffff81060dbf>] warn_slowpath_common+0x7f/0xc0 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496800] [<ffffffff81060e1a>] warn_slowpath_null+0x1a/0x20 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496804] [<ffffffff81273b70>] btrfs_orphan_commit_root+0xb0/0xc0 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496807] [<ffffffff8126f1c1>] commit_fs_roots+0xa1/0x140 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496810] [<ffffffff81270640>] btrfs_commit_transaction+0x350/0x730 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496816] [<ffffffff81082aa0>] ? autoremove_wake_function+0x0/0x40 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496820] [<ffffffff8129ec33>] btrfs_mksubvol+0x363/0x380 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496823] [<ffffffff8129ed3d>] btrfs_ioctl_snap_create_transid+0xed/0x140 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496826] [<ffffffff8129ee87>] btrfs_ioctl_snap_create+0xf7/0x140 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496830] [<ffffffff812a0dcf>] btrfs_ioctl+0x61f/0xa20 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496834] [<ffffffff811836da>] ? fsnotify+0x1ea/0x320 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496839] [<ffffffff8115ce19>] do_vfs_ioctl+0xa9/0x5a0 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496842] [<ffffffff8115d391>] sys_ioctl+0x81/0xa0 > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496847] [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496850] ---[ end trace 2a6c3f752cfb5f1b ]--- > Jan 5 17:07:45 linuscs101 kernel: [ 4325.723170] CPU 1 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.723210] Modules linked in: nfsd exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo cciss fbcon tileblit font bitblit softcursor > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724006] > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724041] Pid: 2766, comm: cosd Tainted: G W 2.6.37-ceph-client #1 /ProLiant DL380 G5 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724169] RIP: 0010:[<ffffffff81278190>] [<ffffffff81278190>] btrfs_truncate+0x510/0x530 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724318] RSP: 0018:ffff8803d7e1bd48 EFLAGS: 00010286 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724397] RAX: 00000000ffffffe4 RBX: ffff8803dfaf1800 RCX: ffff880406ce7090 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724493] RDX: 0000000000000000 RSI: ffffea000e17d288 RDI: 0000000000000206 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724592] RBP: ffff8803d7e1bdd8 R08: 0000000000000783 R09: ffff8803d7e1bb28 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724691] R10: 00000000ffffffe4 R11: 0000000000000001 R12: ffff8803dee49f00 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724793] R13: ffff8803d5369c10 R14: ffff8803d5369a78 R15: ffff8803d5369d38 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.724899] FS: 00007f77acfb6710(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000000000 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725096] CR2: 00007f81cd5b8000 CR3: 00000003dfad3000 CR4: 00000000000006e0 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725195] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725293] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725392] Process cosd (pid: 2766, threadinfo ffff8803d7e1a000, task ffff8803dfaf8000) > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725549] 0000000000000000 ffffffffffffffff ffff8803d5369d78 00000000000001da > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725695] 0000000000000fff 00000000d5369d38 0000000000001000 0000000000000000 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.725841] ffff8803d5369aa8 ffff8803d5369c10 ffff8803d7e1bdc8 0000000000000000 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726039] [<ffffffff81104c46>] vmtruncate+0x56/0x70 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726113] [<ffffffff8127cece>] btrfs_setattr+0x13e/0x2a0 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726202] [<ffffffff811652c0>] notify_change+0x170/0x2e0 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726292] [<ffffffff8114b9b4>] do_truncate+0x64/0xa0 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726370] [<ffffffff81156d73>] ? generic_permission+0x23/0xc0 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726460] [<ffffffff81156bd5>] ? get_write_access+0x45/0x70 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726543] [<ffffffff8114bb39>] sys_truncate+0x149/0x150 > Jan 5 17:07:45 linuscs101 kernel: [ 4325.726631] [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b > Jan 5 17:07:45 linuscs101 kernel: [ 4325.727618] RSP<ffff8803d7e1bd48> > Jan 5 17:07:45 linuscs101 kernel: [ 4325.748986] ---[ end trace 2a6c3f752cfb5f1c ]---On 1/26/11 12:48 PM, Jim Schutt wrote:> Hi, > > On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote: >> Hi, >> >> I got this kernel BUG on a server running multiple Ceph >> cosd instances, during a heavy write load generated by >> multiple Ceph clients. >> >> The server was running the current ceph unstable kernel >> (a3f5274e535 in git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git). >> >> Please let me know what other information you need to >> make this report useful. >> >> -- Jim >> > Here''s another example. > > Again, please let me know what other information you need to > make this report useful. > > -- Jim > > [11199.532483] ------------[ cut here ]------------ > [11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198! > [11199.536292] invalid opcode: 0000 [#1] SMP > [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map > [11199.536292] CPU 3 > [11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] > [11199.536292] > [11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 > [11199.536292] RIP: 0010:[<ffffffffa0774081>] [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs] > [11199.536292] RSP: 0018:ffff8801c90abb58 EFLAGS: 00010282 > [11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: ffff8802262c5000 > [11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI: 0000000000000001 > [11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09: 0000000000000000 > [11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12: ffff880140bb8f00 > [11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15: ffff8802262c5000 > [11199.536292] FS: 00007f5e680fc940(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000 > [11199.536292] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4: 00000000000006e0 > [11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000, task ffff8801df12d840) > [11199.536292] Stack: > [11199.536292] 0000000000000000 0000000000000000 0000000000000001 0000000000000000 > [11199.536292] ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600 ffff880181eff378 > [11199.536292] 0000000000000000 0000002600000206 ffff880181eff380 000000007921e750 > [11199.536292] Call Trace: > [11199.536292] [<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd3 [btrfs] > [11199.536292] [<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0x15e [btrfs] > [11199.536292] [<ffffffff810fa54d>] ? __fsnotify_update_dcache_flags+0x22/0x56 > [11199.536292] [<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/0x1e3 [btrfs] > [11199.536292] [<ffffffffa0780372>] btrfs_end_transaction_throttle+0x18/0x1a [btrfs] > [11199.536292] [<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btrfs] > [11199.536292] [<ffffffff810f49e2>] vfs_create+0x76/0x96 > [11199.536292] [<ffffffff810f56af>] do_last+0x24d/0x4d3 > [11199.536292] [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5 > [11199.536292] [<ffffffff81031061>] ? should_resched+0xe/0x2f > [11199.536292] [<ffffffff8136a638>] ? _cond_resched+0xe/0x22 > [11199.536292] [<ffffffff811aa669>] ? might_fault+0xe/0x10 > [11199.536292] [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a > [11199.536292] [<ffffffff810e9023>] do_sys_open+0x62/0xeb > [11199.536292] [<ffffffff810e90df>] sys_open+0x20/0x22 > [11199.536292] [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b > [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04<0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff > [11199.536292] RIP [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b [btrfs] > [11199.536292] RSP<ffff8801c90abb58> > [11199.905250] ---[ end trace b0dead1e7c3dbf7b ]--- > Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]------------ > Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP > Jan 26 11:40:33 an1 [11199.536292] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map > Jan 26 11:40:38 an1 [11199.536292] Stack: > Jan 26 11:40:38 an1 [11199.536292] Call Trace: > Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 04<0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 4 > [11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0 > [11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0 > [11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted 237BEA0B found F7B13C5E level 0 > [11212.729433] ------------[ cut here ]------------ > [11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789! > [11212.734157] invalid opcode: 0000 [#2] SMP > [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map > [11212.734157] CPU 3 > [11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] > [11212.734157] > [11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G D 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 > [11212.734157] RIP: 0010:[<ffffffffa0773452>] [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs] > [11212.734157] RSP: 0018:ffff880227539be0 EFLAGS: 00010282 > [11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX: ffff88020b993000 > [11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI: 0000000100000090 > [11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09: 0000000000000000 > [11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12: ffff8801d83c3000 > [11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15: 00000000000000e0 > [11212.734157] FS: 0000000000000000(0000) GS:ffff8800cfcc0000(0000) knlGS:0000000000000000 > [11212.734157] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4: 00000000000006e0 > [11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo ffff880227538000, task ffff88020ebc0000) > [11212.734157] Stack: > [11212.734157] ffff880227539bf0 0000000400000000 ffff8801cd50d750 ffff8801e0a9ca00 > [11212.734157] 00000000024cd000 000010000000006b ffff88021527f880 0000000100000001 > [11212.734157] ffff880227539c50 ffffffffa079c6bc ffff880225c96198 ffff8801b0cf9aa8 > [11212.734157] Call Trace: > [11212.734157] [<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c/0x8a [btrfs] > [11212.734157] [<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btrfs] > [11212.734157] [<ffffffffa076db1f>] ? btrfs_header_generation+0x1f/0x25 [btrfs] > [11212.734157] [<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 [btrfs] > [11212.734157] [<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btrfs] > [11212.734157] [<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x467 [btrfs] > [11212.734157] [<ffffffff81031049>] ? need_resched+0x23/0x2d > [11212.734157] [<ffffffff81031061>] ? should_resched+0xe/0x2f > [11212.734157] [<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [btrfs] > [11212.734157] [<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xee/0x10c [btrfs] > [11212.734157] [<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [btrfs] > [11212.734157] [<ffffffff8105b11e>] kthread+0x72/0x7a > [11212.734157] [<ffffffff810039d4>] kernel_thread_helper+0x4/0x10 > [11212.734157] [<ffffffff8105b0ac>] ? kthread+0x0/0x7a > [11212.734157] [<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x10 > [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04<0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83 > [11212.734157] RIP [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 [btrfs] > [11212.734157] RSP<ffff880227539be0> > [11213.101484] ---[ end trace b0dead1e7c3dbf7c ]--- > Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]------------ > Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP > Jan 26 11:40:45 an1 [11212.734157] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map > Jan 26 11:40:46 an1 [11212.734157] Stack: > Jan 26 11:40:46 an1 [11212.734157] Call Trace: > Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 04<0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 0 > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
The btrfs_orphan_commit_root warning is also reproducable in our ceph environment. Regards Christian 2011/1/26 Matt Weil <mweil@genome.wustl.edu>:> heavy writes as well > > Jan 5 16:56:46 linuscs101 kernel: [ 3666.496742] ------------[ cut here > ]------------ >> >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496754] WARNING: at >> fs/btrfs/inode.c:2143 btrfs_orphan_commit_root+0xb0/0xc0() >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496756] Hardware name: ProLiant >> DL380 G5 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496758] Modules linked in: nfsd >> exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm >> drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si >> i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo >> cciss fbcon tileblit font bitblit softcursor >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496788] Pid: 2764, comm: cosd >> Not tainted 2.6.37-ceph-client #1 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496790] Call Trace: >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496797] [<ffffffff81060dbf>] >> warn_slowpath_common+0x7f/0xc0 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496800] [<ffffffff81060e1a>] >> warn_slowpath_null+0x1a/0x20 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496804] [<ffffffff81273b70>] >> btrfs_orphan_commit_root+0xb0/0xc0 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496807] [<ffffffff8126f1c1>] >> commit_fs_roots+0xa1/0x140 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496810] [<ffffffff81270640>] >> btrfs_commit_transaction+0x350/0x730 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496816] [<ffffffff81082aa0>] ? >> autoremove_wake_function+0x0/0x40 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496820] [<ffffffff8129ec33>] >> btrfs_mksubvol+0x363/0x380 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496823] [<ffffffff8129ed3d>] >> btrfs_ioctl_snap_create_transid+0xed/0x140 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496826] [<ffffffff8129ee87>] >> btrfs_ioctl_snap_create+0xf7/0x140 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496830] [<ffffffff812a0dcf>] >> btrfs_ioctl+0x61f/0xa20 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496834] [<ffffffff811836da>] ? >> fsnotify+0x1ea/0x320 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496839] [<ffffffff8115ce19>] >> do_vfs_ioctl+0xa9/0x5a0 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496842] [<ffffffff8115d391>] >> sys_ioctl+0x81/0xa0 >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496847] [<ffffffff8100c042>] >> system_call_fastpath+0x16/0x1b >> Jan 5 16:56:46 linuscs101 kernel: [ 3666.496850] ---[ end trace >> 2a6c3f752cfb5f1b ]--- >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.723170] CPU 1 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.723210] Modules linked in: nfsd >> exportfs nfs lockd nfs_acl auth_rpcgss bonding sunrpc radeon ttm >> drm_kms_helper drm bnx2 psmouse i5000_edac usbhid lp shpchp ipmi_si >> i2c_algo_bit hid edac_core parport ipmi_msghandler serio_raw i5k_amb hpilo >> cciss fbcon tileblit font bitblit softcursor >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724006] >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724041] Pid: 2766, comm: cosd >> Tainted: G W 2.6.37-ceph-client #1 /ProLiant DL380 G5 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724169] RIP: >> 0010:[<ffffffff81278190>] [<ffffffff81278190>] btrfs_truncate+0x510/0x530 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724318] RSP: >> 0018:ffff8803d7e1bd48 EFLAGS: 00010286 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724397] RAX: 00000000ffffffe4 >> RBX: ffff8803dfaf1800 RCX: ffff880406ce7090 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724493] RDX: 0000000000000000 >> RSI: ffffea000e17d288 RDI: 0000000000000206 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724592] RBP: ffff8803d7e1bdd8 >> R08: 0000000000000783 R09: ffff8803d7e1bb28 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724691] R10: 00000000ffffffe4 >> R11: 0000000000000001 R12: ffff8803dee49f00 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724793] R13: ffff8803d5369c10 >> R14: ffff8803d5369a78 R15: ffff8803d5369d38 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.724899] FS: >> 00007f77acfb6710(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000000000 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725019] CS: 0010 DS: 0000 ES: >> 0000 CR0: 0000000080050033 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725096] CR2: 00007f81cd5b8000 >> CR3: 00000003dfad3000 CR4: 00000000000006e0 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725195] DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725293] DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725392] Process cosd (pid: >> 2766, threadinfo ffff8803d7e1a000, task ffff8803dfaf8000) >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725549] 0000000000000000 >> ffffffffffffffff ffff8803d5369d78 00000000000001da >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725695] 0000000000000fff >> 00000000d5369d38 0000000000001000 0000000000000000 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.725841] ffff8803d5369aa8 >> ffff8803d5369c10 ffff8803d7e1bdc8 0000000000000000 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726039] [<ffffffff81104c46>] >> vmtruncate+0x56/0x70 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726113] [<ffffffff8127cece>] >> btrfs_setattr+0x13e/0x2a0 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726202] [<ffffffff811652c0>] >> notify_change+0x170/0x2e0 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726292] [<ffffffff8114b9b4>] >> do_truncate+0x64/0xa0 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726370] [<ffffffff81156d73>] ? >> generic_permission+0x23/0xc0 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726460] [<ffffffff81156bd5>] ? >> get_write_access+0x45/0x70 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726543] [<ffffffff8114bb39>] >> sys_truncate+0x149/0x150 >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.726631] [<ffffffff8100c042>] >> system_call_fastpath+0x16/0x1b >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.727618] RSP<ffff8803d7e1bd48> >> Jan 5 17:07:45 linuscs101 kernel: [ 4325.748986] ---[ end trace >> 2a6c3f752cfb5f1c ]--- > > > > On 1/26/11 12:48 PM, Jim Schutt wrote: >> >> Hi, >> >> On Wed, 2011-01-26 at 10:59 -0700, Jim Schutt wrote: >>> >>> Hi, >>> >>> I got this kernel BUG on a server running multiple Ceph >>> cosd instances, during a heavy write load generated by >>> multiple Ceph clients. >>> >>> The server was running the current ceph unstable kernel >>> (a3f5274e535 in >>> git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git). >>> >>> Please let me know what other information you need to >>> make this report useful. >>> >>> -- Jim >>> >> Here''s another example. >> >> Again, please let me know what other information you need to >> make this report useful. >> >> -- Jim >> >> [11199.532483] ------------[ cut here ]------------ >> [11199.536292] kernel BUG at fs/btrfs/extent-tree.c:2198! >> [11199.536292] invalid opcode: 0000 [#1] SMP >> [11199.536292] last sysfs file: >> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map >> [11199.536292] CPU 3 >> [11199.536292] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE >> iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack >> ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] >> [11199.536292] >> [11199.536292] Pid: 1664, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 >> 0DT097/PowerEdge 1950 >> [11199.536292] RIP: 0010:[<ffffffffa0774081>] [<ffffffffa0774081>] >> run_clustered_refs+0x71e/0x76b [btrfs] >> [11199.536292] RSP: 0018:ffff8801c90abb58 EFLAGS: 00010282 >> [11199.536292] RAX: 00000000fffffffb RBX: 0000000000000000 RCX: >> ffff8802262c5000 >> [11199.536292] RDX: ffff88017921e2d0 RSI: ffffea000527f690 RDI: >> 0000000000000001 >> [11199.536292] RBP: ffff8801c90abc28 R08: ffffe8ffffccefe8 R09: >> 0000000000000000 >> [11199.536292] R10: 0000000000000003 R11: ffff880227549e98 R12: >> ffff880140bb8f00 >> [11199.536292] R13: 0000000000000000 R14: ffff880181eff378 R15: >> ffff8802262c5000 >> [11199.536292] FS: 00007f5e680fc940(0000) GS:ffff8800cfcc0000(0000) >> knlGS:0000000000000000 >> [11199.536292] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [11199.536292] CR2: 00007f0e1a476260 CR3: 0000000173aa0000 CR4: >> 00000000000006e0 >> [11199.536292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [11199.536292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [11199.536292] Process cosd (pid: 1664, threadinfo ffff8801c90aa000, task >> ffff8801df12d840) >> [11199.536292] Stack: >> [11199.536292] 0000000000000000 0000000000000000 0000000000000001 >> 0000000000000000 >> [11199.536292] ffff8801c90abc48 ffff8802262c5000 ffff8801e0a9c600 >> ffff880181eff378 >> [11199.536292] 0000000000000000 0000002600000206 ffff880181eff380 >> 000000007921e750 >> [11199.536292] Call Trace: >> [11199.536292] [<ffffffffa0785be0>] ? btrfs_update_inode+0xc3/0xd3 >> [btrfs] >> [11199.536292] [<ffffffffa07741bc>] btrfs_run_delayed_refs+0xee/0x15e >> [btrfs] >> [11199.536292] [<ffffffff810fa54d>] ? >> __fsnotify_update_dcache_flags+0x22/0x56 >> [11199.536292] [<ffffffffa07801d0>] __btrfs_end_transaction+0x6d/0x1e3 >> [btrfs] >> [11199.536292] [<ffffffffa0780372>] >> btrfs_end_transaction_throttle+0x18/0x1a [btrfs] >> [11199.536292] [<ffffffffa07872e1>] btrfs_create+0x1a0/0x1fa [btrfs] >> [11199.536292] [<ffffffff810f49e2>] vfs_create+0x76/0x96 >> [11199.536292] [<ffffffff810f56af>] do_last+0x24d/0x4d3 >> [11199.536292] [<ffffffff810f5b16>] do_filp_open+0x1e1/0x4c5 >> [11199.536292] [<ffffffff81031061>] ? should_resched+0xe/0x2f >> [11199.536292] [<ffffffff8136a638>] ? _cond_resched+0xe/0x22 >> [11199.536292] [<ffffffff811aa669>] ? might_fault+0xe/0x10 >> [11199.536292] [<ffffffff811aa753>] ? __strncpy_from_user+0x20/0x4a >> [11199.536292] [<ffffffff810e9023>] do_sys_open+0x62/0xeb >> [11199.536292] [<ffffffff810e90df>] sys_open+0x20/0x22 >> [11199.536292] [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b >> [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 8b b5 58 ff ff ff 48 >> 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f 0b eb fe 85 c0 74 >> 04<0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 48 8b bd 70 ff ff ff >> [11199.536292] RIP [<ffffffffa0774081>] run_clustered_refs+0x71e/0x76b >> [btrfs] >> [11199.536292] RSP<ffff8801c90abb58> >> [11199.905250] ---[ end trace b0dead1e7c3dbf7b ]--- >> Jan 26 11:40:32 an1 [11199.532483] ------------[ cut here ]------------ >> Jan 26 11:40:33 an1 [11199.536292] invalid opcode: 0000 [#1] SMP >> Jan 26 11:40:33 an1 [11199.536292] last sysfs file: >> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map >> Jan 26 11:40:38 an1 [11199.536292] Stack: >> Jan 26 11:40:38 an1 [11199.536292] Call Trace: >> Jan 26 11:40:40 an1 [11199.536292] Code: 24 08 48 8b 46 40 48 89 04 24 48 >> 8b b5 58 ff ff ff 48 8b bd 60 ff ff ff e8 61 e7 ff ff eb 08 0f 0b eb fe 0f >> 0b eb fe 85 c0 74 04<0f> 0b eb fe 4c 89 e7 e8 65 ae ff ff 4 >> [11212.699541] btrfs: sdm2 checksum verify failed on 31928320 wanted >> 237BEA0B found F7B13C5E level 0 >> [11212.709895] btrfs: sdm2 checksum verify failed on 31928320 wanted >> 237BEA0B found F7B13C5E level 0 >> [11212.719737] btrfs: sdm2 checksum verify failed on 31928320 wanted >> 237BEA0B found F7B13C5E level 0 >> [11212.729433] ------------[ cut here ]------------ >> [11212.730394] kernel BUG at fs/btrfs/extent-tree.c:5789! >> [11212.734157] invalid opcode: 0000 [#2] SMP >> [11212.734157] last sysfs file: >> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map >> [11212.734157] CPU 3 >> [11212.734157] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE >> iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack >> ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] >> [11212.734157] >> [11212.734157] Pid: 27662, comm: btrfs-cleaner Tainted: G D >> 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 >> [11212.734157] RIP: 0010:[<ffffffffa0773452>] [<ffffffffa0773452>] >> reada_walk_down+0x18c/0x249 [btrfs] >> [11212.734157] RSP: 0018:ffff880227539be0 EFLAGS: 00010282 >> [11212.734157] RAX: 00000000fffffffb RBX: ffff8801cd50d750 RCX: >> ffff88020b993000 >> [11212.734157] RDX: ffff88017921e3f0 RSI: ffffea000527f690 RDI: >> 0000000100000090 >> [11212.734157] RBP: ffff880227539c80 R08: ffffe8ffffccefe8 R09: >> 0000000000000000 >> [11212.734157] R10: 0000000100a68468 R11: ffff880227549e98 R12: >> ffff8801d83c3000 >> [11212.734157] R13: 0000000000000040 R14: ffff88020b993000 R15: >> 00000000000000e0 >> [11212.734157] FS: 0000000000000000(0000) GS:ffff8800cfcc0000(0000) >> knlGS:0000000000000000 >> [11212.734157] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [11212.734157] CR2: 0000000000b92de8 CR3: 000000020e5b3000 CR4: >> 00000000000006e0 >> [11212.734157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [11212.734157] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [11212.734157] Process btrfs-cleaner (pid: 27662, threadinfo >> ffff880227538000, task ffff88020ebc0000) >> [11212.734157] Stack: >> [11212.734157] ffff880227539bf0 0000000400000000 ffff8801cd50d750 >> ffff8801e0a9ca00 >> [11212.734157] 00000000024cd000 000010000000006b ffff88021527f880 >> 0000000100000001 >> [11212.734157] ffff880227539c50 ffffffffa079c6bc ffff880225c96198 >> ffff8801b0cf9aa8 >> [11212.734157] Call Trace: >> [11212.734157] [<ffffffffa079c6bc>] ? extent_buffer_uptodate+0x6c/0x8a >> [btrfs] >> [11212.734157] [<ffffffffa0775d62>] do_walk_down+0x25b/0x395 [btrfs] >> [11212.734157] [<ffffffffa076db1f>] ? btrfs_header_generation+0x1f/0x25 >> [btrfs] >> [11212.734157] [<ffffffffa0771268>] ? walk_down_proc+0x10a/0x1d0 [btrfs] >> [11212.734157] [<ffffffffa0775f1d>] walk_down_tree+0x81/0xac [btrfs] >> [11212.734157] [<ffffffffa077636f>] btrfs_drop_snapshot+0x2aa/0x467 >> [btrfs] >> [11212.734157] [<ffffffff81031049>] ? need_resched+0x23/0x2d >> [11212.734157] [<ffffffff81031061>] ? should_resched+0xe/0x2f >> [11212.734157] [<ffffffffa077d080>] ? cleaner_kthread+0x0/0x16b [btrfs] >> [11212.734157] [<ffffffffa077f24d>] btrfs_clean_old_snapshots+0xee/0x10c >> [btrfs] >> [11212.734157] [<ffffffffa077d177>] cleaner_kthread+0xf7/0x16b [btrfs] >> [11212.734157] [<ffffffff8105b11e>] kthread+0x72/0x7a >> [11212.734157] [<ffffffff810039d4>] kernel_thread_helper+0x4/0x10 >> [11212.734157] [<ffffffff8105b0ac>] ? kthread+0x0/0x7a >> [11212.734157] [<ffffffff810039d0>] ? kernel_thread_helper+0x0/0x10 >> [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d 8c 48 8b 55 80 4c 8d >> 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec da ff ff 85 c0 74 >> 04<0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 04 0f 0b eb fe 41 83 >> [11212.734157] RIP [<ffffffffa0773452>] reada_walk_down+0x18c/0x249 >> [btrfs] >> [11212.734157] RSP<ffff880227539be0> >> [11213.101484] ---[ end trace b0dead1e7c3dbf7c ]--- >> Jan 26 11:40:45 an1 [11212.729433] ------------[ cut here ]------------ >> Jan 26 11:40:45 an1 [11212.734157] invalid opcode: 0000 [#2] SMP >> Jan 26 11:40:45 an1 [11212.734157] last sysfs file: >> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map >> Jan 26 11:40:46 an1 [11212.734157] Stack: >> Jan 26 11:40:46 an1 [11212.734157] Call Trace: >> Jan 26 11:40:46 an1 [11212.734157] Code: 01 00 00 0f 86 bb 00 00 00 8b 4d >> 8c 48 8b 55 80 4c 8d 4d c0 48 8b bd 78 ff ff ff 4c 8d 45 c8 4c 89 f6 e8 ec >> da ff ff 85 c0 74 04<0f> 0b eb fe 48 8b 45 c8 48 85 c0 75 0 >> >> >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, I got this kernel BUG on a server running multiple Ceph cosd instances. I''m not sure what was going on at the time, as I just noticed this on my serial console for this node. It looks like another example of the truncate issue in Matt Weil''s report. Please let me know what other information is needed to make this report useful. Thanks -- Jim an4 login: [62397.925080] ------------[ cut here ]------------ [62397.926012] kernel BUG at fs/btrfs/inode.c:6403! [62397.926012] invalid opcode: 0000 [#1] SMP [62397.926012] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map [62397.926012] CPU 1 [62397.926012] Modules linked in: loop btrfs zlib_deflate ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge ] [62397.994828] [62397.994828] Pid: 10514, comm: cosd Not tainted 2.6.37-00017-ga3f5274 #4 0DT097/PowerEdge 1950 [62397.994828] RIP: 0010:[<ffffffffa07834ff>] [<ffffffffa07834ff>] btrfs_truncate+0x444/0x47a [btrfs] [62397.994828] RSP: 0018:ffff8801a2e61d48 EFLAGS: 00010286 [62397.994828] RAX: 00000000ffffffe4 RBX: ffff88018c9c3a50 RCX: ffff8802136e9240 [62397.994828] RDX: ffff8802136e97e0 RSI: ffffea00074402f8 RDI: 0000000000000090 [62397.994828] RBP: ffff8801a2e61dd8 R08: ffffe8ffffc4ebe8 R09: 00000001e2a6a8c0 [62397.994828] R10: 0000000000000008 R11: 0000000000000016 R12: ffff8801e2a6a8c0 [62397.994828] R13: 0000000000000000 R14: ffff88018c9c3a50 R15: ffff880223b56800 [62397.994828] FS: 00007f6122b2e940(0000) GS:ffff8800cfc40000(0000) knlGS:0000000000000000 [62397.994828] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [62397.994828] CR2: 00007f7c1c7580a0 CR3: 00000001fc864000 CR4: 00000000000006e0 [62397.994828] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [62397.994828] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [62397.994828] Process cosd (pid: 10514, threadinfo ffff8801a2e60000, task ffff8801da311610) [62397.994828] Stack: [62397.994828] 0000000000000000 0000000000000000 0000000000000000 ffffffff00001000 [62397.994828] ffff88018c9c38b8 ffff88018c9c3a50 ffff88018c9c3b78 0000000000000000 [62397.994828] ffff88018c9c38e8 ffff88018c9c3b78 0000000000000000 ffffffff810b4960 [62397.994828] Call Trace: [62397.994828] [<ffffffff810b4960>] ? truncate_pagecache+0x52/0x5a [62397.994828] [<ffffffff810b49ca>] vmtruncate+0x44/0x50 [62397.994828] [<ffffffffa078482c>] btrfs_setattr+0x205/0x24e [btrfs] [62397.994828] [<ffffffff810fe7fc>] notify_change+0x194/0x285 [62397.994828] [<ffffffff810e9c0a>] do_truncate+0x71/0x90 [62397.994828] [<ffffffff810f34f1>] ? generic_permission+0x1c/0x91 [62397.994828] [<ffffffff810f3317>] ? get_write_access+0x1d/0x47 [62397.994828] [<ffffffff810e9df7>] sys_truncate+0x112/0x124 [62397.994828] [<ffffffff81002c2b>] system_call_fastpath+0x16/0x1b [62397.994828] Code: 83 7e 5c 00 74 13 4c 89 f6 4c 89 e7 e8 00 cd ff ff 85 c0 74 04 0f 0b eb fe 4c 89 f2 4c 89 fe 4c 89 e7 e8 22 f6 ff ff 85 c0 74 04 <0f> 0b eb fe 4c 89 fe 4c 89 e7 49 8b 5c 24 20 e8 47 9e ff [62397.994828] RIP [<ffffffffa07834ff>] btrfs_truncate+0x444/0x47a [btrfs] [62397.994828] RSP <ffff8801a2e61d48> Jan 27 08:47:39 [62398.251586] ---[ end trace c4d86802177b259b ]--- an4 [62397.925080] ------------[ cut here ]------------ Jan 27 08:47:39 an4 [62397.926012] invalid opcode: 0000 [#1] SMP Jan 27 08:47:39 an4 [62397.926012] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Jan 27 08:47:39 an4 [62397.994828] Stack: Jan 27 08:47:39 an4 [62397.994828] Call Trace: Jan 27 08:47:39 an4 [62397.994828] Code: 83 7e 5c 00 74 13 4c 89 f6 4c 89 e7 e8 00 cd ff ff 85 c0 74 04 0f 0b eb fe 4c 89 f2 4c 89 fe 4c 89 e7 e8 22 f6 ff ff 85 c0 74 04 <0f> 0b eb fe 4c 89 fe 4c 89 e7 49 8b 5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html