Hi today I hit this Bug, kernel is v3.1-rc10 + josef from today, workload is a ceph osd. Best Regards, Martin [28997.273289] ------------[ cut here ]------------ [28997.282916] kernel BUG at fs/btrfs/inode.c:1163! [28997.290863] invalid opcode: 0000 [#1] SMP [28997.290863] CPU 0 [28997.290863] Modules linked in: radeon ttm drm_kms_helper drm psmouse sp5100_tco i2c_piix4 i2c_algo_bit serio_raw edac_core k8temp edac_mce_amd shpchp lp parport pata_atiixp btrfs zlib_deflate e1000e libcrc32c ahci libahci [28997.290863] [28997.290863] Pid: 1220, comm: ceph-osd Tainted: G W 3.1.0-rc10+ #2 MICRO-STAR INTERNATIONAL CO., LTD MS-96B3/MS-96B3 [28997.290863] RIP: 0010:[<ffffffffa0094f17>] [<ffffffffa0094f17>] run_delalloc_nocow+0x7a7/0x7c0 [btrfs] [28997.290863] RSP: 0018:ffff880117357a78 EFLAGS: 00010206 [28997.290863] RAX: 000000000000002f RBX: ffff880116b12a20 RCX: ffff880117357a38 [28997.290863] RDX: ffff880000000000 RSI: 0000000000000496 RDI: ffff8801003851e0 [28997.290863] RBP: ffff880117357b78 R08: 0000000000000497 R09: ffff880117357a28 [28997.290863] R10: 0000000000000030 R11: 0000000000000000 R12: 0000000000011d3b [28997.290863] R13: 0000000000011d3b R14: ffff8801003851e0 R15: 0000000000300000 [28997.290863] FS: 00007ff45ae7b700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 [28997.507960] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [28997.507960] CR2: 00007ff450a20000 CR3: 0000000114b75000 CR4: 00000000000006f0 [28997.507960] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [28997.507960] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [28997.507960] Process ceph-osd (pid: 1220, threadinfo ffff880117356000, task ffff880115260000) [28997.507960] Stack: [28997.507960] ffff880117357aa8 ffffffff81156e90 ffff880104413af0 ffff880104413af0 [28997.507960] ffff880117550030 ffff880117357bf0 ffff880117550028 ffff880117550020 [28997.507960] 0000000000400000 0000000100400000 ffff880117357d14 00ffffffa00a973e [28997.507960] Call Trace: [28997.507960] [<ffffffff81156e90>] ? kmem_cache_free+0x20/0x100 [28997.507960] [<ffffffffa0095264>] run_delalloc_range+0x334/0x380 [btrfs] [28997.507960] [<ffffffffa00abc85>] __extent_writepage+0x5b5/0x6f0 [btrfs] [28997.507960] [<ffffffff812e526d>] ? radix_tree_gang_lookup_tag_slot+0x8d/0xd0 [28997.507960] [<ffffffffa00abfea>] extent_write_cache_pages.clone.19.clone.26+0x22a/0x3a0 [btrfs] [28997.507960] [<ffffffffa00ac3a5>] extent_writepages+0x45/0x60 [btrfs] [28997.507960] [<ffffffffa00903e0>] ? acls_after_inode_item+0xc0/0xc0 [btrfs] [28997.507960] [<ffffffff81182ade>] ? vfsmount_lock_local_unlock+0x1e/0x30 [28997.507960] [<ffffffffa008fa27>] btrfs_writepages+0x27/0x30 [btrfs] [28997.507960] [<ffffffff81118161>] do_writepages+0x21/0x40 [28997.507960] [<ffffffff8110e2cb>] __filemap_fdatawrite_range+0x5b/0x60 [28997.507960] [<ffffffff8110f1d3>] filemap_fdatawrite_range+0x13/0x20 [28997.507960] [<ffffffff81192c99>] sys_sync_file_range+0x149/0x180 [28997.835220] [<ffffffff815f05c2>] system_call_fastpath+0x16/0x1b [28997.835220] Code: 8b 7d 80 e8 dc 9e 00 00 41 b9 04 00 00 00 e9 3d fe ff ff 4d 89 ef 41 bc 01 00 00 00 48 c7 45 a8 ff ff ff ff e9 5c fb ff ff 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 66 0f 1f 84 00 [28997.835220] RIP [<ffffffffa0094f17>] run_delalloc_nocow+0x7a7/0x7c0 [btrfs] [28997.835220] RSP <ffff880117357a78> [28997.927402] ---[ end trace a0a1c4a13d975229 ]--- -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 18, 2011 at 10:04:01PM +0200, Martin Mailand wrote:> [28997.273289] ------------[ cut here ]------------ > [28997.282916] kernel BUG at fs/btrfs/inode.c:1163!1119 fi = btrfs_item_ptr(leaf, path->slots[0], 1120 struct btrfs_file_extent_item); 1121 extent_type = btrfs_file_extent_type(leaf, fi); 1122 1123 if (extent_type == BTRFS_FILE_EXTENT_REG || 1124 extent_type == BTRFS_FILE_EXTENT_PREALLOC) { ... 1158 } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) { 1159 extent_end = found_key.offset + 1160 btrfs_file_extent_inline_len(leaf, fi); 1161 extent_end = ALIGN(extent_end, root->sectorsize); 1162 } else { 1163 BUG_ON(1); 1164 } rc10 kernel sources point to this, can you please verify it in your sources? if it''s really this one, that means that it''s an unhandled extent_type read from the b-tree leaf and could be a corruption. (the value is directly obtained from file extent type item, line 1121) It would be interesting what''s the value of ''extent_type'' at the time of crash, if it''s eg -1 that could point to a real bug, some unhandled corner case in truncate, for example.> [28997.507960] Call Trace: > [28997.507960] [<ffffffffa00903e0>] ? acls_after_inode_item+0xc0/0xc0 [btrfs]... a corruption caused by overflow of xattrs/acls into inode item bytes? As ceph stresses xattrs very well, I wouldn''t be surprised by that. david -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 19.10.2011 11:49, schrieb David Sterba:> On Tue, Oct 18, 2011 at 10:04:01PM +0200, Martin Mailand wrote: >> [28997.273289] ------------[ cut here ]------------ >> [28997.282916] kernel BUG at fs/btrfs/inode.c:1163! > > 1119 fi = btrfs_item_ptr(leaf, path->slots[0], > 1120 struct btrfs_file_extent_item); > 1121 extent_type = btrfs_file_extent_type(leaf, fi); > 1122 > 1123 if (extent_type == BTRFS_FILE_EXTENT_REG || > 1124 extent_type == BTRFS_FILE_EXTENT_PREALLOC) { > ... > 1158 } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) { > 1159 extent_end = found_key.offset + > 1160 btrfs_file_extent_inline_len(leaf, fi); > 1161 extent_end = ALIGN(extent_end, root->sectorsize); > 1162 } else { > 1163 BUG_ON(1); > 1164 } > > rc10 kernel sources point to this, can you please verify it in your > sources? if it''s really this one, that means that it''s an unhandled > extent_type read from the b-tree leaf and could be a corruption. (the > value is directly obtained from file extent type item, line 1121) >yep, that''s the same in my source> It would be interesting what''s the value of ''extent_type'' at the time of > crash, if it''s eg -1 that could point to a real bug, some unhandled > corner case in truncate, for example. >How can I do that?>> [28997.507960] Call Trace: >> [28997.507960] [<ffffffffa00903e0>] ? acls_after_inode_item+0xc0/0xc0 [btrfs] > > ... a corruption caused by overflow of xattrs/acls into inode item bytes? > > As ceph stresses xattrs very well, I wouldn''t be surprised by that. > > > david-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Martin, > workload is a ceph osd. I tried to play with ceph here and not a complete success yet. any idea what was done on the system at the time of the problem ? and any specific command that could trigger this again ? Thanks. anand> Best Regards, > Martin > > [28997.273289] ------------[ cut here ]------------ > [28997.282916] kernel BUG at fs/btrfs/inode.c:1163! > [28997.290863] invalid opcode: 0000 [#1] SMP > [28997.290863] CPU 0 > [28997.290863] Modules linked in: radeon ttm drm_kms_helper drm psmouse > sp5100_tco i2c_piix4 i2c_algo_bit serio_raw edac_core k8temp > edac_mce_amd shpchp lp parport pata_atiixp btrfs zlib_deflate e1000e > libcrc32c ahci libahci > [28997.290863] > [28997.290863] Pid: 1220, comm: ceph-osd Tainted: G W 3.1.0-rc10+ > #2 MICRO-STAR INTERNATIONAL CO., LTD MS-96B3/MS-96B3 > [28997.290863] RIP: 0010:[<ffffffffa0094f17>] [<ffffffffa0094f17>] > run_delalloc_nocow+0x7a7/0x7c0 [btrfs] > [28997.290863] RSP: 0018:ffff880117357a78 EFLAGS: 00010206 > [28997.290863] RAX: 000000000000002f RBX: ffff880116b12a20 RCX: > ffff880117357a38 > [28997.290863] RDX: ffff880000000000 RSI: 0000000000000496 RDI: > ffff8801003851e0 > [28997.290863] RBP: ffff880117357b78 R08: 0000000000000497 R09: > ffff880117357a28 > [28997.290863] R10: 0000000000000030 R11: 0000000000000000 R12: > 0000000000011d3b > [28997.290863] R13: 0000000000011d3b R14: ffff8801003851e0 R15: > 0000000000300000 > [28997.290863] FS: 00007ff45ae7b700(0000) GS:ffff88011fc00000(0000) > knlGS:0000000000000000 > [28997.507960] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [28997.507960] CR2: 00007ff450a20000 CR3: 0000000114b75000 CR4: > 00000000000006f0 > [28997.507960] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [28997.507960] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [28997.507960] Process ceph-osd (pid: 1220, threadinfo ffff880117356000, > task ffff880115260000) > [28997.507960] Stack: > [28997.507960] ffff880117357aa8 ffffffff81156e90 ffff880104413af0 > ffff880104413af0 > [28997.507960] ffff880117550030 ffff880117357bf0 ffff880117550028 > ffff880117550020 > [28997.507960] 0000000000400000 0000000100400000 ffff880117357d14 > 00ffffffa00a973e > [28997.507960] Call Trace: > [28997.507960] [<ffffffff81156e90>] ? kmem_cache_free+0x20/0x100 > [28997.507960] [<ffffffffa0095264>] run_delalloc_range+0x334/0x380 [btrfs] > [28997.507960] [<ffffffffa00abc85>] __extent_writepage+0x5b5/0x6f0 [btrfs] > [28997.507960] [<ffffffff812e526d>] ? > radix_tree_gang_lookup_tag_slot+0x8d/0xd0 > [28997.507960] [<ffffffffa00abfea>] > extent_write_cache_pages.clone.19.clone.26+0x22a/0x3a0 [btrfs] > [28997.507960] [<ffffffffa00ac3a5>] extent_writepages+0x45/0x60 [btrfs] > [28997.507960] [<ffffffffa00903e0>] ? acls_after_inode_item+0xc0/0xc0 > [btrfs] > [28997.507960] [<ffffffff81182ade>] ? vfsmount_lock_local_unlock+0x1e/0x30 > [28997.507960] [<ffffffffa008fa27>] btrfs_writepages+0x27/0x30 [btrfs] > [28997.507960] [<ffffffff81118161>] do_writepages+0x21/0x40 > [28997.507960] [<ffffffff8110e2cb>] __filemap_fdatawrite_range+0x5b/0x60 > [28997.507960] [<ffffffff8110f1d3>] filemap_fdatawrite_range+0x13/0x20 > [28997.507960] [<ffffffff81192c99>] sys_sync_file_range+0x149/0x180 > [28997.835220] [<ffffffff815f05c2>] system_call_fastpath+0x16/0x1b > [28997.835220] Code: 8b 7d 80 e8 dc 9e 00 00 41 b9 04 00 00 00 e9 3d fe > ff ff 4d 89 ef 41 bc 01 00 00 00 48 c7 45 a8 ff ff ff ff e9 5c fb ff ff > 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 66 0f 1f 84 00 > [28997.835220] RIP [<ffffffffa0094f17>] run_delalloc_nocow+0x7a7/0x7c0 > [btrfs] > [28997.835220] RSP <ffff880117357a78> > [28997.927402] ---[ end trace a0a1c4a13d975229 ]--- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Anand, I changed the replication level of the rbd pool, from one to two. ceph osd pool set rbd size 2 And then during the sync the bug happened, but today I could not reproduce it. So I do not have a testcase for you. Best Regards, martin Am 19.10.2011 17:02, schrieb Anand Jain:> I tried to play with ceph here and not a complete success yet. > > any idea what was done on the system at the time of the problem ? > and any specific command that could trigger this again ? > Thanks. > anand-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 19, 2011 at 02:59:45PM +0200, Martin Mailand wrote:> Am 19.10.2011 11:49, schrieb David Sterba: > >It would be interesting what''s the value of ''extent_type'' at the time of > >crash, if it''s eg -1 that could point to a real bug, some unhandled > >corner case in truncate, for example. > > > > How can I do that?something like that would print the type information during runtime --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1160,6 +1160,11 @@ next_slot: btrfs_file_extent_inline_len(leaf, fi); extent_end = ALIGN(extent_end, root->sectorsize); } else { + printk(KERN_CRIT "btrfs: unhandled extent type %d, key=[%llu,%u,%llu]", + (unsigned long long)found_key.objecitd, + (unsigned)found_key.tyep, + (unsigned long long)found_key.offset, + extent_type); BUG_ON(1); } out_check: --- the key seems to be a relevant iformation to print as well, but the rest stored in the leaf item could be just garbage. The btrfs-debug-tree utility prints the whole tree items, but as I saw just now in the case of extent, it skips an unknown type (the same where kernel BUGs), which means that other extent data are glued to previous output and thus cannot be spotted easily. To fix that, apply these changes to progs'' print-tree.c: --- a/print-tree.c +++ b/print-tree.c @@ -138,7 +138,7 @@ static void print_file_extent_item(struct extent_buffer *eb, btrfs_file_extent_inline_len(eb, fi), btrfs_file_extent_compression(eb, fi)); return; - } + } else if (extent_type == BTRFS_FILE_EXTENT_PREALLOC) { printf("\t\tprealloc data disk byte %llu nr %llu\n", (unsigned long long)btrfs_file_extent_disk_bytenr(eb, fi), @@ -147,6 +147,8 @@ static void print_file_extent_item(struct extent_buffer *eb, (unsigned long long)btrfs_file_extent_offset(eb, fi), (unsigned long long)btrfs_file_extent_num_bytes(eb, fi)); return; + } else { + printf("*** ERROR unknown extent type %d\n", extent_type); } printf("\t\textent data disk byte %llu nr %llu\n", (unsigned long long)btrfs_file_extent_disk_bytenr(eb, fi), --- and run the utility on your fs. david -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html