Peter Marheine
2012-Aug-14 18:20 UTC
Hung I/O, Kernel BUG with corrupt leaf (bad key order)
Hi all, I''m running btrfs in a 3-disk RAID1 configuration. After a hard power-off, I''m seeing a lot of hung I/O tasks on this volume, apparently due to a corrupt leaf. I first noticed the problem on kernel 3.4.7, and it''s persisted with 3.4.8. Relevant parts of the kernel log follow. [ 85.179621] block group 38684065792 has an wrong amount of free space [ 85.179667] btrfs: failed to load free space cache for block group 38684065792 [ 136.969477] btrfs: corrupt leaf, bad key order: block=1478255230976,root=1, slot=26 [ 136.998953] btrfs: corrupt leaf, bad key order: block=1478255230976,root=1, slot=26 [ 137.000492] btrfs: corrupt leaf, bad key order: block=1478255230976,root=1, slot=26 [ 137.000708] btrfs: corrupt leaf, bad key order: block=1478255230976,root=1, slot=26 [ 153.912922] btrfs: corrupt leaf, bad key order: block=1478255230976,root=1, slot=26 [ 153.913020] ------------[ cut here ]------------ [ 153.913055] kernel BUG at fs/btrfs/inode.c:828! [ 153.913087] invalid opcode: 0000 [#1] PREEMPT SMP [ 153.913142] CPU 1 [ 153.913155] Modules linked in: nfsd exportfs arc4 snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm ath5k ath microcode i915 video i2c_algo_bit acpi_cpufreq drm_kms_helper mperf mac80211 cfg80211 i2c_i801 rfkill serio_raw drm processor evdev snd_page_alloc snd_timer snd coretemp soundcore mei(C) psmouse pcspkr e1000e iTCO_wdt i2c_core button iTCO_vendor_support intel_agp intel_gtt nfs nfs_acl lockd auth_rpcgss sunrpc fscache dm_mod floppy btrfs crc32c libcrc32c zlib_deflate ext4 crc16 jbd2 mbcache uhci_hcd ehci_hcd usbcore usb_common sd_mod ahci libahci pata_marvell libata scsi_mod [ 153.913685] [ 153.913698] Pid: 325, comm: btrfs-transacti Tainted: G C 3.4.8-1-ARCH #1 /DG33TL [ 153.913767] RIP: 0010:[<ffffffffa0197cd0>] [<ffffffffa0197cd0>] cow_file_range+0x3d0/0x4b0 [btrfs] [ 153.913841] RSP: 0018:ffff8801a1fb1580 EFLAGS: 00010246 [ 153.913873] RAX: ffff88019cd38000 RBX: ffff8801a1fb18e8 RCX: 000000000000ffff [ 153.913911] RDX: ffff88019d8bb800 RSI: ffffea00060d0040 RDI: ffff88017dff47f0 [ 153.913951] RBP: ffff8801a1fb1640 R08: ffff8801a1fb18d4 R09: ffff8801a1fb18e8 [ 153.913990] R10: 0000000000010000 R11: 0000000000000001 R12: 0000000000000000 [ 153.914029] R13: 0000000000000000 R14: 0000000000001000 R15: ffff88017dff47f0 [ 153.914068] FS: 0000000000000000(0000) GS:ffff8801abc80000(0000) knlGS:0000000000000000 [ 153.914112] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 153.914144] CR2: 00007f085106b000 CR3: 0000000198736000 CR4: 00000000000007e0 [ 153.914182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 153.914221] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 153.914261] Process btrfs-transacti (pid: 325, threadinfo ffff8801a1fb0000, task ffff88019cd7b790) [ 153.914308] Stack: [ 153.914322] 0000000000000000 ffff880162624b60 0000000000000286 0000000000000003 [ 153.914377] 000000000000ffff ffff88017dff4620 ffff8801a1fb15f0 ffffea00060d0040 [ 153.914431] ffff8801a1fb15f0 ffff88019d8bb800 ffff8801a09ad360 ffff8801a1fb18d4 [ 153.914485] Call Trace: [ 153.914516] [<ffffffffa01b687f>] ? free_extent_buffer+0x2f/0x70 [btrfs] [ 153.914565] [<ffffffffa0198173>] run_delalloc_nocow+0x3c3/0x950 [btrfs] [ 153.914615] [<ffffffffa0198a31>] run_delalloc_range+0x331/0x3a0 [btrfs] [ 153.914665] [<ffffffffa01b52f1>] __extent_writepage+0x341/0x7c0 [btrfs] [ 153.914715] [<ffffffffa01b5a52>] extent_write_cache_pages.isra.26.constprop.44+0x2e2/0x3e0 [btrfs] [ 153.914775] [<ffffffffa01b5da5>] extent_writepages+0x45/0x60 [btrfs] [ 153.914823] [<ffffffffa0194330>] ? btrfs_writepage+0x70/0x70 [btrfs] [ 153.914871] [<ffffffffa01b191e>] ? free_extent_state+0x1e/0x30 [btrfs] [ 153.914919] [<ffffffffa0193338>] btrfs_writepages+0x28/0x30 [btrfs] [ 153.916201] [<ffffffff81118082>] do_writepages+0x22/0x50 [ 153.916315] [<ffffffff8110d5fb>] __filemap_fdatawrite_range+0x5b/0x60 [ 153.916315] [<ffffffff8110d61f>] filemap_fdatawrite+0x1f/0x30 [ 153.920013] [<ffffffff8110d665>] filemap_write_and_wait+0x35/0x60 [ 153.920013] [<ffffffffa01cf622>] __btrfs_write_out_cache+0x792/0x9a0 [btrfs] [ 153.920013] [<ffffffffa0175b25>] ? __find_space_info+0x85/0xa0 [btrfs] [ 153.920013] [<ffffffffa017f28b>] ? btrfs_run_delayed_refs+0x1cb/0x450 [btrfs] [ 153.920013] [<ffffffffa01cf8c5>] btrfs_write_out_cache+0x95/0xf0 [btrfs] [ 153.920013] [<ffffffffa017fa2f>] btrfs_write_dirty_block_groups+0x51f/0x5f0 [btrfs] [ 153.920013] [<ffffffffa01e9b2a>] commit_cowonly_roots+0xec/0x1c6 [btrfs] [ 153.920013] [<ffffffffa0190895>] btrfs_commit_transaction+0x575/0xaa0 [btrfs] [ 153.920013] [<ffffffff81073b50>] ? abort_exclusive_wait+0xb0/0xb0 [ 153.920013] [<ffffffffa0188e15>] transaction_kthread+0x235/0x2b0 [btrfs] [ 153.920013] [<ffffffffa0188be0>] ? btrfs_alloc_root+0x50/0x50 [btrfs] [ 153.920013] [<ffffffff810731c3>] kthread+0x93/0xa0 [ 153.920013] [<ffffffff8146bfa4>] kernel_thread_helper+0x4/0x10 [ 153.920013] [<ffffffff81073130>] ? kthread_freezable_should_stop+0x70/0x70 [ 153.920013] [<ffffffff8146bfa0>] ? gs_change+0x13/0x13 [ 153.920013] Code: ff 48 8b 75 88 48 8b 7d 80 41 89 c0 b9 a3 03 00 00 48 c7 c2 63 10 1f a0 41 89 c6 e8 ab 3e fd ff eb 2a 66 0f 1f 84 00 00 00 00 00 <0f> 0b 48 8b 75 88 48 8b 7d 80 41 89 c0 b9 7d 03 00 00 48 c7 c2 [ 153.920013] RIP [<ffffffffa0197cd0>] cow_file_range+0x3d0/0x4b0 [btrfs] [ 153.920013] RSP <ffff8801a1fb1580> [ 153.920330] ---[ end trace 462486d382b33cae ]--- Btrfsck on this volume prints a lot of messages about incorrect backrefs, and eventually fails out due to bad key ordering: backpointer mismatch on [823847440384 1204224] owner ref check failed [823847440384 1204224] ref mismatch on [823848644608 1269760] extent item 1, found 0 Incorrect local backref count on 823848644608 root 5 owner 136598 offset 0 found 0 wanted 1 back 0xa6 cc9a0 backpointer mismatch on [823848644608 1269760] owner ref check failed [823848644608 1269760] ref mismatch on [823849914368 1662976] extent item 1, found 0 Incorrect local backref count on 823849914368 root 5 owner 136599 offset 0 found 0 wanted 1 back 0xa6 ccc00 backpointer mismatch on [823849914368 1662976] owner ref check failed [823849914368 1662976] ref mismatch on [823851577344 1585152] extent item 1, found 0 Incorrect local backref count on 823851577344 root 5 owner 136600 offset 0 found 0 wanted 1 back 0xa6 cd0c0 backpointer mismatch on [823851577344 1585152] owner ref check failed [823851577344 1585152] ref mismatch on [823853162496 1585152] extent item 1, found 0 Incorrect local backref count on 823853162496 root 5 owner 136601 offset 0 found 0 wanted 1 back 0xa6 cd580 backpointer mismatch on [823853162496 1585152] owner ref check failed [823853162496 1585152] ref mismatch on [823854747648 1777664] extent item 1, found 0 Incorrect local backref count on 823854747648 root 5 owner 136602 offset 0 found 0 wanted 1 back 0xa6cd450 backpointer mismatch on [823854747648 1777664] owner ref check failed [823854747648 1777664] owner ref check failed [1478255230976 4096] Errors found in extent allocation tree checking fs roots bad key ordering 26 27 btrfsck: btrfsck.c:873: count_csum_range: Assertion `!(ret < 0)'' failed. Is there some way to fix this corruption? I noticed what looks like the same problem in an earlier message on the list ("btrfs unmountable after failed suspend", February 7), but with no resolution. I have offline backups, but recovering those in their entirety will take some time, so a solution that doesn''t require wiping the entire FS would be preferred. -- Peter Marheine -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Marheine
2012-Aug-15 01:29 UTC
Re: Hung I/O, Kernel BUG with corrupt leaf (bad key order)
> Is there some way to fix this corruption? I noticed what looks like > the same problem in an earlier message on the list ("btrfs unmountable > after failed suspend", February 7), but with no resolution. I have > offline backups, but recovering those in their entirety will take some > time, so a solution that doesn''t require wiping the entire FS would be > preferred.I did some further investigation into the problem, and I have determined the problematic directory (by seeing where `ls -R` hangs). If I skip the corrupt directory, everything works properly, but attempting to list its contents causes the entire volume to stop responding. At this point I''d like to simply unlink the corrupt directory (without enumerating it). Is that possible, or should I just image the volume minus the corrupt directory and recreate my fs? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Aug-22 15:01 UTC
Re: Hung I/O, Kernel BUG with corrupt leaf (bad key order)
On Tue, Aug 14, 2012 at 01:20:36PM -0500, Peter Marheine wrote:> Hi all, > > I''m running btrfs in a 3-disk RAID1 configuration. After a hard > power-off, I''m seeing a lot of hung I/O tasks on this volume, > apparently due to a corrupt leaf. I first noticed the problem on > kernel 3.4.7, and it''s persisted with 3.4.8. Relevant parts of the > kernel log follow.What was the filesystem activity when the power-off happened?> > [ 85.179621] block group 38684065792 has an wrong amount of free space > [ 85.179667] btrfs: failed to load free space cache for block group > 38684065792 > [ 136.969477] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 136.998953] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 137.000492] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 137.000708] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 153.912922] btrfs: corrupt leaf, bad key order: > block=1478255230976,root=1, slot=26 > [ 153.913020] ------------[ cut here ]------------ > [ 153.913055] kernel BUG at fs/btrfs/inode.c:828!809 static noinline int cow_file_range(struct inode *inode, 810 struct page *locked_page, 811 u64 start, u64 end, int *page_started, 812 unsigned long *nr_written, 813 int unlock) 814 { [...] 828 BUG_ON(btrfs_is_free_space_inode(root, inode)); plus the ''block group'' warning above, this seems to be the but that Liu Bo fixed with patches Btrfs: fix a bug of writting free space cache with nodatacow option Btrfs: fix a bug of writting free space cache during balance Btrfs: fix btrfs_is_free_space_inode to recognize btree inode that should appear in 3.6. You can try to mount with ''nospace_cache'' or ''clear_cache'' if this would make a difference to redo the space cache from scratch, but I''m afaraid the bad keys will remain and would have to be removed via offline fsck. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html