Hello, Somehow my subvolume with /home got corrupted. When I booted the machine this morning (after perfectly normal shutdown) it gave me a bunch of kernel errors. I found out that if I comment out my /home entry in fstab, it would boot ok. So the / is not corrupted. I then booted from the live CD and set "clear_cache" for /home instead of "inode_cache,space_cache" /dev/disk/by-label/btrfs-root / btrfs defaults,noatime,inode_cache,space_cache 0 0 /dev/disk/by-label/btrfs-root /var/lib/btrfs-root btrfs defaults,noatime,subvolid=0 0 0 #/dev/disk/by-label/btrfs-root /home btrfs defaults,noatime,subvol=__home-new,inode_cache,space_cache 0 0 /dev/disk/by-label/btrfs-root /home btrfs defaults,noatime,subvol=__home-new,clear_cache 0 0 /var/lib/btrfs-root/boot /boot none bind 0 0 Then I could mount the /home subvolume. I also found the corrupted file ? -????????? ? ? ? ? ? 13.4.4.40.js Whenever I try to access it I am getting Input/output error and the following error in the kernel.log Oct 10 10:38:03 yukikaze kernel: [34592.275080] parent transid verify failed on 105930436608 wanted 58565 found 134248 Oct 10 10:38:03 yukikaze kernel: [34592.275161] BUG: scheduling while atomic: ls/2545/0x00000002 Oct 10 10:38:03 yukikaze kernel: [34592.275166] Modules linked in: ipv6 loop usb_storage uas radeon snd_hda_codec_hdmi ttm snd_hda_codec_via drm_kms_helper ppdev sg snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd edac_core soundcore sp5100_tco r8169 drm firewire_ohci firewire_core i2c_algo_bit i2c_piix4 i2c_core edac_mce_amd parport_pc shpchp parport pci_hotplug pcspkr evdev mii serio_raw k10temp psmouse asus_atk0110 snd_page_alloc crc_itu_t wmi button powernow_k8 processor mperf sr_mod cdrom sd_mod pata_acpi usbhid hid ohci_hcd pata_atiixp ahci libahci libata ehci_hcd scsi_mod usbcore Oct 10 10:38:03 yukikaze kernel: [34592.275268] Pid: 2545, comm: ls Not tainted 3.0.6-aya1 #3 Oct 10 10:38:03 yukikaze kernel: [34592.275273] Call Trace: Oct 10 10:38:03 yukikaze kernel: [34592.275288] [<ffffffff8143fd33>] __schedule_bug+0x5f/0x64 Oct 10 10:38:03 yukikaze kernel: [34592.275298] [<ffffffff81447c89>] __schedule+0x7c9/0x980 Oct 10 10:38:03 yukikaze kernel: [34592.275310] [<ffffffff812705e7>] ? submit_bio+0x87/0x110 Oct 10 10:38:03 yukikaze kernel: [34592.275320] [<ffffffff81009e29>] ? read_tsc+0x9/0x20 Oct 10 10:38:03 yukikaze kernel: [34592.275329] [<ffffffff8107e7bd>] ? ktime_get_ts+0xad/0xe0 Oct 10 10:38:03 yukikaze kernel: [34592.275338] [<ffffffff810eb550>] ? __lock_page+0x70/0x70 Oct 10 10:38:03 yukikaze kernel: [34592.275346] [<ffffffff8104ac6f>] schedule+0x3f/0x60 Oct 10 10:38:03 yukikaze kernel: [34592.275354] [<ffffffff81447fbf>] io_schedule+0x8f/0xd0 Oct 10 10:38:03 yukikaze kernel: [34592.275362] [<ffffffff810eb55e>] sleep_on_page+0xe/0x20 Oct 10 10:38:03 yukikaze kernel: [34592.275370] [<ffffffff8144876f>] __wait_on_bit+0x5f/0x90 Oct 10 10:38:03 yukikaze kernel: [34592.275379] [<ffffffff810eb748>] wait_on_page_bit+0x78/0x80 Oct 10 10:38:03 yukikaze kernel: [34592.275388] [<ffffffff81074140>] ? autoremove_wake_function+0x40/0x40 Oct 10 10:38:03 yukikaze kernel: [34592.275397] [<ffffffff81210902>] read_extent_buffer_pages+0x412/0x480 Oct 10 10:38:03 yukikaze kernel: [34592.275405] [<ffffffff811e4410>] ? verify_parent_transid+0x240/0x240 Oct 10 10:38:03 yukikaze kernel: [34592.275414] [<ffffffff811e529a>] btree_read_extent_buffer_pages.isra.61+0x8a/0xc0 Oct 10 10:38:03 yukikaze kernel: [34592.275422] [<ffffffff811e6bf1>] read_tree_block+0x41/0x60 Oct 10 10:38:03 yukikaze kernel: [34592.275431] [<ffffffff811cbaab>] read_block_for_search.isra.33+0x1fb/0x500 Oct 10 10:38:03 yukikaze kernel: [34592.275439] [<ffffffff811cb0bd>] ? generic_bin_search.constprop.35+0x17d/0x1f0 Oct 10 10:38:03 yukikaze kernel: [34592.275447] [<ffffffff811cb214>] ? bin_search+0xe4/0x130 Oct 10 10:38:03 yukikaze kernel: [34592.275454] [<ffffffff811ceb48>] btrfs_search_slot+0x358/0x900 Oct 10 10:38:03 yukikaze kernel: [34592.275464] [<ffffffff811e310f>] btrfs_lookup_inode+0x2f/0xa0 Oct 10 10:38:03 yukikaze kernel: [34592.275473] [<ffffffff811f6e38>] btrfs_iget+0x108/0x4d0 Oct 10 10:38:03 yukikaze kernel: [34592.275482] [<ffffffff811e0b7f>] ? btrfs_lookup_dir_item+0xdf/0x110 Oct 10 10:38:03 yukikaze kernel: [34592.275491] [<ffffffff811f78f3>] btrfs_lookup_dentry+0x383/0x480 Oct 10 10:38:03 yukikaze kernel: [34592.275499] [<ffffffff811367b9>] ? kmem_cache_alloc+0x149/0x160 Oct 10 10:38:03 yukikaze kernel: [34592.275508] [<ffffffff811f7a06>] btrfs_lookup+0x16/0x30 Oct 10 10:38:03 yukikaze kernel: [34592.275515] [<ffffffff811561d5>] d_alloc_and_lookup+0x45/0x90 Oct 10 10:38:03 yukikaze kernel: [34592.275524] [<ffffffff811632b5>] ? d_lookup+0x35/0x60 Oct 10 10:38:03 yukikaze kernel: [34592.275531] [<ffffffff81157a3e>] do_lookup+0x29e/0x310 Oct 10 10:38:03 yukikaze kernel: [34592.275538] [<ffffffff811586bc>] path_lookupat+0x11c/0x700 Oct 10 10:38:03 yukikaze kernel: [34592.275546] [<ffffffff81158cd1>] do_path_lookup+0x31/0xc0 Oct 10 10:38:03 yukikaze kernel: [34592.275553] [<ffffffff8115a909>] user_path_at+0x59/0xa0 Oct 10 10:38:03 yukikaze kernel: [34592.275561] [<ffffffff8102f8f0>] ? do_page_fault+0x1c0/0x4d0 Oct 10 10:38:03 yukikaze kernel: [34592.275570] [<ffffffff8114fd64>] vfs_fstatat+0x44/0x70 Oct 10 10:38:03 yukikaze kernel: [34592.275578] [<ffffffff810677fd>] ? do_sigaction+0x12d/0x1f0 Oct 10 10:38:03 yukikaze kernel: [34592.275586] [<ffffffff8114fdcb>] vfs_stat+0x1b/0x20 Oct 10 10:38:03 yukikaze kernel: [34592.275593] [<ffffffff8114ff0a>] sys_newstat+0x1a/0x40 Oct 10 10:38:03 yukikaze kernel: [34592.275601] [<ffffffff81067bcd>] ? sys_rt_sigaction+0x8d/0xc0 Oct 10 10:38:03 yukikaze kernel: [34592.275610] [<ffffffff8144b055>] ? page_fault+0x25/0x30 Oct 10 10:38:03 yukikaze kernel: [34592.275617] [<ffffffff8144b602>] system_call_fastpath+0x16/0x1b My question - is it possible to delete this rogue file somehow or repair it? I tried to delete the directory that contained it, but got the same Input/output error. Any help is appreciated. I need to mention that I did have the very same error about a couple of months ago with about 30 files getting corrupt this way in my /home. I had to create a new subvolume for /home (__home-new) and restore the missing files from backup. When I tried to delete the corrupted subvolume it gave me a bunch of kernel errors, but when I repeated the command, it completed ok. However, on reboot the space from this subvolume was not recovered. I tried to balance the subvolume after that but after a couple of hours I am getting only the note about 22 extents in my kernel.log Oct 10 11:03:22 yukikaze kernel: [36111.396313] btrfs: found 22 extents Oct 10 11:03:27 yukikaze kernel: [36116.922236] btrfs: found 22 extents Oct 10 11:03:33 yukikaze kernel: [36122.922488] btrfs: found 22 extents and no relocation messages. So I think it go stuck ( thanks ~dima --- archlinux Linux yukikaze 3.0.6-aya1 #3 SMP PREEMPT Sat Oct 8 19:01:41 JST 2011 x86_64 AMD Athlon(tm) II X4 635 Processor AuthenticAMD GNU/Linux the latest btrfs-tools -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Mon, Oct 10, 2011 at 02:14:26AM +0000, dima wrote:> Somehow my subvolume with /home got corrupted. When I booted the machine this > morning (after perfectly normal shutdown) it gave me a bunch of kernel errors.That''s very strange, if it was a pefrectly normal shutdown, I don''t see a way how could happen. External disk damage, bad RAM would seem a as convenient excuse :)> I found out that if I comment out my /home entry in fstab, it would > boot ok. So the / is not corrupted. I then booted from the live CD and > set "clear_cache" for /home instead of "inode_cache,space_cache" > > /dev/disk/by-label/btrfs-root / btrfs > defaults,noatime,inode_cache,space_cache 0 0 > /dev/disk/by-label/btrfs-root /var/lib/btrfs-root btrfs > defaults,noatime,subvolid=0 0 0 > #/dev/disk/by-label/btrfs-root /home btrfs > defaults,noatime,subvol=__home-new,inode_cache,space_cache 0 0 > /dev/disk/by-label/btrfs-root /home btrfs > defaults,noatime,subvol=__home-new,clear_cache 0 0 > /var/lib/btrfs-root/boot /boot none bind 0 0 > > Then I could mount the /home subvolume. > > I also found the corrupted file > ? -????????? ? ? ? ? ? 13.4.4.40.jsChromium cache? Somebody recently reported a problem there. I wonder what this browser does to the filesystem ... :)> Whenever I try to access it I am getting Input/output error and the following > error in the kernel.log > > > Oct 10 10:38:03 yukikaze kernel: [34592.275080] parent transid verify failed on > 105930436608 wanted 58565 found 134248 > Oct 10 10:38:03 yukikaze kernel: [34592.275161] BUG: scheduling while atomic: > ls/2545/0x00000002This bug is in most cases only a consequence of some btrfs BUG_ON, please try to find it in your logs or reproduce the problem. The ''parent transid verify'' problem may cause a BUG_ON up in the caller stack.> My question - is it possible to delete this rogue file somehow or repair it? > I tried to delete the directory that contained it, but got the same Input/output > error.Fsck for the rescue! Or, you can try Josef''s repair [1] proggy to retrieve the data from the volume (AFAIK it should work around the parent transid problem). If all other files are fine, you can rebuild the /home from that. david [1] git://github.com/josefbacik/btrfs-progs.git -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thanks David,
The last shutdown was clean, but I had to powercycle several times this month.
I am also mounting a swapfile via loop device, so maybe this also adds up to
instability.
The corrupt file is a firefox source file
(mozilla-central/js/src/tests/e4x/XML/13.4.4.40.js). Interesting thing that I
did not touch this file or rebuild firefox for about 3-4 days, so I do not have
any idea why it got corrupted suddenly.
When trying to remove the directory containing this file I am getting:
Oct 10 14:03:13 yukikaze kernel: [ 9836.993172] ------------[ cut here
]------------                                           
Oct 10 14:03:13 yukikaze kernel: [ 9836.993261] kernel BUG at
fs/btrfs/inode.c:3024!                                           
Oct 10 14:03:13 yukikaze kernel: [ 9836.993340] invalid opcode: 0000 [#1]
PREEMPT SMP                                          
Oct 10 14:03:13 yukikaze kernel: [ 9836.993438] CPU 0                           
                                              
Oct 10 14:03:13 yukikaze kernel: [ 9836.993474] Modules linked in: reiserfs
usb_storage uas ipv6 loop snd_hda_codec_hdmi snd_hda_codec_via sg snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm snd_timer snd sp5100_tco i2c_piix4 radeon ttm
drm_kms_helper drm i2c_algo_bit firewire_ohci psmouse ppdev shpchp evdev
serio_raw pcspkr firewire_core pci_hotplug i2c_core edac_core soundcore
snd_page_alloc asus_atk0110 k10temp edac_mce_amd parport_pc parport crc_itu_t
r8169 mii button wmi powernow_k8 processor mperf usbhid hid sr_mod cdrom sd_mod
pata_acpi ohci_hcd ehci_hcd pata_atiixp ahci libahci libata scsi_mod usbcore    
                 
Oct 10 14:03:13 yukikaze kernel: [ 9836.994630]                                 
                                              
Oct 10 14:03:13 yukikaze kernel: [ 9836.994662] Pid: 3043, comm: rm Not tainted
3.0.6-aya1 #3 System manufacturer System Product Name/M4A785TD-V EVO            
                                                                                
            
Oct 10 14:03:13 yukikaze kernel: [ 9836.994840] RIP:
0010:[<ffffffff811f5221>]
[<ffffffff811f5221>] btrfs_unlink+0xd1/0xe0    
Oct 10 14:03:13 yukikaze kernel: [ 9836.994983] RSP: 0018:ffff8800a616fe28 
EFLAGS: 00010282                                   
Oct 10 14:03:13 yukikaze kernel: [ 9836.995070] RAX: 00000000fffffffe RBX:
ffff8801178f6240 RCX: 000000000331d8c0              
Oct 10 14:03:13 yukikaze kernel: [ 9836.995185] RDX: 000000000331d880 RSI:
0000000000018dc0 RDI: ffffea0003d28130              
Oct 10 14:03:13 yukikaze kernel: [ 9836.995301] RBP: ffff8800a616fe58 R08:
ffffffff811c7dda R09: 0000000000000000              
Oct 10 14:03:13 yukikaze kernel: [ 9836.995416] R10: 0000000000000000 R11:
0000000000000001 R12: 00000000fffffffe              
Oct 10 14:03:13 yukikaze kernel: [ 9836.995530] R13: ffff880096fb05c8 R14:
ffff8801186ad800 R15: ffff8800426bbf88
Oct 10 14:03:13 yukikaze kernel: [ 9836.995646] FS:  00007f54a0d6e700(0000)
GS:ffff88011fc00000(0000) knlGS:0000000000000000   
Oct 10 14:03:13 yukikaze kernel: [ 9836.995777] CS:  0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Oct 10 14:03:13 yukikaze kernel: [ 9836.995870] CR2: 0000000001ddf0b8 CR3:
00000001081d9000 CR4: 00000000000006f0
Oct 10 14:03:13 yukikaze kernel: [ 9836.995984] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Oct 10 14:03:13 yukikaze kernel: [ 9836.996099] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113] Process rm (pid: 3043,
threadinfo ffff8800a616e000, task ffff8800967e1d00)
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113] Stack:
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  0000000000000000
ffff880012f8b300 0000000000000000 ffff880096fb05c8
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  0000000000000000
0000000000000003 ffff8800a616fe88 ffffffff8115a42f
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  ffff8800a616fe88
ffff880012f8b300 ffff8800426bbf88 0000000000000000
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113] Call Trace:
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  [<ffffffff8115a42f>]
vfs_unlink+0x9f/0x110
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  [<ffffffff8115a63a>]
do_unlinkat+0x19a/0x1c0
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  [<ffffffff811496b6>] ?
filp_close+0x66/0x90
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  [<ffffffff8115b332>]
sys_unlinkat+0x22/0x40
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  [<ffffffff8144b602>]
system_call_fastpath+0x16/0x1b
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113] Code: 5d d8 4c 8b 65 e0 4c 8b 6d
e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 66 0f 1f 44 00 00 4c 89 fe 48 89 df e8 e5 cd ff
ff 85 c0 74 b8 0f 0b <0f> 0b 41 89 c4 eb c9 0f 1f 84 00 00 00 00 00 55 48
89 e5
41 57 
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113] RIP  [<ffffffff811f5221>]
btrfs_unlink+0xd1/0xe0
Oct 10 14:03:13 yukikaze kernel: [ 9836.996113]  RSP <ffff8800a616fe28>
Oct 10 14:03:13 yukikaze kernel: [ 9837.023860] ---[ end trace 771cebd6df5534bd
]---
I did btrfsck with the latest btrfs-tools
After
        item 33 key (150121906176 EXTENT_ITEM 4096) itemoff 2234 itemsize 51
                extent refs 1 gen 33099 flags 2
                tree block key (1215402 1 0) level 0
                tree block backref root 257
(i.e. very early, about 4-5 seconds after I started checking)
it gave me an error
failed to find block number 150121762816
Unless I touch this file, the FS is fully functional.
Yes, I can create a new subvolume of course, but as I mentioned before, there is
a big chance that the corrupted one will not be deleted cleanly and my disk gets
bloated even more with junk data I can do nothing about.
thanks
~dima
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
On Mon, Oct 10, 2011 at 11:03:34AM +0000, dima wrote:> The last shutdown was clean, but I had to powercycle several times this month. > I am also mounting a swapfile via loop device, so maybe this also adds up to > instability. > > The corrupt file is a firefox source file > (mozilla-central/js/src/tests/e4x/XML/13.4.4.40.js). Interesting thing that I > did not touch this file or rebuild firefox for about 3-4 days, so I do not have > any idea why it got corrupted suddenly. > > When trying to remove the directory containing this file I am getting: > > Oct 10 14:03:13 yukikaze kernel: [ 9836.993172] ------------[ cut here > ]------------ > Oct 10 14:03:13 yukikaze kernel: [ 9836.993261] kernel BUG at > fs/btrfs/inode.c:3024!fixed by: commit b532402e4d147e4f409c4e7f50d4413e8450101d Author: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Date: Tue Jul 19 07:27:20 2011 +0000 Btrfs: return error to caller when btrfs_unlink() failes When btrfs_unlink_inode() and btrfs_orphan_add() in btrfs_unlink() are error, the error code is returned to the caller instead of BUG_ON(). david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba wrote:>> Then I could mount the /home subvolume. >> >> I also found the corrupted file >>? -????????? ? ? ? ? ? 13.4.4.40.js > > Chromium cache? Somebody recently reported a problem there. I wonder > what this browser does to the filesystem ... :)If you meant me by "someone": No, my problem was not related to chromium usage - the problems only raised there because of a previous "cp --reflink" issue while I continued browsing. ;-) So it is pure coincidence because browser caches are a probable destination for write access while my system came to a complete halt due to a browsing- unrelated file operation (cp --reflink). Regards, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Oh, I see. The fix is not in 3.0.x but on the master branch. I will need the latest 3.1 RC. I will try this. Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I have upgraded to 3.1 rc8. I created a new subvolume for /home, copied the files there from the old subvolume and deleted the old subvolume. It looks like the space has been reclaimed fine. Though when doing btrfsck I am still getting the same error failed to find block number 150121762816 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html