Hello! I tried to "cp --reflink" a huge file (about 80G, a VMware disk image). It took maybe about 1 minute when my PC started thrashing the hard disk, some minutes later the command returned with an out of memory message. I could no longer open terminals in my KDE Konsole to investiage dmesg. I could not start new programs. I could not log out. Hard disk access was somehow blocked. Opening new terminals within Konsole yielded in red letters. "unable to start /bin/bash" after a few seconds. I rebooted using the reset button, Alt+Print+S didn''t seem to sync anything to the disk - I tried. The system booted up just fine, some error messages about unusable free space caches came up but it booted up into the login manager. However, I can no longer login: KDE startup freezes the system. If I ssh into the box first, I can see some dmesg output related to "bad blocks" and some "transid" errors. So I did a scrub, this is what I get: # jupiter btrfs-progs-unstable [git:integration-20110805] # ./btrfs scr start -B /mnt/btrfs ERROR: scrubbing /mnt/btrfs failed for device id 1 (Input/output error) scrub canceled for 493dacb5-0397-4b47-bd18-c2b2349c9958 scrub started at Sat Oct 8 00:41:59 2011 and was aborted after 6113 seconds total bytes scrubbed: 586.67GB with 40 errors error details: verify=40 corrected errors: 0, uncorrectable errors: 20, unverified errors: 0 jupiter ~ # while true; do dmesg -c; sleep 1; done [13686.476028] zcache: destroyed pool id=0 [13844.990091] device fsid 493dacb5-0397-4b47-bd18-c2b2349c9958 devid 1 transid 45326 /dev/sdd3 [13844.990374] btrfs: use lzo compression [13845.074088] btrfs: disk space caching is enabled [13852.481822] zcache: created ephemeral tmem pool, id=0 [19577.056022] btrfs: unable to fixup at 641086156800 [19577.056211] btrfs: unable to fixup at 641086160896 [19577.056383] btrfs: unable to fixup at 641086164992 [19577.056555] btrfs: unable to fixup at 641086169088 [19577.056733] btrfs: unable to fixup at 641086173184 [19577.858378] btrfs: unable to fixup at 641086156800 [19577.858566] btrfs: unable to fixup at 641086160896 [19577.858736] btrfs: unable to fixup at 641086164992 [19577.858909] btrfs: unable to fixup at 641086169088 [19577.859083] btrfs: unable to fixup at 641086173184 [19986.054338] verify_parent_transid: 310 callbacks suppressed [19986.054343] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.054559] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.054904] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.062448] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.062455] parent transid verify failed on 641086156800 wanted 43863 found 43873 I was able to rsync my /home to the original partition I created my btrfs from about 2 weeks ago, so this is not a complete desaster - but with many of these logged to dmesg: [10902.814420] btrfs no csum found for inode 445127 start 58392576 [10902.815153] btrfs no csum found for inode 445127 start 58396672 [10902.815951] btrfs no csum found for inode 445127 start 58400768 [10902.816692] btrfs no csum found for inode 445127 start 58404864 [10902.817430] btrfs no csum found for inode 445127 start 58408960 [10902.818168] btrfs no csum found for inode 445127 start 58413056 [10902.818904] btrfs no csum found for inode 445127 start 58417152 [10902.819683] btrfs no csum found for inode 445127 start 58421248 [10902.820421] btrfs no csum found for inode 445127 start 58425344 [10902.821154] btrfs no csum found for inode 445127 start 58429440 [10902.821887] btrfs no csum found for inode 445127 start 58433536 [10902.822673] btrfs no csum found for inode 445127 start 58437632 [10902.823414] btrfs no csum found for inode 445127 start 58441728 [10902.824151] btrfs no csum found for inode 445127 start 58445824 [10902.824889] btrfs no csum found for inode 445127 start 58449920 [10902.825716] btrfs no csum found for inode 445127 start 58454016 [10960.325903] verify_parent_transid: 470 callbacks suppressed [10960.325908] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.326129] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.326319] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.334898] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.334906] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.334912] btrfs no csum found for inode 288125 start 8912896 [10960.335131] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335322] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335518] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335840] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335849] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335854] btrfs no csum found for inode 288125 start 8916992 [10960.336643] btrfs no csum found for inode 288125 start 8921088 [10960.337413] btrfs no csum found for inode 288125 start 8925184 [10960.338169] btrfs no csum found for inode 288125 start 8929280 [10960.338920] btrfs no csum found for inode 288125 start 8933376 [10960.339702] btrfs no csum found for inode 288125 start 8937472 [10960.340440] btrfs no csum found for inode 288125 start 8941568 [10960.341179] btrfs no csum found for inode 288125 start 8945664 [10960.341916] btrfs no csum found for inode 288125 start 8949760 [10960.342746] btrfs no csum found for inode 288125 start 8953856 [10960.343483] btrfs no csum found for inode 288125 start 8957952 [10960.344222] btrfs no csum found for inode 288125 start 8962048 I''m on Gentoo, using gentoo-sources 3.0.4 in the backup system (the btrfs system runs on 3.0.6): # uname -a Linux jupiter 3.0.4-gentoo #1 SMP Sat Oct 1 17:20:43 CEST 2011 i686 Intel(R) Pentium(R) 4 CPU 3.20GHz GenuineIntel GNU/Linux The btrfs partition consists of multiple sub volumes. I would not loose my /home as I was able to sync it but I would loose the rest of the file system which includes a complete Gentoo installation and some big data files only recoverable by investing much time. So I''d love to get rid of the problems scrub complains about. I don''t mind if I would have to delete some files which I can probably recover easily. But I can simply find no way to identify these files. Is there a way to map the above error messages to file system pathes? Just for completeness, here''s my fstab with mount options (the relevant part): /dev/sdd3 / btrfs compress=lzo,autodefrag,subvol=root 0 1 /dev/sdd3 /home btrfs compress=lzo,autodefrag,subvol=home 0 2 /dev/sdd3 /usr/portage btrfs compress=lzo,autodefrag,subvol=portage 0 2 /dev/sdd3 /usr/src btrfs compress=lzo,autodefrag,subvol=usr-src 0 2 /dev/sdd3 /tmp btrfs compress=lzo,autodefrag,subvol=tmp,nodev,nosuid 0 2 /dev/sdd3 /var/tmp btrfs compress=lzo,autodefrag,subvol=var-tmp,nodev,nosuid 0 2 /dev/sdd3 /mnt/btrfs-subvol-0 btrfs compress=lzo,subvolid=0,autodefrag,noauto 0 2 I got rid of the free space caching issues by mounting with clear_cache. Memory of the system is stable (memtest86 does not report errors). The hard disk is fresh from factory, no errors reported in smartctl and no sector errors reported in dmesg. When the problem in "cp --reflink" occured I know there was a backtrace in dmesg related to btrfs but I was not able to capture it. Btrfs was created with meta data mirroring and user data striping although the fs is single disk only currently. I planned to add more disks when btrfs proves stable for me. Thanks in advance, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello! I tried to "cp --reflink" a huge file (about 80G, a VMware disk image). It took maybe about 1 minute when my PC started thrashing the hard disk, some minutes later the command returned with an out of memory message. I could no longer open terminals in my KDE Konsole to investiage dmesg. I could not start new programs. I could not log out. Hard disk access was somehow blocked. Opening new terminals within Konsole yielded in red letters. "unable to start /bin/bash" after a few seconds. I rebooted using the reset button, Alt+Print+S didn''t seem to sync anything to the disk - I tried. The system booted up just fine, some error messages about unusable free space caches came up but it booted up into the login manager. However, I can no longer login: KDE startup freezes the system. If I ssh into the box first, I can see some dmesg output related to "bad blocks" and some "transid" errors. So I did a scrub, this is what I get: # jupiter btrfs-progs-unstable [git:integration-20110805] # ./btrfs scr start -B /mnt/btrfs ERROR: scrubbing /mnt/btrfs failed for device id 1 (Input/output error) scrub canceled for 493dacb5-0397-4b47-bd18-c2b2349c9958 scrub started at Sat Oct 8 00:41:59 2011 and was aborted after 6113 seconds total bytes scrubbed: 586.67GB with 40 errors error details: verify=40 corrected errors: 0, uncorrectable errors: 20, unverified errors: 0 jupiter ~ # while true; do dmesg -c; sleep 1; done [13686.476028] zcache: destroyed pool id=0 [13844.990091] device fsid 493dacb5-0397-4b47-bd18-c2b2349c9958 devid 1 transid 45326 /dev/sdd3 [13844.990374] btrfs: use lzo compression [13845.074088] btrfs: disk space caching is enabled [13852.481822] zcache: created ephemeral tmem pool, id=0 [19577.056022] btrfs: unable to fixup at 641086156800 [19577.056211] btrfs: unable to fixup at 641086160896 [19577.056383] btrfs: unable to fixup at 641086164992 [19577.056555] btrfs: unable to fixup at 641086169088 [19577.056733] btrfs: unable to fixup at 641086173184 [19577.858378] btrfs: unable to fixup at 641086156800 [19577.858566] btrfs: unable to fixup at 641086160896 [19577.858736] btrfs: unable to fixup at 641086164992 [19577.858909] btrfs: unable to fixup at 641086169088 [19577.859083] btrfs: unable to fixup at 641086173184 [19986.054338] verify_parent_transid: 310 callbacks suppressed [19986.054343] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.054559] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.054904] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.062448] parent transid verify failed on 641086156800 wanted 43863 found 43873 [19986.062455] parent transid verify failed on 641086156800 wanted 43863 found 43873 I was able to rsync my /home to the original partition I created my btrfs from about 2 weeks ago, so this is not a complete desaster - but with many of these logged to dmesg: [10902.814420] btrfs no csum found for inode 445127 start 58392576 [10902.815153] btrfs no csum found for inode 445127 start 58396672 [10902.815951] btrfs no csum found for inode 445127 start 58400768 [10902.816692] btrfs no csum found for inode 445127 start 58404864 [10902.817430] btrfs no csum found for inode 445127 start 58408960 [10902.818168] btrfs no csum found for inode 445127 start 58413056 [10902.818904] btrfs no csum found for inode 445127 start 58417152 [10902.819683] btrfs no csum found for inode 445127 start 58421248 [10902.820421] btrfs no csum found for inode 445127 start 58425344 [10902.821154] btrfs no csum found for inode 445127 start 58429440 [10902.821887] btrfs no csum found for inode 445127 start 58433536 [10902.822673] btrfs no csum found for inode 445127 start 58437632 [10902.823414] btrfs no csum found for inode 445127 start 58441728 [10902.824151] btrfs no csum found for inode 445127 start 58445824 [10902.824889] btrfs no csum found for inode 445127 start 58449920 [10902.825716] btrfs no csum found for inode 445127 start 58454016 [10960.325903] verify_parent_transid: 470 callbacks suppressed [10960.325908] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.326129] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.326319] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.334898] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.334906] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.334912] btrfs no csum found for inode 288125 start 8912896 [10960.335131] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335322] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335518] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335840] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335849] parent transid verify failed on 641086173184 wanted 43863 found 43873 [10960.335854] btrfs no csum found for inode 288125 start 8916992 [10960.336643] btrfs no csum found for inode 288125 start 8921088 [10960.337413] btrfs no csum found for inode 288125 start 8925184 [10960.338169] btrfs no csum found for inode 288125 start 8929280 [10960.338920] btrfs no csum found for inode 288125 start 8933376 [10960.339702] btrfs no csum found for inode 288125 start 8937472 [10960.340440] btrfs no csum found for inode 288125 start 8941568 [10960.341179] btrfs no csum found for inode 288125 start 8945664 [10960.341916] btrfs no csum found for inode 288125 start 8949760 [10960.342746] btrfs no csum found for inode 288125 start 8953856 [10960.343483] btrfs no csum found for inode 288125 start 8957952 [10960.344222] btrfs no csum found for inode 288125 start 8962048 I''m on Gentoo, using gentoo-sources 3.0.4 in the backup system (the btrfs system runs on 3.0.6): # uname -a Linux jupiter 3.0.4-gentoo #1 SMP Sat Oct 1 17:20:43 CEST 2011 i686 Intel(R) Pentium(R) 4 CPU 3.20GHz GenuineIntel GNU/Linux The btrfs partition consists of multiple sub volumes. I would not loose my /home as I was able to sync it but I would loose the rest of the file system which includes a complete Gentoo installation and some big data files only recoverable by investing much time. So I''d love to get rid of the problems scrub complains about. I don''t mind if I would have to delete some files which I can probably recover easily. But I can simply find no way to identify these files. Is there a way to map the above error messages to file system pathes? Just for completeness, here''s my fstab with mount options (the relevant part): /dev/sdd3 / btrfs compress=lzo,autodefrag,subvol=root 0 1 /dev/sdd3 /home btrfs compress=lzo,autodefrag,subvol=home 0 2 /dev/sdd3 /usr/portage btrfs compress=lzo,autodefrag,subvol=portage 0 2 /dev/sdd3 /usr/src btrfs compress=lzo,autodefrag,subvol=usr-src 0 2 /dev/sdd3 /tmp btrfs compress=lzo,autodefrag,subvol=tmp,nodev,nosuid 0 2 /dev/sdd3 /var/tmp btrfs compress=lzo,autodefrag,subvol=var-tmp,nodev,nosuid 0 2 /dev/sdd3 /mnt/btrfs-subvol-0 btrfs compress=lzo,subvolid=0,autodefrag,noauto 0 2 I got rid of the free space caching issues by mounting with clear_cache. Memory of the system is stable (memtest86 does not report errors). The hard disk is fresh from factory, no errors reported in smartctl and no sector errors reported in dmesg. When the problem in "cp --reflink" occured I know there was a backtrace in dmesg related to btrfs but I was not able to capture it. Btrfs was created with meta data mirroring and user data striping although the fs is single disk only currently. I planned to add more disks when btrfs proves stable for me. Thanks in advance, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello again! 2011/10/8 Kai Krakow <hurikhan77+btrfs@gmail.com>:> I tried to "cp --reflink" a huge file (about 80G, a VMware disk > image). It took maybe about 1 minute when my PC started thrashing the > hard disk, some minutes later the command returned with an out of > memory message.[...]> So I''d love to > get rid of the problems scrub complains about. I don''t mind if I would > have to delete some files which I can probably recover easily. But I > can simply find no way to identify these files. Is there a way to map > the above error messages to file system pathes?I figured out that most of the affected files are located in the Google Chromium cache. Trying to "cat" all files there showed up the csum errors in dmesg but the system stays stable. I decided to simply delete the cache, however that makes "rm" becoming killed. Here''s the dmesg output from the "rm -Rf .cache/chromium/*" session: [ 637.297845] verify_parent_transid: 470 callbacks suppressed [ 637.297852] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 637.298833] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 637.315081] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 637.345259] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 637.345269] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 637.345296] BUG: unable to handle kernel NULL pointer dereference at 0000001c [ 637.345306] IP: [<c1142c2f>] btrfs_print_leaf+0xd/0x886 [ 637.345319] *pde = 00000000 [ 637.345324] Oops: 0000 [#1] SMP [ 637.345330] Modules linked in: af_packet vmnet vmblock vsock vmci vmmon lm90 it87 hwmon_vid hwmon fuse rfcomm bnep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss reiserfs zram(C) mperf loop emu10k1_gp sidewinder joydev nfs lockd auth_rpcgss nfs_acl sunrpc ipv6 tcp_cubic nvidia(P) snd_usb_audio snd_usbmidi_lib i82875p_edac usb_storage usbhid 8250_pnp gspca_sonixj gspca_main videodev firewire_ohci edac_core uas hid firewire_core sr_mod cdrom sg ns558 btusb analog evdev pcspkr ne2k_pci 8390 floppy gameport i2c_i801 8250 parport_pc serial_core parport e1000 intel_agp snd_mpu401 snd_mpu401_uart thermal bluetooth crc16 crc_itu_t fan processor button intel_gtt agpgart unix [last unloaded: microcode] [ 637.345434] [ 637.345439] Pid: 2465, comm: btrfs-delayed-m Tainted: P A C 3.0.6-gentoo #1 /8KNXP [ 637.345446] EIP: 0060:[<c1142c2f>] EFLAGS: 00010286 CPU: 1 [ 637.345450] EIP is at btrfs_print_leaf+0xd/0x886 [ 637.345454] EAX: f506a800 EBX: f506a800 ECX: 00418335 EDX: 00000000 [ 637.345457] ESI: 00000000 EDI: f4c40850 EBP: fffffffb ESP: f49d7d30 [ 637.345460] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 637.345464] Process btrfs-delayed-m (pid: 2465, ti=f49d6000 task=f4707800 task.ti=f49d6000) [ 637.345467] Stack: [ 637.345469] f499e840 f505e000 000000b2 00000000 00000000 ffffffff 00000000 00000000 [ 637.345478] 00000100 00000000 0006cac6 00000000 c113b1ec c101e917 f506a800 c1350707 [ 637.345487] 11d22000 00000128 00060000 00008050 2811d220 a8000001 00040000 00000000 [ 637.345496] Call Trace: [ 637.345502] [<c113b1ec>] ? update_block_group.clone.51+0x2b3/0x2e4 [ 637.345508] [<c101e917>] ? need_resched+0x11/0x1a [ 637.345513] [<c1350707>] ? _cond_resched+0x5/0x18 [ 637.345518] [<c113b7e1>] ? __btrfs_free_extent+0x397/0x7e3 [ 637.345523] [<c113e818>] ? run_clustered_refs+0x839/0x869 [ 637.345528] [<c11b66f8>] ? rb_erase+0x14d/0x1f0 [ 637.345532] [<c113e805>] ? run_clustered_refs+0x826/0x869 [ 637.345537] [<c101d84e>] ? kmap_atomic_prot+0x23/0x96 [ 637.345542] [<c11356fc>] ? btrfs_search_slot+0x3e8/0x452 [ 637.345547] [<c113e909>] ? btrfs_run_delayed_refs+0xc1/0x144 [ 637.345552] [<c114d8a0>] ? __btrfs_end_transaction+0x70/0x19b [ 637.345556] [<c114d9df>] ? btrfs_end_transaction_dmeta+0x14/0x18 [ 637.345561] [<c118d3dd>] ? btrfs_async_run_delayed_node_done+0x14d/0x1a0 [ 637.345567] [<c1174d3f>] ? worker_loop+0x10a/0x393 [ 637.345571] [<c1174c35>] ? btrfs_queue_worker+0x1f1/0x1f1 [ 637.345576] [<c103b98a>] ? kthread+0x63/0x68 [ 637.345580] [<c103b927>] ? kthread_worker_fn+0x10f/0x10f [ 637.345585] [<c135263e>] ? kernel_thread_helper+0x6/0xd [ 637.345588] Code: da d4 20 00 83 c4 2c 5b 5e 5f 5d c3 53 e8 ab ac ed ff 8a 58 64 e8 58 ab ed ff 88 d8 5b c3 55 57 56 53 83 ec 60 89 c3 89 54 24 2c <8b> 42 1c e8 8a ac ed ff 8b 50 60 89 54 24 48 e8 33 ab ed ff 8b [ 637.345633] EIP: [<c1142c2f>] btrfs_print_leaf+0xd/0x886 SS:ESP 0068:f49d7d30 [ 637.345640] CR2: 000000000000001c [ 637.345647] ---[ end trace 640af837f79e8469 ]--- [ 639.058404] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 639.058670] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 639.059074] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 639.067414] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 639.067423] parent transid verify failed on 641086160896 wanted 43863 found 43873 [ 639.067453] BUG: unable to handle kernel NULL pointer dereference at 0000001c [ 639.067462] IP: [<c1142c2f>] btrfs_print_leaf+0xd/0x886 [ 639.067476] *pde = 00000000 [ 639.067481] Oops: 0000 [#2] SMP [ 639.067487] Modules linked in: af_packet vmnet vmblock vsock vmci vmmon lm90 it87 hwmon_vid hwmon fuse rfcomm bnep snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss reiserfs zram(C) mperf loop emu10k1_gp sidewinder joydev nfs lockd auth_rpcgss nfs_acl sunrpc ipv6 tcp_cubic nvidia(P) snd_usb_audio snd_usbmidi_lib i82875p_edac usb_storage usbhid 8250_pnp gspca_sonixj gspca_main videodev firewire_ohci edac_core uas hid firewire_core sr_mod cdrom sg ns558 btusb analog evdev pcspkr ne2k_pci 8390 floppy gameport i2c_i801 8250 parport_pc serial_core parport e1000 intel_agp snd_mpu401 snd_mpu401_uart thermal bluetooth crc16 crc_itu_t fan processor button intel_gtt agpgart unix [last unloaded: microcode] [ 639.067584] [ 639.067590] Pid: 4241, comm: rm Tainted: P DA C 3.0.6-gentoo #1 /8KNXP [ 639.067596] EIP: 0060:[<c1142c2f>] EFLAGS: 00010286 CPU: 1 [ 639.067600] EIP is at btrfs_print_leaf+0xd/0x886 [ 639.067604] EAX: f506a800 EBX: f506a800 ECX: 0042f40d EDX: 00000000 [ 639.067607] ESI: 00000000 EDI: f4c400e0 EBP: fffffffb ESP: ec6c5ce4 [ 639.067610] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 639.067614] Process rm (pid: 4241, ti=ec6c4000 task=f3731c00 task.ti=ec6c4000) [ 639.067616] Stack: [ 639.067618] f49f9780 f505e000 000000b2 00000000 00000000 ffffffff 00000000 00000000 [ 639.067627] 00000100 00000000 0006cac7 00000000 c113b1ec c101e917 f506a800 c1350707 [ 639.067636] 11d6c000 00000128 00001000 00008050 2811d6c0 a8000001 00040000 00000000 [ 639.067645] Call Trace: [ 639.067651] [<c113b1ec>] ? update_block_group.clone.51+0x2b3/0x2e4 [ 639.067657] [<c101e917>] ? need_resched+0x11/0x1a [ 639.067662] [<c1350707>] ? _cond_resched+0x5/0x18 [ 639.067667] [<c113b7e1>] ? __btrfs_free_extent+0x397/0x7e3 [ 639.067672] [<c113e818>] ? run_clustered_refs+0x839/0x869 [ 639.067676] [<c113e818>] ? run_clustered_refs+0x839/0x869 [ 639.067681] [<c10963a7>] ? kfree+0x88/0x90 [ 639.067685] [<c113e818>] ? run_clustered_refs+0x839/0x869 [ 639.067690] [<c11857c8>] ? btrfs_delayed_ref_lock+0x2c/0x74 [ 639.067694] [<c113e805>] ? run_clustered_refs+0x826/0x869 [ 639.067700] [<c113e909>] ? btrfs_run_delayed_refs+0xc1/0x144 [ 639.067704] [<c1350707>] ? _cond_resched+0x5/0x18 [ 639.067709] [<c114d8a0>] ? __btrfs_end_transaction+0x70/0x19b [ 639.067713] [<c114da21>] ? btrfs_end_transaction+0x11/0x15 [ 639.067718] [<c1157509>] ? btrfs_evict_inode+0x172/0x1e5 [ 639.067723] [<c10ac392>] ? evict+0x52/0xe1 [ 639.067727] [<c10a580c>] ? do_unlinkat+0xca/0x10a [ 639.067733] [<c10c0e00>] ? fsnotify_find_inode_mark+0x17/0x1d [ 639.067737] [<c109aa57>] ? filp_close+0x56/0x5f [ 639.067743] [<c1352093>] ? sysenter_do_call+0x12/0x22 [ 639.067745] Code: da d4 20 00 83 c4 2c 5b 5e 5f 5d c3 53 e8 ab ac ed ff 8a 58 64 e8 58 ab ed ff 88 d8 5b c3 55 57 56 53 83 ec 60 89 c3 89 54 24 2c <8b> 42 1c e8 8a ac ed ff 8b 50 60 89 54 24 48 e8 33 ab ed ff 8b [ 639.067791] EIP: [<c1142c2f>] btrfs_print_leaf+0xd/0x886 SS:ESP 0068:ec6c5ce4 [ 639.067797] CR2: 000000000000001c [ 639.067805] ---[ end trace 640af837f79e846a ]--- Now every file access (no matter where in the file system) makes the shell freeze and the process cannot be killed. So, effectively my system is frozen again. :-( Any fix for this? Greetings, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello again!> [ 637.345296] BUG: unable to handle kernel NULL pointer dereference at 0000001c > [ 637.345306] IP: [<c1142c2f>] btrfs_print_leaf+0xd/0x886I tried to fix this by returning from this function when "l" is null like this guy did: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg10697.html But now I get: [ 397.416733] parent transid verify failed on 641086169088 wanted 43863 found 43873 [ 397.417015] parent transid verify failed on 641086169088 wanted 43863 found 43873 [ 397.417289] parent transid verify failed on 641086169088 wanted 43863 found 43873 [ 397.424886] parent transid verify failed on 641086169088 wanted 43863 found 43873 [ 397.424895] parent transid verify failed on 641086169088 wanted 43863 found 43873 [ 397.424900] ------------[ cut here ]------------ [ 397.424910] WARNING: at fs/btrfs/extent-tree.c:4462 __btrfs_free_extent+0x3a6/0x7e3() [ 397.424913] Hardware name: [ 397.424915] Modules linked in: af_packet vmnet vmblock vsock vmci vmmon lm90 it87 hwmon_vid hwmon fuse bnep rfcomm snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss reiserfs zram(C) mperf loop emu10k1_gp sidewinder joydev nfs lockd auth_rpcgss nfs_acl sunrpc ipv6 tcp_cubic nvidia(P) snd_usb_audio snd_usbmidi_lib usb_storage gspca_sonixj usbhid sr_mod gspca_main hid cdrom videodev uas sg 8250_pnp btusb firewire_ohci i82875p_edac ne2k_pci 8390 bluetooth analog e1000 firewire_core ns558 gameport 8250 serial_core parport_pc i2c_i801 pcspkr snd_mpu401 floppy edac_core crc_itu_t crc16 evdev fan intel_agp processor button thermal parport snd_mpu401_uart intel_gtt agpgart unix [last unloaded: microcode] [ 397.424997] Pid: 3873, comm: rm Tainted: P A C 3.0.6-gentoo #2 [ 397.425000] Call Trace: [ 397.425008] [<c1028cec>] ? warn_slowpath_common+0x7c/0x8f [ 397.425012] [<c113b7f0>] ? __btrfs_free_extent+0x3a6/0x7e3 [ 397.425016] [<c113b7f0>] ? __btrfs_free_extent+0x3a6/0x7e3 [ 397.425021] [<c1028d1a>] ? warn_slowpath_null+0x1b/0x1f [ 397.425025] [<c113b7f0>] ? __btrfs_free_extent+0x3a6/0x7e3 [ 397.425031] [<c1096332>] ? kfree+0x13/0x90 [ 397.425035] [<c11857d0>] ? btrfs_delayed_ref_lock+0x2c/0x74 [ 397.425040] [<c113e805>] ? run_clustered_refs+0x826/0x869 [ 397.425046] [<c113681b>] ? btrfs_del_items+0x2f2/0x2fc [ 397.425050] [<c113e909>] ? btrfs_run_delayed_refs+0xc1/0x144 [ 397.425055] [<c114d8a8>] ? __btrfs_end_transaction+0x70/0x19b [ 397.425059] [<c114da29>] ? btrfs_end_transaction+0x11/0x15 [ 397.425064] [<c115756d>] ? btrfs_evict_inode+0x1ce/0x1e5 [ 397.425068] [<c10ac392>] ? evict+0x52/0xe1 [ 397.425073] [<c10a580c>] ? do_unlinkat+0xca/0x10a [ 397.425078] [<c10c0e00>] ? fsnotify_find_inode_mark+0x17/0x1d [ 397.425083] [<c109aa57>] ? filp_close+0x56/0x5f [ 397.425089] [<c1352093>] ? sysenter_do_call+0x12/0x22 [ 397.425092] ---[ end trace b71c480eb63758b5 ]--- [ 397.425096] btrfs unable to find ref byte nr 1271613026304 parent 0 root 256 owner 445125 offset 2097152 [ 397.425108] BUG: unable to handle kernel NULL pointer dereference at 0000000c [ 397.425114] IP: [<c1164530>] btrfs_item_size+0xe/0xb0 [ 397.425122] *pde = 00000000 [ 397.425125] Oops: 0000 [#1] SMP [ 397.425129] Modules linked in: af_packet vmnet vmblock vsock vmci vmmon lm90 it87 hwmon_vid hwmon fuse bnep rfcomm snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss reiserfs zram(C) mperf loop emu10k1_gp sidewinder joydev nfs lockd auth_rpcgss nfs_acl sunrpc ipv6 tcp_cubic nvidia(P) snd_usb_audio snd_usbmidi_lib usb_storage gspca_sonixj usbhid sr_mod gspca_main hid cdrom videodev uas sg 8250_pnp btusb firewire_ohci i82875p_edac ne2k_pci 8390 bluetooth analog e1000 firewire_core ns558 gameport 8250 serial_core parport_pc i2c_i801 pcspkr snd_mpu401 floppy edac_core crc_itu_t crc16 evdev fan intel_agp processor button thermal parport snd_mpu401_uart intel_gtt agpgart unix [last unloaded: microcode] [ 397.425200] [ 397.425204] Pid: 3873, comm: rm Tainted: P AWC 3.0.6-gentoo #2 /8KNXP [ 397.425210] EIP: 0060:[<c1164530>] EFLAGS: 00010292 CPU: 1 [ 397.425214] EIP is at btrfs_item_size+0xe/0xb0 [ 397.425217] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000065 [ 397.425221] ESI: 00000065 EDI: f4c3dd90 EBP: 0000007a ESP: f4a75d1c [ 397.425224] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 397.425228] Process rm (pid: 3873, ti=f4a74000 task=f4b06000 task.ti=f4a74000) [ 397.425230] Stack: [ 397.425232] b71c480e c1028cf1 c13eb382 c1554038 0000116e c113b7f0 a88f56d4 00000000 [ 397.425241] 00000000 f4c3dd90 00000000 00000000 f4c3dd90 fffffffb c113b883 c13fcf0d [ 397.425250] 120af000 00000128 00000000 00000000 00000100 00000000 0006cac5 00000000 [ 397.425259] Call Trace: [ 397.425264] [<c1028cf1>] ? warn_slowpath_common+0x81/0x8f [ 397.425269] [<c113b7f0>] ? __btrfs_free_extent+0x3a6/0x7e3 [ 397.425273] [<c113b883>] ? __btrfs_free_extent+0x439/0x7e3 [ 397.425278] [<c1096332>] ? kfree+0x13/0x90 [ 397.425282] [<c11857d0>] ? btrfs_delayed_ref_lock+0x2c/0x74 [ 397.425286] [<c113e805>] ? run_clustered_refs+0x826/0x869 [ 397.425292] [<c113681b>] ? btrfs_del_items+0x2f2/0x2fc [ 397.425297] [<c113e909>] ? btrfs_run_delayed_refs+0xc1/0x144 [ 397.425301] [<c114d8a8>] ? __btrfs_end_transaction+0x70/0x19b [ 397.425306] [<c114da29>] ? btrfs_end_transaction+0x11/0x15 [ 397.425310] [<c115756d>] ? btrfs_evict_inode+0x1ce/0x1e5 [ 397.425314] [<c10ac392>] ? evict+0x52/0xe1 [ 397.425318] [<c10a580c>] ? do_unlinkat+0xca/0x10a [ 397.425322] [<c10c0e00>] ? fsnotify_find_inode_mark+0x17/0x1d [ 397.425326] [<c109aa57>] ? filp_close+0x56/0x5f [ 397.425331] [<c1352093>] ? sysenter_do_call+0x12/0x22 [ 397.425333] Code: 11 85 ed 75 10 b9 04 00 00 00 8b 54 24 28 89 d8 e8 11 9b 00 00 83 c4 2c 5b 5e 5f 5d c3 55 57 56 53 83 ec 28 89 c3 89 d6 8d 6a 15 <8b> 78 0c 85 ff 74 1e 8b 40 14 39 c5 72 17 8d 4a 19 8b 53 18 01 [ 397.425382] EIP: [<c1164530>] btrfs_item_size+0xe/0xb0 SS:ESP 0068:f4a75d1c [ 397.425388] CR2: 000000000000000c [ 397.425392] ---[ end trace b71c480eb63758b6 ]--- This makes me think the problem is buried much deeper because path->nodes[0] is null. :-( Please advice. Greetings, Kai -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html