Hi! I got a X server / drm related crash or hard lockup. After I rebooted I tried to mount the BTRFS on my esata disk. It has big metadata (mkfs.btrfs -l 32768 -n 32768). I got: [ 43.764274] ata5: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen [ 43.764278] ata5: irq_stat 0x00000040, connection status changed [ 43.764281] ata5: SError: { PHYRdyChg CommWake DevExch } [ 43.764287] ata5: hard resetting link [ 46.978917] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 46.989402] ata5.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 46.989407] ata5.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 46.990609] ata5.00: ATA-8: Hitachi HTS545050B9A300, PB4OC60G, max UDMA/133 [ 46.990613] ata5.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA [ 46.991925] ata5.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 46.991930] ata5.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 46.993155] ata5.00: configured for UDMA/133 [ 47.003851] ata5: EH complete [ 47.003958] scsi 4:0:0:0: Direct-Access ATA Hitachi HTS54505 PB4O PQ: 0 ANSI: 5 [ 47.004135] sd 4:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) [ 47.004191] sd 4:0:0:0: [sdb] Write Protect is off [ 47.004194] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 47.004218] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn''t support DPO or FUA [ 47.050154] sdb: sdb1 [ 47.050390] sd 4:0:0:0: [sdb] Attached SCSI disk [ 58.100217] CPU1: Package power limit notification (total events = 1) [ 58.100220] CPU3: Package power limit notification (total events = 1) [ 58.100221] CPU2: Package power limit notification (total events = 1) [ 58.100225] CPU0: Package power limit notification (total events = 1) [ 58.103689] CPU1: Package power limit normal [ 58.103691] CPU3: Package power limit normal [ 58.103692] CPU2: Package power limit normal [ 58.103695] CPU0: Package power limit normal [ 249.200560] device label daten devid 1 transid 2194 /dev/sdb1 [ 249.201186] btrfs: use lzo compression [ 249.201192] btrfs: disk space caching is enabled [ 249.241975] btrfs: bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 0, gen 0 [ 251.620610] ------------[ cut here ]------------ [ 251.620693] kernel BUG at fs/btrfs/inode.c:3758! [ 251.620767] invalid opcode: 0000 [#1] PREEMPT SMP [ 251.620842] CPU 1 [ 251.620960] [ 251.620988] Pid: 3430, comm: mount Tainted: G O 3.5.0-rc4-tp520 #1 LENOVO 42433WG/42433WG [ 251.621149] RIP: 0010:[<ffffffffa023a93f>] [<ffffffffa023a93f>] btrfs_evict_inode+0xcd/0x278 [btrfs] [ 251.621289] RSP: 0018:ffff880157033a58 EFLAGS: 00010246 [ 251.621370] RAX: 0000000000000000 RBX: ffff8801c1747800 RCX: 000000000000001a [ 251.621477] RDX: 000000000000001a RSI: 0000000000000002 RDI: ffff880157032000 [ 251.621584] RBP: ffff8800851c5d20 R08: ffff880157033978 R09: 0000000000000002 [ 251.621691] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffa027a230 [ 251.621799] R13: 0000000000008000 R14: 0000000000008000 R15: ffff8801c1740400 [ 251.621907] FS: 00007ffa1402e7e0(0000) GS:ffff88021e240000(0000) knlGS:0000000000000000 [ 251.622029] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 251.622116] CR2: ffffffffff600400 CR3: 000000014a97d000 CR4: 00000000000407e0 [ 251.622224] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 251.622332] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 251.622440] Process mount (pid: 3430, threadinfo ffff880157032000, task ffff88015983be70) [ 251.622562] Stack: [ 251.622597] 0000000000000000 ffff8800851c5da8 ffff8800851c5d20 ffff8800851c5d20 [ 251.622711] ffff8800851c5e18 ffffffffa027a230 ffff8800851c5d20 ffff8801c1742c00 [ 251.622825] ffff8801c1740400 ffffffff8111787c ffff88020dca8cf0 ffff8801c1747800 [ 251.622936] Call Trace: [ 251.622982] [<ffffffff8111787c>] ? evict+0xa3/0x153 [ 251.623077] [<ffffffffa0260ef6>] ? fixup_inode_link_counts+0xd2/0xfb [btrfs] [ 251.623201] [<ffffffffa022ee3c>] ? btrfs_read_fs_root_no_name+0x92/0x24e [btrfs] [ 251.623331] [<ffffffffa0261db3>] ? btrfs_recover_log_trees+0x207/0x2dd [btrfs] [ 251.623458] [<ffffffffa0260a3b>] ? replay_one_extent+0x439/0x439 [btrfs] [ 251.623578] [<ffffffffa0230fac>] ? open_ctree+0x1354/0x1680 [btrfs] [ 251.627492] [<ffffffff811b60b0>] ? ida_get_new_above+0x16c/0x17d [ 251.631356] [<ffffffffa02153fa>] ? btrfs_mount+0x3cb/0x516 [btrfs] [ 251.635197] [<ffffffff810ef373>] ? alloc_pages_current+0xb2/0xcd [ 251.638971] [<ffffffff811078c9>] ? mount_fs+0x61/0x144 [ 251.642736] [<ffffffff8111a390>] ? vfs_kern_mount+0x62/0xe3 [ 251.646426] [<ffffffff8111aa2a>] ? do_kern_mount+0x49/0xdd [ 251.650039] [<ffffffff8111c20f>] ? do_mount+0x68a/0x710 [ 251.653636] [<ffffffff8111c3b5>] ? sys_mount+0x80/0xba [ 251.657204] [<ffffffff813d53b9>] ? system_call_fastpath+0x16/0x1b [ 251.660791] Code: 00 48 83 ca ff 31 f6 48 89 ef e8 d8 05 01 00 48 8b 83 20 01 00 00 83 b8 40 0e 00 00 00 74 0e 48 8b 45 98 a8 20 0f 85 7b 01 00 00 <0f> 0b 83 7d 48 00 74 0f 83 bb f8 00 00 00 00 0f 84 66 01 00 00 [ 251.668347] RIP [<ffffffffa023a93f>] btrfs_evict_inode+0xcd/0x278 [btrfs] [ 251.672204] RSP <ffff880157033a58> [ 251.698474] ---[ end trace 431fcd3e91e1f4fd ]--- [ 265.799887] nepomukservices[2181]: segfault at 0 ip (null) sp 00007fff403d0ca8 error 14 in nepomukservicestub[400000+7000] BTRFS was not mounted. After trying to mount again, I got: merkaba:~> ps aux | grep " D" | grep -v grep root 3446 0.0 0.0 0 0 ? D 20:22 0:00 [btrfs-transacti] root 4666 0.0 0.0 18640 1184 tty1 D+ 20:24 0:00 mount /mnt/amazon-daten Any hints how to get my disk mounted? I have a fairly recent backup, but I would prefer when I do not have to replay it. Its one of my expectations for a file system: be safe on sudden write interruptions like power loss or crash. Ciao, -- Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am Montag, 25. Juni 2012 schrieb Martin Steigerwald:> Hi! > > I got a X server / drm related crash or hard lockup. After I rebooted I > tried to mount the BTRFS on my esata disk. It has big metadata > (mkfs.btrfs -l 32768 -n 32768). > > > I got: >[… backtrace …]> BTRFS was not mounted. After trying to mount again, I got: > > merkaba:~> ps aux | grep " D" | grep -v grep > root 3446 0.0 0.0 0 0 ? D 20:22 0:00 > [btrfs-transacti] root 4666 0.0 0.0 18640 1184 tty1 D+ > 20:24 0:00 mount /mnt/amazon-daten > > Any hints how to get my disk mounted? > > I have a fairly recent backup, but I would prefer when I do not have to > replay it. Its one of my expectations for a file system: be safe on > sudden write interruptions like power loss or crash.Well, I wanted to have back my disk ASAP. So I just tried that btrfs-zero- log mantra again. It worked. Hopefully the backtrace still gives you a clue on what has happened. I thought these kind of errors where gone now. (Yeah, I know its still experimental… no indoctrination requested;-) Thanks, -- Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 25, 2012 at 08:29:34PM +0200, Martin Steigerwald wrote:> I got a X server / drm related crash or hard lockup. After I rebooted I > tried to mount the BTRFS on my esata disk. It has big metadata > (mkfs.btrfs -l 32768 -n 32768). > > > I got: > > [ 43.764274] ata5: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen > [ 43.764278] ata5: irq_stat 0x00000040, connection status changed > [ 43.764281] ata5: SError: { PHYRdyChg CommWake DevExch } > [ 43.764287] ata5: hard resetting link > [ 46.978917] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [ 46.989402] ata5.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded > [ 46.989407] ata5.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out > [ 46.990609] ata5.00: ATA-8: Hitachi HTS545050B9A300, PB4OC60G, max UDMA/133 > [ 46.990613] ata5.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA > [ 46.991925] ata5.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded > [ 46.991930] ata5.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out > [ 46.993155] ata5.00: configured for UDMA/133 > [ 47.003851] ata5: EH complete > [ 47.003958] scsi 4:0:0:0: Direct-Access ATA Hitachi HTS54505 PB4O PQ: 0 ANSI: 5 > [ 47.004135] sd 4:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) > [ 47.004191] sd 4:0:0:0: [sdb] Write Protect is off > [ 47.004194] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > [ 47.004218] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn''t support DPO or FUA > [ 47.050154] sdb: sdb1 > [ 47.050390] sd 4:0:0:0: [sdb] Attached SCSI disk > [ 58.100217] CPU1: Package power limit notification (total events = 1) > [ 58.100220] CPU3: Package power limit notification (total events = 1) > [ 58.100221] CPU2: Package power limit notification (total events = 1) > [ 58.100225] CPU0: Package power limit notification (total events = 1) > [ 58.103689] CPU1: Package power limit normal > [ 58.103691] CPU3: Package power limit normal > [ 58.103692] CPU2: Package power limit normal > [ 58.103695] CPU0: Package power limit normal > [ 249.200560] device label daten devid 1 transid 2194 /dev/sdb1 > [ 249.201186] btrfs: use lzo compression > [ 249.201192] btrfs: disk space caching is enabled > [ 249.241975] btrfs: bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 0, gen 0 > [ 251.620610] ------------[ cut here ]------------ > [ 251.620693] kernel BUG at fs/btrfs/inode.c:3758!3756 if (root->fs_info->log_root_recovering) { 3757 BUG_ON(!test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, 3758 &BTRFS_I(inode)->runtime_flags)); 3759 goto no_delete; 3760 } and it happened during log replay, as you found already, fixable by running the zero-log utility. Another way is to mount read-only, this skips log replay. I think there could be a logic error, as this probably happens only during log replay when the orphan bit is not in sync with link count, but I saw that this should be handled in the fixup_inode_link_counts call path. CCing Josef, if he has an idea.> [ 251.620767] invalid opcode: 0000 [#1] PREEMPT SMP > [ 251.620842] CPU 1 > [ 251.620960] > [ 251.620988] Pid: 3430, comm: mount Tainted: G O 3.5.0-rc4-tp520 #1 LENOVO 42433WG/42433WG > [ 251.621149] RIP: 0010:[<ffffffffa023a93f>] [<ffffffffa023a93f>] btrfs_evict_inode+0xcd/0x278 [btrfs] > [ 251.621289] RSP: 0018:ffff880157033a58 EFLAGS: 00010246 > [ 251.621370] RAX: 0000000000000000 RBX: ffff8801c1747800 RCX: 000000000000001a > [ 251.621477] RDX: 000000000000001a RSI: 0000000000000002 RDI: ffff880157032000 > [ 251.621584] RBP: ffff8800851c5d20 R08: ffff880157033978 R09: 0000000000000002 > [ 251.621691] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffa027a230 > [ 251.621799] R13: 0000000000008000 R14: 0000000000008000 R15: ffff8801c1740400 > [ 251.621907] FS: 00007ffa1402e7e0(0000) GS:ffff88021e240000(0000) knlGS:0000000000000000 > [ 251.622029] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 251.622116] CR2: ffffffffff600400 CR3: 000000014a97d000 CR4: 00000000000407e0 > [ 251.622224] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 251.622332] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 251.622440] Process mount (pid: 3430, threadinfo ffff880157032000, task ffff88015983be70) > [ 251.622562] Stack: > [ 251.622597] 0000000000000000 ffff8800851c5da8 ffff8800851c5d20 ffff8800851c5d20 > [ 251.622711] ffff8800851c5e18 ffffffffa027a230 ffff8800851c5d20 ffff8801c1742c00 > [ 251.622825] ffff8801c1740400 ffffffff8111787c ffff88020dca8cf0 ffff8801c1747800 > [ 251.622936] Call Trace: > [ 251.622982] [<ffffffff8111787c>] ? evict+0xa3/0x153 > [ 251.623077] [<ffffffffa0260ef6>] ? fixup_inode_link_counts+0xd2/0xfb [btrfs] > [ 251.623201] [<ffffffffa022ee3c>] ? btrfs_read_fs_root_no_name+0x92/0x24e [btrfs] > [ 251.623331] [<ffffffffa0261db3>] ? btrfs_recover_log_trees+0x207/0x2dd [btrfs] > [ 251.623458] [<ffffffffa0260a3b>] ? replay_one_extent+0x439/0x439 [btrfs] > [ 251.623578] [<ffffffffa0230fac>] ? open_ctree+0x1354/0x1680 [btrfs] > [ 251.627492] [<ffffffff811b60b0>] ? ida_get_new_above+0x16c/0x17d > [ 251.631356] [<ffffffffa02153fa>] ? btrfs_mount+0x3cb/0x516 [btrfs] > [ 251.635197] [<ffffffff810ef373>] ? alloc_pages_current+0xb2/0xcd > [ 251.638971] [<ffffffff811078c9>] ? mount_fs+0x61/0x144 > [ 251.642736] [<ffffffff8111a390>] ? vfs_kern_mount+0x62/0xe3 > [ 251.646426] [<ffffffff8111aa2a>] ? do_kern_mount+0x49/0xdd > [ 251.650039] [<ffffffff8111c20f>] ? do_mount+0x68a/0x710 > [ 251.653636] [<ffffffff8111c3b5>] ? sys_mount+0x80/0xba > [ 251.657204] [<ffffffff813d53b9>] ? system_call_fastpath+0x16/0x1b > [ 251.660791] Code: 00 48 83 ca ff 31 f6 48 89 ef e8 d8 05 01 00 48 8b 83 20 01 00 00 83 b8 40 0e 00 00 00 74 0e 48 8b 45 98 a8 > 20 0f 85 7b 01 00 00 <0f> 0b 83 7d 48 00 74 0f 83 bb f8 00 00 00 00 0f 84 66 01 00 00 > [ 251.668347] RIP [<ffffffffa023a93f>] btrfs_evict_inode+0xcd/0x278 [btrfs] > [ 251.672204] RSP <ffff880157033a58>-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/26/2012 06:18 AM, David Sterba wrote:> 3756 if (root->fs_info->log_root_recovering) { > 3757 BUG_ON(!test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, > 3758 &BTRFS_I(inode)->runtime_flags)); > 3759 goto no_delete; > 3760 } > > and it happened during log replay, as you found already, fixable by > running the zero-log utility. Another way is to mount read-only, this > skips log replay. > > I think there could be a logic error, as this probably happens only > during log replay when the orphan bit is not in sync with link count, > but I saw that this should be handled in the fixup_inode_link_counts > call path. CCing Josef, if he has an idea. >It is a logic error, but mostly a finger wrong from Josef IMO... :) I''ll send a patch for it. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am Dienstag, 26. Juni 2012 schrieb Liu Bo:> On 06/26/2012 06:18 AM, David Sterba wrote: > > 3756 if (root->fs_info->log_root_recovering) { > > 3757 BUG_ON(!test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, > > 3758 &BTRFS_I(inode)->runtime_flags)); > > 3759 goto no_delete; > > 3760 } > > > > and it happened during log replay, as you found already, fixable by > > running the zero-log utility. Another way is to mount read-only, this > > skips log replay. > > > > I think there could be a logic error, as this probably happens only > > during log replay when the orphan bit is not in sync with link count, > > but I saw that this should be handled in the fixup_inode_link_counts > > call path. CCing Josef, if he has an idea. > > It is a logic error, but mostly a finger wrong from Josef IMO... :) > > I''ll send a patch for it.Thanks for looking into it. Since my BTRFS is up and running again I can´t test a patch easily however. I´d have to unplug the disk or crash my laptop several times to trigger it again I bet. -- Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 25, 2012 at 09:47:33PM -0600, Liu Bo wrote:> On 06/26/2012 06:18 AM, David Sterba wrote: > > > 3756 if (root->fs_info->log_root_recovering) { > > 3757 BUG_ON(!test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, > > 3758 &BTRFS_I(inode)->runtime_flags)); > > 3759 goto no_delete; > > 3760 } > > > > and it happened during log replay, as you found already, fixable by > > running the zero-log utility. Another way is to mount read-only, this > > skips log replay. > > > > I think there could be a logic error, as this probably happens only > > during log replay when the orphan bit is not in sync with link count, > > but I saw that this should be handled in the fixup_inode_link_counts > > call path. CCing Josef, if he has an idea. > > > > > It is a logic error, but mostly a finger wrong from Josef IMO... :) > > I''ll send a patch for it.Heh oops, sorry about that ;), Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html