Marc MERLIN
2012-Oct-25 19:58 UTC
Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
Howdy, I can wait a day or maybe 2 before I have to wipe and restore from backup. Please let me know if you have a patch against 3.6.3 you''d like me to try to mount/recover this filesystem, or whether you''d like me to try btrfsck. My laptop had a problem with its boot drive which prevented linux from writing to it, and in turn caused btrfs to have incomplete writes to it. After reboot, the boot drive was fine, but the btrfs filesystem has a corruption that prevents it from being mounted. Unfortunately the mount crash prevents writing of crash data to even another drive since linux stops before the crash data can be written to syslog. Picture #1 shows a dump when my laptop crashed (before reboot). btrfs no csum found for inode X start Y http://marc.merlins.org/tmp/crash.jpg Mounting with 3.5.0 and 3.6.3 gives the same error: gandalfthegreat:~# mount -o recovery,skip_balance,ro /dev/mapper/bootdsk shows btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 (there are 2 lines, not sure why) kernel BUG at fs/btrfs/volumes.c:3707 int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 len) { struct extent_map *em; struct map_lookup *map; struct extent_map_tree *em_tree = &map_tree->map_tree; int ret; read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, logical, len); read_unlock(&em_tree->lock); BUG_ON(!em); <--- If the snapshot helps (sorry, hard to read, but usable): http://marc.merlins.org/tmp/btrfs_bug.jpg Questions: 1) Any better way to get a proper dump without serial console? (I hate to give you pictures) 2) Should I try btrfsck now, or are there other mount options than mount -o recovery,skip_balance,ro /dev/mapper/bootdsk I should try? 3) Want me to try btrfsck although it may make it impossible for me to reproduce the bug and test a fix, as well as potentially break the filesystem more (last time I tried btrfsck, it outputted thousands of lines and never converged to a state it was happy with) Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
cwillu
2012-Oct-25 20:03 UTC
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
On Thu, Oct 25, 2012 at 1:58 PM, Marc MERLIN <marc@merlins.org> wrote:> Howdy, > > I can wait a day or maybe 2 before I have to wipe and restore from backup. > Please let me know if you have a patch against 3.6.3 you''d like me to try > to mount/recover this filesystem, or whether you''d like me to try btrfsck. > > > My laptop had a problem with its boot drive which prevented linux > from writing to it, and in turn caused btrfs to have incomplete writes > to it. > After reboot, the boot drive was fine, but the btrfs filesystem has > a corruption that prevents it from being mounted. > > Unfortunately the mount crash prevents writing of crash data to even another > drive since linux stops before the crash data can be written to syslog. > > Picture #1 shows a dump when my laptop crashed (before reboot). > btrfs no csum found for inode X start Y > http://marc.merlins.org/tmp/crash.jpg > > Mounting with 3.5.0 and 3.6.3 gives the same error: > > gandalfthegreat:~# mount -o recovery,skip_balance,ro /dev/mapper/bootdsk > > shows > btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 > btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 > (there are 2 lines, not sure why) > > kernel BUG at fs/btrfs/volumes.c:3707 > int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 len) > { > struct extent_map *em; > struct map_lookup *map; > struct extent_map_tree *em_tree = &map_tree->map_tree; > int ret; > > read_lock(&em_tree->lock); > em = lookup_extent_mapping(em_tree, logical, len); > read_unlock(&em_tree->lock); > BUG_ON(!em); <--- > > If the snapshot helps (sorry, hard to read, but usable): > http://marc.merlins.org/tmp/btrfs_bug.jpg > > Questions: > 1) Any better way to get a proper dump without serial console? > (I hate to give you pictures) > > 2) Should I try btrfsck now, or are there other mount options than > mount -o recovery,skip_balance,ro /dev/mapper/bootdsk > I should try? > > 3) Want me to try btrfsck although it may make it impossible for me to > reproduce the bug and test a fix, as well as potentially break the filesystem > more (last time I tried btrfsck, it outputted thousands of lines and never converged > to a state it was happy with)This looks like something btrfs-zero-log would work around (although -o recovery should do mostly the same things). That would destroy the evidence though, and may just make things (slightly) worse, so I''d wait to see if anyone suggests something better before trying it. If you''re ultimately ending up restoring from backup though, it may save you that effort at least. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2012-Oct-25 20:12 UTC
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
On Thu, Oct 25, 2012 at 02:03:49PM -0600, cwillu wrote:> > 3) Want me to try btrfsck although it may make it impossible for me to > > reproduce the bug and test a fix, as well as potentially break the filesystem > > more (last time I tried btrfsck, it outputted thousands of lines and never converged > > to a state it was happy with) > > This looks like something btrfs-zero-log would work around (although > -o recovery should do mostly the same things). That would destroy the > evidence though, and may just make things (slightly) worse, so I''d > wait to see if anyone suggests something better before trying it. If > you''re ultimately ending up restoring from backup though, it may save > you that effort at least.Thanks for pointing out btrfs-zero-log, I hadn''t re-read the wiki page since this got added. But I''ll hold off at least until tomorrow morning (GMT-7). If someone would like me to hold off a bit longer, please let me know and I''ll wait for whatever patch you''d like me to try. As for backups, yes, I have some :) and I also have hourly, daily, weekly btrfs subvolume snapshots, but I can''t use those currently since I can''t mount the base filesystem. If my latest snapshot is corrupted, once I know which subvolume has the problem (I can''t quite tell since the crash doesn''t say which subvolume is causing it), I can revert to the last hourly snapshot. Thanks for your reply. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2012-Oct-26 18:29 UTC
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
If any devs want info out of my drive, please ask today, I really need to fix it tomorrow. I''ll try btrfs-zero-log otherwise and if not, wipe and start over. Marc On Thu, Oct 25, 2012 at 12:58:05PM -0700, Marc MERLIN wrote:> Howdy, > > I can wait a day or maybe 2 before I have to wipe and restore from backup. > Please let me know if you have a patch against 3.6.3 you''d like me to try > to mount/recover this filesystem, or whether you''d like me to try btrfsck. > > > My laptop had a problem with its boot drive which prevented linux > from writing to it, and in turn caused btrfs to have incomplete writes > to it. > After reboot, the boot drive was fine, but the btrfs filesystem has > a corruption that prevents it from being mounted. > > Unfortunately the mount crash prevents writing of crash data to even another > drive since linux stops before the crash data can be written to syslog. > > Picture #1 shows a dump when my laptop crashed (before reboot). > btrfs no csum found for inode X start Y > http://marc.merlins.org/tmp/crash.jpg > > Mounting with 3.5.0 and 3.6.3 gives the same error: > > gandalfthegreat:~# mount -o recovery,skip_balance,ro /dev/mapper/bootdsk > > shows > btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 > btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 > (there are 2 lines, not sure why) > > kernel BUG at fs/btrfs/volumes.c:3707 > int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 len) > { > struct extent_map *em; > struct map_lookup *map; > struct extent_map_tree *em_tree = &map_tree->map_tree; > int ret; > > read_lock(&em_tree->lock); > em = lookup_extent_mapping(em_tree, logical, len); > read_unlock(&em_tree->lock); > BUG_ON(!em); <--- > > If the snapshot helps (sorry, hard to read, but usable): > http://marc.merlins.org/tmp/btrfs_bug.jpg > > Questions: > 1) Any better way to get a proper dump without serial console? > (I hate to give you pictures) > > 2) Should I try btrfsck now, or are there other mount options than > mount -o recovery,skip_balance,ro /dev/mapper/bootdsk > I should try? > > 3) Want me to try btrfsck although it may make it impossible for me to > reproduce the bug and test a fix, as well as potentially break the filesystem > more (last time I tried btrfsck, it outputted thousands of lines and never converged > to a state it was happy with) > > Thanks, > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems .... > .... what McDonalds is to gourmet cooking > Home page: http://marc.merlins.org/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2012-Oct-29 04:30 UTC
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
On Thu, Oct 25, 2012 at 01:12:23PM -0700, Marc MERLIN wrote:> On Thu, Oct 25, 2012 at 02:03:49PM -0600, cwillu wrote: > > > 3) Want me to try btrfsck although it may make it impossible for me to > > > reproduce the bug and test a fix, as well as potentially break the filesystem > > > more (last time I tried btrfsck, it outputted thousands of lines and never converged > > > to a state it was happy with) > > > > This looks like something btrfs-zero-log would work around (although > > -o recovery should do mostly the same things). That would destroy the > > evidence though, and may just make things (slightly) worse, so I''d > > wait to see if anyone suggests something better before trying it. If > > you''re ultimately ending up restoring from backup though, it may save > > you that effort at least. > > Thanks for pointing out btrfs-zero-log, I hadn''t re-read the wiki page since > this got added. > But I''ll hold off at least until tomorrow morning (GMT-7).I''m a bit surprised that no one seems to be replying on btrfs crashes, that''s a bit worrisome. I''m willing to risk my data somewhat, but if finding a problem doesn''t help fixing the code, I''m not sure if I''m helping anymore :-/ Since I ran out of time, I tried: gandalfthegreat:~# btrfs-zero-log usage: btrfs-zero-log dev Btrfs Btrfs v0.19 gandalfthegreat:~# btrfs-zero-log /dev/mapper/bootdsk Check tree block failed, want=7533391872, have=17347973115472321934 Check tree block failed, want=7533391872, have=17347973115472321934 Check tree block failed, want=7533391872, have=8450612919225897562 Check tree block failed, want=7533391872, have=17347973115472321934 Check tree block failed, want=7533391872, have=17347973115472321934 read block failed check_tree_block gandalfthegreat:~# So from here, unless someone chimes in tomorrow, I''m going to have to wipe my filesystem and start over. I suppose that means btrfs can likely still cause unknown and unfixable corruption. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy
2012-Oct-29 05:05 UTC
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
On Oct 25, 2012, at 2:12 PM, Marc MERLIN <marc@merlins.org> wrote:> I also have hourly, daily, weekly > btrfs subvolume snapshots, but I can''t use those currently since I can''t > mount the base filesystem.It might be worth unmounting it. Then only remounting a snapshot well before the problem started, yet still current enough to be useful: use ''-o subvol='' instead of trying to mount from the top. Each subvolume is a root directory, so it might be possible to find one that will mount directly.> I''m a bit surprised that no one seems to be replying on btrfs crashes, > that''s a bit worrisome. I''m willing to risk my data somewhat, but if finding > a problem doesn''t help fixing the code, I''m not sure if I''m helping anymore > :-/Lurking, I''ve learned this means you either didn''t provide enough information for anyone to go on, or the problem is known. I suspect the former. Kernel 3.5.0 or 3.6.2 doesn''t say where it came from, what distribution, or what version of btrfs is included in that distros kernel. And I''m not seeing that you''re using a debug kernel, which will actually produce useful error messages. And it''s over a weekend for another thing. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2012-Oct-29 17:42 UTC
Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707
Hi, Thanks for the reply and hints. On Sun, Oct 28, 2012 at 11:05:42PM -0600, Chris Murphy wrote:> > I also have hourly, daily, weekly > > btrfs subvolume snapshots, but I can''t use those currently since I can''t > > mount the base filesystem. > > It might be worth unmounting it. Then only remounting a snapshot well > before the problem started, yet still current enough to be useful: use > ''-o subvol='' instead of trying to mount from the top. Each subvolume > is a root directory, so it might be possible to find one that will > mount directly.So, I had thought about going back to an old snapshot, but the problem is that my snapshots have pseudo random names based on cron times when they''re taken. Because I couldn''t mount the root so I couldn''t find the snapshot names. Is there a way to get a list of snapshots from a btrfs FS without mounting it?> > I''m a bit surprised that no one seems to be replying on btrfs crashes, > > that''s a bit worrisome. I''m willing to risk my data somewhat, but if finding > > a problem doesn''t help fixing the code, I''m not sure if I''m helping anymore > > :-/ > > Lurking, I''ve learned this means you either didn''t provide enough > information for anyone to go on, or the problem is known. I suspect > the former. Kernel 3.5.0 or 3.6.2 doesn''t say where it came from, whatFair enough. At the time I thought it didn''t really matter how the bug happened, and more that btrfs shouldn''t crash my kernel when there is some minor problem with the filesystem. In my case, I''m convinced it was simply a problem that all the writes did not make it to disk before the device disconnected for some unknown reason (not related to btrfs).> distribution, or what version of btrfs is included in that distrosdebian unstable although it didn''t seem relevant since it''s the kernel in initrd that can''t mount the filesystem. Userland seems to be btrfs 0.19 as per the output I posted.> kernel. And I''m not seeing that you''re using a debug kernel, which > will actually produce useful error messages.Thanks for pointing that out. I''ll admit that I''m not sure what kernel build options I''m supposed to add to help. I asked about that in the past, but never heard back. What do you recommend I add in .config?> And it''s over a weekend for another thing.Well, it was thursday when I posted :) Now, I get the general point that I have no paid support, and I''m not even sure there is any official support for kernel.org from yesterday or last week (just a few vendor kernels). At the same time, if brave testers are risking their data to help test the filesystem, it''s also good if they feel re-assured that they''ll get help or that if their data is gone, whatever bug they found was useful to someone. Now, there is a good ending to this story, thanks to you no less, I''ll post in another message not to burry it down there. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html