Hi, I ran btrfsctl resize -r -3gb /dev/sda2 using wireless-testing.git based on 2.6.38-rc6 and all seemed good. df reported reduced size so I repartitioned and rebooted. Filesystem can no longer be mounted: [10560.129038] device fsid b2408c2e83f55cc2-5f7a14e35f176484 devid 1 transid 341132 /dev/sdb2 [10560.133407] btrfs bad tree block start 0 34520006656 [10560.134031] btrfs bad tree block start 0 34520006656 [10560.134904] btrfs bad tree block start 0 34520006656 [10560.134912] btrfs: failed to read tree root on sdb2 [10560.137206] btrfs: open_ctree failed btrfs-debug-tree and friends are similarly upset: $ ./btrfs-debug-tree /dev/sdb2 btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. $ gdb --args ./btrfs-debug-tree /dev/sdb2 GNU gdb 6.8 .. (gdb) b disk-io.c:44 Breakpoint 1 at 0x8050db7: file disk-io.c, line 44. .. Breakpoint 1, check_tree_block (root=0x946e2e8, buf=0x9471538) at disk-io.c:44 44 if (buf->start != btrfs_header_bytenr(buf)) (gdb) p buf->start $1 = 20971520 .. Breakpoint 1, check_tree_block (root=0x946e2e8, buf=0x9472588) at disk-io.c:44 44 if (buf->start != btrfs_header_bytenr(buf)) (gdb) p buf->start $2 = 20987904 .. Breakpoint 1, check_tree_block (root=0x946e2e8, buf=0x94735d8) at disk-io.c:44 44 if (buf->start != btrfs_header_bytenr(buf)) (gdb) p buf->start $3 = 20983808 The above checks succeed but next time check_tree_block() is called the check does not succeed. Breakpoint 1, check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:44 44 if (buf->start != btrfs_header_bytenr(buf)) (gdb) p buf->start $4 = 34520006656 (gdb) p btrfs_header_bytenr(buf) $5 = 0 .. (gdb) bt #0 check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:45 #1 0x080514fc in read_tree_block (root=0x946e008, bytenr=34520006656, blocksize=4096, parent_transid=341132) at disk-io.c:207 #2 0x080531a7 in open_ctree_fd (fp=7, path=0xbfef322a "/dev/sdb2", sb_bytenr=65536, writes=0) at disk-io.c:736 #3 0x08052a58 in open_ctree (filename=0xbfef322a "/dev/sdb2", sb_bytenr=0, writes=0) at disk-io.c:587 #4 0x080735cf in main (ac=1, av=0xbfef2374) at debug-tree.c:148 This is disk-io.c: 732 blocksize = btrfs_level_size(tree_root, btrfs_super_root_level(disk_super)); generation = btrfs_super_generation(disk_super); tree_root->node = read_tree_block(tree_root, btrfs_super_root(disk_super), blocksize, generation); .. 188 eb = btrfs_find_create_tree_block(root, bytenr, blocksize); .. 198 ret = btrfs_map_block(&root->fs_info->mapping_tree, READ, eb->start, &length, &multi, mirror_num); .. 206 ret = read_extent_from_disk(eb); if (ret == 0 && check_tree_block(root, eb) == 0 && This is the call that fails. Where do I look next? //Peter -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Peter Stuge''s message of 2011-03-07 12:48:20 -0500:> Hi, > > I ran btrfsctl resize -r -3gb /dev/sda2 using wireless-testing.git > based on 2.6.38-rc6 and all seemed good. df reported reduced size so > I repartitioned and rebooted. Filesystem can no longer be mounted: > > [10560.129038] device fsid b2408c2e83f55cc2-5f7a14e35f176484 devid 1 transid 341132 /dev/sdb2 > [10560.133407] btrfs bad tree block start 0 34520006656 > [10560.134031] btrfs bad tree block start 0 34520006656 > [10560.134904] btrfs bad tree block start 0 34520006656 > [10560.134912] btrfs: failed to read tree root on sdb2 > [10560.137206] btrfs: open_ctree failed > > btrfs-debug-tree and friends are similarly upset:Ouch, sorry about this. Do you have details on how big the FS was and how big the partition was before the resize? Have you tried using fdisk to bring the partition back to the original size? -chris> > $ ./btrfs-debug-tree /dev/sdb2 > btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. > > $ gdb --args ./btrfs-debug-tree /dev/sdb2 > GNU gdb 6.8 > .. > (gdb) b disk-io.c:44 > Breakpoint 1 at 0x8050db7: file disk-io.c, line 44. > .. > Breakpoint 1, check_tree_block (root=0x946e2e8, buf=0x9471538) at disk-io.c:44 > 44 if (buf->start != btrfs_header_bytenr(buf)) > (gdb) p buf->start > $1 = 20971520 > .. > Breakpoint 1, check_tree_block (root=0x946e2e8, buf=0x9472588) at disk-io.c:44 > 44 if (buf->start != btrfs_header_bytenr(buf)) > (gdb) p buf->start > $2 = 20987904 > .. > Breakpoint 1, check_tree_block (root=0x946e2e8, buf=0x94735d8) at disk-io.c:44 > 44 if (buf->start != btrfs_header_bytenr(buf)) > (gdb) p buf->start > $3 = 20983808 > > The above checks succeed but next time check_tree_block() is called > the check does not succeed. > > Breakpoint 1, check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:44 > 44 if (buf->start != btrfs_header_bytenr(buf)) > (gdb) p buf->start > $4 = 34520006656 > (gdb) p btrfs_header_bytenr(buf) > $5 = 0 > .. > (gdb) bt > #0 check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:45 > #1 0x080514fc in read_tree_block (root=0x946e008, bytenr=34520006656, > blocksize=4096, parent_transid=341132) at disk-io.c:207 > #2 0x080531a7 in open_ctree_fd (fp=7, path=0xbfef322a "/dev/sdb2", > sb_bytenr=65536, writes=0) at disk-io.c:736 > #3 0x08052a58 in open_ctree (filename=0xbfef322a "/dev/sdb2", sb_bytenr=0, > writes=0) at disk-io.c:587 > #4 0x080735cf in main (ac=1, av=0xbfef2374) at debug-tree.c:148 > > This is disk-io.c: > > 732 blocksize = btrfs_level_size(tree_root, > btrfs_super_root_level(disk_super)); > generation = btrfs_super_generation(disk_super); > > tree_root->node = read_tree_block(tree_root, > btrfs_super_root(disk_super), > blocksize, generation); > .. > 188 eb = btrfs_find_create_tree_block(root, bytenr, blocksize); > .. > 198 ret = btrfs_map_block(&root->fs_info->mapping_tree, READ, > eb->start, &length, &multi, mirror_num); > .. > 206 ret = read_extent_from_disk(eb); > if (ret == 0 && check_tree_block(root, eb) == 0 && > > This is the call that fails. > > > Where do I look next? > > > //Peter-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Chris, Chris Mason wrote:> > I ran btrfsctl resize -r -3gb /dev/sda2 using wireless-testing.git > > based on 2.6.38-rc6 and all seemed good. df reported reduced size so > > I repartitioned and rebooted. Filesystem can no longer be mounted: > > Ouch, sorry about this. Do you have details on how big the FS was > and how big the partition was before the resize?Not complete details I''m afraid. I only remember some of the numbers at the end of the original partition size. Apologies for not having including more details about the media in the first message! It''s a 64GB CF card with two partitions; one 40MB ext2 and "the rest" is btrfs. This is the current fdisk output: Command (m for help): p Disk /dev/sdb: 64.0 GB, 64030244864 bytes 64 heads, 32 sectors/track, 61064 cylinders Units = cylinders of 2048 * 512 = 1048576 bytes Disk identifier: 0x001c2022 Device Boot Start End Blocks Id System /dev/sdb1 1 40 40131 83 Linux Partition 1 does not end on cylinder boundary. /dev/sdb2 40 61064 62489373+ 83 Linux Command (m for help): u Changing display/entry units to sectors Command (m for help): p Disk /dev/sdb: 64.0 GB, 64030244864 bytes 64 heads, 32 sectors/track, 61064 cylinders, total 125059072 sectors Units = sectors of 1 * 512 = 512 bytes Disk identifier: 0x001c2022 Device Boot Start End Blocks Id System /dev/sdb1 63 80324 40131 83 Linux Partition 1 does not end on cylinder boundary. /dev/sdb2 80325 125059071 62489373+ 83 Linux Command (m for help): x Expert command (m for help): p Disk /dev/sdb: 64 heads, 32 sectors, 61064 cylinders Nr AF Hd Sec Cyl Hd Sec Cyl Start Size ID 1 00 1 1 0 254 63 4 63 80262 83 Partition 1 does not end on cylinder boundary. 2 00 14 6 39 63 32 1023 80325 124978747 83 3 00 0 0 0 0 0 0 0 0 00 4 00 0 0 0 0 0 0 0 0 00 I say current, because by now I have changed the sdb2 partition twice.> Have you tried using fdisk to bring the partition back to the > original size?Almost.. After resizing I deleted the partition and then created a new one starting at 80325, which was exactly 120000000 sectors. This is only 1.2 GB smaller than the original partition (resized -3gb) but I wanted to avoid mistakes while calculating sectors, so I exaggerated. The 1-or-so GB free space at the end would be enough anyway. In any case changing the partition table shouldn''t affect the filesystem, right? Also, I changed the partition with the filesystem mounted, so the kernel did not start using the new partition table. When the mount failed after rebooting, I tried to do what you suggest; I removed the partition and then created a new one which used all available space on the card. This is the state of the card now. However, I am 100% sure that the current size of the partition is not exactly the same as the original partition was. Could this partition table difference have an impact after all? Is something in the fs calculated based on device size? I would expect serious trouble if I made the partition smaller *without* resizing, so that a seek within the fs could go beyond device limits, but from gdb:ing disk-io.c it seems that zero-bytes are where there''s supposed to be a root node. So either the root node was destroyed (uh-oh?) or code is reading from the wrong place. I don''t know which is more likely?> > $ ./btrfs-debug-tree /dev/sdb2 > > btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. > > > > $ gdb --args ./btrfs-debug-tree /dev/sdb2..> > Breakpoint 1, check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:44 > > 44 if (buf->start != btrfs_header_bytenr(buf)) > > (gdb) p buf->start > > $4 = 34520006656 > > (gdb) p btrfs_header_bytenr(buf) > > $5 = 0..> > (gdb) bt > > #0 check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:45 > > #1 0x080514fc in read_tree_block (root=0x946e008, bytenr=34520006656, > > blocksize=4096, parent_transid=341132) at disk-io.c:207 > > #2 0x080531a7 in open_ctree_fd (fp=7, path=0xbfef322a "/dev/sdb2", > > sb_bytenr=65536, writes=0) at disk-io.c:736The fs had >20 GB available before resize, and 19-something after. (From memory of df output.) I haven''t removed very many files from the filesystem since it was created. I have also not used any "advanced" features such as snapshots or subvolumes. This was the first time I ran btrfsctl. I appreciate the help! //Peter -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Peter Stuge''s message of 2011-03-10 01:23:33 -0500:> Hi Chris, > > Chris Mason wrote: > > > I ran btrfsctl resize -r -3gb /dev/sda2 using wireless-testing.git > > > based on 2.6.38-rc6 and all seemed good. df reported reduced size so > > > I repartitioned and rebooted. Filesystem can no longer be mounted: > > > > Ouch, sorry about this. Do you have details on how big the FS was > > and how big the partition was before the resize? > > Not complete details I''m afraid. I only remember some of the numbers > at the end of the original partition size. Apologies for not having > including more details about the media in the first message! > > It''s a 64GB CF card with two partitions; one 40MB ext2 and "the rest" > is btrfs. This is the current fdisk output:Ok, going back to your original email, the block you''re failing on is probably right in the middle of the drive. We can''t be sure without looking at the mapping tree (which we don''t have), but it is very unlikely to be related to the boundary of your resize. More below> > Command (m for help): p > > Disk /dev/sdb: 64.0 GB, 64030244864 bytes > 64 heads, 32 sectors/track, 61064 cylinders > Units = cylinders of 2048 * 512 = 1048576 bytes > Disk identifier: 0x001c2022 > > Device Boot Start End Blocks Id System > /dev/sdb1 1 40 40131 83 Linux > Partition 1 does not end on cylinder boundary. > /dev/sdb2 40 61064 62489373+ 83 Linux > > Command (m for help): u > Changing display/entry units to sectors > > Command (m for help): p > > Disk /dev/sdb: 64.0 GB, 64030244864 bytes > 64 heads, 32 sectors/track, 61064 cylinders, total 125059072 sectors > Units = sectors of 1 * 512 = 512 bytes > Disk identifier: 0x001c2022 > > Device Boot Start End Blocks Id System > /dev/sdb1 63 80324 40131 83 Linux > Partition 1 does not end on cylinder boundary. > /dev/sdb2 80325 125059071 62489373+ 83 Linux > > Command (m for help): x > > Expert command (m for help): p > > Disk /dev/sdb: 64 heads, 32 sectors, 61064 cylinders > > Nr AF Hd Sec Cyl Hd Sec Cyl Start Size ID > 1 00 1 1 0 254 63 4 63 80262 83 > Partition 1 does not end on cylinder boundary. > 2 00 14 6 39 63 32 1023 80325 124978747 83 > 3 00 0 0 0 0 0 0 0 0 00 > 4 00 0 0 0 0 0 0 0 0 00 > > I say current, because by now I have changed the sdb2 partition > twice.Have you ever changed the start of the partition? If the start had changed the superblock should be in the wrong place, so the mount wouldn''t have gotten this far.> > > Have you tried using fdisk to bring the partition back to the > > original size? > > Almost.. > > After resizing I deleted the partition and then created a new one > starting at 80325, which was exactly 120000000 sectors. This is only > 1.2 GB smaller than the original partition (resized -3gb) but I > wanted to avoid mistakes while calculating sectors, so I exaggerated. > The 1-or-so GB free space at the end would be enough anyway. > > In any case changing the partition table shouldn''t affect the > filesystem, right? Also, I changed the partition with the filesystem > mounted, so the kernel did not start using the new partition table.I''d have to repeat the test on this flash card to say for sure. Deleting then recreating the partition with the FS mounted isn''t very high up on the list of things that get tested often, so my guess is that''s where the problem is.> > When the mount failed after rebooting, I tried to do what you > suggest; I removed the partition and then created a new one which > used all available space on the card. This is the state of the card > now. However, I am 100% sure that the current size of the partition > is not exactly the same as the original partition was. Could this > partition table difference have an impact after all? Is something in > the fs calculated based on device size? > > > I would expect serious trouble if I made the partition smaller > *without* resizing, so that a seek within the fs could go beyond > device limits, but from gdb:ing disk-io.c it seems that zero-bytes > are where there''s supposed to be a root node. So either the root node > was destroyed (uh-oh?) or code is reading from the wrong place. I > don''t know which is more likely?Right, we''ve got a block full with zeros where they don''t belong. Can you run dump the block contents with gdb please? I''d like to see if they are all zeros or just offset slightly.> > > > $ ./btrfs-debug-tree /dev/sdb2 > > > btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. > > > > > > $ gdb --args ./btrfs-debug-tree /dev/sdb2 > .. > > > Breakpoint 1, check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:44 > > > 44 if (buf->start != btrfs_header_bytenr(buf)) > > > (gdb) p buf->start > > > $4 = 34520006656 > > > (gdb) p btrfs_header_bytenr(buf) > > > $5 = 0 > .. > > > (gdb) bt > > > #0 check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:45 > > > #1 0x080514fc in read_tree_block (root=0x946e008, bytenr=34520006656, > > > blocksize=4096, parent_transid=341132) at disk-io.c:207 > > > #2 0x080531a7 in open_ctree_fd (fp=7, path=0xbfef322a "/dev/sdb2", > > > sb_bytenr=65536, writes=0) at disk-io.c:736 > > The fs had >20 GB available before resize, and 19-something after. > (From memory of df output.) I haven''t removed very many files from > the filesystem since it was created. I have also not used any > "advanced" features such as snapshots or subvolumes. This was the > first time I ran btrfsctl. >Ok, we talked about power offs and barriers in a different thread, but I didn''t realize you were on a CF device. I''d want to do some tests on this device to see how well it really reacted in power offs, but lets do that after we pull your data some where safer. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Peter Stuge''s message of 2011-03-10 01:23:33 -0500:> Hi Chris, > > Chris Mason wrote: > > > I ran btrfsctl resize -r -3gb /dev/sda2 using wireless-testing.git > > > based on 2.6.38-rc6 and all seemed good. df reported reduced size so > > > I repartitioned and rebooted. Filesystem can no longer be mounted: > > > > Ouch, sorry about this. Do you have details on how big the FS was > > and how big the partition was before the resize? > > Not complete details I''m afraid. I only remember some of the numbers > at the end of the original partition size. Apologies for not having > including more details about the media in the first message!One more question which Jens Axboe just suggested. Which tool and which version of the tool did you use to delete the partition? -chris> > It''s a 64GB CF card with two partitions; one 40MB ext2 and "the rest" > is btrfs. This is the current fdisk output: > > Command (m for help): p > > Disk /dev/sdb: 64.0 GB, 64030244864 bytes > 64 heads, 32 sectors/track, 61064 cylinders > Units = cylinders of 2048 * 512 = 1048576 bytes > Disk identifier: 0x001c2022 > > Device Boot Start End Blocks Id System > /dev/sdb1 1 40 40131 83 Linux > Partition 1 does not end on cylinder boundary. > /dev/sdb2 40 61064 62489373+ 83 Linux > > Command (m for help): u > Changing display/entry units to sectors > > Command (m for help): p > > Disk /dev/sdb: 64.0 GB, 64030244864 bytes > 64 heads, 32 sectors/track, 61064 cylinders, total 125059072 sectors > Units = sectors of 1 * 512 = 512 bytes > Disk identifier: 0x001c2022 > > Device Boot Start End Blocks Id System > /dev/sdb1 63 80324 40131 83 Linux > Partition 1 does not end on cylinder boundary. > /dev/sdb2 80325 125059071 62489373+ 83 Linux > > Command (m for help): x > > Expert command (m for help): p > > Disk /dev/sdb: 64 heads, 32 sectors, 61064 cylinders > > Nr AF Hd Sec Cyl Hd Sec Cyl Start Size ID > 1 00 1 1 0 254 63 4 63 80262 83 > Partition 1 does not end on cylinder boundary. > 2 00 14 6 39 63 32 1023 80325 124978747 83 > 3 00 0 0 0 0 0 0 0 0 00 > 4 00 0 0 0 0 0 0 0 0 00 > > I say current, because by now I have changed the sdb2 partition > twice. > > > Have you tried using fdisk to bring the partition back to the > > original size? > > Almost.. > > After resizing I deleted the partition and then created a new one > starting at 80325, which was exactly 120000000 sectors. This is only > 1.2 GB smaller than the original partition (resized -3gb) but I > wanted to avoid mistakes while calculating sectors, so I exaggerated. > The 1-or-so GB free space at the end would be enough anyway. > > In any case changing the partition table shouldn''t affect the > filesystem, right? Also, I changed the partition with the filesystem > mounted, so the kernel did not start using the new partition table. > > When the mount failed after rebooting, I tried to do what you > suggest; I removed the partition and then created a new one which > used all available space on the card. This is the state of the card > now. However, I am 100% sure that the current size of the partition > is not exactly the same as the original partition was. Could this > partition table difference have an impact after all? Is something in > the fs calculated based on device size? > > > I would expect serious trouble if I made the partition smaller > *without* resizing, so that a seek within the fs could go beyond > device limits, but from gdb:ing disk-io.c it seems that zero-bytes > are where there''s supposed to be a root node. So either the root node > was destroyed (uh-oh?) or code is reading from the wrong place. I > don''t know which is more likely? > > > > $ ./btrfs-debug-tree /dev/sdb2 > > > btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. > > > > > > $ gdb --args ./btrfs-debug-tree /dev/sdb2 > .. > > > Breakpoint 1, check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:44 > > > 44 if (buf->start != btrfs_header_bytenr(buf)) > > > (gdb) p buf->start > > > $4 = 34520006656 > > > (gdb) p btrfs_header_bytenr(buf) > > > $5 = 0 > .. > > > (gdb) bt > > > #0 check_tree_block (root=0x946e008, buf=0x9474628) at disk-io.c:45 > > > #1 0x080514fc in read_tree_block (root=0x946e008, bytenr=34520006656, > > > blocksize=4096, parent_transid=341132) at disk-io.c:207 > > > #2 0x080531a7 in open_ctree_fd (fp=7, path=0xbfef322a "/dev/sdb2", > > > sb_bytenr=65536, writes=0) at disk-io.c:736 > > The fs had >20 GB available before resize, and 19-something after. > (From memory of df output.) I haven''t removed very many files from > the filesystem since it was created. I have also not used any > "advanced" features such as snapshots or subvolumes. This was the > first time I ran btrfsctl. > > I appreciate the help! > > > //Peter-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> Which tool and which version of the tool did you use to delete the > partition?fdisk from util-linux-2.18 The non-working partition was deleted and the current one created with fdisk from util-linux-2.14.2.> > It''s a 64GB CF card with two partitions; one 40MB ext2 and "the rest" > > is btrfs. This is the current fdisk output: > > Ok, going back to your original email, the block you''re failing on > is probably right in the middle of the drive.Right.> We can''t be sure without looking at the mapping tree (which we > don''t have),Could we guess at where it was put?> > I say current, because by now I have changed the sdb2 partition > > twice. > > Have you ever changed the start of the partition?No.> If the start had changed the superblock should be in the wrong > place, so the mount wouldn''t have gotten this far.Right.> > In any case changing the partition table shouldn''t affect the > > filesystem, right? Also, I changed the partition with the filesystem > > mounted, so the kernel did not start using the new partition table. > > I''d have to repeat the test on this flash card to say for sure. > Deleting then recreating the partition with the FS mounted isn''t > very high up on the list of things that get tested often, so my > guess is that''s where the problem is.As I understand it, fdisk writes the new partition table to disk, and asks the kernel to re-read it from there. That ioctl failed, I expect because the filesystem was mounted.> Right, we''ve got a block full with zeros where they don''t belong. > Can you run dump the block contents with gdb please?Will do!> Ok, we talked about power offs and barriers in a different thread, > but I didn''t realize you were on a CF device. I''d want to do some > tests on this device to see how well it really reacted in power > offs, but lets do that after we pull your data some where safer.I''m of course happy to help test anything! //Peter -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Excerpts from Peter Stuge''s message of 2011-03-10 08:45:09 -0500:> Chris Mason wrote: > > Which tool and which version of the tool did you use to delete the > > partition? > > fdisk from util-linux-2.18Straight from util-linux, or with distro patches?> > The non-working partition was deleted and the current one created > with fdisk from util-linux-2.14.2. > > > > It''s a 64GB CF card with two partitions; one 40MB ext2 and "the rest" > > > is btrfs. This is the current fdisk output: > > > > Ok, going back to your original email, the block you''re failing on > > is probably right in the middle of the drive. > > Right. > > > We can''t be sure without looking at the mapping tree (which we > > don''t have), > > Could we guess at where it was put?Until you do something funky like balance the drive, there''s a 1:1 mapping. The easy way to guess is to strace btrfsck and see where it read.> > > > I say current, because by now I have changed the sdb2 partition > > > twice. > > > > Have you ever changed the start of the partition? > > No. > > > If the start had changed the superblock should be in the wrong > > place, so the mount wouldn''t have gotten this far. > > Right. > > > > In any case changing the partition table shouldn''t affect the > > > filesystem, right? Also, I changed the partition with the filesystem > > > mounted, so the kernel did not start using the new partition table. > > > > I''d have to repeat the test on this flash card to say for sure. > > Deleting then recreating the partition with the FS mounted isn''t > > very high up on the list of things that get tested often, so my > > guess is that''s where the problem is. > > As I understand it, fdisk writes the new partition table to disk, and > asks the kernel to re-read it from there. That ioctl failed, I expect > because the filesystem was mounted. > > > Right, we''ve got a block full with zeros where they don''t belong. > > Can you run dump the block contents with gdb please? > > Will do! > > > Ok, we talked about power offs and barriers in a different thread, > > but I didn''t realize you were on a CF device. I''d want to do some > > tests on this device to see how well it really reacted in power > > offs, but lets do that after we pull your data some where safer. > > I''m of course happy to help test anything!Great, thanks. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sorry for not following up on this until now. :( I''ve been busy and have been using a backup. But I''m still very interested in restoring the btrfs and finding this bug! Let me know if I should refresh any details. Chris Mason wrote:> > In any case changing the partition table shouldn''t affect the > > filesystem, right? Also, I changed the partition with the filesystem > > mounted, so the kernel did not start using the new partition table. > > I''d have to repeat the test on this flash card to say for sure. > Deleting then recreating the partition with the FS mounted isn''t very > high up on the list of things that get tested often, so my guess is > that''s where the problem is.I resized the filesystem, and then changed the partition. However as expected the kernel doesn''t really care about the partition change until I reboot, since a partition on the disk was mounted. Do you think this could be a problem still? It seems that the only way there could be a problem is if the resize would not stay within the new smaller size?> > from gdb:ing disk-io.c it seems that zero-bytes are where there''s > > supposed to be a root node. So either the root node was destroyed > > (uh-oh?) or code is reading from the wrong place. I don''t know > > which is more likely? > > Right, we''ve got a block full with zeros where they don''t belong. > Can you run dump the block contents with gdb please? I''d like to > see if they are all zeros or just offset slightly.That block is all zeros. /tmp/btrfs-progs-unstable $ gdb --args ./btrfs-debug-tree /dev/sdb2 GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (gdb) b read_extent_from_disk Breakpoint 1 at 0x806882d: file extent_io.c, line 671. (gdb) r Starting program: /tmp/btrfs-progs-unstable/btrfs-debug-tree /dev/sdb2 Breakpoint 1, read_extent_from_disk (eb=0x92a0538) at extent_io.c:671 671 ret = pread(eb->fd, eb->data, eb->len, eb->dev_bytenr); (gdb) p eb->dev_bytenr $1 = 20971520 (gdb) c Continuing. Breakpoint 1, read_extent_from_disk (eb=0x92a1588) at extent_io.c:671 671 ret = pread(eb->fd, eb->data, eb->len, eb->dev_bytenr); (gdb) p eb->dev_bytenr $2 = 20987904 (gdb) c Continuing. Breakpoint 1, read_extent_from_disk (eb=0x92a25d8) at extent_io.c:671 671 ret = pread(eb->fd, eb->data, eb->len, eb->dev_bytenr); (gdb) p eb->dev_bytenr $3 = 20983808 (gdb) c Continuing. Breakpoint 1, read_extent_from_disk (eb=0x92a3628) at extent_io.c:671 671 ret = pread(eb->fd, eb->data, eb->len, eb->dev_bytenr); (gdb) p eb->dev_bytenr $4 = 36675878912 (gdb) p eb->len (gdb) c Continuing. Breakpoint 1, read_extent_from_disk (eb=0x92a3628) at extent_io.c:671 671 ret = pread(eb->fd, eb->data, eb->len, eb->dev_bytenr); (gdb) p eb->dev_bytenr $5 = 36944314368 (gdb) c Continuing. btrfs-debug-tree: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. Program received signal SIGABRT, Aborted. 0xb78ea424 in __kernel_vsyscall () (gdb) q The program is running. Exit anyway? (y or n) y $ bc <<< 36944314368/4096 9019608 $ dd if=/dev/sdb2 bs=4k skip=9019608 count=1|xxd -a 0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ * 0000ff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.0035931 s, 1.1 MB/s Nearby blocks are also zero: $ dd if=/dev/sdb2 bs=4k skip=$[9019608-8] count=16|xxd -a 0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ * 000fff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 16+0 records in 16+0 records out 65536 bytes (66 kB) copied, 0.0128077 s, 5.1 MB/s Further back there are some bits: $ dd if=/dev/sdb2 bs=4k skip=$[9019608-17] count=32|xxd -a 0000000: 52a6 7b43 0000 0000 0000 0000 0000 0000 R.{C............ 0000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0000020: c25c f583 2e8c 40b2 8464 175f e314 7a5f .\....@..d._..z_ 0000030: 0070 8c09 0800 0000 0100 0000 0000 0001 .p.............. 0000040: 1538 bb3f eb62 45f5 8956 4728 ca7b fad0 .8.?.bE..VG(.{.. 0000050: 8934 0500 0000 0000 0200 0000 0000 0000 .4.............. 0000060: 4b00 0000 0100 a099 ec07 0000 00a8 0070 K..............p 0000070: 0000 0000 0000 0050 0103 0800 0000 bc32 .......P.......2 0000080: 0500 0000 0000 0090 a6ec 0700 0000 a800 ................ 0000090: 1000 0000 0000 0000 3043 0408 0000 00ed ........0C...... 00000a0: 3205 0000 0000 0000 50ac ec07 0000 00a8 2.......P....... 00000b0: 0020 0000 0000 0000 00d0 4003 0800 0000 . ........@..... 00000c0: be32 0500 0000 0000 0050 b1ec 0700 0000 .2.......P...... 00000d0: a800 0002 0000 0000 0000 506f 0908 0000 ..........Po.... 00000e0: 0084 3405 0000 0000 0000 40be ec07 0000 ..4.......@..... 00000f0: 00a8 0020 0000 0000 0000 0060 4204 0800 ... .......`B... 0000100: 0000 ed32 0500 0000 0000 0030 c1ec 0700 ...2.......0.... 0000110: 0000 a800 1000 0000 0000 0000 8001 0308 ................ 0000120: 0000 00bc 3205 0000 0000 0000 f0c5 ec07 ....2........... 0000130: 0000 00a8 0010 0000 0000 0000 00b0 3d03 ..............=. 0000140: 0800 0000 be32 0500 0000 0000 00d0 c8ec .....2.......... 0000150: 0700 0000 a800 1000 0000 0000 0000 c03d ...............0000160: 0308 0000 00be 3205 0000 0000 0000 90cc ......2......... 0000170: ec07 0000 00a8 0010 0000 0000 0000 0020 ............... 0000180: 6f09 0800 0000 8434 0500 0000 0000 0010 o......4........ 0000190: d7ec 0700 0000 a800 1000 0000 0000 0000 ................ 00001a0: f0dd 0108 0000 009b 3205 0000 0000 0000 ........2....... 00001b0: 50e0 ec07 0000 00a8 0030 0000 0000 0000 P........0...... 00001c0: 00f0 cd02 0800 0000 b332 0500 0000 0000 .........2...... 00001d0: 0030 e6ec 0700 0000 a800 3000 0000 0000 .0........0..... 00001e0: 0000 006f 0908 0000 0084 3405 0000 0000 ...o......4..... 00001f0: 0000 40f0 ec07 0000 00a8 00a0 0000 0000 ..@............. 0000200: 0000 0060 7409 0800 0000 8534 0500 0000 ...`t......4.... 0000210: 0000 0000 05ed 0700 0000 a800 1000 0000 ................ 0000220: 0000 0000 d06e 0908 0000 0084 3405 0000 .....n......4... 0000230: 0000 0000 f031 ed07 0000 00a8 0010 0000 .....1.......... 0000240: 0000 0000 00e0 be07 0800 0000 d933 0500 .............3.. 0000250: 0000 0000 0020 52ed 0700 0000 a800 3000 ..... R.......0. 0000260: 0000 0000 0000 60db 0708 0000 00e0 3305 ......`.......3. 0000270: 0000 0000 0000 0068 ef07 0000 00a8 00f0 .......h........ 0000280: 0800 0000 0000 00b0 8c09 0800 0000 8934 ...............4 0000290: 0500 0000 0000 0010 c101 0800 0000 a800 ................ 00002a0: 1000 0000 0000 0000 0044 0908 0000 007b .........D.....{ 00002b0: 3405 0000 0000 0000 d0c6 0108 0000 00a8 4............... 00002c0: 0010 0000 0000 0000 0030 de08 0800 0000 .........0...... 00002d0: 5c34 0500 0000 0000 00c0 c901 0800 0000 \4.............. 00002e0: a800 1000 0000 0000 0000 2014 0508 0000 .......... ..... 00002f0: 0055 3305 0000 0000 0000 90d0 0108 0000 .U3............. 0000300: 00a8 0010 0000 0000 0000 00a0 c003 0800 ................ 0000310: 0000 cc32 0500 0000 0000 0010 d401 0800 ...2............ 0000320: 0000 a800 1000 0000 0000 0000 608b 0908 ............`... 0000330: 0000 0088 3405 0000 0000 0000 50dc 0108 ....4.......P... 0000340: 0000 00a8 0010 0000 0000 0000 0020 4803 ............. H. 0000350: 0800 0000 be32 0500 0000 0000 00e0 e001 .....2.......... 0000360: 0800 0000 a800 1000 0000 0000 0000 109f ................ 0000370: 0708 0000 00d5 3305 0000 0000 0000 70eb ......3.......p. 0000380: 0108 0000 00a8 0010 0000 0000 0000 00a0 ................ 0000390: 8a09 0800 0000 8834 0500 0000 0000 0080 .......4........ 00003a0: 0602 0800 0000 a800 1000 0000 0000 0000 ................ 00003b0: c085 0908 0000 0088 3405 0000 0000 0000 ........4....... 00003c0: b03a 0208 0000 00a8 0010 0000 0000 0000 .:.............. 00003d0: 0070 8a09 0800 0000 8834 0500 0000 0000 .p.......4...... 00003e0: 0020 8502 0800 0000 a800 1000 0000 0000 . .............. 00003f0: 0000 2047 0808 0000 0023 3405 0000 0000 .. G.....#4..... 0000400: 0000 90a6 0208 0000 00a8 0010 0000 0000 ................ 0000410: 0000 00e0 8a09 0800 0000 8834 0500 0000 ...........4.... 0000420: 0000 00c0 d702 0800 0000 a800 1000 0000 ................ 0000430: 0000 0000 3081 0908 0000 0086 3405 0000 ....0.......4... 0000440: 0000 0000 5001 0308 0000 00a8 0010 0000 ....P........... 0000450: 0000 0000 0050 7009 0800 0000 8434 0500 .....Pp......4.. 0000460: 0000 0000 00a0 1203 0800 0000 a800 1000 ................ 0000470: 0000 0000 0000 b03d 0508 0000 005a 3305 .......=.....Z3. 0000480: 0000 0000 0000 7014 0308 0000 00a8 0010 ......p......... 0000490: 0000 0000 0000 0080 7809 0800 0000 8534 ........x......4 00004a0: 0500 0000 0000 0070 1703 0800 0000 a800 .......p........ 00004b0: 1000 0000 0000 0000 806e 0908 0000 0084 .........n...... 00004c0: 3405 0000 0000 0000 5022 0308 0000 00a8 4.......P"...... 00004d0: 0010 0000 0000 0000 00d0 8203 0800 0000 ................ 00004e0: c332 0500 0000 0000 0020 2503 0800 0000 .2....... %..... 00004f0: a800 1000 0000 0000 0000 e078 0908 0000 ...........x.... 0000500: 0085 3405 0000 0000 0000 202e 0308 0000 ..4....... ..... 0000510: 00a8 0010 0000 0000 0000 00a0 4809 0800 ............H... 0000520: 0000 7c34 0500 0000 0000 0070 3103 0800 ..|4.......p1... 0000530: 0000 a800 1000 0000 0000 0000 907f 0908 ................ 0000540: 0000 0086 3405 0000 0000 0000 203d 0308 ....4....... =.. 0000550: 0000 00a8 0010 0000 0000 0000 0070 6f09 .............po. 0000560: 0800 0000 8434 0500 0000 0000 00c0 4503 .....4........E. 0000570: 0800 0000 a800 1000 0000 0000 0000 2078 .............. x 0000580: 0908 0000 0085 3405 0000 0000 0000 b048 ......4........H 0000590: 0308 0000 00a8 0010 0000 0000 0000 00e0 ................ 00005a0: dd08 0800 0000 5c34 0500 0000 0000 0040 ......\4.......@ 00005b0: 4c03 0800 0000 a800 1000 0000 0000 0000 L............... 00005c0: f010 0708 0000 00a0 3305 0000 0000 0000 ........3....... 00005d0: 404f 0308 0000 00a8 0010 0000 0000 0000 @O.............. 00005e0: 0000 c406 0800 0000 9633 0500 0000 0000 .........3...... 00005f0: 0080 5203 0800 0000 a800 1000 0000 0000 ..R............. 0000600: 0000 0000 0000 0000 0000 0000 0000 0000 ................ * 001fff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 32+0 records in 32+0 records out 131072 bytes (131 kB) copied, 0.0395934 s, 3.3 MB/s But going forward there doesn''t seem to be much: $ dd if=/dev/sdb2 bs=4k skip=$[9019608] count=1024|xxd -a 0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................ * 03ffff0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 1024+0 records in 1024+0 records out 4194304 bytes (4.2 MB) copied, 1.62279 s, 2.6 MB/s //Peter -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> > > Which tool and which version of the tool did you use to delete the > > > partition? > > > > fdisk from util-linux-2.18 > > Straight from util-linux, or with distro patches?Yes a few Gentoo patches, but nothing that seems relevant: epatch "${FILESDIR}"/${P}-ncursesw.patch Subject: [PATCH] cfdisk: search for ncursesw/ncurses.h Deals with building cfdisk, which I did not use. (I used plain fdisk) epatch "${FILESDIR}"/${P}-slang.patch #326373 Subject: [PATCH] cfdisk: fix --with-slang Fixes bug in cfdisk --with-slang, which I did not enable. epatch "${FILESDIR}"/${P}-cfdisk-string-len.patch #328959 Subject: [PATCH] cfdisk: get_string not calculating correct limits cfdisk string input length limit stuff. epatch "${FILESDIR}"/${P}-falloc.patch #339432 Subject: [PATCH] fallocate: fix build failure with old linux headers I built with new linux headers and had no problem during build.> > > We can''t be sure without looking at the mapping tree (which we > > > don''t have), > > > > Could we guess at where it was put? > > Until you do something funky like balance the drive, there''s a 1:1 > mapping. The easy way to guess is to strace btrfsck and see where > it read.Basically same as gdb: open("/dev/sdb2", O_RDONLY|O_LARGEFILE) = 3 pread64(3, "p\23{\335\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 2859, 65536) = 2859 pread64(3, "\320rS\23\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 2859, 67108864) = 2859 pread64(3, ""..., 2859, 274877906944) = 0 open("/dev/sdb2", O_RDONLY|O_LARGEFILE) = 5 pread64(5, "p\23{\335\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 2859, 65536) = 2859 pread64(5, "\320rS\23\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 2859, 67108864) = 2859 pread64(5, ""..., 2859, 274877906944) = 0 pread64(5, "\353$m<\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 4096, 20971520) = 4096 pread64(5, "\275\354\332\225\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 4096, 20987904) = 4096 pread64(5, "\253\22#(\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\302"..., 4096, 20983808) = 4096 pread64(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 36675878912) = 4096 pread64(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 36675878912) = 4096 pread64(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 36944314368) = 4096 write(2, "btrfsck: disk-io.c:739: open_ctre"..., 79btrfsck: disk-io.c:739: open_ctree_fd: Assertion `!(!tree_root->node)'' failed. ) = 79 I see that I made a mistake copypasting from gdb in the other email. I didn''t think that btrfs-debug-tree actually read at 36675878912 twice. //Peter -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Stuge wrote:> Sorry for not following up on this until now. :( I''ve been busy and > have been using a backup. But I''m still very interested in restoring > the btrfs and finding this bug! Let me know if I should refresh any > details.Ping? I''ve created a small btrfs to see if I can learn a little about the structure (wiki next to the terminal) but not really expect that I''ll be efficient. Any hints on how/where I can find my root node again? Or if unidentifiable, could it be reconstructed? //Peter -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html