Hi, I added a new drive to an existing RAID 0 array. Every attempt to rebalance the array fails: # btrfs filesystem balance /share/bd8 ERROR: error during balancing ''/share/bd8'' - Input/output error # dmesg | tail btrfs: found 1 extents btrfs: relocating block group 10752513540096 flags 1 btrfs: found 5 extents btrfs: found 5 extents btrfs: relocating block group 10751439798272 flags 1 btrfs: found 1 extents btrfs: found 1 extents btrfs: relocating block group 10048138903552 flags 1 btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028 btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028 An earlier rebalance attempt had the same csum error on a different inode: btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028 btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028 Every rebalance attempt fails the same way, but with a different inum. Here is the array: # btrfs filesystem show Label: ''bd8'' uuid: b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4 Total devices 4 FS bytes used 7.37TB devid 4 size 3.64TB used 52.00GB path /dev/sde devid 1 size 3.64TB used 3.32TB path /dev/sdf1 devid 3 size 3.64TB used 2.92TB path /dev/sdc devid 2 size 3.64TB used 2.97TB path /dev/sdb While I didn''t finish the scrub, no errors were found: # btrfs scrub status -d /share/bd8 scrub status for b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4 scrub device /dev/sdf1 (id 1) status scrub resumed at Sun Jun 2 20:29:06 2013, running for 10360 seconds total bytes scrubbed: 845.53GB with 0 errors scrub device /dev/sdb (id 2) status scrub resumed at Sun Jun 2 20:29:06 2013, running for 10360 seconds total bytes scrubbed: 869.38GB with 0 errors scrub device /dev/sdc (id 3) status scrub resumed at Sun Jun 2 20:29:06 2013, running for 10360 seconds total bytes scrubbed: 706.04GB with 0 errors scrub device /dev/sde (id 4) history scrub started at Sun Jun 2 12:48:36 2013 and finished after 0 seconds total bytes scrubbed: 0.00 with 0 errors Mount options: /dev/sdf1 on /share/bd8 type btrfs (rw,flushoncommit) Kernel 3.9.4 John -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jun 2, 2013 at 9:05 PM, John Haller <john.h.haller@gmail.com> wrote:> Hi, > > I added a new drive to an existing RAID 0 array. Every > attempt to rebalance the array fails: > # btrfs filesystem balance /share/bd8> btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028 > btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028 > > An earlier rebalance attempt had the same csum error on a different inode: > btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028 > btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028 > > Every rebalance attempt fails the same way, but with a different inum. >Final scrub results: btrfs: checksum error at logical 9524548104192 on dev /dev/sdc, sector 5252042552, root 5, inode 6754, offset 14188032000, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 btrfs: unable to fixup (regular) error at logical 9524548104192 on dev /dev/sdc btrfs: checksum error at logical 9531724152832 on dev /dev/sdc, sector 5266058272, root 5, inode 6755, offset 1801699328, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 btrfs: unable to fixup (regular) error at logical 9531724152832 on dev /dev/sdc btrfs: checksum error at logical 9628551053312 on dev /dev/sdc, sector 5455173312, root 5, inode 6757, offset 10686889984, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 btrfs: unable to fixup (regular) error at logical 9628551053312 on dev /dev/sdc btrfs: checksum error at logical 9645596147712 on dev /dev/sdc, sector 5488464512, root 5, inode 6757, offset 22100770816, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 btrfs: unable to fixup (regular) error at logical 9645596147712 on dev /dev/sdc btrfs: checksum error at logical 9662878707712 on dev /dev/sdc, sector 5522219512, root 5, inode 6758, offset 1697771520, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 btrfs: unable to fixup (regular) error at logical 9662878707712 on dev /dev/sdc btrfs: checksum error at logical 9967720464384 on dev /dev/sdc, sector 6117613568, root 5, inode 6767, offset 19135102976, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 btrfs: unable to fixup (regular) error at logical 9967720464384 on dev /dev/sdc btrfs: checksum error at logical 10048360648704 on dev /dev/sdc, sector 6275113928, root 5, inode 6771, offset 4187852800, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 btrfs: unable to fixup (regular) error at logical 10048360648704 on dev /dev/sdc btrfs: checksum error at logical 6748601905152 on dev /dev/sdb, sector 6199388792, root 5, inode 6782, offset 7338291200, length 4096, links 1 (path: bd6/...) btrfs: bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 reading one of the files yields the following in dmesg: btrfs csum failed ino 6782 off 7338291200 csum 444044750 private 1783974550 But, none of these reflect the inodes of the original csum failure. So, what''s causing the problem at offset 221745152 that didn''t show up in the scrub, but''s associated with multiple inodes? John -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jun 2, 2013 at 9:05 PM, John Haller <john.h.haller@...> wrote:> Hi, > > I added a new drive to an existing RAID 0 array. Every > attempt to rebalance the array fails: > # btrfs filesystem balance /share/bd8 > ERROR: error during balancing ''/share/bd8'' - Input/output error > # dmesg | tail > btrfs: found 1 extents > btrfs: relocating block group 10752513540096 flags 1 > btrfs: found 5 extents > btrfs: found 5 extents > btrfs: relocating block group 10751439798272 flags 1 > btrfs: found 1 extents > btrfs: found 1 extents > btrfs: relocating block group 10048138903552 flags 1 > btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028 > btrfs csum failed ino 365 off 221745152 csum 3391451932 private 3121065028 > > An earlier rebalance attempt had the same csum error on a different inode: > btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028 > btrfs csum failed ino 312 off 221745152 csum 3391451932 private 3121065028 > > Every rebalance attempt fails the same way, but with a different inum. > > Here is the array: > # btrfs filesystem show > Label: ''bd8'' uuid: b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4 > Total devices 4 FS bytes used 7.37TB > devid 4 size 3.64TB used 52.00GB path /dev/sde > devid 1 size 3.64TB used 3.32TB path /dev/sdf1 > devid 3 size 3.64TB used 2.92TB path /dev/sdc > devid 2 size 3.64TB used 2.97TB path /dev/sdb > > While I didn''t finish the scrub, no errors were found: > # btrfs scrub status -d /share/bd8 > scrub status for b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4 > scrub device /dev/sdf1 (id 1) status > scrub resumed at Sun Jun 2 20:29:06 2013, running for 10360 seconds > total bytes scrubbed: 845.53GB with 0 errors > scrub device /dev/sdb (id 2) status > scrub resumed at Sun Jun 2 20:29:06 2013, running for 10360 seconds > total bytes scrubbed: 869.38GB with 0 errors > scrub device /dev/sdc (id 3) status > scrub resumed at Sun Jun 2 20:29:06 2013, running for 10360 seconds > total bytes scrubbed: 706.04GB with 0 errors > scrub device /dev/sde (id 4) history > scrub started at Sun Jun 2 12:48:36 2013 and finished after 0 seconds > total bytes scrubbed: 0.00 with 0 errors > > Mount options: > /dev/sdf1 on /share/bd8 type btrfs (rw,flushoncommit) > > Kernel 3.9.4 > > JohnAfter cleaning up the scrub, the balance succeeded. The failure messages from dmesg from the balance were not helpful in finding bad sectors, only the scrub dmesg pointed to the right files with errors. Now, the question is why did the balance leave things so unbalanced as compared with above: # btrfs scrub status /share/bd8 scrub status for b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4 scrub started at Mon Jun 17 23:07:01 2013 and finished after 39209 seconds total bytes scrubbed: 7.49TB with 0 errors # btrfs filesystem show Label: ''bd8'' uuid: b39f475f-3ebf-40ea-b088-4ce7f4d4d8f4 Total devices 4 FS bytes used 7.49TB devid 4 size 3.64TB used 1.99TB path /dev/sdf devid 1 size 3.64TB used 3.32TB path /dev/sdg1 devid 3 size 3.64TB used 1.99TB path /dev/sdc devid 2 size 3.64TB used 1.97TB path /dev/sdb Btrfs v0.20-rc1 It appears that devid 1 was never balanced. Note that the drive numbers are different because I still have the backup device connected which had the originals of corrupted files. The filesystem started with devid 1, was filled to the above capacity, and the other drives were added later, so it didn''t start as a RAID 0 system. The metadata is RAID 1. John -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Jun 19, 2013, at 5:55 PM, John Haller <john.h.haller@gmail.com> wrote:> > It appears that devid 1 was never balanced. Note that the drive > numbers are different because I still have the backup device connected > which had the originals of corrupted files. The filesystem started > with devid 1, was filled to the above capacity, and the other drives > were added later, so it didn''t start as a RAID 0 system.Did you balance with -dconvert raid0 when you added the additional drives? Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Maybe Matching Threads
- Samba 3.0.21b binaries on AIX 5.2 ML4, fails to run after install
- Error compiling samba 3.0.21c, AIX 52 ML7 gcc 3.3.2
- Cannot connect to Samba-3.0.23d (and earlier) from other trusted AD domains
- btrfs raid1 on 16TB goes read-only after "btrfs: block rsv returned -28"
- No DNS domain configured