(My previous post seems to have been discarded because of the attachment size, I''m resending it without the dmesg output - which can be found @ http://pastebin.com/T0J3z59j ) Hi, yesterday I updated my kernel (clean clone from mason/btrfs-unstable.gi), pulling in the single latest change I have been missing ( http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=3f6fae9559225741c91f1320090b285da1413290 ) and adding my patch from http://patchwork.kernel.org/patch/81547/ . Previous kernel version (without my patch - could this be my fault?) has been running fine for 14 days, but after recompiling and rebooting, my dmesg output is full of "btrfs no csum found for inode 386 start 0" and "btrfs csum failed ino 386 extent 65191274496 csum 1851253866 wanted 0 mirror 1" and "btrfs csum failed ino 82619 off 8749056 csum 2686054019 private 0", repeating with different values. Also, accessing pretty much any file ends with "read error" - only small text files remained readable. Newly written files seem to behave correctly. I have tried reverting back to the version from 10.02.2010 (last change on 04.02, without my patch) that worked well before, but now it also spits out the same errors and files remain unreadable. Dmesg output is attached. Anything more I can do to help diagnose the problem? Regards, Leszek ''skolima'' Ciesielski -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I have changed the btrfs code to ignore checksum failures and now I can read files correctly from the filesystem. Also, moving them onto another volume and then back into btrfs fixes the checksums and no more errors are reported for the file in question. Quick and dirty code I used for getting my files out: diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index a11a320..d6e6aa9 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -140,8 +140,9 @@ static int check_compressed_csum(struct inode *inode, "wanted %u mirror %d\n", inode->i_ino, (unsigned long long)disk_start, csum, *cb_sum, cb->mirror_num); - ret = -EIO; - goto fail; + /*ret = -EIO; + goto fail;*/ + printk("btrfs ignoring compressed csum mismatch"); } cb_sum++; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4deb280..f1572ce 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1955,8 +1955,9 @@ static int btrfs_readpage_end_io_hook(struct page *page, u64 start, u64 end, csum = btrfs_csum_data(root, kaddr + offset, csum, end - start + 1); btrfs_csum_final(csum, (char *)&csum); - if (csum != private) - goto zeroit; + if (csum != private && printk_ratelimit()) + printk(KERN_INFO "btrfs ignoring csum mismatch"); +// goto zeroit; kunmap_atomic(kaddr, KM_USER0); good: On Thu, Feb 25, 2010 at 10:34 AM, Leszek Ciesielski <skolima@gmail.com> wrote:> (My previous post seems to have been discarded because of the > attachment size, I''m resending it without the dmesg output - which can > be found @ http://pastebin.com/T0J3z59j ) > > Hi, > > yesterday I updated my kernel (clean clone from > mason/btrfs-unstable.gi), pulling in the single latest change I have > been missing ( http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=3f6fae9559225741c91f1320090b285da1413290 > ) and adding my patch from http://patchwork.kernel.org/patch/81547/ . > Previous kernel version (without my patch - could this be my fault?) > has been running fine for 14 days, but after recompiling and > rebooting, my dmesg output is full of "btrfs no csum found for inode > 386 start 0" and "btrfs csum failed ino 386 extent 65191274496 csum > 1851253866 wanted 0 mirror 1" and "btrfs csum failed ino 82619 off > 8749056 csum 2686054019 private 0", repeating with different values. > Also, accessing pretty much any file ends with "read error" - only > small text files remained readable. Newly written files seem to behave > correctly. I have tried reverting back to the version from 10.02.2010 > (last change on 04.02, without my patch) that worked well before, but > now it also spits out the same errors and files remain unreadable. > Dmesg output is attached. Anything more I can do to help diagnose the > problem? > > Regards, > > Leszek ''skolima'' Ciesielski >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 25, 2010 at 12:52 PM, Leszek Ciesielski <skolima@gmail.com> wrote:> I have changed the btrfs code to ignore checksum failures and now I > can read files correctly from the filesystem. Also, moving them onto > another volume and then back into btrfs fixes the checksums and no > more errors are reported for the file in question. > > Quick and dirty code I used for getting my files out:Yes, but did you verify your data? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 25, 2010 at 10:34:22AM +0100, Leszek Ciesielski wrote:> (My previous post seems to have been discarded because of the > attachment size, I''m resending it without the dmesg output - which can > be found @ http://pastebin.com/T0J3z59j ) > > Hi, > > yesterday I updated my kernel (clean clone from > mason/btrfs-unstable.gi), pulling in the single latest change I have > been missing ( http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=3f6fae9559225741c91f1320090b285da1413290 > ) and adding my patch from http://patchwork.kernel.org/patch/81547/ . > Previous kernel version (without my patch - could this be my fault?) > has been running fine for 14 days, but after recompiling and > rebooting, my dmesg output is full of "btrfs no csum found for inode > 386 start 0" and "btrfs csum failed ino 386 extent 65191274496 csum > 1851253866 wanted 0 mirror 1" and "btrfs csum failed ino 82619 off > 8749056 csum 2686054019 private 0",I don''t think your patch alone could have caused this. Has anything else strange been happening on this machine? The fact that all your files are wrong is especially strange. Have you ever mounted this FS with mount -o nodatasum? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 26, 2010 at 1:45 AM, Chris Mason <chris.mason@oracle.com> wrote:> On Thu, Feb 25, 2010 at 10:34:22AM +0100, Leszek Ciesielski wrote: >> (My previous post seems to have been discarded because of the >> attachment size, I''m resending it without the dmesg output - which can >> be found @ http://pastebin.com/T0J3z59j ) >> >> Hi, >> >> yesterday I updated my kernel (clean clone from >> mason/btrfs-unstable.gi), pulling in the single latest change I have >> been missing ( http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=3f6fae9559225741c91f1320090b285da1413290 >> ) and adding my patch from http://patchwork.kernel.org/patch/81547/ . >> Previous kernel version (without my patch - could this be my fault?) >> has been running fine for 14 days, but after recompiling and >> rebooting, my dmesg output is full of "btrfs no csum found for inode >> 386 start 0" and "btrfs csum failed ino 386 extent 65191274496 csum >> 1851253866 wanted 0 mirror 1" and "btrfs csum failed ino 82619 off >> 8749056 csum 2686054019 private 0", > > Yes, but did you verify your data?Part of the data stored on the volume consisted of video recordings - after copying out and back onto the volume, they play back fine, without video or audio glitches. Which I am aware does not mean they are intact, just "good enough to work". I had also some important data there, which is backed up to another location - I will verify it''s integrity with rsync during the weekend.> > I don''t think your patch alone could have caused this. Has anything > else strange been happening on this machine?Not really. The FS was created with metadata=mirror data=mirror on a single drive, then a second (larger) drive was added and the fs was rebalanced. Compression is enabled. No problems until the last kernel update. After the recovery - no new csum failures.> > The fact that all your files are wrong is especially strange. Have you > ever mounted this FS with mount -o nodatasum?Only once, after the problem occured - yesterday, when I was trying to copy the data out. Only read access was performed (although the fs was mounted rw). Regards, Leszek ''skolima'' Ciesielski -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 26, 2010 at 11:51:35AM +0100, Leszek Ciesielski wrote:> On Fri, Feb 26, 2010 at 1:45 AM, Chris Mason <chris.mason@oracle.com> wrote: > > On Thu, Feb 25, 2010 at 10:34:22AM +0100, Leszek Ciesielski wrote: > >> (My previous post seems to have been discarded because of the > >> attachment size, I''m resending it without the dmesg output - which can > >> be found @ http://pastebin.com/T0J3z59j ) > >> > >> Hi, > >> > >> yesterday I updated my kernel (clean clone from > >> mason/btrfs-unstable.gi), pulling in the single latest change I have > >> been missing ( http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=3f6fae9559225741c91f1320090b285da1413290 > >> ) and adding my patch from http://patchwork.kernel.org/patch/81547/ . > >> Previous kernel version (without my patch - could this be my fault?) > >> has been running fine for 14 days, but after recompiling and > >> rebooting, my dmesg output is full of "btrfs no csum found for inode > >> 386 start 0" and "btrfs csum failed ino 386 extent 65191274496 csum > >> 1851253866 wanted 0 mirror 1" and "btrfs csum failed ino 82619 off > >> 8749056 csum 2686054019 private 0", > > > > Yes, but did you verify your data? > > Part of the data stored on the volume consisted of video recordings - > after copying out and back onto the volume, they play back fine, > without video or audio glitches. Which I am aware does not mean they > are intact, just "good enough to work". I had also some important data > there, which is backed up to another location - I will verify it''s > integrity with rsync during the weekend. > > > > > I don''t think your patch alone could have caused this. Has anything > > else strange been happening on this machine? > > Not really. The FS was created with metadata=mirror data=mirror on a > single drive, then a second (larger) drive was added and the fs was > rebalanced. Compression is enabled. No problems until the last kernel > update. After the recovery - no new csum failures.Ok, what does btrfsck say about the FS now? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 26, 2010 at 5:19 PM, Chris Mason <chris.mason@oracle.com> wrote:> On Fri, Feb 26, 2010 at 11:51:35AM +0100, Leszek Ciesielski wrote: >> On Fri, Feb 26, 2010 at 1:45 AM, Chris Mason <chris.mason@oracle.com> wrote: >> > On Thu, Feb 25, 2010 at 10:34:22AM +0100, Leszek Ciesielski wrote: >> >> (My previous post seems to have been discarded because of the >> >> attachment size, I''m resending it without the dmesg output - which can >> >> be found @ http://pastebin.com/T0J3z59j ) >> >> >> >> Hi, >> >> >> >> yesterday I updated my kernel (clean clone from >> >> mason/btrfs-unstable.gi), pulling in the single latest change I have >> >> been missing ( http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=3f6fae9559225741c91f1320090b285da1413290 >> >> ) and adding my patch from http://patchwork.kernel.org/patch/81547/ . >> >> Previous kernel version (without my patch - could this be my fault?) >> >> has been running fine for 14 days, but after recompiling and >> >> rebooting, my dmesg output is full of "btrfs no csum found for inode >> >> 386 start 0" and "btrfs csum failed ino 386 extent 65191274496 csum >> >> 1851253866 wanted 0 mirror 1" and "btrfs csum failed ino 82619 off >> >> 8749056 csum 2686054019 private 0", >> > >> > Yes, but did you verify your data? >> >> Part of the data stored on the volume consisted of video recordings - >> after copying out and back onto the volume, they play back fine, >> without video or audio glitches. Which I am aware does not mean they >> are intact, just "good enough to work". I had also some important data >> there, which is backed up to another location - I will verify it''s >> integrity with rsync during the weekend. >> >> > >> > I don''t think your patch alone could have caused this. Has anything >> > else strange been happening on this machine? >> >> Not really. The FS was created with metadata=mirror data=mirror on a >> single drive, then a second (larger) drive was added and the fs was >> rebalanced. Compression is enabled. No problems until the last kernel >> update. After the recovery - no new csum failures. > > Ok, what does btrfsck say about the FS now? > > -chris >49MB of errors, 3.5MB compressed file. So I guess it''s not too good ;-) Log uploaded here: http://cid-3e1bf92365ec19a2.skydrive.live.com/self.aspx/.Public/btrfscfk.txt.gz -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html