Hi everybody, Today while playing around with btrfs I uncovered what must be a bug in the btrfs checksum code. My kernel log received a couple of these messages with various ino and off numbers: btrfs csum failed ino 5098 off 524288 csum 2981133980 private 959545494 [..] This happens on reading from the btrfs filesystem. The funny thing is that the files are read correct, as verified by md5sum. I have cross-checked this on another machine (with same kernel and btrfs utils): same result. A full filesystem md5sum check showed no errors. The md5sums obviously were computed before the data was copied to the btrfs. So I conclude that these messages are faulty because data is read correctly. In addition, when you have more than one btrfs you cannot see from the message which fs it is refering to. Here is my setup, maybe it has something to do with the (nowadays) unusual kernel target: - unmodified upstream 2.6.36 kernel - Debian Squeeze - Standard Debian gcc 4.3.5 with target i486 - CPU AMD Geode LX800 on ALIX board - btrfs on USB-ATA connected IDE drive Seagate Barracuda 7200.8 ST3400832A - btrs utils v0.19 - about 300GB of data of all sorts in 50000+ files on the fs - data gets rsynced to another btrfs volume of 1TB when on read the csum errors occur Hope that some of this informations rings a bell on someones mind. If so, please let me know ;) bye, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Today while playing around with btrfs I uncovered what must be a bug in the btrfs checksum code. My kernel log received a couple of these messages with various ino and off numbers: > > btrfs csum failed ino 5098 off 524288 csum 2981133980 private 959545494 > [..] > > This happens on reading from the btrfs filesystem. > > The funny thing is that the files are read correct, as verified by md5sum. I have cross-checked this on another machine (with same kernel and btrfs utils): same result. A full filesystem md5sum check showed no errors. The md5sums obviously were computed before the data was copied to the btrfs. > > So I conclude that these messages are faulty because data is read correctly. In addition, when you have more than one btrfs you cannot see from the message which fs it is refering to.Is this a raid1 or a dup array? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
So I conclude that these messages are faulty because data is read correctly. In addition, when you have more than one btrfs you cannot see from the message which fs it is refering to. Is this a raid1 or a dup array? No, plain vanilla partition on physical hard disk. Btrfs was made with the command "mkfs.btrfs /dev/sdc1" no extra arguments. bye, A.B. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
To follow up on this matter, I have created another two btrfs volumes (also plain - no options - also on two external USB-SATA disks), and am at the moment copying heaps of data between these two. No errors as of yet. All copies are verified by md5sum after the deed. The volume in question can still "reliably" reproduce the csum errors on read, though. Aprox. 30 csum errors occur when the whole fs is read. The data is still fine. I can put it aside for further debugging until at most Wednesday morning. If someone wants me to run diagnostics on it, please let me know. I am glad to be of help (until Wednesday morning). Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1 November 2010 00:35, Andreas Bauer <ab@voltage.de> wrote:> So I conclude that these messages are faulty because data is read correctly. > In addition, when you have more than one btrfs you cannot see from the message > which fs it is refering to. > > Is this a raid1 or a dup array? > > No, plain vanilla partition on physical hard disk. Btrfs was made with the command "mkfs.btrfs /dev/sdc1" no extra arguments.By default, metadata is duplicated, thus it could be that BTRFS is using the correct copy of the metadata after finding checksum errors in the first copy. Daniel -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 1, 2010 at 4:55 AM, Daniel J Blueman <daniel.blueman@gmail.com> wrote:> On 1 November 2010 00:35, Andreas Bauer <ab@voltage.de> wrote: >> So I conclude that these messages are faulty because data is read correctly. >> In addition, when you have more than one btrfs you cannot see from the message >> which fs it is refering to. >> >> Is this a raid1 or a dup array? >> >> No, plain vanilla partition on physical hard disk. Btrfs was made with the command "mkfs.btrfs /dev/sdc1" no extra arguments. > > By default, metadata is duplicated, thus it could be that BTRFS is > using the correct copy of the metadata after finding checksum errors > in the first copy.Ahhhhhhh, and that makes this make sense: Andreas, have you checked which file(s) are giving the errors? if not, you can use "find /whatever/mountpoint -xdev -inum 5098 -print" to get the filename. And I would bet that it''s small enough that it''s being inlined into the metadata block group, and therefore covered under the default "dup" profile of that block group, which is why you''re getting the actual file data back. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 01, 2010 at 12:02:10PM CET, cwillu wrote: Ahhhhhhh, and that makes this make sense: Andreas, have you checked which file(s) are giving the errors? if not, you can use "find /whatever/mountpoint -xdev -inum 5098 -print" to get the filename. And I would bet that it''s small enough that it''s being inlined into the metadata block group, and therefore covered under the default "dup" profile of that block group, which is why you''re getting the actual file data back. Sorry to disappoint, the files hit are from big (8 GB) to small. I took the opportunity to compare the syslog from both machines I tested on, and the csum ino and off counters are completely different in each case. The filesystem which showed this behaviour has now been destoyed, and in further testing I wasn''t able to reproduce the bug. To summarize: - a btrfs about 400GB in size showed several csum errors on reading while the data read was correct. The same thing happened when the filesystem was mounted on another machine (same kernel). - the errors could be consistently reproduced by reading enough data. - about 60 - 120 csum happened on reading about 250 GB of data. - the csum error happened to different inodes each time (and each run) As I don''t have enough time at the moment to familiarize myself with the btrfs code, I have to let go of this issue at this point. Thank you for your work. -- A.B. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html