hi all I moved my btrfs filesystems around using btrfs replace and now I have errors (lots of errors) [63724.419779] BTRFS info (device dm-12): csum failed ino 9340 off 8192 csum 717036259 private 94677163 : root; time btrfs scrub start -Bd /disks/backups scrub device /dev/dm-11 (id 1) done scrub started at Sun Aug 18 15:17:50 2013 and finished after 4487 seconds total bytes scrubbed: 576.46GB with 261883 errors error details: csum=261883 corrected errors: 0, uncorrectable errors: 261883, unverified errors: 0 I had two 2 Tb disks who''s data I needed to swap (/mnt on a WD-Black & /disks/backup on a HD204UI). Both had btrfs systems but /disks/backup was encrypted using luks. I had a spare 640 Gb WD-Blue disk that I plugged into an SATA dock for this operation. I "btrfs resize"d /disks/backup to fit in 590 GB then I "btrfs replace"d /disks/backup to a new luks partition on the WD-Blue disk. Then I "btrfs replace"d /mnt to the HD204UI. Then I "btrfs replace"d the backup data to a new luks partition on the WD-Black. I then got IO Errors reading /disks/backup. I''m using: Linux kooka 3.10-2-amd64 #1 SMP Debian 3.10.5-1 (2013-08-07) x86_64 GNU/Linux and btrfs-tools 0.19+20130315-5 rsync: write failed on "/disks/backups/snapshot_rsync/stuart/secret/current/.purple/accounts.xml": Input/output error (5) Lots of files on /disks/backup have errors. smartctl says passed for all the drives. This is a summary of what I did: 6 btrfs filesystem resize 580g . 9 time btrfs balance start -musage=1 -dusage=1 . && time btrfs filesystem resize 580g . 10 time btrfs filesystem resize 590g . 12 cryptsetup luksOpen /dev/sdd2 640Gb 13 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups 14 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups 18 cryptsetup remove _dev_sdc2 19 fdisk /dev/sdc 32 time btrfs replace start /dev/sdb1 /dev/sdc2 -B /mnt 34 btrfs filesystem label /dev/dm-12 36 btrfs filesystem label /disks/backups backups2Tb 38 btrfs filesystem label /disks/backups 39 cryptsetup luksFormat /dev/sdb2 40 cryptsetup luksAddKey /dev/sdb2 41 cryptsetup open /dev/sdb2 newbackups 43 time btrfs replace start /dev/dm-12 /dev/dm-11 -B /disks/backups 44 btrfs filesystem show 45 cryptsetup status 640Gb 46 cryptsetup remove 640Gb 47 btrfs filesystem show 49 btrfs filesystem resize max /disks/backups/ 54 /etc/local/backups # errors ! 57 time btrfs scrub start -Bd /disks/backups Lots of errors in /var/log/syslog Aug 18 12:27:51 kooka kernel: [54113.507151] btrfs: dev_replace from /dev/mapper/640Gb (devid 1) to /dev/dm-11) started Aug 18 12:27:51 kooka kernel: [54113.601334] device label backups2Tb devid 1 transid 39282 /dev/dm-12 Aug 18 12:28:03 kooka kernel: [54125.020038] ata10.00: exception Emask 0x10 SAct 0x3dfe0ff0 SErr 0x780100 action 0x6 Aug 18 12:28:03 kooka kernel: [54125.020043] ata10.00: irq_stat 0x08000000 Aug 18 12:28:03 kooka kernel: [54125.020047] ata10: SError: { UnrecovData 10B8B Dispar BadCRC Handshk } Aug 18 12:28:03 kooka kernel: [54125.020050] ata10.00: failed command: READ FPDMA QUEUED Aug 18 12:28:03 kooka kernel: [54125.020056] ata10.00: cmd 60/18:20:c0:18:0b/00:00:00:00:00/40 tag 4 ncq 12288 in Aug 18 12:28:03 kooka kernel: [54125.020056] res 40/00:5c:f0:1a:0b/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Aug 18 12:28:03 kooka kernel: [54125.020059] ata10.00: status: { DRDY } [...] Aug 18 12:28:03 kooka kernel: [54125.020262] ata10: hard resetting link Aug 18 12:28:03 kooka kernel: [54125.512032] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Aug 18 12:28:03 kooka kernel: [54125.523759] ata10.00: configured for UDMA/133 Aug 18 12:28:03 kooka kernel: [54125.536380] ata10: EH complete Aug 18 12:28:04 kooka kernel: [54125.770176] ata10.00: exception Emask 0x10 SAct 0x7fffffff SErr 0x780100 action 0x6 Aug 18 12:28:04 kooka kernel: [54125.770181] ata10.00: irq_stat 0x08000000 Aug 18 12:28:04 kooka kernel: [54125.770184] ata10: SError: { UnrecovData 10B8B Dispar BadCRC Handshk } [...] Aug 18 12:28:17 kooka kernel: [54138.957095] ata10.00: status: { DRDY } Aug 18 12:28:17 kooka kernel: [54138.957100] ata10: hard resetting link Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Aug 18 12:28:17 kooka kernel: [54139.449972] ata10.00: configured for UDMA/133 Aug 18 12:28:17 kooka kernel: [54139.464065] ata10: EH complete [...] Aug 18 12:38:31 kooka kernel: [54753.527070] btrfs: checksum error at logical 52642709504 on dev /dev/dm-12, sector 104931328, root 1281, inode 42152, offset 0, length 4096, links 1 (path: XXXXX) [...] Aug 18 12:38:31 kooka kernel: [54753.606566] btrfs: bdev /dev/dm-12 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 [...] Aug 18 12:38:32 kooka kernel: [54753.679513] btrfs: bdev /dev/dm-12 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 Aug 18 12:38:36 kooka kernel: [54758.076089] scrub_handle_errored_block: 15173 callbacks suppressed [...] Aug 18 12:38:52 kooka kernel: [54774.647414] btrfs: bdev /dev/dm-12 errs: wr 0, rd 0, flush 0, corrupt 65313, gen 0 [...] Aug 18 15:24:03 kooka kernel: [64685.641464] btrfs: unable to fixup (regular) error at logical 52643758080 on dev /dev/dm-11 It appears that my WD-Blue or its connection is bad but why didn''t the "btrfs replace" give me an error? "btrfs replace" seems to have read bad data without checking the checksum and then wrote the bad data to the new disk. ata10 is the WD-Blue Aug 17 21:26:19 kooka kernel: [ 1.410573] ata10.00: ATA-8: WDC WD6400AAKS-00A7B2, 01.03B01, max UDMA/133 : root; sleep 2m; smartctl -a /dev/sdd [...] Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 161 158 021 Pre-fail Always - 4933 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 327 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 070 070 000 Old_age Always - 22077 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 245 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 169 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 327 194 Temperature_Celsius 0x0022 096 090 000 Old_age Always - 51 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 12080 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 I guess that /disks/backup is mostly dead and that I should just reformat it. What do you think? Next time I''ll watch /var/log/syslog but I would have preferred that "btrfs replace" stop when getting errors. thanks, Stuart -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Aug 18, 2013, at 1:12 PM, Stuart Pook <slp644161@pook.it> wrote:> 6 btrfs filesystem resize 580g .You first shrank a 2TB btrfs file system on dmcrypt device to 590GB. But then you didn''t resize the dm device or the partition?> 9 time btrfs balance start -musage=1 -dusage=1 . && time btrfs filesystem resize 580g . > 10 time btrfs filesystem resize 590g .You followed the resize of the fs, but not the underlying devices, with a balance, then resized it two more times? This is weird, but also makes the sequence difficult to follow.> 13 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups > 14 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backupsWhy is this command repeated? What''s with the numbering system that skips numbers?> > > [...] > Aug 18 12:28:03 kooka kernel: [54125.020262] ata10: hard resetting link > Aug 18 12:28:03 kooka kernel: [54125.512032] ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > Aug 18 12:28:03 kooka kernel: [54125.523759] ata10.00: configured for UDMA/133 > Aug 18 12:28:03 kooka kernel: [54125.536380] ata10: EH complete > Aug 18 12:28:04 kooka kernel: [54125.770176] ata10.00: exception Emask 0x10 SAct 0x7fffffff SErr 0x780100 action 0x6 > Aug 18 12:28:04 kooka kernel: [54125.770181] ata10.00: irq_stat 0x08000000 > Aug 18 12:28:04 kooka kernel: [54125.770184] ata10: SError: { UnrecovData 10B8B Dispar BadCRC Handshk } > [...] > Aug 18 12:28:17 kooka kernel: [54138.957095] ata10.00: status: { DRDY } > Aug 18 12:28:17 kooka kernel: [54138.957100] ata10: hard resetting link > Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > Aug 18 12:28:17 kooka kernel: [54139.449972] ata10.00: configured for UDMA/133 > Aug 18 12:28:17 kooka kernel: [54139.464065] ata10: EH completeBad connection so libata is dropping the link from 3 Gbps to 1.5Gbps.>> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 12080This confirms that both ends of the cable are sensing communication problems between drive and controller. The cable needs to be replaced, likely it''s the connector not the cable itself.> I guess that /disks/backup is mostly dead and that I should just reformat it. What do you think?Well I think I''d try to simplify this drastically and see if you''ve got a reproducing bug. The steps you''ve got I find mostly incoherent, so I can''t try to do what you did to see if it''s reproducible.> Next time I''ll watch /var/log/syslog but I would have preferred that "btrfs replace" stop when getting errors.The errors should be self correcting, but the mere fact they''re happening means that some errors could be occurring but aren''t detected. If the data is corrupting in-transit, but the drive or controller didn''t report a problem, then btrfs has no way of knowing it was written incorrectly. There''s only so much software can do to overcome blatant hardware problems. But, it seems unlikely such a high percent of errors would go undetected to result in so many uncorrectable errors, so there may be user error here along with a bug. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
hi Chris thanks for your reply. I was unable to save the filesystem. Even after deleting all but 4Gb I still had too many errors so I just reformated the device. I''m glad that it was my backups and not my data. On 18/08/13 23:43, Chris Murphy wrote:> On Aug 18, 2013, at 1:12 PM, Stuart Pook <slp644161@pook.it> wrote: > >> 6 btrfs filesystem resize 580g . > > You first shrank a 2TB btrfs file system on dmcrypt device to 590GB. > But then you didn''t resize the dm device or the partition?no, I had no need to resize the dm device or partition. I just read that when doing a replace the new device must be no smaller than the old device. So I shrunk the old device using "btrfs filesystem resize". Once the resize worked I was able to do the replace but I didn''t try to replace before resizing. This is what btrfs(1) says on Debian: "The targetdev needs to be same size or larger than the srcdev." I may be confused here.>> 9 time btrfs balance start -musage=1 -dusage=1 . && time btrfs filesystem resize 580g .I was surprised that the resize to 580Gb didn''t work so I tried a magical rebalance before doing the resize to 580 again. It still didn''t work (not enough space) but a resize to 590 Gb did.>> 10 time btrfs filesystem resize 590g .this worked> You followed the resize of the fs, but not the underlying devices, > with a balance, then resized it two more times?The resize to 580 didn''t work. So I did a balance. The resize to 580 still didn''t work so I resized to 590.> This is weird, but also makes the sequence difficult to follow.>> 13 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups >> 14 time btrfs replace start /dev/dm-11 /dev/dm-12-B /disks/backups> Why is this command repeated? What''s with the numbering system that > skips numbers?The command is repeated because I cancelled it my mistake by setting the filesystem to readonly. I''m not sure if I restarted it by rerunning the replace or just by remounting the filesystem readwrite in another window. I''ll put all of the commands at the end of this list.>> Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up1.5 Gbps (SStatus 113 SControl 310) > Bad connection so libata is dropping the link from 3 Gbps to1.5Gbps. >> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 12080 > > This confirms that both ends of the cable are sensing communication > problems between drive and controller. The cable needs to be > replaced, likely it''s the connector not the cable itself.I think that I should stop using my SATA dock with the SATA ports on my motherboard which are probably not designed to be hot plugged.>> I guess that /disks/backup is mostly dead and that I should just >> reformat it. What do you think? > > Well I think I''d try to simplify this drastically and see if you''ve > got a reproducing bug.I ran a badblocks scan on the raw device (not the luks device) and didn''t get any errors.> The steps you''ve got I find mostly incoherent, so I can''t try to do > what you did to see if it''s reproducible.yes, this was the first time I''ve tried this. And just to make this more difficult some commands were typed in a different window.>> Next time I''ll watch /var/log/syslog but I would have preferred >> that "btrfs replace" stop when getting errors. > > The errors should be self correcting, but the mere fact they''re > happening means that some errors could be occurring but aren''t > detected. If the data is corrupting in-transit, but the drive or > controller didn''t report a problem, then btrfs has no way of knowing > it was written incorrectly.The data was written to the WD-Blue (640Gb) disk and then copied off it. The only errors I saw concerned the WB-Blue. If the errors were data corruption on writing or reading the WD-Blue then I would have thought that the checksums would have told me that there was something wrong. btrfs didn''t give me an IO error until I started to read the files when the data was on a final disk. Does "btrfs replace" check the ckecksums as it reads the data from the disk that is being replaced? Just to be clear. This is the series of btrfs replace I did: backups : HD204UI -> WD-Blue /mnt : WD-Black -> HD204UI backups : WD-Blue -> WD-Black I guess that my backups were corrupted was they were written to or read from the WD-Blue. Wouldn''t the checksums have detected this problem before the data was written to the WD-Black?> There''s only so much software can do to overcome blatant hardware problems.I was hoping to be informed of them> But, it seems unlikely such a high percent of errors would go > undetected to result in so many uncorrectable errors, so there may be > user error here along with a bug.I''m not sure how I could have done it better. Does "btrfs replace" check that the data is correctly written to the new disk before it is removed from the old disk? Should I have used the 2 disks to make a RAID-1 and then done a scrub before removing the old disk? Here is the complete list of commands I made in the main terminal 1 cd /disks/backups/ 2 btrfs filesystem df 3 btrfs filesystem df , 4* 5 btrfs filesystem df . 6 btrfs filesystem resize 580g . 7 date 8 btrfs filesystem df . 9 time btrfs balance start -musage=1 -dusage=1 . && time btrfs filesystem resize 580g . 10 time btrfs filesystem resize 590g . 11 btrfs filesystem show 12 cryptsetup luksOpen /dev/sdd2 640Gb 13 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups 14 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups 15 cd / 16 btrfs filesystem show 17 btrfs filesystem show 18 cryptsetup remove _dev_sdc2 19 fdisk /dev/sdc 20 fdisk /dev/sdc 21 fdisk -c /dev/sdc 22 fdisk -c=dos /dev/sdc 23 fdisk /dev/sdc 24 fdisk -c=dos /dev/sdc 25 l /mnt 26 mount /dev/sdb1 /mnt 27 l /mnt 28 btrfs subv list /mnt 29 btrfs filesystem show 30 #time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups 31 fdisk -l /dev/sdc 32 time btrfs replace start /dev/sdb1 /dev/sdc2 -B /mnt 33 btrfs filesystem show 34 btrfs filesystem label /dev/dm-12 35 btrfs filesystem label /disks/backups 36 btrfs filesystem label /disks/backups backups2Tb 37 btrfs filesystem show 38 btrfs filesystem label /disks/backups 39 cryptsetup luksFormat /dev/sdb2 40 cryptsetup luksAddKey /dev/sdb2 41 cryptsetup open /dev/sdb2 newbackups 42 l /dev/mapper/newbackups 43 time btrfs replace start /dev/dm-12 /dev/dm-11 -B /disks/backups 44 btrfs filesystem show 45 cryptsetup status 640Gb 46 cryptsetup remove 640Gb 47 btrfs filesystem show 48 btrfs filesystem df /disks/backups/ 49 btrfs filesystem resize max /disks/backups/ 50 btrfs filesystem df /disks/backups/ 51 btrfs filesystem show 52 vi /etc/cron.daily/storebackup 53 vi /etc/cron.daily/stuart 54 /etc/local/backups 55 mount 56 mount -o remount,rw /disks/backups/ 57 time btrfs scrub start -Bd /disks/backups 58 smartctl -a /dev/sdb 59 smartctl -a /dev/sdc 60 smartctl -a /dev/sdd 61 smartctl -t short /dev/sdd 62 sleep 2m; smartctl -a /dev/sdd 63 history > /tmp/root.commands Which disk is which? WD-Black ata-WDC_WD2002FAEX-007BA0_WD-WCAY00589823 -> ../../sdb HD204UI ata-ST2000DL004_HD204UI_S2H7J90C549571 -> ../../sdc WD-Blue ata-WDC_WD6400AAKS-00A7B2_WD-WMASY2546840 -> ../../sdd please let me know if I can be any clearer, thanks Stuart -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Aug 18, 2013, at 4:35 PM, Stuart Pook <slp644161@pook.it> wrote:>> >> You first shrank a 2TB btrfs file system on dmcrypt device to 590GB. >> But then you didn''t resize the dm device or the partition? > > no, I had no need to resize the dm device or partition.OK well it''s unusual to resize a file system and then not resize the containing block device. I don''t know if Btrfs cares about this or not.> > I ran a badblocks scan on the raw device (not the luks device) and didn''t get any errors.badblocks will depend on the drive determining a persistent read failure with a sector, and timing out before the SCSI block layer times out. Since the linux SCSI driver time out is 30 seconds, and most consumer drive ECT is 120 seconds, the bus is reset before the drive has a chance to report a bad sector. So I think you''re better off using smartctl -l long tests to find bad sectors on a disk. Further a smartctl -x may show SATA Phy Event Counters, which should have 0''s or very low numbers and if not then that''s also an indicator of hardware problems.> The data was written to the WD-Blue (640Gb) disk and then copied off it. The only errors I saw concerned the WB-Blue. If the errors were data corruption on writing or reading the WD-Blue then I would have thought that the checksums would have told me that there was something wrong. btrfs didn''t give me an IO error until I started to read the files when the data was on a final disk.How does Btrfs know there''s been a failure during write if the hardware hasn''t detected it? Btrfs doesn''t re-read everything it just wrote to the drive to confirm it was written correctly. It assumes it was unless there''s a hardware error. It wouldn''t know this until a Btrfs scrub is done on the written drive. What I can''t tell you is how Btrfs behaves and if it behaves correctly, when writing data to hardware having transient errors. I don''t know what it does when the hardware reports the error, but presumably if the hardware doesn''t report an error Btrfs can''t do anything about that except on the next read or scrub.> > Just to be clear. This is the series of btrfs replace I did: > > backups : HD204UI -> WD-Blue > /mnt : WD-Black -> HD204UI > backups : WD-Blue -> WD-Black > > I guess that my backups were corrupted was they were written to or read from the WD-Blue. Wouldn''t the checksums have detected this problem before the data was written to the WD-Black?When you first encountered the btrfs reported csum errors, what operation was occurring?> >> There''s only so much software can do to overcome blatant hardware problems. > > I was hoping to be informed of themWell you were informed of them in dmesg, by virtue of the controller having problems talking to a SATA rev 2 drive at rev 2 speed, with a negotiated fallback to rev 1 speed.> >> But, it seems unlikely such a high percent of errors would go >> undetected to result in so many uncorrectable errors, so there may be >> user error here along with a bug. > > I''m not sure how I could have done it better. Does "btrfs replace" check that the data is correctly written to the new disk before it is removed from the old disk?That''s a valid question. Hopefully someone more knowledgable can answer what the expected error handling behavior is supposed to be.> Should I have used the 2 disks to make a RAID-1 and then done a scrub before removing the old disk?Good question. Possibly it''s best practices to use btrfs replace with an existing raid1, rather than using it as a way to move a single copy of data from one disk to another. I think you''d have been better off using btrfs send and receive for this operation. A full dmesg might also be enlightening even if it is really long. Just put it in its own email without comment. I think pasting it out of forum is less preferred. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This is just a comment from someone following all of this from the sidelines. And that is that I see so much going on here with this procedure that is scares me. Once a single operation reaches a certain degree of complexity I get really scared because all it takes is a single misstep and my data is gone. And that happens so easily as complexity increases and confusion tends to set in. In this particular situation, my solution would probably have been to create a new btrfs partition from scratch on the new drive and simply mount the source partition/drive ro and rsync the data across to the target partition/drive rather than trying to do the btrfs replace operation. That way I could have verified the target drive before erasing the source drive and I would not have had to worry about partition sizes, encryption, etc. That said, I am certainly thankful that this was backup data and not working data. But I think it serves as a cautionary tale as to not assuming that something should be done just because it theoretically can be done. I am not really familiar with btrfs replace but would imagine that it is intended for use more in a raid situation than in simply moving data from one drive to another. On 08/18/2013 05:42 PM, Chris Murphy wrote:> On Aug 18, 2013, at 4:35 PM, Stuart Pook <slp644161@pook.it> wrote: >>> You first shrank a 2TB btrfs file system on dmcrypt device to 590GB. >>> But then you didn''t resize the dm device or the partition? >> no, I had no need to resize the dm device or partition. > OK well it''s unusual to resize a file system and then not resize the containing block device. I don''t know if Btrfs cares about this or not. > >> I ran a badblocks scan on the raw device (not the luks device) and didn''t get any errors. > badblocks will depend on the drive determining a persistent read failure with a sector, and timing out before the SCSI block layer times out. Since the linux SCSI driver time out is 30 seconds, and most consumer drive ECT is 120 seconds, the bus is reset before the drive has a chance to report a bad sector. So I think you''re better off using smartctl -l long tests to find bad sectors on a disk. > > Further a smartctl -x may show SATA Phy Event Counters, which should have 0''s or very low numbers and if not then that''s also an indicator of hardware problems. > > >> The data was written to the WD-Blue (640Gb) disk and then copied off it. The only errors I saw concerned the WB-Blue. If the errors were data corruption on writing or reading the WD-Blue then I would have thought that the checksums would have told me that there was something wrong. btrfs didn''t give me an IO error until I started to read the files when the data was on a final disk. > How does Btrfs know there''s been a failure during write if the hardware hasn''t detected it? Btrfs doesn''t re-read everything it just wrote to the drive to confirm it was written correctly. It assumes it was unless there''s a hardware error. It wouldn''t know this until a Btrfs scrub is done on the written drive. > > What I can''t tell you is how Btrfs behaves and if it behaves correctly, when writing data to hardware having transient errors. I don''t know what it does when the hardware reports the error, but presumably if the hardware doesn''t report an error Btrfs can''t do anything about that except on the next read or scrub. > > > > >> Just to be clear. This is the series of btrfs replace I did: >> >> backups : HD204UI -> WD-Blue >> /mnt : WD-Black -> HD204UI >> backups : WD-Blue -> WD-Black >> >> I guess that my backups were corrupted was they were written to or read from the WD-Blue. Wouldn''t the checksums have detected this problem before the data was written to the WD-Black? > When you first encountered the btrfs reported csum errors, what operation was occurring? > >>> There''s only so much software can do to overcome blatant hardware problems. >> I was hoping to be informed of them > Well you were informed of them in dmesg, by virtue of the controller having problems talking to a SATA rev 2 drive at rev 2 speed, with a negotiated fallback to rev 1 speed. >>> But, it seems unlikely such a high percent of errors would go >>> undetected to result in so many uncorrectable errors, so there may be >>> user error here along with a bug. >> I''m not sure how I could have done it better. Does "btrfs replace" check that the data is correctly written to the new disk before it is removed from the old disk? > That''s a valid question. Hopefully someone more knowledgable can answer what the expected error handling behavior is supposed to be. > >> Should I have used the 2 disks to make a RAID-1 and then done a scrub before removing the old disk? > Good question. Possibly it''s best practices to use btrfs replace with an existing raid1, rather than using it as a way to move a single copy of data from one disk to another. I think you''d have been better off using btrfs send and receive for this operation. > > A full dmesg might also be enlightening even if it is really long. Just put it in its own email without comment. I think pasting it out of forum is less preferred. > > > Chris Murphy-- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 19 Aug 2013 00:35:54 +0200, Stuart Pook wrote:> hi Chris > > thanks for your reply. I was unable to save the filesystem. Even after > deleting all but 4Gb I still had too many errors so I just reformated > the device. I''m glad that it was my backups and not my data. > > On 18/08/13 23:43, Chris Murphy wrote: >> On Aug 18, 2013, at 1:12 PM, Stuart Pook <slp644161@pook.it> wrote: >> >>> 6 btrfs filesystem resize 580g . >> >> You first shrank a 2TB btrfs file system on dmcrypt device to 590GB. >> But then you didn''t resize the dm device or the partition? > > no, I had no need to resize the dm device or partition. I just read > that when doing a replace the new device must be no smaller than the old > device. So I shrunk the old device using "btrfs filesystem resize". > Once the resize worked I was able to do the replace but I didn''t try to > replace before resizing. > > This is what btrfs(1) says on Debian: "The targetdev needs to be same > size or larger than the srcdev." I may be confused here. > >>> 9 time btrfs balance start -musage=1 -dusage=1 . && time btrfs >>> filesystem resize 580g . > > I was surprised that the resize to 580Gb didn''t work so I tried a > magical rebalance before doing the resize to 580 again. It still didn''t > work (not enough space) but a resize to 590 Gb did. > >>> 10 time btrfs filesystem resize 590g . > > this worked > >> You followed the resize of the fs, but not the underlying devices, >> with a balance, then resized it two more times? > > The resize to 580 didn''t work. So I did a balance. The resize to 580 > still didn''t work so I resized to 590. > >> This is weird, but also makes the sequence difficult to follow. > >>> 13 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups >>> 14 time btrfs replace start /dev/dm-11 /dev/dm-12-B /disks/backups > >> Why is this command repeated? What''s with the numbering system that >> skips numbers? > > The command is repeated because I cancelled it my mistake by setting the > filesystem to readonly. I''m not sure if I restarted it by rerunning the > replace or just by remounting the filesystem readwrite in another window. > > I''ll put all of the commands at the end of this list. > >>> Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up1.5 >>> Gbps (SStatus 113 SControl 310) >> Bad connection so libata is dropping the link from 3 Gbps to1.5Gbps. >>> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age >>> Always - 12080 >> >> This confirms that both ends of the cable are sensing communication >> problems between drive and controller. The cable needs to be >> replaced, likely it''s the connector not the cable itself. > > I think that I should stop using my SATA dock with the SATA ports on my > motherboard which are probably not designed to be hot plugged. > >>> I guess that /disks/backup is mostly dead and that I should just >>> reformat it. What do you think? >> >> Well I think I''d try to simplify this drastically and see if you''ve >> got a reproducing bug. > > I ran a badblocks scan on the raw device (not the luks device) and > didn''t get any errors. > >> The steps you''ve got I find mostly incoherent, so I can''t try to do >> what you did to see if it''s reproducible. > > yes, this was the first time I''ve tried this. And just to make this > more difficult some commands were typed in a different window. > >>> Next time I''ll watch /var/log/syslog but I would have preferred >>> that "btrfs replace" stop when getting errors. >> >> The errors should be self correcting, but the mere fact they''re >> happening means that some errors could be occurring but aren''t >> detected. If the data is corrupting in-transit, but the drive or >> controller didn''t report a problem, then btrfs has no way of knowing >> it was written incorrectly. > > The data was written to the WD-Blue (640Gb) disk and then copied off > it. The only errors I saw concerned the WB-Blue. If the errors were > data corruption on writing or reading the WD-Blue then I would have > thought that the checksums would have told me that there was something > wrong. btrfs didn''t give me an IO error until I started to read the > files when the data was on a final disk. > > Does "btrfs replace" check the ckecksums as it reads the data from the > disk that is being replaced? > > Just to be clear. This is the series of btrfs replace I did: > > backups : HD204UI -> WD-Blue > /mnt : WD-Black -> HD204UI > backups : WD-Blue -> WD-Black > > I guess that my backups were corrupted was they were written to or read > from the WD-Blue. Wouldn''t the checksums have detected this problem > before the data was written to the WD-Black? > >> There''s only so much software can do to overcome blatant hardware >> problems. > > I was hoping to be informed of them > >> But, it seems unlikely such a high percent of errors would go >> undetected to result in so many uncorrectable errors, so there may be >> user error here along with a bug. > > I''m not sure how I could have done it better. Does "btrfs replace" check > that the data is correctly written to the new disk before it is removed > from the old disk? Should I have used the 2 disks to make a RAID-1 and > then done a scrub before removing the old disk? > > Here is the complete list of commands I made in the main terminal > > 1 cd /disks/backups/ > 2 btrfs filesystem df > 3 btrfs filesystem df , > 4* > 5 btrfs filesystem df . > 6 btrfs filesystem resize 580g . > 7 date > 8 btrfs filesystem df . > 9 time btrfs balance start -musage=1 -dusage=1 . && time btrfs > filesystem resize 580g . > 10 time btrfs filesystem resize 590g . > 11 btrfs filesystem show > 12 cryptsetup luksOpen /dev/sdd2 640Gb > 13 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups > 14 time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups > 15 cd / > 16 btrfs filesystem show > 17 btrfs filesystem show > 18 cryptsetup remove _dev_sdc2 > 19 fdisk /dev/sdc > 20 fdisk /dev/sdc > 21 fdisk -c /dev/sdc > 22 fdisk -c=dos /dev/sdc > 23 fdisk /dev/sdc > 24 fdisk -c=dos /dev/sdc > 25 l /mnt > 26 mount /dev/sdb1 /mnt > 27 l /mnt > 28 btrfs subv list /mnt > 29 btrfs filesystem show > 30 #time btrfs replace start /dev/dm-11 /dev/dm-12 -B /disks/backups > 31 fdisk -l /dev/sdc > 32 time btrfs replace start /dev/sdb1 /dev/sdc2 -B /mnt > 33 btrfs filesystem show > 34 btrfs filesystem label /dev/dm-12 > 35 btrfs filesystem label /disks/backups > 36 btrfs filesystem label /disks/backups backups2Tb > 37 btrfs filesystem show > 38 btrfs filesystem label /disks/backups > 39 cryptsetup luksFormat /dev/sdb2 > 40 cryptsetup luksAddKey /dev/sdb2 > 41 cryptsetup open /dev/sdb2 newbackups > 42 l /dev/mapper/newbackups > 43 time btrfs replace start /dev/dm-12 /dev/dm-11 -B /disks/backups > 44 btrfs filesystem show > 45 cryptsetup status 640Gb > 46 cryptsetup remove 640Gb > 47 btrfs filesystem show > 48 btrfs filesystem df /disks/backups/ > 49 btrfs filesystem resize max /disks/backups/ > 50 btrfs filesystem df /disks/backups/ > 51 btrfs filesystem show > 52 vi /etc/cron.daily/storebackup > 53 vi /etc/cron.daily/stuart > 54 /etc/local/backups > 55 mount > 56 mount -o remount,rw /disks/backups/ > 57 time btrfs scrub start -Bd /disks/backups > 58 smartctl -a /dev/sdb > 59 smartctl -a /dev/sdc > 60 smartctl -a /dev/sdd > 61 smartctl -t short /dev/sdd > 62 sleep 2m; smartctl -a /dev/sdd > 63 history > /tmp/root.commands > > Which disk is which? > > WD-Black ata-WDC_WD2002FAEX-007BA0_WD-WCAY00589823 -> ../../sdb > HD204UI ata-ST2000DL004_HD204UI_S2H7J90C549571 -> ../../sdc > WD-Blue ata-WDC_WD6400AAKS-00A7B2_WD-WMASY2546840 -> ../../sdd > > please let me know if I can be any clearer, thanks > StuartDo you still have the kernel log files around that had been written while you ran the replace procedure? /var/log/messages*. Could you share these files (via personal mail if the files are too huge). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Behrens <sbehrens@giantdisaster.de> wrote:>Do you still have the kernel log files around that had been written >while you ran the replace procedure? /var/log/messages*. Could you >share >these files (via personal mail if the files are too huge).Yes I still have them. I''m away at the moment so I''ll not be able to send them to you before Saturday. What is the maximum size that I should send to the mailing list? Stuart -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Murphy <lists@colorremedies.com> wrote:>> I ran a badblocks scan on the raw device (not the luks device) and >didn''t get any errors. > >badblocks will depend on the drive determining a persistent read >failure with a sector, and timing out before the SCSI block layer times >out. Since the linux SCSI driver time out is 30 seconds, and most >consumer drive ECT is 120 seconds, the bus is reset before the drive >has a chance to report a bad sector. So I think you''re better off using >smartctl -l long tests to find bad sectors on a disk.I have no reason to think that I have bad sectors on the disk. I just wanted to see if badblocks would lead to errors due to connection or cable problems. It didn''t.>How does Btrfs know there''s been a failure during write if the hardware >hasn''t detected it? Btrfs doesn''t re-read everything it just wrote to >the drive to confirm it was written correctly. It assumes it was unless >there''s a hardware error. It wouldn''t know this until a Btrfs scrub is >done on the written drive.I was hoping that btrfs would have checked that the data was correctly copied to the new disk before it removed it from the original. This is what would have saved my filesystem.>What I can''t tell you is how Btrfs behaves and if it behaves correctly, >when writing data to hardware having transient errors. I don''t know >what it does when the hardware reports the error, but presumably if the >hardware doesn''t report an error Btrfs can''t do anything about that >except on the next read or scrub.But btrfs did read the data from the WD-blue because it copied it to the WD-black. btrfs copied rubbish onto the WD-black so if it had checked the checksums as it read from the WD-blue it would have seen that things were bad. This would already have been too late for my filesystem but it would have been good to know then rather than just get errors when I tried to read the files on the filesystem.>> Just to be clear. This is the series of btrfs replace I did: >> >> backups : HD204UI -> WD-Blue >> /mnt : WD-Black -> HD204UI >> backups : WD-Blue -> WD-Black >> >> I guess that my backups were corrupted was they were written to or >read from the WD-Blue. Wouldn''t the checksums have detected this >problem before the data was written to the WD-Black? > >When you first encountered the btrfs reported csum errors, what >operation was occurring?When I started to read and write my backups after they have been copied to the WD-black>>> There''s only so much software can do to overcome blatant hardware >problems. >> >> I was hoping to be informed of them > >Well you were informed of them in dmesg, by virtue of the controller >having problems talking to a SATA rev 2 drive at rev 2 speed, with a >negotiated fallback to rev 1 speed.I wanted btrfs to reread the new disk before removing the old disk from the filesystem. I also do not understand why the errors, which were going into dmesg, were not received by btrfs so that it could abort the replace.>> Does "btrfs replace" >check that the data is correctly written to the new disk before it is >removed from the old disk? > >That''s a valid question. Hopefully someone more knowledgable can answer >what the expected error handling behavior is supposed to be.It would be good if it did!>> Should I have used the 2 disks to make a RAID-1 and then done a >scrub before removing the old disk? > >Good question. Possibly it''s best practices to use btrfs replace with >an existing raid1, rather than using it as a way to move a single copy >of data from one disk to another. I think you''d have been better off >using btrfs send and receive for this operation.But using send and receive would have lead to downtime.>A full dmesg might also be enlightening even if it is really long. Just >put it in its own email without comment.As soon as I get back home ... Stuart Pook, http://www.pook.it -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 20 Aug 2013 15:52:27 +0200, slp644161 wrote:> Stefan Behrens <sbehrens@giantdisaster.de> wrote: > >> Do you still have the kernel log files around that had been written >> while you ran the replace procedure? /var/log/messages*. Could you >> share >> these files (via personal mail if the files are too huge). > > Yes I still have them. I''m away at the moment so I''ll not be able to send them to you before Saturday. What is the maximum size that I should send to the mailing list? > > StuartThe list server blocks mails larger then 100k characters according to <http://vger.kernel.org/majordomo-info.html>. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 20 Aug 2013 16:46:59 +0200, slp644161 wrote:> Chris Murphy <lists@colorremedies.com> wrote: > >>> I ran a badblocks scan on the raw device (not the luks device) and >> didn''t get any errors. >> >> badblocks will depend on the drive determining a persistent read >> failure with a sector, and timing out before the SCSI block layer times >> out. Since the linux SCSI driver time out is 30 seconds, and most >> consumer drive ECT is 120 seconds, the bus is reset before the drive >> has a chance to report a bad sector. So I think you''re better off using >> smartctl -l long tests to find bad sectors on a disk. > > I have no reason to think that I have bad sectors on the disk. I just wanted to see if badblocks would lead to errors due to connection or cable problems. It didn''t. > >> How does Btrfs know there''s been a failure during write if the hardware >> hasn''t detected it? Btrfs doesn''t re-read everything it just wrote to >> the drive to confirm it was written correctly. It assumes it was unless >> there''s a hardware error. It wouldn''t know this until a Btrfs scrub is >> done on the written drive. > > I was hoping that btrfs would have checked that the data was correctly copied to the new disk before it removed it from the original. This is what would have saved my filesystem. > >> What I can''t tell you is how Btrfs behaves and if it behaves correctly, >> when writing data to hardware having transient errors. I don''t know >> what it does when the hardware reports the error, but presumably if the >> hardware doesn''t report an error Btrfs can''t do anything about that >> except on the next read or scrub. > > But btrfs did read the data from the WD-blue because it copied it to the WD-black. btrfs copied rubbish onto the WD-black so if it had checked the checksums as it read from the WD-blue it would have seen that things were bad. This would already have been too late for my filesystem but it would have been good to know then rather than just get errors when I tried to read the files on the filesystem. > >>> Just to be clear. This is the series of btrfs replace I did: >>> >>> backups : HD204UI -> WD-Blue >>> /mnt : WD-Black -> HD204UI >>> backups : WD-Blue -> WD-Black >>> >>> I guess that my backups were corrupted was they were written to or >> read from the WD-Blue. Wouldn''t the checksums have detected this >> problem before the data was written to the WD-Black? >> >> When you first encountered the btrfs reported csum errors, what >> operation was occurring? > > When I started to read and write my backups after they have been copied to the WD-black > >>>> There''s only so much software can do to overcome blatant hardware >> problems. >>> >>> I was hoping to be informed of them >> >> Well you were informed of them in dmesg, by virtue of the controller >> having problems talking to a SATA rev 2 drive at rev 2 speed, with a >> negotiated fallback to rev 1 speed. > > I wanted btrfs to reread the new disk before removing the old disk from the filesystem. I also do not understand why the errors, which were going into dmesg, were not received by btrfs so that it could abort the replace. > >>> Does "btrfs replace" >> check that the data is correctly written to the new disk before it is >> removed from the old disk? >> >> That''s a valid question. Hopefully someone more knowledgable can answer >> what the expected error handling behavior is supposed to be. > > It would be good if it did!If write errors are encountered (EIO in the write completion callback), the dev replace procedure is aborted. ''btrfs replace status'' can be used to see the write errors and the fact that the replace operation was canceled. If the ''btrfs replace start'' task was invoked with the ''-B'' option (do not background), an error message is printed in this case and the exit value is set. I spent my whole day yesterday to check the replace procedure for errors by doing similar things like you did, including inserting write errors, using LUKS and everything. The only issue that I was able to find was that it is not checked whether the filesystem is in read-only mode when the operation starts. If it is, the operation fails at the end without giving the user an indication. If you then think that the copy operation succeeded and scratch or reuse the source drive, all data is gone. But I have not found anything else. Therefore I''m looking forward to read through your kernel log files.>>> Should I have used the 2 disks to make a RAID-1 and then done a >> scrub before removing the old disk? >> >> Good question. Possibly it''s best practices to use btrfs replace with >> an existing raid1, rather than using it as a way to move a single copy >> of data from one disk to another. I think you''d have been better off >> using btrfs send and receive for this operation. > > But using send and receive would have lead to downtime. > >> A full dmesg might also be enlightening even if it is really long. Just >> put it in its own email without comment. > > As soon as I get back home ... > > Stuart Pook, http://www.pook.it-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/08/13 17:16, Stefan Behrens wrote:> I spent my whole day yesterday to check the replace procedure for errors > by doing similar things like you did, including inserting write errors, > using LUKS and everything.thanks> The only issue that I was able to find was > that it is not checked whether the filesystem is in read-only mode when > the operation starts.Note that I started the replace and then set the filesystem to readonly. The replace then stopped. I set the filesystem back to readwrite and restarted the replace.> Therefore I''m looking forward to read > through your kernel log files.I emailed them to Stefan Behrens & Chris Murphy. Please let me know if you did not get them (presumably because they are too big). Stuart -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Aug 25, 2013, at 4:10 PM, Stuart Pook <slp644161@pook.it> wrote:> > I emailed them to Stefan Behrens & Chris Murphy. Please let me know if you did not get them (presumably because they are too big).Observations: 1. The problems started before the start of the provided log. 2. smartd reports sdb at 100˚C. The spec sheet for WD2002FAEX is 60˚C. It''s possible the raw value isn''t actually ˚C so you''ll need to look at smartctl -a columns VALUE, WORST and THRESH to determine if it is or has hit the threshold. Seems possible the drives are being cooked. sdc is ST2000DL004 which google finds this http://forums.seagate.com/t5/Desktop-HDD-Desktop-SSHD/BEWARE-the-so-called-Samsung-HD204UI/m-p/166856 It also looks to be running hot. 3. the first ata error seems to be 8/10 encoding related, could be a connector problem, a port problem, a drive problem, or firmware bug - the Emask 0x10 implicates NCQ according to libata.h: AC_ERR_NCQ = (1 << 10), /* marker for offending NCQ qc */ 4. Hundreds of these: ata10.00: failed command: READ FPDMA QUEUED Implies it may be an incompatibility between this drive and the controller, possibly disabling NCQ on the drive will fix the problem (set queue depth to 1) https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559 https://ata.wiki.kernel.org/index.php/Libata_FAQ echo 1 > /sys/block/sdX/device/queue_depth I can''t tell you what /dev/ node applies to ata10:00 because the log is incomplete, so I don''t know which drive is giving you a hard time with NCQ. Thing is, if you disable NCQ on just one drive, it''ll slow it down compared to the others. I don''t know how tolerant btrfs is when devices have different speeds. 5. Tens of thousands of checksum errors on both dm-11 and dm-12. 6. Many instances of btrfs: unable to fixup (regular) error at logical 53281xxxxxx on dev /dev/dm-11 So kernel messages have been screaming of bus related problems for some time, they were ignored, btrfs did what it could, reported hundreds to thousands of errors in dmesg, but user space tools didn''t warn the user operations effectively failed. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Aug 25, 2013, at 8:07 PM, Chris Murphy <lists@colorremedies.com> wrote:> ata10.00: failed command: READ FPDMA QUEUED > > Implies it may be an incompatibility between this drive and the controller, possibly disabling NCQ on the drive will fix the problem (set queue depth to 1)http://serverfault.com/questions/295740/ubuntu-11-04-server-crashing-failed-command-read-fpdma-queued One found replacing cables solved the problem, another a BIOS update. It could also be port specific. I''m uncertain if there''s enough information to know if the data was written incorrectly, or if it''s just reading incorrectly and btrfs is complaining. It might be worth putting the drive on another port with a new cable and seeing if the volume will mount ro and if so if you can access your data. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 25 Aug 2013 20:07:32 -0600, Chris Murphy wrote:> On Aug 25, 2013, at 4:10 PM, Stuart Pook <slp644161@pook.it> wrote: >> >> I emailed them to Stefan Behrens & Chris Murphy. Please let me know if you did not get them (presumably because they are too big). > > Observations: > > 1. The problems started before the start of the provided log. > > 2. smartd reports sdb at 100˚C. The spec sheet for WD2002FAEX is 60˚C. It''s possible the raw value isn''t actually ˚C so you''ll need to look at smartctl -a columns VALUE, WORST and THRESH to determine if it is or has hit the threshold. Seems possible the drives are being cooked. > > sdc is ST2000DL004 which google finds this > http://forums.seagate.com/t5/Desktop-HDD-Desktop-SSHD/BEWARE-the-so-called-Samsung-HD204UI/m-p/166856 > > It also looks to be running hot. > > 3. the first ata error seems to be 8/10 encoding related, could be a connector problem, a port problem, a drive problem, or firmware bug - the Emask 0x10 implicates NCQ according to libata.h: > AC_ERR_NCQ = (1 << 10), /* marker for offending NCQ qc */ > > 4. Hundreds of these: > ata10.00: failed command: READ FPDMA QUEUED > > Implies it may be an incompatibility between this drive and the controller, possibly disabling NCQ on the drive will fix the problem (set queue depth to 1) > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559 > > https://ata.wiki.kernel.org/index.php/Libata_FAQ > > echo 1 > /sys/block/sdX/device/queue_depth > > > I can''t tell you what /dev/ node applies to ata10:00 because the log is incomplete, so I don''t know which drive is giving you a hard time with NCQ. Thing is, if you disable NCQ on just one drive, it''ll slow it down compared to the others. I don''t know how tolerant btrfs is when devices have different speeds. > > > > 5. Tens of thousands of checksum errors on both dm-11 and dm-12. > > 6. Many instances of > btrfs: unable to fixup (regular) error at logical 53281xxxxxx on dev /dev/dm-11 > > So kernel messages have been screaming of bus related problems for some time, they were ignored, btrfs did what it could, reported hundreds to thousands of errors in dmesg, but user space tools didn''t warn the user operations effectively failed.Right, I assume that the WD6400AAKS failed in reading the 250,000 blocks due to heat or SATA link issues. And in this case the user space tools should have warned and aborted the operations because there is hope that after cooling down the disk or after fixing the SATA link issues, the read errors disappear. There is the other use case where such unrecoverable read errors are expected. This is the case when a disk is about to die. The configuration option is missing whether to abort or continue on unrecoverable read errors. The even better solution is to implement an optional verify at the end or a scrub run, and to only declare the operation as being finished when this additional check succeeds. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html