thr3ads.net - Btrfs devel - uncorrectable errors after btrfs replace [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Stuart Pook

2013-Aug-18 19:12 UTC

uncorrectable errors after btrfs replace

hi all

I moved my btrfs filesystems around using btrfs replace and now I have errors
(lots of errors)

[63724.419779] BTRFS info (device dm-12): csum failed ino 9340 off 8192 csum
717036259 private 94677163

: root; time  btrfs  scrub start -Bd /disks/backups
scrub device /dev/dm-11 (id 1) done
	scrub started at Sun Aug 18 15:17:50 2013 and finished after 4487 seconds
	total bytes scrubbed: 576.46GB with 261883 errors
	error details: csum=261883
	corrected errors: 0, uncorrectable errors: 261883, unverified errors: 0

I had two 2 Tb disks who''s data I needed to swap (/mnt on a WD-Black
& /disks/backup on a HD204UI). Both had btrfs systems but /disks/backup was
encrypted using luks. I had a spare 640 Gb WD-Blue disk that I plugged into an
SATA dock for this operation.

I "btrfs resize"d /disks/backup to fit in 590 GB then I "btrfs
replace"d /disks/backup to a new luks partition on the WD-Blue disk. Then I
"btrfs replace"d /mnt to the HD204UI.  Then I "btrfs
replace"d the backup data to a new luks partition on the WD-Black. I then
got IO Errors reading /disks/backup.

I''m using: Linux kooka 3.10-2-amd64 #1 SMP Debian 3.10.5-1 (2013-08-07)
x86_64 GNU/Linux
and btrfs-tools 0.19+20130315-5

rsync: write failed on
"/disks/backups/snapshot_rsync/stuart/secret/current/.purple/accounts.xml":
Input/output error (5)

Lots of files on /disks/backup have errors. smartctl says passed for all the
drives.

This is a summary of what I did:

     6  btrfs filesystem resize 580g .
     9  time btrfs  balance start -musage=1 -dusage=1 . && time  btrfs
filesystem resize 580g .
    10  time  btrfs filesystem resize 590g .
    12  cryptsetup luksOpen /dev/sdd2 640Gb
    13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    14  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    18  cryptsetup remove _dev_sdc2
    19  fdisk /dev/sdc
    32  time btrfs replace start  /dev/sdb1  /dev/sdc2 -B /mnt
    34  btrfs filesystem label  /dev/dm-12
    36   btrfs filesystem label /disks/backups backups2Tb
    38   btrfs filesystem label /disks/backups
    39  cryptsetup luksFormat /dev/sdb2
    40  cryptsetup luksAddKey /dev/sdb2
    41  cryptsetup open  /dev/sdb2 newbackups
    43  time btrfs replace start  /dev/dm-12  /dev/dm-11 -B /disks/backups
    44  btrfs filesystem show
    45  cryptsetup status 640Gb
    46  cryptsetup remove 640Gb
    47  btrfs filesystem show
    49  btrfs filesystem resize max /disks/backups/
    54  /etc/local/backups
# errors !
    57  time  btrfs  scrub start -Bd /disks/backups

Lots of errors in /var/log/syslog

Aug 18 12:27:51 kooka kernel: [54113.507151] btrfs: dev_replace from
/dev/mapper/640Gb (devid 1) to /dev/dm-11) started
Aug 18 12:27:51 kooka kernel: [54113.601334] device label backups2Tb devid 1
transid 39282 /dev/dm-12
Aug 18 12:28:03 kooka kernel: [54125.020038] ata10.00: exception Emask 0x10 SAct
0x3dfe0ff0 SErr 0x780100 action 0x6
Aug 18 12:28:03 kooka kernel: [54125.020043] ata10.00: irq_stat 0x08000000
Aug 18 12:28:03 kooka kernel: [54125.020047] ata10: SError: { UnrecovData 10B8B
Dispar BadCRC Handshk }
Aug 18 12:28:03 kooka kernel: [54125.020050] ata10.00: failed command: READ
FPDMA QUEUED
Aug 18 12:28:03 kooka kernel: [54125.020056] ata10.00: cmd
60/18:20:c0:18:0b/00:00:00:00:00/40 tag 4 ncq 12288 in
Aug 18 12:28:03 kooka kernel: [54125.020056]          res
40/00:5c:f0:1a:0b/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Aug 18 12:28:03 kooka kernel: [54125.020059] ata10.00: status: { DRDY }
[...]
Aug 18 12:28:03 kooka kernel: [54125.020262] ata10: hard resetting link
Aug 18 12:28:03 kooka kernel: [54125.512032] ata10: SATA link up 3.0 Gbps
(SStatus 123 SControl 300)
Aug 18 12:28:03 kooka kernel: [54125.523759] ata10.00: configured for UDMA/133
Aug 18 12:28:03 kooka kernel: [54125.536380] ata10: EH complete
Aug 18 12:28:04 kooka kernel: [54125.770176] ata10.00: exception Emask 0x10 SAct
0x7fffffff SErr 0x780100 action 0x6
Aug 18 12:28:04 kooka kernel: [54125.770181] ata10.00: irq_stat 0x08000000
Aug 18 12:28:04 kooka kernel: [54125.770184] ata10: SError: { UnrecovData 10B8B
Dispar BadCRC Handshk }
[...]
Aug 18 12:28:17 kooka kernel: [54138.957095] ata10.00: status: { DRDY }
Aug 18 12:28:17 kooka kernel: [54138.957100] ata10: hard resetting link
Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
Aug 18 12:28:17 kooka kernel: [54139.449972] ata10.00: configured for UDMA/133
Aug 18 12:28:17 kooka kernel: [54139.464065] ata10: EH complete
[...]

Aug 18 12:38:31 kooka kernel: [54753.527070] btrfs: checksum error at logical
52642709504 on dev /dev/dm-12, sector 104931328, root 1281, inode 42152, offset
0, length 4096, links 1 (path: XXXXX)
[...]
Aug 18 12:38:31 kooka kernel: [54753.606566] btrfs: bdev /dev/dm-12 errs: wr 0,
rd 0, flush 0, corrupt 1, gen 0
[...]
Aug 18 12:38:32 kooka kernel: [54753.679513] btrfs: bdev /dev/dm-12 errs: wr 0,
rd 0, flush 0, corrupt 10, gen 0
Aug 18 12:38:36 kooka kernel: [54758.076089] scrub_handle_errored_block: 15173
callbacks suppressed
[...]
Aug 18 12:38:52 kooka kernel: [54774.647414] btrfs: bdev /dev/dm-12 errs: wr 0,
rd 0, flush 0, corrupt 65313, gen 0
[...]
Aug 18 15:24:03 kooka kernel: [64685.641464] btrfs: unable to fixup (regular)
error at logical 52643758080 on dev /dev/dm-11

It appears that my WD-Blue or its connection is bad but why didn''t the
"btrfs replace" give me an error? "btrfs replace" seems to
have read bad data without checking the checksum and then wrote the bad data to
the new disk.

ata10 is the WD-Blue

Aug 17 21:26:19 kooka kernel: [    1.410573] ata10.00: ATA-8: WDC
WD6400AAKS-00A7B2, 01.03B01, max UDMA/133

: root; sleep 2m;  smartctl -a   /dev/sdd
[...]
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED 
WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always      
-       0
   3 Spin_Up_Time            0x0027   161   158   021    Pre-fail  Always      
-       4933
   4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always      
-       327
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always      
-       0
   7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always      
-       0
   9 Power_On_Hours          0x0032   070   070   000    Old_age   Always      
-       22077
  10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always      
-       0
  11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always      
-       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always      
-       245
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -
169
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -
327
194 Temperature_Celsius     0x0022   096   090   000    Old_age   Always       -
51
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -
0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -
0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -
0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -
12080
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -
0

I guess that /disks/backup is mostly dead and that I should just reformat it. 
What do you think?  Next time I''ll watch /var/log/syslog but I would
have preferred that "btrfs replace" stop when getting errors.

thanks, Stuart

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Murphy

2013-Aug-18 21:43 UTC

head link

Re: uncorrectable errors after btrfs replace

On Aug 18, 2013, at 1:12 PM, Stuart Pook <slp644161@pook.it> wrote:
>    6  btrfs filesystem resize 580g .
You first shrank a 2TB btrfs file system on dmcrypt device to 590GB. But then
you didn''t resize the dm device or the partition?
> 9  time btrfs  balance start -musage=1 -dusage=1 . && time  btrfs
filesystem resize 580g .
>  10  time  btrfs filesystem resize 590g .

You followed the resize of the fs, but not the underlying devices, with a
balance, then resized it two more times? This is weird, but also makes the
sequence difficult to follow.
>   13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
>   14  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
Why is this command repeated? What''s with the numbering system that
skips numbers?
> 
> 
> [...]
> Aug 18 12:28:03 kooka kernel: [54125.020262] ata10: hard resetting link
> Aug 18 12:28:03 kooka kernel: [54125.512032] ata10: SATA link up 3.0 Gbps
(SStatus 123 SControl 300)
> Aug 18 12:28:03 kooka kernel: [54125.523759] ata10.00: configured for
UDMA/133
> Aug 18 12:28:03 kooka kernel: [54125.536380] ata10: EH complete
> Aug 18 12:28:04 kooka kernel: [54125.770176] ata10.00: exception Emask 0x10
SAct 0x7fffffff SErr 0x780100 action 0x6
> Aug 18 12:28:04 kooka kernel: [54125.770181] ata10.00: irq_stat 0x08000000
> Aug 18 12:28:04 kooka kernel: [54125.770184] ata10: SError: { UnrecovData
10B8B Dispar BadCRC Handshk }
> [...]
> Aug 18 12:28:17 kooka kernel: [54138.957095] ata10.00: status: { DRDY }
> Aug 18 12:28:17 kooka kernel: [54138.957100] ata10: hard resetting link
> Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
> Aug 18 12:28:17 kooka kernel: [54139.449972] ata10.00: configured for
UDMA/133
> Aug 18 12:28:17 kooka kernel: [54139.464065] ata10: EH complete
Bad connection so libata is dropping the link from 3 Gbps to
1.5Gbps.> 
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always   
-       12080
This confirms that both ends of the cable are sensing communication problems
between drive and controller. The cable needs to be replaced, likely
it''s the connector not the cable itself.

> I guess that /disks/backup is mostly dead and that I should just reformat
it.  What do you think?
Well I think I''d try to simplify this drastically and see if
you''ve got a reproducing bug. The steps you''ve got I find
mostly incoherent, so I can''t try to do what you did to see if
it''s reproducible.
> Next time I''ll watch /var/log/syslog but I would have preferred
that "btrfs replace" stop when getting errors.
The errors should be self correcting, but the mere fact they''re
happening means that some errors could be occurring but aren''t
detected. If the data is corrupting in-transit, but the drive or controller
didn''t report a problem, then btrfs has no way of knowing it was
written incorrectly. There''s only so much software can do to overcome
blatant hardware problems.

But, it seems unlikely such a high percent of errors would go undetected to
result in so many uncorrectable errors, so there may be user error here along
with a bug.


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stuart Pook

2013-Aug-18 22:35 UTC

head link

Re: uncorrectable errors after btrfs replace

hi Chris

thanks for your reply. I was unable to save the filesystem. Even after deleting
all but 4Gb I still had too many errors so I just reformated the device. 
I''m glad that it was my backups and not my data.

On 18/08/13 23:43, Chris Murphy wrote:> On Aug 18, 2013, at 1:12 PM, Stuart Pook <slp644161@pook.it> wrote:
>
>> 6  btrfs filesystem resize 580g .
>
> You first shrank a 2TB btrfs file system on dmcrypt device to 590GB.
> But then you didn''t resize the dm device or the partition?
no, I had no need to resize the dm device or partition.  I just read that when
doing a replace the new device must be no smaller than the old device.  So I
shrunk the old device using "btrfs filesystem resize".  Once the
resize worked I was able to do the replace but I didn''t try to replace
before resizing.

This is what btrfs(1) says on Debian: "The targetdev needs to be same size
or larger than the srcdev."  I may be confused here.
>> 9  time btrfs balance start -musage=1 -dusage=1 . && time btrfs
filesystem resize 580g .
I was surprised that the resize to 580Gb didn''t work so I tried a
magical rebalance before doing the resize to 580 again.  It still
didn''t work (not enough space) but a resize to 590 Gb did.
>> 10  time  btrfs filesystem resize 590g .
this worked
> You followed the resize of the fs, but not the underlying devices,
> with a balance, then resized it two more times?
The resize to 580 didn''t work. So I did a balance.  The resize to 580
still didn''t work so I resized to 590.
> This is weird, but also makes the sequence difficult to follow.
>> 13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
>> 14  time btrfs replace start  /dev/dm-11 /dev/dm-12-B /disks/backups
> Why is this command repeated? What''s with the numbering system
that
> skips numbers?
The command is repeated because I cancelled it my mistake by setting the
filesystem to readonly.  I''m not sure if I restarted it by rerunning
the replace or just by remounting the filesystem readwrite in another window.

I''ll put all of the commands at the end of this list.
>> Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up1.5
Gbps (SStatus 113 SControl 310)
> Bad connection so libata is dropping the link from 3 Gbps to1.5Gbps.
>> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age Always
- 12080
>
> This confirms that both ends of the cable are sensing communication
> problems between drive and controller. The cable needs to be
> replaced, likely it''s the connector not the cable itself.
I think that I should stop using my SATA dock with the SATA ports on my
motherboard which are probably not designed to be hot plugged.
>> I guess that /disks/backup is mostly dead and that I should just
>> reformat it.  What do you think?
>
> Well I think I''d try to simplify this drastically and see if
you''ve
> got a reproducing bug.
I ran a badblocks scan on the raw device (not the luks device) and
didn''t get any errors.
> The steps you''ve got I find mostly incoherent, so I can''t
try to do
> what you did to see if it''s reproducible.
yes, this was the first time I''ve tried this.  And just to make this
more difficult some commands were typed in a different window.
  >> Next time I''ll watch /var/log/syslog but I would have
preferred
>> that "btrfs replace" stop when getting errors.
>
> The errors should be self correcting, but the mere fact they''re
> happening means that some errors could be occurring but aren''t
> detected. If the data is corrupting in-transit, but the drive or
> controller didn''t report a problem, then btrfs has no way of
knowing
> it was written incorrectly.
The data was written to the WD-Blue (640Gb) disk and then copied off it.  The
only errors I saw concerned the WB-Blue.  If the errors were data corruption on
writing or reading the WD-Blue then I would have thought that the checksums
would have told me that there was something wrong.  btrfs didn''t give
me an IO error until I started to read the files when the data was on a final
disk.

Does "btrfs replace" check the ckecksums as it reads the data from the
disk that is being replaced?

Just to be clear. This is the series of btrfs replace I did:

backups : HD204UI -> WD-Blue
/mnt : WD-Black -> HD204UI
backups : WD-Blue -> WD-Black

I guess that my backups were corrupted was they were written to or read from the
WD-Blue. Wouldn''t the checksums have detected this problem before the
data was written to the WD-Black?
> There''s only so much software can do to overcome blatant hardware
problems.
I was hoping to be informed of them
> But, it seems unlikely such a high percent of errors would go
> undetected to result in so many uncorrectable errors, so there may be
> user error here along with a bug.
I''m not sure how I could have done it better. Does "btrfs
replace" check that the data is correctly written to the new disk before it
is removed from the old disk?  Should I have used the 2 disks to make a RAID-1
and then done a scrub before removing the old disk?

Here is the complete list of commands I made in the main terminal

     1  cd /disks/backups/
     2  btrfs filesystem df
     3  btrfs filesystem df  ,
     4*
     5  btrfs filesystem df  .
     6  btrfs filesystem resize 580g .
     7  date
     8  btrfs filesystem df  .
     9  time btrfs  balance start -musage=1 -dusage=1 . && time  btrfs
filesystem resize 580g .
    10  time  btrfs filesystem resize 590g .
    11  btrfs filesystem show
    12  cryptsetup luksOpen /dev/sdd2 640Gb
    13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    14  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    15  cd /
    16  btrfs filesystem show
    17  btrfs filesystem show
    18  cryptsetup remove _dev_sdc2
    19  fdisk /dev/sdc
    20  fdisk /dev/sdc
    21  fdisk -c /dev/sdc
    22  fdisk -c=dos /dev/sdc
    23  fdisk /dev/sdc
    24  fdisk -c=dos /dev/sdc
    25  l /mnt
    26  mount /dev/sdb1 /mnt
    27  l /mnt
    28  btrfs subv list /mnt
    29  btrfs filesystem show
    30  #time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
    31  fdisk -l /dev/sdc
    32  time btrfs replace start  /dev/sdb1  /dev/sdc2 -B /mnt
    33  btrfs filesystem show
    34  btrfs filesystem label  /dev/dm-12
    35   btrfs filesystem label /disks/backups
    36   btrfs filesystem label /disks/backups backups2Tb
    37  btrfs filesystem show
    38   btrfs filesystem label /disks/backups
    39  cryptsetup luksFormat /dev/sdb2
    40  cryptsetup luksAddKey /dev/sdb2
    41  cryptsetup open  /dev/sdb2 newbackups
    42  l /dev/mapper/newbackups
    43  time btrfs replace start  /dev/dm-12  /dev/dm-11 -B /disks/backups
    44  btrfs filesystem show
    45  cryptsetup status 640Gb
    46  cryptsetup remove 640Gb
    47  btrfs filesystem show
    48  btrfs filesystem df /disks/backups/
    49  btrfs filesystem resize max /disks/backups/
    50  btrfs filesystem df /disks/backups/
    51  btrfs filesystem show
    52  vi /etc/cron.daily/storebackup
    53  vi /etc/cron.daily/stuart
    54  /etc/local/backups
    55  mount
    56  mount -o remount,rw /disks/backups/
    57  time  btrfs  scrub start -Bd /disks/backups
    58  smartctl -a   /dev/sdb
    59  smartctl -a   /dev/sdc
    60  smartctl -a   /dev/sdd
    61  smartctl -t short   /dev/sdd
    62  sleep 2m;  smartctl -a   /dev/sdd
    63  history > /tmp/root.commands

Which disk is which?

WD-Black ata-WDC_WD2002FAEX-007BA0_WD-WCAY00589823 -> ../../sdb
HD204UI ata-ST2000DL004_HD204UI_S2H7J90C549571 -> ../../sdc
WD-Blue  ata-WDC_WD6400AAKS-00A7B2_WD-WMASY2546840 -> ../../sdd

please let me know if I can be any clearer, thanks
Stuart
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Murphy

2013-Aug-19 00:42 UTC

head link

Re: uncorrectable errors after btrfs replace

On Aug 18, 2013, at 4:35 PM, Stuart Pook <slp644161@pook.it>
wrote:>> 
>> You first shrank a 2TB btrfs file system on dmcrypt device to 590GB.
>> But then you didn''t resize the dm device or the partition?
> 
> no, I had no need to resize the dm device or partition.
OK well it''s unusual to resize a file system and then not resize the
containing block device. I don''t know if Btrfs cares about this or not.
> 
> I ran a badblocks scan on the raw device (not the luks device) and
didn''t get any errors.
badblocks will depend on the drive determining a persistent read failure with a
sector, and timing out before the SCSI block layer times out. Since the linux
SCSI driver time out is 30 seconds, and most consumer drive ECT is 120 seconds,
the bus is reset before the drive has a chance to report a bad sector. So I
think you''re better off using smartctl -l long tests to find bad
sectors on a disk.

Further a smartctl -x may show SATA Phy Event Counters, which should have
0''s or very low numbers and if not then that''s also an
indicator of hardware problems.

> The data was written to the WD-Blue (640Gb) disk and then copied off it. 
The only errors I saw concerned the WB-Blue.  If the errors were data corruption
on writing or reading the WD-Blue then I would have thought that the checksums
would have told me that there was something wrong.  btrfs didn''t give
me an IO error until I started to read the files when the data was on a final
disk.
How does Btrfs know there''s been a failure during write if the hardware
hasn''t detected it? Btrfs doesn''t re-read everything it just
wrote to the drive to confirm it was written correctly. It assumes it was unless
there''s a hardware error. It wouldn''t know this until a Btrfs
scrub is done on the written drive.

What I can''t tell you is how Btrfs behaves and if it behaves correctly,
when writing data to hardware having transient errors. I don''t know
what it does when the hardware reports the error, but presumably if the hardware
doesn''t report an error Btrfs can''t do anything about that
except on the next read or scrub.



> 
> Just to be clear. This is the series of btrfs replace I did:
> 
> backups : HD204UI -> WD-Blue
> /mnt : WD-Black -> HD204UI
> backups : WD-Blue -> WD-Black
> 
> I guess that my backups were corrupted was they were written to or read
from the WD-Blue. Wouldn''t the checksums have detected this problem
before the data was written to the WD-Black?
When you first encountered the btrfs reported csum errors, what operation was
occurring?
> 
>> There''s only so much software can do to overcome blatant
hardware problems.
> 
> I was hoping to be informed of them
Well you were informed of them in dmesg, by virtue of the controller having
problems talking to a SATA rev 2 drive at rev 2 speed, with a negotiated
fallback to rev 1 speed.> 
>> But, it seems unlikely such a high percent of errors would go
>> undetected to result in so many uncorrectable errors, so there may be
>> user error here along with a bug.
> 
> I''m not sure how I could have done it better. Does "btrfs
replace" check that the data is correctly written to the new disk before it
is removed from the old disk?
That''s a valid question. Hopefully someone more knowledgable can answer
what the expected error handling behavior is supposed to be.
>  Should I have used the 2 disks to make a RAID-1 and then done a scrub
before removing the old disk?
Good question. Possibly it''s best practices to use btrfs replace with
an existing raid1, rather than using it as a way to move a single copy of data
from one disk to another. I think you''d have been better off using
btrfs send and receive for this operation.

A full dmesg might also be enlightening even if it is really long. Just put it
in its own email without comment. I think pasting it out of forum is less
preferred.


Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

George Mitchell

2013-Aug-19 01:21 UTC

head link

Re: uncorrectable errors after btrfs replace

This is just a comment from someone following all of this from the 
sidelines.

And that is that I see so much going on here with this procedure that is 
scares me.  Once a single operation reaches a certain degree of 
complexity I get really scared because all it takes is a single misstep 
and my data is gone.  And that happens so easily as complexity increases 
and confusion tends to set in.  In this particular situation, my 
solution would probably have been to create a new btrfs partition from 
scratch on the new drive and simply mount the source partition/drive ro 
and rsync the data across to the target partition/drive rather than 
trying to do the btrfs replace operation.  That way I could have 
verified the target drive before erasing the source drive and I would 
not have had to worry about partition sizes, encryption, etc.

That said, I am certainly thankful that this was backup data and not 
working data.  But I think it serves as a cautionary tale as to not 
assuming that something should be done just because it theoretically can 
be done.  I am not really familiar with btrfs replace but would imagine 
that it is intended for use more in a raid situation than in simply 
moving data from one drive to another.



On 08/18/2013 05:42 PM, Chris Murphy wrote:> On Aug 18, 2013, at 4:35 PM, Stuart Pook <slp644161@pook.it> wrote:
>>> You first shrank a 2TB btrfs file system on dmcrypt device to
590GB.
>>> But then you didn''t resize the dm device or the partition?
>> no, I had no need to resize the dm device or partition.
> OK well it''s unusual to resize a file system and then not resize
the containing block device. I don''t know if Btrfs cares about this or
not.
>
>> I ran a badblocks scan on the raw device (not the luks device) and
didn''t get any errors.
> badblocks will depend on the drive determining a persistent read failure
with a sector, and timing out before the SCSI block layer times out. Since the
linux SCSI driver time out is 30 seconds, and most consumer drive ECT is 120
seconds, the bus is reset before the drive has a chance to report a bad sector.
So I think you''re better off using smartctl -l long tests to find bad
sectors on a disk.
>
> Further a smartctl -x may show SATA Phy Event Counters, which should have
0''s or very low numbers and if not then that''s also an
indicator of hardware problems.
>
>
>> The data was written to the WD-Blue (640Gb) disk and then copied off
it.  The only errors I saw concerned the WB-Blue.  If the errors were data
corruption on writing or reading the WD-Blue then I would have thought that the
checksums would have told me that there was something wrong.  btrfs
didn''t give me an IO error until I started to read the files when the
data was on a final disk.
> How does Btrfs know there''s been a failure during write if the
hardware hasn''t detected it? Btrfs doesn''t re-read everything
it just wrote to the drive to confirm it was written correctly. It assumes it
was unless there''s a hardware error. It wouldn''t know this
until a Btrfs scrub is done on the written drive.
>
> What I can''t tell you is how Btrfs behaves and if it behaves
correctly, when writing data to hardware having transient errors. I
don''t know what it does when the hardware reports the error, but
presumably if the hardware doesn''t report an error Btrfs can''t
do anything about that except on the next read or scrub.
>
>
>
>
>> Just to be clear. This is the series of btrfs replace I did:
>>
>> backups : HD204UI -> WD-Blue
>> /mnt : WD-Black -> HD204UI
>> backups : WD-Blue -> WD-Black
>>
>> I guess that my backups were corrupted was they were written to or read
from the WD-Blue. Wouldn''t the checksums have detected this problem
before the data was written to the WD-Black?
> When you first encountered the btrfs reported csum errors, what operation
was occurring?
>
>>> There''s only so much software can do to overcome blatant
hardware problems.
>> I was hoping to be informed of them
> Well you were informed of them in dmesg, by virtue of the controller having
problems talking to a SATA rev 2 drive at rev 2 speed, with a negotiated
fallback to rev 1 speed.
>>> But, it seems unlikely such a high percent of errors would go
>>> undetected to result in so many uncorrectable errors, so there may
be
>>> user error here along with a bug.
>> I''m not sure how I could have done it better. Does "btrfs
replace" check that the data is correctly written to the new disk before it
is removed from the old disk?
> That''s a valid question. Hopefully someone more knowledgable can
answer what the expected error handling behavior is supposed to be.
>
>>   Should I have used the 2 disks to make a RAID-1 and then done a scrub
before removing the old disk?
> Good question. Possibly it''s best practices to use btrfs replace
with an existing raid1, rather than using it as a way to move a single copy of
data from one disk to another. I think you''d have been better off using
btrfs send and receive for this operation.
>
> A full dmesg might also be enlightening even if it is really long. Just put
it in its own email without comment. I think pasting it out of forum is less
preferred.
>
>
> Chris Murphy--
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-Aug-20 09:44 UTC

head link

Re: uncorrectable errors after btrfs replace

On Mon, 19 Aug 2013 00:35:54 +0200, Stuart Pook wrote:> hi Chris
> 
> thanks for your reply. I was unable to save the filesystem. Even after
> deleting all but 4Gb I still had too many errors so I just reformated
> the device.  I''m glad that it was my backups and not my data.
> 
> On 18/08/13 23:43, Chris Murphy wrote:
>> On Aug 18, 2013, at 1:12 PM, Stuart Pook <slp644161@pook.it>
wrote:
>>
>>> 6  btrfs filesystem resize 580g .
>>
>> You first shrank a 2TB btrfs file system on dmcrypt device to 590GB.
>> But then you didn''t resize the dm device or the partition?
> 
> no, I had no need to resize the dm device or partition.  I just read
> that when doing a replace the new device must be no smaller than the old
> device.  So I shrunk the old device using "btrfs filesystem
resize".
> Once the resize worked I was able to do the replace but I didn''t
try to
> replace before resizing.
> 
> This is what btrfs(1) says on Debian: "The targetdev needs to be same
> size or larger than the srcdev."  I may be confused here.
> 
>>> 9  time btrfs balance start -musage=1 -dusage=1 . && time
btrfs
>>> filesystem resize 580g .
> 
> I was surprised that the resize to 580Gb didn''t work so I tried a
> magical rebalance before doing the resize to 580 again.  It still
didn''t
> work (not enough space) but a resize to 590 Gb did.
> 
>>> 10  time  btrfs filesystem resize 590g .
> 
> this worked
> 
>> You followed the resize of the fs, but not the underlying devices,
>> with a balance, then resized it two more times?
> 
> The resize to 580 didn''t work. So I did a balance.  The resize to
580
> still didn''t work so I resized to 590.
> 
>> This is weird, but also makes the sequence difficult to follow.
> 
>>> 13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B
/disks/backups
>>> 14  time btrfs replace start  /dev/dm-11 /dev/dm-12-B
/disks/backups
> 
>> Why is this command repeated? What''s with the numbering system
that
>> skips numbers?
> 
> The command is repeated because I cancelled it my mistake by setting the
> filesystem to readonly.  I''m not sure if I restarted it by
rerunning the
> replace or just by remounting the filesystem readwrite in another window.
> 
> I''ll put all of the commands at the end of this list.
> 
>>> Aug 18 12:28:17 kooka kernel: [54139.448029] ata10: SATA link up1.5
>>> Gbps (SStatus 113 SControl 310)
>> Bad connection so libata is dropping the link from 3 Gbps to1.5Gbps.
>>> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
>>> Always - 12080
>>
>> This confirms that both ends of the cable are sensing communication
>> problems between drive and controller. The cable needs to be
>> replaced, likely it''s the connector not the cable itself.
> 
> I think that I should stop using my SATA dock with the SATA ports on my
> motherboard which are probably not designed to be hot plugged.
> 
>>> I guess that /disks/backup is mostly dead and that I should just
>>> reformat it.  What do you think?
>>
>> Well I think I''d try to simplify this drastically and see if
you''ve
>> got a reproducing bug.
> 
> I ran a badblocks scan on the raw device (not the luks device) and
> didn''t get any errors.
> 
>> The steps you''ve got I find mostly incoherent, so I
can''t try to do
>> what you did to see if it''s reproducible.
> 
> yes, this was the first time I''ve tried this.  And just to make
this
> more difficult some commands were typed in a different window.
>  
>>> Next time I''ll watch /var/log/syslog but I would have
preferred
>>> that "btrfs replace" stop when getting errors.
>>
>> The errors should be self correcting, but the mere fact
they''re
>> happening means that some errors could be occurring but aren''t
>> detected. If the data is corrupting in-transit, but the drive or
>> controller didn''t report a problem, then btrfs has no way of
knowing
>> it was written incorrectly.
> 
> The data was written to the WD-Blue (640Gb) disk and then copied off
> it.  The only errors I saw concerned the WB-Blue.  If the errors were
> data corruption on writing or reading the WD-Blue then I would have
> thought that the checksums would have told me that there was something
> wrong.  btrfs didn''t give me an IO error until I started to read
the
> files when the data was on a final disk.
> 
> Does "btrfs replace" check the ckecksums as it reads the data
from the
> disk that is being replaced?
> 
> Just to be clear. This is the series of btrfs replace I did:
> 
> backups : HD204UI -> WD-Blue
> /mnt : WD-Black -> HD204UI
> backups : WD-Blue -> WD-Black
> 
> I guess that my backups were corrupted was they were written to or read
> from the WD-Blue. Wouldn''t the checksums have detected this
problem
> before the data was written to the WD-Black?
> 
>> There''s only so much software can do to overcome blatant
hardware
>> problems.
> 
> I was hoping to be informed of them
> 
>> But, it seems unlikely such a high percent of errors would go
>> undetected to result in so many uncorrectable errors, so there may be
>> user error here along with a bug.
> 
> I''m not sure how I could have done it better. Does "btrfs
replace" check
> that the data is correctly written to the new disk before it is removed
> from the old disk?  Should I have used the 2 disks to make a RAID-1 and
> then done a scrub before removing the old disk?
> 
> Here is the complete list of commands I made in the main terminal
> 
>     1  cd /disks/backups/
>     2  btrfs filesystem df
>     3  btrfs filesystem df  ,
>     4*
>     5  btrfs filesystem df  .
>     6  btrfs filesystem resize 580g .
>     7  date
>     8  btrfs filesystem df  .
>     9  time btrfs  balance start -musage=1 -dusage=1 . && time 
btrfs
> filesystem resize 580g .
>    10  time  btrfs filesystem resize 590g .
>    11  btrfs filesystem show
>    12  cryptsetup luksOpen /dev/sdd2 640Gb
>    13  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
>    14  time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
>    15  cd /
>    16  btrfs filesystem show
>    17  btrfs filesystem show
>    18  cryptsetup remove _dev_sdc2
>    19  fdisk /dev/sdc
>    20  fdisk /dev/sdc
>    21  fdisk -c /dev/sdc
>    22  fdisk -c=dos /dev/sdc
>    23  fdisk /dev/sdc
>    24  fdisk -c=dos /dev/sdc
>    25  l /mnt
>    26  mount /dev/sdb1 /mnt
>    27  l /mnt
>    28  btrfs subv list /mnt
>    29  btrfs filesystem show
>    30  #time btrfs replace start  /dev/dm-11 /dev/dm-12 -B /disks/backups
>    31  fdisk -l /dev/sdc
>    32  time btrfs replace start  /dev/sdb1  /dev/sdc2 -B /mnt
>    33  btrfs filesystem show
>    34  btrfs filesystem label  /dev/dm-12
>    35   btrfs filesystem label /disks/backups
>    36   btrfs filesystem label /disks/backups backups2Tb
>    37  btrfs filesystem show
>    38   btrfs filesystem label /disks/backups
>    39  cryptsetup luksFormat /dev/sdb2
>    40  cryptsetup luksAddKey /dev/sdb2
>    41  cryptsetup open  /dev/sdb2 newbackups
>    42  l /dev/mapper/newbackups
>    43  time btrfs replace start  /dev/dm-12  /dev/dm-11 -B /disks/backups
>    44  btrfs filesystem show
>    45  cryptsetup status 640Gb
>    46  cryptsetup remove 640Gb
>    47  btrfs filesystem show
>    48  btrfs filesystem df /disks/backups/
>    49  btrfs filesystem resize max /disks/backups/
>    50  btrfs filesystem df /disks/backups/
>    51  btrfs filesystem show
>    52  vi /etc/cron.daily/storebackup
>    53  vi /etc/cron.daily/stuart
>    54  /etc/local/backups
>    55  mount
>    56  mount -o remount,rw /disks/backups/
>    57  time  btrfs  scrub start -Bd /disks/backups
>    58  smartctl -a   /dev/sdb
>    59  smartctl -a   /dev/sdc
>    60  smartctl -a   /dev/sdd
>    61  smartctl -t short   /dev/sdd
>    62  sleep 2m;  smartctl -a   /dev/sdd
>    63  history > /tmp/root.commands
> 
> Which disk is which?
> 
> WD-Black ata-WDC_WD2002FAEX-007BA0_WD-WCAY00589823 -> ../../sdb
> HD204UI ata-ST2000DL004_HD204UI_S2H7J90C549571 -> ../../sdc
> WD-Blue  ata-WDC_WD6400AAKS-00A7B2_WD-WMASY2546840 -> ../../sdd
> 
> please let me know if I can be any clearer, thanks
> Stuart
Do you still have the kernel log files around that had been written
while you ran the replace procedure? /var/log/messages*. Could you share
these files (via personal mail if the files are too huge).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

slp644161@pook.it

2013-Aug-20 13:52 UTC

head link

Re: uncorrectable errors after btrfs replace

Stefan Behrens <sbehrens@giantdisaster.de> wrote:
>Do you still have the kernel log files around that had been written
>while you ran the replace procedure? /var/log/messages*. Could you
>share
>these files (via personal mail if the files are too huge).
Yes I still have them. I''m away at the moment so I''ll not be
able to send them to you before Saturday. What is the maximum size that I should
send to the mailing list?

Stuart
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

slp644161@pook.it

2013-Aug-20 14:46 UTC

head link

Re: uncorrectable errors after btrfs replace

Chris Murphy <lists@colorremedies.com> wrote:
>> I ran a badblocks scan on the raw device (not the luks device) and
>didn''t get any errors.
>
>badblocks will depend on the drive determining a persistent read
>failure with a sector, and timing out before the SCSI block layer times
>out. Since the linux SCSI driver time out is 30 seconds, and most
>consumer drive ECT is 120 seconds, the bus is reset before the drive
>has a chance to report a bad sector. So I think you''re better off
using
>smartctl -l long tests to find bad sectors on a disk.
I have no reason to think that I have bad sectors on the disk. I just wanted to
see if badblocks would lead to errors due to connection or cable problems. It
didn''t.
>How does Btrfs know there''s been a failure during write if the
hardware
>hasn''t detected it? Btrfs doesn''t re-read everything it
just wrote to
>the drive to confirm it was written correctly. It assumes it was unless
>there''s a hardware error. It wouldn''t know this until a
Btrfs scrub is
>done on the written drive. 
I was hoping that btrfs would have checked that the data was correctly copied to
the new disk before it removed it from the original. This is what would have
saved my filesystem.
>What I can''t tell you is how Btrfs behaves and if it behaves
correctly,
>when writing data to hardware having transient errors. I don''t know
>what it does when the hardware reports the error, but presumably if the
>hardware doesn''t report an error Btrfs can''t do anything
about that
>except on the next read or scrub.
But btrfs did read the data from the WD-blue because it copied it to the
WD-black. btrfs copied rubbish onto the WD-black so if it had checked the
checksums as it read from the WD-blue it would have seen that things were bad.
This would already have been too late for my filesystem but it would have been
good to know then rather than just get errors when I tried to read the files on
the filesystem.
>> Just to be clear. This is the series of btrfs replace I did:
>> 
>> backups : HD204UI -> WD-Blue
>> /mnt : WD-Black -> HD204UI
>> backups : WD-Blue -> WD-Black
>> 
>> I guess that my backups were corrupted was they were written to or
>read from the WD-Blue. Wouldn''t the checksums have detected this
>problem before the data was written to the WD-Black?
>
>When you first encountered the btrfs reported csum errors, what
>operation was occurring?
When I started to read and write my backups after they have been copied to the
WD-black
>>> There''s only so much software can do to overcome blatant
hardware
>problems.
>> 
>> I was hoping to be informed of them
>
>Well you were informed of them in dmesg, by virtue of the controller
>having problems talking to a SATA rev 2 drive at rev 2 speed, with a
>negotiated fallback to rev 1 speed.
I wanted btrfs to reread the new disk before removing the old disk from the
filesystem. I also do not understand why the errors, which were going into
dmesg, were not received by btrfs so that it could abort the replace.
>> Does "btrfs replace"
>check that the data is correctly written to the new disk before it is
>removed from the old disk?
>
>That''s a valid question. Hopefully someone more knowledgable can
answer
>what the expected error handling behavior is supposed to be.
It would be good if it did!
>>  Should I have used the 2 disks to make a RAID-1 and then done a
>scrub before removing the old disk?
>
>Good question. Possibly it''s best practices to use btrfs replace
with
>an existing raid1, rather than using it as a way to move a single copy
>of data from one disk to another. I think you''d have been better
off
>using btrfs send and receive for this operation.
But using send and receive would have lead to downtime.
>A full dmesg might also be enlightening even if it is really long. Just
>put it in its own email without comment. 
As soon as I get back home ...

Stuart Pook, http://www.pook.it
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-Aug-20 14:50 UTC

head link

Re: uncorrectable errors after btrfs replace

On Tue, 20 Aug 2013 15:52:27 +0200, slp644161 wrote:> Stefan Behrens <sbehrens@giantdisaster.de> wrote:
> 
>> Do you still have the kernel log files around that had been written
>> while you ran the replace procedure? /var/log/messages*. Could you
>> share
>> these files (via personal mail if the files are too huge).
> 
> Yes I still have them. I''m away at the moment so I''ll not
be able to send them to you before Saturday. What is the maximum size that I
should send to the mailing list?
> 
> Stuart
The list server blocks mails larger then 100k characters according to
<http://vger.kernel.org/majordomo-info.html>.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-Aug-20 15:16 UTC

head link

Re: uncorrectable errors after btrfs replace

On Tue, 20 Aug 2013 16:46:59 +0200, slp644161 wrote:> Chris Murphy <lists@colorremedies.com> wrote:
> 
>>> I ran a badblocks scan on the raw device (not the luks device) and
>> didn''t get any errors.
>>
>> badblocks will depend on the drive determining a persistent read
>> failure with a sector, and timing out before the SCSI block layer times
>> out. Since the linux SCSI driver time out is 30 seconds, and most
>> consumer drive ECT is 120 seconds, the bus is reset before the drive
>> has a chance to report a bad sector. So I think you''re better
off using
>> smartctl -l long tests to find bad sectors on a disk.
> 
> I have no reason to think that I have bad sectors on the disk. I just
wanted to see if badblocks would lead to errors due to connection or cable
problems. It didn''t.
> 
>> How does Btrfs know there''s been a failure during write if the
hardware
>> hasn''t detected it? Btrfs doesn''t re-read everything
it just wrote to
>> the drive to confirm it was written correctly. It assumes it was unless
>> there''s a hardware error. It wouldn''t know this until
a Btrfs scrub is
>> done on the written drive. 
> 
> I was hoping that btrfs would have checked that the data was correctly
copied to the new disk before it removed it from the original. This is what
would have saved my filesystem.
> 
>> What I can''t tell you is how Btrfs behaves and if it behaves
correctly,
>> when writing data to hardware having transient errors. I don''t
know
>> what it does when the hardware reports the error, but presumably if the
>> hardware doesn''t report an error Btrfs can''t do
anything about that
>> except on the next read or scrub.
> 
> But btrfs did read the data from the WD-blue because it copied it to the
WD-black. btrfs copied rubbish onto the WD-black so if it had checked the
checksums as it read from the WD-blue it would have seen that things were bad.
This would already have been too late for my filesystem but it would have been
good to know then rather than just get errors when I tried to read the files on
the filesystem.
> 
>>> Just to be clear. This is the series of btrfs replace I did:
>>>
>>> backups : HD204UI -> WD-Blue
>>> /mnt : WD-Black -> HD204UI
>>> backups : WD-Blue -> WD-Black
>>>
>>> I guess that my backups were corrupted was they were written to or
>> read from the WD-Blue. Wouldn''t the checksums have detected
this
>> problem before the data was written to the WD-Black?
>>
>> When you first encountered the btrfs reported csum errors, what
>> operation was occurring?
> 
> When I started to read and write my backups after they have been copied to
the WD-black
> 
>>>> There''s only so much software can do to overcome
blatant hardware
>> problems.
>>>
>>> I was hoping to be informed of them
>>
>> Well you were informed of them in dmesg, by virtue of the controller
>> having problems talking to a SATA rev 2 drive at rev 2 speed, with a
>> negotiated fallback to rev 1 speed.
> 
> I wanted btrfs to reread the new disk before removing the old disk from the
filesystem. I also do not understand why the errors, which were going into
dmesg, were not received by btrfs so that it could abort the replace.
> 
>>> Does "btrfs replace"
>> check that the data is correctly written to the new disk before it is
>> removed from the old disk?
>>
>> That''s a valid question. Hopefully someone more knowledgable
can answer
>> what the expected error handling behavior is supposed to be.
> 
> It would be good if it did!
If write errors are encountered (EIO in the write completion callback),
the dev replace procedure is aborted. ''btrfs replace status''
can be used
to see the write errors and the fact that the replace operation was
canceled. If the ''btrfs replace start'' task was invoked with
the ''-B''
option (do not background), an error message is printed in this case and
the exit value is set.

I spent my whole day yesterday to check the replace procedure for errors
by doing similar things like you did, including inserting write errors,
using LUKS and everything. The only issue that I was able to find was
that it is not checked whether the filesystem is in read-only mode when
the operation starts. If it is, the operation fails at the end without
giving the user an indication. If you then think that the copy operation
succeeded and scratch or reuse the source drive, all data is gone. But I
have not found anything else. Therefore I''m looking forward to read
through your kernel log files.

>>>  Should I have used the 2 disks to make a RAID-1 and then done a
>> scrub before removing the old disk?
>>
>> Good question. Possibly it''s best practices to use btrfs
replace with
>> an existing raid1, rather than using it as a way to move a single copy
>> of data from one disk to another. I think you''d have been
better off
>> using btrfs send and receive for this operation.
> 
> But using send and receive would have lead to downtime.
> 
>> A full dmesg might also be enlightening even if it is really long. Just
>> put it in its own email without comment. 
> 
> As soon as I get back home ...
> 
> Stuart Pook, http://www.pook.it--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stuart Pook

2013-Aug-25 22:10 UTC

head link

Re: uncorrectable errors after btrfs replace

On 20/08/13 17:16, Stefan Behrens wrote:
> I spent my whole day yesterday to check the replace procedure for errors
> by doing similar things like you did, including inserting write errors,
> using LUKS and everything.
thanks
> The only issue that I was able to find was
> that it is not checked whether the filesystem is in read-only mode when
> the operation starts.
Note that I started the replace and then set the filesystem to readonly.  The
replace then stopped.  I set the filesystem back to readwrite and restarted the
replace.
> Therefore I''m looking forward to read
> through your kernel log files.
I emailed them to Stefan Behrens & Chris Murphy.  Please let me know if you
did not get them (presumably because they are too big).

Stuart
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Murphy

2013-Aug-26 02:07 UTC

head link

Re: uncorrectable errors after btrfs replace

On Aug 25, 2013, at 4:10 PM, Stuart Pook <slp644161@pook.it>
wrote:>
> I emailed them to Stefan Behrens & Chris Murphy. Please let me know if
you did not get them (presumably because they are too big).
Observations:

1. The problems started before the start of the provided log.

2. smartd reports sdb at 100˚C. The spec sheet for WD2002FAEX is 60˚C.
It''s possible the raw value isn''t actually ˚C so
you''ll need to look at smartctl -a columns VALUE, WORST and THRESH to
determine if it is or has hit the threshold. Seems possible the drives are being
cooked.

sdc is ST2000DL004 which google finds this
http://forums.seagate.com/t5/Desktop-HDD-Desktop-SSHD/BEWARE-the-so-called-Samsung-HD204UI/m-p/166856

It also looks to be running hot.

3. the first ata error seems to be 8/10 encoding related, could be a connector
problem, a port problem, a drive problem, or firmware bug - the Emask 0x10
implicates NCQ according to libata.h:
AC_ERR_NCQ = (1 << 10), /* marker for offending NCQ qc */

4. Hundreds of these:
ata10.00: failed command: READ FPDMA QUEUED

Implies it may be an incompatibility between this drive and the controller,
possibly disabling NCQ on the drive will fix the problem (set queue depth to 1)
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559

https://ata.wiki.kernel.org/index.php/Libata_FAQ

echo 1 > /sys/block/sdX/device/queue_depth

I can''t tell you what /dev/ node applies to ata10:00 because the log is
incomplete, so I don''t know which drive is giving you a hard time with
NCQ. Thing is, if you disable NCQ on just one drive, it''ll slow it down
compared to the others. I don''t know how tolerant btrfs is when devices
have different speeds.

5. Tens of thousands of checksum errors on both dm-11 and dm-12.

6. Many instances of
btrfs: unable to fixup (regular) error at logical 53281xxxxxx on dev /dev/dm-11

So kernel messages have been screaming of bus related problems for some time,
they were ignored, btrfs did what it could, reported hundreds to thousands of
errors in dmesg, but user space tools didn''t warn the user operations
effectively failed.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Chris Murphy

2013-Aug-26 02:32 UTC

head link

Re: uncorrectable errors after btrfs replace

On Aug 25, 2013, at 8:07 PM, Chris Murphy <lists@colorremedies.com> wrote:
> ata10.00: failed command: READ FPDMA QUEUED
> 
> Implies it may be an incompatibility between this drive and the controller,
possibly disabling NCQ on the drive will fix the problem (set queue depth to 1)
http://serverfault.com/questions/295740/ubuntu-11-04-server-crashing-failed-command-read-fpdma-queued

One found replacing cables solved the problem, another a BIOS update. It could
also be port specific.

I''m uncertain if there''s enough information to know if the
data was written incorrectly, or if it''s just reading incorrectly and
btrfs is complaining. It might be worth putting the drive on another port with a
new cable and seeing if the volume will mount ro and if so if you can access
your data.

Chris Murphy--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-Sep-02 16:23 UTC

head link

Re: uncorrectable errors after btrfs replace

On Sun, 25 Aug 2013 20:07:32 -0600, Chris Murphy wrote:> On Aug 25, 2013, at 4:10 PM, Stuart Pook <slp644161@pook.it> wrote:
>>
>> I emailed them to Stefan Behrens & Chris Murphy.  Please let me
know if you did not get them (presumably because they are too big).
> 
> Observations:
> 
> 1. The problems started before the start of the provided log.
> 
> 2. smartd reports sdb at 100˚C. The spec sheet for WD2002FAEX is 60˚C.
It''s possible the raw value isn''t actually ˚C so
you''ll need to look at smartctl -a columns VALUE, WORST and THRESH to
determine if it is or has hit the threshold. Seems possible the drives are being
cooked.
> 
> sdc is ST2000DL004 which google finds this
>
http://forums.seagate.com/t5/Desktop-HDD-Desktop-SSHD/BEWARE-the-so-called-Samsung-HD204UI/m-p/166856
> 
> It also looks to be running hot. 
> 
> 3. the first ata error seems to be 8/10 encoding related, could be a
connector problem, a port problem, a drive problem, or firmware bug - the Emask
0x10 implicates NCQ according to libata.h:
> AC_ERR_NCQ              = (1 << 10), /* marker for offending NCQ qc
*/
> 
> 4. Hundreds of these:
> ata10.00: failed command: READ FPDMA QUEUED
> 
> Implies it may be an incompatibility between this drive and the controller,
possibly disabling NCQ on the drive will fix the problem (set queue depth to 1)
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559
> 
> https://ata.wiki.kernel.org/index.php/Libata_FAQ
> 
> echo 1 > /sys/block/sdX/device/queue_depth
> 
> 
> I can''t tell you what /dev/ node applies to ata10:00 because the
log is incomplete, so I don''t know which drive is giving you a hard
time with NCQ. Thing is, if you disable NCQ on just one drive, it''ll
slow it down compared to the others. I don''t know how tolerant btrfs is
when devices have different speeds.
> 
> 
> 
> 5. Tens of thousands of checksum errors on both dm-11 and dm-12. 
> 
> 6. Many instances of 
>  btrfs: unable to fixup (regular) error at logical 53281xxxxxx on dev
/dev/dm-11
> 
> So kernel messages have been screaming of bus related problems for some
time, they were ignored, btrfs did what it could, reported hundreds to thousands
of errors in dmesg, but user space tools didn''t warn the user
operations effectively failed.
Right, I assume that the WD6400AAKS failed in reading the 250,000 blocks due to
heat or SATA link issues. And in this case the user space tools should have
warned and aborted the operations because there is hope that after cooling down
the disk or after fixing the SATA link issues, the read errors disappear.

There is the other use case where such unrecoverable read errors are expected.
This is the case when a disk is about to die.

The configuration option is missing whether to abort or continue on
unrecoverable read errors. The even better solution is to implement an optional
verify at the end or a scrub run, and to only declare the operation as being
finished when this additional check succeeds.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Aug 2013 - uncorrectable errors after btrfs replace

uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace

Re: uncorrectable errors after btrfs replace