thr3ads.net - Btrfs devel - kernel 3.3.4 damages filesystem (?) [May 2012]

If this information is useful, please help other people find it:
Share via:

Helmut Hullen

2012-May-07 10:46 UTC

kernel 3.3.4 damages filesystem (?)

Hallo,

"never change a running system" ...

For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without  
problems.

Yesterday I compiled kernel 3.3.4, and this morning I started the  
machine with this kernel. There may be some ugly problems.

Copying something into the btrfs "directory" worked well for some
files,
and then I got error messages (I''ve not copied them, something with
"IO
error" under Samba).

Rebooting the machine with kernel 3.2.9 worked, copying 1 file worked,  
but copying more than this file didn''t work. And I can''t
delete this
file.

That doesn''t please me - copying more than 4 TBytes wastes time and  
money.

=========== configuration ================
/dev/sdc1 on /srv/MM type btrfs (rw,noatime)

/dev/sdc: SAMSUNG HD204UI: 25 °C
/dev/sdf: WDC WD30EZRX-00MMMB0: 30 °C
/dev/sdi: WDC WD30EZRX-00MMMB0: 29 °C

Data, RAID0: total=5.29TB, used=4.29TB
System, RAID1: total=8.00MB, used=352.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=149.00GB, used=5.00GB

Label: ''MMedia''  uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91
	Total devices 3 FS bytes used 4.29TB
	devid    3 size 2.73TB used 1.98TB path /dev/sdi1
	devid    2 size 2.73TB used 1.94TB path /dev/sdf1
	devid    1 size 1.82TB used 1.63TB path /dev/sdc1

Btrfs Btrfs v0.19

=================== boot messages, kernel related =============
[boot with kernel 3.3.4]
May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000
action 0xe frozen
May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
May  7 06:55:26 Arktur kernel: ata5: hard resetting link
May  7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6
secs
May  7 06:55:36 Arktur kernel: ata5: hard resetting link
May  7 06:55:38 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 06:55:38 Arktur kernel: ata5: reset failed (errno=-19), retrying in 9
secs
May  7 06:55:46 Arktur kernel: ata5: hard resetting link
May  7 06:55:47 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 06:55:47 Arktur kernel: ata5: reset failed (errno=-19), retrying in 34
secs
May  7 06:56:21 Arktur kernel: ata5: hard resetting link
May  7 06:56:22 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 06:56:22 Arktur kernel: ata5.00: configured for UDMA/100
May  7 06:56:22 Arktur kernel: ata5: EH complete
May  7 07:12:07 Arktur kernel: ata5.00: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0xe frozen
May  7 07:12:07 Arktur kernel: ata5: SError: { PHYRdyChg }
May  7 07:12:07 Arktur kernel: ata5.00: failed command: WRITE DMA EXT
May  7 07:12:07 Arktur kernel: ata5.00: cmd 35/00:00:00:62:50/00:04:5e:00:00/e0
tag 0 dma 524288 out
May  7 07:12:07 Arktur kernel:          res d8/d8:d8:d8:d8:d8/d8:d8:d8:d8:d8/d8
Emask 0x12 (ATA bus error)
May  7 07:12:07 Arktur kernel: ata5.00: status: { Busy }
May  7 07:12:07 Arktur kernel: ata5.00: error: { ICRC UNC IDNF }
May  7 07:12:07 Arktur kernel: ata5: hard resetting link
May  7 07:12:13 Arktur kernel: ata5: link is slow to respond, please be patient
(ready=-19)
May  7 07:12:15 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 07:12:15 Arktur kernel: ata5.00: failed to IDENTIFY (I/O error,
err_mask=0x100)
May  7 07:12:15 Arktur kernel: ata5.00: revalidation failed (errno=-5)
May  7 07:12:20 Arktur kernel: ata5: hard resetting link
May  7 07:12:20 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:12:20 Arktur kernel: ata5: reset failed (errno=-19), retrying in 10
secs
May  7 07:12:30 Arktur kernel: ata5: hard resetting link
May  7 07:12:30 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:12:30 Arktur kernel: ata5: reset failed (errno=-19), retrying in 10
secs
May  7 07:12:40 Arktur kernel: ata5: hard resetting link
May  7 07:12:42 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 07:12:43 Arktur kernel: ata5.00: configured for UDMA/100
May  7 07:12:43 Arktur kernel: ata5: EH complete
May  7 07:12:43 Arktur kernel: ata5.00: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0xe frozen
May  7 07:12:43 Arktur kernel: ata5: SError: { PHYRdyChg }
May  7 07:12:43 Arktur kernel: ata5.00: failed command: WRITE DMA EXT
May  7 07:12:43 Arktur kernel: ata5.00: cmd 35/00:00:00:72:50/00:04:5e:00:00/e0
tag 0 dma 524288 out
May  7 07:12:43 Arktur kernel:          res d0/d0:d0:d0:d0:d0/d0:d0:d0:d0:d0/d0
Emask 0x12 (ATA bus error)
May  7 07:12:43 Arktur kernel: ata5.00: status: { Busy }
May  7 07:12:43 Arktur kernel: ata5.00: error: { ICRC UNC IDNF }
May  7 07:12:43 Arktur kernel: ata5: hard resetting link
May  7 07:12:49 Arktur kernel: ata5: link is slow to respond, please be patient
(ready=-19)
May  7 07:12:50 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 07:12:51 Arktur kernel: ata5.00: configured for UDMA/100
May  7 07:12:51 Arktur kernel: ata5: EH complete
May  7 07:12:51 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000
action 0xe frozen
May  7 07:12:51 Arktur kernel: ata5: SError: { PHYRdyChg }
May  7 07:12:51 Arktur kernel: ata5: hard resetting link
May  7 07:12:54 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:12:54 Arktur kernel: ata5: reset failed (errno=-19), retrying in 7
secs
May  7 07:13:01 Arktur kernel: ata5: hard resetting link
May  7 07:13:04 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:13:04 Arktur kernel: ata5: reset failed (errno=-19), retrying in 7
secs
May  7 07:13:11 Arktur kernel: ata5: hard resetting link
May  7 07:13:14 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:13:14 Arktur kernel: ata5: reset failed (errno=-19), retrying in 33
secs
May  7 07:13:46 Arktur kernel: ata5: hard resetting link
May  7 07:13:47 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 07:13:47 Arktur kernel: ata5.00: failed to read native max address
(err_mask=0x100)
May  7 07:13:47 Arktur kernel: ata5.00: HPA support seems broken, skipping HPA
handling
May  7 07:13:47 Arktur kernel: ata5.00: revalidation failed (errno=-5)
May  7 07:13:52 Arktur kernel: ata5: hard resetting link
May  7 07:13:53 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:13:53 Arktur kernel: ata5: reset failed (errno=-19), retrying in 9
secs
May  7 07:14:02 Arktur kernel: ata5: hard resetting link
May  7 07:14:05 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:14:05 Arktur kernel: ata5: reset failed (errno=-19), retrying in 8
secs
May  7 07:14:12 Arktur kernel: ata5: hard resetting link
May  7 07:14:14 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:14:14 Arktur kernel: ata5: reset failed (errno=-19), retrying in 33
secs
May  7 07:14:47 Arktur kernel: ata5: hard resetting link
May  7 07:14:47 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:14:47 Arktur kernel: ata5: reset failed, giving up
May  7 07:14:47 Arktur kernel: ata5.00: disabled
May  7 07:14:47 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000
action 0xe frozen t4
May  7 07:14:47 Arktur kernel: ata5: SError: { PHYRdyChg }
May  7 07:14:47 Arktur kernel: ata5: hard resetting link
May  7 07:14:47 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] killing request
May  7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] Unhandled error code
May  7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf]  Result: hostbyte=0x01
driverbyte=0x00
May  7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] CDB: cdb[0]=0x28: 28 00 d0 d1
07 20 00 00 08 00
May  7 07:14:47 Arktur kernel: end_request: I/O error, dev sdf, sector
3503359776
May  7 07:14:48 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:14:48 Arktur kernel: end_request: I/O error, dev sdf, sector 0
May  7 07:14:48 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:14:49 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:14:49 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:14:49 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:14:49 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:14:49 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:14:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:14:54 Arktur kernel: ata5: link is slow to respond, please be patient
(ready=-19)
May  7 07:14:57 Arktur kernel: ata5: COMRESET failed (errno=-16)
May  7 07:14:57 Arktur kernel: ata5: hard resetting link
May  7 07:15:01 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:03 Arktur kernel: ata5: link is slow to respond, please be patient
(ready=-19)
May  7 07:15:07 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:07 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:15:07 Arktur kernel: ata5: reset failed (errno=-19), retrying in 1
secs
May  7 07:15:07 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:07 Arktur kernel: ata5: hard resetting link
May  7 07:15:07 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:12 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:15:12 Arktur kernel: ata5: reset failed (errno=-19), retrying in 31
secs
May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0
May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:15:22 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:42 Arktur kernel: ata5: hard resetting link
May  7 07:15:44 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:15:44 Arktur kernel: ata5: reset failed, giving up
May  7 07:15:44 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000
action 0xe frozen t3
May  7 07:15:44 Arktur kernel: ata5: SError: { PHYRdyChg }
May  7 07:15:44 Arktur kernel: ata5: hard resetting link
May  7 07:15:44 Arktur kernel: ata5: COMRESET failed (errno=-19)
May  7 07:15:44 Arktur kernel: ata5: reset failed (errno=-19), retrying in 10
secs
May  7 07:15:49 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:50 Arktur kernel: end_request: I/O error, dev sdf, sector 0
May  7 07:15:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:50 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:15:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:50 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:15:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:50 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:15:54 Arktur kernel: ata5: hard resetting link
May  7 07:15:55 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
May  7 07:15:59 Arktur kernel: ata5: link is slow to respond, please be patient
(ready=-19)
May  7 07:16:04 Arktur kernel: ata5: COMRESET failed (errno=-16)
May  7 07:16:04 Arktur kernel: ata5: hard resetting link
May  7 07:16:05 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 07:16:05 Arktur kernel: ata5.00: ATA-8: WDC WD30EZRX-00MMMB0, 80.00A80,
max UDMA/133
May  7 07:16:05 Arktur kernel: ata5.00: 5860533168 sectors, multi 0: LBA48 NCQ
(depth 0/32)
May  7 07:16:05 Arktur kernel: ata5.00: configured for UDMA/100
May  7 07:16:05 Arktur kernel: ata5: EH complete
May  7 07:16:05 Arktur kernel: ata5.00: detaching (SCSI 5:0:0:0)
May  7 07:16:05 Arktur kernel: sd 5:0:0:0: [sdf] Synchronizing SCSI cache
May  7 07:16:20 Arktur kernel: end_request: I/O error, dev sdf, sector 0
May  7 07:16:20 Arktur kernel: lost page write due to I/O error on sdf1
May  7 07:22:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 360s
May  7 07:28:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 360s
May  7 07:34:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 360s
May  7 07:34:05 Arktur kernel: sd 5:0:0:0: [sdf]  Result: hostbyte=0x00
driverbyte=0x00
May  7 07:34:05 Arktur kernel: sd 5:0:0:0: [sdf] Stopping disk
May  7 07:37:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 180s
May  7 07:37:05 Arktur kernel: sd 5:0:0:0: [sdf] START_STOP FAILED
May  7 07:37:05 Arktur kernel: sd 5:0:0:0: [sdf]  Result: hostbyte=0x00
driverbyte=0x00
May  7 07:37:06 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl
310)
May  7 07:37:07 Arktur kernel: ata5.00: configured for UDMA/100
May  7 07:37:07 Arktur kernel: scsi 5:0:0:0: Direct-Access     ATA      WDC
WD30EZRX-00M 80.0 PQ: 0 ANSI: 5

May  7 10:47:22 Arktur kernel: lost page write due to I/O error on sdf1

May  7 11:11:21 Arktur kernel: lost page write due to I/O error on sdf1
May  7 11:12:07 Arktur kernel: lost page write due to I/O error on sdf1

[reboot with kernel 3.2.9]

May  7 11:15:25 Arktur kernel: ata5.00: configured for UDMA/100
May  7 11:15:25 Arktur kernel: scsi 5:0:0:0: Direct-Access     ATA      WDC
WD30EZRX-00M 80.0 PQ: 0 ANSI: 5
May  7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] 5860533168 512-byte logical
blocks: (3.00 TB/2.72 TiB)
May  7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] 4096-byte physical blocks
May  7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Write Protect is off
May  7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
May  7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read
cache: enabled, doesn''t support DPO or FUA
May  7 11:15:26 Arktur kernel:  sdf: sdf1
May  7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Attached SCSI disk

=============  dmesg output ==============
btrfs: free space inode generation (0) did not match free space cache  
generation (36740)
btrfs: space cache generation (36727) does not match inode (36747)
btrfs: failed to load free space cache for block group 9193084223488
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in
         res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in
         res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in
         res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in
         res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in
         res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in
         res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
sd 5:0:0:0: [sdf] Unhandled sense code
sd 5:0:0:0: [sdf]  Result: hostbyte=0x00 driverbyte=0x08
sd 5:0:0:0: [sdf]  Sense Key : 0x3 [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01
        02 d7 b3 08
sd 5:0:0:0: [sdf]  ASC=0x11 ASCQ=0x4
sd 5:0:0:0: [sdf] CDB: cdb[0]=0x88: 88 00 00 00 00 01 02 d7 b3 00 00 00 00 80 00
00
end_request: I/O error, dev sdf, sector 4342657800
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in
         res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in
         res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in
         res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in
         res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in
         res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata5.00: BMDMA2 stat 0x80d0009
ata5.00: failed command: READ DMA EXT
ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in
         res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/100
sd 5:0:0:0: [sdf] Unhandled sense code
sd 5:0:0:0: [sdf]  Result: hostbyte=0x00 driverbyte=0x08
sd 5:0:0:0: [sdf]  Sense Key : 0x3 [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01
        02 d7 b3 08
sd 5:0:0:0: [sdf]  ASC=0x11 ASCQ=0x4
sd 5:0:0:0: [sdf] CDB: cdb[0]=0x88: 88 00 00 00 00 01 02 d7 b3 08 00 00 00 08 00
00
end_request: I/O error, dev sdf, sector 4342657800
ata5: EH complete
btrfs: error reading free space cache
BUG: unable to handle kernel NULL pointer dereference at 00000001
IP: [<c1295c36>] io_ctl_drop_pages+0x26/0x50
*pdpt = 0000000029712001 *pde = 0000000000000000
Oops: 0002 [#1]

============== syslogd output =================================== kernel 3.2.9
after the 3.3.4 try ===
Message from syslogd@Arktur at Mon May    7  11:21:55 2012
Arktur kernel: Oops: 0002 [#1]
Message from syslogd@Arktur at Mon May    7  11:21:56 2012
es existieren nur
Arktur kernel:   Process flush-btrfs-l  (pid:   51 ti=e9f12000)
ò at Mon May
ò at Mon May
ti=e9f12000 task=f6882a50 task. 11:21:56 2012   .. . 11:21:56 2012   ...

Message from syslogd@Arktur at Mon May    7  11:21:56 2012   ...
Arktur kernel: Code: c3 8d 74 26 00 55 89 e5 56 53 3e 8d 74 26 00 89 c6 e8 2f ff
ff ff 8b 5e 1c 85 db 7e 30 31 db 8d b6 00 00 00 00 8b 46 0c 8b 04 98 <80>
60 01
fe 8b 46 0c 8b 04 98 e8 1b 96 df ff 8b 46 0c 8b 04 98

Message from syslogd@Arktur at Mon May    7  11:21:56 2012   ...
Arktur kernel:   EIP:   [<c1295c36>]   io_cfl_drop_pages+0x26/0x50 SS:ESP
0068:e9fl396

Message from syslogd@Arktur at Mon May 7 11:21:56 2012
Arktur kernel: CR2: 0000000000000001
Arktur:- #

=========================================================
The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller.
Only 1 of the 3 disks seems to be damaged.

=========================================================
Ca I repair the system? Or have I to copy it to a set of other disks?

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fajar A. Nugraha

2012-May-07 10:58 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Mon, May 7, 2012 at 5:46 PM, Helmut Hullen <Hullen@t-online.de> wrote:
> For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without
> problems.
>
> Yesterday I compiled kernel 3.3.4, and this morning I started the
> machine with this kernel. There may be some ugly problems.
> Data, RAID0: total=5.29TB, used=4.29TB
Raid0? Yaiks!
> System, RAID1: total=8.00MB, used=352.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=149.00GB, used=5.00GB
>
> Label: ''MMedia''  uuid:
9adfdc84-0fbe-431b-bcb1-cabb6a915e91
>        Total devices 3 FS bytes used 4.29TB
>        devid    3 size 2.73TB used 1.98TB path /dev/sdi1
>        devid    2 size 2.73TB used 1.94TB path /dev/sdf1
>        devid    1 size 1.82TB used 1.63TB path /dev/sdc1
>
> May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0xe frozen
> May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
> May  7 06:55:26 Arktur kernel: ata5: hard resetting link
> May  7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19)
> May  7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in
6 secs
> May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
> May  7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0
> May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
> May  7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1

That looks like a bad disk to me, and it shouldn''t be related to ther
kernel version you use.

Your best chance might be:
- unmount the fs
- get another disk to replace /dev/sdf, copy the content over with
dd_rescue. Ata resets can be a PITA, so you might be better of by
moving the failed disk to a usb external adapter, and du some creative
combination of plug-unplug and selectively skip bad sectors manually
(by passing "-s" to dd_rescue).
- reboot, with the bad disk unplugged
- (optional) run "btrfs filesystem scrub" (you might need to build
btrfs-progs manually from git source). or simply read the entire fs
(e.g. using tar to /dev/null, or whatever). It should check the
checksum of all files and print out which files are damaged (either in
stdout or syslog).

I don''t think there''s anything you can do to recover the
damaged files
(other than restore from backup), but at least you know which files
are NOT damaged.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-07 10:59 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Mon, May 07, 2012 at 12:46:00PM +0200, Helmut Hullen
wrote:> Hallo,
> 
> "never change a running system" ...
> 
> For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without  
> problems.
> 
> Yesterday I compiled kernel 3.3.4, and this morning I started the  
> machine with this kernel. There may be some ugly problems.
> 
> Copying something into the btrfs "directory" worked well for some
files,
> and then I got error messages (I''ve not copied them, something
with "IO
> error" under Samba).
> 
> Rebooting the machine with kernel 3.2.9 worked, copying 1 file worked,  
> but copying more than this file didn''t work. And I can''t
delete this
> file.
> 
> That doesn''t please me - copying more than 4 TBytes wastes time
and
> money.
> 
> =========== configuration ================> 
> /dev/sdc1 on /srv/MM type btrfs (rw,noatime)
> 
> /dev/sdc: SAMSUNG HD204UI: 25 °C
> /dev/sdf: WDC WD30EZRX-00MMMB0: 30 °C
> /dev/sdi: WDC WD30EZRX-00MMMB0: 29 °C
> 
> Data, RAID0: total=5.29TB, used=4.29TB
> System, RAID1: total=8.00MB, used=352.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=149.00GB, used=5.00GB
> 
> Label: ''MMedia''  uuid:
9adfdc84-0fbe-431b-bcb1-cabb6a915e91
> 	Total devices 3 FS bytes used 4.29TB
> 	devid    3 size 2.73TB used 1.98TB path /dev/sdi1
> 	devid    2 size 2.73TB used 1.94TB path /dev/sdf1
> 	devid    1 size 1.82TB used 1.63TB path /dev/sdc1
> 
> Btrfs Btrfs v0.19
> 
> =================== boot messages, kernel related =============> 
> [boot with kernel 3.3.4]
> May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0xe frozen
> May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
> May  7 06:55:26 Arktur kernel: ata5: hard resetting link
> May  7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19)
> May  7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in
6 secs
> May  7 06:55:36 Arktur kernel: ata5: hard resetting link
> May  7 06:55:38 Arktur kernel: ata5: COMRESET failed (errno=-19)
> May  7 06:55:38 Arktur kernel: ata5: reset failed (errno=-19), retrying in
9 secs
> May  7 06:55:46 Arktur kernel: ata5: hard resetting link
> May  7 06:55:47 Arktur kernel: ata5: COMRESET failed (errno=-19)
> May  7 06:55:47 Arktur kernel: ata5: reset failed (errno=-19), retrying in
34 secs
> May  7 06:56:21 Arktur kernel: ata5: hard resetting link
> May  7 06:56:22 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113
SControl 310)
> May  7 06:56:22 Arktur kernel: ata5.00: configured for UDMA/100
> May  7 06:56:22 Arktur kernel: ata5: EH complete
> May  7 07:12:07 Arktur kernel: ata5.00: exception Emask 0x10 SAct 0x0 SErr
0x10000 action 0xe frozen
> May  7 07:12:07 Arktur kernel: ata5: SError: { PHYRdyChg }
> May  7 07:12:07 Arktur kernel: ata5.00: failed command: WRITE DMA EXT
> May  7 07:12:07 Arktur kernel: ata5.00: cmd
35/00:00:00:62:50/00:04:5e:00:00/e0 tag 0 dma 524288 out
> May  7 07:12:07 Arktur kernel:          res
d8/d8:d8:d8:d8:d8/d8:d8:d8:d8:d8/d8 Emask 0x12 (ATA bus error)
> May  7 07:12:07 Arktur kernel: ata5.00: status: { Busy }
> May  7 07:12:07 Arktur kernel: ata5.00: error: { ICRC UNC IDNF }
   This is a hardware error. You have a device that''s either dead or
dying. (Given the number of errors, probably already dead).
> May  7 07:12:07 Arktur kernel: ata5: hard resetting link
> =========================================================> 
> The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller.
> Only 1 of the 3 disks seems to be damaged.
> 
> =========================================================> 
> Ca I repair the system? Or have I to copy it to a set of other disks?
   If you have RAID-1 or RAID-10 on both data and netadata, then you
_should_ in theory just be able to remove the dead disk (physically),
then btrfs dev add a new one, btrfs dev del missing, and balance.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                        --- argc, argv, argh! ---

Helmut Hullen

2012-May-07 12:06 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Fajar,

Du meintest am 07.05.12:
>> For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without
>> problems.
>>
>> Yesterday I compiled kernel 3.3.4, and this morning I started the
>> machine with this kernel. There may be some ugly problems.
>> Data, RAID0: total=5.29TB, used=4.29TB
> Raid0? Yaiks!
Why not?
You know the price of 1 3-TByte disk?
The data isn''t irreproducible, in this case.

[...]
>> May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline
>> device
>> May  7 07:15:19 Arktur kernel: end_request: I/O error, dev
>> sdf, sector 0
>> May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting
>> I/O to offline device
>> May  7 07:15:19 Arktur kernel: lost page write
>> due to I/O error on sdf1
> That looks like a bad disk to me, and it shouldn''t be related to
the
> kernel version you use.
But why does it happen just when I change the kernel?
(Yes - I know: Murphy works reliable ...)
> Your best chance might be:
> - unmount the fs
> - get another disk to replace /dev/sdf, copy the content over with
> dd_rescue. Ata resets can be a PITA, so you might be better of by
> moving the failed disk to a usb external adapter, and du some
> creative combination of plug-unplug and selectively skip bad sectors
> manually (by passing "-s" to dd_rescue).
Hmmm - I''ll take a try ...

> - reboot, with the bad disk unplugged
> - (optional) run "btrfs filesystem scrub" (you might need to
build
> btrfs-progs manually from git source).
Last time I''d tried this command (some months ago) it had produced a  
completely unusable system of disks/partitions ...

> or simply read the entire fs
> (e.g. using tar to /dev/null, or whatever). It should check the
> checksum of all files and print out which files are damaged (either
> in stdout or syslog).
And that''s the other try - I had to use it for another disk (also WD,  
but only 2 TByte - I could watch how it died ...).

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-07 12:15 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Hugo,

Du meintest am 07.05.12:
>> Yesterday I compiled kernel 3.3.4, and this morning I started the
>> machine with this kernel. There may be some ugly problems.
>>
>> Copying something into the btrfs "directory" worked well for
some
>> files, and then I got error messages (I''ve not copied them,
>> something with "IO error" under Samba).
[...]
>> Data, RAID0: total=5.29TB, used=4.29TB
>> System, RAID1: total=8.00MB, used=352.00KB
>> System: total=4.00MB, used=0.00
>> Metadata, RAID1: total=149.00GB, used=5.00GB
>>
>> Label: ''MMedia''  uuid:
9adfdc84-0fbe-431b-bcb1-cabb6a915e91
>> 	Total devices 3 FS bytes used 4.29TB
>> 	devid    3 size 2.73TB used 1.98TB path /dev/sdi1
>> 	devid    2 size 2.73TB used 1.94TB path /dev/sdf1
>> 	devid    1 size 1.82TB used 1.63TB path /dev/sdc1
>>
>> Btrfs Btrfs v0.19
>>
>> =================== boot messages, kernel related =============>>
>> [boot with kernel 3.3.4]
>> May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0
>> SErr 0x10000 action 0xe frozen
>> May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
>> May  7 06:55:26 Arktur kernel: ata5: hard resetting link
>    This is a hardware error. You have a device that''s either dead
or
> dying. (Given the number of errors, probably already dead).
It seems to be undecided which status it has ...
>> Can I repair the system? Or have I to copy it to a set of other
>> disks?
>    If you have RAID-1 or RAID-10 on both data and netadata, then you
> _should_ in theory just be able to remove the dead disk (physically),
> then btrfs dev add a new one, btrfs dev del missing, and balance.

I haven''t - I have a kind of copy/backup in the neighbourhood.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-May-07 12:53 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 05/07/2012 06:46 PM, Helmut Hullen wrote:
> btrfs: error reading free space cache
> BUG: unable to handle kernel NULL pointer dereference at 00000001
> IP: [<c1295c36>] io_ctl_drop_pages+0x26/0x50
> *pdpt = 0000000029712001 *pde = 0000000000000000
> Oops: 0002 [#1]


Could you please try this and show us the results?

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 202008e..ae514ad 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -296,7 +296,9 @@ static void io_ctl_free(struct io_ctl *io_ctl)
 static void io_ctl_unmap_page(struct io_ctl *io_ctl)
 {
 	if (io_ctl->cur) {
-		kunmap(io_ctl->page);
+		WARN_ON(!io_ctl->page);
+		if (io_ctl->page)
+			kunmap(io_ctl->page);
 		io_ctl->cur = NULL;
 		io_ctl->orig = NULL;
 	}
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-07 13:34 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Hugo,

Du meintest am 07.05.12:
>> =================== boot messages, kernel related =============>>
>> [boot with kernel 3.3.4]
>> May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0
>> SErr 0x10000 action 0xe frozen
>> May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
>> May  7 06:55:26 Arktur kernel: ata5: hard resetting link
[...]
>    This is a hardware error. You have a device that''s either dead
or
> dying. (Given the number of errors, probably already dead).
It''s dead - R.I.P.

I''ve tried it with a SATA-USB-adapter - that adapter produces dmesg  
lines when connecting or disconnecting.

And this special drive doesn''t tell anything now. Shit.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-07 14:05 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Mon, May 07, 2012 at 03:34:00PM +0200, Helmut Hullen
wrote:> Hallo, Hugo,
> 
> Du meintest am 07.05.12:
> 
> >> =================== boot messages, kernel related
=============> >>
> >> [boot with kernel 3.3.4]
> >> May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0
> >> SErr 0x10000 action 0xe frozen
> >> May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
> >> May  7 06:55:26 Arktur kernel: ata5: hard resetting link
> 
> [...]
> 
> >    This is a hardware error. You have a device that''s either
dead or
> > dying. (Given the number of errors, probably already dead).
> 
> It''s dead - R.I.P.
> 
> I''ve tried it with a SATA-USB-adapter - that adapter produces
dmesg
> lines when connecting or disconnecting.
> 
> And this special drive doesn''t tell anything now. Shit.
   Sorry to be the bearer of bad news. I don''t think we can point the
finger at btrfs here.

   It looks like you''ve lost most of your data -- losing a RAID-0
stripe across the whole FS isn''t likely to have left much of it
intact. If you''ve got the space (or the money to get it), mkfs.btrfs
-m raid1 -d raid1 would have saved you here.

[ Incidentally, thinking about it, the failure coming at a kernel
   upgrade could well be down to the additional stress of the
   power-down/reboot finally pushing a bad drive over the edge. ]

   In sympathy,
   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
    --- But somewhere along the line, it seems / That pimp became ---    
                       cool,  and punk mainstream.

Helmut Hullen

2012-May-07 16:36 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Hugo,

Du meintest am 07.05.12:
>> It''s dead - R.I.P.
>    Sorry to be the bearer of bad news. I don''t think we can point
the
> finger at btrfs here.
a) you know what to do with the bearer?
b) I like such errors - completely independent, but simultaneously.
>    It looks like you''ve lost most of your data -- losing a RAID-0
> stripe across the whole FS isn''t likely to have left much of it
> intact.
I''m just going back to ext4 - then one broken disk doesn''t
disturb the
contents of the other disks.

The data is not very valuable - DVB video mpegs. Most of the files are  
repeated on and on.
> If you''ve got the space (or the money to get it), mkfs.btrfs
> -m raid1 -d raid1 would have saved you here.
About 400 ... 500 Euro for backing up videos? Not necessary.

(No: I don''t count the minutes and hours working with the system ...)
> [ Incidentally, thinking about it, the failure coming at a kernel
>    upgrade could well be down to the additional stress of the
>    power-down/reboot finally pushing a bad drive over the edge. ]
Just now it''s again an "open system"; I had to wobble the
cables too ...

Maybe the SATA-PCI-controller needs to be replaced too ...

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Felix Blanke

2012-May-07 17:13 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 5/7/12 6:36 PM, Helmut Hullen wrote:> Hallo, Hugo,
>
> Du meintest am 07.05.12:
>
>>> It''s dead - R.I.P.
>
>>     Sorry to be the bearer of bad news. I don''t think we can
point the
>> finger at btrfs here.
>
> a) you know what to do with the bearer?
> b) I like such errors - completely independent, but simultaneously.
>
>>     It looks like you''ve lost most of your data -- losing a
RAID-0
>> stripe across the whole FS isn''t likely to have left much of
it
>> intact.
>
> I''m just going back to ext4 - then one broken disk
doesn''t disturb the
> contents of the other disks.
?! If you use raid0 one broken disk will allways disturb the contents of 
the other disks, that is what raid0 does, no matter what filesystem you 
use. You could easly use btrfs with the "normal" or raid1 mode. Btrfs
is
still in development and often times you can blaim it for a corrupt 
filesystem, but in this case it''s simply "raid0 -> 1 disc dies
-> data
are gone".
>
> The data is not very valuable - DVB video mpegs. Most of the files are
> repeated on and on.
>
>> If you''ve got the space (or the money to get it), mkfs.btrfs
>> -m raid1 -d raid1 would have saved you here.
>
> About 400 ... 500 Euro for backing up videos? Not necessary.
>
> (No: I don''t count the minutes and hours working with the system
...)
>
>> [ Incidentally, thinking about it, the failure coming at a kernel
>>     upgrade could well be down to the additional stress of the
>>     power-down/reboot finally pushing a bad drive over the edge. ]
>
> Just now it''s again an "open system"; I had to wobble
the cables too ...
>
> Maybe the SATA-PCI-controller needs to be replaced too ...
>
> Viele Gruesse!
> Helmut
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-07 17:52 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Felix,

Du meintest am 07.05.12:
>> I''m just going back to ext4 - then one broken disk
doesn''t disturb
>> the contents of the other disks.
> ?! If you use raid0 one broken disk will always disturb the contents
> of the other disks, that is what raid0 does, no matter what
> filesystem you use.
Yes - I know. But btrfs promises that I can add bigger disks and delete  
smaller disks "on the fly". For something like a video collection
which
will grow on and on an interesting feature. And such a (big) collection  
does need a "gradfather-father-son" backup, that''s no
critical data.

With a file system like ext2/3/4 I can work with several directories  
which are mounted together, but (as said before) one broken disk
doesn''t
disturb the others.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-07 18:00 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Mon, May 07, 2012 at 07:52:00PM +0200, Helmut Hullen
wrote:> Hallo, Felix,
> 
> Du meintest am 07.05.12:
> 
> >> I''m just going back to ext4 - then one broken disk
doesn''t disturb
> >> the contents of the other disks.
> 
> > ?! If you use raid0 one broken disk will always disturb the contents
> > of the other disks, that is what raid0 does, no matter what
> > filesystem you use.
> 
> Yes - I know. But btrfs promises that I can add bigger disks and delete  
> smaller disks "on the fly". For something like a video collection
which
> will grow on and on an interesting feature. And such a (big) collection  
> does need a "gradfather-father-son" backup, that''s no
critical data.
> 
> With a file system like ext2/3/4 I can work with several directories  
> which are mounted together, but (as said before) one broken disk
doesn''t
> disturb the others.
   mkfs.btrfs -m raid1 -d single should give you that.

   There may be a kernel patch you need to stop it doing the silly
single → raid0 "upgrade" automatically, as well.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                       ---   __(_''>  Squeak!   ---

Helmut Hullen

2012-May-07 18:25 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Hugo,

Du meintest am 07.05.12:
>> With a file system like ext2/3/4 I can work with several directories
>> which are mounted together, but (as said before) one broken disk
>> doesn''t disturb the others.
>    mkfs.btrfs -m raid1 -d single should give you that.
What''s the difference to

     mkfs.btrfs -m raid1 -d raid0

(what I have used the last time)?

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-07 18:44 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Mon, May 07, 2012 at 08:25:00PM +0200, Helmut Hullen
wrote:> Hallo, Hugo,
> 
> Du meintest am 07.05.12:
> 
> >> With a file system like ext2/3/4 I can work with several
directories
> >> which are mounted together, but (as said before) one broken disk
> >> doesn''t disturb the others.
> 
> >    mkfs.btrfs -m raid1 -d single should give you that.
> 
> What''s the difference to
> 
>      mkfs.btrfs -m raid1 -d raid0
 - RAID-0 stripes each piece of data across all the disks.
 - single puts data on one disk at a time.

   So, on three disks (each disk running horizontally), the FS will
allocate block groups this way for RAID-0:

Disk 1:   | A1 | B1 | C1 |...
Disk 2:   | A2 | B2 | C2 |...
Disk 3:   | A3 | B3 | C3 |...

where each chunk, e.g. A2, is 1G in size. Then data is striped across
all of the An chunks (a single block group of size 3G) in 64k
sub-stripes, until block group A is filled up, and then it''ll move on
to another block group.

   For "single" allocation on the same disks, you will instead get:

Disk 1:  | A  | D  | G  |...
Disk 2:  | B  | E  | H  |...
Disk 3:  | C  | F  | I  |...

where, again, each chunk is 1G in size. Data written to the FS will
live in one of the chunks, overflowing to some other chunk when
there''s no more space.

   With large files, you''ve still got a chance that (some of) the data
from the file will be on more than one disk, but it''s a much much
better situation than you''d have with RAID-0.

   Of course, you still need RAID-1 metadata, so that when a disk does
go bang, you still have all the filesystem structures you need to read
the remaining data. :)

   In fact, this is probably a good argument for having the option to
put back the old allocator algorithm, which would have ensured that
the first disk would fill up completely first before it touched the
next one...

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
          --- ...  one ping(1) to rule them all, and in the ---          
                         darkness bind(2) them.

Daniel Lee

2012-May-07 19:30 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 05/07/2012 10:52 AM, Helmut Hullen wrote:> Hallo, Felix,
>
> Du meintest am 07.05.12:
>
>>> I''m just going back to ext4 - then one broken disk
doesn''t disturb
>>> the contents of the other disks.
>
>> ?! If you use raid0 one broken disk will always disturb the contents
>> of the other disks, that is what raid0 does, no matter what
>> filesystem you use.
>
> Yes - I know. But btrfs promises that I can add bigger disks and delete
> smaller disks "on the fly". For something like a video collection
which
> will grow on and on an interesting feature. And such a (big) collection
> does need a "gradfather-father-son" backup, that''s no
critical data.
>
> With a file system like ext2/3/4 I can work with several directories
> which are mounted together, but (as said before) one broken disk
doesn''t
> disturb the others.
>
How can you do that with ext2/3/4? If you mean create several different 
filesystems and mount them separately then that''s very different from 
your current situation. What you did in this case is comparable to 
creating a raid0 array out of your disks. I don''t see how an ext 
filesystem is going to work any better if one of the disks drops out 
than with a btrfs filesystem. Using -d single isn''t going to be of much
use in this case either because that''s like spanning a lvm volume over 
several disks and then putting ext over that, it''s pretty 
nondeterministic how much you''ll actually save should a large chunk of 
the filesystem suddenly disappear.

It sounds like what you''re thinking of is creating several separate ext
filesystems and then just mounting them separately. There''s nothing 
inherently special about doing this with ext, you can can do the same 
thing with btrfs and it would amount to about the same level of 
protection (potentially more if you consider [meta]data checksums 
important but potentially less if you feel that ext is more robust for 
whatever reason).

If you want to survive losing a single disk without the (absolute) fear 
of the whole filesystem breaking you have to have some sort of 
redundancy either by separating filesystems or using some version of 
raid other than raid0. I suppose the volume management of btrfs is sort 
of confusing at the moment but when btrfs promises you can remove disks 
"on the fly" it doesn''t mean you can just unplug disks from a
raid0
without telling btrfs to put that data elsewhere first.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-07 20:21 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Daniel,

Du meintest am 07.05.12:
>> Yes - I know. But btrfs promises that I can add bigger disks and
>> delete smaller disks "on the fly". For something like a video
>> collection which will grow on and on an interesting feature. And
>> such a (big) collection does need a "gradfather-father-son"
backup,
>> that''s no critical data.
>>
>> With a file system like ext2/3/4 I can work with several directories
>> which are mounted together, but (as said before) one broken disk
>> doesn''t disturb the others.
> How can you do that with ext2/3/4? If you mean create several
> different filesystems and mount them separately then that''s very
> different from your current situation. What you did in this case is
> comparable to creating a raid0 array out of your disks. I don''t
see
> how an ext filesystem is going to work any better if one of the disks
> drops out than with a btrfs filesystem.
  mkfs.btrfs  -m raid1 -d raid0

with 3 disks gives me a "cluster" which looks like 1 disk/partition/ 
directory.
If one disk fails nothing is usable.

(Yes - I''ve read Hugo''s explanation of "-d single",
I''ll try this way)

With ext2/3/4 I mount 2 disks/partitions into the first disk. If one  
disk fails the contents of the 2 other disks is still readable,
> It sounds like what you''re thinking of is creating several
separate
> ext filesystems and then just mounting them separately.
Yes - that''s the old way. It''s reliable but "ugly".
> There''s nothing inherently special about doing this with ext, you
can
> do the same thing with btrfs and it would amount to about the same
> level of protection (potentially more if you consider [meta]data
> checksums important but potentially less if you feel that ext is more
> robust for whatever reason).
No - as just mentionend: there''s a big difference when one disk fails.
> If you want to survive losing a single disk without the (absolute)
> fear of the whole filesystem breaking you have to have some sort of
> redundancy either by separating filesystems or using some version of
> raid other than raid0.
No - since some years I use a kind of outsourced backup. A copy of all  
data is on a bundle of disks somewhere in the neighbourhood. As  
mentionend: the data isn''t business critical, it''s just
"nice to have".
It''s not worth something like raid1 or so (with twice the costs of a
non
raid solution).
> I suppose the volume management of btrfs is
> sort of confusing at the moment but when btrfs promises you can
> remove disks "on the fly" it doesn''t mean you can just
unplug disks
> from a raid0 without telling btrfs to put that data elsewhere first.
No - it''s not confusing. It only needs a kind of recipe and much time:

        btrfs device add ...
        btrfs filesystem balance ... (perhaps no necessary)
        btrfs device delete ...
        btrfs filesystem balance ... (perhaps not necessary)

No intellectual challenge.
And completely different to "hot pluggable".

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Lee

2012-May-07 20:51 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 05/07/2012 01:21 PM, Helmut Hullen wrote:> Hallo, Daniel,
>
> Du meintest am 07.05.12:
>
>>> Yes - I know. But btrfs promises that I can add bigger disks and
>>> delete smaller disks "on the fly". For something like a
video
>>> collection which will grow on and on an interesting feature. And
>>> such a (big) collection does need a
"gradfather-father-son" backup,
>>> that''s no critical data.
>>>
>>> With a file system like ext2/3/4 I can work with several
directories
>>> which are mounted together, but (as said before) one broken disk
>>> doesn''t disturb the others.
>
>> How can you do that with ext2/3/4? If you mean create several
>> different filesystems and mount them separately then that''s
very
>> different from your current situation. What you did in this case is
>> comparable to creating a raid0 array out of your disks. I
don''t see
>> how an ext filesystem is going to work any better if one of the disks
>> drops out than with a btrfs filesystem.
>
>    mkfs.btrfs  -m raid1 -d raid0
>
> with 3 disks gives me a "cluster" which looks like 1
disk/partition/
> directory.
> If one disk fails nothing is usable.
How is that different from putting ext on top of a raid0?
>
> (Yes - I''ve read Hugo''s explanation of "-d
single", I''ll try this way)
>
> With ext2/3/4 I mount 2 disks/partitions into the first disk. If one
> disk fails the contents of the 2 other disks is still readable,
There is nothing that prevents you from using this strategy with btrfs.
>
>> It sounds like what you''re thinking of is creating several
separate
>> ext filesystems and then just mounting them separately.
>
> Yes - that''s the old way. It''s reliable but
"ugly".
>
>> There''s nothing inherently special about doing this with ext,
you can
>> do the same thing with btrfs and it would amount to about the same
>> level of protection (potentially more if you consider [meta]data
>> checksums important but potentially less if you feel that ext is more
>> robust for whatever reason).
>
> No - as just mentionend: there''s a big difference when one disk
fails.
No there isn''t.
>
>> If you want to survive losing a single disk without the (absolute)
>> fear of the whole filesystem breaking you have to have some sort of
>> redundancy either by separating filesystems or using some version of
>> raid other than raid0.
>
> No - since some years I use a kind of outsourced backup. A copy of all
> data is on a bundle of disks somewhere in the neighbourhood. As
> mentionend: the data isn''t business critical, it''s just
"nice to have".
> It''s not worth something like raid1 or so (with twice the costs of
a non
> raid solution).
>
>> I suppose the volume management of btrfs is
>> sort of confusing at the moment but when btrfs promises you can
>> remove disks "on the fly" it doesn''t mean you can
just unplug disks
>> from a raid0 without telling btrfs to put that data elsewhere first.
>
> No - it''s not confusing. It only needs a kind of recipe and much
time:
>
>          btrfs device add ...
>          btrfs filesystem balance ... (perhaps no necessary)
>          btrfs device delete ...
>          btrfs filesystem balance ... (perhaps not necessary)
>
> No intellectual challenge.
> And completely different to "hot pluggable".
This is no different to any raid0 or spanning disk setup that allows 
growing/shrinking of the array.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-07 21:17 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Daniel,

Du meintest am 07.05.12:
>>    mkfs.btrfs  -m raid1 -d raid0
>>
>> with 3 disks gives me a "cluster" which looks like 1
disk/partition/
>> directory.
>> If one disk fails nothing is usable.
> How is that different from putting ext on top of a raid0?
Classic raid0 doesn''t allow deleting/removing disks from a cluster.
>> With ext2/3/4 I mount 2 disks/partitions into the first disk. If one
>> disk fails the contents of the 2 other disks is still readable,
> There is nothing that prevents you from using this strategy with
> btrfs.
How?
I''ve tried many installations of btrfs, sometimes 1 disk failed, and  
then the data on all other disks was inaccessible.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

cwillu

2012-May-07 21:27 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Mon, May 7, 2012 at 3:17 PM, Helmut Hullen <Hullen@t-online.de>
wrote:> Hallo, Daniel,
>
> Du meintest am 07.05.12:
>
>>>    mkfs.btrfs  -m raid1 -d raid0
>>>
>>> with 3 disks gives me a "cluster" which looks like 1
disk/partition/
>>> directory.
>>> If one disk fails nothing is usable.
>
>> How is that different from putting ext on top of a raid0?
>
> Classic raid0 doesn''t allow deleting/removing disks from a
cluster.
>
>>> With ext2/3/4 I mount 2 disks/partitions into the first disk. If
one
>>> disk fails the contents of the 2 other disks is still readable,
>
>> There is nothing that prevents you from using this strategy with
>> btrfs.
>
> How?
> I''ve tried many installations of btrfs, sometimes 1 disk failed,
and
> then the data on all other disks was inaccessible.
"With ext2/3/4 I mount 2 disks/partitions into the first disk. If one
disk fails the contents of the 2 other disks is still readable,"

There''s nothing stopping you from using 3 btrfs filesystems mounted in
the same way as you would 3 ext4 filesystems.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin Steigerwald

2012-May-07 22:07 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Am Montag, 7. Mai 2012 schrieb Helmut Hullen:> > If you want to survive losing a single disk without the (absolute)
> > fear of the whole filesystem breaking you have to have some sort of
> > redundancy either by separating filesystems or using some version of
> > raid other than raid0.
> 
> No - since some years I use a kind of outsourced backup. A copy of
> all   data is on a bundle of disks somewhere in the neighbourhood. As
> mentionend: the data isn''t business critical, it''s just
"nice to
> have". It''s not worth something like raid1 or so (with twice
the costs
> of a non raid solution).
Thats not true when you use BTRFS RAID1 with three disks. BTRFS will only 
store each chunk on two different drives then, not on all three. Such it is 
not twice the cost, but given all three drives have the same capacity 
about one and a half times the cost.

Consider the time to recover the files from the outsourced backup. Maybe it 
does make up the money you would have to spend for one additional 
harddisk.

Anyway, I agree with the others responding to your post that this one 
harddisk died and I do not see a kernel version related issue. Any striped 
RAID 0 would have failed in that case.

And you can use three BTRFS filesystems the same way as three Ext4 
filesystems if you prefer such a setup if the time spent for restoring the 
backup does not make up the cost for one additional disk for you.

-- 
Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-08 07:39 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Martin,

Du meintest am 08.05.12:
>> No - since some years I use a kind of outsourced backup. A copy of
>> all   data is on a bundle of disks somewhere in the neighbourhood.
>> As mentionend: the data isn''t business critical, it''s
just "nice to
>> have". It''s not worth something like raid1 or so (with
twice the
>> costs of a non raid solution).
> Thats not true when you use BTRFS RAID1 with three disks. BTRFS will
> only store each chunk on two different drives then, not on all three.
> Such it is not twice the cost, but given all three drives have the
> same capacity about one and a half times the cost.
> Consider the time to recover the files from the outsourced backup.
> Maybe it does make up the money you would have to spend for one
> additional harddisk.
I have considered it, many times. And the result is unchanged: no RAID1.  
It doesn''t replace a real backup.
> Anyway, I agree with the others responding to your post that this one
> harddisk died and I do not see a kernel version related issue. Any
> striped RAID 0 would have failed in that case.
Yes - I had written yesterday that the disk is dead. One of three disks.  
I''m on the way restoring (from backup) the three disks.
> And you can use three BTRFS filesystems the same way as three Ext4
> filesystems if you prefer such a setup if the time spent for
> restoring the backup does not make up the cost for one additional
> disk for you.
But where''s the gain? If a disk fails I have a lot of tools for  
repairing an ext2/3/4 system.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fajar A. Nugraha

2012-May-08 07:44 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Tue, May 8, 2012 at 2:39 PM, Helmut Hullen <Hullen@t-online.de> wrote:
>> And you can use three BTRFS filesystems the same way as three Ext4
>> filesystems if you prefer such a setup if the time spent for
>> restoring the backup does not make up the cost for one additional
>> disk for you.
>
> But where''s the gain? If a disk fails I have a lot of tools for
> repairing an ext2/3/4 system.
It won''t work if you use it in RAID0 (e.g. with LVM spanning three
disks, then use ext4 on top of the LV). Which is basically the same
thing that you did (using btrfs in raid0 mode).

As others said, if your only concern is "if a disk is dead, I want to
be able to access data on other disks", then simply use btrfs as three
different fs, mounted on three directories.

btrfs will shine when:
- you need checksum and self-healing in raid10 mode
- you have lots of small files
- you have highly compressible content
- you need snapshot/clone feature

Since you don''t need either, IMHO it''s actually better if you
just use ext4.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-08 10:00 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Fajar,

Du meintest am 08.05.12:
>>> And you can use three BTRFS filesystems the same way as three Ext4
>>> filesystems if you prefer such a setup if the time spent for
>>> restoring the backup does not make up the cost for one additional
>>> disk for you.
>>
>> But where''s the gain? If a disk fails I have a lot of tools
for
>> repairing an ext2/3/4 system.
> It won''t work if you use it in RAID0 (e.g. with LVM spanning three
> disks, then use ext4 on top of the LV).
But when I use ext2/3/4 I neither need RAID0 nor do I need LVM.
> As others said, if your only concern is "if a disk is dead, I want to
> be able to access data on other disks", then simply use btrfs as
> three different fs, mounted on three directories.
But then I don''t need especially btrfs.
> btrfs will shine when:
> - you need checksum and self-healing in raid10 mode
> - you have lots of small files
> - you have highly compressible content
> - you need snapshot/clone feature
For my video collection (mpeg2) nothing fits ...

The only advantage I see with btrfs is

        adding a bigger disk
        deleting/removing a smaller disk

with really simple commands.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Clemens Eisserer

2012-May-08 10:41 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hi Helmut,
>> But where''s the gain? If a disk fails I have a lot of tools
for
>> repairing an ext2/3/4 system.
Nope, when a disk in your ext4 raid0 array fails, you are just as doomed.

> But when I use ext2/3/4 I neither need RAID0 nor do I need LVM.
You can use btrfs, without using its raid capabilities.
Face it, you used an experimental filesystem and you configured it the
wrong way.
Btrfs is not the one to blame here.

- Clemens
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-08 13:13 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Clemens,

Du meintest am 08.05.12:
>>> But where''s the gain? If a disk fails I have a lot of
tools for
>>> repairing an ext2/3/4 system.
> Nope, when a disk in your ext4 raid0 array fails, you are just as
> doomed.
Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the  
directory tree does the job.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Felix Blanke

2012-May-08 13:44 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 5/8/12 3:13 PM, Helmut Hullen wrote:> Hallo, Clemens,
>
> Du meintest am 08.05.12:
>
>>>> But where''s the gain? If a disk fails I have a lot of
tools for
>>>> repairing an ext2/3/4 system.
>
>> Nope, when a disk in your ext4 raid0 array fails, you are just as
>> doomed.
>
> Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the
> directory tree does the job.
Nobody told you that you should do it. What EVERYBODY here is telling 
you: The problem you have right now would be the same damn problem, no 
matter what fs you would you. Every fs will be unusable if you lose one 
disk in a raid0 setup. That''s all what we are trying to tell you for
the
last 15 mails :)

If you don''t see any benefits using btrfs then simply don''t 
use it :)
Again: You misconfigured your fs if you never wanted to use raid0.
Don''t
blame the fs, blame yourself.
>
> Viele Gruesse!
> Helmut
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-08 13:52 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Tue, May 08, 2012 at 03:44:12PM +0200, Felix Blanke
wrote:> On 5/8/12 3:13 PM, Helmut Hullen wrote:
> >Hallo, Clemens,
> >
> >Du meintest am 08.05.12:
> >
> >>>>But where''s the gain? If a disk fails I have a lot
of tools for
> >>>>repairing an ext2/3/4 system.
> >
> >>Nope, when a disk in your ext4 raid0 array fails, you are just as
> >>doomed.
> >
> >Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the
> >directory tree does the job.
> 
> Nobody told you that you should do it. What EVERYBODY here is
> telling you: The problem you have right now would be the same damn
> problem, no matter what fs you would you. Every fs will be unusable
> if you lose one disk in a raid0 setup. That''s all what we are
trying
> to tell you for the last 15 mails :)
   I think he''s got the point by now. Can we stop this thread now,
please? It doesn''t seem to be serving any further purpose.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
             --- No names... I want to remain anomalous. ---

Helmut Hullen

2012-May-08 16:53 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Felix,

Du meintest am 08.05.12:
>> Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the
>> directory tree does the job.
> Nobody told you that you should do it. What EVERYBODY here is telling
> you: The problem you have right now would be the same damn problem,
> no matter what fs you would you. Every fs will be unusable if you
> lose one disk in a raid0 setup. That''s all what we are trying to
tell
> you for the last 15 mails :)
> If you don''t see any benefits using btrfs then simply
don''t  use it
I still hope for a benefit when I use btrfs.

As I''ve written many times: I want a system for my video collection  
which allows

        adding a bigger disk
        deleting/removing a smaller disk

with simple commands.

btrfs seems to be able to do that (and I have tested this job many  
times). But with my configuration "mkfs.btrfs -m raid1 -d raid0"
I''ve
(again) seen that all data vanishes when 1 disk fails.

I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d
single".
And I hope that it doesn''t make all disks unreadable when 1 disk fails.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Felix Blanke

2012-May-08 17:24 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 5/8/12 6:53 PM, Helmut Hullen wrote:
 > Hallo, Felix,
 >
 > Du meintest am 08.05.12:
 >
 >>> Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in
the
 >>> directory tree does the job.
 >
 >> Nobody told you that you should do it. What EVERYBODY here is telling
 >> you: The problem you have right now would be the same damn problem,
 >> no matter what fs you would you. Every fs will be unusable if you
 >> lose one disk in a raid0 setup. That''s all what we are trying
to tell
 >> you for the last 15 mails :)
 >
 >> If you don''t see any benefits using btrfs then simply
don''t  use it
 >
 > I still hope for a benefit when I use btrfs.
 >
 > As I''ve written many times: I want a system for my video
collection
 > which allows
 >
 >          adding a bigger disk
 >          deleting/removing a smaller disk
 >
 > with simple commands.
 >
 > btrfs seems to be able to do that (and I have tested this job many
 > times). But with my configuration "mkfs.btrfs -m raid1 -d raid0"
I''ve
 > (again) seen that all data vanishes when 1 disk fails.
 >
 > I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d
single".
 > And I hope that it doesn''t make all disks unreadable when 1 disk
fails.

Maybe you should inform yourself about the different raid level before 
you use them?

http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0

Raid0 will allways be that way: One disk dies, filesystem is gone. 
That''s some sort of defintion of raid0 :)

@"-d single"

Is it really possible to remove a disk from btrfs (created with -d 
single) without losing the data on that disk? Is there a way to tell 
balance to copy all the data from this disk to the other disks (ofc if 
there is enough free space on them)?

 >
 > Viele Gruesse!
 > Helmut
 > --
 > To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
 > the body of a message to majordomo@vger.kernel.org
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-08 18:29 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Felix,

Du meintest am 08.05.12:
>> As I''ve written many times: I want a system for my video
collection
>> which allows
>>
>>          adding a bigger disk
>>          deleting/removing a smaller disk
>>
>> with simple commands.
>>
>> btrfs seems to be able to do that (and I have tested this job many
>> times). But with my configuration "mkfs.btrfs -m raid1 -d
raid0"
>> I''ve (again) seen that all data vanishes when 1 disk fails.
>>
>> I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d
single".
>> And I hope that it doesn''t make all disks unreadable when 1
disk
>> fails.
[...]
> @"-d single"
> Is it really possible to remove a disk from btrfs (created with -d
> single) without losing the data on that disk?
When the system is configured with

        mkfs.btrfs -m raid1 -d raid0

then the above shown way is possible, it works (now) as expected.
Ok - it needs some time.

And I have yet told in this mailing list that I''ll try the option 2-d  
single".
> Is there a way to tell
> balance to copy all the data from this disk to the other disks (ofc
> if there is enough free space on them)?
As I''ve written some hours ago: I run

        btrfs fi balance ...

after adding and after deleting a disk. Maybe it''s not necessary.  
Especially it seems not to be necessary after adding a disk.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Felix Blanke

2012-May-08 18:41 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 5/8/12 8:29 PM, Helmut Hullen wrote:> Hallo, Felix,
>
> Du meintest am 08.05.12:
>
>>> As I''ve written many times: I want a system for my video
collection
>>> which allows
>>>
>>>           adding a bigger disk
>>>           deleting/removing a smaller disk
>>>
>>> with simple commands.
>>>
>>> btrfs seems to be able to do that (and I have tested this job many
>>> times). But with my configuration "mkfs.btrfs -m raid1 -d
raid0"
>>> I''ve (again) seen that all data vanishes when 1 disk
fails.
>>>
>>> I''ll try Hugo''s proposal "mkfs.btrfs -m
raid1 -d single".
>>> And I hope that it doesn''t make all disks unreadable when
1 disk
>>> fails.
>
> [...]
>
>> @"-d single"
>
>> Is it really possible to remove a disk from btrfs (created with -d
>> single) without losing the data on that disk?
>
> When the system is configured with
>
>          mkfs.btrfs -m raid1 -d raid0
>
> then the above shown way is possible, it works (now) as expected.
> Ok - it needs some time.
>
> And I have yet told in this mailing list that I''ll try the option
2-d
> single".
>
>> Is there a way to tell
>> balance to copy all the data from this disk to the other disks (ofc
>> if there is enough free space on them)?
>
> As I''ve written some hours ago: I run
>
>          btrfs fi balance ...
>
> after adding and after deleting a disk. Maybe it''s not necessary.
> Especially it seems not to be necessary after adding a disk.
What are the steps you''re doing?! If this is really possible then there
must be some sort of command that tells btrfs "Hey, I wanne remove this 
disk from the fs, please copy all data to the other disks and then 
remove the disk". Is there such a command? Haven''t heard of one,
but
that would be interesting.

Otherwise if you remove a disk from a raid0 (doesn''t matter if you have
2 or 5 or x disks in the fs, btrfs should stripe above all disks) your 
fs should be broken.
>
> Viele Gruesse!
> Helmut
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-May-08 19:12 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Tue, May 08, 2012 at 08:41:47PM +0200, Felix Blanke
wrote:> >As I''ve written some hours ago: I run
> >
> >         btrfs fi balance ...
> >
> >after adding and after deleting a disk. Maybe it''s not
necessary.
> >Especially it seems not to be necessary after adding a disk.
> 
> What are the steps you''re doing?! If this is really possible then
there must
> be some sort of command that tells btrfs "Hey, I wanne remove this
disk from
> the fs, please copy all data to the other disks and then remove the
disk".
> Is there such a command? Haven''t heard of one, but that would be
> interesting.
The ''btrfs device delete'' command does what you described, a
pretty
basic command, so I''m not sure if I did not miss something during this
thread.
> Otherwise if you remove a disk from a raid0 (doesn''t matter if you
have 2 or
> 5 or x disks in the fs, btrfs should stripe above all disks) your fs should
> be broken.
All data from the device being removed are relocated to the rest of the
device group.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-08 19:34 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Felix,

Du meintest am 08.05.12:
>>>>           adding a bigger disk
>>>>           deleting/removing a smaller disk
>>>>
>>>> with simple commands.
[...]
>>> Is it really possible to remove a disk from btrfs (created with -d
>>> single) without losing the data on that disk?
>>
>> When the system is configured with
>>
>>          mkfs.btrfs -m raid1 -d raid0
>>
>> then the above shown way is possible, it works (now) as expected.
>> Ok - it needs some time.
[...]
> What are the steps you''re doing?! If this is really possible then
> there must be some sort of command that tells btrfs "Hey, I wanne
> remove this disk from the fs, please copy all data to the other disks
> and then remove the disk". Is there such a command? Haven''t
heard of
> one, but that would be interesting.
        btrfs device add /dev/$newdisk ...
        (btrfs fi balance ...)
        btrfs device delete /dev/$olddisk ...
        (btrfs fi balance ...)

I''ve told these simple steps many times in this mailing list.

Since some kernel versions (at least since kernel 3.2.x) it seems to  
work without problems; "btrfs-progs"-packet from 2011-10-30.

> Otherwise if you remove a disk from a raid0 (doesn''t matter if you
> have 2 or 5 or x disks in the fs, btrfs should stripe above all
> disks) your fs should be broken.

Not with btrfs ... there it works even with

  mkfs.btrfs -m raid1 -d raid0 ...

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-08 20:02 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Tue, May 08, 2012 at 09:34:00PM +0200, Helmut Hullen
wrote:> Hallo, Felix,
> 
> Du meintest am 08.05.12:
> 
> >>>>           adding a bigger disk
> >>>>           deleting/removing a smaller disk
> >>>>
> >>>> with simple commands.
> 
> [...]
> 
> >>> Is it really possible to remove a disk from btrfs (created
with -d
> >>> single) without losing the data on that disk?
> >>
> >> When the system is configured with
> >>
> >>          mkfs.btrfs -m raid1 -d raid0
> >>
> >> then the above shown way is possible, it works (now) as expected.
> >> Ok - it needs some time.
> 
> [...]
> 
> > What are the steps you''re doing?! If this is really possible
then
> > there must be some sort of command that tells btrfs "Hey, I wanne
> > remove this disk from the fs, please copy all data to the other disks
> > and then remove the disk". Is there such a command?
Haven''t heard of
> > one, but that would be interesting.
> 
>         btrfs device add /dev/$newdisk ...
>         (btrfs fi balance ...)
>         btrfs device delete /dev/$olddisk ...
>         (btrfs fi balance ...)
> 
> I''ve told these simple steps many times in this mailing list.
> 
> Since some kernel versions (at least since kernel 3.2.x) it seems to  
> work without problems; "btrfs-progs"-packet from 2011-10-30.
> 
> 
> > Otherwise if you remove a disk from a raid0 (doesn''t matter
if you
> > have 2 or 5 or x disks in the fs, btrfs should stripe above all
> > disks) your fs should be broken.
> 
> 
> Not with btrfs ... there it works even with
> 
>   mkfs.btrfs -m raid1 -d raid0 ...
   There is a big difference between "orderly and planned removal of a
hard disk", and "disk goes away with no warning". This is
essentially
the difference you''ve been talking about at cross-purposes all day.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                 --- My karma has run over my dogma. ---

Helmut Hullen

2012-May-08 20:19 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hallo, Hugo,

Du meintest am 08.05.12:
>>> Otherwise if you remove a disk from a raid0 (doesn''t
matter if you
>>> have 2 or 5 or x disks in the fs, btrfs should stripe above all
>>> disks) your fs should be broken.
>> Not with btrfs ... there it works even with
>>
>>   mkfs.btrfs -m raid1 -d raid0 ...
>    There is a big difference between "orderly and planned removal of
> a hard disk", and "disk goes away with no warning".
And I know the difference ...

When I first called for help I searched the failure in another place  
than in "disk is dead".
> This is essentially the difference you''ve been talking about at
cross-
> purposes all day.
What I still hope (may be it''s impossible): when 1 disk/partition
fails,
then the contents of the other disks is "somehow" restorable. And not
irreproducable.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Roman Mamedov

2012-May-08 20:56 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On 08 May 2012 22:19:00 +0200
Hullen@t-online.de (Helmut Hullen) wrote:
> What I still hope (may be it''s impossible): when 1 disk/partition
fails,
> then the contents of the other disks is "somehow" restorable. And
not
> irreproducable.
You should look for file/directory-level tree merging, e.g. this FUSE based
virtual FS: 

  https://romanrm.ru/en/mhddfs

Or various other unionfs''es, some of which are kernel-based.

Regarding btrfs, AFAIK even "btrfs -d single" suggested above works
not "per
file", but per allocation extent, so in case of one disk failure you will
lose
random *parts* (extents) of random files, which in effect could mean no file
in your whole file system will remain undamaged.

-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

Hubert Kario

2012-May-08 21:42 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Tuesday 08 of May 2012 12:00:00 Helmut Hullen wrote:> Hallo, Fajar,
> 
> Du meintest am 08.05.12:
> >>> And you can use three BTRFS filesystems the same way as three
Ext4
> >>> filesystems if you prefer such a setup if the time spent for
> >>> restoring the backup does not make up the cost for one
additional
> >>> disk for you.
> >> 
> >> But where''s the gain? If a disk fails I have a lot of
tools for
> >> repairing an ext2/3/4 system.
> > 
> > It won''t work if you use it in RAID0 (e.g. with LVM spanning
three
> > disks, then use ext4 on top of the LV).
> 
> But when I use ext2/3/4 I neither need RAID0 nor do I need LVM.
> 
> > As others said, if your only concern is "if a disk is dead, I
want to
> > be able to access data on other disks", then simply use btrfs as
> > three different fs, mounted on three directories.
> 
> But then I don''t need especially btrfs.
> 
> > btrfs will shine when:
> > - you need checksum and self-healing in raid10 mode
> > - you have lots of small files
> > - you have highly compressible content
> > - you need snapshot/clone feature
> 
> For my video collection (mpeg2) nothing fits ...
> 
> The only advantage I see with btrfs is
> 
>         adding a bigger disk
>         deleting/removing a smaller disk
> 
> with really simple commands.
Playing the Devil''s advocate here (not that I don''t use The
Other Linux FS
;)

I don''t see btrfs commands much different from
pvcreate /dev/new-disk
vgextend videos-volume-42 /dev/new-disk
pvmove /dev/old-disk /dev/new-disk
vgreduce videos-volume-42 /dev/old-disk
resize2fs /dev/videos-volume-42/logical-volume

Unlike with shrinking, there''s really no place for error. Messing up
those
commands will give quite clear error messages and definetly won''t
destroy
data (unless a hardware error occurs). And the FS on the LV is online all 
the time, just like with btrfs.

The only difference is that with btrfs you can both extend and shrink the FS 
online, with ext2/3/4 you can only extend online...

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-09 13:04 UTC

head link

failed disk (was: kernel 3.3.4 damages filesystem (?))

Hallo, Hugo,

Du meintest am 07.05.12:
>>>    mkfs.btrfs -m raid1 -d single should give you that.
>> What''s the difference to
>>
>>      mkfs.btrfs -m raid1 -d raid0
>  - RAID-0 stripes each piece of data across all the disks.
>  - single puts data on one disk at a time.
[...]

>    In fact, this is probably a good argument for having the option to
> put back the old allocator algorithm, which would have ensured that
> the first disk would fill up completely first before it touched the
> next one...
The actual version seems to oscillate from disk to disk:

Copying about 160 GiByte shows

Label: none  uuid: fd0596c6-d819-42cd-bb4a-420c38d2a60b
	Total devices 2 FS bytes used 155.64GB
	devid    2 size 136.73GB used 114.00GB path /dev/sdl1
	devid    1 size 68.37GB used 45.04GB path /dev/sdk1

Btrfs Btrfs v0.19

------------------------

Watching the amount showed that both disks are filled nearly  
simultaneously.

That would be more difficult to restore ...

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-09 13:19 UTC

head link

Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

On Wed, May 09, 2012 at 03:04:00PM +0200, Helmut Hullen
wrote:> Hallo, Hugo,
> 
> Du meintest am 07.05.12:
> 
> >>>    mkfs.btrfs -m raid1 -d single should give you that.
> 
> >> What''s the difference to
> >>
> >>      mkfs.btrfs -m raid1 -d raid0
> 
> >  - RAID-0 stripes each piece of data across all the disks.
> >  - single puts data on one disk at a time.
> 
> [...]
> 
> 
> >    In fact, this is probably a good argument for having the option to
> > put back the old allocator algorithm, which would have ensured that
> > the first disk would fill up completely first before it touched the
> > next one...
> 
> The actual version seems to oscillate from disk to disk:
   Yes, specifically, when it''s asked for n chunks to make up a block
group, the current allocator will pick the n disks with the most free
space on them. The original allocator would pick the disks with the
smallest devid (which is probably optimal for your use case -- hence
my comment above).
> Watching the amount showed that both disks are filled nearly  
> simultaneously.
> 
> That would be more difficult to restore ...
   If your files are small compared to the block group size (1GiB in
this case), then the odds of a file spanning block groups are small.
With files similar in size to, or larger than, a chunk, you will be
far more likely to lose some part of the file when a disk goes away.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
       --- Great oxymorons of the world, no. 6: Mature Student ---

Helmut Hullen

2012-May-09 14:25 UTC

head link

failed disk (was: kernel 3.3.4 damages filesystem (?))

Hallo, Hugo,

Du meintest am 07.05.12:

[...]
>> With a file system like ext2/3/4 I can work with several directories
>> which are mounted together, but (as said before) one broken disk
>> doesn''t disturb the others.
>    mkfs.btrfs -m raid1 -d single should give you that.
Just a small bug, perhaps:

created a system with

        mkfs.btrfs -m raid1 -d single /dev/sdl1
        mount /dev/sdl1 /mnt/Scsi
        btrfs device add /dev/sdk1 /mnt/Scsi
        btrfs device add /dev/sdm1 /mnt/Scsi
        (filling with data)

and

        btrfs fi df /mnt/Scsi

now tells

Data, RAID0: total=183.18GB, used=76.60GB
Data: total=80.01GB, used=79.83GB
System, DUP: total=8.00MB, used=32.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=192.74MB
Metadata: total=8.00MB, used=0.00

--------------------------------------

"Data, RAID0" confuses me (not very much ...), and the system for  
metadata (RAID1) is not told.


Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-09 14:37 UTC

head link

Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

On Wed, May 09, 2012 at 04:25:00PM +0200, Helmut Hullen
wrote:> Du meintest am 07.05.12:
> 
> [...]
> 
> >> With a file system like ext2/3/4 I can work with several
directories
> >> which are mounted together, but (as said before) one broken disk
> >> doesn''t disturb the others.
> 
> >    mkfs.btrfs -m raid1 -d single should give you that.
> 
> Just a small bug, perhaps:
> 
> created a system with
> 
>         mkfs.btrfs -m raid1 -d single /dev/sdl1
>         mount /dev/sdl1 /mnt/Scsi
>         btrfs device add /dev/sdk1 /mnt/Scsi
>         btrfs device add /dev/sdm1 /mnt/Scsi
>         (filling with data)
> 
> and
> 
>         btrfs fi df /mnt/Scsi
> 
> now tells
> 
> Data, RAID0: total=183.18GB, used=76.60GB
> Data: total=80.01GB, used=79.83GB
> System, DUP: total=8.00MB, used=32.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=1.00GB, used=192.74MB
> Metadata: total=8.00MB, used=0.00
> 
> --------------------------------------
> 
> "Data, RAID0" confuses me (not very much ...), and the system for
> metadata (RAID1) is not told.
   DUP is two copies of each block, but it allows the two copies to
live on the same device. It''s done this because you started with a
single device, and you can''t do RAID-1 on one device. The first bit of
metadata you write to it should automatically upgrade the DUP chunk to
RAID-1.

   As to the spurious "upgrade" of single to RAID-0, I thought Ilya
had stopped it doing that. What kernel version are you running?

   Out of interest, why did you do the device adds separately, instead
of just this?

# mkfs.btrfs -m raid1 -d single /dev/sdl1 /dev/sdk1 /dev/sdm1

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Comic Sans goes into a bar,  and the barman says, "We don''t
---
                         serve your type here."

Kaspar Schleiser

2012-May-09 14:46 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Hi,

On 05/08/2012 10:56 PM, Roman Mamedov wrote:> Regarding btrfs, AFAIK even "btrfs -d single" suggested above
works not "per
> file", but per allocation extent, so in case of one disk failure you
will lose
> random *parts* (extents) of random files, which in effect could mean no
file
> in your whole file system will remain undamaged.Maybe we should evaluate the possiblility of such a "one file gets on 
one disk" feature.

Helmut Hullen has the use case: Many disks, totally non-critical but 
nice-to-have data. If one disk dies, some *files* should lost, not some 
*random parts of all files*.

This could be accomplished by some userspace-tool that moves stuff 
around, combined with "file pinning"-support, that lets the user make 
sure a specific file is on a specific disk.

Cheers
Kaspar

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-09 15:14 UTC

head link

Re: failed disk

Hallo, Hugo,

Du meintest am 09.05.12:
>>>    mkfs.btrfs -m raid1 -d single should give you that.
>> Just a small bug, perhaps:
>>
>> created a system with
>>
>>         mkfs.btrfs -m raid1 -d single /dev/sdl1
>>         mount /dev/sdl1 /mnt/Scsi
>>         btrfs device add /dev/sdk1 /mnt/Scsi
>>         btrfs device add /dev/sdm1 /mnt/Scsi
>>         (filling with data)
>>
>> and
>>
>>         btrfs fi df /mnt/Scsi
>>
>> now tells
>>
>> Data, RAID0: total=183.18GB, used=76.60GB
>> Data: total=80.01GB, used=79.83GB
>> System, DUP: total=8.00MB, used=32.00KB
>> System: total=4.00MB, used=0.00
>> Metadata, DUP: total=1.00GB, used=192.74MB
>> Metadata: total=8.00MB, used=0.00
>>
>> --------------------------------------
>>
>> "Data, RAID0" confuses me (not very much ...), and the system
for
>> metadata (RAID1) is not told.
>    DUP is two copies of each block, but it allows the two copies to
> live on the same device. It''s done this because you started with a
> single device, and you can''t do RAID-1 on one device. The first
bit
> of metadata you write to it should automatically upgrade the DUP
> chunk to RAID-1.
Ok.

Sounds familiar - have you explained that to me many months ago?
>    As to the spurious "upgrade" of single to RAID-0, I thought
Ilya
> had stopped it doing that. What kernel version are you running?
3.2.9, self made.
I could test the message with 3.3.4, but not today (if it''s only an  
interpretation of always the same data).
>    Out of interest, why did you do the device adds separately,
> instead of just this?
a) making the first 2 devices: I have tested both versions (one line  
with 2 devices or 2 lines with 1 device); no big difference.

But I had tested the option "-L" (labelling) too, and that makes shit
for the oneliner: both devices get the same label, and then "findfs"  
finds none of them.

The really safe way would be: deleting this option for the
"mkfs.btrfs"
command and only using

        btrfs fi label <device> [<newlabel>]

b) third device: that''s my usual test:
        make a cluster of 2 deivces
        fill them with data
        add a third device
        delete the smallest device

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-09 15:33 UTC

head link

Re: failed disk

On Wed, May 09, 2012 at 05:14:00PM +0200, Helmut Hullen
wrote:> Hallo, Hugo,
> 
> Du meintest am 09.05.12:
> 
> >    DUP is two copies of each block, but it allows the two copies to
> > live on the same device. It''s done this because you started
with a
> > single device, and you can''t do RAID-1 on one device. The
first bit
> > of metadata you write to it should automatically upgrade the DUP
> > chunk to RAID-1.
> 
> Ok.
> 
> Sounds familiar - have you explained that to me many months ago?
   Probably. I tend to explain this kind of thing a lot to people.
> >    As to the spurious "upgrade" of single to RAID-0, I
thought Ilya
> > had stopped it doing that. What kernel version are you running?
> 
> 3.2.9, self made.
   OK, I''m pretty sure that''s too old -- it will
"upgrade" single to
RAID-0. You can probably turn it back to "single" using balance
filters:

# btrfs fi balance -dconvert=single /mountpoint

(You may want to write at least a little data to the FS first --
balance has some slightly odd behaviour on empty filesystems).
> I could test the message with 3.3.4, but not today (if it''s only
an
> interpretation of always the same data).
> 
> >    Out of interest, why did you do the device adds separately,
> > instead of just this?
> 
> a) making the first 2 devices: I have tested both versions (one line  
> with 2 devices or 2 lines with 1 device); no big difference.
> 
> But I had tested the option "-L" (labelling) too, and that makes
shit
> for the oneliner: both devices get the same label, and then
"findfs"
> finds none of them.
   Umm... Yes, of course both devices will get the same label --
you''re labelling the filesystem, not the devices. (Didn''t we
have this
argument some time ago?).

   I don''t know what "findfs" is doing, that it
can''t find the
filesystem by label: you may need to run "sync" after mkfs, possibly.
> The really safe way would be: deleting this option for the
"mkfs.btrfs"
> command and only using
> 
>         btrfs fi label <device> [<newlabel>]
   ... except that it''d have to take a filesystem as parameter, not a
device (see above).
> b) third device: that''s my usual test:
>         make a cluster of 2 deivces
>         fill them with data
>         add a third device
>         delete the smallest device
   What are you testing? And by "delete" do you mean "btrfs dev
delete" or "pull the cable out"?

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
           --- Quidquid latine dictum sit,  altum videtur. ---

Ilya Dryomov

2012-May-09 16:13 UTC

head link

Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

On Wed, May 09, 2012 at 03:37:35PM +0100, Hugo Mills
wrote:> On Wed, May 09, 2012 at 04:25:00PM +0200, Helmut Hullen wrote:
> > Du meintest am 07.05.12:
> > 
> > [...]
> > 
> > >> With a file system like ext2/3/4 I can work with several
directories
> > >> which are mounted together, but (as said before) one broken
disk
> > >> doesn''t disturb the others.
> > 
> > >    mkfs.btrfs -m raid1 -d single should give you that.
> > 
> > Just a small bug, perhaps:
> > 
> > created a system with
> > 
> >         mkfs.btrfs -m raid1 -d single /dev/sdl1
> >         mount /dev/sdl1 /mnt/Scsi
> >         btrfs device add /dev/sdk1 /mnt/Scsi
> >         btrfs device add /dev/sdm1 /mnt/Scsi
> >         (filling with data)
> > 
> > and
> > 
> >         btrfs fi df /mnt/Scsi
> > 
> > now tells
> > 
> > Data, RAID0: total=183.18GB, used=76.60GB
> > Data: total=80.01GB, used=79.83GB
> > System, DUP: total=8.00MB, used=32.00KB
> > System: total=4.00MB, used=0.00
> > Metadata, DUP: total=1.00GB, used=192.74MB
> > Metadata: total=8.00MB, used=0.00
> > 
> > --------------------------------------
> > 
> > "Data, RAID0" confuses me (not very much ...), and the
system for
> > metadata (RAID1) is not told.
> 
>    DUP is two copies of each block, but it allows the two copies to
> live on the same device. It''s done this because you started with a
> single device, and you can''t do RAID-1 on one device. The first
bit of
What Hugo said.  Newer mkfs.btrfs will error out if you try to do this.
> metadata you write to it should automatically upgrade the DUP chunk to
> RAID-1.
We don''t "upgrade" chunks in place, only during balance.
> 
>    As to the spurious "upgrade" of single to RAID-0, I thought
Ilya
> had stopped it doing that. What kernel version are you running?
I did, but again, we were doing it only as part of balance, not as part
of normal operation.

Helmut, do you have any additional data points - the output of btrfs fi
df right after you created FS or somewhere in the middle of filling it ?

Also could you please paste the output of btrfs fi show and tell us what
kernel version you are running ?

Thanks,

		Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Duncan

2012-May-09 17:32 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Helmut Hullen posted on Mon, 07 May 2012 12:46:00 +0200 as excerpted:
> The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller.
> Only 1 of the 3 disks seems to be damaged.
I don''t plan to rehash the raid0/single discussion here, but
here''s some
perhaps useful additional information on that hardware:


For some years I''ve been running that same hardware, SiI 3114 SATA PCI,
on an old dual-socket 3-digit Opteron system, running for some years now 
dual dual-core Opteron 290s (the highest they went, 2.8 GHz, 4 cores in 
two sockets).  However, I *WAS* running them in RAID-1, 4-disk md RAID-1, 
to be exact (with reiserfs, FWIW).


What''s VERY interesting is that I''ve just returned from being
offline for
several days due to severe disk-I/O hardware issues of my own -- again, 
on that Sil-SATA 3114.

Most of the time I was getting full system crashes, but perhaps 25-33% of 
the time it didn''t fully crash the system, simply error out with an 
eventual ATA reset.  When the system didn''t crash immediately, most of 
the time (about 80% I''d say) the reset would be good and I''d
be back up,
but sometimes it''d repeatedly reset, occasionally not ever becoming 
usable again.

As the drives are all the same quite old Seagate 300 gig drives, at about 
half their rated SMART operating hours but I think well beyond the 5 year 
warrantee, I originally thought I''d just learned my lesson on the
don''t
use all the same model or you''re risking them all going out at once
rule,
but I bought a new drive (half-TB seagate 2.5" drive, I''ve been
thinking
about going 2.5" for awhile now and this was the chance, I''ll RAID
it
later with at least one more, preferably a different run at least if not 
a different model) and have been SLOWLY, PAINFULLY, RESETTINGLY copying 
stuff over from one or another of the four RAID-1 drives.

The reset problem, however, hasn''t gone away, tho it''s rather
reduced on
the newer hardware.

I also happened to have a 4-3.5-in-3-5.25-slot drive enclosure that 
seemed to be making the problem worse, as when I first tried the new 2.5 
inch retrofitted into it, the reset problem was as bad with it as with 
the old drives, but when I ran it "lose", just cabled into the mobo
and
power-supply directly, resets went down significantly but did NOT go away.


So... I''ve now concluded that I need a new controller and will probably
buy one in a day or two.

Meanwhile, I THOUGHT it was "just me" with the SIL-SATA controller,
until
I happened to see the same hardware mentioned on this thread.


Now, I''m beginning to suspect that there''s some new kernel DMA
or storage
or perhaps xorg/mesa (AMD AGPGART, after all, handling the DMA using half 
the aperture. if either the graphics or storage try writing to the wrong 
half...) problem that stressed what was already aging hardware, 
triggering the problem.  It''s worth noting that I tried running an
older
kernel and rebuilding (on Gentoo) most of X/mesa/anything-else-I-could-
think-might-be-related between older versions that WERE working find 
before and newer versions, and reverting to older didn''t help, so
it''s
apparently NOT a direct software-only-bug.  However, what I''m wondering
now is whether as I said, software upgrades added stress to already aging 
hardware, such that it tipped it over the edge, and by the time I tried 
reverting, I''d already had enough crashes and etc that my entire system
was unstable, and reverting to older software didn''t help because now
the
hardware was unstable as well.

I''d still chalk it up to simply failing hardware, except that
it''s a
rather interesting coincidence that both you and I had their SIL-SATA 
3114s go bad at very close to the same time.


Meanwhile, I did recently see an interesting kernel commit, either late 
3.4-rc5+ or early 3.4-rc6+.  I don''t want to try to track it down and 
lose this post to a crash on a less than stable system, but it did 
mention that AMD AGPGARTs sometimes poked holes in memory allocations and 
the commit was to try to allow for that.  I''m not sure how long the bad
code had been in the kernel, but if it was introduced at say the 3.2 or 
3.3 kernel, it could be that is what first started triggering the lockups 
that lead to more and more system instability, until now I''ve bought a 
new drive and it looks like I''m going to need to replace the onboard
SIL-
SATA.

So, some questions:

* Do you run OpenGL/Mesa at all on that system, possibly with an OpenGL 
compositing window manager?

* If so, how new is your mesa and xorg-server, and what is your video 
card/driver?

* Do you run quite new kernels, say 3.3/3.4?

* What libffi and cairo? (I did notice reverting libffi seemed to lessen 
the crashing a bit, especially with firefox on my bank''s SSL site,
which
was where the problem first became ugly for me as I kept crashing trying 
to get in to pay bills, etc, but I''m not positive that''s
related, or it
might be that likely otherwise separate bug''s crashes advanced the ATA-
resets issue too.)

* Perhaps most critically, is your system an old AMD with the AGPGART?

* Also, amd64/x86_64, x86 (32), or?

FWIW, amd64, KDE 4.8 here with kwin OpenGL compositing, generally leading 
edge mesa/xorg.  I run git kernels so am on pre-release 3.4 now, and was 
pre-release 3.3 before that, when the problem perhaps started.  (It 
seemed to get worse so I can''t say for sure when it went from normal to
getting gradually worse, but for sure it wasn''t back in the 3.2 era as
I
was stable and happy back then.)  Radeon hd4650 card, freedomware drivers.

If any of that, especially the AGPGART, sounds familiar, we may have a 
hardware-burner bug that caught us both.  If you''re running a bit older
versions of all that stuff or no compositing/opengl, and have say an 
nVidia card and no AMD AGPGART, it''s probably simply coincidence.  But
if
it''s not, and we can catch and get this fixed before the folks running 
older software as well upgrade and start burning their SIL-SATAs...

(FWIW, I hadn''t yet upgraded to btrfs at all when the trouble started 
happening here, tho I was looking at it, thus my being on the list.  I 
didn''t trust the two-way-only btrfs raid1 mode on my older disks and
was
waiting on N-way raid1 mode, roadmapped for after raid-5/6 mode, which is 
now roadmapped for 3.5...  But with a new disk, eventually to add another 
for raid, I don''t have that problem now, so with the upgrade
I''m trying
btrfs dual-metadata single-data on a few working partitions now,
backup''s
still reiserfs, tho.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Atila

2012-May-09 18:06 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

I dont know if this is related or not, but I updated two different 
computers to ubuntu 12, which uses kernel 3.2, and in both I had the 
same problem: using btrfs with compress-force=lzo, after some IO stress 
the filesystem became unusable, some sort of busy.
Im using kernel 3.0 right now, with no such problem.

On 09-05-2012 14:32, Duncan wrote:> Helmut Hullen posted on Mon, 07 May 2012 12:46:00 +0200 as excerpted:
>
>> The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller.
>> Only 1 of the 3 disks seems to be damaged.
> I don''t plan to rehash the raid0/single discussion here, but
here''s some
> perhaps useful additional information on that hardware:
>
>
> For some years I''ve been running that same hardware, SiI 3114 SATA
PCI,
> on an old dual-socket 3-digit Opteron system, running for some years now
> dual dual-core Opteron 290s (the highest they went, 2.8 GHz, 4 cores in
> two sockets).  However, I *WAS* running them in RAID-1, 4-disk md RAID-1,
> to be exact (with reiserfs, FWIW).
>
>
> What''s VERY interesting is that I''ve just returned from
being offline for
> several days due to severe disk-I/O hardware issues of my own -- again,
> on that Sil-SATA 3114.
>
> Most of the time I was getting full system crashes, but perhaps 25-33% of
> the time it didn''t fully crash the system, simply error out with
an
> eventual ATA reset.  When the system didn''t crash immediately,
most of
> the time (about 80% I''d say) the reset would be good and
I''d be back up,
> but sometimes it''d repeatedly reset, occasionally not ever
becoming
> usable again.
>
> As the drives are all the same quite old Seagate 300 gig drives, at about
> half their rated SMART operating hours but I think well beyond the 5 year
> warrantee, I originally thought I''d just learned my lesson on the
don''t
> use all the same model or you''re risking them all going out at
once rule,
> but I bought a new drive (half-TB seagate 2.5" drive, I''ve
been thinking
> about going 2.5" for awhile now and this was the chance, I''ll
RAID it
> later with at least one more, preferably a different run at least if not
> a different model) and have been SLOWLY, PAINFULLY, RESETTINGLY copying
> stuff over from one or another of the four RAID-1 drives.
>
> The reset problem, however, hasn''t gone away, tho it''s
rather reduced on
> the newer hardware.
>
> I also happened to have a 4-3.5-in-3-5.25-slot drive enclosure that
> seemed to be making the problem worse, as when I first tried the new 2.5
> inch retrofitted into it, the reset problem was as bad with it as with
> the old drives, but when I ran it "lose", just cabled into the
mobo and
> power-supply directly, resets went down significantly but did NOT go away.
>
>
> So... I''ve now concluded that I need a new controller and will
probably
> buy one in a day or two.
>
> Meanwhile, I THOUGHT it was "just me" with the SIL-SATA
controller, until
> I happened to see the same hardware mentioned on this thread.
>
>
> Now, I''m beginning to suspect that there''s some new
kernel DMA or storage
> or perhaps xorg/mesa (AMD AGPGART, after all, handling the DMA using half
> the aperture. if either the graphics or storage try writing to the wrong
> half...) problem that stressed what was already aging hardware,
> triggering the problem.  It''s worth noting that I tried running an
older
> kernel and rebuilding (on Gentoo) most of X/mesa/anything-else-I-could-
> think-might-be-related between older versions that WERE working find
> before and newer versions, and reverting to older didn''t help, so
it''s
> apparently NOT a direct software-only-bug.  However, what I''m
wondering
> now is whether as I said, software upgrades added stress to already aging
> hardware, such that it tipped it over the edge, and by the time I tried
> reverting, I''d already had enough crashes and etc that my entire
system
> was unstable, and reverting to older software didn''t help because
now the
> hardware was unstable as well.
>
> I''d still chalk it up to simply failing hardware, except that
it''s a
> rather interesting coincidence that both you and I had their SIL-SATA
> 3114s go bad at very close to the same time.
>
>
> Meanwhile, I did recently see an interesting kernel commit, either late
> 3.4-rc5+ or early 3.4-rc6+.  I don''t want to try to track it down
and
> lose this post to a crash on a less than stable system, but it did
> mention that AMD AGPGARTs sometimes poked holes in memory allocations and
> the commit was to try to allow for that.  I''m not sure how long
the bad
> code had been in the kernel, but if it was introduced at say the 3.2 or
> 3.3 kernel, it could be that is what first started triggering the lockups
> that lead to more and more system instability, until now I''ve
bought a
> new drive and it looks like I''m going to need to replace the
onboard SIL-
> SATA.
>
> So, some questions:
>
> * Do you run OpenGL/Mesa at all on that system, possibly with an OpenGL
> compositing window manager?
>
> * If so, how new is your mesa and xorg-server, and what is your video
> card/driver?
>
> * Do you run quite new kernels, say 3.3/3.4?
>
> * What libffi and cairo? (I did notice reverting libffi seemed to lessen
> the crashing a bit, especially with firefox on my bank''s SSL site,
which
> was where the problem first became ugly for me as I kept crashing trying
> to get in to pay bills, etc, but I''m not positive that''s
related, or it
> might be that likely otherwise separate bug''s crashes advanced the
ATA-
> resets issue too.)
>
> * Perhaps most critically, is your system an old AMD with the AGPGART?
>
> * Also, amd64/x86_64, x86 (32), or?
>
> FWIW, amd64, KDE 4.8 here with kwin OpenGL compositing, generally leading
> edge mesa/xorg.  I run git kernels so am on pre-release 3.4 now, and was
> pre-release 3.3 before that, when the problem perhaps started.  (It
> seemed to get worse so I can''t say for sure when it went from
normal to
> getting gradually worse, but for sure it wasn''t back in the 3.2
era as I
> was stable and happy back then.)  Radeon hd4650 card, freedomware drivers.
>
> If any of that, especially the AGPGART, sounds familiar, we may have a
> hardware-burner bug that caught us both.  If you''re running a bit
older
> versions of all that stuff or no compositing/opengl, and have say an
> nVidia card and no AMD AGPGART, it''s probably simply coincidence. 
But if
> it''s not, and we can catch and get this fixed before the folks
running
> older software as well upgrade and start burning their SIL-SATAs...
>
> (FWIW, I hadn''t yet upgraded to btrfs at all when the trouble
started
> happening here, tho I was looking at it, thus my being on the list.  I
> didn''t trust the two-way-only btrfs raid1 mode on my older disks
and was
> waiting on N-way raid1 mode, roadmapped for after raid-5/6 mode, which is
> now roadmapped for 3.5...  But with a new disk, eventually to add another
> for raid, I don''t have that problem now, so with the upgrade
I''m trying
> btrfs dual-metadata single-data on a few working partitions now,
backup''s
> still reiserfs, tho.)
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-09 18:49 UTC

head link

Re: failed disk

Hallo, Hugo,

Du meintest am 09.05.12:
>>>    As to the spurious "upgrade" of single to RAID-0, I
thought Ilya
>>> had stopped it doing that. What kernel version are you running?
>> 3.2.9, self made.
>    OK, I''m pretty sure that''s too old -- it will
"upgrade" single to
> RAID-0. You can probably turn it back to "single" using balance
> filters:
> # btrfs fi balance -dconvert=single /mountpoint
> (You may want to write at least a little data to the FS first --
> balance has some slightly odd behaviour on empty filesystems).
"manana" ... the system is just running "balance" after
"device delete".
And that may still need 4 ... 5 hours.
>>>    Out of interest, why did you do the device adds separately,
>>> instead of just this?
>> a) making the first 2 devices: I have tested both versions (one line
>> with 2 devices or 2 lines with 1 device); no big difference.
>>
>> But I had tested the option "-L" (labelling) too, and that
makes
>> shit for the oneliner: both devices get the same label, and then
>> "findfs" finds none of them.
>    Umm... Yes, of course both devices will get the same label --
> you''re labelling the filesystem, not the devices. (Didn''t
we have
> this argument some time ago?).
Not with that special case (and that led me to misinterpreting the error  
...).
>    I don''t know what "findfs" is doing, that it
can''t find the
> filesystem by label: you may need to run "sync" after mkfs,
possibly.
No - "findfs" works quite simple: if it finds 1 label then it tells
the
partition.
If it finds more or less labels it tells nothing.
>> b) third device: that''s my usual test:
>>         make a cluster of 2 deivces
>>         fill them with data
>>         add a third device
>>         delete the smallest device
>    What are you testing? And by "delete" do you mean "btrfs
dev
> delete" or "pull the cable out"?
First pure software delete. Tomorrow I''ll reboot the system and look at
the results with

        btrfs fi show

It should tell only 2 devices (that''s the part which seems to work as  
described at least since kernel 3.2).

By the way: it seems to be necessary running

        btrfs fi balance ...

after "btrfs device add ..." and after "btrfs device delete
...".

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-10 02:49 UTC

head link

Re: failed disk

Hallo, Hugo,

Du meintest am 09.05.12:
>>         btrfs fi df /mnt/Scsi
>>
>> now tells
>>
>> Data, RAID0: total=183.18GB, used=76.60GB
>> Data: total=80.01GB, used=79.83GB
>> System, DUP: total=8.00MB, used=32.00KB
>> System: total=4.00MB, used=0.00
>> Metadata, DUP: total=1.00GB, used=192.74MB
>> Metadata: total=8.00MB, used=0.00
>>
>> --------------------------------------
>>
>> "Data, RAID0" confuses me (not very much ...), and the system
for
>> metadata (RAID1) is not told.
>    DUP is two copies of each block, but it allows the two copies to
> live on the same device. It''s done this because you started with a
> single device, and you can''t do RAID-1 on one device. The first
bit
> of metadata you write to it should automatically upgrade the DUP
> chunk to RAID-1.
It has done - ok. Adding and removing disks/partitions works as  
expected.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin Steigerwald

2012-May-10 10:40 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:> Hi,
> 
> On 05/08/2012 10:56 PM, Roman Mamedov wrote:
> > Regarding btrfs, AFAIK even "btrfs -d single" suggested
above works
> > not "per file", but per allocation extent, so in case of one
disk
> > failure you will lose random *parts* (extents) of random files,
> > which in effect could mean no file in your whole file system will
> > remain undamaged.
> 
> Maybe we should evaluate the possiblility of such a "one file gets on
> one disk" feature.
> 
> Helmut Hullen has the use case: Many disks, totally non-critical but
> nice-to-have data. If one disk dies, some *files* should lost, not some
> *random parts of all files*.
> 
> This could be accomplished by some userspace-tool that moves stuff
> around, combined with "file pinning"-support, that lets the user
make
> sure a specific file is on a specific disk.
Yeah, basically I think thats the whole point Helmut is trying to make.

I am not sure whether that should be in userspace. It could be just an 
allocation mode like "raid0" or "single". Such as
"single" as in one file
is really on one disk and thats it.

-- 
Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Helmut Hullen

2012-May-10 11:55 UTC

head link

feature request (was: kernel 3.3.4 damages filesystem (?))

Hallo, Martin,

Du meintest am 10.05.12:

[...]
>> Maybe we should evaluate the possiblility of such a "one file gets
>> on one disk" feature.
>>
>> Helmut Hullen has the use case: Many disks, totally non-critical but
>> nice-to-have data. If one disk dies, some *files* should lost, not
>> some *random parts of all files*.
>>
>> This could be accomplished by some userspace-tool that moves stuff
>> around, combined with "file pinning"-support, that lets the
user
>> make sure a specific file is on a specific disk.
> Yeah, basically I think thats the whole point Helmut is trying to
> make.
Yes - that''s the feature which I miss ...
> I am not sure whether that should be in userspace. It could be just
> an allocation mode like "raid0" or "single". Such as
"single" as in
> one file is really on one disk and thats it.
What I''m dreaming for:

I have a bundle/cluster of (p.e.) 3 disks. When I remove 1 disk  
(accidently/planned/because of disk failure) then I''d be very pleased  
when the contents of the other disks is (mostly) still readable.

It''s no fun restoring Terabytes ...

Yes - I know: that''s no backup, that doesn''t replace a backup.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hubert Kario

2012-May-10 19:43 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Thursday 10 of May 2012 12:40:49 Martin Steigerwald
wrote:> Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:
> > Hi,
> > 
> > On 05/08/2012 10:56 PM, Roman Mamedov wrote:
> > > Regarding btrfs, AFAIK even "btrfs -d single" suggested
above works
> > > not "per file", but per allocation extent, so in case
of one disk
> > > failure you will lose random *parts* (extents) of random files,
> > > which in effect could mean no file in your whole file system will
> > > remain undamaged.
> > 
> > Maybe we should evaluate the possiblility of such a "one file
gets on
> > one disk" feature.
> > 
> > Helmut Hullen has the use case: Many disks, totally non-critical but
> > nice-to-have data. If one disk dies, some *files* should lost, not
some
> > *random parts of all files*.
> > 
> > This could be accomplished by some userspace-tool that moves stuff
> > around, combined with "file pinning"-support, that lets the
user make
> > sure a specific file is on a specific disk.
> 
> Yeah, basically I think thats the whole point Helmut is trying to make.
> 
> I am not sure whether that should be in userspace. It could be just an
> allocation mode like "raid0" or "single". Such as
"single" as in one file
> is really on one disk and thats it.
I was thinking that "linear" would be good name for old style
allocator.

Regards
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2012-May-10 20:15 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Thu, May 10, 2012 at 09:43:58PM +0200, Hubert Kario
wrote:> On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote:
> > Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:
> > > Hi,
> > > 
> > > On 05/08/2012 10:56 PM, Roman Mamedov wrote:
> > > > Regarding btrfs, AFAIK even "btrfs -d single"
suggested above works
> > > > not "per file", but per allocation extent, so in
case of one disk
> > > > failure you will lose random *parts* (extents) of random
files,
> > > > which in effect could mean no file in your whole file system
will
> > > > remain undamaged.
> > > 
> > > Maybe we should evaluate the possiblility of such a "one
file gets on
> > > one disk" feature.
> > > 
> > > Helmut Hullen has the use case: Many disks, totally non-critical
but
> > > nice-to-have data. If one disk dies, some *files* should lost,
not some
> > > *random parts of all files*.
> > > 
> > > This could be accomplished by some userspace-tool that moves
stuff
> > > around, combined with "file pinning"-support, that lets
the user make
> > > sure a specific file is on a specific disk.
> > 
> > Yeah, basically I think thats the whole point Helmut is trying to
make.
> > 
> > I am not sure whether that should be in userspace. It could be just an
> > allocation mode like "raid0" or "single". Such as
"single" as in one file
> > is really on one disk and thats it.
> 
> I was thinking that "linear" would be good name for old style
allocator.
   Please do distinguish between the replication level (e.g. "single",
"RAID-1") and the allocator algorithm. These are distinct. Also, note
that both of those work on the scale of chunks/block groups. There is
a further consideration, which is the allocation of file data to block
groups, which is a whole different thing again (and not something I
know a great deal about), but which will also affect the desired
outcome quite a lot.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Anyone who claims their cryptographic protocol is secure is ---   
         either a genius or a fool.  Given the genius/fool ratio         
                 for our species,  the odds aren''t good.

Hubert Kario

2012-May-10 20:23 UTC

head link

Re: kernel 3.3.4 damages filesystem (?)

On Thursday 10 of May 2012 21:15:30 Hugo Mills wrote:> On Thu, May 10, 2012 at 09:43:58PM +0200, Hubert Kario wrote:
> > On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote:
> > > Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:
> > > > Hi,
> > > > 
> > > > On 05/08/2012 10:56 PM, Roman Mamedov wrote:
> > > > > Regarding btrfs, AFAIK even "btrfs -d single"
suggested above
> > > > > works
> > > > > not "per file", but per allocation extent, so
in case of one disk
> > > > > failure you will lose random *parts* (extents) of
random files,
> > > > > which in effect could mean no file in your whole file
system will
> > > > > remain undamaged.
> > > > 
> > > > Maybe we should evaluate the possiblility of such a
"one file gets
> > > > on
> > > > one disk" feature.
> > > > 
> > > > Helmut Hullen has the use case: Many disks, totally
non-critical but
> > > > nice-to-have data. If one disk dies, some *files* should
lost, not
> > > > some
> > > > *random parts of all files*.
> > > > 
> > > > This could be accomplished by some userspace-tool that moves
stuff
> > > > around, combined with "file pinning"-support, that
lets the user
> > > > make
> > > > sure a specific file is on a specific disk.
> > > 
> > > Yeah, basically I think thats the whole point Helmut is trying to
> > > make.
> > > 
> > > I am not sure whether that should be in userspace. It could be
just an
> > > allocation mode like "raid0" or "single".
Such as "single" as in one
> > > file
> > > is really on one disk and thats it.
> > 
> > I was thinking that "linear" would be good name for old
style allocator.
> 
>    Please do distinguish between the replication level (e.g.
"single",
> "RAID-1") and the allocator algorithm. These are distinct. Also,
note
> that both of those work on the scale of chunks/block groups. There is
> a further consideration, which is the allocation of file data to block
> groups, which is a whole different thing again (and not something I
> know a great deal about), but which will also affect the desired
> outcome quite a lot.
Yes, I know about that.

I was more thinking on the line "how quickly restore aviability of old 
allocator".

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Possibly Parallel Threads

Search for more reasonably related threads

Btrfs devel - May 2012 - kernel 3.3.4 damages filesystem (?)

kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

failed disk (was: kernel 3.3.4 damages filesystem (?))

Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

failed disk (was: kernel 3.3.4 damages filesystem (?))

Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

Re: kernel 3.3.4 damages filesystem (?)

Re: failed disk

Re: failed disk

Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: failed disk

Re: failed disk

Re: kernel 3.3.4 damages filesystem (?)

feature request (was: kernel 3.3.4 damages filesystem (?))

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Re: kernel 3.3.4 damages filesystem (?)

Possibly Parallel Threads