Hallo, "never change a running system" ... For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without problems. Yesterday I compiled kernel 3.3.4, and this morning I started the machine with this kernel. There may be some ugly problems. Copying something into the btrfs "directory" worked well for some files, and then I got error messages (I''ve not copied them, something with "IO error" under Samba). Rebooting the machine with kernel 3.2.9 worked, copying 1 file worked, but copying more than this file didn''t work. And I can''t delete this file. That doesn''t please me - copying more than 4 TBytes wastes time and money. =========== configuration ================ /dev/sdc1 on /srv/MM type btrfs (rw,noatime) /dev/sdc: SAMSUNG HD204UI: 25 °C /dev/sdf: WDC WD30EZRX-00MMMB0: 30 °C /dev/sdi: WDC WD30EZRX-00MMMB0: 29 °C Data, RAID0: total=5.29TB, used=4.29TB System, RAID1: total=8.00MB, used=352.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=149.00GB, used=5.00GB Label: ''MMedia'' uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91 Total devices 3 FS bytes used 4.29TB devid 3 size 2.73TB used 1.98TB path /dev/sdi1 devid 2 size 2.73TB used 1.94TB path /dev/sdf1 devid 1 size 1.82TB used 1.63TB path /dev/sdc1 Btrfs Btrfs v0.19 =================== boot messages, kernel related ============= [boot with kernel 3.3.4] May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } May 7 06:55:26 Arktur kernel: ata5: hard resetting link May 7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6 secs May 7 06:55:36 Arktur kernel: ata5: hard resetting link May 7 06:55:38 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 06:55:38 Arktur kernel: ata5: reset failed (errno=-19), retrying in 9 secs May 7 06:55:46 Arktur kernel: ata5: hard resetting link May 7 06:55:47 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 06:55:47 Arktur kernel: ata5: reset failed (errno=-19), retrying in 34 secs May 7 06:56:21 Arktur kernel: ata5: hard resetting link May 7 06:56:22 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 06:56:22 Arktur kernel: ata5.00: configured for UDMA/100 May 7 06:56:22 Arktur kernel: ata5: EH complete May 7 07:12:07 Arktur kernel: ata5.00: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen May 7 07:12:07 Arktur kernel: ata5: SError: { PHYRdyChg } May 7 07:12:07 Arktur kernel: ata5.00: failed command: WRITE DMA EXT May 7 07:12:07 Arktur kernel: ata5.00: cmd 35/00:00:00:62:50/00:04:5e:00:00/e0 tag 0 dma 524288 out May 7 07:12:07 Arktur kernel: res d8/d8:d8:d8:d8:d8/d8:d8:d8:d8:d8/d8 Emask 0x12 (ATA bus error) May 7 07:12:07 Arktur kernel: ata5.00: status: { Busy } May 7 07:12:07 Arktur kernel: ata5.00: error: { ICRC UNC IDNF } May 7 07:12:07 Arktur kernel: ata5: hard resetting link May 7 07:12:13 Arktur kernel: ata5: link is slow to respond, please be patient (ready=-19) May 7 07:12:15 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 07:12:15 Arktur kernel: ata5.00: failed to IDENTIFY (I/O error, err_mask=0x100) May 7 07:12:15 Arktur kernel: ata5.00: revalidation failed (errno=-5) May 7 07:12:20 Arktur kernel: ata5: hard resetting link May 7 07:12:20 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:12:20 Arktur kernel: ata5: reset failed (errno=-19), retrying in 10 secs May 7 07:12:30 Arktur kernel: ata5: hard resetting link May 7 07:12:30 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:12:30 Arktur kernel: ata5: reset failed (errno=-19), retrying in 10 secs May 7 07:12:40 Arktur kernel: ata5: hard resetting link May 7 07:12:42 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 07:12:43 Arktur kernel: ata5.00: configured for UDMA/100 May 7 07:12:43 Arktur kernel: ata5: EH complete May 7 07:12:43 Arktur kernel: ata5.00: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen May 7 07:12:43 Arktur kernel: ata5: SError: { PHYRdyChg } May 7 07:12:43 Arktur kernel: ata5.00: failed command: WRITE DMA EXT May 7 07:12:43 Arktur kernel: ata5.00: cmd 35/00:00:00:72:50/00:04:5e:00:00/e0 tag 0 dma 524288 out May 7 07:12:43 Arktur kernel: res d0/d0:d0:d0:d0:d0/d0:d0:d0:d0:d0/d0 Emask 0x12 (ATA bus error) May 7 07:12:43 Arktur kernel: ata5.00: status: { Busy } May 7 07:12:43 Arktur kernel: ata5.00: error: { ICRC UNC IDNF } May 7 07:12:43 Arktur kernel: ata5: hard resetting link May 7 07:12:49 Arktur kernel: ata5: link is slow to respond, please be patient (ready=-19) May 7 07:12:50 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 07:12:51 Arktur kernel: ata5.00: configured for UDMA/100 May 7 07:12:51 Arktur kernel: ata5: EH complete May 7 07:12:51 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen May 7 07:12:51 Arktur kernel: ata5: SError: { PHYRdyChg } May 7 07:12:51 Arktur kernel: ata5: hard resetting link May 7 07:12:54 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:12:54 Arktur kernel: ata5: reset failed (errno=-19), retrying in 7 secs May 7 07:13:01 Arktur kernel: ata5: hard resetting link May 7 07:13:04 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:13:04 Arktur kernel: ata5: reset failed (errno=-19), retrying in 7 secs May 7 07:13:11 Arktur kernel: ata5: hard resetting link May 7 07:13:14 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:13:14 Arktur kernel: ata5: reset failed (errno=-19), retrying in 33 secs May 7 07:13:46 Arktur kernel: ata5: hard resetting link May 7 07:13:47 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 07:13:47 Arktur kernel: ata5.00: failed to read native max address (err_mask=0x100) May 7 07:13:47 Arktur kernel: ata5.00: HPA support seems broken, skipping HPA handling May 7 07:13:47 Arktur kernel: ata5.00: revalidation failed (errno=-5) May 7 07:13:52 Arktur kernel: ata5: hard resetting link May 7 07:13:53 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:13:53 Arktur kernel: ata5: reset failed (errno=-19), retrying in 9 secs May 7 07:14:02 Arktur kernel: ata5: hard resetting link May 7 07:14:05 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:14:05 Arktur kernel: ata5: reset failed (errno=-19), retrying in 8 secs May 7 07:14:12 Arktur kernel: ata5: hard resetting link May 7 07:14:14 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:14:14 Arktur kernel: ata5: reset failed (errno=-19), retrying in 33 secs May 7 07:14:47 Arktur kernel: ata5: hard resetting link May 7 07:14:47 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:14:47 Arktur kernel: ata5: reset failed, giving up May 7 07:14:47 Arktur kernel: ata5.00: disabled May 7 07:14:47 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen t4 May 7 07:14:47 Arktur kernel: ata5: SError: { PHYRdyChg } May 7 07:14:47 Arktur kernel: ata5: hard resetting link May 7 07:14:47 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] killing request May 7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] Unhandled error code May 7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] Result: hostbyte=0x01 driverbyte=0x00 May 7 07:14:47 Arktur kernel: sd 5:0:0:0: [sdf] CDB: cdb[0]=0x28: 28 00 d0 d1 07 20 00 00 08 00 May 7 07:14:47 Arktur kernel: end_request: I/O error, dev sdf, sector 3503359776 May 7 07:14:48 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:14:48 Arktur kernel: end_request: I/O error, dev sdf, sector 0 May 7 07:14:48 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:14:49 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:14:49 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:14:49 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:14:49 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:14:49 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:14:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:14:54 Arktur kernel: ata5: link is slow to respond, please be patient (ready=-19) May 7 07:14:57 Arktur kernel: ata5: COMRESET failed (errno=-16) May 7 07:14:57 Arktur kernel: ata5: hard resetting link May 7 07:15:01 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:03 Arktur kernel: ata5: link is slow to respond, please be patient (ready=-19) May 7 07:15:07 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:07 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:15:07 Arktur kernel: ata5: reset failed (errno=-19), retrying in 1 secs May 7 07:15:07 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:07 Arktur kernel: ata5: hard resetting link May 7 07:15:07 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:12 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:15:12 Arktur kernel: ata5: reset failed (errno=-19), retrying in 31 secs May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0 May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:15:22 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:42 Arktur kernel: ata5: hard resetting link May 7 07:15:44 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:15:44 Arktur kernel: ata5: reset failed, giving up May 7 07:15:44 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen t3 May 7 07:15:44 Arktur kernel: ata5: SError: { PHYRdyChg } May 7 07:15:44 Arktur kernel: ata5: hard resetting link May 7 07:15:44 Arktur kernel: ata5: COMRESET failed (errno=-19) May 7 07:15:44 Arktur kernel: ata5: reset failed (errno=-19), retrying in 10 secs May 7 07:15:49 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:50 Arktur kernel: end_request: I/O error, dev sdf, sector 0 May 7 07:15:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:50 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:15:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:50 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:15:50 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:50 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:15:54 Arktur kernel: ata5: hard resetting link May 7 07:15:55 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device May 7 07:15:59 Arktur kernel: ata5: link is slow to respond, please be patient (ready=-19) May 7 07:16:04 Arktur kernel: ata5: COMRESET failed (errno=-16) May 7 07:16:04 Arktur kernel: ata5: hard resetting link May 7 07:16:05 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 07:16:05 Arktur kernel: ata5.00: ATA-8: WDC WD30EZRX-00MMMB0, 80.00A80, max UDMA/133 May 7 07:16:05 Arktur kernel: ata5.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 0/32) May 7 07:16:05 Arktur kernel: ata5.00: configured for UDMA/100 May 7 07:16:05 Arktur kernel: ata5: EH complete May 7 07:16:05 Arktur kernel: ata5.00: detaching (SCSI 5:0:0:0) May 7 07:16:05 Arktur kernel: sd 5:0:0:0: [sdf] Synchronizing SCSI cache May 7 07:16:20 Arktur kernel: end_request: I/O error, dev sdf, sector 0 May 7 07:16:20 Arktur kernel: lost page write due to I/O error on sdf1 May 7 07:22:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 360s May 7 07:28:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 360s May 7 07:34:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 360s May 7 07:34:05 Arktur kernel: sd 5:0:0:0: [sdf] Result: hostbyte=0x00 driverbyte=0x00 May 7 07:34:05 Arktur kernel: sd 5:0:0:0: [sdf] Stopping disk May 7 07:37:05 Arktur kernel: sd 5:0:0:0: timing out command, waited 180s May 7 07:37:05 Arktur kernel: sd 5:0:0:0: [sdf] START_STOP FAILED May 7 07:37:05 Arktur kernel: sd 5:0:0:0: [sdf] Result: hostbyte=0x00 driverbyte=0x00 May 7 07:37:06 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) May 7 07:37:07 Arktur kernel: ata5.00: configured for UDMA/100 May 7 07:37:07 Arktur kernel: scsi 5:0:0:0: Direct-Access ATA WDC WD30EZRX-00M 80.0 PQ: 0 ANSI: 5 May 7 10:47:22 Arktur kernel: lost page write due to I/O error on sdf1 May 7 11:11:21 Arktur kernel: lost page write due to I/O error on sdf1 May 7 11:12:07 Arktur kernel: lost page write due to I/O error on sdf1 [reboot with kernel 3.2.9] May 7 11:15:25 Arktur kernel: ata5.00: configured for UDMA/100 May 7 11:15:25 Arktur kernel: scsi 5:0:0:0: Direct-Access ATA WDC WD30EZRX-00M 80.0 PQ: 0 ANSI: 5 May 7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB) May 7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] 4096-byte physical blocks May 7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Write Protect is off May 7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00 May 7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn''t support DPO or FUA May 7 11:15:26 Arktur kernel: sdf: sdf1 May 7 11:15:26 Arktur kernel: sd 5:0:0:0: [sdf] Attached SCSI disk ============= dmesg output ============== btrfs: free space inode generation (0) did not match free space cache generation (36740) btrfs: space cache generation (36727) does not match inode (36747) btrfs: failed to load free space cache for block group 9193084223488 ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:80:00:b3:d7/00:00:02:01:00/e0 tag 0 dma 65536 in res 51/40:6f:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 sd 5:0:0:0: [sdf] Unhandled sense code sd 5:0:0:0: [sdf] Result: hostbyte=0x00 driverbyte=0x08 sd 5:0:0:0: [sdf] Sense Key : 0x3 [current] [descriptor] Descriptor sense data with sense descriptors (in hex): 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01 02 d7 b3 08 sd 5:0:0:0: [sdf] ASC=0x11 ASCQ=0x4 sd 5:0:0:0: [sdf] CDB: cdb[0]=0x88: 88 00 00 00 00 01 02 d7 b3 00 00 00 00 80 00 00 end_request: I/O error, dev sdf, sector 4342657800 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 ata5: EH complete ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata5.00: BMDMA2 stat 0x80d0009 ata5.00: failed command: READ DMA EXT ata5.00: cmd 25/00:08:08:b3:d7/00:00:02:01:00/e0 tag 0 dma 4096 in res 51/40:08:08:b3:d7/00:00:02:01:00/f0 Emask 0x9 (media error) ata5.00: status: { DRDY ERR } ata5.00: error: { UNC } ata5.00: configured for UDMA/100 sd 5:0:0:0: [sdf] Unhandled sense code sd 5:0:0:0: [sdf] Result: hostbyte=0x00 driverbyte=0x08 sd 5:0:0:0: [sdf] Sense Key : 0x3 [current] [descriptor] Descriptor sense data with sense descriptors (in hex): 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01 02 d7 b3 08 sd 5:0:0:0: [sdf] ASC=0x11 ASCQ=0x4 sd 5:0:0:0: [sdf] CDB: cdb[0]=0x88: 88 00 00 00 00 01 02 d7 b3 08 00 00 00 08 00 00 end_request: I/O error, dev sdf, sector 4342657800 ata5: EH complete btrfs: error reading free space cache BUG: unable to handle kernel NULL pointer dereference at 00000001 IP: [<c1295c36>] io_ctl_drop_pages+0x26/0x50 *pdpt = 0000000029712001 *pde = 0000000000000000 Oops: 0002 [#1] ============== syslogd output =================================== kernel 3.2.9 after the 3.3.4 try === Message from syslogd@Arktur at Mon May 7 11:21:55 2012 Arktur kernel: Oops: 0002 [#1] Message from syslogd@Arktur at Mon May 7 11:21:56 2012 es existieren nur Arktur kernel: Process flush-btrfs-l (pid: 51 ti=e9f12000) ò at Mon May ò at Mon May ti=e9f12000 task=f6882a50 task. 11:21:56 2012 .. . 11:21:56 2012 ... Message from syslogd@Arktur at Mon May 7 11:21:56 2012 ... Arktur kernel: Code: c3 8d 74 26 00 55 89 e5 56 53 3e 8d 74 26 00 89 c6 e8 2f ff ff ff 8b 5e 1c 85 db 7e 30 31 db 8d b6 00 00 00 00 8b 46 0c 8b 04 98 <80> 60 01 fe 8b 46 0c 8b 04 98 e8 1b 96 df ff 8b 46 0c 8b 04 98 Message from syslogd@Arktur at Mon May 7 11:21:56 2012 ... Arktur kernel: EIP: [<c1295c36>] io_cfl_drop_pages+0x26/0x50 SS:ESP 0068:e9fl396 Message from syslogd@Arktur at Mon May 7 11:21:56 2012 Arktur kernel: CR2: 0000000000000001 Arktur:- # ========================================================= The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller. Only 1 of the 3 disks seems to be damaged. ========================================================= Ca I repair the system? Or have I to copy it to a set of other disks? Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 7, 2012 at 5:46 PM, Helmut Hullen <Hullen@t-online.de> wrote:> For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without > problems. > > Yesterday I compiled kernel 3.3.4, and this morning I started the > machine with this kernel. There may be some ugly problems.> Data, RAID0: total=5.29TB, used=4.29TBRaid0? Yaiks!> System, RAID1: total=8.00MB, used=352.00KB > System: total=4.00MB, used=0.00 > Metadata, RAID1: total=149.00GB, used=5.00GB > > Label: ''MMedia'' uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91 > Total devices 3 FS bytes used 4.29TB > devid 3 size 2.73TB used 1.98TB path /dev/sdi1 > devid 2 size 2.73TB used 1.94TB path /dev/sdf1 > devid 1 size 1.82TB used 1.63TB path /dev/sdc1 >> May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen > May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } > May 7 06:55:26 Arktur kernel: ata5: hard resetting link > May 7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19) > May 7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6 secs> May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device > May 7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0 > May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device > May 7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1That looks like a bad disk to me, and it shouldn''t be related to ther kernel version you use. Your best chance might be: - unmount the fs - get another disk to replace /dev/sdf, copy the content over with dd_rescue. Ata resets can be a PITA, so you might be better of by moving the failed disk to a usb external adapter, and du some creative combination of plug-unplug and selectively skip bad sectors manually (by passing "-s" to dd_rescue). - reboot, with the bad disk unplugged - (optional) run "btrfs filesystem scrub" (you might need to build btrfs-progs manually from git source). or simply read the entire fs (e.g. using tar to /dev/null, or whatever). It should check the checksum of all files and print out which files are damaged (either in stdout or syslog). I don''t think there''s anything you can do to recover the damaged files (other than restore from backup), but at least you know which files are NOT damaged. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 07, 2012 at 12:46:00PM +0200, Helmut Hullen wrote:> Hallo, > > "never change a running system" ... > > For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without > problems. > > Yesterday I compiled kernel 3.3.4, and this morning I started the > machine with this kernel. There may be some ugly problems. > > Copying something into the btrfs "directory" worked well for some files, > and then I got error messages (I''ve not copied them, something with "IO > error" under Samba). > > Rebooting the machine with kernel 3.2.9 worked, copying 1 file worked, > but copying more than this file didn''t work. And I can''t delete this > file. > > That doesn''t please me - copying more than 4 TBytes wastes time and > money. > > =========== configuration ================> > /dev/sdc1 on /srv/MM type btrfs (rw,noatime) > > /dev/sdc: SAMSUNG HD204UI: 25 °C > /dev/sdf: WDC WD30EZRX-00MMMB0: 30 °C > /dev/sdi: WDC WD30EZRX-00MMMB0: 29 °C > > Data, RAID0: total=5.29TB, used=4.29TB > System, RAID1: total=8.00MB, used=352.00KB > System: total=4.00MB, used=0.00 > Metadata, RAID1: total=149.00GB, used=5.00GB > > Label: ''MMedia'' uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91 > Total devices 3 FS bytes used 4.29TB > devid 3 size 2.73TB used 1.98TB path /dev/sdi1 > devid 2 size 2.73TB used 1.94TB path /dev/sdf1 > devid 1 size 1.82TB used 1.63TB path /dev/sdc1 > > Btrfs Btrfs v0.19 > > =================== boot messages, kernel related =============> > [boot with kernel 3.3.4] > May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen > May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } > May 7 06:55:26 Arktur kernel: ata5: hard resetting link > May 7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19) > May 7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6 secs > May 7 06:55:36 Arktur kernel: ata5: hard resetting link > May 7 06:55:38 Arktur kernel: ata5: COMRESET failed (errno=-19) > May 7 06:55:38 Arktur kernel: ata5: reset failed (errno=-19), retrying in 9 secs > May 7 06:55:46 Arktur kernel: ata5: hard resetting link > May 7 06:55:47 Arktur kernel: ata5: COMRESET failed (errno=-19) > May 7 06:55:47 Arktur kernel: ata5: reset failed (errno=-19), retrying in 34 secs > May 7 06:56:21 Arktur kernel: ata5: hard resetting link > May 7 06:56:22 Arktur kernel: ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > May 7 06:56:22 Arktur kernel: ata5.00: configured for UDMA/100 > May 7 06:56:22 Arktur kernel: ata5: EH complete > May 7 07:12:07 Arktur kernel: ata5.00: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen > May 7 07:12:07 Arktur kernel: ata5: SError: { PHYRdyChg } > May 7 07:12:07 Arktur kernel: ata5.00: failed command: WRITE DMA EXT > May 7 07:12:07 Arktur kernel: ata5.00: cmd 35/00:00:00:62:50/00:04:5e:00:00/e0 tag 0 dma 524288 out > May 7 07:12:07 Arktur kernel: res d8/d8:d8:d8:d8:d8/d8:d8:d8:d8:d8/d8 Emask 0x12 (ATA bus error) > May 7 07:12:07 Arktur kernel: ata5.00: status: { Busy } > May 7 07:12:07 Arktur kernel: ata5.00: error: { ICRC UNC IDNF }This is a hardware error. You have a device that''s either dead or dying. (Given the number of errors, probably already dead).> May 7 07:12:07 Arktur kernel: ata5: hard resetting link > =========================================================> > The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller. > Only 1 of the 3 disks seems to be damaged. > > =========================================================> > Ca I repair the system? Or have I to copy it to a set of other disks?If you have RAID-1 or RAID-10 on both data and netadata, then you _should_ in theory just be able to remove the dead disk (physically), then btrfs dev add a new one, btrfs dev del missing, and balance. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- argc, argv, argh! ---
Hallo, Fajar, Du meintest am 07.05.12:>> For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without >> problems. >> >> Yesterday I compiled kernel 3.3.4, and this morning I started the >> machine with this kernel. There may be some ugly problems.>> Data, RAID0: total=5.29TB, used=4.29TB> Raid0? Yaiks!Why not? You know the price of 1 3-TByte disk? The data isn''t irreproducible, in this case. [...]>> May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline >> device >> May 7 07:15:19 Arktur kernel: end_request: I/O error, dev >> sdf, sector 0 >> May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting >> I/O to offline device >> May 7 07:15:19 Arktur kernel: lost page write >> due to I/O error on sdf1> That looks like a bad disk to me, and it shouldn''t be related to the > kernel version you use.But why does it happen just when I change the kernel? (Yes - I know: Murphy works reliable ...)> Your best chance might be: > - unmount the fs > - get another disk to replace /dev/sdf, copy the content over with > dd_rescue. Ata resets can be a PITA, so you might be better of by > moving the failed disk to a usb external adapter, and du some > creative combination of plug-unplug and selectively skip bad sectors > manually (by passing "-s" to dd_rescue).Hmmm - I''ll take a try ...> - reboot, with the bad disk unplugged > - (optional) run "btrfs filesystem scrub" (you might need to build > btrfs-progs manually from git source).Last time I''d tried this command (some months ago) it had produced a completely unusable system of disks/partitions ...> or simply read the entire fs > (e.g. using tar to /dev/null, or whatever). It should check the > checksum of all files and print out which files are damaged (either > in stdout or syslog).And that''s the other try - I had to use it for another disk (also WD, but only 2 TByte - I could watch how it died ...). Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Hugo, Du meintest am 07.05.12:>> Yesterday I compiled kernel 3.3.4, and this morning I started the >> machine with this kernel. There may be some ugly problems. >> >> Copying something into the btrfs "directory" worked well for some >> files, and then I got error messages (I''ve not copied them, >> something with "IO error" under Samba).[...]>> Data, RAID0: total=5.29TB, used=4.29TB >> System, RAID1: total=8.00MB, used=352.00KB >> System: total=4.00MB, used=0.00 >> Metadata, RAID1: total=149.00GB, used=5.00GB>> >> Label: ''MMedia'' uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91 >> Total devices 3 FS bytes used 4.29TB >> devid 3 size 2.73TB used 1.98TB path /dev/sdi1 >> devid 2 size 2.73TB used 1.94TB path /dev/sdf1 >> devid 1 size 1.82TB used 1.63TB path /dev/sdc1 >> >> Btrfs Btrfs v0.19 >> >> =================== boot messages, kernel related =============>> >> [boot with kernel 3.3.4] >> May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 >> SErr 0x10000 action 0xe frozen >> May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } >> May 7 06:55:26 Arktur kernel: ata5: hard resetting link> This is a hardware error. You have a device that''s either dead or > dying. (Given the number of errors, probably already dead).It seems to be undecided which status it has ...>> Can I repair the system? Or have I to copy it to a set of other >> disks?> If you have RAID-1 or RAID-10 on both data and netadata, then you > _should_ in theory just be able to remove the dead disk (physically), > then btrfs dev add a new one, btrfs dev del missing, and balance.I haven''t - I have a kind of copy/backup in the neighbourhood. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/07/2012 06:46 PM, Helmut Hullen wrote:> btrfs: error reading free space cache > BUG: unable to handle kernel NULL pointer dereference at 00000001 > IP: [<c1295c36>] io_ctl_drop_pages+0x26/0x50 > *pdpt = 0000000029712001 *pde = 0000000000000000 > Oops: 0002 [#1]Could you please try this and show us the results? diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 202008e..ae514ad 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -296,7 +296,9 @@ static void io_ctl_free(struct io_ctl *io_ctl) static void io_ctl_unmap_page(struct io_ctl *io_ctl) { if (io_ctl->cur) { - kunmap(io_ctl->page); + WARN_ON(!io_ctl->page); + if (io_ctl->page) + kunmap(io_ctl->page); io_ctl->cur = NULL; io_ctl->orig = NULL; } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Hugo, Du meintest am 07.05.12:>> =================== boot messages, kernel related =============>> >> [boot with kernel 3.3.4] >> May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 >> SErr 0x10000 action 0xe frozen >> May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } >> May 7 06:55:26 Arktur kernel: ata5: hard resetting link[...]> This is a hardware error. You have a device that''s either dead or > dying. (Given the number of errors, probably already dead).It''s dead - R.I.P. I''ve tried it with a SATA-USB-adapter - that adapter produces dmesg lines when connecting or disconnecting. And this special drive doesn''t tell anything now. Shit. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 07, 2012 at 03:34:00PM +0200, Helmut Hullen wrote:> Hallo, Hugo, > > Du meintest am 07.05.12: > > >> =================== boot messages, kernel related =============> >> > >> [boot with kernel 3.3.4] > >> May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 > >> SErr 0x10000 action 0xe frozen > >> May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } > >> May 7 06:55:26 Arktur kernel: ata5: hard resetting link > > [...] > > > This is a hardware error. You have a device that''s either dead or > > dying. (Given the number of errors, probably already dead). > > It''s dead - R.I.P. > > I''ve tried it with a SATA-USB-adapter - that adapter produces dmesg > lines when connecting or disconnecting. > > And this special drive doesn''t tell anything now. Shit.Sorry to be the bearer of bad news. I don''t think we can point the finger at btrfs here. It looks like you''ve lost most of your data -- losing a RAID-0 stripe across the whole FS isn''t likely to have left much of it intact. If you''ve got the space (or the money to get it), mkfs.btrfs -m raid1 -d raid1 would have saved you here. [ Incidentally, thinking about it, the failure coming at a kernel upgrade could well be down to the additional stress of the power-down/reboot finally pushing a bad drive over the edge. ] In sympathy, Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- But somewhere along the line, it seems / That pimp became --- cool, and punk mainstream.
Hallo, Hugo, Du meintest am 07.05.12:>> It''s dead - R.I.P.> Sorry to be the bearer of bad news. I don''t think we can point the > finger at btrfs here.a) you know what to do with the bearer? b) I like such errors - completely independent, but simultaneously.> It looks like you''ve lost most of your data -- losing a RAID-0 > stripe across the whole FS isn''t likely to have left much of it > intact.I''m just going back to ext4 - then one broken disk doesn''t disturb the contents of the other disks. The data is not very valuable - DVB video mpegs. Most of the files are repeated on and on.> If you''ve got the space (or the money to get it), mkfs.btrfs > -m raid1 -d raid1 would have saved you here.About 400 ... 500 Euro for backing up videos? Not necessary. (No: I don''t count the minutes and hours working with the system ...)> [ Incidentally, thinking about it, the failure coming at a kernel > upgrade could well be down to the additional stress of the > power-down/reboot finally pushing a bad drive over the edge. ]Just now it''s again an "open system"; I had to wobble the cables too ... Maybe the SATA-PCI-controller needs to be replaced too ... Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/7/12 6:36 PM, Helmut Hullen wrote:> Hallo, Hugo, > > Du meintest am 07.05.12: > >>> It''s dead - R.I.P. > >> Sorry to be the bearer of bad news. I don''t think we can point the >> finger at btrfs here. > > a) you know what to do with the bearer? > b) I like such errors - completely independent, but simultaneously. > >> It looks like you''ve lost most of your data -- losing a RAID-0 >> stripe across the whole FS isn''t likely to have left much of it >> intact. > > I''m just going back to ext4 - then one broken disk doesn''t disturb the > contents of the other disks.?! If you use raid0 one broken disk will allways disturb the contents of the other disks, that is what raid0 does, no matter what filesystem you use. You could easly use btrfs with the "normal" or raid1 mode. Btrfs is still in development and often times you can blaim it for a corrupt filesystem, but in this case it''s simply "raid0 -> 1 disc dies -> data are gone".> > The data is not very valuable - DVB video mpegs. Most of the files are > repeated on and on. > >> If you''ve got the space (or the money to get it), mkfs.btrfs >> -m raid1 -d raid1 would have saved you here. > > About 400 ... 500 Euro for backing up videos? Not necessary. > > (No: I don''t count the minutes and hours working with the system ...)> >> [ Incidentally, thinking about it, the failure coming at a kernel >> upgrade could well be down to the additional stress of the >> power-down/reboot finally pushing a bad drive over the edge. ] > > Just now it''s again an "open system"; I had to wobble the cables too ... > > Maybe the SATA-PCI-controller needs to be replaced too ... > > Viele Gruesse! > Helmut > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Felix, Du meintest am 07.05.12:>> I''m just going back to ext4 - then one broken disk doesn''t disturb >> the contents of the other disks.> ?! If you use raid0 one broken disk will always disturb the contents > of the other disks, that is what raid0 does, no matter what > filesystem you use.Yes - I know. But btrfs promises that I can add bigger disks and delete smaller disks "on the fly". For something like a video collection which will grow on and on an interesting feature. And such a (big) collection does need a "gradfather-father-son" backup, that''s no critical data. With a file system like ext2/3/4 I can work with several directories which are mounted together, but (as said before) one broken disk doesn''t disturb the others. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 07, 2012 at 07:52:00PM +0200, Helmut Hullen wrote:> Hallo, Felix, > > Du meintest am 07.05.12: > > >> I''m just going back to ext4 - then one broken disk doesn''t disturb > >> the contents of the other disks. > > > ?! If you use raid0 one broken disk will always disturb the contents > > of the other disks, that is what raid0 does, no matter what > > filesystem you use. > > Yes - I know. But btrfs promises that I can add bigger disks and delete > smaller disks "on the fly". For something like a video collection which > will grow on and on an interesting feature. And such a (big) collection > does need a "gradfather-father-son" backup, that''s no critical data. > > With a file system like ext2/3/4 I can work with several directories > which are mounted together, but (as said before) one broken disk doesn''t > disturb the others.mkfs.btrfs -m raid1 -d single should give you that. There may be a kernel patch you need to stop it doing the silly single → raid0 "upgrade" automatically, as well. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- __(_''> Squeak! ---
Hallo, Hugo, Du meintest am 07.05.12:>> With a file system like ext2/3/4 I can work with several directories >> which are mounted together, but (as said before) one broken disk >> doesn''t disturb the others.> mkfs.btrfs -m raid1 -d single should give you that.What''s the difference to mkfs.btrfs -m raid1 -d raid0 (what I have used the last time)? Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 07, 2012 at 08:25:00PM +0200, Helmut Hullen wrote:> Hallo, Hugo, > > Du meintest am 07.05.12: > > >> With a file system like ext2/3/4 I can work with several directories > >> which are mounted together, but (as said before) one broken disk > >> doesn''t disturb the others. > > > mkfs.btrfs -m raid1 -d single should give you that. > > What''s the difference to > > mkfs.btrfs -m raid1 -d raid0- RAID-0 stripes each piece of data across all the disks. - single puts data on one disk at a time. So, on three disks (each disk running horizontally), the FS will allocate block groups this way for RAID-0: Disk 1: | A1 | B1 | C1 |... Disk 2: | A2 | B2 | C2 |... Disk 3: | A3 | B3 | C3 |... where each chunk, e.g. A2, is 1G in size. Then data is striped across all of the An chunks (a single block group of size 3G) in 64k sub-stripes, until block group A is filled up, and then it''ll move on to another block group. For "single" allocation on the same disks, you will instead get: Disk 1: | A | D | G |... Disk 2: | B | E | H |... Disk 3: | C | F | I |... where, again, each chunk is 1G in size. Data written to the FS will live in one of the chunks, overflowing to some other chunk when there''s no more space. With large files, you''ve still got a chance that (some of) the data from the file will be on more than one disk, but it''s a much much better situation than you''d have with RAID-0. Of course, you still need RAID-1 metadata, so that when a disk does go bang, you still have all the filesystem structures you need to read the remaining data. :) In fact, this is probably a good argument for having the option to put back the old allocator algorithm, which would have ensured that the first disk would fill up completely first before it touched the next one... Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- ... one ping(1) to rule them all, and in the --- darkness bind(2) them.
On 05/07/2012 10:52 AM, Helmut Hullen wrote:> Hallo, Felix, > > Du meintest am 07.05.12: > >>> I''m just going back to ext4 - then one broken disk doesn''t disturb >>> the contents of the other disks. > >> ?! If you use raid0 one broken disk will always disturb the contents >> of the other disks, that is what raid0 does, no matter what >> filesystem you use. > > Yes - I know. But btrfs promises that I can add bigger disks and delete > smaller disks "on the fly". For something like a video collection which > will grow on and on an interesting feature. And such a (big) collection > does need a "gradfather-father-son" backup, that''s no critical data. > > With a file system like ext2/3/4 I can work with several directories > which are mounted together, but (as said before) one broken disk doesn''t > disturb the others. >How can you do that with ext2/3/4? If you mean create several different filesystems and mount them separately then that''s very different from your current situation. What you did in this case is comparable to creating a raid0 array out of your disks. I don''t see how an ext filesystem is going to work any better if one of the disks drops out than with a btrfs filesystem. Using -d single isn''t going to be of much use in this case either because that''s like spanning a lvm volume over several disks and then putting ext over that, it''s pretty nondeterministic how much you''ll actually save should a large chunk of the filesystem suddenly disappear. It sounds like what you''re thinking of is creating several separate ext filesystems and then just mounting them separately. There''s nothing inherently special about doing this with ext, you can can do the same thing with btrfs and it would amount to about the same level of protection (potentially more if you consider [meta]data checksums important but potentially less if you feel that ext is more robust for whatever reason). If you want to survive losing a single disk without the (absolute) fear of the whole filesystem breaking you have to have some sort of redundancy either by separating filesystems or using some version of raid other than raid0. I suppose the volume management of btrfs is sort of confusing at the moment but when btrfs promises you can remove disks "on the fly" it doesn''t mean you can just unplug disks from a raid0 without telling btrfs to put that data elsewhere first. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Daniel, Du meintest am 07.05.12:>> Yes - I know. But btrfs promises that I can add bigger disks and >> delete smaller disks "on the fly". For something like a video >> collection which will grow on and on an interesting feature. And >> such a (big) collection does need a "gradfather-father-son" backup, >> that''s no critical data. >> >> With a file system like ext2/3/4 I can work with several directories >> which are mounted together, but (as said before) one broken disk >> doesn''t disturb the others.> How can you do that with ext2/3/4? If you mean create several > different filesystems and mount them separately then that''s very > different from your current situation. What you did in this case is > comparable to creating a raid0 array out of your disks. I don''t see > how an ext filesystem is going to work any better if one of the disks > drops out than with a btrfs filesystem.mkfs.btrfs -m raid1 -d raid0 with 3 disks gives me a "cluster" which looks like 1 disk/partition/ directory. If one disk fails nothing is usable. (Yes - I''ve read Hugo''s explanation of "-d single", I''ll try this way) With ext2/3/4 I mount 2 disks/partitions into the first disk. If one disk fails the contents of the 2 other disks is still readable,> It sounds like what you''re thinking of is creating several separate > ext filesystems and then just mounting them separately.Yes - that''s the old way. It''s reliable but "ugly".> There''s nothing inherently special about doing this with ext, you can > do the same thing with btrfs and it would amount to about the same > level of protection (potentially more if you consider [meta]data > checksums important but potentially less if you feel that ext is more > robust for whatever reason).No - as just mentionend: there''s a big difference when one disk fails.> If you want to survive losing a single disk without the (absolute) > fear of the whole filesystem breaking you have to have some sort of > redundancy either by separating filesystems or using some version of > raid other than raid0.No - since some years I use a kind of outsourced backup. A copy of all data is on a bundle of disks somewhere in the neighbourhood. As mentionend: the data isn''t business critical, it''s just "nice to have". It''s not worth something like raid1 or so (with twice the costs of a non raid solution).> I suppose the volume management of btrfs is > sort of confusing at the moment but when btrfs promises you can > remove disks "on the fly" it doesn''t mean you can just unplug disks > from a raid0 without telling btrfs to put that data elsewhere first.No - it''s not confusing. It only needs a kind of recipe and much time: btrfs device add ... btrfs filesystem balance ... (perhaps no necessary) btrfs device delete ... btrfs filesystem balance ... (perhaps not necessary) No intellectual challenge. And completely different to "hot pluggable". Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/07/2012 01:21 PM, Helmut Hullen wrote:> Hallo, Daniel, > > Du meintest am 07.05.12: > >>> Yes - I know. But btrfs promises that I can add bigger disks and >>> delete smaller disks "on the fly". For something like a video >>> collection which will grow on and on an interesting feature. And >>> such a (big) collection does need a "gradfather-father-son" backup, >>> that''s no critical data. >>> >>> With a file system like ext2/3/4 I can work with several directories >>> which are mounted together, but (as said before) one broken disk >>> doesn''t disturb the others. > >> How can you do that with ext2/3/4? If you mean create several >> different filesystems and mount them separately then that''s very >> different from your current situation. What you did in this case is >> comparable to creating a raid0 array out of your disks. I don''t see >> how an ext filesystem is going to work any better if one of the disks >> drops out than with a btrfs filesystem. > > mkfs.btrfs -m raid1 -d raid0 > > with 3 disks gives me a "cluster" which looks like 1 disk/partition/ > directory. > If one disk fails nothing is usable.How is that different from putting ext on top of a raid0?> > (Yes - I''ve read Hugo''s explanation of "-d single", I''ll try this way) > > With ext2/3/4 I mount 2 disks/partitions into the first disk. If one > disk fails the contents of the 2 other disks is still readable,There is nothing that prevents you from using this strategy with btrfs.> >> It sounds like what you''re thinking of is creating several separate >> ext filesystems and then just mounting them separately. > > Yes - that''s the old way. It''s reliable but "ugly". > >> There''s nothing inherently special about doing this with ext, you can >> do the same thing with btrfs and it would amount to about the same >> level of protection (potentially more if you consider [meta]data >> checksums important but potentially less if you feel that ext is more >> robust for whatever reason). > > No - as just mentionend: there''s a big difference when one disk fails.No there isn''t.> >> If you want to survive losing a single disk without the (absolute) >> fear of the whole filesystem breaking you have to have some sort of >> redundancy either by separating filesystems or using some version of >> raid other than raid0. > > No - since some years I use a kind of outsourced backup. A copy of all > data is on a bundle of disks somewhere in the neighbourhood. As > mentionend: the data isn''t business critical, it''s just "nice to have". > It''s not worth something like raid1 or so (with twice the costs of a non > raid solution). > >> I suppose the volume management of btrfs is >> sort of confusing at the moment but when btrfs promises you can >> remove disks "on the fly" it doesn''t mean you can just unplug disks >> from a raid0 without telling btrfs to put that data elsewhere first. > > No - it''s not confusing. It only needs a kind of recipe and much time: > > btrfs device add ... > btrfs filesystem balance ... (perhaps no necessary) > btrfs device delete ... > btrfs filesystem balance ... (perhaps not necessary) > > No intellectual challenge. > And completely different to "hot pluggable".This is no different to any raid0 or spanning disk setup that allows growing/shrinking of the array. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Daniel, Du meintest am 07.05.12:>> mkfs.btrfs -m raid1 -d raid0 >> >> with 3 disks gives me a "cluster" which looks like 1 disk/partition/ >> directory. >> If one disk fails nothing is usable.> How is that different from putting ext on top of a raid0?Classic raid0 doesn''t allow deleting/removing disks from a cluster.>> With ext2/3/4 I mount 2 disks/partitions into the first disk. If one >> disk fails the contents of the 2 other disks is still readable,> There is nothing that prevents you from using this strategy with > btrfs.How? I''ve tried many installations of btrfs, sometimes 1 disk failed, and then the data on all other disks was inaccessible. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, May 7, 2012 at 3:17 PM, Helmut Hullen <Hullen@t-online.de> wrote:> Hallo, Daniel, > > Du meintest am 07.05.12: > >>> mkfs.btrfs -m raid1 -d raid0 >>> >>> with 3 disks gives me a "cluster" which looks like 1 disk/partition/ >>> directory. >>> If one disk fails nothing is usable. > >> How is that different from putting ext on top of a raid0? > > Classic raid0 doesn''t allow deleting/removing disks from a cluster. > >>> With ext2/3/4 I mount 2 disks/partitions into the first disk. If one >>> disk fails the contents of the 2 other disks is still readable, > >> There is nothing that prevents you from using this strategy with >> btrfs. > > How? > I''ve tried many installations of btrfs, sometimes 1 disk failed, and > then the data on all other disks was inaccessible."With ext2/3/4 I mount 2 disks/partitions into the first disk. If one disk fails the contents of the 2 other disks is still readable," There''s nothing stopping you from using 3 btrfs filesystems mounted in the same way as you would 3 ext4 filesystems. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am Montag, 7. Mai 2012 schrieb Helmut Hullen:> > If you want to survive losing a single disk without the (absolute) > > fear of the whole filesystem breaking you have to have some sort of > > redundancy either by separating filesystems or using some version of > > raid other than raid0. > > No - since some years I use a kind of outsourced backup. A copy of > all data is on a bundle of disks somewhere in the neighbourhood. As > mentionend: the data isn''t business critical, it''s just "nice to > have". It''s not worth something like raid1 or so (with twice the costs > of a non raid solution).Thats not true when you use BTRFS RAID1 with three disks. BTRFS will only store each chunk on two different drives then, not on all three. Such it is not twice the cost, but given all three drives have the same capacity about one and a half times the cost. Consider the time to recover the files from the outsourced backup. Maybe it does make up the money you would have to spend for one additional harddisk. Anyway, I agree with the others responding to your post that this one harddisk died and I do not see a kernel version related issue. Any striped RAID 0 would have failed in that case. And you can use three BTRFS filesystems the same way as three Ext4 filesystems if you prefer such a setup if the time spent for restoring the backup does not make up the cost for one additional disk for you. -- Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Martin, Du meintest am 08.05.12:>> No - since some years I use a kind of outsourced backup. A copy of >> all data is on a bundle of disks somewhere in the neighbourhood. >> As mentionend: the data isn''t business critical, it''s just "nice to >> have". It''s not worth something like raid1 or so (with twice the >> costs of a non raid solution).> Thats not true when you use BTRFS RAID1 with three disks. BTRFS will > only store each chunk on two different drives then, not on all three. > Such it is not twice the cost, but given all three drives have the > same capacity about one and a half times the cost.> Consider the time to recover the files from the outsourced backup. > Maybe it does make up the money you would have to spend for one > additional harddisk.I have considered it, many times. And the result is unchanged: no RAID1. It doesn''t replace a real backup.> Anyway, I agree with the others responding to your post that this one > harddisk died and I do not see a kernel version related issue. Any > striped RAID 0 would have failed in that case.Yes - I had written yesterday that the disk is dead. One of three disks. I''m on the way restoring (from backup) the three disks.> And you can use three BTRFS filesystems the same way as three Ext4 > filesystems if you prefer such a setup if the time spent for > restoring the backup does not make up the cost for one additional > disk for you.But where''s the gain? If a disk fails I have a lot of tools for repairing an ext2/3/4 system. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 8, 2012 at 2:39 PM, Helmut Hullen <Hullen@t-online.de> wrote:>> And you can use three BTRFS filesystems the same way as three Ext4 >> filesystems if you prefer such a setup if the time spent for >> restoring the backup does not make up the cost for one additional >> disk for you. > > But where''s the gain? If a disk fails I have a lot of tools for > repairing an ext2/3/4 system.It won''t work if you use it in RAID0 (e.g. with LVM spanning three disks, then use ext4 on top of the LV). Which is basically the same thing that you did (using btrfs in raid0 mode). As others said, if your only concern is "if a disk is dead, I want to be able to access data on other disks", then simply use btrfs as three different fs, mounted on three directories. btrfs will shine when: - you need checksum and self-healing in raid10 mode - you have lots of small files - you have highly compressible content - you need snapshot/clone feature Since you don''t need either, IMHO it''s actually better if you just use ext4. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Fajar, Du meintest am 08.05.12:>>> And you can use three BTRFS filesystems the same way as three Ext4 >>> filesystems if you prefer such a setup if the time spent for >>> restoring the backup does not make up the cost for one additional >>> disk for you. >> >> But where''s the gain? If a disk fails I have a lot of tools for >> repairing an ext2/3/4 system.> It won''t work if you use it in RAID0 (e.g. with LVM spanning three > disks, then use ext4 on top of the LV).But when I use ext2/3/4 I neither need RAID0 nor do I need LVM.> As others said, if your only concern is "if a disk is dead, I want to > be able to access data on other disks", then simply use btrfs as > three different fs, mounted on three directories.But then I don''t need especially btrfs.> btrfs will shine when: > - you need checksum and self-healing in raid10 mode > - you have lots of small files > - you have highly compressible content > - you need snapshot/clone featureFor my video collection (mpeg2) nothing fits ... The only advantage I see with btrfs is adding a bigger disk deleting/removing a smaller disk with really simple commands. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Helmut,>> But where''s the gain? If a disk fails I have a lot of tools for >> repairing an ext2/3/4 system.Nope, when a disk in your ext4 raid0 array fails, you are just as doomed.> But when I use ext2/3/4 I neither need RAID0 nor do I need LVM.You can use btrfs, without using its raid capabilities. Face it, you used an experimental filesystem and you configured it the wrong way. Btrfs is not the one to blame here. - Clemens -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Clemens, Du meintest am 08.05.12:>>> But where''s the gain? If a disk fails I have a lot of tools for >>> repairing an ext2/3/4 system.> Nope, when a disk in your ext4 raid0 array fails, you are just as > doomed.Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the directory tree does the job. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/8/12 3:13 PM, Helmut Hullen wrote:> Hallo, Clemens, > > Du meintest am 08.05.12: > >>>> But where''s the gain? If a disk fails I have a lot of tools for >>>> repairing an ext2/3/4 system. > >> Nope, when a disk in your ext4 raid0 array fails, you are just as >> doomed. > > Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the > directory tree does the job.Nobody told you that you should do it. What EVERYBODY here is telling you: The problem you have right now would be the same damn problem, no matter what fs you would you. Every fs will be unusable if you lose one disk in a raid0 setup. That''s all what we are trying to tell you for the last 15 mails :) If you don''t see any benefits using btrfs then simply don''t use it :) Again: You misconfigured your fs if you never wanted to use raid0. Don''t blame the fs, blame yourself.> > Viele Gruesse! > Helmut > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 08, 2012 at 03:44:12PM +0200, Felix Blanke wrote:> On 5/8/12 3:13 PM, Helmut Hullen wrote: > >Hallo, Clemens, > > > >Du meintest am 08.05.12: > > > >>>>But where''s the gain? If a disk fails I have a lot of tools for > >>>>repairing an ext2/3/4 system. > > > >>Nope, when a disk in your ext4 raid0 array fails, you are just as > >>doomed. > > > >Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the > >directory tree does the job. > > Nobody told you that you should do it. What EVERYBODY here is > telling you: The problem you have right now would be the same damn > problem, no matter what fs you would you. Every fs will be unusable > if you lose one disk in a raid0 setup. That''s all what we are trying > to tell you for the last 15 mails :)I think he''s got the point by now. Can we stop this thread now, please? It doesn''t seem to be serving any further purpose. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- No names... I want to remain anomalous. ---
Hallo, Felix, Du meintest am 08.05.12:>> Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the >> directory tree does the job.> Nobody told you that you should do it. What EVERYBODY here is telling > you: The problem you have right now would be the same damn problem, > no matter what fs you would you. Every fs will be unusable if you > lose one disk in a raid0 setup. That''s all what we are trying to tell > you for the last 15 mails :)> If you don''t see any benefits using btrfs then simply don''t use itI still hope for a benefit when I use btrfs. As I''ve written many times: I want a system for my video collection which allows adding a bigger disk deleting/removing a smaller disk with simple commands. btrfs seems to be able to do that (and I have tested this job many times). But with my configuration "mkfs.btrfs -m raid1 -d raid0" I''ve (again) seen that all data vanishes when 1 disk fails. I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d single". And I hope that it doesn''t make all disks unreadable when 1 disk fails. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/8/12 6:53 PM, Helmut Hullen wrote: > Hallo, Felix, > > Du meintest am 08.05.12: > >>> Why should I use RAID0 with a bundle of ext2/3/4? Mounting on/in the >>> directory tree does the job. > >> Nobody told you that you should do it. What EVERYBODY here is telling >> you: The problem you have right now would be the same damn problem, >> no matter what fs you would you. Every fs will be unusable if you >> lose one disk in a raid0 setup. That''s all what we are trying to tell >> you for the last 15 mails :) > >> If you don''t see any benefits using btrfs then simply don''t use it > > I still hope for a benefit when I use btrfs. > > As I''ve written many times: I want a system for my video collection > which allows > > adding a bigger disk > deleting/removing a smaller disk > > with simple commands. > > btrfs seems to be able to do that (and I have tested this job many > times). But with my configuration "mkfs.btrfs -m raid1 -d raid0" I''ve > (again) seen that all data vanishes when 1 disk fails. > > I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d single". > And I hope that it doesn''t make all disks unreadable when 1 disk fails. Maybe you should inform yourself about the different raid level before you use them? http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0 Raid0 will allways be that way: One disk dies, filesystem is gone. That''s some sort of defintion of raid0 :) @"-d single" Is it really possible to remove a disk from btrfs (created with -d single) without losing the data on that disk? Is there a way to tell balance to copy all the data from this disk to the other disks (ofc if there is enough free space on them)? > > Viele Gruesse! > Helmut > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Felix, Du meintest am 08.05.12:>> As I''ve written many times: I want a system for my video collection >> which allows >> >> adding a bigger disk >> deleting/removing a smaller disk >> >> with simple commands. >> >> btrfs seems to be able to do that (and I have tested this job many >> times). But with my configuration "mkfs.btrfs -m raid1 -d raid0" >> I''ve (again) seen that all data vanishes when 1 disk fails. >> >> I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d single". >> And I hope that it doesn''t make all disks unreadable when 1 disk >> fails.[...]> @"-d single"> Is it really possible to remove a disk from btrfs (created with -d > single) without losing the data on that disk?When the system is configured with mkfs.btrfs -m raid1 -d raid0 then the above shown way is possible, it works (now) as expected. Ok - it needs some time. And I have yet told in this mailing list that I''ll try the option 2-d single".> Is there a way to tell > balance to copy all the data from this disk to the other disks (ofc > if there is enough free space on them)?As I''ve written some hours ago: I run btrfs fi balance ... after adding and after deleting a disk. Maybe it''s not necessary. Especially it seems not to be necessary after adding a disk. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/8/12 8:29 PM, Helmut Hullen wrote:> Hallo, Felix, > > Du meintest am 08.05.12: > >>> As I''ve written many times: I want a system for my video collection >>> which allows >>> >>> adding a bigger disk >>> deleting/removing a smaller disk >>> >>> with simple commands. >>> >>> btrfs seems to be able to do that (and I have tested this job many >>> times). But with my configuration "mkfs.btrfs -m raid1 -d raid0" >>> I''ve (again) seen that all data vanishes when 1 disk fails. >>> >>> I''ll try Hugo''s proposal "mkfs.btrfs -m raid1 -d single". >>> And I hope that it doesn''t make all disks unreadable when 1 disk >>> fails. > > [...] > >> @"-d single" > >> Is it really possible to remove a disk from btrfs (created with -d >> single) without losing the data on that disk? > > When the system is configured with > > mkfs.btrfs -m raid1 -d raid0 > > then the above shown way is possible, it works (now) as expected. > Ok - it needs some time. > > And I have yet told in this mailing list that I''ll try the option 2-d > single". > >> Is there a way to tell >> balance to copy all the data from this disk to the other disks (ofc >> if there is enough free space on them)? > > As I''ve written some hours ago: I run > > btrfs fi balance ... > > after adding and after deleting a disk. Maybe it''s not necessary. > Especially it seems not to be necessary after adding a disk.What are the steps you''re doing?! If this is really possible then there must be some sort of command that tells btrfs "Hey, I wanne remove this disk from the fs, please copy all data to the other disks and then remove the disk". Is there such a command? Haven''t heard of one, but that would be interesting. Otherwise if you remove a disk from a raid0 (doesn''t matter if you have 2 or 5 or x disks in the fs, btrfs should stripe above all disks) your fs should be broken.> > Viele Gruesse! > Helmut > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 08, 2012 at 08:41:47PM +0200, Felix Blanke wrote:> >As I''ve written some hours ago: I run > > > > btrfs fi balance ... > > > >after adding and after deleting a disk. Maybe it''s not necessary. > >Especially it seems not to be necessary after adding a disk. > > What are the steps you''re doing?! If this is really possible then there must > be some sort of command that tells btrfs "Hey, I wanne remove this disk from > the fs, please copy all data to the other disks and then remove the disk". > Is there such a command? Haven''t heard of one, but that would be > interesting.The ''btrfs device delete'' command does what you described, a pretty basic command, so I''m not sure if I did not miss something during this thread.> Otherwise if you remove a disk from a raid0 (doesn''t matter if you have 2 or > 5 or x disks in the fs, btrfs should stripe above all disks) your fs should > be broken.All data from the device being removed are relocated to the rest of the device group. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Felix, Du meintest am 08.05.12:>>>> adding a bigger disk >>>> deleting/removing a smaller disk >>>> >>>> with simple commands.[...]>>> Is it really possible to remove a disk from btrfs (created with -d >>> single) without losing the data on that disk? >> >> When the system is configured with >> >> mkfs.btrfs -m raid1 -d raid0 >> >> then the above shown way is possible, it works (now) as expected. >> Ok - it needs some time.[...]> What are the steps you''re doing?! If this is really possible then > there must be some sort of command that tells btrfs "Hey, I wanne > remove this disk from the fs, please copy all data to the other disks > and then remove the disk". Is there such a command? Haven''t heard of > one, but that would be interesting.btrfs device add /dev/$newdisk ... (btrfs fi balance ...) btrfs device delete /dev/$olddisk ... (btrfs fi balance ...) I''ve told these simple steps many times in this mailing list. Since some kernel versions (at least since kernel 3.2.x) it seems to work without problems; "btrfs-progs"-packet from 2011-10-30.> Otherwise if you remove a disk from a raid0 (doesn''t matter if you > have 2 or 5 or x disks in the fs, btrfs should stripe above all > disks) your fs should be broken.Not with btrfs ... there it works even with mkfs.btrfs -m raid1 -d raid0 ... Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 08, 2012 at 09:34:00PM +0200, Helmut Hullen wrote:> Hallo, Felix, > > Du meintest am 08.05.12: > > >>>> adding a bigger disk > >>>> deleting/removing a smaller disk > >>>> > >>>> with simple commands. > > [...] > > >>> Is it really possible to remove a disk from btrfs (created with -d > >>> single) without losing the data on that disk? > >> > >> When the system is configured with > >> > >> mkfs.btrfs -m raid1 -d raid0 > >> > >> then the above shown way is possible, it works (now) as expected. > >> Ok - it needs some time. > > [...] > > > What are the steps you''re doing?! If this is really possible then > > there must be some sort of command that tells btrfs "Hey, I wanne > > remove this disk from the fs, please copy all data to the other disks > > and then remove the disk". Is there such a command? Haven''t heard of > > one, but that would be interesting. > > btrfs device add /dev/$newdisk ... > (btrfs fi balance ...) > btrfs device delete /dev/$olddisk ... > (btrfs fi balance ...) > > I''ve told these simple steps many times in this mailing list. > > Since some kernel versions (at least since kernel 3.2.x) it seems to > work without problems; "btrfs-progs"-packet from 2011-10-30. > > > > Otherwise if you remove a disk from a raid0 (doesn''t matter if you > > have 2 or 5 or x disks in the fs, btrfs should stripe above all > > disks) your fs should be broken. > > > Not with btrfs ... there it works even with > > mkfs.btrfs -m raid1 -d raid0 ...There is a big difference between "orderly and planned removal of a hard disk", and "disk goes away with no warning". This is essentially the difference you''ve been talking about at cross-purposes all day. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- My karma has run over my dogma. ---
Hallo, Hugo, Du meintest am 08.05.12:>>> Otherwise if you remove a disk from a raid0 (doesn''t matter if you >>> have 2 or 5 or x disks in the fs, btrfs should stripe above all >>> disks) your fs should be broken.>> Not with btrfs ... there it works even with >> >> mkfs.btrfs -m raid1 -d raid0 ...> There is a big difference between "orderly and planned removal of > a hard disk", and "disk goes away with no warning".And I know the difference ... When I first called for help I searched the failure in another place than in "disk is dead".> This is essentially the difference you''ve been talking about at cross- > purposes all day.What I still hope (may be it''s impossible): when 1 disk/partition fails, then the contents of the other disks is "somehow" restorable. And not irreproducable. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08 May 2012 22:19:00 +0200 Hullen@t-online.de (Helmut Hullen) wrote:> What I still hope (may be it''s impossible): when 1 disk/partition fails, > then the contents of the other disks is "somehow" restorable. And not > irreproducable.You should look for file/directory-level tree merging, e.g. this FUSE based virtual FS: https://romanrm.ru/en/mhddfs Or various other unionfs''es, some of which are kernel-based. Regarding btrfs, AFAIK even "btrfs -d single" suggested above works not "per file", but per allocation extent, so in case of one disk failure you will lose random *parts* (extents) of random files, which in effect could mean no file in your whole file system will remain undamaged. -- With respect, Roman ~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Stallman had a printer, with code he could not see. So he began to tinker, and set the software free."
On Tuesday 08 of May 2012 12:00:00 Helmut Hullen wrote:> Hallo, Fajar, > > Du meintest am 08.05.12: > >>> And you can use three BTRFS filesystems the same way as three Ext4 > >>> filesystems if you prefer such a setup if the time spent for > >>> restoring the backup does not make up the cost for one additional > >>> disk for you. > >> > >> But where''s the gain? If a disk fails I have a lot of tools for > >> repairing an ext2/3/4 system. > > > > It won''t work if you use it in RAID0 (e.g. with LVM spanning three > > disks, then use ext4 on top of the LV). > > But when I use ext2/3/4 I neither need RAID0 nor do I need LVM. > > > As others said, if your only concern is "if a disk is dead, I want to > > be able to access data on other disks", then simply use btrfs as > > three different fs, mounted on three directories. > > But then I don''t need especially btrfs. > > > btrfs will shine when: > > - you need checksum and self-healing in raid10 mode > > - you have lots of small files > > - you have highly compressible content > > - you need snapshot/clone feature > > For my video collection (mpeg2) nothing fits ... > > The only advantage I see with btrfs is > > adding a bigger disk > deleting/removing a smaller disk > > with really simple commands.Playing the Devil''s advocate here (not that I don''t use The Other Linux FS ;) I don''t see btrfs commands much different from pvcreate /dev/new-disk vgextend videos-volume-42 /dev/new-disk pvmove /dev/old-disk /dev/new-disk vgreduce videos-volume-42 /dev/old-disk resize2fs /dev/videos-volume-42/logical-volume Unlike with shrinking, there''s really no place for error. Messing up those commands will give quite clear error messages and definetly won''t destroy data (unless a hardware error occurs). And the FS on the LV is online all the time, just like with btrfs. The only difference is that with btrfs you can both extend and shrink the FS online, with ext2/3/4 you can only extend online... Regards, -- Hubert Kario QBS - Quality Business Software 02-656 Warszawa, ul. Ksawerów 30/85 tel. +48 (22) 646-61-51, 646-74-24 www.qbs.com.pl -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Helmut Hullen
2012-May-09 13:04 UTC
failed disk (was: kernel 3.3.4 damages filesystem (?))
Hallo, Hugo, Du meintest am 07.05.12:>>> mkfs.btrfs -m raid1 -d single should give you that.>> What''s the difference to >> >> mkfs.btrfs -m raid1 -d raid0> - RAID-0 stripes each piece of data across all the disks. > - single puts data on one disk at a time.[...]> In fact, this is probably a good argument for having the option to > put back the old allocator algorithm, which would have ensured that > the first disk would fill up completely first before it touched the > next one...The actual version seems to oscillate from disk to disk: Copying about 160 GiByte shows Label: none uuid: fd0596c6-d819-42cd-bb4a-420c38d2a60b Total devices 2 FS bytes used 155.64GB devid 2 size 136.73GB used 114.00GB path /dev/sdl1 devid 1 size 68.37GB used 45.04GB path /dev/sdk1 Btrfs Btrfs v0.19 ------------------------ Watching the amount showed that both disks are filled nearly simultaneously. That would be more difficult to restore ... Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hugo Mills
2012-May-09 13:19 UTC
Re: failed disk (was: kernel 3.3.4 damages filesystem (?))
On Wed, May 09, 2012 at 03:04:00PM +0200, Helmut Hullen wrote:> Hallo, Hugo, > > Du meintest am 07.05.12: > > >>> mkfs.btrfs -m raid1 -d single should give you that. > > >> What''s the difference to > >> > >> mkfs.btrfs -m raid1 -d raid0 > > > - RAID-0 stripes each piece of data across all the disks. > > - single puts data on one disk at a time. > > [...] > > > > In fact, this is probably a good argument for having the option to > > put back the old allocator algorithm, which would have ensured that > > the first disk would fill up completely first before it touched the > > next one... > > The actual version seems to oscillate from disk to disk:Yes, specifically, when it''s asked for n chunks to make up a block group, the current allocator will pick the n disks with the most free space on them. The original allocator would pick the disks with the smallest devid (which is probably optimal for your use case -- hence my comment above).> Watching the amount showed that both disks are filled nearly > simultaneously. > > That would be more difficult to restore ...If your files are small compared to the block group size (1GiB in this case), then the odds of a file spanning block groups are small. With files similar in size to, or larger than, a chunk, you will be far more likely to lose some part of the file when a disk goes away. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Great oxymorons of the world, no. 6: Mature Student ---
Helmut Hullen
2012-May-09 14:25 UTC
failed disk (was: kernel 3.3.4 damages filesystem (?))
Hallo, Hugo, Du meintest am 07.05.12: [...]>> With a file system like ext2/3/4 I can work with several directories >> which are mounted together, but (as said before) one broken disk >> doesn''t disturb the others.> mkfs.btrfs -m raid1 -d single should give you that.Just a small bug, perhaps: created a system with mkfs.btrfs -m raid1 -d single /dev/sdl1 mount /dev/sdl1 /mnt/Scsi btrfs device add /dev/sdk1 /mnt/Scsi btrfs device add /dev/sdm1 /mnt/Scsi (filling with data) and btrfs fi df /mnt/Scsi now tells Data, RAID0: total=183.18GB, used=76.60GB Data: total=80.01GB, used=79.83GB System, DUP: total=8.00MB, used=32.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=192.74MB Metadata: total=8.00MB, used=0.00 -------------------------------------- "Data, RAID0" confuses me (not very much ...), and the system for metadata (RAID1) is not told. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hugo Mills
2012-May-09 14:37 UTC
Re: failed disk (was: kernel 3.3.4 damages filesystem (?))
On Wed, May 09, 2012 at 04:25:00PM +0200, Helmut Hullen wrote:> Du meintest am 07.05.12: > > [...] > > >> With a file system like ext2/3/4 I can work with several directories > >> which are mounted together, but (as said before) one broken disk > >> doesn''t disturb the others. > > > mkfs.btrfs -m raid1 -d single should give you that. > > Just a small bug, perhaps: > > created a system with > > mkfs.btrfs -m raid1 -d single /dev/sdl1 > mount /dev/sdl1 /mnt/Scsi > btrfs device add /dev/sdk1 /mnt/Scsi > btrfs device add /dev/sdm1 /mnt/Scsi > (filling with data) > > and > > btrfs fi df /mnt/Scsi > > now tells > > Data, RAID0: total=183.18GB, used=76.60GB > Data: total=80.01GB, used=79.83GB > System, DUP: total=8.00MB, used=32.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=192.74MB > Metadata: total=8.00MB, used=0.00 > > -------------------------------------- > > "Data, RAID0" confuses me (not very much ...), and the system for > metadata (RAID1) is not told.DUP is two copies of each block, but it allows the two copies to live on the same device. It''s done this because you started with a single device, and you can''t do RAID-1 on one device. The first bit of metadata you write to it should automatically upgrade the DUP chunk to RAID-1. As to the spurious "upgrade" of single to RAID-0, I thought Ilya had stopped it doing that. What kernel version are you running? Out of interest, why did you do the device adds separately, instead of just this? # mkfs.btrfs -m raid1 -d single /dev/sdl1 /dev/sdk1 /dev/sdm1 Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Comic Sans goes into a bar, and the barman says, "We don''t --- serve your type here."
Hi, On 05/08/2012 10:56 PM, Roman Mamedov wrote:> Regarding btrfs, AFAIK even "btrfs -d single" suggested above works not "per > file", but per allocation extent, so in case of one disk failure you will lose > random *parts* (extents) of random files, which in effect could mean no file > in your whole file system will remain undamaged.Maybe we should evaluate the possiblility of such a "one file gets on one disk" feature. Helmut Hullen has the use case: Many disks, totally non-critical but nice-to-have data. If one disk dies, some *files* should lost, not some *random parts of all files*. This could be accomplished by some userspace-tool that moves stuff around, combined with "file pinning"-support, that lets the user make sure a specific file is on a specific disk. Cheers Kaspar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Hugo, Du meintest am 09.05.12:>>> mkfs.btrfs -m raid1 -d single should give you that.>> Just a small bug, perhaps: >> >> created a system with >> >> mkfs.btrfs -m raid1 -d single /dev/sdl1 >> mount /dev/sdl1 /mnt/Scsi >> btrfs device add /dev/sdk1 /mnt/Scsi >> btrfs device add /dev/sdm1 /mnt/Scsi >> (filling with data) >> >> and >> >> btrfs fi df /mnt/Scsi >> >> now tells >> >> Data, RAID0: total=183.18GB, used=76.60GB >> Data: total=80.01GB, used=79.83GB >> System, DUP: total=8.00MB, used=32.00KB >> System: total=4.00MB, used=0.00 >> Metadata, DUP: total=1.00GB, used=192.74MB >> Metadata: total=8.00MB, used=0.00 >> >> -------------------------------------- >> >> "Data, RAID0" confuses me (not very much ...), and the system for >> metadata (RAID1) is not told.> DUP is two copies of each block, but it allows the two copies to > live on the same device. It''s done this because you started with a > single device, and you can''t do RAID-1 on one device. The first bit > of metadata you write to it should automatically upgrade the DUP > chunk to RAID-1.Ok. Sounds familiar - have you explained that to me many months ago?> As to the spurious "upgrade" of single to RAID-0, I thought Ilya > had stopped it doing that. What kernel version are you running?3.2.9, self made. I could test the message with 3.3.4, but not today (if it''s only an interpretation of always the same data).> Out of interest, why did you do the device adds separately, > instead of just this?a) making the first 2 devices: I have tested both versions (one line with 2 devices or 2 lines with 1 device); no big difference. But I had tested the option "-L" (labelling) too, and that makes shit for the oneliner: both devices get the same label, and then "findfs" finds none of them. The really safe way would be: deleting this option for the "mkfs.btrfs" command and only using btrfs fi label <device> [<newlabel>] b) third device: that''s my usual test: make a cluster of 2 deivces fill them with data add a third device delete the smallest device Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 09, 2012 at 05:14:00PM +0200, Helmut Hullen wrote:> Hallo, Hugo, > > Du meintest am 09.05.12: > > > DUP is two copies of each block, but it allows the two copies to > > live on the same device. It''s done this because you started with a > > single device, and you can''t do RAID-1 on one device. The first bit > > of metadata you write to it should automatically upgrade the DUP > > chunk to RAID-1. > > Ok. > > Sounds familiar - have you explained that to me many months ago?Probably. I tend to explain this kind of thing a lot to people.> > As to the spurious "upgrade" of single to RAID-0, I thought Ilya > > had stopped it doing that. What kernel version are you running? > > 3.2.9, self made.OK, I''m pretty sure that''s too old -- it will "upgrade" single to RAID-0. You can probably turn it back to "single" using balance filters: # btrfs fi balance -dconvert=single /mountpoint (You may want to write at least a little data to the FS first -- balance has some slightly odd behaviour on empty filesystems).> I could test the message with 3.3.4, but not today (if it''s only an > interpretation of always the same data). > > > Out of interest, why did you do the device adds separately, > > instead of just this? > > a) making the first 2 devices: I have tested both versions (one line > with 2 devices or 2 lines with 1 device); no big difference. > > But I had tested the option "-L" (labelling) too, and that makes shit > for the oneliner: both devices get the same label, and then "findfs" > finds none of them.Umm... Yes, of course both devices will get the same label -- you''re labelling the filesystem, not the devices. (Didn''t we have this argument some time ago?). I don''t know what "findfs" is doing, that it can''t find the filesystem by label: you may need to run "sync" after mkfs, possibly.> The really safe way would be: deleting this option for the "mkfs.btrfs" > command and only using > > btrfs fi label <device> [<newlabel>]... except that it''d have to take a filesystem as parameter, not a device (see above).> b) third device: that''s my usual test: > make a cluster of 2 deivces > fill them with data > add a third device > delete the smallest deviceWhat are you testing? And by "delete" do you mean "btrfs dev delete" or "pull the cable out"? Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Quidquid latine dictum sit, altum videtur. ---
Ilya Dryomov
2012-May-09 16:13 UTC
Re: failed disk (was: kernel 3.3.4 damages filesystem (?))
On Wed, May 09, 2012 at 03:37:35PM +0100, Hugo Mills wrote:> On Wed, May 09, 2012 at 04:25:00PM +0200, Helmut Hullen wrote: > > Du meintest am 07.05.12: > > > > [...] > > > > >> With a file system like ext2/3/4 I can work with several directories > > >> which are mounted together, but (as said before) one broken disk > > >> doesn''t disturb the others. > > > > > mkfs.btrfs -m raid1 -d single should give you that. > > > > Just a small bug, perhaps: > > > > created a system with > > > > mkfs.btrfs -m raid1 -d single /dev/sdl1 > > mount /dev/sdl1 /mnt/Scsi > > btrfs device add /dev/sdk1 /mnt/Scsi > > btrfs device add /dev/sdm1 /mnt/Scsi > > (filling with data) > > > > and > > > > btrfs fi df /mnt/Scsi > > > > now tells > > > > Data, RAID0: total=183.18GB, used=76.60GB > > Data: total=80.01GB, used=79.83GB > > System, DUP: total=8.00MB, used=32.00KB > > System: total=4.00MB, used=0.00 > > Metadata, DUP: total=1.00GB, used=192.74MB > > Metadata: total=8.00MB, used=0.00 > > > > -------------------------------------- > > > > "Data, RAID0" confuses me (not very much ...), and the system for > > metadata (RAID1) is not told. > > DUP is two copies of each block, but it allows the two copies to > live on the same device. It''s done this because you started with a > single device, and you can''t do RAID-1 on one device. The first bit ofWhat Hugo said. Newer mkfs.btrfs will error out if you try to do this.> metadata you write to it should automatically upgrade the DUP chunk to > RAID-1.We don''t "upgrade" chunks in place, only during balance.> > As to the spurious "upgrade" of single to RAID-0, I thought Ilya > had stopped it doing that. What kernel version are you running?I did, but again, we were doing it only as part of balance, not as part of normal operation. Helmut, do you have any additional data points - the output of btrfs fi df right after you created FS or somewhere in the middle of filling it ? Also could you please paste the output of btrfs fi show and tell us what kernel version you are running ? Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Helmut Hullen posted on Mon, 07 May 2012 12:46:00 +0200 as excerpted:> The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller. > Only 1 of the 3 disks seems to be damaged.I don''t plan to rehash the raid0/single discussion here, but here''s some perhaps useful additional information on that hardware: For some years I''ve been running that same hardware, SiI 3114 SATA PCI, on an old dual-socket 3-digit Opteron system, running for some years now dual dual-core Opteron 290s (the highest they went, 2.8 GHz, 4 cores in two sockets). However, I *WAS* running them in RAID-1, 4-disk md RAID-1, to be exact (with reiserfs, FWIW). What''s VERY interesting is that I''ve just returned from being offline for several days due to severe disk-I/O hardware issues of my own -- again, on that Sil-SATA 3114. Most of the time I was getting full system crashes, but perhaps 25-33% of the time it didn''t fully crash the system, simply error out with an eventual ATA reset. When the system didn''t crash immediately, most of the time (about 80% I''d say) the reset would be good and I''d be back up, but sometimes it''d repeatedly reset, occasionally not ever becoming usable again. As the drives are all the same quite old Seagate 300 gig drives, at about half their rated SMART operating hours but I think well beyond the 5 year warrantee, I originally thought I''d just learned my lesson on the don''t use all the same model or you''re risking them all going out at once rule, but I bought a new drive (half-TB seagate 2.5" drive, I''ve been thinking about going 2.5" for awhile now and this was the chance, I''ll RAID it later with at least one more, preferably a different run at least if not a different model) and have been SLOWLY, PAINFULLY, RESETTINGLY copying stuff over from one or another of the four RAID-1 drives. The reset problem, however, hasn''t gone away, tho it''s rather reduced on the newer hardware. I also happened to have a 4-3.5-in-3-5.25-slot drive enclosure that seemed to be making the problem worse, as when I first tried the new 2.5 inch retrofitted into it, the reset problem was as bad with it as with the old drives, but when I ran it "lose", just cabled into the mobo and power-supply directly, resets went down significantly but did NOT go away. So... I''ve now concluded that I need a new controller and will probably buy one in a day or two. Meanwhile, I THOUGHT it was "just me" with the SIL-SATA controller, until I happened to see the same hardware mentioned on this thread. Now, I''m beginning to suspect that there''s some new kernel DMA or storage or perhaps xorg/mesa (AMD AGPGART, after all, handling the DMA using half the aperture. if either the graphics or storage try writing to the wrong half...) problem that stressed what was already aging hardware, triggering the problem. It''s worth noting that I tried running an older kernel and rebuilding (on Gentoo) most of X/mesa/anything-else-I-could- think-might-be-related between older versions that WERE working find before and newer versions, and reverting to older didn''t help, so it''s apparently NOT a direct software-only-bug. However, what I''m wondering now is whether as I said, software upgrades added stress to already aging hardware, such that it tipped it over the edge, and by the time I tried reverting, I''d already had enough crashes and etc that my entire system was unstable, and reverting to older software didn''t help because now the hardware was unstable as well. I''d still chalk it up to simply failing hardware, except that it''s a rather interesting coincidence that both you and I had their SIL-SATA 3114s go bad at very close to the same time. Meanwhile, I did recently see an interesting kernel commit, either late 3.4-rc5+ or early 3.4-rc6+. I don''t want to try to track it down and lose this post to a crash on a less than stable system, but it did mention that AMD AGPGARTs sometimes poked holes in memory allocations and the commit was to try to allow for that. I''m not sure how long the bad code had been in the kernel, but if it was introduced at say the 3.2 or 3.3 kernel, it could be that is what first started triggering the lockups that lead to more and more system instability, until now I''ve bought a new drive and it looks like I''m going to need to replace the onboard SIL- SATA. So, some questions: * Do you run OpenGL/Mesa at all on that system, possibly with an OpenGL compositing window manager? * If so, how new is your mesa and xorg-server, and what is your video card/driver? * Do you run quite new kernels, say 3.3/3.4? * What libffi and cairo? (I did notice reverting libffi seemed to lessen the crashing a bit, especially with firefox on my bank''s SSL site, which was where the problem first became ugly for me as I kept crashing trying to get in to pay bills, etc, but I''m not positive that''s related, or it might be that likely otherwise separate bug''s crashes advanced the ATA- resets issue too.) * Perhaps most critically, is your system an old AMD with the AGPGART? * Also, amd64/x86_64, x86 (32), or? FWIW, amd64, KDE 4.8 here with kwin OpenGL compositing, generally leading edge mesa/xorg. I run git kernels so am on pre-release 3.4 now, and was pre-release 3.3 before that, when the problem perhaps started. (It seemed to get worse so I can''t say for sure when it went from normal to getting gradually worse, but for sure it wasn''t back in the 3.2 era as I was stable and happy back then.) Radeon hd4650 card, freedomware drivers. If any of that, especially the AGPGART, sounds familiar, we may have a hardware-burner bug that caught us both. If you''re running a bit older versions of all that stuff or no compositing/opengl, and have say an nVidia card and no AMD AGPGART, it''s probably simply coincidence. But if it''s not, and we can catch and get this fixed before the folks running older software as well upgrade and start burning their SIL-SATAs... (FWIW, I hadn''t yet upgraded to btrfs at all when the trouble started happening here, tho I was looking at it, thus my being on the list. I didn''t trust the two-way-only btrfs raid1 mode on my older disks and was waiting on N-way raid1 mode, roadmapped for after raid-5/6 mode, which is now roadmapped for 3.5... But with a new disk, eventually to add another for raid, I don''t have that problem now, so with the upgrade I''m trying btrfs dual-metadata single-data on a few working partitions now, backup''s still reiserfs, tho.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I dont know if this is related or not, but I updated two different computers to ubuntu 12, which uses kernel 3.2, and in both I had the same problem: using btrfs with compress-force=lzo, after some IO stress the filesystem became unusable, some sort of busy. Im using kernel 3.0 right now, with no such problem. On 09-05-2012 14:32, Duncan wrote:> Helmut Hullen posted on Mon, 07 May 2012 12:46:00 +0200 as excerpted: > >> The 3 btrfs disks are connected via a SiI 3114 SATA-PCI-Controller. >> Only 1 of the 3 disks seems to be damaged. > I don''t plan to rehash the raid0/single discussion here, but here''s some > perhaps useful additional information on that hardware: > > > For some years I''ve been running that same hardware, SiI 3114 SATA PCI, > on an old dual-socket 3-digit Opteron system, running for some years now > dual dual-core Opteron 290s (the highest they went, 2.8 GHz, 4 cores in > two sockets). However, I *WAS* running them in RAID-1, 4-disk md RAID-1, > to be exact (with reiserfs, FWIW). > > > What''s VERY interesting is that I''ve just returned from being offline for > several days due to severe disk-I/O hardware issues of my own -- again, > on that Sil-SATA 3114. > > Most of the time I was getting full system crashes, but perhaps 25-33% of > the time it didn''t fully crash the system, simply error out with an > eventual ATA reset. When the system didn''t crash immediately, most of > the time (about 80% I''d say) the reset would be good and I''d be back up, > but sometimes it''d repeatedly reset, occasionally not ever becoming > usable again. > > As the drives are all the same quite old Seagate 300 gig drives, at about > half their rated SMART operating hours but I think well beyond the 5 year > warrantee, I originally thought I''d just learned my lesson on the don''t > use all the same model or you''re risking them all going out at once rule, > but I bought a new drive (half-TB seagate 2.5" drive, I''ve been thinking > about going 2.5" for awhile now and this was the chance, I''ll RAID it > later with at least one more, preferably a different run at least if not > a different model) and have been SLOWLY, PAINFULLY, RESETTINGLY copying > stuff over from one or another of the four RAID-1 drives. > > The reset problem, however, hasn''t gone away, tho it''s rather reduced on > the newer hardware. > > I also happened to have a 4-3.5-in-3-5.25-slot drive enclosure that > seemed to be making the problem worse, as when I first tried the new 2.5 > inch retrofitted into it, the reset problem was as bad with it as with > the old drives, but when I ran it "lose", just cabled into the mobo and > power-supply directly, resets went down significantly but did NOT go away. > > > So... I''ve now concluded that I need a new controller and will probably > buy one in a day or two. > > Meanwhile, I THOUGHT it was "just me" with the SIL-SATA controller, until > I happened to see the same hardware mentioned on this thread. > > > Now, I''m beginning to suspect that there''s some new kernel DMA or storage > or perhaps xorg/mesa (AMD AGPGART, after all, handling the DMA using half > the aperture. if either the graphics or storage try writing to the wrong > half...) problem that stressed what was already aging hardware, > triggering the problem. It''s worth noting that I tried running an older > kernel and rebuilding (on Gentoo) most of X/mesa/anything-else-I-could- > think-might-be-related between older versions that WERE working find > before and newer versions, and reverting to older didn''t help, so it''s > apparently NOT a direct software-only-bug. However, what I''m wondering > now is whether as I said, software upgrades added stress to already aging > hardware, such that it tipped it over the edge, and by the time I tried > reverting, I''d already had enough crashes and etc that my entire system > was unstable, and reverting to older software didn''t help because now the > hardware was unstable as well. > > I''d still chalk it up to simply failing hardware, except that it''s a > rather interesting coincidence that both you and I had their SIL-SATA > 3114s go bad at very close to the same time. > > > Meanwhile, I did recently see an interesting kernel commit, either late > 3.4-rc5+ or early 3.4-rc6+. I don''t want to try to track it down and > lose this post to a crash on a less than stable system, but it did > mention that AMD AGPGARTs sometimes poked holes in memory allocations and > the commit was to try to allow for that. I''m not sure how long the bad > code had been in the kernel, but if it was introduced at say the 3.2 or > 3.3 kernel, it could be that is what first started triggering the lockups > that lead to more and more system instability, until now I''ve bought a > new drive and it looks like I''m going to need to replace the onboard SIL- > SATA. > > So, some questions: > > * Do you run OpenGL/Mesa at all on that system, possibly with an OpenGL > compositing window manager? > > * If so, how new is your mesa and xorg-server, and what is your video > card/driver? > > * Do you run quite new kernels, say 3.3/3.4? > > * What libffi and cairo? (I did notice reverting libffi seemed to lessen > the crashing a bit, especially with firefox on my bank''s SSL site, which > was where the problem first became ugly for me as I kept crashing trying > to get in to pay bills, etc, but I''m not positive that''s related, or it > might be that likely otherwise separate bug''s crashes advanced the ATA- > resets issue too.) > > * Perhaps most critically, is your system an old AMD with the AGPGART? > > * Also, amd64/x86_64, x86 (32), or? > > FWIW, amd64, KDE 4.8 here with kwin OpenGL compositing, generally leading > edge mesa/xorg. I run git kernels so am on pre-release 3.4 now, and was > pre-release 3.3 before that, when the problem perhaps started. (It > seemed to get worse so I can''t say for sure when it went from normal to > getting gradually worse, but for sure it wasn''t back in the 3.2 era as I > was stable and happy back then.) Radeon hd4650 card, freedomware drivers. > > If any of that, especially the AGPGART, sounds familiar, we may have a > hardware-burner bug that caught us both. If you''re running a bit older > versions of all that stuff or no compositing/opengl, and have say an > nVidia card and no AMD AGPGART, it''s probably simply coincidence. But if > it''s not, and we can catch and get this fixed before the folks running > older software as well upgrade and start burning their SIL-SATAs... > > (FWIW, I hadn''t yet upgraded to btrfs at all when the trouble started > happening here, tho I was looking at it, thus my being on the list. I > didn''t trust the two-way-only btrfs raid1 mode on my older disks and was > waiting on N-way raid1 mode, roadmapped for after raid-5/6 mode, which is > now roadmapped for 3.5... But with a new disk, eventually to add another > for raid, I don''t have that problem now, so with the upgrade I''m trying > btrfs dual-metadata single-data on a few working partitions now, backup''s > still reiserfs, tho.) >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Hugo, Du meintest am 09.05.12:>>> As to the spurious "upgrade" of single to RAID-0, I thought Ilya >>> had stopped it doing that. What kernel version are you running?>> 3.2.9, self made.> OK, I''m pretty sure that''s too old -- it will "upgrade" single to > RAID-0. You can probably turn it back to "single" using balance > filters:> # btrfs fi balance -dconvert=single /mountpoint> (You may want to write at least a little data to the FS first -- > balance has some slightly odd behaviour on empty filesystems)."manana" ... the system is just running "balance" after "device delete". And that may still need 4 ... 5 hours.>>> Out of interest, why did you do the device adds separately, >>> instead of just this?>> a) making the first 2 devices: I have tested both versions (one line >> with 2 devices or 2 lines with 1 device); no big difference. >> >> But I had tested the option "-L" (labelling) too, and that makes >> shit for the oneliner: both devices get the same label, and then >> "findfs" finds none of them.> Umm... Yes, of course both devices will get the same label -- > you''re labelling the filesystem, not the devices. (Didn''t we have > this argument some time ago?).Not with that special case (and that led me to misinterpreting the error ...).> I don''t know what "findfs" is doing, that it can''t find the > filesystem by label: you may need to run "sync" after mkfs, possibly.No - "findfs" works quite simple: if it finds 1 label then it tells the partition. If it finds more or less labels it tells nothing.>> b) third device: that''s my usual test: >> make a cluster of 2 deivces >> fill them with data >> add a third device >> delete the smallest device> What are you testing? And by "delete" do you mean "btrfs dev > delete" or "pull the cable out"?First pure software delete. Tomorrow I''ll reboot the system and look at the results with btrfs fi show It should tell only 2 devices (that''s the part which seems to work as described at least since kernel 3.2). By the way: it seems to be necessary running btrfs fi balance ... after "btrfs device add ..." and after "btrfs device delete ...". Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Hugo, Du meintest am 09.05.12:>> btrfs fi df /mnt/Scsi >> >> now tells >> >> Data, RAID0: total=183.18GB, used=76.60GB >> Data: total=80.01GB, used=79.83GB >> System, DUP: total=8.00MB, used=32.00KB >> System: total=4.00MB, used=0.00 >> Metadata, DUP: total=1.00GB, used=192.74MB >> Metadata: total=8.00MB, used=0.00 >> >> -------------------------------------- >> >> "Data, RAID0" confuses me (not very much ...), and the system for >> metadata (RAID1) is not told.> DUP is two copies of each block, but it allows the two copies to > live on the same device. It''s done this because you started with a > single device, and you can''t do RAID-1 on one device. The first bit > of metadata you write to it should automatically upgrade the DUP > chunk to RAID-1.It has done - ok. Adding and removing disks/partitions works as expected. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:> Hi, > > On 05/08/2012 10:56 PM, Roman Mamedov wrote: > > Regarding btrfs, AFAIK even "btrfs -d single" suggested above works > > not "per file", but per allocation extent, so in case of one disk > > failure you will lose random *parts* (extents) of random files, > > which in effect could mean no file in your whole file system will > > remain undamaged. > > Maybe we should evaluate the possiblility of such a "one file gets on > one disk" feature. > > Helmut Hullen has the use case: Many disks, totally non-critical but > nice-to-have data. If one disk dies, some *files* should lost, not some > *random parts of all files*. > > This could be accomplished by some userspace-tool that moves stuff > around, combined with "file pinning"-support, that lets the user make > sure a specific file is on a specific disk.Yeah, basically I think thats the whole point Helmut is trying to make. I am not sure whether that should be in userspace. It could be just an allocation mode like "raid0" or "single". Such as "single" as in one file is really on one disk and thats it. -- Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Helmut Hullen
2012-May-10 11:55 UTC
feature request (was: kernel 3.3.4 damages filesystem (?))
Hallo, Martin, Du meintest am 10.05.12: [...]>> Maybe we should evaluate the possiblility of such a "one file gets >> on one disk" feature. >> >> Helmut Hullen has the use case: Many disks, totally non-critical but >> nice-to-have data. If one disk dies, some *files* should lost, not >> some *random parts of all files*. >> >> This could be accomplished by some userspace-tool that moves stuff >> around, combined with "file pinning"-support, that lets the user >> make sure a specific file is on a specific disk.> Yeah, basically I think thats the whole point Helmut is trying to > make.Yes - that''s the feature which I miss ...> I am not sure whether that should be in userspace. It could be just > an allocation mode like "raid0" or "single". Such as "single" as in > one file is really on one disk and thats it.What I''m dreaming for: I have a bundle/cluster of (p.e.) 3 disks. When I remove 1 disk (accidently/planned/because of disk failure) then I''d be very pleased when the contents of the other disks is (mostly) still readable. It''s no fun restoring Terabytes ... Yes - I know: that''s no backup, that doesn''t replace a backup. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote:> Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser: > > Hi, > > > > On 05/08/2012 10:56 PM, Roman Mamedov wrote: > > > Regarding btrfs, AFAIK even "btrfs -d single" suggested above works > > > not "per file", but per allocation extent, so in case of one disk > > > failure you will lose random *parts* (extents) of random files, > > > which in effect could mean no file in your whole file system will > > > remain undamaged. > > > > Maybe we should evaluate the possiblility of such a "one file gets on > > one disk" feature. > > > > Helmut Hullen has the use case: Many disks, totally non-critical but > > nice-to-have data. If one disk dies, some *files* should lost, not some > > *random parts of all files*. > > > > This could be accomplished by some userspace-tool that moves stuff > > around, combined with "file pinning"-support, that lets the user make > > sure a specific file is on a specific disk. > > Yeah, basically I think thats the whole point Helmut is trying to make. > > I am not sure whether that should be in userspace. It could be just an > allocation mode like "raid0" or "single". Such as "single" as in one file > is really on one disk and thats it.I was thinking that "linear" would be good name for old style allocator. Regards -- Hubert Kario QBS - Quality Business Software 02-656 Warszawa, ul. Ksawerów 30/85 tel. +48 (22) 646-61-51, 646-74-24 www.qbs.com.pl -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 10, 2012 at 09:43:58PM +0200, Hubert Kario wrote:> On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote: > > Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser: > > > Hi, > > > > > > On 05/08/2012 10:56 PM, Roman Mamedov wrote: > > > > Regarding btrfs, AFAIK even "btrfs -d single" suggested above works > > > > not "per file", but per allocation extent, so in case of one disk > > > > failure you will lose random *parts* (extents) of random files, > > > > which in effect could mean no file in your whole file system will > > > > remain undamaged. > > > > > > Maybe we should evaluate the possiblility of such a "one file gets on > > > one disk" feature. > > > > > > Helmut Hullen has the use case: Many disks, totally non-critical but > > > nice-to-have data. If one disk dies, some *files* should lost, not some > > > *random parts of all files*. > > > > > > This could be accomplished by some userspace-tool that moves stuff > > > around, combined with "file pinning"-support, that lets the user make > > > sure a specific file is on a specific disk. > > > > Yeah, basically I think thats the whole point Helmut is trying to make. > > > > I am not sure whether that should be in userspace. It could be just an > > allocation mode like "raid0" or "single". Such as "single" as in one file > > is really on one disk and thats it. > > I was thinking that "linear" would be good name for old style allocator.Please do distinguish between the replication level (e.g. "single", "RAID-1") and the allocator algorithm. These are distinct. Also, note that both of those work on the scale of chunks/block groups. There is a further consideration, which is the allocation of file data to block groups, which is a whole different thing again (and not something I know a great deal about), but which will also affect the desired outcome quite a lot. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Anyone who claims their cryptographic protocol is secure is --- either a genius or a fool. Given the genius/fool ratio for our species, the odds aren''t good.
On Thursday 10 of May 2012 21:15:30 Hugo Mills wrote:> On Thu, May 10, 2012 at 09:43:58PM +0200, Hubert Kario wrote: > > On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote: > > > Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser: > > > > Hi, > > > > > > > > On 05/08/2012 10:56 PM, Roman Mamedov wrote: > > > > > Regarding btrfs, AFAIK even "btrfs -d single" suggested above > > > > > works > > > > > not "per file", but per allocation extent, so in case of one disk > > > > > failure you will lose random *parts* (extents) of random files, > > > > > which in effect could mean no file in your whole file system will > > > > > remain undamaged. > > > > > > > > Maybe we should evaluate the possiblility of such a "one file gets > > > > on > > > > one disk" feature. > > > > > > > > Helmut Hullen has the use case: Many disks, totally non-critical but > > > > nice-to-have data. If one disk dies, some *files* should lost, not > > > > some > > > > *random parts of all files*. > > > > > > > > This could be accomplished by some userspace-tool that moves stuff > > > > around, combined with "file pinning"-support, that lets the user > > > > make > > > > sure a specific file is on a specific disk. > > > > > > Yeah, basically I think thats the whole point Helmut is trying to > > > make. > > > > > > I am not sure whether that should be in userspace. It could be just an > > > allocation mode like "raid0" or "single". Such as "single" as in one > > > file > > > is really on one disk and thats it. > > > > I was thinking that "linear" would be good name for old style allocator. > > Please do distinguish between the replication level (e.g. "single", > "RAID-1") and the allocator algorithm. These are distinct. Also, note > that both of those work on the scale of chunks/block groups. There is > a further consideration, which is the allocation of file data to block > groups, which is a whole different thing again (and not something I > know a great deal about), but which will also affect the desired > outcome quite a lot.Yes, I know about that. I was more thinking on the line "how quickly restore aviability of old allocator". Regards, -- Hubert Kario QBS - Quality Business Software 02-656 Warszawa, ul. Ksawerów 30/85 tel. +48 (22) 646-61-51, 646-74-24 www.qbs.com.pl -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html