Hello Folks,
I used to have an array of 4x4TB drives with BTRFS in raid10.
The kernel version is: 3.13-0.bpo.1-amd64
BTRFS version is: v3.14.1
When it was reaching 80% in space I added another 4TB drive to the array with:
> btrfs device add /dev/sdf /mnt/backup
And started the balancing to the new drive:
> btrfs filesystem balance /mnt/backup
This was going for a while for 5-6 hours before it segfaulted with not enough
free space message.
Now my configuration looks like this:
btrfs fi show /mnt/backup
Label: 'backup' uuid: ...
Total devices 5 FS bytes used 5.93TiB
devid 1 size 3.64TiB used 2.82TiB path /dev/sdd
devid 2 size 3.64TiB used 2.82TiB path /dev/sdc
devid 3 size 3.64TiB used 2.81TiB path /dev/sdb
devid 4 size 3.64TiB used 2.82TiB path /dev/sde
devid 5 size 3.64TiB used 638.50GiB path /dev/sdf
After this crash happend during the balancing (logs are attached at the end) the
system remounted my /mnt/backup share as RO.
At this point I started to really worry. I umounted and remounted it manually.
At the beginning it run some self checks which took like 5 mins then as iotop
showed it continued with the balancing which failed again the same way. For next
time after mount I immediately put the balancing on pause (which helped).
My question is where to go from here? What I going to do right now is to copy
the most important data to another separated XFS drive.
What I planning to do is:
1, Upgrade the kernel
2, Upgrade BTRFS
3, Continue the balancing.
Could someone please also explain that how is exactly the raid10 setup works
with ODD number of drives with btrfs?
Raid10 should be a stripe of mirrors. Now then this sdf drive is mirrored or
striped or what?
Some btrfs gurus could tell me that should I be worried of dataloss because of
this or not?
Would I need even more free space just to add a 5th drive? If so how much more?
Kernel logs
-----------
Oct 24 17:25:44 backup kernel: [29396.873750] btrfs: relocating block group
5162588438528 flags 65
Oct 24 17:26:09 backup kernel: [29421.594524] btrfs: found 13126 extents
Oct 24 17:26:38 backup kernel: [29450.769228] btrfs: found 13126 extents
Oct 24 17:26:39 backup kernel: [29451.345198] btrfs: relocating block group
5161514696704 flags 68
Oct 24 17:31:33 backup kernel: [29745.776810] BTRFS debug (device sdb):
run_one_delayed_ref returned -28
Oct 24 17:31:33 backup kernel: [29745.776818] ------------[ cut here
]------------
Oct 24 17:31:33 backup kernel: [29745.776847] WARNING: CPU: 1 PID: 1807 at
/build/linux-t5aGFh/linux-3.13.10/fs/btrfs/super.c:254
__btrfs_abort_transaction+0x5a/0x140 [btrfs]()
Oct 24 17:31:33 backup kernel: [29745.776849] btrfs: Transaction aborted (error
-28)
Oct 24 17:31:33 backup kernel: [29745.776851] Modules linked in: xen_gntdev
xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
fscache sunrpc 8021q garp mrp bridge stp llc loop iTCO_wdt iTCO_vendor_support
lpc_ich radeon mfd_core processor evdev ttm drm_kms_helper drm i2c_algo_bit
coretemp rng_core serio_raw pcspkr i2c_i801 i2c_core i3000_edac thermal_sys
button shpchp edac_core ext4 crc16 mbcache jbd2 btrfs xor raid6_pq crc32c
libcrc32c dm_mod xen_pciback sg sd_mod sr_mod crc_t10dif cdrom crct10dif_common
ata_generic ahci ata_piix libahci 3w_9xxx libata scsi_mod ehci_pci uhci_hcd
ehci_hcd e1000e ptp pps_core usbcore usb_common
Oct 24 17:31:33 backup kernel: [29745.776902] CPU: 1 PID: 1807 Comm:
btrfs-transacti Not tainted 3.13-0.bpo.1-amd64 #1 Debian 3.13.10-1~bpo70+1
Oct 24 17:31:33 backup kernel: [29745.776905] Hardware name: Supermicro
PDSM4+/PDSM4+, BIOS 6.00 02/05/2007
Oct 24 17:31:33 backup kernel: [29745.776907] 0000000000000000 ffffffffa0257130
ffffffff814d16c9 ffff88006a7f3cc8
Oct 24 17:31:33 backup kernel: [29745.776911] ffffffff81060967 00000000ffffffe4
ffff880004282800 ffff88003b813ec0
Oct 24 17:31:33 backup kernel: [29745.776914] 0000000000000aaa ffffffffa0253b60
ffffffff81060a55 ffffffffa0257260
Oct 24 17:31:33 backup kernel: [29745.776918] Call Trace:
Oct 24 17:31:33 backup kernel: [29745.776926] [<ffffffff814d16c9>] ?
dump_stack+0x41/0x51
Oct 24 17:31:33 backup kernel: [29745.776931] [<ffffffff81060967>] ?
warn_slowpath_common+0x87/0xc0
Oct 24 17:31:33 backup kernel: [29745.776935] [<ffffffff81060a55>] ?
warn_slowpath_fmt+0x45/0x50
Oct 24 17:31:33 backup kernel: [29745.776946] [<ffffffffa01b73ca>] ?
__btrfs_abort_transaction+0x5a/0x140 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.776959] [<ffffffffa01d2e72>] ?
btrfs_run_delayed_refs+0x372/0x530 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.776974] [<ffffffffa01fa8c3>] ?
btrfs_run_ordered_operations+0x213/0x2b0 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.776988] [<ffffffffa01e2fea>] ?
btrfs_commit_transaction+0x5a/0x990 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.777001] [<ffffffffa01e1345>] ?
transaction_kthread+0x1c5/0x240 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.777015] [<ffffffffa01e1180>] ?
open_ctree+0x1ff0/0x1ff0 [btrfs]
Oct 24 17:31:33 backup kernel: [29745.777019] [<ffffffff8108233c>] ?
kthread+0xbc/0xe0
Oct 24 17:31:33 backup kernel: [29745.777022] [<ffffffff81082280>] ?
flush_kthread_worker+0xa0/0xa0
Oct 24 17:31:33 backup kernel: [29745.777026] [<ffffffff814dee4c>] ?
ret_from_fork+0x7c/0xb0
Oct 24 17:31:33 backup kernel: [29745.777030] [<ffffffff81082280>] ?
flush_kthread_worker+0xa0/0xa0
Oct 24 17:31:33 backup kernel: [29745.777032] ---[ end trace 5de5beb31698a3c1
]---
Oct 24 17:31:33 backup kernel: [29745.777035] BTRFS error (device sdb) in
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:31:33 backup kernel: [29745.777512] BTRFS info (device sdb): forced
readonly
Oct 24 17:31:33 backup kernel: [29745.784767] BTRFS debug (device sdb):
run_one_delayed_ref returned -28
Oct 24 17:31:33 backup kernel: [29745.784773] BTRFS error (device sdb) in
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:35:53 backup kernel: [30005.015967] btrfs: device label backup_fs
devid 3 transid 86656 /dev/sdb
Oct 24 17:35:53 backup kernel: [30005.063903] btrfs: disk space caching is
enabled
Oct 24 17:43:01 backup kernel: [30433.356660] BTRFS debug (device sdf): unlinked
1 orphans
Oct 24 17:43:01 backup kernel: [30433.395645] btrfs: continuing balance
Oct 24 17:43:02 backup kernel: [30434.395936] btrfs: relocating block group
7434626138112 flags 65
Oct 24 17:43:17 backup kernel: [30449.104022] btrfs: found 8842 extents
Oct 24 17:43:24 backup kernel: [30456.043235] btrfs: found 8834 extents
Oct 24 17:43:24 backup kernel: [30456.580133] btrfs: relocating block group
7223098998784 flags 68
Oct 24 17:48:42 backup kernel: [30774.465707] btrfs: found 37187 extents
Oct 24 17:48:43 backup kernel: [30775.058570] btrfs: relocating block group
6782864850944 flags 68
Oct 24 17:52:16 backup kernel: [30988.070735] BTRFS debug (device sdf):
run_one_delayed_ref returned -28
Oct 24 17:52:16 backup kernel: [30988.070742] ------------[ cut here
]------------
Oct 24 17:52:16 backup kernel: [30988.070772] WARNING: CPU: 1 PID: 15920 at
/build/linux-t5aGFh/linux-3.13.10/fs/btrfs/super.c:254
__btrfs_abort_transaction+0x5a/0x140 [btrfs]()
Oct 24 17:52:16 backup kernel: [30988.070775] btrfs: Transaction aborted (error
-28)
Oct 24 17:52:16 backup kernel: [30988.070776] Modules linked in: xen_gntdev
xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
fscache sunrpc 8021q garp mrp bridge stp llc loop iTCO_wdt iTCO_vendor_support
lpc_ich radeon mfd_core processor evdev ttm drm_kms_helper drm i2c_algo_bit
coretemp rng_core serio_raw pcspkr i2c_i801 i2c_core i3000_edac thermal_sys
button shpchp edac_core ext4 crc16 mbcache jbd2 btrfs xor raid6_pq crc32c
libcrc32c dm_mod xen_pciback sg sd_mod sr_mod crc_t10dif cdrom crct10dif_common
ata_generic ahci ata_piix libahci 3w_9xxx libata scsi_mod ehci_pci uhci_hcd
ehci_hcd e1000e ptp pps_core usbcore usb_common
Oct 24 17:52:16 backup kernel: [30988.070828] CPU: 1 PID: 15920 Comm:
btrfs-transacti Tainted: G W 3.13-0.bpo.1-amd64 #1 Debian
3.13.10-1~bpo70+1
Oct 24 17:52:16 backup kernel: [30988.070830] Hardware name: Supermicro
PDSM4+/PDSM4+, BIOS 6.00 02/05/2007
Oct 24 17:52:16 backup kernel: [30988.070833] 0000000000000000 ffffffffa0257130
ffffffff814d16c9 ffff880056d7bcc8
Oct 24 17:52:16 backup kernel: [30988.070838] ffffffff81060967 00000000ffffffe4
ffff880003c97000 ffff88006ba9abe0
Oct 24 17:52:16 backup kernel: [30988.070841] 0000000000000aaa ffffffffa0253b60
ffffffff81060a55 ffffffffa0257260
Oct 24 17:52:16 backup kernel: [30988.070845] Call Trace:
Oct 24 17:52:16 backup kernel: [30988.070853] [<ffffffff814d16c9>] ?
dump_stack+0x41/0x51
Oct 24 17:52:16 backup kernel: [30988.070858] [<ffffffff81060967>] ?
warn_slowpath_common+0x87/0xc0
Oct 24 17:52:16 backup kernel: [30988.070862] [<ffffffff81060a55>] ?
warn_slowpath_fmt+0x45/0x50
Oct 24 17:52:16 backup kernel: [30988.070873] [<ffffffffa01b73ca>] ?
__btrfs_abort_transaction+0x5a/0x140 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070886] [<ffffffffa01d2e72>] ?
btrfs_run_delayed_refs+0x372/0x530 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070901] [<ffffffffa01fa8c3>] ?
btrfs_run_ordered_operations+0x213/0x2b0 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070915] [<ffffffffa01e2fea>] ?
btrfs_commit_transaction+0x5a/0x990 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070929] [<ffffffffa01e1345>] ?
transaction_kthread+0x1c5/0x240 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070942] [<ffffffffa01e1180>] ?
open_ctree+0x1ff0/0x1ff0 [btrfs]
Oct 24 17:52:16 backup kernel: [30988.070946] [<ffffffff8108233c>] ?
kthread+0xbc/0xe0
Oct 24 17:52:16 backup kernel: [30988.070949] [<ffffffff81082280>] ?
flush_kthread_worker+0xa0/0xa0
Oct 24 17:52:16 backup kernel: [30988.070954] [<ffffffff814dee4c>] ?
ret_from_fork+0x7c/0xb0
Oct 24 17:52:16 backup kernel: [30988.070957] [<ffffffff81082280>] ?
flush_kthread_worker+0xa0/0xa0
Oct 24 17:52:16 backup kernel: [30988.070960] ---[ end trace 5de5beb31698a3c2
]---
Oct 24 17:52:16 backup kernel: [30988.070963] BTRFS error (device sdf) in
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:52:16 backup kernel: [30988.071439] BTRFS info (device sdf): forced
readonly
Oct 24 17:52:16 backup kernel: [30988.081154] BTRFS debug (device sdf):
run_one_delayed_ref returned -28
Oct 24 17:52:16 backup kernel: [30988.081161] BTRFS error (device sdf) in
btrfs_run_delayed_refs:2730: errno=-28 No space left
Oct 24 17:55:34 backup kernel: [31186.936384] btrfs: device label backup_fs
devid 3 transid 86683 /dev/sdb
Oct 24 17:55:35 backup kernel: [31187.067619] btrfs: disk space caching is
enabled
Oct 24 18:01:23 backup kernel: [31535.301582] BTRFS debug (device sdf): unlinked
1 orphans
Oct 24 18:01:23 backup kernel: [31535.339410] btrfs: continuing balance
Oct 24 18:01:23 backup kernel: [31535.624023] btrfs: relocating block group
7438921105408 flags 68
Oct 24 18:02:37 backup kernel: [31609.293378] btrfs: found 26705 extents
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html