thr3ads.net - Btrfs devel - unfinished convert to raid6, enospc [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Dan van der Ster

2013-Jul-17 13:34 UTC

unfinished convert to raid6, enospc

Hi,
Two days ago I decided to throw caution to the wind and convert my
raid1 array to raid6, for the space and redundancy benefits. I did

# btrfs fi balance start -dconvert=raid6 /media/btrfs

Eventually today the balance finished, but the conversion to raid6 was
incomplete:

# btrfs fi df /media/btrfs
Data, RAID1: total=693.00GB, used=690.47GB
Data, RAID6: total=6.36TB, used=4.35TB
System, RAID1: total=32.00MB, used=1008.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=8.00GB, used=6.04GB

A recent btrfs balance status (before finishing) said:

# btrfs balance status /media/btrfs
Balance on ''/media/btrfs'' is running
4289 out of about 5208 chunks balanced (4988 considered),  18% left

and at the end I have:

[164935.053643] btrfs: 693 enospc errors during balance

Here is the array:

# btrfs fi show /dev/sdb
Label: none  uuid: 743135d0-d1f5-4695-9f32-e682537749cf
Total devices 7 FS bytes used 5.04TB
devid    2 size 2.73TB used 2.73TB path /dev/sdh
devid    1 size 2.73TB used 2.73TB path /dev/sdg
devid    5 size 1.36TB used 1.31TB path /dev/sde
devid    6 size 1.36TB used 1.31TB path /dev/sdf
devid    4 size 1.82TB used 1.82TB path /dev/sdd
devid    3 size 1.82TB used 1.82TB path /dev/sdc
devid    7 size 1.82TB used 1.82TB path /dev/sdb


I''m running latest stable, plus the patch "free csums when
we''re done
scrubbing an extent" (otherwise I get OOM when scrubbing).

# uname -a
Linux dvanders-webserver 3.10.1+ #1 SMP Mon Jul 15 17:07:19 CEST 2013
x86_64 x86_64 x86_64 GNU/Linux

I still have plenty of free space:

# df -h /media/btrfs
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdd         14T  5.8T  2.2T  74% /media/btrfs

Any idea how I can get out of this? Thanks!
--
Dan van der Ster
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dan van der Ster

2013-Jul-17 19:56 UTC

head link

Re: unfinished convert to raid6, enospc

Well, I''m trying a balance again with -dconvert=raid6 -dusage=5 this
time. Will report back...

On Wed, Jul 17, 2013 at 3:34 PM, Dan van der Ster <dan@vanderster.com>
wrote:> Hi,
> Two days ago I decided to throw caution to the wind and convert my
> raid1 array to raid6, for the space and redundancy benefits. I did
>
> # btrfs fi balance start -dconvert=raid6 /media/btrfs
>
> Eventually today the balance finished, but the conversion to raid6 was
> incomplete:
>
> # btrfs fi df /media/btrfs
> Data, RAID1: total=693.00GB, used=690.47GB
> Data, RAID6: total=6.36TB, used=4.35TB
> System, RAID1: total=32.00MB, used=1008.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=8.00GB, used=6.04GB
>
> A recent btrfs balance status (before finishing) said:
>
> # btrfs balance status /media/btrfs
> Balance on ''/media/btrfs'' is running
> 4289 out of about 5208 chunks balanced (4988 considered),  18% left
>
> and at the end I have:
>
> [164935.053643] btrfs: 693 enospc errors during balance
>
> Here is the array:
>
> # btrfs fi show /dev/sdb
> Label: none  uuid: 743135d0-d1f5-4695-9f32-e682537749cf
> Total devices 7 FS bytes used 5.04TB
> devid    2 size 2.73TB used 2.73TB path /dev/sdh
> devid    1 size 2.73TB used 2.73TB path /dev/sdg
> devid    5 size 1.36TB used 1.31TB path /dev/sde
> devid    6 size 1.36TB used 1.31TB path /dev/sdf
> devid    4 size 1.82TB used 1.82TB path /dev/sdd
> devid    3 size 1.82TB used 1.82TB path /dev/sdc
> devid    7 size 1.82TB used 1.82TB path /dev/sdb
>
>
> I''m running latest stable, plus the patch "free csums when
we''re done
> scrubbing an extent" (otherwise I get OOM when scrubbing).
>
> # uname -a
> Linux dvanders-webserver 3.10.1+ #1 SMP Mon Jul 15 17:07:19 CEST 2013
> x86_64 x86_64 x86_64 GNU/Linux
>
> I still have plenty of free space:
>
> # df -h /media/btrfs
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sdd         14T  5.8T  2.2T  74% /media/btrfs
>
> Any idea how I can get out of this? Thanks!
> --
> Dan van der Ster--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-Jul-17 20:53 UTC

head link

Re: unfinished convert to raid6, enospc

On 07/17/2013 21:56, Dan van der Ster wrote:> Well, I''m trying a balance again with -dconvert=raid6 -dusage=5
this
> time. Will report back...
>
> On Wed, Jul 17, 2013 at 3:34 PM, Dan van der Ster
<dan@vanderster.com> wrote:
>> Hi,
>> Two days ago I decided to throw caution to the wind and convert my
>> raid1 array to raid6, for the space and redundancy benefits. I did
>>
>> # btrfs fi balance start -dconvert=raid6 /media/btrfs
>>
>> Eventually today the balance finished, but the conversion to raid6 was
>> incomplete:
>>
>> # btrfs fi df /media/btrfs
>> Data, RAID1: total=693.00GB, used=690.47GB
>> Data, RAID6: total=6.36TB, used=4.35TB
>> System, RAID1: total=32.00MB, used=1008.00KB
>> System: total=4.00MB, used=0.00
>> Metadata, RAID1: total=8.00GB, used=6.04GB
>>
>> A recent btrfs balance status (before finishing) said:
>>
>> # btrfs balance status /media/btrfs
>> Balance on ''/media/btrfs'' is running
>> 4289 out of about 5208 chunks balanced (4988 considered),  18% left
>>
>> and at the end I have:
>>
>> [164935.053643] btrfs: 693 enospc errors during balance
>>
>> Here is the array:
>>
>> # btrfs fi show /dev/sdb
>> Label: none  uuid: 743135d0-d1f5-4695-9f32-e682537749cf
>> Total devices 7 FS bytes used 5.04TB
>> devid    2 size 2.73TB used 2.73TB path /dev/sdh
>> devid    1 size 2.73TB used 2.73TB path /dev/sdg
>> devid    5 size 1.36TB used 1.31TB path /dev/sde
>> devid    6 size 1.36TB used 1.31TB path /dev/sdf
>> devid    4 size 1.82TB used 1.82TB path /dev/sdd
>> devid    3 size 1.82TB used 1.82TB path /dev/sdc
>> devid    7 size 1.82TB used 1.82TB path /dev/sdb
>>
>>
>> I''m running latest stable, plus the patch "free csums
when we''re done
>> scrubbing an extent" (otherwise I get OOM when scrubbing).
>>
>> # uname -a
>> Linux dvanders-webserver 3.10.1+ #1 SMP Mon Jul 15 17:07:19 CEST 2013
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> I still have plenty of free space:
>>
>> # df -h /media/btrfs
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/sdd         14T  5.8T  2.2T  74% /media/btrfs
>>
>> Any idea how I can get out of this? Thanks!
You know the limitations of the current Btrfs RAID5/6 implementation, 
don''t you? No protection against power loss or disk failures. No
support
for scrub. These limits are explained very explicitly in the commit message:

http://lwn.net/Articles/536038/

I''d recommend Btrfs RAID1 for the time being.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Dan van der Ster

2013-Jul-17 21:43 UTC

head link

Re: unfinished convert to raid6, enospc

On Wed, Jul 17, 2013 at 10:53 PM, Stefan Behrens
<sbehrens@giantdisaster.de> wrote:> No protection against ... disk failures
Well I was aware of the power loss and scrub issues, but not the lack
of protection against disk failures. That sorta defeats the purpose.
Back to RAID 1 I go then...
thanks :)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gareth Pye

2013-Jul-17 23:01 UTC

head link

Re: unfinished convert to raid6, enospc

So I''m reading:

* Progs support for parity rebuild.  Missing drives upset the progs
today, but the kernel does rebuild parity properly.


wrong? As that sounds like the programs will bork but it can be
mounted and it''ll rebuild.

On Thu, Jul 18, 2013 at 6:53 AM, Stefan Behrens
<sbehrens@giantdisaster.de> wrote:> On 07/17/2013 21:56, Dan van der Ster wrote:
>>
>> Well, I''m trying a balance again with -dconvert=raid6
-dusage=5 this
>> time. Will report back...
>>
>> On Wed, Jul 17, 2013 at 3:34 PM, Dan van der Ster
<dan@vanderster.com>
>> wrote:
>>>
>>> Hi,
>>> Two days ago I decided to throw caution to the wind and convert my
>>> raid1 array to raid6, for the space and redundancy benefits. I did
>>>
>>> # btrfs fi balance start -dconvert=raid6 /media/btrfs
>>>
>>> Eventually today the balance finished, but the conversion to raid6
was
>>> incomplete:
>>>
>>> # btrfs fi df /media/btrfs
>>> Data, RAID1: total=693.00GB, used=690.47GB
>>> Data, RAID6: total=6.36TB, used=4.35TB
>>> System, RAID1: total=32.00MB, used=1008.00KB
>>> System: total=4.00MB, used=0.00
>>> Metadata, RAID1: total=8.00GB, used=6.04GB
>>>
>>> A recent btrfs balance status (before finishing) said:
>>>
>>> # btrfs balance status /media/btrfs
>>> Balance on ''/media/btrfs'' is running
>>> 4289 out of about 5208 chunks balanced (4988 considered),  18% left
>>>
>>> and at the end I have:
>>>
>>> [164935.053643] btrfs: 693 enospc errors during balance
>>>
>>> Here is the array:
>>>
>>> # btrfs fi show /dev/sdb
>>> Label: none  uuid: 743135d0-d1f5-4695-9f32-e682537749cf
>>> Total devices 7 FS bytes used 5.04TB
>>> devid    2 size 2.73TB used 2.73TB path /dev/sdh
>>> devid    1 size 2.73TB used 2.73TB path /dev/sdg
>>> devid    5 size 1.36TB used 1.31TB path /dev/sde
>>> devid    6 size 1.36TB used 1.31TB path /dev/sdf
>>> devid    4 size 1.82TB used 1.82TB path /dev/sdd
>>> devid    3 size 1.82TB used 1.82TB path /dev/sdc
>>> devid    7 size 1.82TB used 1.82TB path /dev/sdb
>>>
>>>
>>> I''m running latest stable, plus the patch "free csums
when we''re done
>>> scrubbing an extent" (otherwise I get OOM when scrubbing).
>>>
>>> # uname -a
>>> Linux dvanders-webserver 3.10.1+ #1 SMP Mon Jul 15 17:07:19 CEST
2013
>>> x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> I still have plenty of free space:
>>>
>>> # df -h /media/btrfs
>>> Filesystem      Size  Used Avail Use% Mounted on
>>> /dev/sdd         14T  5.8T  2.2T  74% /media/btrfs
>>>
>>> Any idea how I can get out of this? Thanks!
>
>
> You know the limitations of the current Btrfs RAID5/6 implementation,
don''t
> you? No protection against power loss or disk failures. No support for
> scrub. These limits are explained very explicitly in the commit message:
>
> http://lwn.net/Articles/536038/
>
> I''d recommend Btrfs RAID1 for the time being.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Gareth Pye
Level 2 Judge, Melbourne, Australia
Australian MTG Forum: mtgau.com
gareth@cerberos.id.au - www.rockpaperdynamite.wordpress.com
"Dear God, I would like to file a bug report"
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2013-Jul-18 09:00 UTC

head link

Re: unfinished convert to raid6, enospc

On Thu, 18 Jul 2013 09:01:26 +1000, Gareth Pye wrote:> So I''m reading:
> 
> * Progs support for parity rebuild.  Missing drives upset the progs
> today, but the kernel does rebuild parity properly.
> 
> 
> wrong? As that sounds like the programs will bork but it can be
> mounted and it''ll rebuild.
"No protection" like I wrote was wrong. The parity code is there. But
the known and documented issue is that the current code rewrites disk blocks
while they are referenced. This is the power loss issue. Until the improved
RAID5/6 code is published, I cannot recommend to use Btrfs RAID5/6 if you like
your data.

And I''ve seen that two failing disks on a Btrfs RAID6 filesystem caused
a corrupt filesystem, the log of such a case is attached.

If you look at the log, sdu and sdi fail, some minutes later the "btrfs bad
tree block start" messages occur. The two reported EIO errors on sdz seem
to be the result since no hardware errors for sdz are reported (there are no
related mpt2sas messages for sdz, and there are no Btrfs device statistic
messages for sdz). Eventually, it was not possible to mount the filesystem
anymore.


May 27 17:43:45 btrfs: setting 8 feature flag
May 27 17:43:45 btrfs: use lzo compression
May 27 17:43:45 btrfs: enabling inode map caching
May 27 17:43:45 btrfs: disk space caching is enabled
May 27 17:43:45 btrfs flagging fs with big metadata feature
[...]
May 30 15:07:19  sd 6:0:18:0: [sdu] Synchronizing SCSI cache
May 30 15:07:19  sd 6:0:18:0: [sdu]
May 30 15:07:19  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
May 30 15:07:19  mpt2sas0: removing handle(0x001c), sas_addr(0x50030480008b7bd6)
May 30 15:11:52  btrfs: bdev /dev/sdu errs: wr 0, rd 0, flush 1, corrupt 0, gen
0
May 30 15:11:52  lost page write due to I/O error on /dev/sdu
May 30 15:11:52  btrfs: bdev /dev/sdu errs: wr 1, rd 0, flush 1, corrupt 0, gen
0
[...]
May 30 15:12:56  sd 6:0:6:0: [sdi] Synchronizing SCSI cache
May 30 15:12:56  sd 6:0:6:0: [sdi]
May 30 15:12:56  Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
May 30 15:12:56  mpt2sas0: removing handle(0x0010), sas_addr(0x50030480008b7bca)
May 30 15:13:06  btrfs: bdev /dev/sdi errs: wr 0, rd 0, flush 1, corrupt 0, gen
0
May 30 15:13:07  lost page write due to I/O error on /dev/sdi
May 30 15:13:07  btrfs: bdev /dev/sdi errs: wr 1, rd 0, flush 1, corrupt 0, gen
0
[...]
May 30 15:13:36  btrfs: bdev /dev/sdu errs: wr 474, rd 0, flush 159, corrupt 0,
gen 0
May 30 15:13:36  btrfs: bdev /dev/sdi errs: wr 165, rd 0, flush 56, corrupt 0,
gen 0
May 30 15:14:12  btrfs bad tree block start 5363727999388487864 306642944
May 30 15:14:12  btrfs bad tree block start 6661525506344239944 306642944
May 30 15:14:12  btrfs bad tree block start 2349785282384153232 323960832
May 30 15:14:12  btrfs bad tree block start 343678018170920118 323960832
May 30 15:14:12  btrfs bad tree block start 2003655791289495711 323960832
May 30 15:14:12  btrfs bad tree block start 9327219462809738085 323960832
May 30 15:14:15  btrfs bad tree block start 10356026150932229840 323960832
May 30 15:14:15  btrfs bad tree block start 1515852799129945386 323960832
May 30 15:14:15  ------------[ cut here ]------------
May 30 15:14:15  WARNING: at fs/btrfs/super.c:254
__btrfs_abort_transaction+0xdd/0xf0 [btrfs]()
May 30 15:14:15  Hardware name: X8SIL
May 30 15:14:15  Modules linked in: mptctl mptbase btrfs raid6_pq xor bonding
raid1 mpt2sas scsi_transport_sas raid_class
May 30 15:14:15  Pid: 21611, comm: btrfs-transacti Tainted: G        W    3.9.0+
#82
May 30 15:14:15  Call Trace:
May 30 15:14:15   [<ffffffffa00a7d00>] ?
__btrfs_abort_transaction+0xa0/0xf0 [btrfs]
May 30 15:14:15   [<ffffffff81087a0a>] warn_slowpath_common+0x7a/0xc0
May 30 15:14:15   [<ffffffff81087af1>] warn_slowpath_fmt+0x41/0x50
May 30 15:14:15   [<ffffffffa00a7d3d>] __btrfs_abort_transaction+0xdd/0xf0
[btrfs]
May 30 15:14:15   [<ffffffffa00c1f6c>] btrfs_run_delayed_refs+0x49c/0x570
[btrfs]
May 30 15:14:15   [<ffffffffa00d2922>] btrfs_commit_transaction+0x82/0xb70
[btrfs]
May 30 15:14:15   [<ffffffff810acbd0>] ? wake_up_bit+0x40/0x40
May 30 15:14:15   [<ffffffffa00ca9f5>] transaction_kthread+0x1b5/0x230
[btrfs]
May 30 15:14:15   [<ffffffffa00ca840>] ? check_leaf.isra.108+0x340/0x340
[btrfs]
May 30 15:14:15   [<ffffffff810ac616>] kthread+0xd6/0xe0
May 30 15:14:15   [<ffffffff810e6d0d>] ? trace_hardirqs_on+0xd/0x10
May 30 15:14:15   [<ffffffff810ac540>] ?
kthread_create_on_node+0x130/0x130
May 30 15:14:15   [<ffffffff81994dac>] ret_from_fork+0x7c/0xb0
May 30 15:14:15   [<ffffffff810ac540>] ?
kthread_create_on_node+0x130/0x130
May 30 15:14:15  ---[ end trace 66a995824fe81c3c ]---
May 30 15:14:15  BTRFS error (device sdz) in btrfs_run_delayed_refs:2630:
errno=-5 IO failure
May 30 15:14:15  BTRFS info (device sdz): forced readonly
May 30 15:14:15  BTRFS error (device sdz) in btrfs_run_delayed_refs:2630:
errno=-5 IO failure
[...]
May 31 09:33:43  btrfs: open /dev/sdu failed
May 31 09:33:43  btrfs: open /dev/sdi failed
May 31 09:33:43  btrfs: allowing degraded mounts
May 31 09:33:43  btrfs: disk space caching is enabled
May 31 09:33:43  btrfs: bdev /dev/sdu errs: wr 513, rd 0, flush 171, corrupt 0,
gen 0
May 31 09:33:43  btrfs: bdev /dev/sdi errs: wr 204, rd 0, flush 68, corrupt 0,
gen 0
May 31 09:33:43  btrfs bad tree block start 11069366888604640046 627949568
May 31 09:33:43  Failed to read block groups: -5
May 31 09:33:43  btrfs: open_ctree failed


> 
> On Thu, Jul 18, 2013 at 6:53 AM, Stefan Behrens
> <sbehrens@giantdisaster.de> wrote:
>> On 07/17/2013 21:56, Dan van der Ster wrote:
>>>
>>> Well, I''m trying a balance again with -dconvert=raid6
-dusage=5 this
>>> time. Will report back...
>>>
>>> On Wed, Jul 17, 2013 at 3:34 PM, Dan van der Ster
<dan@vanderster.com>
>>> wrote:
>>>>
>>>> Hi,
>>>> Two days ago I decided to throw caution to the wind and convert
my
>>>> raid1 array to raid6, for the space and redundancy benefits. I
did
>>>>
>>>> # btrfs fi balance start -dconvert=raid6 /media/btrfs
>>>>
>>>> Eventually today the balance finished, but the conversion to
raid6 was
>>>> incomplete:
>>>>
>>>> # btrfs fi df /media/btrfs
>>>> Data, RAID1: total=693.00GB, used=690.47GB
>>>> Data, RAID6: total=6.36TB, used=4.35TB
>>>> System, RAID1: total=32.00MB, used=1008.00KB
>>>> System: total=4.00MB, used=0.00
>>>> Metadata, RAID1: total=8.00GB, used=6.04GB
>>>>
>>>> A recent btrfs balance status (before finishing) said:
>>>>
>>>> # btrfs balance status /media/btrfs
>>>> Balance on ''/media/btrfs'' is running
>>>> 4289 out of about 5208 chunks balanced (4988 considered),  18%
left
>>>>
>>>> and at the end I have:
>>>>
>>>> [164935.053643] btrfs: 693 enospc errors during balance
>>>>
>>>> Here is the array:
>>>>
>>>> # btrfs fi show /dev/sdb
>>>> Label: none  uuid: 743135d0-d1f5-4695-9f32-e682537749cf
>>>> Total devices 7 FS bytes used 5.04TB
>>>> devid    2 size 2.73TB used 2.73TB path /dev/sdh
>>>> devid    1 size 2.73TB used 2.73TB path /dev/sdg
>>>> devid    5 size 1.36TB used 1.31TB path /dev/sde
>>>> devid    6 size 1.36TB used 1.31TB path /dev/sdf
>>>> devid    4 size 1.82TB used 1.82TB path /dev/sdd
>>>> devid    3 size 1.82TB used 1.82TB path /dev/sdc
>>>> devid    7 size 1.82TB used 1.82TB path /dev/sdb
>>>>
>>>>
>>>> I''m running latest stable, plus the patch "free
csums when we''re done
>>>> scrubbing an extent" (otherwise I get OOM when scrubbing).
>>>>
>>>> # uname -a
>>>> Linux dvanders-webserver 3.10.1+ #1 SMP Mon Jul 15 17:07:19
CEST 2013
>>>> x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>> I still have plenty of free space:
>>>>
>>>> # df -h /media/btrfs
>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>> /dev/sdd         14T  5.8T  2.2T  74% /media/btrfs
>>>>
>>>> Any idea how I can get out of this? Thanks!
>>
>>
>> You know the limitations of the current Btrfs RAID5/6 implementation,
don''t
>> you? No protection against power loss or disk failures. No support for
>> scrub. These limits are explained very explicitly in the commit
message:
>>
>> http://lwn.net/Articles/536038/
>>
>> I''d recommend Btrfs RAID1 for the time being.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Jul 2013 - unfinished convert to raid6, enospc

unfinished convert to raid6, enospc

Re: unfinished convert to raid6, enospc

Re: unfinished convert to raid6, enospc

Re: unfinished convert to raid6, enospc

Re: unfinished convert to raid6, enospc

Re: unfinished convert to raid6, enospc