thr3ads.net - Btrfs devel - "No space left on device" [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Leonidas Spyropoulos

2013-Nov-14 19:54 UTC

"No space left on device"

Hello,

I''ve been following this list for years and I see during various
situations
this message coming up. Some times it''s a genuine problem that there is
actually not enough space. In other cases it''s some by-product of
something
else. I have seen this error personality on a broken system ( which I never
figured out what had happened).
I know this is still experimental but I just want to make sure my
expectations are not really out on sync with the others.

As an end user when I see an error like this the first thing I will do is
check the space (using ''df'' command) [1]. If I see more that
7% I usually
think it''s OK, (depends on the size of partition as well).

- Is this unreasonable in btrfs filesystem? Is there a formula to calculate
how much space btrfs _might_ need?
- It''s probably not your job but can df reports correct sizes for
btrfs?
I''ve seen some threads on trying to show the actual space occupied by
data
and/or metadata with btrfs command. Can we expect this someone to be
incorporated into df command?
- In cases that btrfs reports this error but there''s something else
that''s
causing it, can we expect better error handling from btrfs so the end user
is pointed to the correct direction?

[1]: one could argue that an end user should use the btrfs commands instead
but let''s leave that for now.

Apologies if these have already been answered or are already on roadmap.

Thanks in advanced, your comments are appreciated.

Kind regards,
Leonidas

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2013-Nov-14 21:00 UTC

head link

Re: "No space left on device"

On Thu, Nov 14, 2013 at 07:54:21PM +0000, Leonidas Spyropoulos
wrote:> Hello,
> 
> I''ve been following this list for years and I see during various
situations
> this message coming up. Some times it''s a genuine problem that
there is
> actually not enough space. In other cases it''s some by-product of
something
> else. I have seen this error personality on a broken system ( which I never
> figured out what had happened).
> I know this is still experimental but I just want to make sure my
> expectations are not really out on sync with the others.
> 
> As an end user when I see an error like this the first thing I will do is
> check the space (using ''df'' command) [1]. If I see more
that 7% I usually
> think it''s OK, (depends on the size of partition as well).
   The problem is that there''s two kinds of space (data and metadata),
and either could run out. The FS won''t, currently, attempt to
reallocate between one and the other -- hence the recommendation for a
filtered balance to fix it (one of the side-effects of a balance is to
be able to free up unused or little-used chunks of allocation).
> - Is this unreasonable in btrfs filesystem? Is there a formula to calculate
> how much space btrfs _might_ need?
   Not really. I''d expect to need something in the range 250-1500 GiB
of headroom, depending on the size of the filesystem (and on the size
of the metadata).
> - It''s probably not your job but can df reports correct sizes for
btrfs?
> I''ve seen some threads on trying to show the actual space occupied
by data
> and/or metadata with btrfs command. Can we expect this someone to be
> incorporated into df command?
   Sadly, no. The POSIX API for df doesn''t contain enough information
to give an accurate representation of the space on the FS.
> - In cases that btrfs reports this error but there''s something
else that''s
> causing it, can we expect better error handling from btrfs so the end user
> is pointed to the correct direction?
   Again, we''re constrained by the POSIX API here -- we only have
ENOSPC to represent the error condition. I suspect the best we could
do (possibly) is print something in dmesg, where users don''t tend to
look anyway.

   There''s a future project on the books to make the FS attempt to
recover unused or little-used chunks, which should reduce the odds of
the ENOSPC showing up in an "unexpected" way (i.e. running out of
metadata).

   Hugo.
> [1]: one could argue that an end user should use the btrfs commands instead
> but let''s leave that for now.
> 
> Apologies if these have already been answered or are already on roadmap.
> 
> Thanks in advanced, your comments are appreciated.
> 
> Kind regards,
> Leonidas
> 
-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                 --- emacs: Eats Memory and Crashes. ---

Duncan

2013-Nov-15 09:25 UTC

head link

Re: "No space left on device"

Hugo Mills posted on Thu, 14 Nov 2013 21:00:56 +0000 as excerpted:
>> Is there a formula to calculate how much space btrfs _might_ need?
> 
> Not really. I''d expect to need something in the range 250-1500 GiB
of
> headroom, depending on the size of the filesystem (and on the size of
> the metadata).
As a somewhat more concrete answer...

While recently doing a bit of research on something else, I came across 
comments that on a large enough filesystem, data chunks default to 1 GiB, 
while metadata chunks default to 256 MiB.

And we know that data mode defaults to SINGLE, while metadata mode 
defaults to DUP.

So on a default single-device btrfs of several gigs plus, assuming the 
files being manipulated are under 1 GiB size, keeping an unallocated 
space reserve of 1.5 GiB should be reasonable.  That''s enough
unallocated
space to allocate one more 1 GiB data chunk, plus one more 256 MiB 
metadata chunk, doubled to a half GiB due to DUP mode.  Obviously in the 
single-mode-metadata case, the metadata requirement would be only a 
single copy, so 256 MiB for it, 1.25 GiB total unallocated, minimum.

btrfs filesystem show is the command used to see what your allocated 
space for a filesystem looks like, per device.  However, it doesn''t
note
UNALLOCATED space, only size and used (aka allocated), so an admin must 
do the math to figure unallocated.

If the files being manipulated are over a gig in size, round up to the 
nearest whole GiB for the data and add another half GiB to cover the 
quarter-gig DUP metadata case.

If the filesystem is under a gig in size, btrfs defaults to mixed
data+metadata, with chunks of 256 MiB if there''s space but apparently 
rather more flexibility in ordered to better utilize all available 
space.  At such "small" sizes[1], full allocation with no more to 
allocate being common, but one does hope people using such sized 
filesystems have a good idea what will be going on them, and they won''t
/
need/ to allocate further chunks after the initial filesystem 
population.  And quite in contrast to the multi-TB filesystems, 
rebalancing such a filesystem in ordered to recover lost space should be 
relatively fast even on spinning rust.

For filesystems of 1 GiB up to say 10 GiB, it''s a more open question, 
altho at that size, there''s still a rather good chance that the
sysadmin
has a reasonably good idea what''s going on the filesystem and has
planned
accordingly, with some "reasonable" level of over-allocation for
future-
proofing and plan fuzziness, and rebalances should still occur in 
reasonable time as well, so it shouldn''t be a /huge/ problem unless the
admin simply isn''t tracking the situation.

The multi-device situation is another dimension vector.  Apparently, 
except for single mode, btrfs at this point only ever allocates in pairs 
(plus raid5/6 checksum chunks if applicable, and pairs of pairs in raid10 
mode), regardless of the number of devices available, which does simplify 
calculations to some degree.

Btrfs'' multi-device default (for >1 GiB per device sizes, anyway) is
single data, raid1 metadata.  So to reserve space for one chunk of either 
type, we''d need at least 1 GiB unallocated on ONE device to allow at 
least one single-mode data chunk allocation, PLUS at least 256 MiB 
unallocated on each of TWO devices to cover at least one raid1-mode 
metadata chunk allocation.  Thus, with two devices, we''d require at
least
1.25 GiB free/unallocated on one device (1 GiB data chunk plus one copy 
of the 256 MiB metadata chunk), 256 MiB on the other (the second copy of 
the metadata).  For a three+ device filesystem, that would work, OR 256 
MiB on each of two (for the raid1 metadata), 1 GiB on a third (for the 
data).

For raid1 data the 1 GiB data chunks must have two copies, each on its 
own device, and the above multi-device default scenario would modify 
accordingly: 2-device-case: 1.25 GiB minimum unallocated on each device 
(one copy each for a data and a metadata chunk).  3-device-case:  That OR 
1.25/1.0/.25 GiB.  4-device-plus-case: Either of those or 1.0/1.0/.25/.25 
GiB.

For single metadata plus default single data, we''re back to the 1.25
GiB
total case, in two separate chunks of 1 GiB and 256 MiB, either on 
separate devices or the same device.

I haven''t personally played with the raid0 case as it doesn''t
fit my use-
case, but the wiki documentation suggests that it still allocates chunks 
only in pairs, striping the data/metadata across the pair.  So we''re 
looking at a minimum 1 GiB on each of two separate devices for a raid0 
data chunk allocation (which would then allow two gigs of data), a 
minimum of 256 MiB on each of two separate devices for a raid0 metadata 
chunk allocation (which would hold a half-gig of metadata).  Permutations 
are, as they say "left as an exercise for the reader." =:^)

Apparently raid10 mode is pairs of pairs, so allocates in sets of four.  
Metadata: 256 MiB on each of four separate devices, 512 MiB metadata 
capacity.  Data: 1 GiB on each of four separate devices, holds 2 GiB 
worth of data.  Again, permutations "left as an exercise for the
reader."

Finally, there''s the mixed data/metadata chunk mode that''s the
default on
<1 GiB filesystems.  Default chunk sizes there are 256 MiB, with the same 
pair-allocation rules for multi-device filesystems as above.  But as 
discussed under the single device case, these filesystems are often 
capacity-planned and fully allocated from the beginning, with no further 
chunk allocation necessary once the filesystem is populated.

That leaves raid5/6.  With the caveat that these raid modes aren''t yet 
ready for normal use (even more so than the still experimental btrfs as a 
whole, where good backups are STRONGLY RECOMMENDED, with raid5/6 mode, 
REALLY expect your data to be eaten for breakfast, so do NOT use it in 
present form for anything but temporary testing!)...

raid5 should work like raid0 above, but requiring one more device chunk 
reserved for the raid5 checksumming, thus reserving in threes with no 
additional capacity over raid0.  raid6 is the same but with yet another 
reserved, thus reserving in fours.  Again, permutations "left as an 
exercise for the reader."

Presumably raid50/60 will be possible with little change in the code once 
raid5/6 stabilize, since it''s a logical combination with raid0, with
the
required parallel chunk reservation 6 and 8 devices wide respectively, 
but AFAIK, that''s not even supported at all yet, and even if it is,
it''s
hardly worth trying since the raid5/6 component remains so highly 
unstable at this point.

And of course there''s N-way mirroring on the roadmap as well, but 
implementation remains some way out, beyond raid5/6 normalization.  When 
it comes, its parallel chunk reservation characteristics can be predicted 
based on the raid1 discussion above, extended from it by multiplying by 
the N in the N-way mirroring, instead of by a hard-coded two, as done in 
the current raid1 case.  (This is actually a case I''m strongly
interested
in, 3-way-mirroring, perhaps even in the raid10 variant thus requiring 
six devices minimum, but given btrfs history to date and current progress 
on raid5/6, I don''t expect to see it in anything like normalized form 
until well into next year, perhaps a year from now, at the earliest.)

---
[1] Re < 1 GiB being "small", I still can''t help but think
of my first
computer when I mention that, a 486-class machine with a 130 MB (128 MiB 
or some such, half the size of my /boot and 1/128th the size of my main 
memory, today!) hard drive, and that was early 90s, so while I''ve a bit
of computer experience I''m still a relative newcomer compared to many
in
the *ix community.  It was several disk upgrade generations later when I 
got my first gig-sized drive, and it sure didn''t seem "small"
at the
time!  My how times do change!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Nov 2013 - "No space left on device"

"No space left on device"

Re: "No space left on device"

Re: "No space left on device"