thr3ads.net - Gluster users - [Gluster-users] Write failure on distributed volume with free space available [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Nux!

2013-Jan-26 16:44 UTC

[Gluster-users] Write failure on distributed volume with free space available

Hello,

Thanks to "partner" on IRC who told me about this (quite big) problem.
Apparently in a distributed setup once a brick fills up you start 
getting write failures. Is there a way to work around this?

I would have thought gluster would check for free space before writing 
to a brick.

It's very easy to test, I created a distributed volume from 2 uneven 
bricks and started to write to it; as one of them got full I started 
getting write failures. This is how I tested:

======================
[root at localhost gluster1]# for i in `seq 1 20`; do df -h 
/mnt/gluster1/; dd if=/dev/zero of=16_$i bs=16M count=1; done
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M   16M  276M   6% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.115813 s, 145 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M   32M  260M  11% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.140746 s, 119 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M   48M  243M  17% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0905644 s, 185 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M   64M  227M  22% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.088424 s, 190 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M   80M  211M  28% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0876373 s, 191 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M   96M  195M  33% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0890243 s, 188 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  112M  179M  39% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0853196 s, 197 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  128M  163M  45% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0923682 s, 182 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  145M  147M  50% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0861475 s, 195 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  161M  131M  56% /mnt/gluster1
dd: writing `16_10': No space left on device
dd: closing output file `16_10': No space left on device
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  170M  121M  59% /mnt/gluster1
dd: writing `16_11': No space left on device
dd: closing output file `16_11': No space left on device
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  170M  121M  59% /mnt/gluster1
dd: opening `16_12': No space left on device
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  170M  121M  59% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0842241 s, 199 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  186M  105M  64% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.102602 s, 164 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  202M   89M  70% /mnt/gluster1
dd: opening `16_15': No space left on device
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  202M   89M  70% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0866302 s, 194 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  219M   73M  76% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.0898677 s, 187 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  235M   57M  81% /mnt/gluster1
dd: opening `16_18': No space left on device
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  235M   57M  81% /mnt/gluster1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.126375 s, 133 MB/s
Filesystem            Size  Used Avail Use% Mounted on
192.168.192.5:/test   291M  251M   41M  87% /mnt/gluster1
dd: opening `16_20': No space left on device
======================


-- 
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

Nux!

2013-Jan-28 18:43 UTC

head link

[Gluster-users] Write failure on distributed volume with free space available

On 26.01.2013 16:44, Nux! wrote:> Hello,
> 
> Thanks to "partner" on IRC who told me about this (quite big)
problem.
> Apparently in a distributed setup once a brick fills up you start
> getting write failures. Is there a way to work around this?
Anyone?..

-- 
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro

Jeff Darcy

2013-Jan-28 20:14 UTC

head link

[Gluster-users] Write failure on distributed volume with free space available

On 01/28/2013 01:43 PM, Nux! wrote:>> Thanks to "partner" on IRC who told me about this (quite big)
>> problem. Apparently in a distributed setup once a brick fills up
>> you start getting write failures. Is there a way to work around
>> this?
The way this is supposed to work is that if a brick is full then *new*
files will be created on other bricks with more space.  However,
*existing* files are not relocated, so a write that requires allocating
new space for a file on a full brick will fail.  There are three ways
that you can cause files to be relocated, freeing up space.

(1) A full rebalance via the CLI.

(2) Targeted rebalance of a directory using a special setxattr (see my
exchange with Dan Bretherton).

(3) Manual copying.  When you create the new file it will be created on
the brick with the most space, then when you complete the copy the space
used by the original will be freed.

Alexandros Soumplis

2013-Jan-28 21:58 UTC

head link

[Gluster-users] Write failure on distributed volume with free space available

You mention "The way this is supposed to work is that if a brick is full 
then *new* files will be created on other bricks with more space.". I am 
not quite sure that this is the case when the new file is large enough 
to fill up the space of the chosen brick, while it would fit on another.


On 28/01/2013 10:14 ??, Jeff Darcy wrote:> On 01/28/2013 01:43 PM, Nux! wrote:
>>> Thanks to "partner" on IRC who told me about this (quite
big)
>>> problem. Apparently in a distributed setup once a brick fills up
>>> you start getting write failures. Is there a way to work around
>>> this?
>
> The way this is supposed to work is that if a brick is full then *new*
> files will be created on other bricks with more space. However,
> *existing* files are not relocated, so a write that requires allocating
> new space for a file on a full brick will fail. There are three ways
> that you can cause files to be relocated, freeing up space.
>
> (1) A full rebalance via the CLI.
>
> (2) Targeted rebalance of a directory using a special setxattr (see my
> exchange with Dan Bretherton).
>
> (3) Manual copying. When you create the new file it will be created on
> the brick with the most space, then when you complete the copy the space
> used by the original will be freed.
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Ling Ho

2013-Jan-28 22:19 UTC

head link

[Gluster-users] Write failure on distributed volume with free space available

How "full" does it has to be before new files start getting written
into
the other bricks?

In my recent experience, I added a new brick to an existing volume while 
one of the existing 4 bricks was close to full. And yet I constantly get 
out of space error when trying to write new files. Full rebalancing was 
also such as slow process that it cannot keep up with new files I need 
to write to the volume.

I am using 3.3.0-1.

...
ling

On 01/28/2013 01:58 PM, Alexandros Soumplis wrote:> You mention "The way this is supposed to work is that if a brick is
full
> then *new* files will be created on other bricks with more space.". I
am
> not quite sure that this is the case when the new file is large enough
> to fill up the space of the chosen brick, while it would fit on another.
>
>
> On 28/01/2013 10:14 ??, Jeff Darcy wrote:
>> On 01/28/2013 01:43 PM, Nux! wrote:
>>>> Thanks to "partner" on IRC who told me about this
(quite big)
>>>> problem. Apparently in a distributed setup once a brick fills
up
>>>> you start getting write failures. Is there a way to work around
>>>> this?
>> The way this is supposed to work is that if a brick is full then *new*
>> files will be created on other bricks with more space. However,
>> *existing* files are not relocated, so a write that requires allocating
>> new space for a file on a full brick will fail. There are three ways
>> that you can cause files to be relocated, freeing up space.
>>
>> (1) A full rebalance via the CLI.
>>
>> (2) Targeted rebalance of a directory using a special setxattr (see my
>> exchange with Dan Bretherton).
>>
>> (3) Manual copying. When you create the new file it will be created on
>> the brick with the most space, then when you complete the copy the
space
>> used by the original will be freed.
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reasonably Related Threads

Search for more apparently analagous threads

Gluster users - Jan 2013 - Write failure on distributed volume with free space available

[Gluster-users] Write failure on distributed volume with free space available

[Gluster-users] Write failure on distributed volume with free space available

[Gluster-users] Write failure on distributed volume with free space available

[Gluster-users] Write failure on distributed volume with free space available

[Gluster-users] Write failure on distributed volume with free space available

Reasonably Related Threads