thr3ads.net - Gluster users - [Gluster-users] VM fs becomes read only when one gluster node goes down [Oct 2015]

If this information is useful, please help other people find it:
Share via:

André Bauer

2015-Oct-22 18:45 UTC

[Gluster-users] VM fs becomes read only when one gluster node goes down

Hi,

i have a 4 node Glusterfs 3.5.6 Cluster.

My VM images are in an replicated distributed volume which is accessed
from kvm/qemu via libgfapi.

Mount is against storage.domain.local which has IPs for all 4 Gluster
nodes set in DNS.

When one of the Gluster nodes goes down (accidently reboot) a lot of the
vms getting read only filesystem. Even when the node comes back up.

How can i prevent this?
I expect that the vm just uses the replicated file on the other node,
without getting ro fs.

Any hints?

Thanks in advance.

-- 
Regards
Andr? Bauer

Krutika Dhananjay

2015-Oct-23 02:24 UTC

head link

[Gluster-users] VM fs becomes read only when one gluster node goes down

Could you share the output of 'gluster volume info', and also
information as to which node went down on reboot?

-Krutika 
----- Original Message -----
> From: "Andr? Bauer" <abauer at magix.net>
> To: "gluster-users" <gluster-users at gluster.org>
> Cc: gluster-devel at gluster.org
> Sent: Friday, October 23, 2015 12:15:04 AM
> Subject: [Gluster-users] VM fs becomes read only when one gluster node goes
> down
> Hi,
> i have a 4 node Glusterfs 3.5.6 Cluster.
> My VM images are in an replicated distributed volume which is accessed
> from kvm/qemu via libgfapi.
> Mount is against storage.domain.local which has IPs for all 4 Gluster
> nodes set in DNS.
> When one of the Gluster nodes goes down (accidently reboot) a lot of the
> vms getting read only filesystem. Even when the node comes back up.
> How can i prevent this?
> I expect that the vm just uses the replicated file on the other node,
> without getting ro fs.
> Any hints?
> Thanks in advance.
> --
> Regards
> Andr? Bauer
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151022/7c07a7bc/attachment.html>

Niels de Vos

2015-Oct-26 20:56 UTC

head link

[Gluster-users] [Gluster-devel] VM fs becomes read only when one gluster node goes down

On Thu, Oct 22, 2015 at 08:45:04PM +0200, Andr? Bauer
wrote:> Hi,
> 
> i have a 4 node Glusterfs 3.5.6 Cluster.
> 
> My VM images are in an replicated distributed volume which is accessed
> from kvm/qemu via libgfapi.
> 
> Mount is against storage.domain.local which has IPs for all 4 Gluster
> nodes set in DNS.
> 
> When one of the Gluster nodes goes down (accidently reboot) a lot of the
> vms getting read only filesystem. Even when the node comes back up.
> 
> How can i prevent this?
> I expect that the vm just uses the replicated file on the other node,
> without getting ro fs.
> 
> Any hints?
There are at least two timeouts that are involved in this problem:

1. The filesystem in a VM can go read-only when the virtual disk where
   the filesystem is located does not respond for a while.

2. When a storage server that holds a replica of the virtual disk
   becomes unreachable, the Gluster client (qemu+libgfapi) waits for
   max. network.ping-timeout seconds before it resumes I/O.

Once a filesystem in a VM goes read-only, you might be able to fsck and
re-mount it read-writable again. It is not something a VM will do by
itself.

The timeouts for (1) are set in sysfs:

    $ cat /sys/block/sda/device/timeout
    30

30 seconds is the default for SD-devices, and for testing you can change
it with an echo:

    # echo 300 > /sys/block/sda/device/timeout

This is not a peristent change, you can create a udev-rule to apply this
change at bootup.

Some of the filesystem offer a mount option that can change the
behaviour after a disk error is detected. "man mount" shows the
"errors"
option for ext*. Changing this to "continue" is not recommended,
"abort"
or "panic" will be the most safe for your data.

The timeout mentioned in (2) is for the Gluster Volume, and checked by
the client. When a client does a write to a replicated volume, the write
needs to be acknowledged by both/all replicas. The client (libgfapi)
delays the reply to the application (qemu) until both/all replies from
the replicas has been received. This delay is configured as the volume
option network.ping-timeout (42 seconds by default).

Now, if the VM returns block errors after 30 seconds, and the client
waits up to 42 seconds for recovery, there is an issue... So, your
solution could be to increase the timeout for error detection of the
disks inside the VMs, and/or decrease the network.ping-timeout.

It would be interesting to know if adapting these values prevents the
read-only occurrences in your environment. If you do any testing with
this, please keep me informed about the results.

Niels
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151026/b23a1bb4/attachment.sig>

Gluster users - Oct 2015 - VM fs becomes read only when one gluster node goes down

[Gluster-users] VM fs becomes read only when one gluster node goes down

[Gluster-users] VM fs becomes read only when one gluster node goes down

[Gluster-users] [Gluster-devel] VM fs becomes read only when one gluster node goes down