thr3ads.net - Gluster users - [Gluster-users] Ensure volume is healed before node reboot [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Andreas Mather

2015-Jan-08 12:01 UTC

[Gluster-users] Ensure volume is healed before node reboot

Hello!

I'm setting up a qemu/KVM cluster with glusterfs as shared storage. While
testing the cluster, I constantly hit a split-brain situation on VM image
files which I cannot explain.

Setup:
2 bare metal servers running glusterfs (replica 2), having the volume
mounted and one virtual machine which is located on the volume.

Steps:
1.) host1 runs VM, host2 is idle (but fully connected, i.e. replicating)
2.) issue writes in VM (about 10 MB, so nothing big)
3.) live migrate VM from host1 to host2
4.) issuing writes in VM
5.) sleep 120
6.) umount volume, shut down glusterfs reboot host1
7.) start glusterfs, wait 30 sec, mount volume on host1
8.) sleep 120
9.) migrate VM from host2 to host1

Sometimes this works, but usually after I redo the whole operation a second
time or with changed roles (i.e. reboot host2 after the VM was migrated
away) I end up with a split-brained image file.

According to:
gluster volume heal vol1 statistics
split-brain is there after step 6.

Now, I think waiting for replication isn't enough, i.e. when I reboot one
node, even though there weren't many writes, these writes weren't fully
replicated yet. At least that's the simplest explanation to me.

So what I want to ensure is that, after I migrated a VM from host1 to
host2, all previous writes from host1 are fully replicated to host2 (which
I would take as an indicator that it is safe to reboot host1). How can I
accomplish this?

My first guess was gluster volume heal vol1 info, but I'm not sure if I
understand the output correctly (e.g. after resolving the split-brain by
removing the image from one brick and seeing it replicated over the
network, heal info still reports both nodes which makes no sense to me,
since writes occur only from one node...)

Thanks,

Andreas
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150108/e70de87b/attachment.html>

Pranith Kumar Karampuri

2015-Jan-21 19:13 UTC

head link

[Gluster-users] Ensure volume is healed before node reboot

On 01/08/2015 05:31 PM, Andreas Mather wrote:> Hello!
>
> I'm setting up a qemu/KVM cluster with glusterfs as shared storage. 
> While testing the cluster, I constantly hit a split-brain situation on 
> VM image files which I cannot explain.
>
> Setup:
> 2 bare metal servers running glusterfs (replica 2), having the volume 
> mounted and one virtual machine which is located on the volume.
>
> Steps:
> 1.) host1 runs VM, host2 is idle (but fully connected, i.e. replicating)
> 2.) issue writes in VM  (about 10 MB, so nothing big)
> 3.) live migrate VM from host1 to host2
> 4.) issuing writes in VM
> 5.) sleep 120
> 6.) umount volume, shut down glusterfs reboot host1
> 7.) start glusterfs, wait 30 sec, mount volume on host1
> 8.) sleep 120
> 9.) migrate VM from host2 to host1
>
> Sometimes this works, but usually after I redo the whole operation a 
> second time or with changed roles (i.e. reboot host2 after the VM was 
> migrated away) I end up with a split-brained image file.
>
> According to:
> gluster volume heal vol1 statistics
> split-brain is there after step 6.
>
> Now, I think waiting for replication isn't enough, i.e. when I reboot 
> one node, even though there weren't many writes, these writes
weren't
> fully replicated yet. At least that's the simplest explanation to me.
>
> So what I want to ensure is that, after I migrated a VM from host1 to 
> host2, all previous writes from host1 are fully replicated to host2 
> (which I would take as an indicator that it is safe to reboot host1). 
> How can I accomplish this?
>
> My first guess was gluster volume heal vol1 info, but I'm not sure if 
> I understand the output correctly (e.g. after resolving the 
> split-brain by removing the image from one brick and seeing it 
> replicated over the network, heal info still reports both nodes which 
> makes no sense to me, since writes occur only from one node...)Andreas,
          I am sorry I took long to respond to your query. I do not see 
why a split-brain will happen after step-6. Most probably you may not 
have the logs because I am responding so late, but if you do, could you 
give me the logs of the setup?

Pranith>
> Thanks,
>
> Andreas
>
>
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150122/94c1facd/attachment.html>

Gluster users - Jan 2015 - Ensure volume is healed before node reboot

[Gluster-users] Ensure volume is healed before node reboot

[Gluster-users] Ensure volume is healed before node reboot