thr3ads.net - Gluster users - [Gluster-users] Quorum and reboots [Mar 2016]

If this information is useful, please help other people find it:
Share via:

David Gossage

2016-Mar-10 16:24 UTC

[Gluster-users] Quorum and reboots

So like many I probably thought I had done my research and understood what
would happen when rebooting as brick/node only to find out I was wrong.

In my mind I saw I had a 1x3 replicate so I could rolling reboot and they'd
heal up. However looking at logs of ovirt shortly after the rebooted brick
came up all vm's started pausing/going unresponsive. At the time I was
puzzled and freaked out. Next morning on my run I think I found the error
in my logic and reading comprehension of my research. Once the 3rd brick
came up it had to heal and changes to all the VM's. It is file based not
block based healing so it saw multi-GB files that it had to recopy over.
It had to halt all write to those files while that occurred or it would be
a never ending cycle of re-copying the large images. So the fact most VM's
went haywire isnt that odd. It does look based on timing in alerts the 2
bricks that were up kept serving images until 3rd brick came back. It did
heal all images just fine.

So knowing what I believe I now know you can't really do what I had hoped
and just reboot one brick and have the VM's stay up all the time. In order
to achieve something like that I'd need a 2nd set of bricks I could live
storage migrate to.

Am I understanding correctly how that works?

I could also look at minimizing downtime by moving to sharding and that way
the heal would only need to copy smaller files. However I'd still end up
potentially with paused VM's unless those heals were pretty quick.
Probably safest to plan downtime of VM's or work out a storage migration
plan if I had a real need for a high number of 9's uptime.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160310/a426bd0e/attachment.html>

Lindsay Mathieson

2016-Mar-10 22:52 UTC

head link

[Gluster-users] Quorum and reboots

On 11/03/2016 2:24 AM, David Gossage wrote:>   It is file based not block based healing so it saw multi-GB files 
> that it had to recopy over.  It had to halt all write to those files 
> while that occurred or it would be a never ending cycle of re-copying 
> the large images.  So the fact most VM's went haywire isnt that odd.  
> It does look based on timing in alerts the 2 bricks that were up kept 
> serving images until 3rd brick came back.  It did heal all images just 
> fine.
>
What version are you running?  3.7.x has sharding (breaks large files 
into chunks) to allow much finer grained healing, it speeds up heals a 
*lot*. However it can't be applied retroactively, you have to enable 
sharding then copy the VM over :(

http://blog.gluster.org/2015/12/introducing-shard-translator/

In regards to rolling reboots, it can be done with replicated storage 
and gluster will transparently  hand over client read/writes, but for 
each VM image, only one copy at a time can be healing over wise access 
will be blocked as you saw.

So recommended procedure:
- Enable sharding
- copy VM's over
- when rebooting wait for heals to complete before rebooting the next node

nb: Thoroughly recommend 3 way replication as you have done, it saves a 
lot of headaches with quorums and split brain.

-- 
Lindsay Mathieson

Gluster users - Mar 2016 - Quorum and reboots

[Gluster-users] Quorum and reboots

[Gluster-users] Quorum and reboots