So like many I probably thought I had done my research and understood what would happen when rebooting as brick/node only to find out I was wrong. In my mind I saw I had a 1x3 replicate so I could rolling reboot and they'd heal up. However looking at logs of ovirt shortly after the rebooted brick came up all vm's started pausing/going unresponsive. At the time I was puzzled and freaked out. Next morning on my run I think I found the error in my logic and reading comprehension of my research. Once the 3rd brick came up it had to heal and changes to all the VM's. It is file based not block based healing so it saw multi-GB files that it had to recopy over. It had to halt all write to those files while that occurred or it would be a never ending cycle of re-copying the large images. So the fact most VM's went haywire isnt that odd. It does look based on timing in alerts the 2 bricks that were up kept serving images until 3rd brick came back. It did heal all images just fine. So knowing what I believe I now know you can't really do what I had hoped and just reboot one brick and have the VM's stay up all the time. In order to achieve something like that I'd need a 2nd set of bricks I could live storage migrate to. Am I understanding correctly how that works? I could also look at minimizing downtime by moving to sharding and that way the heal would only need to copy smaller files. However I'd still end up potentially with paused VM's unless those heals were pretty quick. Probably safest to plan downtime of VM's or work out a storage migration plan if I had a real need for a high number of 9's uptime. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160310/a426bd0e/attachment.html>
On 11/03/2016 2:24 AM, David Gossage wrote:> It is file based not block based healing so it saw multi-GB files > that it had to recopy over. It had to halt all write to those files > while that occurred or it would be a never ending cycle of re-copying > the large images. So the fact most VM's went haywire isnt that odd. > It does look based on timing in alerts the 2 bricks that were up kept > serving images until 3rd brick came back. It did heal all images just > fine. >What version are you running? 3.7.x has sharding (breaks large files into chunks) to allow much finer grained healing, it speeds up heals a *lot*. However it can't be applied retroactively, you have to enable sharding then copy the VM over :( http://blog.gluster.org/2015/12/introducing-shard-translator/ In regards to rolling reboots, it can be done with replicated storage and gluster will transparently hand over client read/writes, but for each VM image, only one copy at a time can be healing over wise access will be blocked as you saw. So recommended procedure: - Enable sharding - copy VM's over - when rebooting wait for heals to complete before rebooting the next node nb: Thoroughly recommend 3 way replication as you have done, it saves a lot of headaches with quorums and split brain. -- Lindsay Mathieson