Todd Pfaff
2022-Mar-05 16:01 UTC
[Gluster-users] proper way to temporarily remove brick server from replica cluster to avoid kvm guest disruption
I have a replica volume created as: gluster volume create vol1 replica 4 \ host{1,2,3,4}:/mnt/gluster/brick1/data \ force All hosts host{1,2,3,4} mount this volume as: localhost:/vol1 /mnt/gluster/vol1 glusterfs defaults Some other hosts are trusted peers but do not contribute bricks, and they also mount vol1 in the same way: localhost:/vol1 /mnt/gluster/vol1 glusterfs defaults All hosts run CentOS 7.9, and all are running glusterfs 9.4 or 9.5 from centos-release-gluster9-1.0-1.el7.noarch. All hosts run kvm guests that use qcow2 files for root filesystems that are stored on gluster volume vol1. This is all working well, as long as none of host{1,2,3,4} go offline. I want to take one of host{1,2,3,4} offline temporarily for maintenance. I'll refer to this as hostX. I understand that hostX will need to be healed when it comes back online. I would, of course, migrate guests from hostX to another host, in which case hostX would then only be participating as a gluster replica brick provider and serving gluster client requests. What I've experienced is that if I take one of host{1,2,3,4} offline, this can disrupt some of the VM guests on various other hosts such that their root filesystems go to read-only. What I'm looking for here are suggestions as to how to properly take one of host{1,2,3,4} offline to avoid such disruption or how to tune the libvirt kvm hosts and guests to be sufficiently resilient in the face of taking one gluster replica node offline. Thanks, Todd
Strahil Nikolov
2022-Mar-05 21:22 UTC
[Gluster-users] proper way to temporarily remove brick server from replica cluster to avoid kvm guest disruption
Hey Todd, can you provide 'gluster volume info <VOLUME>' ? Best Regards,Strahil Nikolov On Sat, Mar 5, 2022 at 18:17, Todd Pfaff<pfaff at rhpcs.mcmaster.ca> wrote: I have a replica volume created as: gluster volume create vol1 replica 4 \ ? host{1,2,3,4}:/mnt/gluster/brick1/data \ ? force All hosts host{1,2,3,4} mount this volume as: localhost:/vol1 /mnt/gluster/vol1 glusterfs defaults Some other hosts are trusted peers but do not contribute bricks, and they also mount vol1 in the same way: localhost:/vol1 /mnt/gluster/vol1 glusterfs defaults All hosts run CentOS 7.9, and all are running glusterfs 9.4 or 9.5 from centos-release-gluster9-1.0-1.el7.noarch. All hosts run kvm guests that use qcow2 files for root filesystems that are stored on gluster volume vol1. This is all working well, as long as none of host{1,2,3,4} go offline. I want to take one of host{1,2,3,4} offline temporarily for maintenance. I'll refer to this as hostX. I understand that hostX will need to be healed when it comes back online. I would, of course, migrate guests from hostX to another host, in which case hostX would then only be participating as a gluster replica brick provider and serving gluster client requests. What I've experienced is that if I take one of host{1,2,3,4} offline, this can disrupt some of the VM guests on various other hosts such that their root filesystems go to read-only. What I'm looking for here are suggestions as to how to properly take one of host{1,2,3,4} offline to avoid such disruption or how to tune the libvirt kvm hosts and guests to be sufficiently resilient in the face of taking one gluster replica node offline. Thanks, Todd ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220305/798c2b3e/attachment.html>