Dave Sherohman
2018-Feb-13 13:33 UTC
[Gluster-users] Failover problems with gluster 3.8.8-1 (latest Debian stable)
I'm using gluster for a virt-store with 3x2 distributed/replicated servers for 16 qemu/kvm/libvirt virtual machines using image files stored in gluster and accessed via libgfapi. Eight of these disk images are standalone, while the other eight are qcow2 images which all share a single backing file. For the most part, this is all working very well. However, one of the gluster servers (azathoth) causes three of the standalone VMs and all 8 of the shared-backing-image VMs to fail if it goes down. Any of the other gluster servers can go down with no problems; only azathoth causes issues. In addition, the kvm hosts have the gluster volume fuse mounted and one of them (out of five) detects an error on the gluster volume and puts the fuse mount into read-only mode if azathoth goes down. libgfapi connections to the VM images continue to work normally from this host despite this and the other four kvm hosts are unaffected. It initially seemed relevant that I have the libgfapi URIs specified as gluster://azathoth/..., but I've tried changing them to make the initial connection via other gluster hosts and it had no effect on the problem. Losing azathoth still took them out. In addition to changing the mount URI, I've also manually run a heal and rebalance on the volume, enabled the bitrot daemons (then turned them back off a week later, since they reported no activity in that time), and copied one of the standalone images to a new file in case it was a problem with the file itself. As far as I can tell, none of these attempts changed anything. So I'm at a loss. Is this a known type of problem? If so, how do I fix it? If not, what's the next step to troubleshoot it? # gluster --version glusterfs 3.8.8 built on Jan 11 2017 14:07:11 Repository revision: git://git.gluster.com/glusterfs.git # gluster volume status Status of volume: palantir Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick saruman:/var/local/brick0/data 49154 0 Y 10690 Brick gandalf:/var/local/brick0/data 49155 0 Y 18732 Brick azathoth:/var/local/brick0/data 49155 0 Y 9507 Brick yog-sothoth:/var/local/brick0/data 49153 0 Y 39559 Brick cthulhu:/var/local/brick0/data 49152 0 Y 2682 Brick mordiggian:/var/local/brick0/data 49152 0 Y 39479 Self-heal Daemon on localhost N/A N/A Y 9614 Self-heal Daemon on saruman.lub.lu.se N/A N/A Y 15016 Self-heal Daemon on cthulhu.lub.lu.se N/A N/A Y 9756 Self-heal Daemon on gandalf.lub.lu.se N/A N/A Y 5962 Self-heal Daemon on mordiggian.lub.lu.se N/A N/A Y 8295 Self-heal Daemon on yog-sothoth.lub.lu.se N/A N/A Y 7588 Task Status of Volume palantir ------------------------------------------------------------------------------ Task : Rebalance ID : c38e11fe-fe1b-464d-b9f5-1398441cc229 Status : completed -- Dave Sherohman
Dave Sherohman
2018-Feb-15 12:20 UTC
[Gluster-users] Failover problems with gluster 3.8.8-1 (latest Debian stable)
Well, it looks like I've stumped the list, so I did a bit of additional digging myself: azathoth replicates with yog-sothoth, so I compared their brick directories. `ls -R /var/local/brick0/data | md5sum` gives the same result on both servers, so the filenames are identical in both bricks. However, `du -s /var/local/brick0/data` shows that azathoth has about 3G more data (445G vs 442G) than yog. This seems consistent with my assumption that the problem is on yog-sothoth (everything is fine with only azathoth; there are problems with only yog-sothoth) and I am reminded that a few weeks ago, yog-sothoth was offline for 4-5 days, although it should have been brought back up-to-date once it came back online. So, assuming that the issue is stale/missing data on yog-sothoth, is there a way to force gluster to do a full refresh of the data from azathoth's brick to yog-sothoth's brick? I would have expected running heal and/or rebalance to do that sort of thing, but I've run them both (with and without fix-layout on the rebalance) and the problem persists. If there isn't a way to force a refresh, how risky would it be to kill gluster on yog-sothoth, wipe everything from /var/local/brick0, and then re-add it to the cluster as if I were replacing a physically failed disk? Seems like that should work in principle, but it feels dangerous to wipe the partition and rebuild, regardless. On Tue, Feb 13, 2018 at 07:33:44AM -0600, Dave Sherohman wrote:> I'm using gluster for a virt-store with 3x2 distributed/replicated > servers for 16 qemu/kvm/libvirt virtual machines using image files > stored in gluster and accessed via libgfapi. Eight of these disk images > are standalone, while the other eight are qcow2 images which all share a > single backing file. > > For the most part, this is all working very well. However, one of the > gluster servers (azathoth) causes three of the standalone VMs and all 8 > of the shared-backing-image VMs to fail if it goes down. Any of the > other gluster servers can go down with no problems; only azathoth causes > issues. > > In addition, the kvm hosts have the gluster volume fuse mounted and one > of them (out of five) detects an error on the gluster volume and puts > the fuse mount into read-only mode if azathoth goes down. libgfapi > connections to the VM images continue to work normally from this host > despite this and the other four kvm hosts are unaffected. > > It initially seemed relevant that I have the libgfapi URIs specified as > gluster://azathoth/..., but I've tried changing them to make the initial > connection via other gluster hosts and it had no effect on the problem. > Losing azathoth still took them out. > > In addition to changing the mount URI, I've also manually run a heal and > rebalance on the volume, enabled the bitrot daemons (then turned them > back off a week later, since they reported no activity in that time), > and copied one of the standalone images to a new file in case it was a > problem with the file itself. As far as I can tell, none of these > attempts changed anything. > > So I'm at a loss. Is this a known type of problem? If so, how do I fix > it? If not, what's the next step to troubleshoot it? > > > # gluster --version > glusterfs 3.8.8 built on Jan 11 2017 14:07:11 > Repository revision: git://git.gluster.com/glusterfs.git > > # gluster volume status > Status of volume: palantir > Gluster process TCP Port RDMA Port Online > Pid > ------------------------------------------------------------------------------ > Brick saruman:/var/local/brick0/data 49154 0 Y > 10690 > Brick gandalf:/var/local/brick0/data 49155 0 Y > 18732 > Brick azathoth:/var/local/brick0/data 49155 0 Y > 9507 > Brick yog-sothoth:/var/local/brick0/data 49153 0 Y > 39559 > Brick cthulhu:/var/local/brick0/data 49152 0 Y > 2682 > Brick mordiggian:/var/local/brick0/data 49152 0 Y > 39479 > Self-heal Daemon on localhost N/A N/A Y > 9614 > Self-heal Daemon on saruman.lub.lu.se N/A N/A Y > 15016 > Self-heal Daemon on cthulhu.lub.lu.se N/A N/A Y > 9756 > Self-heal Daemon on gandalf.lub.lu.se N/A N/A Y > 5962 > Self-heal Daemon on mordiggian.lub.lu.se N/A N/A Y > 8295 > Self-heal Daemon on yog-sothoth.lub.lu.se N/A N/A Y > 7588 > > Task Status of Volume palantir > ------------------------------------------------------------------------------ > Task : Rebalance > ID : c38e11fe-fe1b-464d-b9f5-1398441cc229 > Status : completed > > > -- > Dave Sherohman > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-- Dave Sherohman
Alex K
2018-Feb-15 19:34 UTC
[Gluster-users] Failover problems with gluster 3.8.8-1 (latest Debian stable)
Hi, Have you checked for any file system errors on the brick mount point? I once was facing weird io errors and xfs_repair fixed the issue. What about the heal? Does it report any pending heals? On Feb 15, 2018 14:20, "Dave Sherohman" <dave at sherohman.org> wrote:> Well, it looks like I've stumped the list, so I did a bit of additional > digging myself: > > azathoth replicates with yog-sothoth, so I compared their brick > directories. `ls -R /var/local/brick0/data | md5sum` gives the same > result on both servers, so the filenames are identical in both bricks. > However, `du -s /var/local/brick0/data` shows that azathoth has about 3G > more data (445G vs 442G) than yog. > > This seems consistent with my assumption that the problem is on > yog-sothoth (everything is fine with only azathoth; there are problems > with only yog-sothoth) and I am reminded that a few weeks ago, > yog-sothoth was offline for 4-5 days, although it should have been > brought back up-to-date once it came back online. > > So, assuming that the issue is stale/missing data on yog-sothoth, is > there a way to force gluster to do a full refresh of the data from > azathoth's brick to yog-sothoth's brick? I would have expected running > heal and/or rebalance to do that sort of thing, but I've run them both > (with and without fix-layout on the rebalance) and the problem persists. > > If there isn't a way to force a refresh, how risky would it be to kill > gluster on yog-sothoth, wipe everything from /var/local/brick0, and then > re-add it to the cluster as if I were replacing a physically failed > disk? Seems like that should work in principle, but it feels dangerous > to wipe the partition and rebuild, regardless. > > On Tue, Feb 13, 2018 at 07:33:44AM -0600, Dave Sherohman wrote: > > I'm using gluster for a virt-store with 3x2 distributed/replicated > > servers for 16 qemu/kvm/libvirt virtual machines using image files > > stored in gluster and accessed via libgfapi. Eight of these disk images > > are standalone, while the other eight are qcow2 images which all share a > > single backing file. > > > > For the most part, this is all working very well. However, one of the > > gluster servers (azathoth) causes three of the standalone VMs and all 8 > > of the shared-backing-image VMs to fail if it goes down. Any of the > > other gluster servers can go down with no problems; only azathoth causes > > issues. > > > > In addition, the kvm hosts have the gluster volume fuse mounted and one > > of them (out of five) detects an error on the gluster volume and puts > > the fuse mount into read-only mode if azathoth goes down. libgfapi > > connections to the VM images continue to work normally from this host > > despite this and the other four kvm hosts are unaffected. > > > > It initially seemed relevant that I have the libgfapi URIs specified as > > gluster://azathoth/..., but I've tried changing them to make the initial > > connection via other gluster hosts and it had no effect on the problem. > > Losing azathoth still took them out. > > > > In addition to changing the mount URI, I've also manually run a heal and > > rebalance on the volume, enabled the bitrot daemons (then turned them > > back off a week later, since they reported no activity in that time), > > and copied one of the standalone images to a new file in case it was a > > problem with the file itself. As far as I can tell, none of these > > attempts changed anything. > > > > So I'm at a loss. Is this a known type of problem? If so, how do I fix > > it? If not, what's the next step to troubleshoot it? > > > > > > # gluster --version > > glusterfs 3.8.8 built on Jan 11 2017 14:07:11 > > Repository revision: git://git.gluster.com/glusterfs.git > > > > # gluster volume status > > Status of volume: palantir > > Gluster process TCP Port RDMA Port Online > > Pid > > ------------------------------------------------------------ > ------------------ > > Brick saruman:/var/local/brick0/data 49154 0 Y > > 10690 > > Brick gandalf:/var/local/brick0/data 49155 0 Y > > 18732 > > Brick azathoth:/var/local/brick0/data 49155 0 Y > > 9507 > > Brick yog-sothoth:/var/local/brick0/data 49153 0 Y > > 39559 > > Brick cthulhu:/var/local/brick0/data 49152 0 Y > > 2682 > > Brick mordiggian:/var/local/brick0/data 49152 0 Y > > 39479 > > Self-heal Daemon on localhost N/A N/A Y > > 9614 > > Self-heal Daemon on saruman.lub.lu.se N/A N/A Y > > 15016 > > Self-heal Daemon on cthulhu.lub.lu.se N/A N/A Y > > 9756 > > Self-heal Daemon on gandalf.lub.lu.se N/A N/A Y > > 5962 > > Self-heal Daemon on mordiggian.lub.lu.se N/A N/A Y > > 8295 > > Self-heal Daemon on yog-sothoth.lub.lu.se N/A N/A Y > > 7588 > > > > Task Status of Volume palantir > > ------------------------------------------------------------ > ------------------ > > Task : Rebalance > > ID : c38e11fe-fe1b-464d-b9f5-1398441cc229 > > Status : completed > > > > > > -- > > Dave Sherohman > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-users > > > -- > Dave Sherohman > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180215/69d8c74e/attachment.html>