yayo (j)
2017-Jul-21 09:25 UTC
[Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishankar at redhat.com>:> > But it does say something. All these gfids of completed heals in the log > below are the for the ones that you have given the getfattr output of. So > what is likely happening is there is an intermittent connection problem > between your mount and the brick process, leading to pending heals again > after the heal gets completed, which is why the numbers are varying each > time. You would need to check why that is the case. > Hope this helps, > Ravi > > > > *[2017-07-20 09:58:46.573079] I [MSGID: 108026] > [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: > Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327. > sources=[0] 1 sinks=2* > *[2017-07-20 09:59:22.995003] I [MSGID: 108026] > [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] > 0-engine-replicate-0: performing metadata selfheal on > f05b9742-2771-484a-85fc-5b6974bcef81* > *[2017-07-20 09:59:22.999372] I [MSGID: 108026] > [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0: > Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81. > sources=[0] 1 sinks=2* > >Hi, But we ha1e 2 gluster volume on the same network and the other one (the "Data" gluster) don't have any problems. Why you think there is a network problem? How to check this on a gluster infrastructure? Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170721/a0b00d1a/attachment.html>
Ravishankar N
2017-Jul-21 11:10 UTC
[Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
On 07/21/2017 02:55 PM, yayo (j) wrote:> 2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishankar at redhat.com > <mailto:ravishankar at redhat.com>>: > > > But it does say something. All these gfids of completed heals in > the log below are the for the ones that you have given the > getfattr output of. So what is likely happening is there is an > intermittent connection problem between your mount and the brick > process, leading to pending heals again after the heal gets > completed, which is why the numbers are varying each time. You > would need to check why that is the case. > Hope this helps, > Ravi > > >> >> /[2017-07-20 09:58:46.573079] I [MSGID: 108026] >> [afr-self-heal-common.c:1254:afr_log_selfheal] >> 0-engine-replicate-0: Completed data selfheal on >> e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1 sinks=2/ >> /[2017-07-20 09:59:22.995003] I [MSGID: 108026] >> [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] >> 0-engine-replicate-0: performing metadata selfheal on >> f05b9742-2771-484a-85fc-5b6974bcef81/ >> /[2017-07-20 09:59:22.999372] I [MSGID: 108026] >> [afr-self-heal-common.c:1254:afr_log_selfheal] >> 0-engine-replicate-0: Completed metadata selfheal on >> f05b9742-2771-484a-85fc-5b6974bcef81. sources=[0] 1 sinks=2/ >> > > > Hi, > > But we ha1e 2 gluster volume on the same network and the other one > (the "Data" gluster) don't have any problems. Why you think there is a > network problem?Because pending self-heals come into the picture when I/O from the clients (mounts) do not succeed on some bricks. They are mostly due to (a) the client losing connection to some bricks (likely), (b) the I/O failing on the bricks themselves (unlikely). If most of the i/o is also going to the 3rd brick (since you say the files are already present on all bricks and I/O is successful) , then it is likely to be (a).> How to check this on a gluster infrastructure? >In the fuse mount logs for the engine volume, check if there are any messages for brick disconnects. Something along the lines of "disconnected from volname-client-x". Just guessing here, but maybe even the 'data' volume did experience disconnects and self-heals later but you did not observe it when you ran heal info. See the glustershd log or mount log for for self-heal completion messages on /0-data-replicate-0 /also. Regards, Ravi> Thank you > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170721/5a006ef9/attachment.html>
Possibly Parallel Threads
- [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
- [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
- [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
- [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
- [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements