thr3ads.net - Gluster users - [Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements [Jul 2017]

If this information is useful, please help other people find it:
Share via:

yayo (j)

2017-Jul-21 09:25 UTC

[Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishankar at redhat.com>:
>
> But it does  say something. All these gfids of completed heals in the log
> below are the for the ones that you have given the getfattr output of. So
> what is likely happening is there is an intermittent connection problem
> between your mount and the brick process, leading to pending heals again
> after the heal gets completed, which is why the numbers are varying each
> time. You would need to check why that is the case.
> Hope this helps,
> Ravi
>
>
>
> *[2017-07-20 09:58:46.573079] I [MSGID: 108026]
> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
> Completed data selfheal on e6dfd556-340b-4b76-b47b-7b6f5bd74327.
> sources=[0] 1  sinks=2*
> *[2017-07-20 09:59:22.995003] I [MSGID: 108026]
> [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
> 0-engine-replicate-0: performing metadata selfheal on
> f05b9742-2771-484a-85fc-5b6974bcef81*
> *[2017-07-20 09:59:22.999372] I [MSGID: 108026]
> [afr-self-heal-common.c:1254:afr_log_selfheal] 0-engine-replicate-0:
> Completed metadata selfheal on f05b9742-2771-484a-85fc-5b6974bcef81.
> sources=[0] 1  sinks=2*
>
>
Hi,

But we ha1e 2 gluster volume on the same network and the other one (the
"Data" gluster) don't have any problems. Why you think there is a
network
problem?  How to check this on a gluster infrastructure?

Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170721/a0b00d1a/attachment.html>

Ravishankar N

2017-Jul-21 11:10 UTC

head link

[Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

On 07/21/2017 02:55 PM, yayo (j) wrote:> 2017-07-20 14:48 GMT+02:00 Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>>:
>
>
>     But it does  say something. All these gfids of completed heals in
>     the log below are the for the ones that you have given the
>     getfattr output of. So what is likely happening is there is an
>     intermittent connection problem between your mount and the brick
>     process, leading to pending heals again after the heal gets
>     completed, which is why the numbers are varying each time. You
>     would need to check why that is the case.
>     Hope this helps,
>     Ravi
>
>
>>
>>         /[2017-07-20 09:58:46.573079] I [MSGID: 108026]
>>         [afr-self-heal-common.c:1254:afr_log_selfheal]
>>         0-engine-replicate-0: Completed data selfheal on
>>         e6dfd556-340b-4b76-b47b-7b6f5bd74327. sources=[0] 1  sinks=2/
>>         /[2017-07-20 09:59:22.995003] I [MSGID: 108026]
>>         [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
>>         0-engine-replicate-0: performing metadata selfheal on
>>         f05b9742-2771-484a-85fc-5b6974bcef81/
>>         /[2017-07-20 09:59:22.999372] I [MSGID: 108026]
>>         [afr-self-heal-common.c:1254:afr_log_selfheal]
>>         0-engine-replicate-0: Completed metadata selfheal on
>>         f05b9742-2771-484a-85fc-5b6974bcef81. sources=[0] 1  sinks=2/
>>
>
>
> Hi,
>
> But we ha1e 2 gluster volume on the same network and the other one 
> (the "Data" gluster) don't have any problems. Why you think
there is a
> network problem?
Because pending self-heals come into the picture when I/O from the 
clients (mounts) do not succeed on some bricks. They are mostly due to
(a) the client losing connection to some bricks (likely),
(b) the I/O failing on the bricks themselves (unlikely).

If most of the i/o is also going to the 3rd brick (since you say the 
files are already present on all bricks and I/O is successful) , then it 
is likely to be (a).
> How to check this on a gluster infrastructure?
>In the fuse mount logs for the engine volume, check if there are any 
messages for brick disconnects. Something along the lines of 
"disconnected from volname-client-x".
Just guessing here, but maybe even the 'data' volume did experience 
disconnects and self-heals later but you did not observe it when you ran 
heal info. See the glustershd log or mount log for for self-heal 
completion messages on /0-data-replicate-0 /also.

Regards,
Ravi> Thank you
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170721/5a006ef9/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

Gluster users - Jul 2017 - [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

[Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

[Gluster-users] [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

Apparently Analagous Threads