thr3ads.net - Gluster users - [Gluster-users] After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Micha Ober

2016-Nov-14 23:03 UTC

[Gluster-users] After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain

Hi,

I upgraded an installation of GlusterFS on Ubuntu 14.04.3 from version
3.4.2 to 3.8.5.
Few hours after the upgrade, I noticed files in "split-brain" state. I
never had split-brain files in months of operation before, with the old
version.

Using htop, I observed the "glusterfs" process jumping from 0% to
100+% CPU
usage every now and then.
Using iostat, I confirmed there is no bottleneck on the local disks (util
is well below 10%)

Inspecting the logfiles, it looks like clients are losing connection quite
often:

[2016-11-14 16:34:56.685349] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv0-client-1: server
X.X.X.62:49152 has not responded in the last 42 seconds, disconnecting.
[2016-11-14 16:35:47.690348] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv0-client-8: server
X.X.X.219:49153 has not responded in the last 42 seconds, disconnecting.
[2016-11-14 17:09:33.903096] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv0-client-7: server
X.X.X.62:49153 has not responded in the last 42 seconds, disconnecting.

There are a total of 6 servers with 2 bricks each (Distribute-Replicate)

The result of a 60 second gluster volume profile can be seen here:
http://pastebin.com/5WN5S63B

After upgrading, I set:

cluster.granular-entry-heal yes
cluster.locking-scheme granular

I now reverted to no/full to see if files are still going
"split-brain".

Best regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161115/bae2c13d/attachment.html>

Pranith Kumar Karampuri

2016-Nov-15 18:06 UTC

head link

[Gluster-users] After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain

On Tue, Nov 15, 2016 at 4:33 AM, Micha Ober <micha2k at gmail.com> wrote:
> Hi,
>
> I upgraded an installation of GlusterFS on Ubuntu 14.04.3 from version
> 3.4.2 to 3.8.5.
> Few hours after the upgrade, I noticed files in "split-brain"
state. I
> never had split-brain files in months of operation before, with the old
> version.
>
> Using htop, I observed the "glusterfs" process jumping from 0% to
100+%
> CPU usage every now and then.
> Using iostat, I confirmed there is no bottleneck on the local disks (util
> is well below 10%)
>
> Inspecting the logfiles, it looks like clients are losing connection quite
> often:
>
> [2016-11-14 16:34:56.685349] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired]
> 0-gv0-client-1: server X.X.X.62:49152 has not responded in the last 42
> seconds, disconnecting.
> [2016-11-14 16:35:47.690348] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired]
> 0-gv0-client-8: server X.X.X.219:49153 has not responded in the last 42
> seconds, disconnecting.
> [2016-11-14 17:09:33.903096] C
[rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired]
> 0-gv0-client-7: server X.X.X.62:49153 has not responded in the last 42
> seconds, disconnecting.
>
You need to find out what is leading to the disconnects above. These could
be the reason for split-brain.

>
> There are a total of 6 servers with 2 bricks each (Distribute-Replicate)
>
> The result of a 60 second gluster volume profile can be seen here:
> http://pastebin.com/5WN5S63B
>
> After upgrading, I set:
>
> cluster.granular-entry-heal yes
> cluster.locking-scheme granular
>
> I now reverted to no/full to see if files are still going
"split-brain".
>
> Best regards
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161115/7cd0a635/attachment.html>

Gluster users - Nov 2016 - After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain

[Gluster-users] After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain

[Gluster-users] After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain