My guess is there is a corruption in vol list or peer list which has lead
glusterd to get into a infinite loop of traversing a peer/volume list and
CPU to hog up. Again this is a guess and I've not got a chance to take a
detail look at the logs and the strace output.
I believe if you get to reboot the node again the problem will disappear.
On Tue, 22 Aug 2017 at 20:07, Serkan ?oban <cobanserkan at gmail.com>
wrote:
> As an addition perf top shows %80 libc-2.12.so __strcmp_sse42 during
> glusterd %100 cpu usage
> Hope this helps...
>
> On Tue, Aug 22, 2017 at 2:41 PM, Serkan ?oban <cobanserkan at
gmail.com>
> wrote:
> > Hi there,
> >
> > I have a strange problem.
> > Gluster version in 3.10.5, I am testing new servers. Gluster
> > configuration is 16+4 EC, I have three volumes, each have 1600 bricks.
> > I can successfully create the cluster and volumes without any
> > problems. I write data to cluster from 100 clients for 12 hours again
> > no problem. But when I try to reboot a node, glusterd process hangs on
> > %100 CPU usage and seems to do nothing, no brick processes come
> > online. You can find strace of glusterd process for 1 minutes here:
> >
> > https://www.dropbox.com/s/c7bxfnbqxze1yus/gluster_strace.out?dl=0
> >
> > Here is the glusterd logs:
> > https://www.dropbox.com/s/hkstb3mdeil9a5u/glusterd.log?dl=0
> >
> >
> > By the way, reboot of one server completes without problem if I reboot
> > the servers before creating any volumes.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
--
- Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170822/5a95e3d2/attachment.html>