thr3ads.net - Gluster users - [Gluster-users] GlusterFS Lockup [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Mark Harrigan

2014-Aug-04 20:40 UTC

[Gluster-users] GlusterFS Lockup

Hi,

We've been using GlusterFS 3.4.0 on Debian 7 for around 11 months. Every so
often (once every 6 weeks or so) a server running the client and nginx
accessing the gluster storage would suffer a hard lockup of nginx and
require a reboot. This wasn't such a big deal as there were several clients
so there was never a loss to production.

However in the last few days we've seen a huge increase in these lockups.
Perhaps 2-3 times a day on multiple machines.

The bug appears to be very similar to
https://bugzilla.redhat.com/show_bug.cgi?id=764743 from Gluster 3.2 and the
suggestion of first killing the Gluster process does allow a restart of
nginx without a reboot.

The only changes to our usage of late have been to stop a rebalance process
(as we're at the point where we need to add extra bricks) and to breach 80%
usage on the volume (Total volume size is 400TB).

Has anyone else had any experience of this bug re-emerging in Gluster 3.4?

Thanks,

Mark

This email, including attachments, is private and confidential. If you have
received this email in error please notify the sender and delete it from
your system. Emails are not secure and may contain viruses. No liability
can be accepted for viruses that might be transferred by this email or any
attachment. Any unauthorised copying of this message or unauthorised
distribution and publication of the information contained herein are
prohibited.

7digital Limited. Registered office: 69 Wilson Street, London EC2A 2BB.
Registered in England and Wales. Registered No. 04843573.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140804/bb172722/attachment.html>

Justin Clift

2014-Aug-06 12:22 UTC

head link

[Gluster-users] GlusterFS Lockup

----- Original Message -----> Hi,
> 
> We've been using GlusterFS 3.4.0 on Debian 7 for around 11 months.
Every so
> often (once every 6 weeks or so) a server running the client and nginx
> accessing the gluster storage would suffer a hard lockup of nginx and
> require a reboot. This wasn't such a big deal as there were several
clients
> so there was never a loss to production.
> 
> However in the last few days we've seen a huge increase in these
lockups.
> Perhaps 2-3 times a day on multiple machines.
> 
> The bug appears to be very similar to
> https://bugzilla.redhat.com/show_bug.cgi?id=764743 from Gluster 3.2 and the
> suggestion of first killing the Gluster process does allow a restart of
> nginx without a reboot.
> 
> The only changes to our usage of late have been to stop a rebalance process
> (as we're at the point where we need to add extra bricks) and to breach
80%
> usage on the volume (Total volume size is 400TB).
> 
> Has anyone else had any experience of this bug re-emerging in Gluster 3.4?
As a side thought, how practical would it be to upgrade to GlusterFS 3.4.5?

There have been a lot of bug fixes (some pretty important) between 3.4.0, and
3.4.5.  I have no idea if there's a fix for the problem you mention in there
btw, I'm just asking in general. :)

Regards and best wishes,

Justin Clift

-- 
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

Pranith Kumar Karampuri

2014-Aug-06 12:29 UTC

head link

[Gluster-users] GlusterFS Lockup

Could you take a statedump of the mount process when the hang happens 
using steps at 
https://github.com/gluster/glusterfs/blob/master/doc/debugging/statedump.md

Pranith
On 08/05/2014 02:10 AM, Mark Harrigan wrote:> Hi,
>
> We've been using GlusterFS 3.4.0 on Debian 7 for around 11 months. 
> Every so often (once every 6 weeks or so) a server running the client 
> and nginx accessing the gluster storage would suffer a hard lockup of 
> nginx and require a reboot. This wasn't such a big deal as there were 
> several clients so there was never a loss to production.
>
> However in the last few days we've seen a huge increase in these 
> lockups. Perhaps 2-3 times a day on multiple machines.
>
> The bug appears to be very similar to 
> https://bugzilla.redhat.com/show_bug.cgi?id=764743 from Gluster 3.2 
> and the suggestion of first killing the Gluster process does allow a 
> restart of nginx without a reboot.
>
> The only changes to our usage of late have been to stop a rebalance 
> process (as we're at the point where we need to add extra bricks) and 
> to breach 80% usage on the volume (Total volume size is 400TB).
>
> Has anyone else had any experience of this bug re-emerging in Gluster 3.4?
>
> Thanks,
>
> Mark
>
>
>
> This email, including attachments, is private and confidential. If you 
> have received this email in error please notify the sender and delete 
> it from your system. Emails are not secure and may contain viruses. No 
> liability can be accepted for viruses that might be transferred by 
> this email or any attachment. Any unauthorised copying of this message 
> or unauthorised distribution and publication of the information 
> contained herein are prohibited.
>
> 7digital Limited. Registered office:69 Wilson Street, London EC2A 2BB.
> Registered inEngland and Wales. Registered No. 04843573.
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140806/e99f5ae6/attachment.html>

Gluster users - Aug 2014 - GlusterFS Lockup

[Gluster-users] GlusterFS Lockup

[Gluster-users] GlusterFS Lockup

[Gluster-users] GlusterFS Lockup