thr3ads.net - Gluster users - [Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout) [Nov 2012]

If this information is useful, please help other people find it:
Share via:

ZHANG Cheng

2012-Nov-26 09:46 UTC

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

Early this morning our 2 bricks replicated cluster had an outage. The
disk space for one of the brick server (brick02) was used up. When we
responded to the disk full alert, the issue already lasted for a few
hours. We reclaimed some disk space, and reboot the brick02 server,
expecting once it come back it will go self healing.

It did go self healing, but just after couple minutes, access to
gluster filesystem freeze. Tons of "nfs: server brick not responding,
still trying" popped up in dmesg. The load average on app server went
up to 200 something from usual 0.10. We had to shutdown brick02 server
or stop gluster server process on it, to get the gluster cluster back
working.

How could we deal with this issue? Thanks in advance.

Our gluster setup is followed the official doc.

gluster> volume info

Volume Name: staticvol
Type: Replicate
Volume ID: fdcbf635-5faf-45d6-ab4e-be97c74d7715
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: brick01:/exports/static
Brick2: brick02:/exports/static

Underlying filesystem is xfs (on a lvm volume), as:
/dev/mapper/vg_node-brick on /exports/static type xfs
(rw,noatime,nodiratime,nobarrier,logbufs=8)

The brick servers don't act as gluster client.

Our app servers are the gluster client, mount via nfs.
brick:/staticvol on /mnt/gfs-static type nfs
(rw,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,addr=10.10.10.51)

brick is a DNS round-robin record for brick01 and brick02.

ZHANG Cheng

2012-Nov-29 05:24 UTC

head link

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

I dig out an gluster-users m-list thread dated 2011-June at
http://gluster.org/pipermail/gluster-users/2011-June/008111.html.

In this post, Marco Agostini said:
=================================================Craig Carl said me, three days
ago:
------------------------------------------------------
 that happens because Gluster's self heal is a blocking operation. We
are working on a non-blocking self heal, we are hoping to ship it in
early September.
------------------------------------------------------
=================================================
Looks like even with release of 3.3.1, self heal is still a blocking
operation. I am wondering why the official Administration Guide
doesn't warn the reader about such important thing regarding
production operation.


On Mon, Nov 26, 2012 at 5:46 PM, ZHANG Cheng <czhang.oss at gmail.com>
wrote:> Early this morning our 2 bricks replicated cluster had an outage. The
> disk space for one of the brick server (brick02) was used up. When we
> responded to the disk full alert, the issue already lasted for a few
> hours. We reclaimed some disk space, and reboot the brick02 server,
> expecting once it come back it will go self healing.
>
> It did go self healing, but just after couple minutes, access to
> gluster filesystem freeze. Tons of "nfs: server brick not responding,
> still trying" popped up in dmesg. The load average on app server went
> up to 200 something from usual 0.10. We had to shutdown brick02 server
> or stop gluster server process on it, to get the gluster cluster back
> working.
>
> How could we deal with this issue? Thanks in advance.
>
> Our gluster setup is followed the official doc.
>
> gluster> volume info
>
> Volume Name: staticvol
> Type: Replicate
> Volume ID: fdcbf635-5faf-45d6-ab4e-be97c74d7715
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: brick01:/exports/static
> Brick2: brick02:/exports/static
>
> Underlying filesystem is xfs (on a lvm volume), as:
> /dev/mapper/vg_node-brick on /exports/static type xfs
> (rw,noatime,nodiratime,nobarrier,logbufs=8)
>
> The brick servers don't act as gluster client.
>
> Our app servers are the gluster client, mount via nfs.
> brick:/staticvol on /mnt/gfs-static type nfs
> (rw,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,addr=10.10.10.51)
>
> brick is a DNS round-robin record for brick01 and brick02.

Bryan Whitehead

2012-Nov-29 06:48 UTC

head link

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

when you mount xfs, also use the inode64 option. That will help with xfs
performance.

My offhand guess is you are likely running into limited network bandwidth
for the 2 bricks to sync. As the network gets flooded nfs response gets
poor. Make sure you are getting full-duplex connections - or upgrade your
network to 10G or (even better) Infiniband.


On Mon, Nov 26, 2012 at 1:46 AM, ZHANG Cheng <czhang.oss at gmail.com>
wrote:
> Early this morning our 2 bricks replicated cluster had an outage. The
> disk space for one of the brick server (brick02) was used up. When we
> responded to the disk full alert, the issue already lasted for a few
> hours. We reclaimed some disk space, and reboot the brick02 server,
> expecting once it come back it will go self healing.
>
> It did go self healing, but just after couple minutes, access to
> gluster filesystem freeze. Tons of "nfs: server brick not responding,
> still trying" popped up in dmesg. The load average on app server went
> up to 200 something from usual 0.10. We had to shutdown brick02 server
> or stop gluster server process on it, to get the gluster cluster back
> working.
>
> How could we deal with this issue? Thanks in advance.
>
> Our gluster setup is followed the official doc.
>
> gluster> volume info
>
> Volume Name: staticvol
> Type: Replicate
> Volume ID: fdcbf635-5faf-45d6-ab4e-be97c74d7715
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: brick01:/exports/static
> Brick2: brick02:/exports/static
>
> Underlying filesystem is xfs (on a lvm volume), as:
> /dev/mapper/vg_node-brick on /exports/static type xfs
> (rw,noatime,nodiratime,nobarrier,logbufs=8)
>
> The brick servers don't act as gluster client.
>
> Our app servers are the gluster client, mount via nfs.
> brick:/staticvol on /mnt/gfs-static type nfs
> (rw,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,addr=10.10.10.51)
>
> brick is a DNS round-robin record for brick01 and brick02.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121128/96357baf/attachment.html>

Jeff Darcy

2012-Nov-29 10:58 UTC

head link

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

On 11/26/12 4:46 AM, ZHANG Cheng wrote:> Early this morning our 2 bricks replicated cluster had an outage. The
> disk space for one of the brick server (brick02) was used up. When we
> responded to the disk full alert, the issue already lasted for a few
> hours. We reclaimed some disk space, and reboot the brick02 server,
> expecting once it come back it will go self healing.
> 
> It did go self healing, but just after couple minutes, access to
> gluster filesystem freeze. Tons of "nfs: server brick not responding,
> still trying" popped up in dmesg. The load average on app server went
> up to 200 something from usual 0.10. We had to shutdown brick02 server
> or stop gluster server process on it, to get the gluster cluster back
> working.
Have you checked the glustershd logs (should be in /var/log/glusterfs)
on the bricks?  If there's nothing useful there, a statedump would also
be useful.  See the "gluster volume statedump" instructions on your
friendly local admin guide (section 10.4 for GlusterFS 3.3).  Most
helpful of all would be a bug report with any of this information plus a
description of your configuration.  You can either create a new one or
attach the info to an existing bug if one seems to fit.  The following
seems like it might be related, even though it's for virtual machines.

https://bugzilla.redhat.com/show_bug.cgi?id=881685

ZHANG Cheng

2013-Jan-09 08:43 UTC

head link

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

We had a planned outage yesterday, which requires us to shutdown one
of replicated brick server (brick02) for 30 minutes. The maintenance
went smooth. But after about couple minutes I brought brick02 back
online, our app server's load raise to a very high number (200~300+),
as it came to a halt as a REST API backend. The same problem we had
that described in previous post.

If I shutdown brick02, restart the jboss-as powered our REST API, the
app server's load will keep at normal level. So looks like our app
server's file access pattern will lead to file operation freeze during
the gluster server's self healing process.

Our app is a REST API backend for mobile forum/community, so the main
content is threads and posts, which contains pictures, showing in
pinterest style in our iOS app. For each picture URL in the JSON
response, our API server's java code does a check as:

  public static boolean checkImage(String path){
      File file=new File(path);
      if(null!=file&&file.exists()&&file.length()>0){
          return true;
      }
      return false;
  }

Usually for each of such response, there are about 10 to 20 pictures
in it, which means this checkImage() will be called that many times.
Because most of request are asking for recent uploaded pictures, these
picture files almost certain are the kind of files requiring self
healing. Even during our off-peak hours there are 0-3 such thread/post
api request per second, sooner or later we will run into the same
freezing problem, if the glusterfs servers are doing self healing.

I think now I have more concrete info to file a bug report.

On Mon, Nov 26, 2012 at 5:46 PM, ZHANG Cheng <czhang.oss at gmail.com>
wrote:> Early this morning our 2 bricks replicated cluster had an outage. The
> disk space for one of the brick server (brick02) was used up. When we
> responded to the disk full alert, the issue already lasted for a few
> hours. We reclaimed some disk space, and reboot the brick02 server,
> expecting once it come back it will go self healing.
>
> It did go self healing, but just after couple minutes, access to
> gluster filesystem freeze. Tons of "nfs: server brick not responding,
> still trying" popped up in dmesg. The load average on app server went
> up to 200 something from usual 0.10. We had to shutdown brick02 server
> or stop gluster server process on it, to get the gluster cluster back
> working.
>
> How could we deal with this issue? Thanks in advance.
>
> Our gluster setup is followed the official doc.
>
> gluster> volume info
>
> Volume Name: staticvol
> Type: Replicate
> Volume ID: fdcbf635-5faf-45d6-ab4e-be97c74d7715
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: brick01:/exports/static
> Brick2: brick02:/exports/static
>
> Underlying filesystem is xfs (on a lvm volume), as:
> /dev/mapper/vg_node-brick on /exports/static type xfs
> (rw,noatime,nodiratime,nobarrier,logbufs=8)
>
> The brick servers don't act as gluster client.
>
> Our app servers are the gluster client, mount via nfs.
> brick:/staticvol on /mnt/gfs-static type nfs
> (rw,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,addr=10.10.10.51)
>
> brick is a DNS round-robin record for brick01 and brick02.

Gluster users - Nov 2012 - Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)