thr3ads.net - Gluster users - [Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Dan Bretherton

2012-Apr-17 23:30 UTC

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

Dear All-
I find that I have to restart glusterd every few days on my servers to 
stop NFS performance from becoming unbearably slow.  When the problem 
occurs, volumes can take several minutes to mount and there are long 
delays responding to "ls".   Mounting from a different server, i.e.
one
not normally used for NFS export, results in normal NFS access speeds.  
This doesn't seem to have anything to do with load because it happens 
whether or not there is anything running on the compute servers.  Even 
when the system is mostly idle there are often a lot of glusterfsd 
processes running, and on several of the servers I looked at this 
evening there is a process called glusterfs using 100% of one CPU.  I 
can't find anything unusual in nfs.log or etc-glusterfs-glusterd.vol.log 
on the servers affected.  Restarting glusterd seems to stop this strange 
behaviour and make NFS access run smoothly again, but this usually only 
lasts for a day or two.

This behaviour is not necessarily related to the length of time since 
glusterd was started, but has more to do with the amount of work the 
GlusterFS processes on each server have to do.  I use a different server 
to export each of my 8 different volumes, and the NFS performance 
degradation seems to affect the most heavily used volumes more than the 
others.  I really need to find a solution to this problem; all I can 
think of doing is setting up a cron job on each server to restart 
glusterd every day, but I am worried about what side effects that might 
have.  I am using GlusterFS version 3.2.5.  All suggestions would be 
much appreciated.

Regards,
Dan.

Gerald Brandt

2012-Apr-18 00:06 UTC

head link

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

Hi,

----- Original Message -----> Dear All-
> I find that I have to restart glusterd every few days on my servers
> to
> stop NFS performance from becoming unbearably slow.  When the problem
> occurs, volumes can take several minutes to mount and there are long
> delays responding to "ls".   Mounting from a different server,
i.e.
> one
> not normally used for NFS export, results in normal NFS access
> speeds.
> This doesn't seem to have anything to do with load because it happens
> whether or not there is anything running on the compute servers.
>  Even
> when the system is mostly idle there are often a lot of glusterfsd
> processes running, and on several of the servers I looked at this
> evening there is a process called glusterfs using 100% of one CPU.  I
> can't find anything unusual in nfs.log or
> etc-glusterfs-glusterd.vol.log
> on the servers affected.  Restarting glusterd seems to stop this
> strange
> behaviour and make NFS access run smoothly again, but this usually
> only
> lasts for a day or two.
> 
> This behaviour is not necessarily related to the length of time since
> glusterd was started, but has more to do with the amount of work the
> GlusterFS processes on each server have to do.  I use a different
> server
> to export each of my 8 different volumes, and the NFS performance
> degradation seems to affect the most heavily used volumes more than
> the
> others.  I really need to find a solution to this problem; all I can
> think of doing is setting up a cron job on each server to restart
> glusterd every day, but I am worried about what side effects that
> might
> have.  I am using GlusterFS version 3.2.5.  All suggestions would be
> much appreciated.
> 
> Regards,
> Dan.
I run GlusterFS 3.2.5 and only access is via NFS.  I'm running Citrix
XenServer with about 23 VM's off of it.  I haven't seen any degradation
at all.

One thing I don't have is replication or anything else set up.  The server
is ready to replicate, but I'm waiting for 3.3

Gerald

Brian Cipriano

2012-Apr-23 14:38 UTC

head link

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

Hi Dan - I've seen this problem too. I agree with everything you've 
described - seems to happen more quickly on more heavily used volumes, 
and a restart fixes it right away. I've also been considering writing a 
cronjob to fix this - have you made any progress on this, anything to 
report?

I'm running a fairly simple distributed, non-replicated volume across 
two servers. What sort of tasks are you using your gluster for? Ours is 
for a render farm, so we see a very large number of mounts/unmounts as 
render nodes mount various parts of the filesystem. I wonder if this has 
anything to do with it; is your use case anything similar?

- brian

On 4/17/12 7:30 PM, Dan Bretherton wrote:> Dear All-
> I find that I have to restart glusterd every few days on my servers to 
> stop NFS performance from becoming unbearably slow.  When the problem 
> occurs, volumes can take several minutes to mount and there are long 
> delays responding to "ls".   Mounting from a different server,
i.e.
> one not normally used for NFS export, results in normal NFS access 
> speeds.  This doesn't seem to have anything to do with load because it 
> happens whether or not there is anything running on the compute 
> servers.  Even when the system is mostly idle there are often a lot of 
> glusterfsd processes running, and on several of the servers I looked 
> at this evening there is a process called glusterfs using 100% of one 
> CPU.  I can't find anything unusual in nfs.log or 
> etc-glusterfs-glusterd.vol.log on the servers affected.  Restarting 
> glusterd seems to stop this strange behaviour and make NFS access run 
> smoothly again, but this usually only lasts for a day or two.
>
> This behaviour is not necessarily related to the length of time since 
> glusterd was started, but has more to do with the amount of work the 
> GlusterFS processes on each server have to do.  I use a different 
> server to export each of my 8 different volumes, and the NFS 
> performance degradation seems to affect the most heavily used volumes 
> more than the others.  I really need to find a solution to this 
> problem; all I can think of doing is setting up a cron job on each 
> server to restart glusterd every day, but I am worried about what side 
> effects that might have.  I am using GlusterFS version 3.2.5.  All 
> suggestions would be much appreciated.
>
> Regards,
> Dan.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Dan Bretherton

2012-Apr-25 13:34 UTC

head link

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

Dear Brian and Paul,
Thanks for reporting your NFS performance degradation problems; I'm glad 
I'm not the only one who has it.  My 20 node storage cluster has a 
number of fairly standard replicated-distributed volumes; I don't use 
striping.> I've also been considering writing a cronjob
> to fix this - have you made any progress on this, anything to report?I made my compute cluster nodes part of the storage cluster a couple of 
months ago as described here:

http://community.gluster.org/a/nfs-performance-with-fuse-client-redundancy/

A few days ago I set up a cron job to restart glusterd on the compute 
nodes every day at about 2AM.   So far there haven't been any reported 
problems and long running jobs have been unaffected.  I thought this 
would be potentially less disruptive than automatically restarting 
glusterd on the storage servers, because those do a lot more than just 
provide NFS.  I have been using the GlusterFS servers to export NFS to 
less important machines, but I now plan to use the compute nodes for all 
NFS exports in order to take advantage of the daily glusterd restart.  
This isn't an ideal situation because the compute nodes get very busy at 
times and tend to suffer more down time than the storage servers.  I 
thought about having a dedicated compute server just for GlusterFS 
exports, but I don't have enough in the budget for that at the moment.  
My other worry is that other GlusterFS related processes on the storage 
servers will slow down with use, not just NFS.
> What sort of tasks are you using your gluster for?The compute cluster is mainly used to run various climate and 
meteorology related models and associated data analysis and processing 
applications, all reading from and writing to GlusterFS
volumes.> Ours is for a
> render farm, so we see a very large number of mounts/unmounts as render
> nodes mount various parts of the filesystem. I wonder if this has anything
> to do with it; is your use case anything similar?I don't think our models and applications do a lot of mounting and 
unmounting; volumes usually stay mounted while compute cluster jobs are 
using the data, and there are also quite a lot of interactive shells 
keeping volumes mounted for long periods.

-Dan.

On 04/23/2012 08:00 PM, gluster-users-request at gluster.org
wrote:> Date: Mon, 23 Apr 2012 19:24:14 +0100
> From: Paul Simpson<paul at realisestudio.com>
> Subject: Re: [Gluster-users] Frequent glusterd restarts needed to
> 	avoid NFS performance degradation
> To: Brian Cipriano<bcipriano at zerovfx.com>
> Cc: gluster-users at gluster.org
> Message-ID:
> 	<CAOFxjOTGSS3mFve=EktgAZRaQz3XiZLoZU-EvEByCV6H=m1cfw at
mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> just like to add that we sometimes need to restart glusterd on servers too.
>   again - on a renderfarm that hammers our 4 server dist/repl servers
> heavily.
>
> -p
>
>
> On 23 April 2012 15:38, Brian Cipriano<bcipriano at zerovfx.com> 
wrote:
>> Hi Dan - I've seen this problem too. I agree with everything
you've
>> described - seems to happen more quickly on more heavily used volumes,
and
>> a restart fixes it right away. I've also been considering writing a
cronjob
>> to fix this - have you made any progress on this, anything to report?
>>
>> I'm running a fairly simple distributed, non-replicated volume
across two
>> servers. What sort of tasks are you using your gluster for? Ours is for
a
>> render farm, so we see a very large number of mounts/unmounts as render
>> nodes mount various parts of the filesystem. I wonder if this has
anything
>> to do with it; is your use case anything similar?
>>
>> - brian
>>
>>
>> On 4/17/12 7:30 PM, Dan Bretherton wrote:
>>> Dear All-
>>> I find that I have to restart glusterd every few days on my servers
to
>>> stop NFS performance from becoming unbearably slow.  When the
problem
>>> occurs, volumes can take several minutes to mount and there are
long delays
>>> responding to "ls".   Mounting from a different server,
i.e. one not
>>> normally used for NFS export, results in normal NFS access speeds. 
This
>>> doesn't seem to have anything to do with load because it
happens whether or
>>> not there is anything running on the compute servers.  Even when
the system
>>> is mostly idle there are often a lot of glusterfsd processes
running, and
>>> on several of the servers I looked at this evening there is a
process
>>> called glusterfs using 100% of one CPU.  I can't find anything
unusual in
>>> nfs.log or etc-glusterfs-glusterd.vol.log on the servers affected.
>>>   Restarting glusterd seems to stop this strange behaviour and make
NFS
>>> access run smoothly again, but this usually only lasts for a day or
two.
>>>
>>> This behaviour is not necessarily related to the length of time
since
>>> glusterd was started, but has more to do with the amount of work
the
>>> GlusterFS processes on each server have to do.  I use a different
server to
>>> export each of my 8 different volumes, and the NFS performance
degradation
>>> seems to affect the most heavily used volumes more than the others.
I
>>> really need to find a solution to this problem; all I can think of
doing is
>>> setting up a cron job on each server to restart glusterd every day,
but I
>>> am worried about what side effects that might have.  I am using
GlusterFS
>>> version 3.2.5.  All suggestions would be much appreciated.
>>>
>>> Regards,
>>> Dan.
>>> ______________________________**_________________

Gluster users - Apr 2012 - Frequent glusterd restarts needed to avoid NFS performance degradation

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation

[Gluster-users] Frequent glusterd restarts needed to avoid NFS performance degradation