thr3ads.net - Gluster users - [Gluster-users] 120k context switches on GlsuterFS nodes [May 2017]

If this information is useful, please help other people find it:
Share via:

mabi

2017-May-17 17:20 UTC

[Gluster-users] 120k context switches on GlsuterFS nodes

I don't know exactly what kind of context-switches it was but what I know is
that it is the "cs" number under "system" when you run
vmstat.

Also I use the percona linux monitoring template for cacti
(https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html)
which monitors context switches too. If that's of any use interrupts where
also quite high during that time with peaks up to 50k interrupts.

-------- Original Message --------
Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes
Local Time: May 17, 2017 2:37 AM
UTC Time: May 17, 2017 12:37 AM
From: ravishankar at redhat.com
To: mabi <mabi at protonmail.ch>, Gluster Users <gluster-users at
gluster.org>

On 05/16/2017 11:13 PM, mabi wrote:
Today I even saw up to 400k context switches for around 30 minutes on my two
nodes replica... Does anyone else have so high context switches on their
GlusterFS nodes?

I am wondering what is "normal" and if I should be worried...

-------- Original Message --------
Subject: 120k context switches on GlsuterFS nodes
Local Time: May 11, 2017 9:18 PM
UTC Time: May 11, 2017 7:18 PM
From: mabi at protonmail.ch
To: Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at
gluster.org)

Hi,

Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a
very high amount of context switches, around 120k. Usually the average is more
around 1k-2k. So I checked what was happening and there where just more users
accessing (downloading) their files at the same time. These are directories with
typical cloud files, which means files of any sizes ranging from a few kB to MB
and a lot of course.

Now I never saw such a high number in context switches in my entire life so I
wanted to ask if this is normal or to be expected? I do not find any signs of
errors or warnings in any log files.

What context switch are you referring to (syscalls context-switch on the
bricks?) ? How did you measure this?
-Ravi

My volume is a replicated volume on two nodes with ZFS as filesystem behind and
the volume is mounted using FUSE on the client (the cloud server). On that cloud
server the glusterfs process was using quite a lot of system CPU but that server
(VM) only has 2 vCPUs so maybe I should increase the number of vCPUs...

Any ideas or recommendations?

Regards,
M.

_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org

http://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170517/ae2b589c/attachment.html>

Pranith Kumar Karampuri

2017-May-17 17:37 UTC

head link

[Gluster-users] 120k context switches on GlsuterFS nodes

+ gluster-devel

On Wed, May 17, 2017 at 10:50 PM, mabi <mabi at protonmail.ch> wrote:
> I don't know exactly what kind of context-switches it was but what I
know
> is that it is the "cs" number under "system" when you
run vmstat.
>
> Also I use the percona linux monitoring template for cacti (
> https://www.percona.com/doc/percona-monitoring-plugins/
> LATEST/cacti/linux-templates.html) which monitors context switches too.
> If that's of any use interrupts where also quite high during that time
with
> peaks up to 50k interrupts.
>
>
>
> -------- Original Message --------
> Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes
> Local Time: May 17, 2017 2:37 AM
> UTC Time: May 17, 2017 12:37 AM
> From: ravishankar at redhat.com
> To: mabi <mabi at protonmail.ch>, Gluster Users <gluster-users at
gluster.org>
>
>
> On 05/16/2017 11:13 PM, mabi wrote:
>
> Today I even saw up to 400k context switches for around 30 minutes on my
> two nodes replica... Does anyone else have so high context switches on
> their GlusterFS nodes?
>
> I am wondering what is "normal" and if I should be worried...
>
>
>
>
> -------- Original Message --------
> Subject: 120k context switches on GlsuterFS nodes
> Local Time: May 11, 2017 9:18 PM
> UTC Time: May 11, 2017 7:18 PM
> From: mabi at protonmail.ch
> To: Gluster Users <gluster-users at gluster.org> <gluster-users at
gluster.org>
>
> Hi,
>
> Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes
> had a very high amount of context switches, around 120k. Usually the
> average is more around 1k-2k. So I checked what was happening and there
> where just more users accessing (downloading) their files at the same time.
> These are directories with typical cloud files, which means files of any
> sizes ranging from a few kB to MB and a lot of course.
>
> Now I never saw such a high number in context switches in my entire life
> so I wanted to ask if this is normal or to be expected? I do not find any
> signs of errors or warnings in any log files.
>
>
> What context switch are you referring to (syscalls context-switch on the
> bricks?) ? How did you measure this?
> -Ravi
>
> My volume is a replicated volume on two nodes with ZFS as filesystem
> behind and the volume is mounted using FUSE on the client (the cloud
> server). On that cloud server the glusterfs process was using quite a lot
> of system CPU but that server (VM) only has 2 vCPUs so maybe I should
> increase the number of vCPUs...
>
> Any ideas or recommendations?
>
>
>
> Regards,
> M.
>
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170517/bcb14c63/attachment.html>

Jamie Lawrence

2017-May-17 17:49 UTC

head link

[Gluster-users] 120k context switches on GlsuterFS nodes

> On May 17, 2017, at 10:20 AM, mabi <mabi at protonmail.ch> wrote:
> 
> I don't know exactly what kind of context-switches it was but what I
know is that it is the "cs" number under "system" when you
run vmstat.
> 
> Also I use the percona linux monitoring template for cacti
(https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html)
which monitors context switches too. If that's of any use interrupts where
also quite high during that time with peaks up to 50k interrupts.
You can't read or write data from the disk or send data over the network
from userspace without making system calls. System calls mean context switches.
So you should expect to see the CS number scale with load - the whole point of
Gluster is to read and write and send data over the network.

As far as them being "excessive", I don't know how to think about
that without at least a comparison , or better, some evidence that something is
doing more work than it "should". (Or best, line numbers where
unnecessary work is being performed.)

Is there something other than a surprising number to make you think it isn't
behaving well? Did the number jump after an upgrade? Do you have other systems
doing roughly the same thing with other software that performs better? Keep in
mind that, say, a vanilla NFS or SMB server doesn't have the
inter-gluster-node overhead, and how much of that traffic there is depends on
how you've configured Gluster.

-j

Gluster users - May 2017 - 120k context switches on GlsuterFS nodes

[Gluster-users] 120k context switches on GlsuterFS nodes

[Gluster-users] 120k context switches on GlsuterFS nodes

[Gluster-users] 120k context switches on GlsuterFS nodes