thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] 120k context switches on GlsuterFS nodes [May 2017]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2017-May-18 05:03 UTC

[Gluster-users] 120k context switches on GlsuterFS nodes

On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote:> + gluster-devel
>
> On Wed, May 17, 2017 at 10:50 PM, mabi <mabi at protonmail.ch 
> <mailto:mabi at protonmail.ch>> wrote:
>
>     I don't know exactly what kind of context-switches it was but what
>     I know is that it is the "cs" number under "system"
when you run
>     vmstat.
>Okay, that could be due to the  syscalls themselves or pre-emptive 
multitasking in case there aren't enough cpu cores. I think the spike in 
numbers is due to more users accessing the files at the same time like 
you observed, translating into more syscalls.  You can try capturing the 
gluster volume profile info the next time it occurs and co-relate with 
the cs count. If you don't see any negative performance impact, I think 
you don't need to be bothered much by the numbers.

HTH,
Ravi>
>
>     Also I use the percona linux monitoring template for cacti
>    
(https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html
>    
<https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html>)
>     which monitors context switches too. If that's of any use
>     interrupts where also quite high during that time with peaks up to
>     50k interrupts.
>
>
>
>>     -------- Original Message --------
>>     Subject: Re: [Gluster-users] 120k context switches on GlsuterFS
nodes
>>     Local Time: May 17, 2017 2:37 AM
>>     UTC Time: May 17, 2017 12:37 AM
>>     From: ravishankar at redhat.com <mailto:ravishankar at
redhat.com>
>>     To: mabi <mabi at protonmail.ch <mailto:mabi at
protonmail.ch>>,
>>     Gluster Users <gluster-users at gluster.org
>>     <mailto:gluster-users at gluster.org>>
>>
>>
>>     On 05/16/2017 11:13 PM, mabi wrote:
>>>     Today I even saw up to 400k context switches for around 30
>>>     minutes on my two nodes replica... Does anyone else have so
high
>>>     context switches on their GlusterFS nodes?
>>>
>>>     I am wondering what is "normal" and if I should be
worried...
>>>
>>>
>>>
>>>
>>>>     -------- Original Message --------
>>>>     Subject: 120k context switches on GlsuterFS nodes
>>>>     Local Time: May 11, 2017 9:18 PM
>>>>     UTC Time: May 11, 2017 7:18 PM
>>>>     From: mabi at protonmail.ch <mailto:mabi at
protonmail.ch>
>>>>     To: Gluster Users <gluster-users at gluster.org>
>>>>     <mailto:gluster-users at gluster.org>
>>>>
>>>>     Hi,
>>>>
>>>>     Today I noticed that for around 50 minutes my two GlusterFS
>>>>     3.8.11 nodes had a very high amount of context switches,
around
>>>>     120k. Usually the average is more around 1k-2k. So I
checked
>>>>     what was happening and there where just more users
accessing
>>>>     (downloading) their files at the same time. These are
>>>>     directories with typical cloud files, which means files of
any
>>>>     sizes ranging from a few kB to MB and a lot of course.
>>>>
>>>>     Now I never saw such a high number in context switches in
my
>>>>     entire life so I wanted to ask if this is normal or to be
>>>>     expected? I do not find any signs of errors or warnings in
any
>>>>     log files.
>>>>
>>
>>     What context switch are you referring to (syscalls context-switch
>>     on the bricks?) ? How did you measure this?
>>     -Ravi
>>
>>>>     My volume is a replicated volume on two nodes with ZFS as
>>>>     filesystem behind and the volume is mounted using FUSE on
the
>>>>     client (the cloud server). On that cloud server the
glusterfs
>>>>     process was using quite a lot of system CPU but that server
>>>>     (VM) only has 2 vCPUs so maybe I should increase the number
of
>>>>     vCPUs...
>>>>
>>>>     Any ideas or recommendations?
>>>>
>>>>
>>>>
>>>>     Regards,
>>>>     M.
>>>
>>>
>>>
>>>     _______________________________________________
>>>     Gluster-users mailing list
>>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>>
>     _______________________________________________ Gluster-users
>     mailing list Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>     <http://lists.gluster.org/mailman/listinfo/gluster-users> 
>
> -- 
> Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170518/0a0d2beb/attachment.html>

Joe Julian

2017-May-18 13:09 UTC

head link

[Gluster-users] [Gluster-devel] 120k context switches on GlsuterFS nodes

On the other hand,  tracking that stat between versions with a known test
sequence may be valuable for watching for performance issues or improvements.

On May 17, 2017 10:03:28 PM PDT, Ravishankar N <ravishankar at redhat.com>
wrote:>On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote:
>> + gluster-devel
>>
>> On Wed, May 17, 2017 at 10:50 PM, mabi <mabi at protonmail.ch 
>> <mailto:mabi at protonmail.ch>> wrote:
>>
>>     I don't know exactly what kind of context-switches it was but
>what
>>     I know is that it is the "cs" number under
"system" when you run
>>     vmstat.
>>
>Okay, that could be due to the  syscalls themselves or pre-emptive 
>multitasking in case there aren't enough cpu cores. I think the spike
>in 
>numbers is due to more users accessing the files at the same time like 
>you observed, translating into more syscalls.  You can try capturing
>the 
>gluster volume profile info the next time it occurs and co-relate with 
>the cs count. If you don't see any negative performance impact, I think
>
>you don't need to be bothered much by the numbers.
>
>HTH,
>Ravi
>>
>>
>>     Also I use the percona linux monitoring template for cacti
>>    
>(https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html
>>    
><https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html>)
>>     which monitors context switches too. If that's of any use
>>     interrupts where also quite high during that time with peaks up
>to
>>     50k interrupts.
>>
>>
>>
>>>     -------- Original Message --------
>>>     Subject: Re: [Gluster-users] 120k context switches on GlsuterFS
>nodes
>>>     Local Time: May 17, 2017 2:37 AM
>>>     UTC Time: May 17, 2017 12:37 AM
>>>     From: ravishankar at redhat.com <mailto:ravishankar at
redhat.com>
>>>     To: mabi <mabi at protonmail.ch <mailto:mabi at
protonmail.ch>>,
>>>     Gluster Users <gluster-users at gluster.org
>>>     <mailto:gluster-users at gluster.org>>
>>>
>>>
>>>     On 05/16/2017 11:13 PM, mabi wrote:
>>>>     Today I even saw up to 400k context switches for around 30
>>>>     minutes on my two nodes replica... Does anyone else have so
>high
>>>>     context switches on their GlusterFS nodes?
>>>>
>>>>     I am wondering what is "normal" and if I should
be worried...
>>>>
>>>>
>>>>
>>>>
>>>>>     -------- Original Message --------
>>>>>     Subject: 120k context switches on GlsuterFS nodes
>>>>>     Local Time: May 11, 2017 9:18 PM
>>>>>     UTC Time: May 11, 2017 7:18 PM
>>>>>     From: mabi at protonmail.ch <mailto:mabi at
protonmail.ch>
>>>>>     To: Gluster Users <gluster-users at gluster.org>
>>>>>     <mailto:gluster-users at gluster.org>
>>>>>
>>>>>     Hi,
>>>>>
>>>>>     Today I noticed that for around 50 minutes my two
GlusterFS
>>>>>     3.8.11 nodes had a very high amount of context
switches,
>around
>>>>>     120k. Usually the average is more around 1k-2k. So I
checked
>>>>>     what was happening and there where just more users
accessing
>>>>>     (downloading) their files at the same time. These are
>>>>>     directories with typical cloud files, which means files
of any
>>>>>     sizes ranging from a few kB to MB and a lot of course.
>>>>>
>>>>>     Now I never saw such a high number in context switches
in my
>>>>>     entire life so I wanted to ask if this is normal or to
be
>>>>>     expected? I do not find any signs of errors or warnings
in any
>>>>>     log files.
>>>>>
>>>
>>>     What context switch are you referring to (syscalls
>context-switch
>>>     on the bricks?) ? How did you measure this?
>>>     -Ravi
>>>
>>>>>     My volume is a replicated volume on two nodes with ZFS
as
>>>>>     filesystem behind and the volume is mounted using FUSE
on the
>>>>>     client (the cloud server). On that cloud server the
glusterfs
>>>>>     process was using quite a lot of system CPU but that
server
>>>>>     (VM) only has 2 vCPUs so maybe I should increase the
number of
>>>>>     vCPUs...
>>>>>
>>>>>     Any ideas or recommendations?
>>>>>
>>>>>
>>>>>
>>>>>     Regards,
>>>>>     M.
>>>>
>>>>
>>>>
>>>>     _______________________________________________
>>>>     Gluster-users mailing list
>>>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>    
<http://lists.gluster.org/mailman/listinfo/gluster-users>
>>>
>>     _______________________________________________ Gluster-users
>>     mailing list Gluster-users at gluster.org
>>     <mailto:Gluster-users at gluster.org>
>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>     <http://lists.gluster.org/mailman/listinfo/gluster-users> 
>>
>> -- 
>> Pranith
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170518/8bea508b/attachment.html>

mabi

2017-May-18 18:43 UTC

head link

[Gluster-users] 120k context switches on GlsuterFS nodes

I have a single Intel Xeon CPU E5-2620 v3 @ 2.40GHz in each nodes and this one
has 6 cores and 12 threads. I thought this would be enough for GlusterFS. When I
check my CPU graphs everything is pretty much idle and there is hardly any peeks
at all on the CPU. During the very high context switch my CPU graphs shows the
following:

1 thread was 100% busy in CPU user
1 thread was 100% busy in CPU system

leaving actually 10 other threads out of the total of 12 threads unused...

Is there maybe any performance tuning parameters I need to configure in order to
make a better use of my CPU cores or threads?

-------- Original Message --------
Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes
Local Time: May 18, 2017 7:03 AM
UTC Time: May 18, 2017 5:03 AM
From: ravishankar at redhat.com
To: Pranith Kumar Karampuri <pkarampu at redhat.com>, mabi <mabi at
protonmail.ch>
Gluster Users <gluster-users at gluster.org>, Gluster Devel
<gluster-devel at gluster.org>

On 05/17/2017 11:07 PM, Pranith Kumar Karampuri wrote:
+ gluster-devel

On Wed, May 17, 2017 at 10:50 PM, mabi <mabi at protonmail.ch> wrote:

I don't know exactly what kind of context-switches it was but what I know is
that it is the "cs" number under "system" when you run
vmstat.
Okay, that could be due to the syscalls themselves or pre-emptive multitasking
in case there aren't enough cpu cores. I think the spike in numbers is due
to more users accessing the files at the same time like you observed,
translating into more syscalls. You can try capturing the gluster volume profile
info the next time it occurs and co-relate with the cs count. If you don't
see any negative performance impact, I think you don't need to be bothered
much by the numbers.

HTH,
Ravi

Also I use the percona linux monitoring template for cacti
(https://www.percona.com/doc/percona-monitoring-plugins/LATEST/cacti/linux-templates.html)
which monitors context switches too. If that's of any use interrupts where
also quite high during that time with peaks up to 50k interrupts.

-------- Original Message --------
Subject: Re: [Gluster-users] 120k context switches on GlsuterFS nodes
Local Time: May 17, 2017 2:37 AM
UTC Time: May 17, 2017 12:37 AM
From: ravishankar at redhat.com
To: mabi <mabi at protonmail.ch>, Gluster Users <gluster-users at
gluster.org>

On 05/16/2017 11:13 PM, mabi wrote:
Today I even saw up to 400k context switches for around 30 minutes on my two
nodes replica... Does anyone else have so high context switches on their
GlusterFS nodes?

I am wondering what is "normal" and if I should be worried...

-------- Original Message --------
Subject: 120k context switches on GlsuterFS nodes
Local Time: May 11, 2017 9:18 PM
UTC Time: May 11, 2017 7:18 PM
From: mabi at protonmail.ch
To: Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at
gluster.org)

Hi,

Today I noticed that for around 50 minutes my two GlusterFS 3.8.11 nodes had a
very high amount of context switches, around 120k. Usually the average is more
around 1k-2k. So I checked what was happening and there where just more users
accessing (downloading) their files at the same time. These are directories with
typical cloud files, which means files of any sizes ranging from a few kB to MB
and a lot of course.

Now I never saw such a high number in context switches in my entire life so I
wanted to ask if this is normal or to be expected? I do not find any signs of
errors or warnings in any log files.

What context switch are you referring to (syscalls context-switch on the
bricks?) ? How did you measure this?
-Ravi

My volume is a replicated volume on two nodes with ZFS as filesystem behind and
the volume is mounted using FUSE on the client (the cloud server). On that cloud
server the glusterfs process was using quite a lot of system CPU but that server
(VM) only has 2 vCPUs so maybe I should increase the number of vCPUs...

Any ideas or recommendations?

Regards,
M.

______________________________

_________________
Gluster-users mailing list
Gluster-users at gluster.org

[http://lists.gluster.org/

mailman/listinfo/gluster-users](http://lists.gluster.org/mailman/listinfo/gluster-users)

_______________________________________________ Gluster-users mailing list
Gluster-users at gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users
--

Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170518/affeb2a7/attachment.html>

Gluster users - May 2017 - [Gluster-devel] 120k context switches on GlsuterFS nodes

[Gluster-users] 120k context switches on GlsuterFS nodes

[Gluster-users] [Gluster-devel] 120k context switches on GlsuterFS nodes

[Gluster-users] 120k context switches on GlsuterFS nodes