thr3ads.net - Gluster users - [Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Sander Zijlstra

2015-Apr-09 12:18 UTC

[Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high

LS,

We have a GlusterFS cluster which consists of 4 nodes with one brick each and a
distributed-replicated volume of 72 TB.

Today I extended the cluster to 8 machines and added new bricks to the volume,
so it now contains 8 bricks.

I didn?t start the rebalance yet to limit the impact during the day but to my
surprise all glusterfsd process went sky high and performance was really really
bad. So effectively I cause downtime to our storage service while I didn?t
anticipated this, hence I didn?t do any rebalance yet.

Can somebody explain to me why adding bricks to a volume causes this high CPU
usage? I can imagine the meta data needed to be synced but if this is so heavy,
why can?t I tune this?

This is my current volume setup:
Volume Name: gv0
Type: Distributed-Replicate
Volume ID: 0322f20f-e507-492b-91db-cb4c953a24eb
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: s-s35-06:/glusterfs/bricks/brick1/brick
Brick2: s-s35-07:/glusterfs/bricks/brick1/brick
Brick3: s-s35-08:/glusterfs/bricks/brick1/brick
Brick4: s-s35-09:/glusterfs/bricks/brick1/brick
Brick5: v39-app-01:/glusterfs/bricks/brick1/gv0
Brick6: v39-app-02:/glusterfs/bricks/brick1/gv0
Brick7: v39-app-03:/glusterfs/bricks/brick1/gv0
Brick8: v39-app-04:/glusterfs/bricks/brick1/gv0
Options Reconfigured:
performance.cache-size: 256MB
nfs.disable: on
geo-replication.indexing: off
geo-replication.ignore-pid-check: on
changelog.changelog: on
performance.io-thread-count: 32
performance.write-behind-window-size: 5MB

Met vriendelijke groet / kind regards,

Sander Zijlstra

| Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31 (0)6
43 99 12 47 | sander.zijlstra at surfsara.nl <mailto:sander.zijlstra at
surfsara.nl> | www.surfsara.nl <http://www.surfsara.nl/> |

Regular day off on friday

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150409/30d50050/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150409/30d50050/attachment.sig>

Jiri Hoogeveen

2015-Apr-09 13:15 UTC

head link

[Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high

Hi Sander,

It sounds to me, that it triggered the self-healing, which will do a scan on the
bricks. Depending on the number of files on the brick, it can use a lot of CPU.

Does the logs say anything useful?
 
Grtz, 
Jiri  Hoogeveen
> On 09 Apr 2015, at 14:18, Sander Zijlstra <sander.zijlstra at
surfsara.nl> wrote:
> 
> LS,
> 
> We have a GlusterFS cluster which consists of 4 nodes with one brick each
and a distributed-replicated volume of 72 TB.
> 
> Today I extended the cluster to 8 machines and added new bricks to the
volume, so it now contains 8 bricks.
> 
> I didn?t start the rebalance yet to limit the impact during the day but to
my surprise all glusterfsd process went sky high and performance was really
really bad. So effectively I cause downtime to our storage service while I
didn?t anticipated this, hence I didn?t do any rebalance yet.
> 
> Can somebody explain to me why adding bricks to a volume causes this high
CPU usage? I can imagine the meta data needed to be synced but if this is so
heavy, why can?t I tune this?
> 
> This is my current volume setup:
> Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 0322f20f-e507-492b-91db-cb4c953a24eb
> Status: Started
> Number of Bricks: 4 x 2 = 8
> Transport-type: tcp
> Bricks:
> Brick1: s-s35-06:/glusterfs/bricks/brick1/brick
> Brick2: s-s35-07:/glusterfs/bricks/brick1/brick
> Brick3: s-s35-08:/glusterfs/bricks/brick1/brick
> Brick4: s-s35-09:/glusterfs/bricks/brick1/brick
> Brick5: v39-app-01:/glusterfs/bricks/brick1/gv0
> Brick6: v39-app-02:/glusterfs/bricks/brick1/gv0
> Brick7: v39-app-03:/glusterfs/bricks/brick1/gv0
> Brick8: v39-app-04:/glusterfs/bricks/brick1/gv0
> Options Reconfigured:
> performance.cache-size: 256MB
> nfs.disable: on
> geo-replication.indexing: off
> geo-replication.ignore-pid-check: on
> changelog.changelog: on
> performance.io-thread-count: 32
> performance.write-behind-window-size: 5MB
> 
> Met vriendelijke groet / kind regards,
> 
> Sander Zijlstra
> 
> | Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31
(0)6 43 99 12 47 | sander.zijlstra at surfsara.nl <mailto:sander.zijlstra at
surfsara.nl> | www.surfsara.nl <http://www.surfsara.nl/> |
> 
> Regular day off on friday
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150409/5ac63234/attachment.html>

Gluster users - Apr 2015 - GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high

[Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high

[Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high