Sander Zijlstra
2015-Apr-09 12:18 UTC
[Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high
LS, We have a GlusterFS cluster which consists of 4 nodes with one brick each and a distributed-replicated volume of 72 TB. Today I extended the cluster to 8 machines and added new bricks to the volume, so it now contains 8 bricks. I didn?t start the rebalance yet to limit the impact during the day but to my surprise all glusterfsd process went sky high and performance was really really bad. So effectively I cause downtime to our storage service while I didn?t anticipated this, hence I didn?t do any rebalance yet. Can somebody explain to me why adding bricks to a volume causes this high CPU usage? I can imagine the meta data needed to be synced but if this is so heavy, why can?t I tune this? This is my current volume setup: Volume Name: gv0 Type: Distributed-Replicate Volume ID: 0322f20f-e507-492b-91db-cb4c953a24eb Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: s-s35-06:/glusterfs/bricks/brick1/brick Brick2: s-s35-07:/glusterfs/bricks/brick1/brick Brick3: s-s35-08:/glusterfs/bricks/brick1/brick Brick4: s-s35-09:/glusterfs/bricks/brick1/brick Brick5: v39-app-01:/glusterfs/bricks/brick1/gv0 Brick6: v39-app-02:/glusterfs/bricks/brick1/gv0 Brick7: v39-app-03:/glusterfs/bricks/brick1/gv0 Brick8: v39-app-04:/glusterfs/bricks/brick1/gv0 Options Reconfigured: performance.cache-size: 256MB nfs.disable: on geo-replication.indexing: off geo-replication.ignore-pid-check: on changelog.changelog: on performance.io-thread-count: 32 performance.write-behind-window-size: 5MB Met vriendelijke groet / kind regards, Sander Zijlstra | Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31 (0)6 43 99 12 47 | sander.zijlstra at surfsara.nl <mailto:sander.zijlstra at surfsara.nl> | www.surfsara.nl <http://www.surfsara.nl/> | Regular day off on friday -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150409/30d50050/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150409/30d50050/attachment.sig>
Jiri Hoogeveen
2015-Apr-09 13:15 UTC
[Gluster-users] GlusterFS 3.6.2, volume from 4 to 8 bricks & CPU went sky high
Hi Sander, It sounds to me, that it triggered the self-healing, which will do a scan on the bricks. Depending on the number of files on the brick, it can use a lot of CPU. Does the logs say anything useful? Grtz, Jiri Hoogeveen> On 09 Apr 2015, at 14:18, Sander Zijlstra <sander.zijlstra at surfsara.nl> wrote: > > LS, > > We have a GlusterFS cluster which consists of 4 nodes with one brick each and a distributed-replicated volume of 72 TB. > > Today I extended the cluster to 8 machines and added new bricks to the volume, so it now contains 8 bricks. > > I didn?t start the rebalance yet to limit the impact during the day but to my surprise all glusterfsd process went sky high and performance was really really bad. So effectively I cause downtime to our storage service while I didn?t anticipated this, hence I didn?t do any rebalance yet. > > Can somebody explain to me why adding bricks to a volume causes this high CPU usage? I can imagine the meta data needed to be synced but if this is so heavy, why can?t I tune this? > > This is my current volume setup: > Volume Name: gv0 > Type: Distributed-Replicate > Volume ID: 0322f20f-e507-492b-91db-cb4c953a24eb > Status: Started > Number of Bricks: 4 x 2 = 8 > Transport-type: tcp > Bricks: > Brick1: s-s35-06:/glusterfs/bricks/brick1/brick > Brick2: s-s35-07:/glusterfs/bricks/brick1/brick > Brick3: s-s35-08:/glusterfs/bricks/brick1/brick > Brick4: s-s35-09:/glusterfs/bricks/brick1/brick > Brick5: v39-app-01:/glusterfs/bricks/brick1/gv0 > Brick6: v39-app-02:/glusterfs/bricks/brick1/gv0 > Brick7: v39-app-03:/glusterfs/bricks/brick1/gv0 > Brick8: v39-app-04:/glusterfs/bricks/brick1/gv0 > Options Reconfigured: > performance.cache-size: 256MB > nfs.disable: on > geo-replication.indexing: off > geo-replication.ignore-pid-check: on > changelog.changelog: on > performance.io-thread-count: 32 > performance.write-behind-window-size: 5MB > > Met vriendelijke groet / kind regards, > > Sander Zijlstra > > | Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31 (0)6 43 99 12 47 | sander.zijlstra at surfsara.nl <mailto:sander.zijlstra at surfsara.nl> | www.surfsara.nl <http://www.surfsara.nl/> | > > Regular day off on friday > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150409/5ac63234/attachment.html>