Alberto Bengoa
2019-Feb-18 16:58 UTC
[Gluster-users] High network traffic with performance.readdir-ahead on
Hello folks, We are working on a migration from Gluster 3.8 to 5.3. Because it has a long migration path, we decided to install new servers running version 5.3 and then migrate the clients updating and pointing them to the new cluster. As a bonus, we still keep a rollback option in case of problems. We made our first migration attempt today and, unfortunately, we had to rollback to the old cluster. Since the very few minutes after switching clients from old to the new cluster, we noticed an unusual network traffic on glusters servers (around 320mbps) for that time. Near to 08:05 (our first daily peak is 8AM) we reached near to 1gbps during some minutes, and the traffic kept sustaining really high (over 800mbps) up to our second daily peak (at 9AM) when we reached again 1gbps. We decided to rollback the main production servers to old cluster, and kept some servers on the new one. We observed the network traffic going down again to around 300mbps. Talking with @nbalacha (Thank you again, man!) on IRC channel he suggested disabling performance.readdir-ahead option and the traffic went instantly down to near to 10mbps. A graph showing all these events can be found here: https://pasteboard.co/I1JR7ck.png So, the first point here, should performance.readdir-ahead be on by default? Maybe our scenario isn't the best use scenario, because, in fact, we do have hundreds of thousands of directories and it looks to be causing much more problems than benefits. Another thing we noticed is that when we point clients running new gluster version (5.3) to the old cluster (version 3.8) we also ran into the high traffic scenario, even already having performance.readdir-ahead switched to "off" (the default option for this version). You can see these high traffics on old cluster here: https://pasteboard.co/I1KdTUd.png . We are aware that having clients and servers running different versions isn't recommended and we are doing that just for debug/tests purposes. About our setup, we have ~= 1.5T volume running in Replicated mode (2 servers each cluster). We have around 30 clients mounting these volumes through fuse.glusterfs. # gluster volume info of new cluster Volume Name: X Type: Replicate Volume ID: 1d8f7d2d-bda6-4f1c-aa10-6ad29e0b7f5e Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: fs02tmp.x.net:/var/data/glusterfs/x/brick Brick2: fs01tmp.x.net:/var/data/glusterfs/x/brick Options Reconfigured: performance.readdir-ahead: off client.event-threads: 4 server.event-threads: 4 server.allow-insecure: on performance.client-io-threads: off nfs.disable: on transport.address-family: inet performance.io-thread-count: 32 performance.cache-size: 1900MB performance.write-behind-window-size: 16MB performance.flush-behind: on network.ping-timeout: 10 # gluster volume info of old cluster Volume Name: X Type: Replicate Volume ID: 1bd3b5d8-b10f-4c4b-a28a-06ea4cfa1d89 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: fs1.x.net:/var/local/gfs Brick2: fs2.x.net:/var/local/gfs Options Reconfigured: network.ping-timeout: 10 performance.cache-size: 512MB server.allow-insecure: on client.bind-insecure: on I was able to collect a profile from new gluster and pasted here: https://pastebin.com/ffF8RVH4 . The sad part is that I was unable to reproduce the issue after reenabling performance.readdir-ahead after. Not sure if the clients connected to the cluster were unable to create a workload near to the one that we had this morning. We'll try to recreate the condition that we had soon. I can provide more info and tests if you guys need it. Cheers, Alberto Bengoa -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190218/2cac8cfd/attachment.html>