Hey folks, We've been running VM's on qemu using a replicated gluster volume connecting using gfapi and things have been going well for the most part. Something we've noticed though is that we have problems with many concurrent disk operations and disk latency. The latency gets bad enough that the process eats the cpu and the entire machine stalls. The place where we've seen it the worst is a apache2 server under very high load which had to be converted to raw disk image due to performance issues. The hypervisors are connected directly to each other over a bonded pair of 10Gb fiber modules and are the only bricks in the volume. Volume info is Volume Name: VMARRAY Type: Replicate Volume ID: 67b3ad79-4b48-4597-9433-47063f90a7a0 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.9.1.1:/mnt/xfs/VMARRAY Brick2: 10.9.1.2:/mnt/xfs/VMARRAY Options Reconfigured: nfs.disable: on network.ping-timeout: 7 cluster.eager-lock: on performance.flush-behind: on performance.write-behind: on performance.write-behind-window-size: 4MB performance.cache-size: 1GB server.allow-insecure: on diagnostics.client-log-level: ERROR Any advice for performance improvements for high IO / low bandwidth tuning would be appreciated. Thanks, Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140320/a19c453c/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: vda-day.png Type: image/png Size: 24384 bytes Desc: not available URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140320/a19c453c/attachment.png>
Carlos Capriotti
2014-Mar-20 23:21 UTC
[Gluster-users] Optimizing Gluster (gfapi) for high IOPS
Well, if you want to join my tests, here are a couple of sysctl options: net.core.wmem_max=12582912 net.core.rmem_max=12582912 net.ipv4.tcp_rmem= 10240 87380 12582912 net.ipv4.tcp_wmem= 10240 87380 12582912 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_timestamps = 1 net.ipv4.tcp_sack = 1 vm.swappiness=10 vm.dirty_background_ratio=1 net.ipv4.neigh.default.gc_thresh2=2048 net.ipv4.neigh.default.gc_thresh3=4096 net.core.netdev_max_backlog=2500 net.ipv4.tcp_mem= 12582912 12582912 12582912 On Fri, Mar 21, 2014 at 12:05 AM, Josh Boon <gluster at joshboon.com> wrote:> Hey folks, > > We've been running VM's on qemu using a replicated gluster volume > connecting using gfapi and things have been going well for the most part. > Something we've noticed though is that we have problems with many > concurrent disk operations and disk latency. The latency gets bad enough > that the process eats the cpu and the entire machine stalls. The place > where we've seen it the worst is a apache2 server under very high load > which had to be converted to raw disk image due to performance issues. The > hypervisors are connected directly to each other over a bonded pair of 10Gb > fiber modules and are the only bricks in the volume. Volume info is > > Volume Name: VMARRAY > Type: Replicate > Volume ID: 67b3ad79-4b48-4597-9433-47063f90a7a0 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.9.1.1:/mnt/xfs/VMARRAY > Brick2: 10.9.1.2:/mnt/xfs/VMARRAY > Options Reconfigured: > nfs.disable: on > network.ping-timeout: 7 > cluster.eager-lock: on > performance.flush-behind: on > performance.write-behind: on > performance.write-behind-window-size: 4MB > performance.cache-size: 1GB > server.allow-insecure: on > diagnostics.client-log-level: ERROR > > > Any advice for performance improvements for high IO / low bandwidth tuning > would be appreciated. > > > Thanks, > > Josh > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140321/aa5dbaf3/attachment.html>