Kolasinski, Brent D.
2014-Jun-30 17:44 UTC
[Gluster-users] Slow NFS performance with Replication
Hi all, I have been experimenting with using gluster as a VM storage backend on VMWare ESXi. We are using Gluster NFS to share out storage to VMware ESXi. Our current setup includes 2 storage servers, in a 1x2 replication pool, each with approximately 16TB of storage shared via gluster. The NFS servers are connected via 10Gbps NICs to the ESXi systems, and we've dedicated a cross connected link for gluster replication between the storage servers. After some initial testing, we are only getting approximately 160-200MBps on write speeds. If we drop a brick from the volume, so replication does not take place, we start seeing writes on the order of 500-600MBps. We would expect the writes to be in the 500MBps range with replication turned on, however we are seeing less than half of that over 10Gbps links. We also notice that write heavy VMs start IO waiting quite a bit with replication turned on. We have increased thread counts with the performance.* variables, but that has not improved our situation. When taking VMWare out of the equation (by mounting directly with an NFS client on a different server), we see the same results. Is this normal speed for 10Gbps interconnects with a replicate volume? Here is our current gluster config. We are running gluster 3.5.0: Volume Name: gvol0 Type: Replicate Volume ID: e88afc1c-50d3-4e2e-b540-4c2979219d12 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nfs0g:/data/brick0/gvol0 Brick2: nfs1g:/data/brick0/gvol0 Options Reconfigured: nfs.disable: 0 network.ping-timeout: 3 nfs.drc off ---------- Brent Kolasinski Computer Systems Engineer Argonne National Laboratory Decision and Information Sciences ARM Climate Research Facility
----- Original Message -----> From: "Brent D. Kolasinski" <bkolasinski at anl.gov> > To: gluster-users at gluster.org > Sent: Monday, June 30, 2014 1:44:49 PM > Subject: [Gluster-users] Slow NFS performance with Replication > > Hi all, > > I have been experimenting with using gluster as a VM storage backend on > VMWare ESXi. We are using Gluster NFS to share out storage to VMware > ESXi. Our current setup includes 2 storage servers, in a 1x2 replication > pool, each with approximately 16TB of storage shared via gluster. The NFS > servers are connected via 10Gbps NICs to the ESXi systems, and we've > dedicated a cross connected link for gluster replication between the > storage servers. > > After some initial testing, we are only getting approximately 160-200MBps > on write speeds. If we drop a brick from the volume, so replication does > not take place, we start seeing writes on the order of 500-600MBps. We > would expect the writes to be in the 500MBps range with replication turned > on, however we are seeing less than half of that over 10Gbps links.For sequential single threaded writes over NFS on my 10G setup I get: # dd if=/dev/zero of=./test.txt bs=1024k count=1000 conv=sync 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 2.32838 s, 450 MB/s For glusterfs mounts I get ~600 MB / sec. With replication you cut your bandwidth in half as write to each brick. Try following the recommendations in: http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf That should get you closer to 450.> We also notice that write heavy VMs start IO waiting quite a bit with > replication turned on. > > We have increased thread counts with the performance.* variables, but that > has not improved our situation. When taking VMWare out of the equation > (by mounting directly with an NFS client on a different server), we see > the same results. > > Is this normal speed for 10Gbps interconnects with a replicate volume? > > Here is our current gluster config. We are running gluster 3.5.0: > > Volume Name: gvol0 > Type: Replicate > Volume ID: e88afc1c-50d3-4e2e-b540-4c2979219d12 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: nfs0g:/data/brick0/gvol0 > Brick2: nfs1g:/data/brick0/gvol0 > Options Reconfigured: > nfs.disable: 0 > network.ping-timeout: 3 > > nfs.drc off > > ---------- > Brent Kolasinski > Computer Systems Engineer > > Argonne National Laboratory > Decision and Information Sciences > ARM Climate Research Facility > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >