Thorsten Schade
2017-Apr-03 14:33 UTC
[Gluster-users] Gluster Performance in an Ovirt Scenario.
On my side in has a productive Ovirt Cluster and try to understand my performance issue. For history information, I start with Ovirt 3.6 and gluster 3.6 and the test are near the same over the version. My understanding problem is that if a oivrt server write in an disperse scenario to 4 (6) nodes, this should near the performance from a nfs mount - but they aren't!! All machines (Gluster and Ovirt) run Centos 7, totally upgrade with newest ML-Kernel The network storage backbone is a 10GB net. Gluster version 3.8.10 ( 6 Node Servers, 16GB Ram, 4 CPU) Oivrt version 4.1 (3 Node Servers, 128GB Ram, 8 CPU) Test 1: The Gluster - 6 computer, every with a 4TB RED 5400upm data Disk. Simple single performance per Disk: Write: 172 MB/s Create a Disperse Volume with 4 + 2 supported configuration and "group virt" . Volume Name: vol01 Type: Disperse Volume ID: ebb831b9-d65d-4583-98d7-f0b262cf124a Status: Started Snapshot Count: 0 Number of Bricks: 1 x (4 + 2) = 6 Transport-type: tcp Bricks: Brick1: vmw-lix-135:/data/brick1-1/brick01 Brick2: vmw-lix-136:/data/brick1-1/brick01 Brick3: vmw-lix-137:/data/brick1-1/brick01 Brick4: vmw-lix-138:/data/brick1-1/brick01 Brick5: vmw-lix-139:/data/brick1-1/brick01 Brick6: vmw-lix-134:/data/brick1-1/brick01 Options Reconfigured: user.cifs: off features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: enable performance.low-prio-threads: 32 performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on The Gluster has running virtual machines on it, verly low usage.... Performance Test with dd 10GB Read to /dev/null and Write from /dev/zero on the Ovirt node servers to the gluster mount. 1 Node dd 10GB multiple test write: 80-95 MB/s (slow) read: 70-80 MB/s (second read same dd file possible up to 800 MB/s - cache?) All 3 Nodes dd run concurrent write: 80-90 MB/s (like a single node write, slow per node, concurrent 240MB/s input in the gluster) read: 40-55 MB/s (poor) My conclusion, The performance per single write is 80-90MB/s and read is slower with only 70 MB/s Multiple write are like single write, but multiple read is poor. Test 2. I think I has a problem in my network or with the server, I build all 6 hard disk in one Server and create 2 partition per 4TB Disk Than in prepare to storages for the Ovirt Cluster. The first 6 disk partitions with mdadm to a raid 5 and mount it as nfs data volume in ovirt The other 6 disk partition as a disperse volume 4+2 the disperse gluster volume get performance like before write: 80MB/s read: 70 MB/s but NFS mount from the mdadm raid: singel node dd: write: 290 MB/s read: 700 MB/s 3 nodes concurrent dd to nfs mount: write: 125-140 MB/s ( ~400 MB/s to mdadm write) read: 400-700 MB/s (~ 1600 MB/s from mdadm, near 10GB network speed) On the same server and the same disks NFS has a real performance advantage!!! The cpu was not a bottleneck during gluster operation, I has a look with htop during operation. Can some explain why the gluster volume has not near the performance from the nfs mount on the mdadm raid 5, or the 6 node gluster test ... Thanks Thorsten
Darrell Budic
2017-Apr-03 18:58 UTC
[Gluster-users] Gluster Performance in an Ovirt Scenario.
You didn?t list your mount type for test 1, but it sounds like your NFS mounting your storage. Is this a ?standard? OS level NFS server, or a Ganesha based NFS server? If you?re using ?normal? NFS, your nodes write to 1 of your gluster servers over the NFS mount, and the gluster server will write it out to all the other servers as needed before acknowledging the write as complete, limiting your total throughput. This is also true for the read case, the server you?re talking to marshals the response from all the servers before sending it along to the client. If you use Ganesha, it may be able to read/write directly from/to all your gluster servers, which should improve your performance. Since you?re using Ovirt, I would recommend you use gluster mounted volumes instead of NFS mounts. Even using the fuse mounts currently supported, I get better behavior from it because then nodes are still writing to all the gluster servers at the same time, which reduces the wait time on the write completions, improving throughput over the NFS case. Then you?re ready for native libgfapi support when Ovirt enables it, something I?m looking forward to myself. I also got some performance improvement by setting higher numbers for server.event-threads and client.event-threads on my volumes. This is more setup & load dependent, so play around with it some. -Darrell> On Apr 3, 2017, at 9:33 AM, Thorsten Schade <Thorsten.Schade at trinovis.com> wrote: > > On my side in has a productive Ovirt Cluster and try to understand my performance issue. > > For history information, I start with Ovirt 3.6 and gluster 3.6 and the test are near the > same over the version. > > My understanding problem is that if a oivrt server write in an disperse scenario to 4 (6) nodes, > this should near the performance from a nfs mount - but they aren't!! > > All machines (Gluster and Ovirt) run Centos 7, totally upgrade with newest ML-Kernel > The network storage backbone is a 10GB net. > > Gluster version 3.8.10 ( 6 Node Servers, 16GB Ram, 4 CPU) > Oivrt version 4.1 (3 Node Servers, 128GB Ram, 8 CPU) > > > Test 1: > > The Gluster - 6 computer, every with a 4TB RED 5400upm data Disk. > Simple single performance per Disk: > Write: 172 MB/s > > Create a Disperse Volume with 4 + 2 supported configuration > and "group virt" . > > > Volume Name: vol01 > Type: Disperse > Volume ID: ebb831b9-d65d-4583-98d7-f0b262cf124a > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (4 + 2) = 6 > Transport-type: tcp > Bricks: > Brick1: vmw-lix-135:/data/brick1-1/brick01 > Brick2: vmw-lix-136:/data/brick1-1/brick01 > Brick3: vmw-lix-137:/data/brick1-1/brick01 > Brick4: vmw-lix-138:/data/brick1-1/brick01 > Brick5: vmw-lix-139:/data/brick1-1/brick01 > Brick6: vmw-lix-134:/data/brick1-1/brick01 > Options Reconfigured: > user.cifs: off > features.shard: on > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > cluster.data-self-heal-algorithm: full > cluster.server-quorum-type: server > cluster.quorum-type: auto > cluster.eager-lock: enable > network.remote-dio: enable > performance.low-prio-threads: 32 > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > transport.address-family: inet > performance.readdir-ahead: on > nfs.disable: on > > > The Gluster has running virtual machines on it, verly low usage.... > > Performance Test with dd 10GB Read to /dev/null and Write from /dev/zero on the Ovirt node servers > to the gluster mount. > > 1 Node dd 10GB multiple test > write: 80-95 MB/s (slow) > read: 70-80 MB/s (second read same dd file possible up to 800 MB/s - cache?) > > All 3 Nodes dd run concurrent > write: 80-90 MB/s (like a single node write, slow per node, concurrent 240MB/s input in the gluster) > read: 40-55 MB/s (poor) > > My conclusion, > The performance per single write is 80-90MB/s and read is slower with only 70 MB/s > Multiple write are like single write, but multiple read is poor. > > Test 2. > > I think I has a problem in my network or with the server, I build all 6 hard disk in one Server > and create 2 partition per 4TB Disk > > Than in prepare to storages for the Ovirt Cluster. > The first 6 disk partitions with mdadm to a raid 5 and mount it as nfs data volume in ovirt > The other 6 disk partition as a disperse volume 4+2 > > the disperse gluster volume get performance like before > write: 80MB/s > read: 70 MB/s > > but NFS mount from the mdadm raid: > > singel node dd: > write: 290 MB/s > read: 700 MB/s > > 3 nodes concurrent dd to nfs mount: > write: 125-140 MB/s ( ~400 MB/s to mdadm write) > read: 400-700 MB/s (~ 1600 MB/s from mdadm, near 10GB network speed) > > On the same server and the same disks NFS has a real performance advantage!!! > > The cpu was not a bottleneck during gluster operation, I has a look with htop during operation. > > > Can some explain why the gluster volume has not near the performance from the nfs mount > on the mdadm raid 5, or the 6 node gluster test ... > > Thanks > > Thorsten > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users