Lindsay Mathieson
2015-Oct-09 00:10 UTC
[Gluster-users] Test results and Performance Tuning efforts ...
Morning, hope the folllowing ramble is ok, just examining the results of some extensive (and destructive ?) testing of gluster 3.6.4 on some disks I had spare. Cluster purpose is solely for hosting qemu vm?s via Proxmox 3.4 Setup: 3 Nodes, well spec?d -????????? 64 GB RAM - VNB & VNG * CPU : E5-2620 - VNA * CPU?s : Dual E5-2660 -????????? Already in use as a Proxmox and Ceph Cluster running 30 Windows VM?s Gluster Bricks. -????????? All bricks on ZFS with 4 GB RAM ZIL, 1GB SSD SLOG and 10GB SSD Cache -????????? LZ4 Compression -????????? Sync disabled Brick 1: -????????? 6 Velocitoraptors in a RAID10+ (3 Mirrors) -????????? High performance -????????? Already hosting 8 VM?s Bricks 2 & 3: -????????? Spare external USB 1TB Toshiba Drive attached via USB3 -????????? Crap performance ? About 50/100 MB/s R/W Overall impressions ? pretty good. Installation is easy and now I?ve been pointed to up to date docs and got the hang of the commands, I?m happy with the administration ? vastly simpler than Ceph. The ability to access the files on the native filesystem is good for peace of mind and enables some interesting benchmark comparisons. I simulated drive failure by killing all the gluster processes on a node and it seemed to cope ok. I would like to see better status information such as ?Heal % progress?, ?Rebalance % progress? NB: Pulling a USB external drive is a *bad* idea as it has no TLER support and this killed an entire node, had to hard reset it. In production I would use something like WD Red NAS drives. Despite all the abuse I threw at it I had no problems with split brain etc and the integration with proxmox is excellent. When running write tests I was very pleased to see it max out my bonded 2x1GB connections, something ceph has never been able to do. I consistently got 110+ MB/s raw write results inside VM?s Currently running 4 VM?s off the Gluster datastore with no issues. Benchmark results ? done using Crystal DiskMark inside a Windows 7 VM, with VIRTIO drivers and writeback enabled. I tested a Gluster replica 3 setup, replica 1 and direct off the disk (ZFS). Multpile tests were run to get a feel for average results. Node VNB - Replica 3 - Local Brick: External USB Toshiba Drive - ----------------------------------------------------------------------- - CrystalDiskMark 3.0.3 x64 (C) 2007-2013 hiyohiyo - Crystal Dew World : http://crystalmark.info/ - ----------------------------------------------------------------------- - * MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s] - - Sequential Read : 738.642 MB/s - Sequential Write : 114.461 MB/s - Random Read 512KB : 720.623 MB/s - Random Write 512KB : 115.084 MB/s - Random Read 4KB (QD=1) : 9.684 MB/s [ 2364.3 IOPS] - Random Write 4KB (QD=1) : 2.511 MB/s [ 613.0 IOPS] - Random Read 4KB (QD=32) : 24.264 MB/s [ 5923.7 IOPS] - Random Write 4KB (QD=32) : 5.685 MB/s [ 1387.8 IOPS] - - Test : 1000 MB [C: 70.1% (44.8/63.9 GB)] (x5) - Date : 2015/10/09 9:30:37 - OS : Windows 7 Professional N SP1 [6.1 Build 7601] (x64) Node VNA - Replica 1 (So no writing over ethernet) - Local Brick: High performance Velocipraptors in RAID10 - Sequential Read : 735.224 MB/s - Sequential Write : 718.203 MB/s - Random Read 512KB : 888.090 MB/s - Random Write 512KB : 453.174 MB/s - Random Read 4KB (QD=1) : 11.808 MB/s [ 2882.9 IOPS] - Random Write 4KB (QD=1) : 4.249 MB/s [ 1037.4 IOPS] - Random Read 4KB (QD=32) : 34.787 MB/s [ 8492.8 IOPS] - Random Write 4KB (QD=32) : 5.487 MB/s [ 1339.5 IOPS] Node VNA - Direct on ZFS (No Gluster) - Sequential Read : 2841.216 MB/s - Sequential Write : 1568.681 MB/s - Random Read 512KB : 1753.746 MB/s - Random Write 512KB : 1219.437 MB/s - Random Read 4KB (QD=1) : 26.852 MB/s [ 6555.6 IOPS] - Random Write 4KB (QD=1) : 20.930 MB/s [ 5109.8 IOPS] - Random Read 4KB (QD=32) : 58.515 MB/s [ 14286.0 IOPS] - Random Write 4KB (QD=32) : 46.303 MB/s [ 11304.3 IOPS] Performance: Raw read performance is excellent, averaging 700Mb/s ? I?d say the ZFS & Cluster caches are working well. As mentioned raw write maxed out at 110 MB/s, near the max ethernet speed. Random I/O is pretty average, it could be the Toshba drives bring things down, though even when I took them out of the equation it wasn?t much improved. Direct off the disk was more than double the replica 1 brick in all areas, but I don?t find that surprising. I expected a fair amount of overhead with a cluster fs, and a 1-brick setup is not a real world usage. I was fairly impressed that adding two bricks to replica 3 made no real difference to the read results and the write results were obviously limuted by network speed. If only I could afford 10GB cards and a switch ... I would like to improve the IOPS ? these are the current tunables I have set ? any suggestions for improvements would be much appreciated: Volume Name: datastore1 Type: Replicate Volume ID: 3bda2eee-54de-4540-a556-2f5d045c033a Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: vna.proxmox.softlog:/zfs_vm/datastore1 Brick2: vnb.proxmox.softlog:/glusterdata/datastore1 Brick3: vng.proxmox.softlog:/glusterdata/datastore1 Options Reconfigured: performance.io-thread-count: 32 performance.write-behind-window-size: 32MB performance.cache-size: 1GB performance.cache-refresh-timeout: 4 nfs.disable: on nfs.addr-namelookup: off nfs.enable-ino32: on diagnostics.client-log-level: WARNING diagnostics.brick-log-level: WARNING performance.write-behind: on Thanks, Lindsay Sent from Mail for Windows 10 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151009/2f05aa56/attachment.html>
Ben Turner
2015-Oct-12 21:54 UTC
[Gluster-users] Test results and Performance Tuning efforts ...
----- Original Message -----> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > To: "gluster-users" <gluster-users at gluster.org> > Sent: Thursday, October 8, 2015 8:10:09 PM > Subject: [Gluster-users] Test results and Performance Tuning efforts ... > > > > Morning, hope the folllowing ramble is ok, just examining the results of some > extensive (and destructive ? ) testing of gluster 3.6.4 on some disks I had > spare. Cluster purpose is solely for hosting qemu vm?s via Proxmox 3.4 > > > > Setup: 3 Nodes, well spec?d > > - 64 GB RAM > > - VNB & VNG > > * CPU : E5-2620 > > - VNA > > * CPU?s : Dual E5-2660 > > - Already in use as a Proxmox and Ceph Cluster running 30 Windows VM?s > > > > Gluster Bricks. > > - All bricks on ZFS with 4 GB RAM ZIL, 1GB SSD SLOG and 10GB SSD Cache > > - LZ4 Compression > > - Sync disabled > > > > Brick 1: > > - 6 Velocitoraptors in a RAID10+ (3 Mirrors) > > - High performance > > - Already hosting 8 VM?s > > > > Bricks 2 & 3: > > - Spare external USB 1TB Toshiba Drive attached via USB3 > > - Crap performance ? About 50/100 MB/s R/W > > > > > > Overall impressions ? pretty good. Installation is easy and now I?ve been > pointed to up to date docs and got the hang of the commands, I?m happy with > the administration ? vastly simpler than Ceph. The ability to access the > files on the native filesystem is good for peace of mind and enables some > interesting benchmark comparisons. I simulated drive failure by killing all > the gluster processes on a node and it seemed to cope ok. > > > > I would like to see better status information such as ?Heal % progress?, > ?Rebalance % progress? > > > > NB: Pulling a USB external drive is a * bad * idea as it has no TLER support > and this killed an entire node, had to hard reset it. In production I would > use something like WD Red NAS drives. > > > > > > Despite all the abuse I threw at it I had no problems with split brain etc > and the integration with proxmox is excellent. When running write tests I > was very pleased to see it max out my bonded 2x1GB connections, something > ceph has never been able to do. I consistently got 110+ MB/s raw write > results inside VM?s > > > > Currently running 4 VM?s off the Gluster datastore with no issues. > > > > Benchmark results ? done using Crystal DiskMark inside a Windows 7 VM, with > VIRTIO drivers and writeback enabled. I tested a Gluster replica 3 setup, > replica 1 and direct off the disk (ZFS). Multpile tests were run to get a > feel for average results. > > > > Node VNB > > - Replica 3 > > - Local Brick: External USB Toshiba Drive > > - ----------------------------------------------------------------------- > > - CrystalDiskMark 3.0.3 x64 (C) 2007-2013 hiyohiyo > > - Crystal Dew World : http://crystalmark.info/ > > - ----------------------------------------------------------------------- > > - * MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s] > > - > > - Sequential Read : 738.642 MB/s > > - Sequential Write : 114.461 MB/s > > - Random Read 512KB : 720.623 MB/s > > - Random Write 512KB : 115.084 MB/s > > - Random Read 4KB (QD=1) : 9.684 MB/s [ 2364.3 IOPS] > > - Random Write 4KB (QD=1) : 2.511 MB/s [ 613.0 IOPS] > > - Random Read 4KB (QD=32) : 24.264 MB/s [ 5923.7 IOPS] > > - Random Write 4KB (QD=32) : 5.685 MB/s [ 1387.8 IOPS] > > - > > - Test : 1000 MB [C: 70.1% (44.8/63.9 GB)] (x5) > > - Date : 2015/10/09 9:30:37 > > - OS : Windows 7 Professional N SP1 [6.1 Build 7601] (x64) > > > > > > Node VNA > > - Replica 1 (So no writing over ethernet) > > - Local Brick: High performance Velocipraptors in RAID10 > > - Sequential Read : 735.224 MB/s > > - Sequential Write : 718.203 MB/s > > - Random Read 512KB : 888.090 MB/s > > - Random Write 512KB : 453.174 MB/s > > - Random Read 4KB (QD=1) : 11.808 MB/s [ 2882.9 IOPS] > > - Random Write 4KB (QD=1) : 4.249 MB/s [ 1037.4 IOPS] > > - Random Read 4KB (QD=32) : 34.787 MB/s [ 8492.8 IOPS] > > - Random Write 4KB (QD=32) : 5.487 MB/s [ 1339.5 IOPS] > > > > > > Node VNA > > - Direct on ZFS (No Gluster) > > - Sequential Read : 2841.216 MB/s > > - Sequential Write : 1568.681 MB/s > > - Random Read 512KB : 1753.746 MB/s > > - Random Write 512KB : 1219.437 MB/s > > - Random Read 4KB (QD=1) : 26.852 MB/s [ 6555.6 IOPS] > > - Random Write 4KB (QD=1) : 20.930 MB/s [ 5109.8 IOPS] > > - Random Read 4KB (QD=32) : 58.515 MB/s [ 14286.0 IOPS] > > - Random Write 4KB (QD=32) : 46.303 MB/s [ 11304.3 IOPS] > > > > > > > > Performance: > > Raw read performance is excellent, averaging 700Mb/s ? I?d say the ZFS & > Cluster caches are working well. > > As mentioned raw write maxed out at 110 MB/s, near the max ethernet speed. > > Random I/O is pretty average, it could be the Toshba drives bring things > down, though even when I took them out of the equation it wasn?t much > improved. > > > > Direct off the disk was more than double the replica 1 brick in all areas, > but I don?t find that surprising. I expected a fair amount of overhead with > a cluster fs, and a 1-brick setup is not a real world usage. I was fairly > impressed that adding two bricks to replica 3 made no real difference to the > read results and the write results were obviously limuted by network speed. > If only I could afford 10GB cards and a switch ... > > > > I would like to improve the IOPS ? these are the current tunables I have set > ? any suggestions for improvements would be much appreciated:Random IO has vastly improved with MT epoll introduced in 3.7, try a test on 3.7 with server and client event threads set to 4. If you want to confirm this before you upgrade run top -H during your testing and look for a hot thread(single thread / CPU pegged at 100%). If you see this during your runs on 3.6 then the MT epoll implementation in 3.7 will def help you out. -b> > > > > > Volume Name: datastore1 > > Type: Replicate > > Volume ID: 3bda2eee-54de-4540-a556-2f5d045c033a > > Status: Started > > Number of Bricks: 1 x 3 = 3 > > Transport-type: tcp > > Bricks: > > Brick1: vna.proxmox.softlog:/zfs_vm/datastore1 > > Brick2: vnb.proxmox.softlog:/glusterdata/datastore1 > > Brick3: vng.proxmox.softlog:/glusterdata/datastore1 > > Options Reconfigured: > > performance.io-thread-count: 32 > > performance.write-behind-window-size: 32MB > > performance.cache-size: 1GB > > performance.cache-refresh-timeout: 4 > > nfs.disable: on > > nfs.addr-namelookup: off > > nfs.enable-ino32: on > > diagnostics.client-log-level: WARNING > > diagnostics.brick-log-level: WARNING > > performance.write-behind: on > > > > Thanks, > > > > Lindsay > > > > > > > > > > Sent from Mail for Windows 10 > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users