Ben Turner
2015-Oct-12 21:54 UTC
[Gluster-users] Test results and Performance Tuning efforts ...
----- Original Message -----> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > To: "gluster-users" <gluster-users at gluster.org> > Sent: Thursday, October 8, 2015 8:10:09 PM > Subject: [Gluster-users] Test results and Performance Tuning efforts ... > > > > Morning, hope the folllowing ramble is ok, just examining the results of some > extensive (and destructive ? ) testing of gluster 3.6.4 on some disks I had > spare. Cluster purpose is solely for hosting qemu vm?s via Proxmox 3.4 > > > > Setup: 3 Nodes, well spec?d > > - 64 GB RAM > > - VNB & VNG > > * CPU : E5-2620 > > - VNA > > * CPU?s : Dual E5-2660 > > - Already in use as a Proxmox and Ceph Cluster running 30 Windows VM?s > > > > Gluster Bricks. > > - All bricks on ZFS with 4 GB RAM ZIL, 1GB SSD SLOG and 10GB SSD Cache > > - LZ4 Compression > > - Sync disabled > > > > Brick 1: > > - 6 Velocitoraptors in a RAID10+ (3 Mirrors) > > - High performance > > - Already hosting 8 VM?s > > > > Bricks 2 & 3: > > - Spare external USB 1TB Toshiba Drive attached via USB3 > > - Crap performance ? About 50/100 MB/s R/W > > > > > > Overall impressions ? pretty good. Installation is easy and now I?ve been > pointed to up to date docs and got the hang of the commands, I?m happy with > the administration ? vastly simpler than Ceph. The ability to access the > files on the native filesystem is good for peace of mind and enables some > interesting benchmark comparisons. I simulated drive failure by killing all > the gluster processes on a node and it seemed to cope ok. > > > > I would like to see better status information such as ?Heal % progress?, > ?Rebalance % progress? > > > > NB: Pulling a USB external drive is a * bad * idea as it has no TLER support > and this killed an entire node, had to hard reset it. In production I would > use something like WD Red NAS drives. > > > > > > Despite all the abuse I threw at it I had no problems with split brain etc > and the integration with proxmox is excellent. When running write tests I > was very pleased to see it max out my bonded 2x1GB connections, something > ceph has never been able to do. I consistently got 110+ MB/s raw write > results inside VM?s > > > > Currently running 4 VM?s off the Gluster datastore with no issues. > > > > Benchmark results ? done using Crystal DiskMark inside a Windows 7 VM, with > VIRTIO drivers and writeback enabled. I tested a Gluster replica 3 setup, > replica 1 and direct off the disk (ZFS). Multpile tests were run to get a > feel for average results. > > > > Node VNB > > - Replica 3 > > - Local Brick: External USB Toshiba Drive > > - ----------------------------------------------------------------------- > > - CrystalDiskMark 3.0.3 x64 (C) 2007-2013 hiyohiyo > > - Crystal Dew World : http://crystalmark.info/ > > - ----------------------------------------------------------------------- > > - * MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s] > > - > > - Sequential Read : 738.642 MB/s > > - Sequential Write : 114.461 MB/s > > - Random Read 512KB : 720.623 MB/s > > - Random Write 512KB : 115.084 MB/s > > - Random Read 4KB (QD=1) : 9.684 MB/s [ 2364.3 IOPS] > > - Random Write 4KB (QD=1) : 2.511 MB/s [ 613.0 IOPS] > > - Random Read 4KB (QD=32) : 24.264 MB/s [ 5923.7 IOPS] > > - Random Write 4KB (QD=32) : 5.685 MB/s [ 1387.8 IOPS] > > - > > - Test : 1000 MB [C: 70.1% (44.8/63.9 GB)] (x5) > > - Date : 2015/10/09 9:30:37 > > - OS : Windows 7 Professional N SP1 [6.1 Build 7601] (x64) > > > > > > Node VNA > > - Replica 1 (So no writing over ethernet) > > - Local Brick: High performance Velocipraptors in RAID10 > > - Sequential Read : 735.224 MB/s > > - Sequential Write : 718.203 MB/s > > - Random Read 512KB : 888.090 MB/s > > - Random Write 512KB : 453.174 MB/s > > - Random Read 4KB (QD=1) : 11.808 MB/s [ 2882.9 IOPS] > > - Random Write 4KB (QD=1) : 4.249 MB/s [ 1037.4 IOPS] > > - Random Read 4KB (QD=32) : 34.787 MB/s [ 8492.8 IOPS] > > - Random Write 4KB (QD=32) : 5.487 MB/s [ 1339.5 IOPS] > > > > > > Node VNA > > - Direct on ZFS (No Gluster) > > - Sequential Read : 2841.216 MB/s > > - Sequential Write : 1568.681 MB/s > > - Random Read 512KB : 1753.746 MB/s > > - Random Write 512KB : 1219.437 MB/s > > - Random Read 4KB (QD=1) : 26.852 MB/s [ 6555.6 IOPS] > > - Random Write 4KB (QD=1) : 20.930 MB/s [ 5109.8 IOPS] > > - Random Read 4KB (QD=32) : 58.515 MB/s [ 14286.0 IOPS] > > - Random Write 4KB (QD=32) : 46.303 MB/s [ 11304.3 IOPS] > > > > > > > > Performance: > > Raw read performance is excellent, averaging 700Mb/s ? I?d say the ZFS & > Cluster caches are working well. > > As mentioned raw write maxed out at 110 MB/s, near the max ethernet speed. > > Random I/O is pretty average, it could be the Toshba drives bring things > down, though even when I took them out of the equation it wasn?t much > improved. > > > > Direct off the disk was more than double the replica 1 brick in all areas, > but I don?t find that surprising. I expected a fair amount of overhead with > a cluster fs, and a 1-brick setup is not a real world usage. I was fairly > impressed that adding two bricks to replica 3 made no real difference to the > read results and the write results were obviously limuted by network speed. > If only I could afford 10GB cards and a switch ... > > > > I would like to improve the IOPS ? these are the current tunables I have set > ? any suggestions for improvements would be much appreciated:Random IO has vastly improved with MT epoll introduced in 3.7, try a test on 3.7 with server and client event threads set to 4. If you want to confirm this before you upgrade run top -H during your testing and look for a hot thread(single thread / CPU pegged at 100%). If you see this during your runs on 3.6 then the MT epoll implementation in 3.7 will def help you out. -b> > > > > > Volume Name: datastore1 > > Type: Replicate > > Volume ID: 3bda2eee-54de-4540-a556-2f5d045c033a > > Status: Started > > Number of Bricks: 1 x 3 = 3 > > Transport-type: tcp > > Bricks: > > Brick1: vna.proxmox.softlog:/zfs_vm/datastore1 > > Brick2: vnb.proxmox.softlog:/glusterdata/datastore1 > > Brick3: vng.proxmox.softlog:/glusterdata/datastore1 > > Options Reconfigured: > > performance.io-thread-count: 32 > > performance.write-behind-window-size: 32MB > > performance.cache-size: 1GB > > performance.cache-refresh-timeout: 4 > > nfs.disable: on > > nfs.addr-namelookup: off > > nfs.enable-ino32: on > > diagnostics.client-log-level: WARNING > > diagnostics.brick-log-level: WARNING > > performance.write-behind: on > > > > Thanks, > > > > Lindsay > > > > > > > > > > Sent from Mail for Windows 10 > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users
Lindsay Mathieson
2015-Oct-12 22:05 UTC
[Gluster-users] Test results and Performance Tuning efforts ...
On 13 October 2015 at 07:54, Ben Turner <bturner at redhat.com> wrote:> Random IO has vastly improved with MT epoll introduced in 3.7, try a test > on 3.7 with server and client event threads set to 4. >I'd like to try it, but I'm running wheezy - unless the jessie repo might work with wheezy?> If you want to confirm this before you upgrade run top -H during your > testing and look for a hot thread(single thread / CPU pegged at 100%). If > you see this during your runs on 3.6 then the MT epoll implementation in > 3.7 will def help you out. >I'll check that out, thanks. -- Lindsay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151013/97cb58de/attachment.html>
Lindsay Mathieson
2015-Oct-13 01:09 UTC
[Gluster-users] Test results and Performance Tuning efforts ...
On 13 October 2015 at 07:54, Ben Turner <bturner at redhat.com> wrote:> Random IO has vastly improved with MT epoll introduced in 3.7, try a test > on 3.7 with server and client event threads set to 4. If you want to > confirm this before you upgrade run top -H during your testing and look for > a hot thread(single thread / CPU pegged at 100%). If you see this during > your runs on 3.6 then the MT epoll implementation in 3.7 will def help > you out. >Not 100%, but it gets up to 90% Oddly, the feature list seems to show it as being done in 3.6 ( http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf#multi-thread-epoll ) Seems to be causing 3.7.4 to crash regularly at the moment to :( -- Lindsay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151013/30993526/attachment.html>