Xie Changlong
2017-Jul-25 07:13 UTC
[Gluster-users] [Questions] About small files performance
Dear all Recently, i did some work to test small files performance for gnfsv3 transport. Following is my scenario. #####environment##### ==2 cluster nodes(nodeA/nodeB)=each is equipped with E5-2650*2, 128G memory and 10GB*2 netcard nodeA: 10.254.3.77 10.128.3.77 nodeB: 10.254.3.78 10.128.3.78 ==2 stress nodes(clientA/clientB)=each is equipped with E5-2650*2, 128G memory and 10GB*2 netcard clientA: 10.254.3.75 clientB: 10.254.3.76 1) 10.254.3.* is for test segment, 10.128.3.* is for cluster internal communication. #####vdbench setup##### hd=default,vdbench=/root/vdbench/,user=root,shell=ssh #hd=hd1,system=10.254.3.xx #hd=hd2,system=10.254.3.xx fsd=fsd1,anchor=/mnt/smalltest1/smalltest/,depth=2,width=100,openflags=o_direct,files=100,size=64k,shared=yes fwd=format,threads=256,xfersize=xxx fwd=default,xfersize=xxx,fileio=random,fileselect=random,rdpct=60,threads=256 #fwd=fwd1,fsd=fsd1,host=hd1 #fwd=fwd2,fsd=fsd1,host=hd2 rd=rd1,fwd=fwd*,fwdrate=max,format=restart,elapsed=600,interval=1 1) Use *o_direct* to bypass cache 2) More than 256 threads show no affect in this test 3) Total 100 millon 64k files #####volume info##### Volume Name: ttt Type: Replicate Volume ID: cf23b1fe-d430-4ede-b33b-b54a2c04d080 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.128.3.77:/gluster/brick-mm Brick2: 10.128.3.78:/gluster/brick-mm Options Reconfigured: performance.nfs.stat-prefetch: off performance.nfs.quick-read: off performance.nfs.io-threads: on client.event-threads: 32 server.event-threads: 32 features.shard: off nfs.trusted-sync: on performance.cache-size: 4000MB performance.io-thread-count: 64 transport.address-family: inet performance.readdir-ahead: on nfs.disable: off Note: 1) I put 10.128.3.*:/gluster/brick-mm on tmpfs, so we can ignore io latency. 2) The key values are based on my experience for best peformance 3) The options of mount.nfs are default because 1M 'rsize/wsize' and 'async' are the best choice. I also dig other options, no significant performance difference to me 4) I've set performance.cache-size as 30GB, but it shows no diffrence to me 5) The network bandwidth is not full for all tests 6) I've tried 'nfs.mem-factor' 'rpc.outstanding-rpc-limit', but gained nothing 7) The version of gluster is 3.8.4 Firstly i get some data with kernel nfs for comparison, the export dir (rw,async,no_root_squash,no_all_squash) is also in tmpfs: [testA] nfs.client: clientA nfs.server: nodeA xfersize=32k 25000ops [testB] nfs.client: clientA nfs.server: nodeA xfersize=4k 100000ops The i did the gnfsv3 tests: [testC] gnfs.client: clientA(mount nodeA) gnfs.server: nodeA nodeB xfersize=32k 10000ops [testD] gnfs.client: clientA(mount nodeA) clientB(mount nodeB) gnfs.server: nodeA nodeB xfersize=32k 10000ops For testA vs testB, small xfersize archive plenty of ops, and i got the same result in gnfs. For testC vs testD, it seems that there is a *bottle neck* with the cluster, 10000ops is limit value to me, am I right? More, i've added more stress nodes and thread counts, but just little affect. We can also dig something from testA and testC. Event if gnfs is as efficient as kernel nfs, gluster fell 60% ops performance! Although it's known that gluster is designed for large files. But I'm a little greedy to ask if there is anyway to promote small files performance? Any idea and/or challenge for tests would be appreciated, thanks in advance ?? -- Thanks -Xie