Hi List, I was wondering if anyone has implemented gluster successfully in AWS, and has some tips on streamlining the process to increase throughput and possibly reduce latency. (Sorry in advance if this list has seen this problem a lot) My current setup is as follows; gfs-server1 - ap-southeast-2 (AZ1) gfs-server2 - ap-southeast-2 (AZ2) gfs-server3 - ap-southeast-1 (AZ1) (Arbiter) web-server1 - apsoutheast2-az1 (Mounted as gluster/nfs to gfs-server1) web-server2 - apsoutheast2-az2 (Mounted as gluster/nfs to gfs-server2) Using latest 3.7 package from the Ubuntu launchpad ppa. I have one server in each availability zone within Australia with the arbiter volume over in Singapore. This will hopefully act as a fall back if ever there is a problem connecting internally between the two availability zones in the same region. Assuming each gluster server can router externally and not internally. This is for a webserver with a lot of wordpress + magento installations. So it has a lot of files. I mounted the gluster volume and started copying across the files and it was terribly slow. (See below for data)[1] My Questions are as follows: I see from the archives and FAQ's that people have sped up copies by using xargs and having multiple threads per sub folders. While this is a good idea, is there any other way to increase throughput? Also I did a few tests against different mount points on NFS and GlusterFS to see what the difference was, and NFS kicks the glusterfs mount out of the park. Is there a specific reason for this? Would removing the arbiter volume or assuming for example sake; that there was a third availability zone in ap-southeast-2 so latency was not an issue, increase my throughput? As the gluster-client has to write the data to the 2 gluster volumes and the meta-data to the arbiter would this help in reducing the time per file? (Also a non-gluster question that no-one has to answer, has anyone tried Amazons' Elastic File System (EFS) and is it comparable to gluster?) Thank you for reading the wall of text, and I appreciate all the hard work everyone has put into this great product. Cheers, Tim [1] Data: |time cp -Rv wordpress/ /var/gluster-nfs/dir/wordpress/ real 165m4.445s user 0m0.592s sys 0m3.227s| |du -shc wordpress/ 374M wordpress/ find wordpress/ | wc -l 4955 (It works out to be on average 2 seconds per file) | NFS DD Write: |sudo dd if=/dev/zero of=./test bs=1024 412738+0 records in 412738+0 records out 422643712 bytes (423 MB) copied, 85.4381 s, 4.9 MB/s| GlusterFS DD Write (1): |sudo dd if=/dev/zero of=./testgf bs=1024k count=10000 12+0 records in 12+0 records out 12582912 bytes (13 MB) copied, 117.974 s, 107 kB/s| GlusterFS DD Write: (2): |sudo dd if=/dev/zero of=./testgf1 bs=1024 count=10000 10000+0 records in 10000+0 records out 10240000 bytes (10 MB) copied, 56.8728 s, 180 kB/s| -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151209/e13cac75/attachment.html>
Ravishankar N
2015-Dec-09 08:48 UTC
[Gluster-users] AWS usage in a 3 replicator set with arbiter
On 12/09/2015 12:46 PM, Tim wrote:> Hi List, > > I was wondering if anyone has implemented gluster successfully in AWS, > and has some tips on streamlining the process to increase throughput > and possibly reduce latency. (Sorry in advance if this list has seen > this problem a lot) > > My current setup is as follows; > > gfs-server1 - ap-southeast-2 (AZ1) > gfs-server2 - ap-southeast-2 (AZ2) > gfs-server3 - ap-southeast-1 (AZ1) (Arbiter) > web-server1 - apsoutheast2-az1 (Mounted as gluster/nfs to gfs-server1) > web-server2 - apsoutheast2-az2 (Mounted as gluster/nfs to gfs-server2) > > Using latest 3.7 package from the Ubuntu launchpad ppa. > > I have one server in each availability zone within Australia with the > arbiter volume over in Singapore. This will hopefully act as a fall > back if ever there is a problem connecting internally between the two > availability zones in the same region. Assuming each gluster server > can router externally and not internally. >I think when you say volumes, you actually mean bricks. i.e. 2 bricks of the arbiter volume are in Australia and the 3rd brick in Singapore. This is not really recommended. It would be better to locate all bricks (and clients too) of a volume in the same region (you could still use different availability zones in the same region). gluster's replication module winds every write from the client to all bricks of the replica. So the closer they are, the faster it would be.> This is for a webserver with a lot of wordpress + magento > installations. So it has a lot of files. > > I mounted the gluster volume and started copying across the files and > it was terribly slow. (See below for data)[1] > > My Questions are as follows: > I see from the archives and FAQ's that people have sped up copies by > using xargs and having multiple threads per sub folders. While this is > a good idea, is there any other way to increase throughput? > Also I did a few tests against different mount points on NFS and > GlusterFS to see what the difference was, and NFS kicks the glusterfs > mount out of the park. Is there a specific reason for this?For FUSE mounts, the replication happen from the client machine while for NFS, it happens from the server which was used for mounting the volume. This could be the reason since the client is farther away while the servers (2 of them at least) are in the same region.> Would removing the arbiter volume or assuming for example sake; that > there was a third availability zone in ap-southeast-2 so latency was > not an issue, increase my throughput? As the gluster-client has to > write the data to the 2 gluster volumes and the meta-data to the > arbiter would this help in reducing the time per file?You could see if locating all 3 servers and the clients on the same region helps improve performance. Regards, Ravi> (Also a non-gluster question that no-one has to answer, has anyone > tried Amazons' Elastic File System (EFS) and is it comparable to gluster?) > > > Thank you for reading the wall of text, and I appreciate all the hard > work everyone has put into this great product. > > Cheers, > Tim > > [1] Data: > |time cp -Rv wordpress/ /var/gluster-nfs/dir/wordpress/ real 165m4.445s > user 0m0.592s sys 0m3.227s| > |du -shc wordpress/ 374M wordpress/ find wordpress/ | wc -l 4955 (It > works out to be on average 2 seconds per file) | > > NFS DD Write: > |sudo dd if=/dev/zero of=./test bs=1024 412738+0 records in 412738+0 > records out 422643712 bytes (423 MB) copied, 85.4381 s, 4.9 MB/s| > > GlusterFS DD Write (1): > > |sudo dd if=/dev/zero of=./testgf bs=1024k count=10000 12+0 records in > 12+0 records out 12582912 bytes (13 MB) copied, 117.974 s, 107 kB/s| > > GlusterFS DD Write: (2): > > |sudo dd if=/dev/zero of=./testgf1 bs=1024 count=10000 10000+0 records > in 10000+0 records out 10240000 bytes (10 MB) copied, 56.8728 s, 180 kB/s| > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-- Ravishankar N work: +91 80 3924 5143 extension: 8373143 mobile: +91 96118 43905 irc nick: itisravi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151209/26834e05/attachment.html>