I have just finished my first steps with glusterfs. Realizing in principle what I wanted to do (including installation from source) was astonishingly easy; however, the performance is extremely poor. Thus, I'd appreciate comments and suggestions what to do/try next. * Operating system is Ubuntu 8.04.1 (32bit on servers, 64bit on client) * glusterfs is 1.3.12; compiled from source; * transport is 100MBit Ethernet * on the client, I tried both the distribution provided fuse kernel module, as well as the one built from fuse-2.7.3glfs10.tar.gz (<rant>which is a pain in the neck as one has to patch the source (*) to make it compile and include the new module in the initramfs; for some reasons, the fuse module is one of the first modules loaded by ubuntu</rant>) (*) see http://www.nabble.com/Compiling-fuse-2.7-on-Ubuntu-Hardy-td18590177.html Is there any chance of getting your patches included upstream ??? At the moment I have two dedicated servers and one test client. I have set up client side AFR, following the example in: http://www.gluster.org/docs/index.php/Setting_up_AFR_on_two_servers_with_client_side_replication. I include the exact volume specs with which the benchmarks were run at the end of the mail. In principle, this worked right away, and after finally enabling extended attributes and Posix ACLs(**) on the underlying filesystems on the servers, self-healings seems to work as well (at least the little I had time to test). (**) not clear from the documentation that they are needed, but without -o acl I get tons of errors/warnings in the server logs! The setup I have described is a first test for the eventual migration of NFS mounted home dirs to glusterfs in order to enhance data safety and availability (hence AFR). My problem is that for two typical use scenarios, the performance is not acceptable. My two "testcases" are a cp -a of a source tree including git repository of appr. 160MB from the local disk to the glusterfs filesystem ("CP-A") and a make statement in this source tree (in the remote glusterfs filesystem) when everything is up to date ("MAKE"); i.e., after make has checked all 800 or so source files it tells you that there is nothing to do. I am afraid my timings speak for themselves: CP-A MAKE local disk < 5sec < 0.3sec NFS (100MBit) 55sec+-2sec < 2sec glusterfs (I) 4m29sec 17sec glusterfs (II) 4m05sec 18sec (I) unpatched fuse.ko, (II) patched fuse.ko, both over the same network as NFS. Note that in (II) the modified kernel module was used with the default (distribution provided) libfuse library. Measurements are fairly reproducible, i.e., variations are at most +- a few seconds. As you can see from the volume specs below, I tried some performance options, but the timings, if anything, got worse. Based on the above timings, I also don't think that the patched fuse.ko is worth the pain. Some questions: (1) Would server side AFR improve things? The servers can/could talk over dedicated GBit Ethernet with jumboframes. Judging from the howtos, server side AFR is much more problematic concerning (redundant) availibility of the service, but I first *have* to get performance somewhere near the NFS levels before there is any sense in continuing. (2) what about this libboost thing? Since the documentation makes it sound highly experimental, I didn't even try. (3) I admit that the plethora of performance xlators has me utterly confused; as mentioned, my ad hoc experiments didn't help at all. So, any hints would be very much appreciated. I should maybe add that the single client scenario is not realistic; the tests just reflect typical activities of my users and myself. The setup eventually should serve 8-10 users with homedirs of 10-50GB; some users often move GBs of data through their homedirs. This is / has not been without pain using NFS, but performance was acceptable so far. So even if I get my test scenario up to speed, would glusterfs be up to the real task ? Thanks in advance for any help, suggestions, pointers. Stefan Boresch --------------------------------------------------------------------- cat /etc/glusterfs/glusterfs-server.vol # the physical data space ##volume brick # watch out volume gfs type storage/posix option directory /data/export end-volume ## the actual exported volume #volume gfs # type performance/io-threads # option thread-count 8 # option cache-size 64MB # subvolumes brick #end-volume # server declaration volume server type protocol/server subvolumes gfs option transport-type tcp/server # For TCP/IP transport option auth.ip.gfs.allow * end-volume ======== cat /etc/glusterfs/glusterfs-client.vol volume brick1 type protocol/client option transport-type tcp/client # for TCP/IP transport option remote-host x.y.z.a # IP address of a option remote-subvolume gfs # name of the remote volume on omega end-volume volume brick2 type protocol/client option transport-type tcp/client # for TCP/IP transport option remote-host x.y.z.b # IP address of b option remote-subvolume gfs # name of the remote volume on sigma end-volume volume afr type cluster/afr subvolumes brick1 brick2 end-volume ## performance block for cluster # optional! #volume writeback # type performance/write-behind # option aggregate-size 131072 # subvolumes afr #end-volume ## performance block for cluster # optional! #volume readahead # type performance/read-ahead # option page-size 65536 # option page-count 16 # subvolumes writeback #end-volume -- Stefan Boresch Institute for Computational Biological Chemistry University of Vienna, Waehringerstr. 17 A-1090 Vienna, Austria Phone: -43-1-427752715 Fax: -43-1-427752790
Hello! First: I'm no expert for GlustersFS, I just did some testing in the last few weeks. 2008/9/17 Stefan Boresch <stefan at mdy.univie.ac.at>:> I have just finished my first steps with glusterfs. Realizing in > principle what I wanted to do (including installation from source) was > astonishingly easy; however, the performance is extremely poor. Thus, > I'd appreciate comments and suggestions what to do/try next.[...]> The setup I have described is a first test for the eventual migration > of NFS mounted home dirs to glusterfs in order to enhance data safety > and availability (hence AFR). My problem is that for two typical use > scenarios, the performance is not acceptable. My two "testcases" are a > cp -a of a source tree including git repository of appr. 160MB from > the local disk to the glusterfs filesystem ("CP-A") and a make > statement in this source tree (in the remote glusterfs filesystem) > when everything is up to date ("MAKE"); i.e., after make has checked > all 800 or so source files it tells you that there is nothing to do. I > am afraid my timings speak for themselves: > > CP-A MAKE > local disk < 5sec < 0.3sec > NFS (100MBit) 55sec+-2sec < 2sec > glusterfs (I) 4m29sec 17sec > glusterfs (II) 4m05sec 18sec160MB/5s = 32MB/s 160MB/55s = 2,9MB/s 160MB/250s = 655kB/s your right, there is something wrong, that's a bit slow IIRC GlusterFS has a high latency when accessing files/metadata. The new binary protocol to be introduced in the 1.4 release should improve that. (http://www.gluster.org/docs/index.php/GlusterFS_Roadmap#GlusterFS_1.4_-_Small_File_Performance) I'd suggest trying the test release of version 1.4 or waiting a month or two for the new stable release and then try again.> (I) unpatched fuse.ko, (II) patched fuse.ko, both over the same network > as NFS. Note that in (II) the modified kernel module was used > with the default (distribution provided) libfuse library. > > Measurements are fairly reproducible, i.e., variations are at most +- > a few seconds. As you can see from the volume specs below, I tried > some performance options, but the timings, if anything, got worse. Based on > the above timings, I also don't think that the patched fuse.ko is worth > the pain. > > Some questions: (1) Would server side AFR improve things? The servers can/could > talk over dedicated GBit Ethernet with jumboframes. Judging from the howtos, > server side AFR is much more problematic concerning (redundant) availibility of the > service, but I first *have* to get performance somewhere near the NFS levels before > there is any sense in continuing.I'm not sure if the numbers are comparable. On write, Client side AFR has to send the same data twice over the network to get it to both of the two servers. On a slow network that has to have a very big impact on performance. Would the setup described in http://www.gluster.org/docs/index.php/NFS_Like_Standalone_Storage_Server be comparable to your current NFS setup? Did you try anything like that? Other suggested reading: http://www.gluster.org/docs/index.php/GlusterFS_1.3_P2P_Cluster_with_Auto_Healing http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas There might be some information for what you are trying to do. IMHO, doing server-side AFR with automatic IP-Failover and some sort of loadbalancing (e.g. telling half of the clients to use one server, telling the other half to use the second one) might be a setup better suited to your network (slow connection to clients, fast connection between servers) without compromising on redundancy.> I should maybe add that the single client scenario is not realistic; the > tests just reflect typical activities of my users and myself. The setup > eventually should serve 8-10 users with homedirs of 10-50GB; some users > often move GBs of data through their > homedirs. This is / has not been without pain using NFS, but > performance was acceptable so far. So even if I get my test scenario > up to speed, would glusterfs be up to the real task ?On a "real" network it should be: http://www.gluster.org/docs/index.php/GlusterFS_1.2.1-BENKI_Aggregated_I/O_vs_NFSv4_Benchmark :-) I just can't image who can afford that kind of setup in a real world scenario... Harald St?rzebecher
I had a similar problem. although not as severe as yours. Mine was mostly involving a very large directory which seemed to get sent back and forth constantly. 1.4 pre5 is running now and it seems much improved. My guess is that there are a number of improvements in 1.4 that will solve some of the issues you''re facing. I''d try to benchmark with that and see if your results improve. At 07:31 AM 9/17/2008, Harald St?rzebecher wrote:>Hello! > >First: I''m no expert for GlustersFS, I just did some testing in the >last few weeks. > >2008/9/17 Stefan Boresch <stefan at mdy.univie.ac.at>: > > I have just finished my first steps with glusterfs. Realizing in > > principle what I wanted to do (including installation from source) was > > astonishingly easy; however, the performance is extremely poor. Thus, > > I''d appreciate comments and suggestions what to do/try next. > >[...] > > > The setup I have described is a first test for the eventual migration > > of NFS mounted home dirs to glusterfs in order to enhance data safety > > and availability (hence AFR). My problem is that for two typical use > > scenarios, the performance is not acceptable. My two "testcases" are a > > cp -a of a source tree including git repository of appr. 160MB from > > the local disk to the glusterfs filesystem ("CP-A") and a make > > statement in this source tree (in the remote glusterfs filesystem) > > when everything is up to date ("MAKE"); i.e., after make has checked > > all 800 or so source files it tells you that there is nothing to do. I > > am afraid my timings speak for themselves: > > > > CP-A MAKE > > local disk < 5sec < 0.3sec > > NFS (100MBit) 55sec+-2sec < 2sec > > glusterfs (I) 4m29sec 17sec > > glusterfs (II) 4m05sec 18sec > >160MB/5s = 32MB/s >160MB/55s = 2,9MB/s >160MB/250s = 655kB/s >your right, there is something wrong, that''s a bit slow > >IIRC GlusterFS has a high latency when accessing files/metadata. >The new binary protocol to be introduced in the >1.4 release should improve that. >(http://www.gluster.org/docs/index.php/GlusterFS_Roadmap#GlusterFS_1.4_-_Small_File_Performance) >I''d suggest trying the test release of version 1.4 or waiting a month >or two for the new stable release and then try again. > > > (I) unpatched fuse.ko, (II) patched fuse.ko, both over the same network > > as NFS. Note that in (II) the modified kernel module was used > > with the default (distribution provided) libfuse library. > > > > Measurements are fairly reproducible, i.e., variations are at most +- > > a few seconds. As you can see from the volume specs below, I tried > > some performance options, but the timings, if anything, got worse. Based on > > the above timings, I also don''t think that the patched fuse.ko is worth > > the pain. > > > > Some questions: (1) Would server side AFR > improve things? The servers can/could > > talk over dedicated GBit Ethernet with > jumboframes. Judging from the howtos, > > server side AFR is much more problematic > concerning (redundant) availibility of the > > service, but I first *have* to get > performance somewhere near the NFS levels before > > there is any sense in continuing. > >I''m not sure if the numbers are comparable. On write, Client side AFR >has to send the same data twice over the network to get it to both of >the two servers. On a slow network that has to have a very big impact >on performance. > >Would the setup described in >http://www.gluster.org/docs/index.php/NFS_Like_Standalone_Storage_Server >be comparable to your current NFS setup? >Did you try anything like that? > >Other suggested reading: >http://www.gluster.org/docs/index.php/GlusterFS_1.3_P2P_Cluster_with_Auto_Healing >http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas >There might be some information for what you are trying to do. > >IMHO, doing server-side AFR with automatic IP-Failover and some sort >of loadbalancing (e.g. telling half of the clients to use one server, >telling the other half to use the second one) might be a setup better >suited to your network (slow connection to clients, fast connection >between servers) without compromising on redundancy. > > > I should maybe add that the single client scenario is not realistic; the > > tests just reflect typical activities of my users and myself. The setup > > eventually should serve 8-10 users with homedirs of 10-50GB; some users > > often move GBs of data through their > > homedirs. This is / has not been without pain using NFS, but > > performance was acceptable so far. So even if I get my test scenario > > up to speed, would glusterfs be up to the real task ? > >On a "real" network it should be: >http://www.gluster.org/docs/index.php/GlusterFS_1.2.1-BENKI_Aggregated_I/O_vs_NFSv4_Benchmark >:-) >I just can''t image who can afford that kind of >setup in a real world scenario... > >Harald St?rzebecher > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
I had a similar problem. although not as severe as yours. Mine was mostly involving a very large directory which seemed to get sent back and forth constantly. 1.4 pre5 is running now and it seems much improved. My guess is that there are a number of improvements in 1.4 that will solve some of the issues you''re facing. I''d try to benchmark with that and see if your results improve. At 07:31 AM 9/17/2008, Harald St?rzebecher wrote:>Hello! > >First: I''m no expert for GlustersFS, I just did some testing in the >last few weeks. > >2008/9/17 Stefan Boresch <stefan at mdy.univie.ac.at>: > > I have just finished my first steps with glusterfs. Realizing in > > principle what I wanted to do (including installation from source) was > > astonishingly easy; however, the performance is extremely poor. Thus, > > I''d appreciate comments and suggestions what to do/try next. > >[...] > > > The setup I have described is a first test for the eventual migration > > of NFS mounted home dirs to glusterfs in order to enhance data safety > > and availability (hence AFR). My problem is that for two typical use > > scenarios, the performance is not acceptable. My two "testcases" are a > > cp -a of a source tree including git repository of appr. 160MB from > > the local disk to the glusterfs filesystem ("CP-A") and a make > > statement in this source tree (in the remote glusterfs filesystem) > > when everything is up to date ("MAKE"); i.e., after make has checked > > all 800 or so source files it tells you that there is nothing to do. I > > am afraid my timings speak for themselves: > > > > CP-A MAKE > > local disk < 5sec < 0.3sec > > NFS (100MBit) 55sec+-2sec < 2sec > > glusterfs (I) 4m29sec 17sec > > glusterfs (II) 4m05sec 18sec > >160MB/5s = 32MB/s >160MB/55s = 2,9MB/s >160MB/250s = 655kB/s >your right, there is something wrong, that''s a bit slow > >IIRC GlusterFS has a high latency when accessing files/metadata. >The new binary protocol to be introduced in the >1.4 release should improve that. >(http://www.gluster.org/docs/index.php/GlusterFS_Roadmap#GlusterFS_1.4_-_Small_File_Performance) >I''d suggest trying the test release of version 1.4 or waiting a month >or two for the new stable release and then try again. > > > (I) unpatched fuse.ko, (II) patched fuse.ko, both over the same network > > as NFS. Note that in (II) the modified kernel module was used > > with the default (distribution provided) libfuse library. > > > > Measurements are fairly reproducible, i.e., variations are at most +- > > a few seconds. As you can see from the volume specs below, I tried > > some performance options, but the timings, if anything, got worse. Based on > > the above timings, I also don''t think that the patched fuse.ko is worth > > the pain. > > > > Some questions: (1) Would server side AFR > improve things? The servers can/could > > talk over dedicated GBit Ethernet with > jumboframes. Judging from the howtos, > > server side AFR is much more problematic > concerning (redundant) availibility of the > > service, but I first *have* to get > performance somewhere near the NFS levels before > > there is any sense in continuing. > >I''m not sure if the numbers are comparable. On write, Client side AFR >has to send the same data twice over the network to get it to both of >the two servers. On a slow network that has to have a very big impact >on performance. > >Would the setup described in >http://www.gluster.org/docs/index.php/NFS_Like_Standalone_Storage_Server >be comparable to your current NFS setup? >Did you try anything like that? > >Other suggested reading: >http://www.gluster.org/docs/index.php/GlusterFS_1.3_P2P_Cluster_with_Auto_Healing >http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas >There might be some information for what you are trying to do. > >IMHO, doing server-side AFR with automatic IP-Failover and some sort >of loadbalancing (e.g. telling half of the clients to use one server, >telling the other half to use the second one) might be a setup better >suited to your network (slow connection to clients, fast connection >between servers) without compromising on redundancy. > > > I should maybe add that the single client scenario is not realistic; the > > tests just reflect typical activities of my users and myself. The setup > > eventually should serve 8-10 users with homedirs of 10-50GB; some users > > often move GBs of data through their > > homedirs. This is / has not been without pain using NFS, but > > performance was acceptable so far. So even if I get my test scenario > > up to speed, would glusterfs be up to the real task ? > >On a "real" network it should be: >http://www.gluster.org/docs/index.php/GlusterFS_1.2.1-BENKI_Aggregated_I/O_vs_NFSv4_Benchmark >:-) >I just can''t image who can afford that kind of >setup in a real world scenario... > >Harald St?rzebecher > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
I had a similar problem. although not as severe as yours. Mine was mostly involving a very large directory which seemed to get sent back and forth constantly. 1.4 pre5 is running now and it seems much improved. My guess is that there are a number of improvements in 1.4 that will solve some of the issues you''re facing. I''d try to benchmark with that and see if your results improve. At 07:31 AM 9/17/2008, Harald St?rzebecher wrote:>Hello! > >First: I''m no expert for GlustersFS, I just did some testing in the >last few weeks. > >2008/9/17 Stefan Boresch <stefan at mdy.univie.ac.at>: > > I have just finished my first steps with glusterfs. Realizing in > > principle what I wanted to do (including installation from source) was > > astonishingly easy; however, the performance is extremely poor. Thus, > > I''d appreciate comments and suggestions what to do/try next. > >[...] > > > The setup I have described is a first test for the eventual migration > > of NFS mounted home dirs to glusterfs in order to enhance data safety > > and availability (hence AFR). My problem is that for two typical use > > scenarios, the performance is not acceptable. My two "testcases" are a > > cp -a of a source tree including git repository of appr. 160MB from > > the local disk to the glusterfs filesystem ("CP-A") and a make > > statement in this source tree (in the remote glusterfs filesystem) > > when everything is up to date ("MAKE"); i.e., after make has checked > > all 800 or so source files it tells you that there is nothing to do. I > > am afraid my timings speak for themselves: > > > > CP-A MAKE > > local disk < 5sec < 0.3sec > > NFS (100MBit) 55sec+-2sec < 2sec > > glusterfs (I) 4m29sec 17sec > > glusterfs (II) 4m05sec 18sec > >160MB/5s = 32MB/s >160MB/55s = 2,9MB/s >160MB/250s = 655kB/s >your right, there is something wrong, that''s a bit slow > >IIRC GlusterFS has a high latency when accessing files/metadata. >The new binary protocol to be introduced in the >1.4 release should improve that. >(http://www.gluster.org/docs/index.php/GlusterFS_Roadmap#GlusterFS_1.4_-_Small_File_Performance) >I''d suggest trying the test release of version 1.4 or waiting a month >or two for the new stable release and then try again. > > > (I) unpatched fuse.ko, (II) patched fuse.ko, both over the same network > > as NFS. Note that in (II) the modified kernel module was used > > with the default (distribution provided) libfuse library. > > > > Measurements are fairly reproducible, i.e., variations are at most +- > > a few seconds. As you can see from the volume specs below, I tried > > some performance options, but the timings, if anything, got worse. Based on > > the above timings, I also don''t think that the patched fuse.ko is worth > > the pain. > > > > Some questions: (1) Would server side AFR > improve things? The servers can/could > > talk over dedicated GBit Ethernet with > jumboframes. Judging from the howtos, > > server side AFR is much more problematic > concerning (redundant) availibility of the > > service, but I first *have* to get > performance somewhere near the NFS levels before > > there is any sense in continuing. > >I''m not sure if the numbers are comparable. On write, Client side AFR >has to send the same data twice over the network to get it to both of >the two servers. On a slow network that has to have a very big impact >on performance. > >Would the setup described in >http://www.gluster.org/docs/index.php/NFS_Like_Standalone_Storage_Server >be comparable to your current NFS setup? >Did you try anything like that? > >Other suggested reading: >http://www.gluster.org/docs/index.php/GlusterFS_1.3_P2P_Cluster_with_Auto_Healing >http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas >There might be some information for what you are trying to do. > >IMHO, doing server-side AFR with automatic IP-Failover and some sort >of loadbalancing (e.g. telling half of the clients to use one server, >telling the other half to use the second one) might be a setup better >suited to your network (slow connection to clients, fast connection >between servers) without compromising on redundancy. > > > I should maybe add that the single client scenario is not realistic; the > > tests just reflect typical activities of my users and myself. The setup > > eventually should serve 8-10 users with homedirs of 10-50GB; some users > > often move GBs of data through their > > homedirs. This is / has not been without pain using NFS, but > > performance was acceptable so far. So even if I get my test scenario > > up to speed, would glusterfs be up to the real task ? > >On a "real" network it should be: >http://www.gluster.org/docs/index.php/GlusterFS_1.2.1-BENKI_Aggregated_I/O_vs_NFSv4_Benchmark >:-) >I just can''t image who can afford that kind of >setup in a real world scenario... > >Harald St?rzebecher > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users
> > 1.4 pre5 is running now and it seems much > improved. My guess is that there are a number of > improvements in 1.4 that will solve some of the issues you're facing. > > I'd try to benchmark with that and see if your results improve.Please await for the new 1.4 pre release which is currently undergoing some cleanups and we expect to have good improvement in performance and stability. avati -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080918/cb58594d/attachment.html>
Hi Stefan, We have been trying to optimize Glusterfs on our network and our results match your previous results. After reading your post, I tried removing AFR but our performances are still poor as compared to NFS (particulalrly for small files). Could you please post your new volume spec file and/or any other improvements that you made to any of the layers? That'll be very helpful. Thanks and regards Chandranshu> Date: Thu, 18 Sep 2008 10:47:22 +0200 > From: stefan at mdy.univie.ac.at (Stefan Boresch) > Subject: Re: [Gluster-users] Poor performance with AFR > To: gluster-users at gluster.org > Message-ID: <20080918084722.GH9860 at loop.mdy.univie.ac.at> > Content-Type: text/plain; charset=us-ascii > > Dear everyone, > > thank you for your replies. I have some additional data that have > clarified issues for me a bit: > > I have repeated my tests without AFR (basically replicating the plain > NFS setup) > > > > > CP-A MAKE > > > local disk < 5sec < 0.3sec > > > NFS (100MBit) 55sec+-2sec < 2sec > > > glusterfs (I) 4m29sec 17sec > > > glusterfs (II) 4m05sec 18sec > glusterfs w/o AFR 45+-2 sec 9sec <==NEW > > So, most of the poor performance is due to AFR. Note that the > copy actually is now faster than NFS. Interestingly, make > still runs much slower (although compared to the actual > compile time, this overhead should be negligible in practice) > > Also, upon running some NFS benchmarks between the two servers, I > noted some strange results, letting me suspect some creeping hardware > issues. > > So, I guess I'll (a) wait for glusterfs 1.4.x and (b) look out for > some better hardware to test things in the meantime. > > Sorry if I caused confusion; I should have checked some of these things > earlier. > > Best regards, > > Stefan Boresch > > -- > Stefan Boresch > Institute for Computational Biological Chemistry > University of Vienna, Waehringerstr. 17 A-1090 Vienna, Austria > Phone: -43-1-427752715 Fax: -43-1-427752790 > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080919/55d727c3/attachment.html>