Ernie Dunbar
2015-Apr-23 17:54 UTC
[Gluster-users] Disastrous performance with rsync to mounted Gluster volume.
Hello everyone. I've built a replicated Gluster cluster (volume info shown below) of two Dell servers on a 1 GB switch, plus a second NIC on each server for replication data. But when I try to copy our mail store from our backup server onto the Gluster volume, I've been having nothing but trouble. I may have messed this right up the first time, as I just used rsync to copy all the files to the Linux filesystem on the primary Gluster server, instead of copying the data to an NFS or Gluster mount. Attempting to get Gluster to synchronize the files to the second Gluster server hasn't worked out very well at all, with about half the data actually copied to the second Gluster server. Attempts to force Gluster to synchronize this data have all failed (Gluster appears to think the data is already synchronized). This might be the best way of accomplishing this in the end, but in the meantime I've tried a different tack. Now, I'm trying to mount the Gluster volume over the network from the backup server, using NFS (the backup server doesn't and can't have a compatible version of GlusterFS on it, I plan to nuke it and install an OS that does support it, but first we have to get this mail store copied over!). Then I use rsync to copy only missing files to the NFS share and let Gluster do its own replication. This has been many, many times slower than just using rsync to copy the files, even considering the amount of data (439 GB). CPU usage on the Gluster servers is fairly high, with a server load value of about 4 on an 8 CPU system. Network usage is... well, not that high. Maybe topping about 50-70 Mbps. This same story is true whether I'm looking at the network usage for the primary, server-facing network or the secondary, Gluster-only network, so I don't think the bottleneck is there. Hard drive utilization peaks at around 40% but doesn't really stay that high. One possible clue may lie in Gluster's logs. I see millions of log entries like this: [2015-04-23 16:40:50.122007] I [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] 0-gv2-replicate-0: <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355401127.H542717P24276.pop.lightspeed.ca:2,: Skipping entry self-heal because of gfid absence [2015-04-23 16:40:50.123327] I [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] 0-gv2-replicate-0: <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355413874.H20794P22730.pop.lightspeed.ca:2,: Skipping entry self-heal because of gfid absence [2015-04-23 16:40:50.123705] I [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] 0-gv2-replicate-0: <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355420013.H176322P3859.pop.lightspeed.ca:2,: Skipping entry self-heal because of gfid absence [2015-04-23 16:40:50.124030] I [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] 0-gv2-replicate-0: <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355429494.H263072P14676.pop.lightspeed.ca:2,: Skipping entry self-heal because of gfid absence [2015-04-23 16:40:50.124423] I [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] 0-gv2-replicate-0: <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355436426.H973617P29804.pop.lightspeed.ca:2,: Skipping entry self-heal because of gfid absence The size and growth of these logs is at the point where I have to cut them short every hour, or the /var partition fills up within a couple of days. And finally, I have the gluster volume info: root at nfs1:/brick1/gv2/www3# gluster vol info gv2 Volume Name: gv2 Type: Replicate Volume ID: fb06a044-7871-4362-b134-fb97433f89f7 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: nfs1:/brick1/gv2 Brick2: nfs2:/brick1/gv2 Options Reconfigured: nfs.disable: off Any help removing myself from this mess would be greatly appreciated. :)
Ben Turner
2015-Apr-23 18:25 UTC
[Gluster-users] Disastrous performance with rsync to mounted Gluster volume.
----- Original Message -----> From: "Ernie Dunbar" <maillist at lightspeed.ca> > To: "Gluster Users" <gluster-users at gluster.org> > Sent: Thursday, April 23, 2015 1:54:35 PM > Subject: [Gluster-users] Disastrous performance with rsync to mounted Gluster volume. > > Hello everyone. > > I've built a replicated Gluster cluster (volume info shown below) of two > Dell servers on a 1 GB switch, plus a second NIC on each server for > replication data. But when I try to copy our mail store from our backup > server onto the Gluster volume, I've been having nothing but trouble. > > I may have messed this right up the first time, as I just used rsync to > copy all the files to the Linux filesystem on the primary Gluster > server, instead of copying the data to an NFS or Gluster mount. > Attempting to get Gluster to synchronize the files to the second Gluster > server hasn't worked out very well at all, with about half the data > actually copied to the second Gluster server. Attempts to force Gluster > to synchronize this data have all failed (Gluster appears to think the > data is already synchronized). This might be the best way of > accomplishing this in the end, but in the meantime I've tried a > different tack.Gluster writes to both bricks in the pair at the same time, bricks should never be out of sync when they are both online.> > Now, I'm trying to mount the Gluster volume over the network from the > backup server, using NFS (the backup server doesn't and can't have a > compatible version of GlusterFS on it, I plan to nuke it and install an > OS that does support it, but first we have to get this mail store copied > over!). Then I use rsync to copy only missing files to the NFS share and > let Gluster do its own replication. This has been many, many times > slower than just using rsync to copy the files, even considering the > amount of data (439 GB). CPU usage on the Gluster servers is fairly > high, with a server load value of about 4 on an 8 CPU system. Network > usage is... well, not that high. Maybe topping about 50-70 Mbps. This > same story is true whether I'm looking at the network usage for the > primary, server-facing network or the secondary, Gluster-only network, > so I don't think the bottleneck is there. Hard drive utilization peaks > at around 40% but doesn't really stay that high.Hmm I am confused, are you using kernel NFS or gluster NFS to copy the data over? The only way I can think of a file to get on gluster without a GFID is if you put it directly on the brick without going through glusterfs? Be sure not to do work on the backend bricks, always access the data through gluster. The logs below defiantly indicate a problem. Here is what I would do: <create gluster volume> On the server with the data: mount -t nfs -o vers=3 <my gluster volume> /gluster-volume cp -R <my data> <my gluster mount> <go home, drink beer, come back the next day> If you need to use rsync I would look at the --whole-file option and/or forcing it to write in bigger in larger block sizes. The rsync workload is one of the worst I can think of for glusterfs, it does a ton of stating and copies the files in really small block sizes which creates tons of round trips for gluster. How much data are you copying? How many files? Which version To address: "Maybe topping about 50-70 Mbps." This is as fast as gigabit + replica 2 will do from a single client, each client writes to both bricks at the same time which cuts your throughput in half. Gigabit theoretical is 125 MB / sec, most gigabit NICs get ~120 MB / sec in the real world, half of that is ~60 MB / sec which is what gluster + replica 2 will do from a single client.> One possible clue may lie in Gluster's logs. I see millions of log > entries like this: > > [2015-04-23 16:40:50.122007] I > [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] > 0-gv2-replicate-0: > <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355401127.H542717P24276.pop.lightspeed.ca:2,: > Skipping entry self-heal because of gfid absence > [2015-04-23 16:40:50.123327] I > [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] > 0-gv2-replicate-0: > <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355413874.H20794P22730.pop.lightspeed.ca:2,: > Skipping entry self-heal because of gfid absence > [2015-04-23 16:40:50.123705] I > [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] > 0-gv2-replicate-0: > <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355420013.H176322P3859.pop.lightspeed.ca:2,: > Skipping entry self-heal because of gfid absence > [2015-04-23 16:40:50.124030] I > [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] > 0-gv2-replicate-0: > <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355429494.H263072P14676.pop.lightspeed.ca:2,: > Skipping entry self-heal because of gfid absence > [2015-04-23 16:40:50.124423] I > [afr-self-heal-entry.c:1909:afr_sh_entry_common_lookup_done] > 0-gv2-replicate-0: > <gfid:912eec51-89dc-40ea-9dfd-072404d306a2>/1355436426.H973617P29804.pop.lightspeed.ca:2,: > Skipping entry self-heal because of gfid absence > > The size and growth of these logs is at the point where I have to cut > them short every hour, or the /var partition fills up within a couple of > days. > > And finally, I have the gluster volume info: > > root at nfs1:/brick1/gv2/www3# gluster vol info gv2 > > Volume Name: gv2 > Type: Replicate > Volume ID: fb06a044-7871-4362-b134-fb97433f89f7 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: nfs1:/brick1/gv2 > Brick2: nfs2:/brick1/gv2 > Options Reconfigured: > nfs.disable: off > > > Any help removing myself from this mess would be greatly appreciated. :) > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >