Harry Mangalam
2012-Aug-10 21:16 UTC
[Gluster-users] 1/4 glusterfsd's runs amok; performance suffers;
running 3.3 distributed on IPoIB on 4 nodes, 1 brick per node. Any idea why, on one of those nodes, glusterfsd would go berserk, running up to 370% CPU and driving load to >30 (file performance on the clients slows to a crawl). While very slow, it continued to serve out files. This is the second time this has happened in about a week. I had turned on the gluster nfs services, but wasn't using it when this happened. It's now off. kill -HUP did nothing to either glusterd or glusterfsd, so I had to kill both and restart glusterd. That solved the overload on glusterfsd and performance is back to near normal. I'm now doing a rebalance/fix-layout which is running as expected, but will take the weekend to complete. I did notice that the affected node (pbs3) has more files than the others, tho I'm not sure that this is significant. Filesystem Size Used Avail Use% Mounted on pbs1:/dev/sdb 6.4T 1.9T 4.6T 29% /bducgl pbs2:/dev/md0 8.2T 2.4T 5.9T 30% /bducgl pbs3:/dev/md127 8.2T 5.9T 2.3T 73% /bducgl <--- pbs4:/dev/sda 6.4T 1.8T 4.6T 29% /bducgl -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120810/e4fcf846/attachment.html>
Nux!
2012-Aug-11 11:11 UTC
[Gluster-users] 1/4 glusterfsd's runs amok; performance suffers;
On 10.08.2012 22:16, Harry Mangalam wrote:> pbs3:/dev/md127 8.2T 5.9T 2.3T 73% /bducgl <---Harry, The name of that md device (127) indicated there may be something dodgy going on there. A device shouldn't be named 127 unless some problems occured. Are you sure your drives are OK? -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro
Joe Julian
2012-Aug-11 16:56 UTC
[Gluster-users] 1/4 glusterfsd's runs amok; performance suffers;
Check your client logs. I have seen that with network issues causing disconnects. Harry Mangalam <hjmangalam at gmail.com> wrote:>Thanks for your comments. > >I use mdadm on many servers and I've seen md numbering like this a fair >bit. Usually it occurs after a another RAID has been created and the >numbering shifts. Neil Brown (mdadm's author) , seems to think it's fine. > So I don't think that's the problem. And you're right - this is a >Frankengluster made from a variety of chassis and controllers and normally >it's fine. As Brian noted, it's all the same to gluster, mod some small >local differences in IO performance. > >Re the size difference, I'll explicitly rebalance the brick after the >fix-layout finishes, but I'm even more worried about this fantastic >increase in CPU usage and its effect on user performance. > >In the fix-layout routines (still running), I've seen CPU usage of >glusterfsd rise to ~400% and loadavg go up to >15 on all the servers >(except the pbs3, the one that originally had that problem). That high >load does not last long tho (maybe a few mintes - we've just installed >nagios on these nodes and I'm getting a ton of emails about load increasing >and then decreasing on all the nodes (except pbs3). When the load goes >very high on a server node, the user-end performance drops appreciably. > >hjm > > > >On Sat, Aug 11, 2012 at 4:20 AM, Brian Candler <B.Candler at pobox.com> wrote: > >> On Sat, Aug 11, 2012 at 12:11:39PM +0100, Nux! wrote: >> > On 10.08.2012 22:16, Harry Mangalam wrote: >> > >pbs3:/dev/md127 8.2T 5.9T 2.3T 73% /bducgl <--- >> > >> > Harry, >> > >> > The name of that md device (127) indicated there may be something >> > dodgy going on there. A device shouldn't be named 127 unless some >> > problems occured. Are you sure your drives are OK? >> >> I have systems with /dev/md127 all the time, and there's no problem. It >> seems to number downwards from /dev/md127 - if I create md array on the >> same >> system it is /dev/md126. >> >> However, this does suggest that the nodes are not configured identically: >> two are /dev/sda or /dev/sdb, which suggests either plain disk or hardware >> RAID, while two are /dev/md0 or /dev/127, which is software RAID. >> >> Although this could explain performance differences between the nodes, this >> is transparent to gluster and doesn't explain why the files are unevenly >> balanced - unless there is one huge file which happens to have been >> allocated to this node. >> >> Regards, >> >> Brian. >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> > > > >-- >Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine >[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 >415 South Circle View Dr, Irvine, CA, 92697 [shipping] >MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://gluster.org/cgi-bin/mailman/listinfo/gluster-users