Krutika Dhananjay
2014-Dec-18 03:51 UTC
[Gluster-users] glusterfs and glusterfsd process utilization extremely high
----- Original Message -----> From: "Kyle Harris" <kyle.harris98 at gmail.com> > To: gluster-users at gluster.org > Sent: Thursday, December 18, 2014 4:47:35 AM > Subject: [Gluster-users] glusterfs and glusterfsd process utilization > extremely high> This is an extenuation of a problem that I posted about last month that I am > still experiencing. The original post with more detail can be found at > http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019587.html > . To sum up my problem, I have a freshly created 3 node replicated cluster. > It contains roughly 135 GB of files, many of which are small. It is home to > several web sites hosted with Apache. I am using Gluster version 3.6.1-1 > installed from RPMs mounted from the server via the Fuse client (I have > tried NFS but it makes no difference).> When I posted last about the problem of extreme processor utilization, the > solution I was given by Parnith was to utilize another file system other > than EXT4 and to turn off cluster.entry-self-heal. I am now using XFS and > cluster.entry-self-heal is turned off and I even turned off > cluster.self-heal-daemon but it made absolutely no difference. All is fine > during the entire time the cluster is loaded via rsync however the minute I > point Apache traffic at the sites hosted on the cluster, glusterfs and > glusterfsd begin to climb to levels so high that in a matter of minutes it > is not even possible to log on to the system. No modification have been made > to any of the other Gluster settings.> Any additional help resolving this matter would be greatly appreciated.Hello, First of all, do the logs suggest anything useful? Could you perform the following steps while the I/O is going on (this is assuming the nodes are not thrashed to the extent that it is impossible to execute these commands): 1) On the shell, on one of the nodes in the cluster, execute `gluster volume profile <volname> start` Wait for a minute or two. And then execute `gluster volume profile <volname> info` and collect its output. Wait for another minute or so. And execute `gluster volume profile <volname> info` and collect its output too, and share them? You can stop the profiling once you are done using `gluster volume profile <volname> stop`. 2) Assuming it is the brick processes (glusterfsd) that are showing high CPU utilisation, is it possible to get the core of the processes when this is happening? -Krutika> -- > Regards,> Kyle> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141217/37e04684/attachment.html>
Kyle Harris
2014-Dec-18 13:39 UTC
[Gluster-users] glusterfs and glusterfsd process utilization extremely high
Hi Krutika and thank you for the quick response. I think I found the problem and it was hiding in the logs the whole time. However, I'm still glad I started this thread as it might help someone else and furthermore I still have a question about it. I discovered a lot of entries similar to the following in the gluster mnt log: 12-18 02:41:23.557523] I [dht-common.c:1822:dht_lookup_cbk] 0-gv0-dht: Entry /html/some_site/some_folder/asdf.php missing on subvol gv0-replicate-0 Because this log entry appeared to just be informational, I didn't pay much attention to it. However I began to notice many of them for one particular site that is hosted on this cluster. I finally decided to remove that site temporarily from the cluster and much to my surprise AND delight, the problem went away! After much research, it appears as though files that are called from a gluster drive that are not present is an expensive operation in terms of resource utilization and that was causing my problem. Obviously the solution is to have the developers fix the issues on the site but it does bring up another question. What happens when I have a site hosted on a gluster drive and a user or link points to an incorrect URL on that site and thus to a file that doesn't exist? Obviously that would have to happen multiple times in order to be a problem but on a busy site, the potential exist for a denial of service. So my new question is this. How can this be mitigated from gluster such that missing files do not cause such an issue? Thank you again for any assistance. Kyle On Wed, Dec 17, 2014 at 9:51 PM, Krutika Dhananjay <kdhananj at redhat.com> wrote:> > > > ------------------------------ > > *From: *"Kyle Harris" <kyle.harris98 at gmail.com> > *To: *gluster-users at gluster.org > *Sent: *Thursday, December 18, 2014 4:47:35 AM > *Subject: *[Gluster-users] glusterfs and glusterfsd process > utilization extremely high > > This is an extenuation of a problem that I posted about last month that I > am still experiencing. The original post with more detail can be found at > http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019587.html. > To sum up my problem, I have a freshly created 3 node replicated cluster. > It contains roughly 135 GB of files, many of which are small. It is home > to several web sites hosted with Apache. I am using Gluster version > 3.6.1-1 installed from RPMs mounted from the server via the Fuse client (I > have tried NFS but it makes no difference). > > When I posted last about the problem of extreme processor utilization, the > solution I was given by Parnith was to utilize another file system other > than EXT4 and to turn off cluster.entry-self-heal. I am now using XFS > and cluster.entry-self-heal is turned off and I even turned off > cluster.self-heal-daemon but it made absolutely no difference. All is fine > during the entire time the cluster is loaded via rsync however the minute I > point Apache traffic at the sites hosted on the cluster, glusterfs and > glusterfsd begin to climb to levels so high that in a matter of minutes it > is not even possible to log on to the system. No modification have been > made to any of the other Gluster settings. > > Any additional help resolving this matter would be greatly appreciated. > > Hello, > > First of all, do the logs suggest anything useful? > > Could you perform the following steps while the I/O is going on (this is > assuming the nodes are not thrashed to the extent that it is impossible to > execute these commands): > > 1) On the shell, on one of the nodes in the cluster, execute `gluster > volume profile <volname> start` > Wait for a minute or two. And then execute `gluster volume profile > <volname> info` and collect its output. > Wait for another minute or so. And execute `gluster volume profile > <volname> info` and collect its output too, and share them? > You can stop the profiling once you are done using `gluster volume > profile <volname> stop`. > 2) Assuming it is the brick processes (glusterfsd) that are showing high > CPU utilisation, is it possible to get the core of the processes when this > is happening? > > -Krutika > > -- > Regards, > > Kyle > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users > > >-- Kyle A. Harris Kyle at TheHarrisHome.com 615-364-6752 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141218/6145e73e/attachment.html>