Benjamin Edgar
2016-Aug-23 18:18 UTC
[Gluster-users] Memory leak with a replica 3 arbiter 1 configuration
Hi Ravi, I saw that you updated the patch today (@ http://review.gluster.org/#/c/ 15289/). I built an RPM of the first iteration you had of the patch (just changing the one line in arbiter.c "GF_FREE (ctx->iattbuf);") and am running that on some test servers now to see if the memory of the arbiter brick gets out of control. Ben On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N <ravishankar at redhat.com> wrote:> Hi Benjamin > > On 08/23/2016 06:41 AM, Benjamin Edgar wrote: > > I've attached a statedump of the problem brick process. Let me know if > there are any other logs you need. > > > Thanks for the report! I've sent a fix @ http://review.gluster.org/#/c/ > 15289/ . It would be nice if you can verify if the patch fixes the issue > for you. > > Thanks, > Ravi > > > Thanks a lot, > Ben > > On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri < > pkarampu at redhat.com> wrote: > >> Could you collect statedump of the brick process by following: >> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump >> >> That should help us identify which datatype is causing leaks and fix it. >> >> Thanks! >> >> On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar <benedgar8 at gmail.com> >> wrote: >> >>> Hi, >>> >>> I appear to have a memory leak with a replica 3 arbiter 1 configuration >>> of gluster. I have a data brick and an arbiter brick on one server, and >>> another server with the last data brick. The more I write files to gluster >>> in this configuration, the more memory the arbiter brick process takes up. >>> >>> I am able to reproduce this issue by first setting up a replica 3 >>> arbiter 1 configuration and then using the following bash script to create >>> 10,000 200kB files, delete those files, and run forever: >>> >>> while true ; do >>> for i in {1..10000} ; do >>> dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i >>> done >>> rm -rf $TEST_FILES_DIR/* >>> done >>> >>> $TEST_FILES_DIR is a location on my gluster mount. >>> >>> After about 3 days of this script running on one of my clusters, this is >>> what the output of "top" looks like: >>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>> TIME+ COMMAND >>> 16039 root 20 0 1397220 77720 3948 S 20.6 1.0 >>> 860:01.53 glusterfsd >>> 13174 root 20 0 1395824 112728 3692 S 19.6 1.5 >>> 806:07.17 glusterfs >>> 19961 root 20 0 2967204 *2.145g* 3896 S 17.3 >>> 29.0 752:10.70 glusterfsd >>> >>> As you can see one of the brick processes is using over 2 gigabytes of >>> memory. >>> >>> One work-around for this is to kill the arbiter brick process and >>> restart the gluster daemon. This restarts arbiter brick process and its >>> memory usage goes back down to a reasonable level. However I would rather >>> not kill the arbiter brick every week for production environments. >>> >>> Has anyone seen this issue before and is there a known work-around/fix? >>> >>> Thanks, >>> Ben >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >> >> >> >> -- >> Pranith >> > > > > -- > Benjamin Edgar > Computer Science > University of Virginia 2015 > (571) 338-0878 > > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > >-- Benjamin Edgar Computer Science University of Virginia 2015 (571) 338-0878 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160823/51175b05/attachment.html>
Benjamin Edgar
2016-Aug-23 20:42 UTC
[Gluster-users] Memory leak with a replica 3 arbiter 1 configuration
My test servers have been running for about 3 hours now (with the while loop to constantly write and delete files) and it looks like the memory usage of the arbiter brick process has not increased in the past hour. Before it was constantly increasing, so it looks like adding the "GF_FREE (ctx->iattbuf);" line in arbiter.c fixed the issue. If anything changes overnight I will post an update, but I believe that the fix worked! Once this patch makes it into the master branch, how long does it usually take to get released back to 3.8? Thanks! Ben On Tue, Aug 23, 2016 at 2:18 PM, Benjamin Edgar <benedgar8 at gmail.com> wrote:> Hi Ravi, > > I saw that you updated the patch today (@ http://review.gluster.org/# > /c/15289/). I built an RPM of the first iteration you had of the patch > (just changing the one line in arbiter.c "GF_FREE (ctx->iattbuf);") and am > running that on some test servers now to see if the memory of the arbiter > brick gets out of control. > > Ben > > On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N <ravishankar at redhat.com> > wrote: > >> Hi Benjamin >> >> On 08/23/2016 06:41 AM, Benjamin Edgar wrote: >> >> I've attached a statedump of the problem brick process. Let me know if >> there are any other logs you need. >> >> >> Thanks for the report! I've sent a fix @ http://review.gluster.org/#/c/ >> 15289/ . It would be nice if you can verify if the patch fixes the issue >> for you. >> >> Thanks, >> Ravi >> >> >> Thanks a lot, >> Ben >> >> On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri < >> pkarampu at redhat.com> wrote: >> >>> Could you collect statedump of the brick process by following: >>> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump >>> >>> That should help us identify which datatype is causing leaks and fix it. >>> >>> Thanks! >>> >>> On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar <benedgar8 at gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I appear to have a memory leak with a replica 3 arbiter 1 configuration >>>> of gluster. I have a data brick and an arbiter brick on one server, and >>>> another server with the last data brick. The more I write files to gluster >>>> in this configuration, the more memory the arbiter brick process takes up. >>>> >>>> I am able to reproduce this issue by first setting up a replica 3 >>>> arbiter 1 configuration and then using the following bash script to create >>>> 10,000 200kB files, delete those files, and run forever: >>>> >>>> while true ; do >>>> for i in {1..10000} ; do >>>> dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i >>>> done >>>> rm -rf $TEST_FILES_DIR/* >>>> done >>>> >>>> $TEST_FILES_DIR is a location on my gluster mount. >>>> >>>> After about 3 days of this script running on one of my clusters, this >>>> is what the output of "top" looks like: >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>> TIME+ COMMAND >>>> 16039 root 20 0 1397220 77720 3948 S 20.6 1.0 >>>> 860:01.53 glusterfsd >>>> 13174 root 20 0 1395824 112728 3692 S 19.6 1.5 >>>> 806:07.17 glusterfs >>>> 19961 root 20 0 2967204 *2.145g* 3896 S 17.3 >>>> 29.0 752:10.70 glusterfsd >>>> >>>> As you can see one of the brick processes is using over 2 gigabytes of >>>> memory. >>>> >>>> One work-around for this is to kill the arbiter brick process and >>>> restart the gluster daemon. This restarts arbiter brick process and its >>>> memory usage goes back down to a reasonable level. However I would rather >>>> not kill the arbiter brick every week for production environments. >>>> >>>> Has anyone seen this issue before and is there a known work-around/fix? >>>> >>>> Thanks, >>>> Ben >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >>> >>> -- >>> Pranith >>> >> >> >> >> -- >> Benjamin Edgar >> Computer Science >> University of Virginia 2015 >> (571) 338-0878 >> >> >> _______________________________________________ >> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >> >> >> > > > -- > Benjamin Edgar > Computer Science > University of Virginia 2015 > (571) 338-0878 >-- Benjamin Edgar Computer Science University of Virginia 2015 (571) 338-0878 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160823/7fd90034/attachment.html>