Oleksandr Natalenko
2016-Jan-20 09:08 UTC
[Gluster-users] [Gluster-devel] Memory leak in GlusterFS FUSE client
Yes, there are couple of messages like this in my logs too (I guess one message per each remount): ==[2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- glusterfs-fuse: kernel notifier loop terminated == On ??????, 20 ????? 2016 ?. 09:51:23 EET Xavier Hernandez wrote:> I'm seeing a similar problem with 3.7.6. > > This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t > objects in fuse. Looking at the code I see they are used to send > invalidations to kernel fuse, however this is done in a separate thread > that writes a log message when it exits. On the system I'm seeing the > memory leak, I can see that message in the log files: > > [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] > 0-glusterfs-fuse: kernel notifier loop terminated > > But the volume is still working at this moment, so any future inode > invalidations will leak memory because it was this thread that should > release it. > > Can you check if you also see this message in the mount log ? > > It seems that this thread terminates if write returns any error > different than ENOENT. I'm not sure if there could be any other error > that can cause this. > > Xavi > > On 20/01/16 00:13, Oleksandr Natalenko wrote: > > Here is another RAM usage stats and statedump of GlusterFS mount > > approaching to just another OOM: > > > > ==> > root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 > > /usr/sbin/ > > glusterfs --volfile-server=server.example.com --volfile-id=volume > > /mnt/volume ==> > > > https://gist.github.com/86198201c79e927b46bd > > > > 1.6G of RAM just for almost idle mount (we occasionally store Asterisk > > recordings there). Triple OOM for 69 days of uptime. > > > > Any thoughts? > > > > On ??????, 13 ????? 2016 ?. 16:26:59 EET Soumya Koduri wrote: > >> kill -USR1 > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel
Xavier Hernandez
2016-Jan-21 09:32 UTC
[Gluster-users] [Gluster-devel] Memory leak in GlusterFS FUSE client
If this message appears way before the volume is unmounted, can you try to start the volume manually using this command and repeat the tests ? glusterfs --fopen-keep-cache=off --volfile-server=<server> --volfile-id=/<volume> <mount point> This will prevent invalidation requests to be sent to the kernel, so there shouldn't be any memory leak even if the worker thread exits prematurely. If that solves the problem, we could try to determine the cause of the premature exit and solve it. Xavi On 20/01/16 10:08, Oleksandr Natalenko wrote:> Yes, there are couple of messages like this in my logs too (I guess one > message per each remount): > > ==> [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- > glusterfs-fuse: kernel notifier loop terminated > ==> > On ??????, 20 ????? 2016 ?. 09:51:23 EET Xavier Hernandez wrote: >> I'm seeing a similar problem with 3.7.6. >> >> This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t >> objects in fuse. Looking at the code I see they are used to send >> invalidations to kernel fuse, however this is done in a separate thread >> that writes a log message when it exits. On the system I'm seeing the >> memory leak, I can see that message in the log files: >> >> [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] >> 0-glusterfs-fuse: kernel notifier loop terminated >> >> But the volume is still working at this moment, so any future inode >> invalidations will leak memory because it was this thread that should >> release it. >> >> Can you check if you also see this message in the mount log ? >> >> It seems that this thread terminates if write returns any error >> different than ENOENT. I'm not sure if there could be any other error >> that can cause this. >> >> Xavi >> >> On 20/01/16 00:13, Oleksandr Natalenko wrote: >>> Here is another RAM usage stats and statedump of GlusterFS mount >>> approaching to just another OOM: >>> >>> ==>>> root 32495 1.4 88.3 4943868 1697316 ? Ssl Jan13 129:18 >>> /usr/sbin/ >>> glusterfs --volfile-server=server.example.com --volfile-id=volume >>> /mnt/volume ==>>> >>> https://gist.github.com/86198201c79e927b46bd >>> >>> 1.6G of RAM just for almost idle mount (we occasionally store Asterisk >>> recordings there). Triple OOM for 69 days of uptime. >>> >>> Any thoughts? >>> >>> On ??????, 13 ????? 2016 ?. 16:26:59 EET Soumya Koduri wrote: >>>> kill -USR1 >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-devel > >
Kaleb KEITHLEY
2016-Jan-21 22:29 UTC
[Gluster-users] [Gluster-devel] Memory leak in GlusterFS FUSE client
On 01/20/2016 04:08 AM, Oleksandr Natalenko wrote:> Yes, there are couple of messages like this in my logs too (I guess one > message per each remount): > > ==> [2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0- > glusterfs-fuse: kernel notifier loop terminated > ==>Bug reports and fixes for master and release-3.7 branches are: master) https://bugzilla.redhat.com/show_bug.cgi?id=1288857 http://review.gluster.org/12886 release-3.7) https://bugzilla.redhat.com/show_bug.cgi?id=1288922 http://review.gluster.org/12887 The release-3.7 fix will be in glusterfs-3.7.7 when it's released. I think with even with the above fixes applied there are still some issues remaining. I have submitted additional/revised fixes on top of the above fixes at: master: http://review.gluster.org/13274 release-3.7: http://review.gluster.org/13275 I invite you to review the patches in gerrit (review.gluster.org). Regards, -- Kaleb