thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary

If this information is useful, please help other people find it:
Share via:

Soumya Koduri

2016-Feb-01 11:28 UTC

[Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary — part I

On 02/01/2016 02:48 PM, Xavier Hernandez wrote:> Hi,
>
> On 01/02/16 09:54, Soumya Koduri wrote:
>>
>>
>> On 02/01/2016 01:39 PM, Oleksandr Natalenko wrote:
>>> Wait. It seems to be my bad.
>>>
>>> Before unmounting I do drop_caches (2), and glusterfs process CPU
usage
>>> goes to 100% for a while. I haven't waited for it to drop to
0%, and
>>> instead perform unmount. It seems glusterfs is purging inodes and
that's
>>> why it uses 100% of CPU. I've re-tested it, waiting for CPU
usage to
>>> become normal, and got no leaks.
>>>
>>> Will verify this once again and report more.
>>>
>>> BTW, if that works, how could I limit inode cache for FUSE client?
I do
>>> not want it to go beyond 1G, for example, even if I have 48G of RAM
on
>>> my server.
>>
>> Its hard-coded for now. For fuse the lru limit (of the inodes which are
>> not active) is (32*1024).
>
> This is not exact for current implementation. The inode memory pool is
> configured with 32*1024 entries, but the lru limit is set to infinite:
> currently inode_table_prune() takes lru_limit == 0 as infinite, and the
> inode table created by fuse is initialized with 0.
>
> Anyway this should not be a big problem in normal conditions. After
> having fixed the incorrect nlookup count for "." and
".." directory
> entries, when the kernel detects memory pressure and sends inode
> forgets, the memory will be released.
>
>> One of the ways to address this (which we were discussing earlier) is
to
>> have an option to configure inode cache limit.
>
> I think this will need more thinking. I've made a fast test forcing
> lru_limit to a small value and weird errors have appeared (probably from
> inodes being expected to exist when kernel sends new requests). Anyway I
> haven't spent time on this. I haven't tested in on master either.
Oh okay. Thanks for checking.

-Soumya
>
> Xavi
>
>> If that sounds good, we
>> can then check on if it has to be global/volume-level,
>> client/server/both.
>>
>> Thanks,
>> Soumya
>>
>>>
>>> 01.02.2016 09:54, Soumya Koduri ???????:
>>>> On 01/31/2016 03:05 PM, Oleksandr Natalenko wrote:
>>>>> Unfortunately, this patch doesn't help.
>>>>>
>>>>> RAM usage on "find" finish is ~9G.
>>>>>
>>>>> Here is statedump before drop_caches:
https://gist.github.com/
>>>>> fc1647de0982ab447e20
>>>>
>>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage]
>>>> size=706766688
>>>> num_allocs=2454051
>>>>
>>>>>
>>>>> And after drop_caches:
https://gist.github.com/5eab63bc13f78787ed19
>>>>
>>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage]
>>>> size=550996416
>>>> num_allocs=1913182
>>>>
>>>> There isn't much significant drop in inode contexts. One of
the
>>>> reasons could be because of dentrys holding a refcount on the
inodes
>>>> which shall result in inodes not getting purged even after
>>>> fuse_forget.
>>>>
>>>>
>>>> pool-name=fuse:dentry_t
>>>> hot-count=32761
>>>>
>>>> if  '32761' is the current active dentry count, it
still doesn't seem
>>>> to match up to inode count.
>>>>
>>>> Thanks,
>>>> Soumya
>>>>>
>>>>> And here is Valgrind output:
>>>>> https://gist.github.com/2490aeac448320d98596
>>>>>
>>>>> On ??????, 30 ????? 2016 ?. 22:56:37 EET Xavier Hernandez
wrote:
>>>>>> There's another inode leak caused by an incorrect
counting of
>>>>>> lookups on directory reads.
>>>>>>
>>>>>> Here's a patch that solves the problem for
>>>>>> 3.7:
>>>>>>
>>>>>> http://review.gluster.org/13324
>>>>>>
>>>>>> Hopefully with this patch the
>>>>>> memory leaks should disapear.
>>>>>>
>>>>>> Xavi
>>>>>>
>>>>>> On 29.01.2016 19:09, Oleksandr
>>>>>>
>>>>>> Natalenko wrote:
>>>>>>> Here is intermediate summary of current memory
>>>>>>
>>>>>> leaks in FUSE client
>>>>>>
>>>>>>> investigation.
>>>>>>>
>>>>>>> I use GlusterFS v3.7.6
>>>>>>
>>>>>> release with the following patches:
>>>>>>> ==>>>>>>
>>>>>>> Kaleb S KEITHLEY (1):
>>>>>> fuse: use-after-free fix in fuse-bridge, revisited
>>>>>>
>>>>>>> Pranith Kumar K
>>>>>>
>>>>>> (1):
>>>>>>> mount/fuse: Fix use-after-free crash
>>>>>>
>>>>>>> Soumya Koduri (3):
>>>>>> gfapi: Fix inode nlookup counts
>>>>>>
>>>>>>> inode: Retire the inodes from the lru
>>>>>>
>>>>>> list in inode_table_destroy
>>>>>>
>>>>>>> upcall: free the xdr* allocations
>>>>>>> ==>>>>>>>
>>>>>>>
>>>>>>> With those patches we got API leaks fixed (I hope,
brief tests show
>>>>>>
>>>>>> that) and
>>>>>>
>>>>>>> got rid of "kernel notifier loop
terminated" message.
>>>>>>
>>>>>> Nevertheless, FUSE
>>>>>>
>>>>>>> client still leaks.
>>>>>>>
>>>>>>> I have several test
>>>>>>
>>>>>> volumes with several million of small files (100K?2M in
>>>>>>
>>>>>>> average). I
>>>>>>
>>>>>> do 2 types of FUSE client testing:
>>>>>>> 1) find /mnt/volume -type d
>>>>>>> 2)
>>>>>>
>>>>>> rsync -av -H /mnt/source_volume/* /mnt/target_volume/
>>>>>>
>>>>>>> And most
>>>>>>
>>>>>> up-to-date results are shown below:
>>>>>>> === find /mnt/volume -type d
>>>>>>
>>>>>> ==>>>>>>
>>>>>>> Memory consumption: ~4G
>>>>>>
>>>>>>> Statedump:
>>>>>> https://gist.github.com/10cde83c63f1b4f1dd7a
>>>>>>
>>>>>>> Valgrind:
>>>>>> https://gist.github.com/097afb01ebb2c5e9e78d
>>>>>>
>>>>>>> I guess,
>>>>>>
>>>>>> fuse-bridge/fuse-resolve. related.
>>>>>>
>>>>>>> === rsync -av -H
>>>>>>
>>>>>> /mnt/source_volume/* /mnt/target_volume/
==>>>>>>
>>>>>>> Memory consumption:
>>>>>> ~3.3...4G
>>>>>>
>>>>>>> Statedump (target volume):
>>>>>> https://gist.github.com/31e43110eaa4da663435
>>>>>>
>>>>>>> Valgrind (target volume):
>>>>>> https://gist.github.com/f8e0151a6878cacc9b1a
>>>>>>
>>>>>>> I guess,
>>>>>>
>>>>>> DHT-related.
>>>>>>
>>>>>>> Give me more patches to test :).
>>>>>>
>>>>>> _______________________________________________
>>>>>>
>>>>>>> Gluster-devel mailing
>>>>>>
>>>>>> list
>>>>>>
>>>>>>> Gluster-devel at gluster.org
>>>>>>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel

Serkan Çoban

2016-Feb-01 13:26 UTC

head link

[Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary — part I

Will those patches be available in 3.7.7?

On Mon, Feb 1, 2016 at 1:28 PM, Soumya Koduri <skoduri at redhat.com>
wrote:>
>
> On 02/01/2016 02:48 PM, Xavier Hernandez wrote:
>>
>> Hi,
>>
>> On 01/02/16 09:54, Soumya Koduri wrote:
>>>
>>>
>>>
>>> On 02/01/2016 01:39 PM, Oleksandr Natalenko wrote:
>>>>
>>>> Wait. It seems to be my bad.
>>>>
>>>> Before unmounting I do drop_caches (2), and glusterfs process
CPU usage
>>>> goes to 100% for a while. I haven't waited for it to drop
to 0%, and
>>>> instead perform unmount. It seems glusterfs is purging inodes
and that's
>>>> why it uses 100% of CPU. I've re-tested it, waiting for CPU
usage to
>>>> become normal, and got no leaks.
>>>>
>>>> Will verify this once again and report more.
>>>>
>>>> BTW, if that works, how could I limit inode cache for FUSE
client? I do
>>>> not want it to go beyond 1G, for example, even if I have 48G of
RAM on
>>>> my server.
>>>
>>>
>>> Its hard-coded for now. For fuse the lru limit (of the inodes which
are
>>> not active) is (32*1024).
>>
>>
>> This is not exact for current implementation. The inode memory pool is
>> configured with 32*1024 entries, but the lru limit is set to infinite:
>> currently inode_table_prune() takes lru_limit == 0 as infinite, and the
>> inode table created by fuse is initialized with 0.
>>
>> Anyway this should not be a big problem in normal conditions. After
>> having fixed the incorrect nlookup count for "." and
".." directory
>> entries, when the kernel detects memory pressure and sends inode
>> forgets, the memory will be released.
>>
>>> One of the ways to address this (which we were discussing earlier)
is to
>>> have an option to configure inode cache limit.
>>
>>
>> I think this will need more thinking. I've made a fast test forcing
>> lru_limit to a small value and weird errors have appeared (probably
from
>> inodes being expected to exist when kernel sends new requests). Anyway
I
>> haven't spent time on this. I haven't tested in on master
either.
>
>
> Oh okay. Thanks for checking.
>
> -Soumya
>
>
>>
>> Xavi
>>
>>> If that sounds good, we
>>> can then check on if it has to be global/volume-level,
>>> client/server/both.
>>>
>>> Thanks,
>>> Soumya
>>>
>>>>
>>>> 01.02.2016 09:54, Soumya Koduri ???????:
>>>>>
>>>>> On 01/31/2016 03:05 PM, Oleksandr Natalenko wrote:
>>>>>>
>>>>>> Unfortunately, this patch doesn't help.
>>>>>>
>>>>>> RAM usage on "find" finish is ~9G.
>>>>>>
>>>>>> Here is statedump before drop_caches:
https://gist.github.com/
>>>>>> fc1647de0982ab447e20
>>>>>
>>>>>
>>>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx
memusage]
>>>>> size=706766688
>>>>> num_allocs=2454051
>>>>>
>>>>>>
>>>>>> And after drop_caches:
https://gist.github.com/5eab63bc13f78787ed19
>>>>>
>>>>>
>>>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx
memusage]
>>>>> size=550996416
>>>>> num_allocs=1913182
>>>>>
>>>>> There isn't much significant drop in inode contexts.
One of the
>>>>> reasons could be because of dentrys holding a refcount on
the inodes
>>>>> which shall result in inodes not getting purged even after
>>>>> fuse_forget.
>>>>>
>>>>>
>>>>> pool-name=fuse:dentry_t
>>>>> hot-count=32761
>>>>>
>>>>> if  '32761' is the current active dentry count, it
still doesn't seem
>>>>> to match up to inode count.
>>>>>
>>>>> Thanks,
>>>>> Soumya
>>>>>>
>>>>>>
>>>>>> And here is Valgrind output:
>>>>>> https://gist.github.com/2490aeac448320d98596
>>>>>>
>>>>>> On ??????, 30 ????? 2016 ?. 22:56:37 EET Xavier
Hernandez wrote:
>>>>>>>
>>>>>>> There's another inode leak caused by an
incorrect counting of
>>>>>>> lookups on directory reads.
>>>>>>>
>>>>>>> Here's a patch that solves the problem for
>>>>>>> 3.7:
>>>>>>>
>>>>>>> http://review.gluster.org/13324
>>>>>>>
>>>>>>> Hopefully with this patch the
>>>>>>> memory leaks should disapear.
>>>>>>>
>>>>>>> Xavi
>>>>>>>
>>>>>>> On 29.01.2016 19:09, Oleksandr
>>>>>>>
>>>>>>> Natalenko wrote:
>>>>>>>>
>>>>>>>> Here is intermediate summary of current memory
>>>>>>>
>>>>>>>
>>>>>>> leaks in FUSE client
>>>>>>>
>>>>>>>> investigation.
>>>>>>>>
>>>>>>>> I use GlusterFS v3.7.6
>>>>>>>
>>>>>>>
>>>>>>> release with the following patches:
>>>>>>>>
>>>>>>>> ==>>>>>>>
>>>>>>>
>>>>>>>> Kaleb S KEITHLEY (1):
>>>>>>>
>>>>>>> fuse: use-after-free fix in fuse-bridge, revisited
>>>>>>>
>>>>>>>> Pranith Kumar K
>>>>>>>
>>>>>>>
>>>>>>> (1):
>>>>>>>>
>>>>>>>> mount/fuse: Fix use-after-free crash
>>>>>>>
>>>>>>>
>>>>>>>> Soumya Koduri (3):
>>>>>>>
>>>>>>> gfapi: Fix inode nlookup counts
>>>>>>>
>>>>>>>> inode: Retire the inodes from the lru
>>>>>>>
>>>>>>>
>>>>>>> list in inode_table_destroy
>>>>>>>
>>>>>>>> upcall: free the xdr* allocations
>>>>>>>> ==>>>>>>>>
>>>>>>>>
>>>>>>>> With those patches we got API leaks fixed (I
hope, brief tests show
>>>>>>>
>>>>>>>
>>>>>>> that) and
>>>>>>>
>>>>>>>> got rid of "kernel notifier loop
terminated" message.
>>>>>>>
>>>>>>>
>>>>>>> Nevertheless, FUSE
>>>>>>>
>>>>>>>> client still leaks.
>>>>>>>>
>>>>>>>> I have several test
>>>>>>>
>>>>>>>
>>>>>>> volumes with several million of small files
(100K?2M in
>>>>>>>
>>>>>>>> average). I
>>>>>>>
>>>>>>>
>>>>>>> do 2 types of FUSE client testing:
>>>>>>>>
>>>>>>>> 1) find /mnt/volume -type d
>>>>>>>> 2)
>>>>>>>
>>>>>>>
>>>>>>> rsync -av -H /mnt/source_volume/*
/mnt/target_volume/
>>>>>>>
>>>>>>>> And most
>>>>>>>
>>>>>>>
>>>>>>> up-to-date results are shown below:
>>>>>>>>
>>>>>>>> === find /mnt/volume -type d
>>>>>>>
>>>>>>>
>>>>>>> ==>>>>>>>
>>>>>>>> Memory consumption: ~4G
>>>>>>>
>>>>>>>
>>>>>>>> Statedump:
>>>>>>>
>>>>>>> https://gist.github.com/10cde83c63f1b4f1dd7a
>>>>>>>
>>>>>>>> Valgrind:
>>>>>>>
>>>>>>> https://gist.github.com/097afb01ebb2c5e9e78d
>>>>>>>
>>>>>>>> I guess,
>>>>>>>
>>>>>>>
>>>>>>> fuse-bridge/fuse-resolve. related.
>>>>>>>
>>>>>>>> === rsync -av -H
>>>>>>>
>>>>>>>
>>>>>>> /mnt/source_volume/* /mnt/target_volume/
==>>>>>>>
>>>>>>>> Memory consumption:
>>>>>>>
>>>>>>> ~3.3...4G
>>>>>>>
>>>>>>>> Statedump (target volume):
>>>>>>>
>>>>>>> https://gist.github.com/31e43110eaa4da663435
>>>>>>>
>>>>>>>> Valgrind (target volume):
>>>>>>>
>>>>>>> https://gist.github.com/f8e0151a6878cacc9b1a
>>>>>>>
>>>>>>>> I guess,
>>>>>>>
>>>>>>>
>>>>>>> DHT-related.
>>>>>>>
>>>>>>>> Give me more patches to test :).
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>>
>>>>>>>> Gluster-devel mailing
>>>>>>>
>>>>>>>
>>>>>>> list
>>>>>>>
>>>>>>>> Gluster-devel at gluster.org
>>>>>>>
>>>>>>>
>>>>>>>
http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-devel mailing list
>>>>>> Gluster-devel at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>
>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

Gluster users - Feb 2016 - [Gluster-devel] GlusterFS FUSE client leaks summary — part I

[Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary — part I

[Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary — part I