Soumya Koduri
2016-Feb-01 11:28 UTC
[Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary — part I
On 02/01/2016 02:48 PM, Xavier Hernandez wrote:> Hi, > > On 01/02/16 09:54, Soumya Koduri wrote: >> >> >> On 02/01/2016 01:39 PM, Oleksandr Natalenko wrote: >>> Wait. It seems to be my bad. >>> >>> Before unmounting I do drop_caches (2), and glusterfs process CPU usage >>> goes to 100% for a while. I haven't waited for it to drop to 0%, and >>> instead perform unmount. It seems glusterfs is purging inodes and that's >>> why it uses 100% of CPU. I've re-tested it, waiting for CPU usage to >>> become normal, and got no leaks. >>> >>> Will verify this once again and report more. >>> >>> BTW, if that works, how could I limit inode cache for FUSE client? I do >>> not want it to go beyond 1G, for example, even if I have 48G of RAM on >>> my server. >> >> Its hard-coded for now. For fuse the lru limit (of the inodes which are >> not active) is (32*1024). > > This is not exact for current implementation. The inode memory pool is > configured with 32*1024 entries, but the lru limit is set to infinite: > currently inode_table_prune() takes lru_limit == 0 as infinite, and the > inode table created by fuse is initialized with 0. > > Anyway this should not be a big problem in normal conditions. After > having fixed the incorrect nlookup count for "." and ".." directory > entries, when the kernel detects memory pressure and sends inode > forgets, the memory will be released. > >> One of the ways to address this (which we were discussing earlier) is to >> have an option to configure inode cache limit. > > I think this will need more thinking. I've made a fast test forcing > lru_limit to a small value and weird errors have appeared (probably from > inodes being expected to exist when kernel sends new requests). Anyway I > haven't spent time on this. I haven't tested in on master either.Oh okay. Thanks for checking. -Soumya> > Xavi > >> If that sounds good, we >> can then check on if it has to be global/volume-level, >> client/server/both. >> >> Thanks, >> Soumya >> >>> >>> 01.02.2016 09:54, Soumya Koduri ???????: >>>> On 01/31/2016 03:05 PM, Oleksandr Natalenko wrote: >>>>> Unfortunately, this patch doesn't help. >>>>> >>>>> RAM usage on "find" finish is ~9G. >>>>> >>>>> Here is statedump before drop_caches: https://gist.github.com/ >>>>> fc1647de0982ab447e20 >>>> >>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage] >>>> size=706766688 >>>> num_allocs=2454051 >>>> >>>>> >>>>> And after drop_caches: https://gist.github.com/5eab63bc13f78787ed19 >>>> >>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage] >>>> size=550996416 >>>> num_allocs=1913182 >>>> >>>> There isn't much significant drop in inode contexts. One of the >>>> reasons could be because of dentrys holding a refcount on the inodes >>>> which shall result in inodes not getting purged even after >>>> fuse_forget. >>>> >>>> >>>> pool-name=fuse:dentry_t >>>> hot-count=32761 >>>> >>>> if '32761' is the current active dentry count, it still doesn't seem >>>> to match up to inode count. >>>> >>>> Thanks, >>>> Soumya >>>>> >>>>> And here is Valgrind output: >>>>> https://gist.github.com/2490aeac448320d98596 >>>>> >>>>> On ??????, 30 ????? 2016 ?. 22:56:37 EET Xavier Hernandez wrote: >>>>>> There's another inode leak caused by an incorrect counting of >>>>>> lookups on directory reads. >>>>>> >>>>>> Here's a patch that solves the problem for >>>>>> 3.7: >>>>>> >>>>>> http://review.gluster.org/13324 >>>>>> >>>>>> Hopefully with this patch the >>>>>> memory leaks should disapear. >>>>>> >>>>>> Xavi >>>>>> >>>>>> On 29.01.2016 19:09, Oleksandr >>>>>> >>>>>> Natalenko wrote: >>>>>>> Here is intermediate summary of current memory >>>>>> >>>>>> leaks in FUSE client >>>>>> >>>>>>> investigation. >>>>>>> >>>>>>> I use GlusterFS v3.7.6 >>>>>> >>>>>> release with the following patches: >>>>>>> ==>>>>>> >>>>>>> Kaleb S KEITHLEY (1): >>>>>> fuse: use-after-free fix in fuse-bridge, revisited >>>>>> >>>>>>> Pranith Kumar K >>>>>> >>>>>> (1): >>>>>>> mount/fuse: Fix use-after-free crash >>>>>> >>>>>>> Soumya Koduri (3): >>>>>> gfapi: Fix inode nlookup counts >>>>>> >>>>>>> inode: Retire the inodes from the lru >>>>>> >>>>>> list in inode_table_destroy >>>>>> >>>>>>> upcall: free the xdr* allocations >>>>>>> ==>>>>>>> >>>>>>> >>>>>>> With those patches we got API leaks fixed (I hope, brief tests show >>>>>> >>>>>> that) and >>>>>> >>>>>>> got rid of "kernel notifier loop terminated" message. >>>>>> >>>>>> Nevertheless, FUSE >>>>>> >>>>>>> client still leaks. >>>>>>> >>>>>>> I have several test >>>>>> >>>>>> volumes with several million of small files (100K?2M in >>>>>> >>>>>>> average). I >>>>>> >>>>>> do 2 types of FUSE client testing: >>>>>>> 1) find /mnt/volume -type d >>>>>>> 2) >>>>>> >>>>>> rsync -av -H /mnt/source_volume/* /mnt/target_volume/ >>>>>> >>>>>>> And most >>>>>> >>>>>> up-to-date results are shown below: >>>>>>> === find /mnt/volume -type d >>>>>> >>>>>> ==>>>>>> >>>>>>> Memory consumption: ~4G >>>>>> >>>>>>> Statedump: >>>>>> https://gist.github.com/10cde83c63f1b4f1dd7a >>>>>> >>>>>>> Valgrind: >>>>>> https://gist.github.com/097afb01ebb2c5e9e78d >>>>>> >>>>>>> I guess, >>>>>> >>>>>> fuse-bridge/fuse-resolve. related. >>>>>> >>>>>>> === rsync -av -H >>>>>> >>>>>> /mnt/source_volume/* /mnt/target_volume/ ==>>>>>> >>>>>>> Memory consumption: >>>>>> ~3.3...4G >>>>>> >>>>>>> Statedump (target volume): >>>>>> https://gist.github.com/31e43110eaa4da663435 >>>>>> >>>>>>> Valgrind (target volume): >>>>>> https://gist.github.com/f8e0151a6878cacc9b1a >>>>>> >>>>>>> I guess, >>>>>> >>>>>> DHT-related. >>>>>> >>>>>>> Give me more patches to test :). >>>>>> >>>>>> _______________________________________________ >>>>>> >>>>>>> Gluster-devel mailing >>>>>> >>>>>> list >>>>>> >>>>>>> Gluster-devel at gluster.org >>>>>> >>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-devel mailing list >>>>> Gluster-devel at gluster.org >>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >>>>> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel
Serkan Çoban
2016-Feb-01 13:26 UTC
[Gluster-users] [Gluster-devel] GlusterFS FUSE client leaks summary — part I
Will those patches be available in 3.7.7? On Mon, Feb 1, 2016 at 1:28 PM, Soumya Koduri <skoduri at redhat.com> wrote:> > > On 02/01/2016 02:48 PM, Xavier Hernandez wrote: >> >> Hi, >> >> On 01/02/16 09:54, Soumya Koduri wrote: >>> >>> >>> >>> On 02/01/2016 01:39 PM, Oleksandr Natalenko wrote: >>>> >>>> Wait. It seems to be my bad. >>>> >>>> Before unmounting I do drop_caches (2), and glusterfs process CPU usage >>>> goes to 100% for a while. I haven't waited for it to drop to 0%, and >>>> instead perform unmount. It seems glusterfs is purging inodes and that's >>>> why it uses 100% of CPU. I've re-tested it, waiting for CPU usage to >>>> become normal, and got no leaks. >>>> >>>> Will verify this once again and report more. >>>> >>>> BTW, if that works, how could I limit inode cache for FUSE client? I do >>>> not want it to go beyond 1G, for example, even if I have 48G of RAM on >>>> my server. >>> >>> >>> Its hard-coded for now. For fuse the lru limit (of the inodes which are >>> not active) is (32*1024). >> >> >> This is not exact for current implementation. The inode memory pool is >> configured with 32*1024 entries, but the lru limit is set to infinite: >> currently inode_table_prune() takes lru_limit == 0 as infinite, and the >> inode table created by fuse is initialized with 0. >> >> Anyway this should not be a big problem in normal conditions. After >> having fixed the incorrect nlookup count for "." and ".." directory >> entries, when the kernel detects memory pressure and sends inode >> forgets, the memory will be released. >> >>> One of the ways to address this (which we were discussing earlier) is to >>> have an option to configure inode cache limit. >> >> >> I think this will need more thinking. I've made a fast test forcing >> lru_limit to a small value and weird errors have appeared (probably from >> inodes being expected to exist when kernel sends new requests). Anyway I >> haven't spent time on this. I haven't tested in on master either. > > > Oh okay. Thanks for checking. > > -Soumya > > >> >> Xavi >> >>> If that sounds good, we >>> can then check on if it has to be global/volume-level, >>> client/server/both. >>> >>> Thanks, >>> Soumya >>> >>>> >>>> 01.02.2016 09:54, Soumya Koduri ???????: >>>>> >>>>> On 01/31/2016 03:05 PM, Oleksandr Natalenko wrote: >>>>>> >>>>>> Unfortunately, this patch doesn't help. >>>>>> >>>>>> RAM usage on "find" finish is ~9G. >>>>>> >>>>>> Here is statedump before drop_caches: https://gist.github.com/ >>>>>> fc1647de0982ab447e20 >>>>> >>>>> >>>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage] >>>>> size=706766688 >>>>> num_allocs=2454051 >>>>> >>>>>> >>>>>> And after drop_caches: https://gist.github.com/5eab63bc13f78787ed19 >>>>> >>>>> >>>>> [mount/fuse.fuse - usage-type gf_common_mt_inode_ctx memusage] >>>>> size=550996416 >>>>> num_allocs=1913182 >>>>> >>>>> There isn't much significant drop in inode contexts. One of the >>>>> reasons could be because of dentrys holding a refcount on the inodes >>>>> which shall result in inodes not getting purged even after >>>>> fuse_forget. >>>>> >>>>> >>>>> pool-name=fuse:dentry_t >>>>> hot-count=32761 >>>>> >>>>> if '32761' is the current active dentry count, it still doesn't seem >>>>> to match up to inode count. >>>>> >>>>> Thanks, >>>>> Soumya >>>>>> >>>>>> >>>>>> And here is Valgrind output: >>>>>> https://gist.github.com/2490aeac448320d98596 >>>>>> >>>>>> On ??????, 30 ????? 2016 ?. 22:56:37 EET Xavier Hernandez wrote: >>>>>>> >>>>>>> There's another inode leak caused by an incorrect counting of >>>>>>> lookups on directory reads. >>>>>>> >>>>>>> Here's a patch that solves the problem for >>>>>>> 3.7: >>>>>>> >>>>>>> http://review.gluster.org/13324 >>>>>>> >>>>>>> Hopefully with this patch the >>>>>>> memory leaks should disapear. >>>>>>> >>>>>>> Xavi >>>>>>> >>>>>>> On 29.01.2016 19:09, Oleksandr >>>>>>> >>>>>>> Natalenko wrote: >>>>>>>> >>>>>>>> Here is intermediate summary of current memory >>>>>>> >>>>>>> >>>>>>> leaks in FUSE client >>>>>>> >>>>>>>> investigation. >>>>>>>> >>>>>>>> I use GlusterFS v3.7.6 >>>>>>> >>>>>>> >>>>>>> release with the following patches: >>>>>>>> >>>>>>>> ==>>>>>>> >>>>>>> >>>>>>>> Kaleb S KEITHLEY (1): >>>>>>> >>>>>>> fuse: use-after-free fix in fuse-bridge, revisited >>>>>>> >>>>>>>> Pranith Kumar K >>>>>>> >>>>>>> >>>>>>> (1): >>>>>>>> >>>>>>>> mount/fuse: Fix use-after-free crash >>>>>>> >>>>>>> >>>>>>>> Soumya Koduri (3): >>>>>>> >>>>>>> gfapi: Fix inode nlookup counts >>>>>>> >>>>>>>> inode: Retire the inodes from the lru >>>>>>> >>>>>>> >>>>>>> list in inode_table_destroy >>>>>>> >>>>>>>> upcall: free the xdr* allocations >>>>>>>> ==>>>>>>>> >>>>>>>> >>>>>>>> With those patches we got API leaks fixed (I hope, brief tests show >>>>>>> >>>>>>> >>>>>>> that) and >>>>>>> >>>>>>>> got rid of "kernel notifier loop terminated" message. >>>>>>> >>>>>>> >>>>>>> Nevertheless, FUSE >>>>>>> >>>>>>>> client still leaks. >>>>>>>> >>>>>>>> I have several test >>>>>>> >>>>>>> >>>>>>> volumes with several million of small files (100K?2M in >>>>>>> >>>>>>>> average). I >>>>>>> >>>>>>> >>>>>>> do 2 types of FUSE client testing: >>>>>>>> >>>>>>>> 1) find /mnt/volume -type d >>>>>>>> 2) >>>>>>> >>>>>>> >>>>>>> rsync -av -H /mnt/source_volume/* /mnt/target_volume/ >>>>>>> >>>>>>>> And most >>>>>>> >>>>>>> >>>>>>> up-to-date results are shown below: >>>>>>>> >>>>>>>> === find /mnt/volume -type d >>>>>>> >>>>>>> >>>>>>> ==>>>>>>> >>>>>>>> Memory consumption: ~4G >>>>>>> >>>>>>> >>>>>>>> Statedump: >>>>>>> >>>>>>> https://gist.github.com/10cde83c63f1b4f1dd7a >>>>>>> >>>>>>>> Valgrind: >>>>>>> >>>>>>> https://gist.github.com/097afb01ebb2c5e9e78d >>>>>>> >>>>>>>> I guess, >>>>>>> >>>>>>> >>>>>>> fuse-bridge/fuse-resolve. related. >>>>>>> >>>>>>>> === rsync -av -H >>>>>>> >>>>>>> >>>>>>> /mnt/source_volume/* /mnt/target_volume/ ==>>>>>>> >>>>>>>> Memory consumption: >>>>>>> >>>>>>> ~3.3...4G >>>>>>> >>>>>>>> Statedump (target volume): >>>>>>> >>>>>>> https://gist.github.com/31e43110eaa4da663435 >>>>>>> >>>>>>>> Valgrind (target volume): >>>>>>> >>>>>>> https://gist.github.com/f8e0151a6878cacc9b1a >>>>>>> >>>>>>>> I guess, >>>>>>> >>>>>>> >>>>>>> DHT-related. >>>>>>> >>>>>>>> Give me more patches to test :). >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> >>>>>>>> Gluster-devel mailing >>>>>>> >>>>>>> >>>>>>> list >>>>>>> >>>>>>>> Gluster-devel at gluster.org >>>>>>> >>>>>>> >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-devel mailing list >>>>>> Gluster-devel at gluster.org >>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel >>>>>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users