thr3ads.net - Gluster users - [Gluster-users] Memory leak in GlusterFS FUSE client [Jan 2016]

If this information is useful, please help other people find it:
Share via:

Soumya Koduri

2016-Jan-05 17:22 UTC

[Gluster-users] Memory leak in GlusterFS FUSE client

On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote:> Unfortunately, both patches didn't make any difference for me.
>
> I've patched 3.7.6 with both patches, recompiled and installed patched
> GlusterFS package on client side and mounted volume with ~2M of files.
> The I performed usual tree traverse with simple "find".
>
> Memory RES value went from ~130M at the moment of mounting to ~1.5G
> after traversing the volume for ~40 mins. Valgrind log still shows lots
> of leaks. Here it is:
>
> https://gist.github.com/56906ca6e657c4ffa4a1
Looks like you had done fuse mount. The patches which I have pasted 
below apply to gfapi/nfs-ganesha applications.

Also, to resolve the nfs-ganesha issue which I had mentioned below  (in 
case if Entries_HWMARK option gets changed), I have posted below fix -
	https://review.gerrithub.io/#/c/258687

Thanks,
Soumya
>
> Ideas?
>
> 05.01.2016 12:31, Soumya Koduri ???????:
>> I tried to debug the inode* related leaks and seen some improvements
>> after applying the below patches when ran the same test (but will
>> smaller load). Could you please apply those patches & confirm the
>> same?
>>
>> a) http://review.gluster.org/13125
>>
>> This will fix the inodes & their ctx related leaks during unexport
and
>> the program exit. Please check the valgrind output after applying the
>> patch. It should not list any inodes related memory as lost.
>>
>> b) http://review.gluster.org/13096
>>
>> The reason the change in Entries_HWMARK (in your earlier mail) dint
>> have much effect is that the inode_nlookup count doesn't become
zero
>> for those handles/inodes being closed by ganesha. Hence those inodes
>> shall get added to inode lru list instead of purge list which shall
>> get forcefully purged only when the number of gfapi inode table
>> entries reaches its limit (which is 137012).
>>
>> This patch fixes those 'nlookup' counts. Please apply this
patch and
>> reduce 'Entries_HWMARK' to much lower value and check if it
decreases
>> the in-memory being consumed by ganesha process while being active.
>>
>> CACHEINODE {
>>         Entries_HWMark = 500;
>> }
>>
>>
>> Note: I see an issue with nfs-ganesha during exit when the option
>> 'Entries_HWMARK' gets changed. This is not related to any of
the above
>> patches (or rather Gluster) and I am currently debugging it.
>>
>> Thanks,
>> Soumya
>>
>>
>> On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote:
>>> 1. test with Cache_Size = 256 and Entries_HWMark = 4096
>>>
>>> Before find . -type f:
>>>
>>> root      3120  0.6 11.0 879120 208408 ?       Ssl  17:39   0:00
>>> /usr/bin/
>>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf
-N
>>> NIV_EVENT
>>>
>>> After:
>>>
>>> root      3120 11.4 24.3 1170076 458168 ?      Ssl  17:39  13:39
>>> /usr/bin/
>>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf
-N
>>> NIV_EVENT
>>>
>>> ~250M leak.
>>>
>>> 2. test with default values (after ganesha restart)
>>>
>>> Before:
>>>
>>> root     24937  1.3 10.4 875016 197808 ?       Ssl  19:39   0:00
>>> /usr/bin/
>>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf
-N
>>> NIV_EVENT
>>>
>>> After:
>>>
>>> root     24937  3.5 18.9 1022544 356340 ?      Ssl  19:39   0:40
>>> /usr/bin/
>>> ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf
-N
>>> NIV_EVENT
>>>
>>> ~159M leak.
>>>
>>> No reasonable correlation detected. Second test was finished much
>>> faster than
>>> first (I guess, server-side GlusterFS cache or server kernel page
>>> cache is the
>>> cause).
>>>
>>> There are ~1.8M files on this test volume.
>>>
>>> On ????????, 25 ?????? 2015 ?. 20:28:13 EET Soumya Koduri wrote:
>>>> On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote:
>>>>> Another addition: it seems to be GlusterFS API library
memory leak
>>>>> because NFS-Ganesha also consumes huge amount of memory
while doing
>>>>> ordinary "find . -type f" via NFSv4.2 on remote
client. Here is memory
>>>>> usage:
>>>>>
>>>>> ==>>>>> root      5416 34.2 78.5 2047176
1480552 ?     Ssl  12:02 117:54
>>>>> /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f
>>>>> /etc/ganesha/ganesha.conf -N NIV_EVENT
>>>>> ==>>>>>
>>>>> 1.4G is too much for simple stat() :(.
>>>>>
>>>>> Ideas?
>>>>
>>>> nfs-ganesha also has cache layer which can scale to millions of
entries
>>>> depending on the number of files/directories being looked upon.
However
>>>> there are parameters to tune it. So either try stat with few
entries or
>>>> add below block in nfs-ganesha.conf file, set low limits and
check the
>>>> difference. That may help us narrow down how much memory
actually
>>>> consumed by core nfs-ganesha and gfAPI.
>>>>
>>>> CACHEINODE {
>>>>     Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); #
>>>> cache size
>>>>     Entries_HWMark(uint32, range 1 to UINT32_MAX, default
100000);
>>>> #Max no.
>>>> of entries in the cache.
>>>> }
>>>>
>>>> Thanks,
>>>> Soumya
>>>>
>>>>> 24.12.2015 16:32, Oleksandr Natalenko ???????:
>>>>>> Still actual issue for 3.7.6. Any suggestions?
>>>>>>
>>>>>> 24.09.2015 10:14, Oleksandr Natalenko ???????:
>>>>>>> In our GlusterFS deployment we've encountered
something like memory
>>>>>>> leak in GlusterFS FUSE client.
>>>>>>>
>>>>>>> We use replicated (?2) GlusterFS volume to store
mail (exim+dovecot,
>>>>>>> maildir format). Here is inode stats for both
bricks and mountpoint:
>>>>>>>
>>>>>>> ==>>>>>>> Brick 1 (Server 1):
>>>>>>>
>>>>>>> Filesystem                                         
Inodes IUsed
>>>>>>>
>>>>>>>       IFree IUse% Mounted on
>>>>>>>
>>>>>>> /dev/mapper/vg_vd1_misc-lv08_mail                  
578768144
>>>>>>> 10954918
>>>>>>>
>>>>>>>   567813226    2% /bricks/r6sdLV08_vd1_mail
>>>>>>>
>>>>>>> Brick 2 (Server 2):
>>>>>>>
>>>>>>> Filesystem                                         
Inodes IUsed
>>>>>>>
>>>>>>>       IFree IUse% Mounted on
>>>>>>>
>>>>>>> /dev/mapper/vg_vd0_misc-lv07_mail                  
578767984
>>>>>>> 10954913
>>>>>>>
>>>>>>>   567813071    2% /bricks/r6sdLV07_vd0_mail
>>>>>>>
>>>>>>> Mountpoint (Server 3):
>>>>>>>
>>>>>>> Filesystem                              Inodes   
IUsed      IFree
>>>>>>> IUse% Mounted on
>>>>>>> glusterfs.xxx:mail                   578767760
10954915  567812845
>>>>>>> 2% /var/spool/mail/virtual
>>>>>>> ==>>>>>>>
>>>>>>> glusterfs.xxx domain has two A records for both
Server 1 and
>>>>>>> Server 2.
>>>>>>>
>>>>>>> Here is volume info:
>>>>>>>
>>>>>>> ==>>>>>>> Volume Name: mail
>>>>>>> Type: Replicate
>>>>>>> Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2
>>>>>>> Status: Started
>>>>>>> Number of Bricks: 1 x 2 = 2
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
>>>>>>> Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
>>>>>>> Options Reconfigured:
>>>>>>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24
>>>>>>> features.cache-invalidation-timeout: 10
>>>>>>> performance.stat-prefetch: off
>>>>>>> performance.quick-read: on
>>>>>>> performance.read-ahead: off
>>>>>>> performance.flush-behind: on
>>>>>>> performance.write-behind: on
>>>>>>> performance.io-thread-count: 4
>>>>>>> performance.cache-max-file-size: 1048576
>>>>>>> performance.cache-size: 67108864
>>>>>>> performance.readdir-ahead: off
>>>>>>> ==>>>>>>>
>>>>>>> Soon enough after mounting and exim/dovecot start,
glusterfs client
>>>>>>> process begins to consume huge amount of RAM:
>>>>>>>
>>>>>>> ==>>>>>>> user at server3 ~$
ps aux | grep glusterfs | grep mail
>>>>>>> root     28895 14.4 15.0 15510324 14908868 ?   Ssl 
Sep03 4310:05
>>>>>>> /usr/sbin/glusterfs --fopen-keep-cache
--direct-io-mode=disable
>>>>>>> --volfile-server=glusterfs.xxx --volfile-id=mail
>>>>>>> /var/spool/mail/virtual
>>>>>>> ==>>>>>>>
>>>>>>> That is, ~15 GiB of RAM.
>>>>>>>
>>>>>>> Also we've tried to use mountpoint withing
separate KVM VM with 2
>>>>>>> or 3
>>>>>>> GiB of RAM, and soon after starting mail daemons
got OOM killer for
>>>>>>> glusterfs client process.
>>>>>>>
>>>>>>> Mounting same share via NFS works just fine. Also,
we have much less
>>>>>>> iowait and loadavg on client side with NFS.
>>>>>>>
>>>>>>> Also, we've tried to change IO threads count
and cache size in order
>>>>>>> to limit memory usage with no luck. As you can see,
total cache size
>>>>>>> is 4?64==256 MiB (compare to 15 GiB).
>>>>>>>
>>>>>>> Enabling-disabling stat-prefetch, read-ahead and
readdir-ahead
>>>>>>> didn't
>>>>>>> help as well.
>>>>>>>
>>>>>>> Here are volume memory stats:
>>>>>>>
>>>>>>> ==>>>>>>> Memory status for
volume : mail
>>>>>>> ----------------------------------------------
>>>>>>> Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
>>>>>>> Mallinfo
>>>>>>> --------
>>>>>>> Arena    : 36859904
>>>>>>> Ordblks  : 10357
>>>>>>> Smblks   : 519
>>>>>>> Hblks    : 21
>>>>>>> Hblkhd   : 30515200
>>>>>>> Usmblks  : 0
>>>>>>> Fsmblks  : 53440
>>>>>>> Uordblks : 18604144
>>>>>>> Fordblks : 18255760
>>>>>>> Keepcost : 114112
>>>>>>>
>>>>>>> Mempool Stats
>>>>>>> -------------
>>>>>>> Name                            HotCount ColdCount
PaddedSizeof
>>>>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
>>>>>>> ----                            -------- ---------
------------
>>>>>>> ---------- -------- -------- ------------
>>>>>>> mail-server:fd_t                       0      1024 
108
>>>>>>> 30773120      137        0            0
>>>>>>> mail-server:dentry_t               16110       274 
84
>>>>>>> 235676148    16384  1106499         1152
>>>>>>> mail-server:inode_t                16363        21 
156
>>>>>>> 237216876    16384  1876651         1169
>>>>>>> mail-trash:fd_t                        0      1024 
108
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-trash:dentry_t                    0     32768 
84
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-trash:inode_t                     4     32764 
156
>>>>>>>
>>>>>>>    4        4        0            0
>>>>>>>
>>>>>>> mail-trash:trash_local_t               0        64 
8628
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-changetimerecorder:gf_ctr_local_t         0   
64
>>>>>>> 16540          0        0        0            0
>>>>>>> mail-changelog:rpcsvc_request_t         0         8
2828
>>>>>>>
>>>>>>>     0        0        0            0
>>>>>>>
>>>>>>> mail-changelog:changelog_local_t         0       
64          116
>>>>>>>
>>>>>>>      0        0        0            0
>>>>>>>
>>>>>>> mail-bitrot-stub:br_stub_local_t         0      
512           84
>>>>>>> 79204        4        0            0
>>>>>>> mail-locks:pl_local_t                  0        32 
148
>>>>>>> 6812757        4        0            0
>>>>>>> mail-upcall:upcall_local_t             0       512 
108
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-marker:marker_local_t             0       128 
332
>>>>>>> 64980        3        0            0
>>>>>>> mail-quota:quota_local_t               0        64 
476
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-server:rpcsvc_request_t           0       512 
2828
>>>>>>> 45462533       34        0            0
>>>>>>> glusterfs:struct saved_frame           0         8 
124
>>>>>>>
>>>>>>>    2        2        0            0
>>>>>>>
>>>>>>> glusterfs:struct rpc_req               0         8 
588
>>>>>>>
>>>>>>>    2        2        0            0
>>>>>>>
>>>>>>> glusterfs:rpcsvc_request_t             1         7 
2828
>>>>>>>
>>>>>>>    2        1        0            0
>>>>>>>
>>>>>>> glusterfs:log_buf_t                    5       251 
140
>>>>>>> 3452        6        0            0
>>>>>>> glusterfs:data_t                     242     16141 
52
>>>>>>> 480115498      664        0            0
>>>>>>> glusterfs:data_pair_t                230     16153 
68
>>>>>>> 179483528      275        0            0
>>>>>>> glusterfs:dict_t                      23      4073 
140
>>>>>>> 303751675      627        0            0
>>>>>>> glusterfs:call_stub_t                  0      1024 
3764
>>>>>>> 45290655       34        0            0
>>>>>>> glusterfs:call_stack_t                 1      1023 
1708
>>>>>>> 43598469       34        0            0
>>>>>>> glusterfs:call_frame_t                 1      4095 
172
>>>>>>> 336219655      184        0            0
>>>>>>> ----------------------------------------------
>>>>>>> Brick : server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
>>>>>>> Mallinfo
>>>>>>> --------
>>>>>>> Arena    : 38174720
>>>>>>> Ordblks  : 9041
>>>>>>> Smblks   : 507
>>>>>>> Hblks    : 21
>>>>>>> Hblkhd   : 30515200
>>>>>>> Usmblks  : 0
>>>>>>> Fsmblks  : 51712
>>>>>>> Uordblks : 19415008
>>>>>>> Fordblks : 18759712
>>>>>>> Keepcost : 114848
>>>>>>>
>>>>>>> Mempool Stats
>>>>>>> -------------
>>>>>>> Name                            HotCount ColdCount
PaddedSizeof
>>>>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
>>>>>>> ----                            -------- ---------
------------
>>>>>>> ---------- -------- -------- ------------
>>>>>>> mail-server:fd_t                       0      1024 
108
>>>>>>> 2373075      133        0            0
>>>>>>> mail-server:dentry_t               14114      2270 
84
>>>>>>> 3513654    16384     2300          267
>>>>>>> mail-server:inode_t                16374        10 
156
>>>>>>> 6766642    16384   194635         1279
>>>>>>> mail-trash:fd_t                        0      1024 
108
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-trash:dentry_t                    0     32768 
84
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-trash:inode_t                     4     32764 
156
>>>>>>>
>>>>>>>    4        4        0            0
>>>>>>>
>>>>>>> mail-trash:trash_local_t               0        64 
8628
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-changetimerecorder:gf_ctr_local_t         0   
64
>>>>>>> 16540          0        0        0            0
>>>>>>> mail-changelog:rpcsvc_request_t         0         8
2828
>>>>>>>
>>>>>>>     0        0        0            0
>>>>>>>
>>>>>>> mail-changelog:changelog_local_t         0       
64          116
>>>>>>>
>>>>>>>      0        0        0            0
>>>>>>>
>>>>>>> mail-bitrot-stub:br_stub_local_t         0      
512           84
>>>>>>> 71354        4        0            0
>>>>>>> mail-locks:pl_local_t                  0        32 
148
>>>>>>> 8135032        4        0            0
>>>>>>> mail-upcall:upcall_local_t             0       512 
108
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-marker:marker_local_t             0       128 
332
>>>>>>> 65005        3        0            0
>>>>>>> mail-quota:quota_local_t               0        64 
476
>>>>>>>
>>>>>>>    0        0        0            0
>>>>>>>
>>>>>>> mail-server:rpcsvc_request_t           0       512 
2828
>>>>>>> 12882393       30        0            0
>>>>>>> glusterfs:struct saved_frame           0         8 
124
>>>>>>>
>>>>>>>    2        2        0            0
>>>>>>>
>>>>>>> glusterfs:struct rpc_req               0         8 
588
>>>>>>>
>>>>>>>    2        2        0            0
>>>>>>>
>>>>>>> glusterfs:rpcsvc_request_t             1         7 
2828
>>>>>>>
>>>>>>>    2        1        0            0
>>>>>>>
>>>>>>> glusterfs:log_buf_t                    5       251 
140
>>>>>>> 3443        6        0            0
>>>>>>> glusterfs:data_t                     242     16141 
52
>>>>>>> 138743429      290        0            0
>>>>>>> glusterfs:data_pair_t                230     16153 
68
>>>>>>> 126649864      270        0            0
>>>>>>> glusterfs:dict_t                      23      4073 
140
>>>>>>> 20356289       63        0            0
>>>>>>> glusterfs:call_stub_t                  0      1024 
3764
>>>>>>> 13678560       31        0            0
>>>>>>> glusterfs:call_stack_t                 1      1023 
1708
>>>>>>> 11011561       30        0            0
>>>>>>> glusterfs:call_frame_t                 1      4095 
172
>>>>>>> 125764190      193        0            0
>>>>>>> ----------------------------------------------
>>>>>>> ==>>>>>>>
>>>>>>> So, my questions are:
>>>>>>>
>>>>>>> 1) what one should do to limit GlusterFS FUSE
client memory usage?
>>>>>>> 2) what one should do to prevent client high
loadavg because of high
>>>>>>> iowait because of multiple concurrent volume users?
>>>>>>>
>>>>>>> Server/client OS is CentOS 7.1, GlusterFS server
version is 3.7.3,
>>>>>>> GlusterFS client version is 3.7.4.
>>>>>>>
>>>>>>> Any additional info needed?
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>

Oleksandr Natalenko

2016-Jan-05 17:46 UTC

head link

[Gluster-users] Memory leak in GlusterFS FUSE client

Correct, I used FUSE mount. Shouldn't gfapi be used by FUSE mount helper (/
usr/bin/glusterfs)?

On ????????, 5 ????? 2016 ?. 22:52:25 EET Soumya Koduri
wrote:> On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote:
> > Unfortunately, both patches didn't make any difference for me.
> > 
> > I've patched 3.7.6 with both patches, recompiled and installed
patched
> > GlusterFS package on client side and mounted volume with ~2M of files.
> > The I performed usual tree traverse with simple "find".
> > 
> > Memory RES value went from ~130M at the moment of mounting to ~1.5G
> > after traversing the volume for ~40 mins. Valgrind log still shows
lots
> > of leaks. Here it is:
> > 
> > https://gist.github.com/56906ca6e657c4ffa4a1
> 
> Looks like you had done fuse mount. The patches which I have pasted
> below apply to gfapi/nfs-ganesha applications.
> 
> Also, to resolve the nfs-ganesha issue which I had mentioned below  (in
> case if Entries_HWMARK option gets changed), I have posted below fix -
> 	https://review.gerrithub.io/#/c/258687
> 
> Thanks,
> Soumya
> 
> > Ideas?
> > 
> > 05.01.2016 12:31, Soumya Koduri ???????:
> >> I tried to debug the inode* related leaks and seen some
improvements
> >> after applying the below patches when ran the same test (but will
> >> smaller load). Could you please apply those patches & confirm
the
> >> same?
> >> 
> >> a) http://review.gluster.org/13125
> >> 
> >> This will fix the inodes & their ctx related leaks during
unexport and
> >> the program exit. Please check the valgrind output after applying
the
> >> patch. It should not list any inodes related memory as lost.
> >> 
> >> b) http://review.gluster.org/13096
> >> 
> >> The reason the change in Entries_HWMARK (in your earlier mail)
dint
> >> have much effect is that the inode_nlookup count doesn't
become zero
> >> for those handles/inodes being closed by ganesha. Hence those
inodes
> >> shall get added to inode lru list instead of purge list which
shall
> >> get forcefully purged only when the number of gfapi inode table
> >> entries reaches its limit (which is 137012).
> >> 
> >> This patch fixes those 'nlookup' counts. Please apply this
patch and
> >> reduce 'Entries_HWMARK' to much lower value and check if
it decreases
> >> the in-memory being consumed by ganesha process while being
active.
> >> 
> >> CACHEINODE {
> >> 
> >>         Entries_HWMark = 500;
> >> 
> >> }
> >> 
> >> 
> >> Note: I see an issue with nfs-ganesha during exit when the option
> >> 'Entries_HWMARK' gets changed. This is not related to any
of the above
> >> patches (or rather Gluster) and I am currently debugging it.
> >> 
> >> Thanks,
> >> Soumya
> >> 
> >> On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote:
> >>> 1. test with Cache_Size = 256 and Entries_HWMark = 4096
> >>> 
> >>> Before find . -type f:
> >>> 
> >>> root      3120  0.6 11.0 879120 208408 ?       Ssl  17:39  
0:00
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> After:
> >>> 
> >>> root      3120 11.4 24.3 1170076 458168 ?      Ssl  17:39 
13:39
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> ~250M leak.
> >>> 
> >>> 2. test with default values (after ganesha restart)
> >>> 
> >>> Before:
> >>> 
> >>> root     24937  1.3 10.4 875016 197808 ?       Ssl  19:39  
0:00
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> After:
> >>> 
> >>> root     24937  3.5 18.9 1022544 356340 ?      Ssl  19:39  
0:40
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> ~159M leak.
> >>> 
> >>> No reasonable correlation detected. Second test was finished
much
> >>> faster than
> >>> first (I guess, server-side GlusterFS cache or server kernel
page
> >>> cache is the
> >>> cause).
> >>> 
> >>> There are ~1.8M files on this test volume.
> >>> 
> >>> On ????????, 25 ?????? 2015 ?. 20:28:13 EET Soumya Koduri
wrote:
> >>>> On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote:
> >>>>> Another addition: it seems to be GlusterFS API library
memory leak
> >>>>> because NFS-Ganesha also consumes huge amount of
memory while doing
> >>>>> ordinary "find . -type f" via NFSv4.2 on
remote client. Here is memory
> >>>>> usage:
> >>>>> 
> >>>>> ==> >>>>> root      5416 34.2 78.5
2047176 1480552 ?     Ssl  12:02 117:54
> >>>>> /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f
> >>>>> /etc/ganesha/ganesha.conf -N NIV_EVENT
> >>>>> ==> >>>>> 
> >>>>> 1.4G is too much for simple stat() :(.
> >>>>> 
> >>>>> Ideas?
> >>>> 
> >>>> nfs-ganesha also has cache layer which can scale to
millions of entries
> >>>> depending on the number of files/directories being looked
upon. However
> >>>> there are parameters to tune it. So either try stat with
few entries or
> >>>> add below block in nfs-ganesha.conf file, set low limits
and check the
> >>>> difference. That may help us narrow down how much memory
actually
> >>>> consumed by core nfs-ganesha and gfAPI.
> >>>> 
> >>>> CACHEINODE {
> >>>> 
> >>>>     Cache_Size(uint32, range 1 to UINT32_MAX, default
32633); #
> >>>> 
> >>>> cache size
> >>>> 
> >>>>     Entries_HWMark(uint32, range 1 to UINT32_MAX, default
100000);
> >>>> 
> >>>> #Max no.
> >>>> of entries in the cache.
> >>>> }
> >>>> 
> >>>> Thanks,
> >>>> Soumya
> >>>> 
> >>>>> 24.12.2015 16:32, Oleksandr Natalenko ???????:
> >>>>>> Still actual issue for 3.7.6. Any suggestions?
> >>>>>> 
> >>>>>> 24.09.2015 10:14, Oleksandr Natalenko ???????:
> >>>>>>> In our GlusterFS deployment we've
encountered something like memory
> >>>>>>> leak in GlusterFS FUSE client.
> >>>>>>> 
> >>>>>>> We use replicated (?2) GlusterFS volume to
store mail (exim+dovecot,
> >>>>>>> maildir format). Here is inode stats for both
bricks and mountpoint:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> Brick 1
(Server 1):
> >>>>>>> 
> >>>>>>> Filesystem                                    
Inodes IUsed
> >>>>>>> 
> >>>>>>>       IFree IUse% Mounted on
> >>>>>>> 
> >>>>>>> /dev/mapper/vg_vd1_misc-lv08_mail             
578768144
> >>>>>>> 10954918
> >>>>>>> 
> >>>>>>>   567813226    2% /bricks/r6sdLV08_vd1_mail
> >>>>>>> 
> >>>>>>> Brick 2 (Server 2):
> >>>>>>> 
> >>>>>>> Filesystem                                    
Inodes IUsed
> >>>>>>> 
> >>>>>>>       IFree IUse% Mounted on
> >>>>>>> 
> >>>>>>> /dev/mapper/vg_vd0_misc-lv07_mail             
578767984
> >>>>>>> 10954913
> >>>>>>> 
> >>>>>>>   567813071    2% /bricks/r6sdLV07_vd0_mail
> >>>>>>> 
> >>>>>>> Mountpoint (Server 3):
> >>>>>>> 
> >>>>>>> Filesystem                              Inodes
IUsed      IFree
> >>>>>>> IUse% Mounted on
> >>>>>>> glusterfs.xxx:mail                   578767760
10954915  567812845
> >>>>>>> 2% /var/spool/mail/virtual
> >>>>>>> ==> >>>>>>> 
> >>>>>>> glusterfs.xxx domain has two A records for
both Server 1 and
> >>>>>>> Server 2.
> >>>>>>> 
> >>>>>>> Here is volume info:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> Volume
Name: mail
> >>>>>>> Type: Replicate
> >>>>>>> Volume ID:
f564e85c-7aa6-4170-9417-1f501aa98cd2
> >>>>>>> Status: Started
> >>>>>>> Number of Bricks: 1 x 2 = 2
> >>>>>>> Transport-type: tcp
> >>>>>>> Bricks:
> >>>>>>> Brick1:
server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
> >>>>>>> Brick2:
server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
> >>>>>>> Options Reconfigured:
> >>>>>>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24
> >>>>>>> features.cache-invalidation-timeout: 10
> >>>>>>> performance.stat-prefetch: off
> >>>>>>> performance.quick-read: on
> >>>>>>> performance.read-ahead: off
> >>>>>>> performance.flush-behind: on
> >>>>>>> performance.write-behind: on
> >>>>>>> performance.io-thread-count: 4
> >>>>>>> performance.cache-max-file-size: 1048576
> >>>>>>> performance.cache-size: 67108864
> >>>>>>> performance.readdir-ahead: off
> >>>>>>> ==> >>>>>>> 
> >>>>>>> Soon enough after mounting and exim/dovecot
start, glusterfs client
> >>>>>>> process begins to consume huge amount of RAM:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> user at
server3 ~$ ps aux | grep glusterfs | grep mail
> >>>>>>> root     28895 14.4 15.0 15510324 14908868 ?  
Ssl  Sep03 4310:05
> >>>>>>> /usr/sbin/glusterfs --fopen-keep-cache
--direct-io-mode=disable
> >>>>>>> --volfile-server=glusterfs.xxx
--volfile-id=mail
> >>>>>>> /var/spool/mail/virtual
> >>>>>>> ==> >>>>>>> 
> >>>>>>> That is, ~15 GiB of RAM.
> >>>>>>> 
> >>>>>>> Also we've tried to use mountpoint withing
separate KVM VM with 2
> >>>>>>> or 3
> >>>>>>> GiB of RAM, and soon after starting mail
daemons got OOM killer for
> >>>>>>> glusterfs client process.
> >>>>>>> 
> >>>>>>> Mounting same share via NFS works just fine.
Also, we have much less
> >>>>>>> iowait and loadavg on client side with NFS.
> >>>>>>> 
> >>>>>>> Also, we've tried to change IO threads
count and cache size in order
> >>>>>>> to limit memory usage with no luck. As you can
see, total cache size
> >>>>>>> is 4?64==256 MiB (compare to 15 GiB).
> >>>>>>> 
> >>>>>>> Enabling-disabling stat-prefetch, read-ahead
and readdir-ahead
> >>>>>>> didn't
> >>>>>>> help as well.
> >>>>>>> 
> >>>>>>> Here are volume memory stats:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> Memory
status for volume : mail
> >>>>>>> ----------------------------------------------
> >>>>>>> Brick :
server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
> >>>>>>> Mallinfo
> >>>>>>> --------
> >>>>>>> Arena    : 36859904
> >>>>>>> Ordblks  : 10357
> >>>>>>> Smblks   : 519
> >>>>>>> Hblks    : 21
> >>>>>>> Hblkhd   : 30515200
> >>>>>>> Usmblks  : 0
> >>>>>>> Fsmblks  : 53440
> >>>>>>> Uordblks : 18604144
> >>>>>>> Fordblks : 18255760
> >>>>>>> Keepcost : 114112
> >>>>>>> 
> >>>>>>> Mempool Stats
> >>>>>>> -------------
> >>>>>>> Name                            HotCount
ColdCount PaddedSizeof
> >>>>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
> >>>>>>> ----                            --------
--------- ------------
> >>>>>>> ---------- -------- -------- ------------
> >>>>>>> mail-server:fd_t                       0     
1024          108
> >>>>>>> 30773120      137        0            0
> >>>>>>> mail-server:dentry_t               16110      
274           84
> >>>>>>> 235676148    16384  1106499         1152
> >>>>>>> mail-server:inode_t                16363      
21          156
> >>>>>>> 237216876    16384  1876651         1169
> >>>>>>> mail-trash:fd_t                        0     
1024          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:dentry_t                    0    
32768           84
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:inode_t                     4    
32764          156
> >>>>>>> 
> >>>>>>>    4        4        0            0
> >>>>>>> 
> >>>>>>> mail-trash:trash_local_t               0      
64         8628
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changetimerecorder:gf_ctr_local_t        
0        64
> >>>>>>> 16540          0        0        0           
0
> >>>>>>> mail-changelog:rpcsvc_request_t         0     
8         2828
> >>>>>>> 
> >>>>>>>     0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changelog:changelog_local_t         0    
64          116
> >>>>>>> 
> >>>>>>>      0        0        0            0
> >>>>>>> 
> >>>>>>> mail-bitrot-stub:br_stub_local_t         0    
512           84
> >>>>>>> 79204        4        0            0
> >>>>>>> mail-locks:pl_local_t                  0      
32          148
> >>>>>>> 6812757        4        0            0
> >>>>>>> mail-upcall:upcall_local_t             0      
512          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-marker:marker_local_t             0      
128          332
> >>>>>>> 64980        3        0            0
> >>>>>>> mail-quota:quota_local_t               0      
64          476
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-server:rpcsvc_request_t           0      
512         2828
> >>>>>>> 45462533       34        0            0
> >>>>>>> glusterfs:struct saved_frame           0      
8          124
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:struct rpc_req               0      
8          588
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:rpcsvc_request_t             1      
7         2828
> >>>>>>> 
> >>>>>>>    2        1        0            0
> >>>>>>> 
> >>>>>>> glusterfs:log_buf_t                    5      
251          140
> >>>>>>> 3452        6        0            0
> >>>>>>> glusterfs:data_t                     242    
16141           52
> >>>>>>> 480115498      664        0            0
> >>>>>>> glusterfs:data_pair_t                230    
16153           68
> >>>>>>> 179483528      275        0            0
> >>>>>>> glusterfs:dict_t                      23     
4073          140
> >>>>>>> 303751675      627        0            0
> >>>>>>> glusterfs:call_stub_t                  0     
1024         3764
> >>>>>>> 45290655       34        0            0
> >>>>>>> glusterfs:call_stack_t                 1     
1023         1708
> >>>>>>> 43598469       34        0            0
> >>>>>>> glusterfs:call_frame_t                 1     
4095          172
> >>>>>>> 336219655      184        0            0
> >>>>>>> ----------------------------------------------
> >>>>>>> Brick :
server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
> >>>>>>> Mallinfo
> >>>>>>> --------
> >>>>>>> Arena    : 38174720
> >>>>>>> Ordblks  : 9041
> >>>>>>> Smblks   : 507
> >>>>>>> Hblks    : 21
> >>>>>>> Hblkhd   : 30515200
> >>>>>>> Usmblks  : 0
> >>>>>>> Fsmblks  : 51712
> >>>>>>> Uordblks : 19415008
> >>>>>>> Fordblks : 18759712
> >>>>>>> Keepcost : 114848
> >>>>>>> 
> >>>>>>> Mempool Stats
> >>>>>>> -------------
> >>>>>>> Name                            HotCount
ColdCount PaddedSizeof
> >>>>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
> >>>>>>> ----                            --------
--------- ------------
> >>>>>>> ---------- -------- -------- ------------
> >>>>>>> mail-server:fd_t                       0     
1024          108
> >>>>>>> 2373075      133        0            0
> >>>>>>> mail-server:dentry_t               14114     
2270           84
> >>>>>>> 3513654    16384     2300          267
> >>>>>>> mail-server:inode_t                16374      
10          156
> >>>>>>> 6766642    16384   194635         1279
> >>>>>>> mail-trash:fd_t                        0     
1024          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:dentry_t                    0    
32768           84
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:inode_t                     4    
32764          156
> >>>>>>> 
> >>>>>>>    4        4        0            0
> >>>>>>> 
> >>>>>>> mail-trash:trash_local_t               0      
64         8628
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changetimerecorder:gf_ctr_local_t        
0        64
> >>>>>>> 16540          0        0        0           
0
> >>>>>>> mail-changelog:rpcsvc_request_t         0     
8         2828
> >>>>>>> 
> >>>>>>>     0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changelog:changelog_local_t         0    
64          116
> >>>>>>> 
> >>>>>>>      0        0        0            0
> >>>>>>> 
> >>>>>>> mail-bitrot-stub:br_stub_local_t         0    
512           84
> >>>>>>> 71354        4        0            0
> >>>>>>> mail-locks:pl_local_t                  0      
32          148
> >>>>>>> 8135032        4        0            0
> >>>>>>> mail-upcall:upcall_local_t             0      
512          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-marker:marker_local_t             0      
128          332
> >>>>>>> 65005        3        0            0
> >>>>>>> mail-quota:quota_local_t               0      
64          476
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-server:rpcsvc_request_t           0      
512         2828
> >>>>>>> 12882393       30        0            0
> >>>>>>> glusterfs:struct saved_frame           0      
8          124
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:struct rpc_req               0      
8          588
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:rpcsvc_request_t             1      
7         2828
> >>>>>>> 
> >>>>>>>    2        1        0            0
> >>>>>>> 
> >>>>>>> glusterfs:log_buf_t                    5      
251          140
> >>>>>>> 3443        6        0            0
> >>>>>>> glusterfs:data_t                     242    
16141           52
> >>>>>>> 138743429      290        0            0
> >>>>>>> glusterfs:data_pair_t                230    
16153           68
> >>>>>>> 126649864      270        0            0
> >>>>>>> glusterfs:dict_t                      23     
4073          140
> >>>>>>> 20356289       63        0            0
> >>>>>>> glusterfs:call_stub_t                  0     
1024         3764
> >>>>>>> 13678560       31        0            0
> >>>>>>> glusterfs:call_stack_t                 1     
1023         1708
> >>>>>>> 11011561       30        0            0
> >>>>>>> glusterfs:call_frame_t                 1     
4095          172
> >>>>>>> 125764190      193        0            0
> >>>>>>> ----------------------------------------------
> >>>>>>> ==> >>>>>>> 
> >>>>>>> So, my questions are:
> >>>>>>> 
> >>>>>>> 1) what one should do to limit GlusterFS FUSE
client memory usage?
> >>>>>>> 2) what one should do to prevent client high
loadavg because of high
> >>>>>>> iowait because of multiple concurrent volume
users?
> >>>>>>> 
> >>>>>>> Server/client OS is CentOS 7.1, GlusterFS
server version is 3.7.3,
> >>>>>>> GlusterFS client version is 3.7.4.
> >>>>>>> 
> >>>>>>> Any additional info needed?
> >>>>> 
> >>>>> _______________________________________________
> >>>>> Gluster-users mailing list
> >>>>> Gluster-users at gluster.org
> >>>>> http://www.gluster.org/mailman/listinfo/gluster-users

Oleksandr Natalenko

2016-Jan-05 22:23 UTC

head link

[Gluster-users] Memory leak in GlusterFS FUSE client

OK, I've repeated the same traversing test with patched GlusterFS API, and 
here is new Valgrind log:

https://gist.github.com/17ecb16a11c9aed957f5

Still leaks.

On ????????, 5 ????? 2016 ?. 22:52:25 EET Soumya Koduri
wrote:> On 01/05/2016 05:56 PM, Oleksandr Natalenko wrote:
> > Unfortunately, both patches didn't make any difference for me.
> > 
> > I've patched 3.7.6 with both patches, recompiled and installed
patched
> > GlusterFS package on client side and mounted volume with ~2M of files.
> > The I performed usual tree traverse with simple "find".
> > 
> > Memory RES value went from ~130M at the moment of mounting to ~1.5G
> > after traversing the volume for ~40 mins. Valgrind log still shows
lots
> > of leaks. Here it is:
> > 
> > https://gist.github.com/56906ca6e657c4ffa4a1
> 
> Looks like you had done fuse mount. The patches which I have pasted
> below apply to gfapi/nfs-ganesha applications.
> 
> Also, to resolve the nfs-ganesha issue which I had mentioned below  (in
> case if Entries_HWMARK option gets changed), I have posted below fix -
> 	https://review.gerrithub.io/#/c/258687
> 
> Thanks,
> Soumya
> 
> > Ideas?
> > 
> > 05.01.2016 12:31, Soumya Koduri ???????:
> >> I tried to debug the inode* related leaks and seen some
improvements
> >> after applying the below patches when ran the same test (but will
> >> smaller load). Could you please apply those patches & confirm
the
> >> same?
> >> 
> >> a) http://review.gluster.org/13125
> >> 
> >> This will fix the inodes & their ctx related leaks during
unexport and
> >> the program exit. Please check the valgrind output after applying
the
> >> patch. It should not list any inodes related memory as lost.
> >> 
> >> b) http://review.gluster.org/13096
> >> 
> >> The reason the change in Entries_HWMARK (in your earlier mail)
dint
> >> have much effect is that the inode_nlookup count doesn't
become zero
> >> for those handles/inodes being closed by ganesha. Hence those
inodes
> >> shall get added to inode lru list instead of purge list which
shall
> >> get forcefully purged only when the number of gfapi inode table
> >> entries reaches its limit (which is 137012).
> >> 
> >> This patch fixes those 'nlookup' counts. Please apply this
patch and
> >> reduce 'Entries_HWMARK' to much lower value and check if
it decreases
> >> the in-memory being consumed by ganesha process while being
active.
> >> 
> >> CACHEINODE {
> >> 
> >>         Entries_HWMark = 500;
> >> 
> >> }
> >> 
> >> 
> >> Note: I see an issue with nfs-ganesha during exit when the option
> >> 'Entries_HWMARK' gets changed. This is not related to any
of the above
> >> patches (or rather Gluster) and I am currently debugging it.
> >> 
> >> Thanks,
> >> Soumya
> >> 
> >> On 12/25/2015 11:34 PM, Oleksandr Natalenko wrote:
> >>> 1. test with Cache_Size = 256 and Entries_HWMark = 4096
> >>> 
> >>> Before find . -type f:
> >>> 
> >>> root      3120  0.6 11.0 879120 208408 ?       Ssl  17:39  
0:00
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> After:
> >>> 
> >>> root      3120 11.4 24.3 1170076 458168 ?      Ssl  17:39 
13:39
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> ~250M leak.
> >>> 
> >>> 2. test with default values (after ganesha restart)
> >>> 
> >>> Before:
> >>> 
> >>> root     24937  1.3 10.4 875016 197808 ?       Ssl  19:39  
0:00
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> After:
> >>> 
> >>> root     24937  3.5 18.9 1022544 356340 ?      Ssl  19:39  
0:40
> >>> /usr/bin/
> >>> ganesha.nfsd -L /var/log/ganesha.log -f
/etc/ganesha/ganesha.conf -N
> >>> NIV_EVENT
> >>> 
> >>> ~159M leak.
> >>> 
> >>> No reasonable correlation detected. Second test was finished
much
> >>> faster than
> >>> first (I guess, server-side GlusterFS cache or server kernel
page
> >>> cache is the
> >>> cause).
> >>> 
> >>> There are ~1.8M files on this test volume.
> >>> 
> >>> On ????????, 25 ?????? 2015 ?. 20:28:13 EET Soumya Koduri
wrote:
> >>>> On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote:
> >>>>> Another addition: it seems to be GlusterFS API library
memory leak
> >>>>> because NFS-Ganesha also consumes huge amount of
memory while doing
> >>>>> ordinary "find . -type f" via NFSv4.2 on
remote client. Here is memory
> >>>>> usage:
> >>>>> 
> >>>>> ==> >>>>> root      5416 34.2 78.5
2047176 1480552 ?     Ssl  12:02 117:54
> >>>>> /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f
> >>>>> /etc/ganesha/ganesha.conf -N NIV_EVENT
> >>>>> ==> >>>>> 
> >>>>> 1.4G is too much for simple stat() :(.
> >>>>> 
> >>>>> Ideas?
> >>>> 
> >>>> nfs-ganesha also has cache layer which can scale to
millions of entries
> >>>> depending on the number of files/directories being looked
upon. However
> >>>> there are parameters to tune it. So either try stat with
few entries or
> >>>> add below block in nfs-ganesha.conf file, set low limits
and check the
> >>>> difference. That may help us narrow down how much memory
actually
> >>>> consumed by core nfs-ganesha and gfAPI.
> >>>> 
> >>>> CACHEINODE {
> >>>> 
> >>>>     Cache_Size(uint32, range 1 to UINT32_MAX, default
32633); #
> >>>> 
> >>>> cache size
> >>>> 
> >>>>     Entries_HWMark(uint32, range 1 to UINT32_MAX, default
100000);
> >>>> 
> >>>> #Max no.
> >>>> of entries in the cache.
> >>>> }
> >>>> 
> >>>> Thanks,
> >>>> Soumya
> >>>> 
> >>>>> 24.12.2015 16:32, Oleksandr Natalenko ???????:
> >>>>>> Still actual issue for 3.7.6. Any suggestions?
> >>>>>> 
> >>>>>> 24.09.2015 10:14, Oleksandr Natalenko ???????:
> >>>>>>> In our GlusterFS deployment we've
encountered something like memory
> >>>>>>> leak in GlusterFS FUSE client.
> >>>>>>> 
> >>>>>>> We use replicated (?2) GlusterFS volume to
store mail (exim+dovecot,
> >>>>>>> maildir format). Here is inode stats for both
bricks and mountpoint:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> Brick 1
(Server 1):
> >>>>>>> 
> >>>>>>> Filesystem                                    
Inodes IUsed
> >>>>>>> 
> >>>>>>>       IFree IUse% Mounted on
> >>>>>>> 
> >>>>>>> /dev/mapper/vg_vd1_misc-lv08_mail             
578768144
> >>>>>>> 10954918
> >>>>>>> 
> >>>>>>>   567813226    2% /bricks/r6sdLV08_vd1_mail
> >>>>>>> 
> >>>>>>> Brick 2 (Server 2):
> >>>>>>> 
> >>>>>>> Filesystem                                    
Inodes IUsed
> >>>>>>> 
> >>>>>>>       IFree IUse% Mounted on
> >>>>>>> 
> >>>>>>> /dev/mapper/vg_vd0_misc-lv07_mail             
578767984
> >>>>>>> 10954913
> >>>>>>> 
> >>>>>>>   567813071    2% /bricks/r6sdLV07_vd0_mail
> >>>>>>> 
> >>>>>>> Mountpoint (Server 3):
> >>>>>>> 
> >>>>>>> Filesystem                              Inodes
IUsed      IFree
> >>>>>>> IUse% Mounted on
> >>>>>>> glusterfs.xxx:mail                   578767760
10954915  567812845
> >>>>>>> 2% /var/spool/mail/virtual
> >>>>>>> ==> >>>>>>> 
> >>>>>>> glusterfs.xxx domain has two A records for
both Server 1 and
> >>>>>>> Server 2.
> >>>>>>> 
> >>>>>>> Here is volume info:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> Volume
Name: mail
> >>>>>>> Type: Replicate
> >>>>>>> Volume ID:
f564e85c-7aa6-4170-9417-1f501aa98cd2
> >>>>>>> Status: Started
> >>>>>>> Number of Bricks: 1 x 2 = 2
> >>>>>>> Transport-type: tcp
> >>>>>>> Bricks:
> >>>>>>> Brick1:
server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
> >>>>>>> Brick2:
server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
> >>>>>>> Options Reconfigured:
> >>>>>>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24
> >>>>>>> features.cache-invalidation-timeout: 10
> >>>>>>> performance.stat-prefetch: off
> >>>>>>> performance.quick-read: on
> >>>>>>> performance.read-ahead: off
> >>>>>>> performance.flush-behind: on
> >>>>>>> performance.write-behind: on
> >>>>>>> performance.io-thread-count: 4
> >>>>>>> performance.cache-max-file-size: 1048576
> >>>>>>> performance.cache-size: 67108864
> >>>>>>> performance.readdir-ahead: off
> >>>>>>> ==> >>>>>>> 
> >>>>>>> Soon enough after mounting and exim/dovecot
start, glusterfs client
> >>>>>>> process begins to consume huge amount of RAM:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> user at
server3 ~$ ps aux | grep glusterfs | grep mail
> >>>>>>> root     28895 14.4 15.0 15510324 14908868 ?  
Ssl  Sep03 4310:05
> >>>>>>> /usr/sbin/glusterfs --fopen-keep-cache
--direct-io-mode=disable
> >>>>>>> --volfile-server=glusterfs.xxx
--volfile-id=mail
> >>>>>>> /var/spool/mail/virtual
> >>>>>>> ==> >>>>>>> 
> >>>>>>> That is, ~15 GiB of RAM.
> >>>>>>> 
> >>>>>>> Also we've tried to use mountpoint withing
separate KVM VM with 2
> >>>>>>> or 3
> >>>>>>> GiB of RAM, and soon after starting mail
daemons got OOM killer for
> >>>>>>> glusterfs client process.
> >>>>>>> 
> >>>>>>> Mounting same share via NFS works just fine.
Also, we have much less
> >>>>>>> iowait and loadavg on client side with NFS.
> >>>>>>> 
> >>>>>>> Also, we've tried to change IO threads
count and cache size in order
> >>>>>>> to limit memory usage with no luck. As you can
see, total cache size
> >>>>>>> is 4?64==256 MiB (compare to 15 GiB).
> >>>>>>> 
> >>>>>>> Enabling-disabling stat-prefetch, read-ahead
and readdir-ahead
> >>>>>>> didn't
> >>>>>>> help as well.
> >>>>>>> 
> >>>>>>> Here are volume memory stats:
> >>>>>>> 
> >>>>>>> ==> >>>>>>> Memory
status for volume : mail
> >>>>>>> ----------------------------------------------
> >>>>>>> Brick :
server1.xxx:/bricks/r6sdLV08_vd1_mail/mail
> >>>>>>> Mallinfo
> >>>>>>> --------
> >>>>>>> Arena    : 36859904
> >>>>>>> Ordblks  : 10357
> >>>>>>> Smblks   : 519
> >>>>>>> Hblks    : 21
> >>>>>>> Hblkhd   : 30515200
> >>>>>>> Usmblks  : 0
> >>>>>>> Fsmblks  : 53440
> >>>>>>> Uordblks : 18604144
> >>>>>>> Fordblks : 18255760
> >>>>>>> Keepcost : 114112
> >>>>>>> 
> >>>>>>> Mempool Stats
> >>>>>>> -------------
> >>>>>>> Name                            HotCount
ColdCount PaddedSizeof
> >>>>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
> >>>>>>> ----                            --------
--------- ------------
> >>>>>>> ---------- -------- -------- ------------
> >>>>>>> mail-server:fd_t                       0     
1024          108
> >>>>>>> 30773120      137        0            0
> >>>>>>> mail-server:dentry_t               16110      
274           84
> >>>>>>> 235676148    16384  1106499         1152
> >>>>>>> mail-server:inode_t                16363      
21          156
> >>>>>>> 237216876    16384  1876651         1169
> >>>>>>> mail-trash:fd_t                        0     
1024          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:dentry_t                    0    
32768           84
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:inode_t                     4    
32764          156
> >>>>>>> 
> >>>>>>>    4        4        0            0
> >>>>>>> 
> >>>>>>> mail-trash:trash_local_t               0      
64         8628
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changetimerecorder:gf_ctr_local_t        
0        64
> >>>>>>> 16540          0        0        0           
0
> >>>>>>> mail-changelog:rpcsvc_request_t         0     
8         2828
> >>>>>>> 
> >>>>>>>     0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changelog:changelog_local_t         0    
64          116
> >>>>>>> 
> >>>>>>>      0        0        0            0
> >>>>>>> 
> >>>>>>> mail-bitrot-stub:br_stub_local_t         0    
512           84
> >>>>>>> 79204        4        0            0
> >>>>>>> mail-locks:pl_local_t                  0      
32          148
> >>>>>>> 6812757        4        0            0
> >>>>>>> mail-upcall:upcall_local_t             0      
512          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-marker:marker_local_t             0      
128          332
> >>>>>>> 64980        3        0            0
> >>>>>>> mail-quota:quota_local_t               0      
64          476
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-server:rpcsvc_request_t           0      
512         2828
> >>>>>>> 45462533       34        0            0
> >>>>>>> glusterfs:struct saved_frame           0      
8          124
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:struct rpc_req               0      
8          588
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:rpcsvc_request_t             1      
7         2828
> >>>>>>> 
> >>>>>>>    2        1        0            0
> >>>>>>> 
> >>>>>>> glusterfs:log_buf_t                    5      
251          140
> >>>>>>> 3452        6        0            0
> >>>>>>> glusterfs:data_t                     242    
16141           52
> >>>>>>> 480115498      664        0            0
> >>>>>>> glusterfs:data_pair_t                230    
16153           68
> >>>>>>> 179483528      275        0            0
> >>>>>>> glusterfs:dict_t                      23     
4073          140
> >>>>>>> 303751675      627        0            0
> >>>>>>> glusterfs:call_stub_t                  0     
1024         3764
> >>>>>>> 45290655       34        0            0
> >>>>>>> glusterfs:call_stack_t                 1     
1023         1708
> >>>>>>> 43598469       34        0            0
> >>>>>>> glusterfs:call_frame_t                 1     
4095          172
> >>>>>>> 336219655      184        0            0
> >>>>>>> ----------------------------------------------
> >>>>>>> Brick :
server2.xxx:/bricks/r6sdLV07_vd0_mail/mail
> >>>>>>> Mallinfo
> >>>>>>> --------
> >>>>>>> Arena    : 38174720
> >>>>>>> Ordblks  : 9041
> >>>>>>> Smblks   : 507
> >>>>>>> Hblks    : 21
> >>>>>>> Hblkhd   : 30515200
> >>>>>>> Usmblks  : 0
> >>>>>>> Fsmblks  : 51712
> >>>>>>> Uordblks : 19415008
> >>>>>>> Fordblks : 18759712
> >>>>>>> Keepcost : 114848
> >>>>>>> 
> >>>>>>> Mempool Stats
> >>>>>>> -------------
> >>>>>>> Name                            HotCount
ColdCount PaddedSizeof
> >>>>>>> AllocCount MaxAlloc   Misses Max-StdAlloc
> >>>>>>> ----                            --------
--------- ------------
> >>>>>>> ---------- -------- -------- ------------
> >>>>>>> mail-server:fd_t                       0     
1024          108
> >>>>>>> 2373075      133        0            0
> >>>>>>> mail-server:dentry_t               14114     
2270           84
> >>>>>>> 3513654    16384     2300          267
> >>>>>>> mail-server:inode_t                16374      
10          156
> >>>>>>> 6766642    16384   194635         1279
> >>>>>>> mail-trash:fd_t                        0     
1024          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:dentry_t                    0    
32768           84
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-trash:inode_t                     4    
32764          156
> >>>>>>> 
> >>>>>>>    4        4        0            0
> >>>>>>> 
> >>>>>>> mail-trash:trash_local_t               0      
64         8628
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changetimerecorder:gf_ctr_local_t        
0        64
> >>>>>>> 16540          0        0        0           
0
> >>>>>>> mail-changelog:rpcsvc_request_t         0     
8         2828
> >>>>>>> 
> >>>>>>>     0        0        0            0
> >>>>>>> 
> >>>>>>> mail-changelog:changelog_local_t         0    
64          116
> >>>>>>> 
> >>>>>>>      0        0        0            0
> >>>>>>> 
> >>>>>>> mail-bitrot-stub:br_stub_local_t         0    
512           84
> >>>>>>> 71354        4        0            0
> >>>>>>> mail-locks:pl_local_t                  0      
32          148
> >>>>>>> 8135032        4        0            0
> >>>>>>> mail-upcall:upcall_local_t             0      
512          108
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-marker:marker_local_t             0      
128          332
> >>>>>>> 65005        3        0            0
> >>>>>>> mail-quota:quota_local_t               0      
64          476
> >>>>>>> 
> >>>>>>>    0        0        0            0
> >>>>>>> 
> >>>>>>> mail-server:rpcsvc_request_t           0      
512         2828
> >>>>>>> 12882393       30        0            0
> >>>>>>> glusterfs:struct saved_frame           0      
8          124
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:struct rpc_req               0      
8          588
> >>>>>>> 
> >>>>>>>    2        2        0            0
> >>>>>>> 
> >>>>>>> glusterfs:rpcsvc_request_t             1      
7         2828
> >>>>>>> 
> >>>>>>>    2        1        0            0
> >>>>>>> 
> >>>>>>> glusterfs:log_buf_t                    5      
251          140
> >>>>>>> 3443        6        0            0
> >>>>>>> glusterfs:data_t                     242    
16141           52
> >>>>>>> 138743429      290        0            0
> >>>>>>> glusterfs:data_pair_t                230    
16153           68
> >>>>>>> 126649864      270        0            0
> >>>>>>> glusterfs:dict_t                      23     
4073          140
> >>>>>>> 20356289       63        0            0
> >>>>>>> glusterfs:call_stub_t                  0     
1024         3764
> >>>>>>> 13678560       31        0            0
> >>>>>>> glusterfs:call_stack_t                 1     
1023         1708
> >>>>>>> 11011561       30        0            0
> >>>>>>> glusterfs:call_frame_t                 1     
4095          172
> >>>>>>> 125764190      193        0            0
> >>>>>>> ----------------------------------------------
> >>>>>>> ==> >>>>>>> 
> >>>>>>> So, my questions are:
> >>>>>>> 
> >>>>>>> 1) what one should do to limit GlusterFS FUSE
client memory usage?
> >>>>>>> 2) what one should do to prevent client high
loadavg because of high
> >>>>>>> iowait because of multiple concurrent volume
users?
> >>>>>>> 
> >>>>>>> Server/client OS is CentOS 7.1, GlusterFS
server version is 3.7.3,
> >>>>>>> GlusterFS client version is 3.7.4.
> >>>>>>> 
> >>>>>>> Any additional info needed?
> >>>>> 
> >>>>> _______________________________________________
> >>>>> Gluster-users mailing list
> >>>>> Gluster-users at gluster.org
> >>>>> http://www.gluster.org/mailman/listinfo/gluster-users

Gluster users - Jan 2016 - Memory leak in GlusterFS FUSE client

[Gluster-users] Memory leak in GlusterFS FUSE client

[Gluster-users] Memory leak in GlusterFS FUSE client

[Gluster-users] Memory leak in GlusterFS FUSE client