Oleksandr Natalenko
2015-Dec-24 15:47 UTC
[Gluster-users] Memory leak in GlusterFS FUSE client
Another addition: it seems to be GlusterFS API library memory leak because NFS-Ganesha also consumes huge amount of memory while doing ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory usage: ==root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT == 1.4G is too much for simple stat() :(. Ideas? 24.12.2015 16:32, Oleksandr Natalenko ???????:> Still actual issue for 3.7.6. Any suggestions? > > 24.09.2015 10:14, Oleksandr Natalenko ???????: >> In our GlusterFS deployment we've encountered something like memory >> leak in GlusterFS FUSE client. >> >> We use replicated (?2) GlusterFS volume to store mail (exim+dovecot, >> maildir format). Here is inode stats for both bricks and mountpoint: >> >> ==>> Brick 1 (Server 1): >> >> Filesystem Inodes IUsed >> IFree IUse% Mounted on >> /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 >> 567813226 2% /bricks/r6sdLV08_vd1_mail >> >> Brick 2 (Server 2): >> >> Filesystem Inodes IUsed >> IFree IUse% Mounted on >> /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 >> 567813071 2% /bricks/r6sdLV07_vd0_mail >> >> Mountpoint (Server 3): >> >> Filesystem Inodes IUsed IFree >> IUse% Mounted on >> glusterfs.xxx:mail 578767760 10954915 567812845 >> 2% /var/spool/mail/virtual >> ==>> >> glusterfs.xxx domain has two A records for both Server 1 and Server 2. >> >> Here is volume info: >> >> ==>> Volume Name: mail >> Type: Replicate >> Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 >> Status: Started >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail >> Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail >> Options Reconfigured: >> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 >> features.cache-invalidation-timeout: 10 >> performance.stat-prefetch: off >> performance.quick-read: on >> performance.read-ahead: off >> performance.flush-behind: on >> performance.write-behind: on >> performance.io-thread-count: 4 >> performance.cache-max-file-size: 1048576 >> performance.cache-size: 67108864 >> performance.readdir-ahead: off >> ==>> >> Soon enough after mounting and exim/dovecot start, glusterfs client >> process begins to consume huge amount of RAM: >> >> ==>> user at server3 ~$ ps aux | grep glusterfs | grep mail >> root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 >> /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable >> --volfile-server=glusterfs.xxx --volfile-id=mail >> /var/spool/mail/virtual >> ==>> >> That is, ~15 GiB of RAM. >> >> Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 >> GiB of RAM, and soon after starting mail daemons got OOM killer for >> glusterfs client process. >> >> Mounting same share via NFS works just fine. Also, we have much less >> iowait and loadavg on client side with NFS. >> >> Also, we've tried to change IO threads count and cache size in order >> to limit memory usage with no luck. As you can see, total cache size >> is 4?64==256 MiB (compare to 15 GiB). >> >> Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't >> help as well. >> >> Here are volume memory stats: >> >> ==>> Memory status for volume : mail >> ---------------------------------------------- >> Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail >> Mallinfo >> -------- >> Arena : 36859904 >> Ordblks : 10357 >> Smblks : 519 >> Hblks : 21 >> Hblkhd : 30515200 >> Usmblks : 0 >> Fsmblks : 53440 >> Uordblks : 18604144 >> Fordblks : 18255760 >> Keepcost : 114112 >> >> Mempool Stats >> ------------- >> Name HotCount ColdCount PaddedSizeof >> AllocCount MaxAlloc Misses Max-StdAlloc >> ---- -------- --------- ------------ >> ---------- -------- -------- ------------ >> mail-server:fd_t 0 1024 108 >> 30773120 137 0 0 >> mail-server:dentry_t 16110 274 84 >> 235676148 16384 1106499 1152 >> mail-server:inode_t 16363 21 156 >> 237216876 16384 1876651 1169 >> mail-trash:fd_t 0 1024 108 >> 0 0 0 0 >> mail-trash:dentry_t 0 32768 84 >> 0 0 0 0 >> mail-trash:inode_t 4 32764 156 >> 4 4 0 0 >> mail-trash:trash_local_t 0 64 8628 >> 0 0 0 0 >> mail-changetimerecorder:gf_ctr_local_t 0 64 >> 16540 0 0 0 0 >> mail-changelog:rpcsvc_request_t 0 8 2828 >> 0 0 0 0 >> mail-changelog:changelog_local_t 0 64 116 >> 0 0 0 0 >> mail-bitrot-stub:br_stub_local_t 0 512 84 >> 79204 4 0 0 >> mail-locks:pl_local_t 0 32 148 >> 6812757 4 0 0 >> mail-upcall:upcall_local_t 0 512 108 >> 0 0 0 0 >> mail-marker:marker_local_t 0 128 332 >> 64980 3 0 0 >> mail-quota:quota_local_t 0 64 476 >> 0 0 0 0 >> mail-server:rpcsvc_request_t 0 512 2828 >> 45462533 34 0 0 >> glusterfs:struct saved_frame 0 8 124 >> 2 2 0 0 >> glusterfs:struct rpc_req 0 8 588 >> 2 2 0 0 >> glusterfs:rpcsvc_request_t 1 7 2828 >> 2 1 0 0 >> glusterfs:log_buf_t 5 251 140 >> 3452 6 0 0 >> glusterfs:data_t 242 16141 52 >> 480115498 664 0 0 >> glusterfs:data_pair_t 230 16153 68 >> 179483528 275 0 0 >> glusterfs:dict_t 23 4073 140 >> 303751675 627 0 0 >> glusterfs:call_stub_t 0 1024 3764 >> 45290655 34 0 0 >> glusterfs:call_stack_t 1 1023 1708 >> 43598469 34 0 0 >> glusterfs:call_frame_t 1 4095 172 >> 336219655 184 0 0 >> ---------------------------------------------- >> Brick : server2.xxx:/bricks/r6sdLV07_vd0_mail/mail >> Mallinfo >> -------- >> Arena : 38174720 >> Ordblks : 9041 >> Smblks : 507 >> Hblks : 21 >> Hblkhd : 30515200 >> Usmblks : 0 >> Fsmblks : 51712 >> Uordblks : 19415008 >> Fordblks : 18759712 >> Keepcost : 114848 >> >> Mempool Stats >> ------------- >> Name HotCount ColdCount PaddedSizeof >> AllocCount MaxAlloc Misses Max-StdAlloc >> ---- -------- --------- ------------ >> ---------- -------- -------- ------------ >> mail-server:fd_t 0 1024 108 >> 2373075 133 0 0 >> mail-server:dentry_t 14114 2270 84 >> 3513654 16384 2300 267 >> mail-server:inode_t 16374 10 156 >> 6766642 16384 194635 1279 >> mail-trash:fd_t 0 1024 108 >> 0 0 0 0 >> mail-trash:dentry_t 0 32768 84 >> 0 0 0 0 >> mail-trash:inode_t 4 32764 156 >> 4 4 0 0 >> mail-trash:trash_local_t 0 64 8628 >> 0 0 0 0 >> mail-changetimerecorder:gf_ctr_local_t 0 64 >> 16540 0 0 0 0 >> mail-changelog:rpcsvc_request_t 0 8 2828 >> 0 0 0 0 >> mail-changelog:changelog_local_t 0 64 116 >> 0 0 0 0 >> mail-bitrot-stub:br_stub_local_t 0 512 84 >> 71354 4 0 0 >> mail-locks:pl_local_t 0 32 148 >> 8135032 4 0 0 >> mail-upcall:upcall_local_t 0 512 108 >> 0 0 0 0 >> mail-marker:marker_local_t 0 128 332 >> 65005 3 0 0 >> mail-quota:quota_local_t 0 64 476 >> 0 0 0 0 >> mail-server:rpcsvc_request_t 0 512 2828 >> 12882393 30 0 0 >> glusterfs:struct saved_frame 0 8 124 >> 2 2 0 0 >> glusterfs:struct rpc_req 0 8 588 >> 2 2 0 0 >> glusterfs:rpcsvc_request_t 1 7 2828 >> 2 1 0 0 >> glusterfs:log_buf_t 5 251 140 >> 3443 6 0 0 >> glusterfs:data_t 242 16141 52 >> 138743429 290 0 0 >> glusterfs:data_pair_t 230 16153 68 >> 126649864 270 0 0 >> glusterfs:dict_t 23 4073 140 >> 20356289 63 0 0 >> glusterfs:call_stub_t 0 1024 3764 >> 13678560 31 0 0 >> glusterfs:call_stack_t 1 1023 1708 >> 11011561 30 0 0 >> glusterfs:call_frame_t 1 4095 172 >> 125764190 193 0 0 >> ---------------------------------------------- >> ==>> >> So, my questions are: >> >> 1) what one should do to limit GlusterFS FUSE client memory usage? >> 2) what one should do to prevent client high loadavg because of high >> iowait because of multiple concurrent volume users? >> >> Server/client OS is CentOS 7.1, GlusterFS server version is 3.7.3, >> GlusterFS client version is 3.7.4. >> >> Any additional info needed?
On 12/24/2015 09:17 PM, Oleksandr Natalenko wrote:> Another addition: it seems to be GlusterFS API library memory leak > because NFS-Ganesha also consumes huge amount of memory while doing > ordinary "find . -type f" via NFSv4.2 on remote client. Here is memory > usage: > > ==> root 5416 34.2 78.5 2047176 1480552 ? Ssl 12:02 117:54 > /usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f > /etc/ganesha/ganesha.conf -N NIV_EVENT > ==> > 1.4G is too much for simple stat() :(. > > Ideas?nfs-ganesha also has cache layer which can scale to millions of entries depending on the number of files/directories being looked upon. However there are parameters to tune it. So either try stat with few entries or add below block in nfs-ganesha.conf file, set low limits and check the difference. That may help us narrow down how much memory actually consumed by core nfs-ganesha and gfAPI. CACHEINODE { Cache_Size(uint32, range 1 to UINT32_MAX, default 32633); # cache size Entries_HWMark(uint32, range 1 to UINT32_MAX, default 100000); #Max no. of entries in the cache. } Thanks, Soumya> > 24.12.2015 16:32, Oleksandr Natalenko ???????: >> Still actual issue for 3.7.6. Any suggestions? >> >> 24.09.2015 10:14, Oleksandr Natalenko ???????: >>> In our GlusterFS deployment we've encountered something like memory >>> leak in GlusterFS FUSE client. >>> >>> We use replicated (?2) GlusterFS volume to store mail (exim+dovecot, >>> maildir format). Here is inode stats for both bricks and mountpoint: >>> >>> ==>>> Brick 1 (Server 1): >>> >>> Filesystem Inodes IUsed >>> IFree IUse% Mounted on >>> /dev/mapper/vg_vd1_misc-lv08_mail 578768144 10954918 >>> 567813226 2% /bricks/r6sdLV08_vd1_mail >>> >>> Brick 2 (Server 2): >>> >>> Filesystem Inodes IUsed >>> IFree IUse% Mounted on >>> /dev/mapper/vg_vd0_misc-lv07_mail 578767984 10954913 >>> 567813071 2% /bricks/r6sdLV07_vd0_mail >>> >>> Mountpoint (Server 3): >>> >>> Filesystem Inodes IUsed IFree >>> IUse% Mounted on >>> glusterfs.xxx:mail 578767760 10954915 567812845 >>> 2% /var/spool/mail/virtual >>> ==>>> >>> glusterfs.xxx domain has two A records for both Server 1 and Server 2. >>> >>> Here is volume info: >>> >>> ==>>> Volume Name: mail >>> Type: Replicate >>> Volume ID: f564e85c-7aa6-4170-9417-1f501aa98cd2 >>> Status: Started >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: server1.xxx:/bricks/r6sdLV08_vd1_mail/mail >>> Brick2: server2.xxx:/bricks/r6sdLV07_vd0_mail/mail >>> Options Reconfigured: >>> nfs.rpc-auth-allow: 1.2.4.0/24,4.5.6.0/24 >>> features.cache-invalidation-timeout: 10 >>> performance.stat-prefetch: off >>> performance.quick-read: on >>> performance.read-ahead: off >>> performance.flush-behind: on >>> performance.write-behind: on >>> performance.io-thread-count: 4 >>> performance.cache-max-file-size: 1048576 >>> performance.cache-size: 67108864 >>> performance.readdir-ahead: off >>> ==>>> >>> Soon enough after mounting and exim/dovecot start, glusterfs client >>> process begins to consume huge amount of RAM: >>> >>> ==>>> user at server3 ~$ ps aux | grep glusterfs | grep mail >>> root 28895 14.4 15.0 15510324 14908868 ? Ssl Sep03 4310:05 >>> /usr/sbin/glusterfs --fopen-keep-cache --direct-io-mode=disable >>> --volfile-server=glusterfs.xxx --volfile-id=mail >>> /var/spool/mail/virtual >>> ==>>> >>> That is, ~15 GiB of RAM. >>> >>> Also we've tried to use mountpoint withing separate KVM VM with 2 or 3 >>> GiB of RAM, and soon after starting mail daemons got OOM killer for >>> glusterfs client process. >>> >>> Mounting same share via NFS works just fine. Also, we have much less >>> iowait and loadavg on client side with NFS. >>> >>> Also, we've tried to change IO threads count and cache size in order >>> to limit memory usage with no luck. As you can see, total cache size >>> is 4?64==256 MiB (compare to 15 GiB). >>> >>> Enabling-disabling stat-prefetch, read-ahead and readdir-ahead didn't >>> help as well. >>> >>> Here are volume memory stats: >>> >>> ==>>> Memory status for volume : mail >>> ---------------------------------------------- >>> Brick : server1.xxx:/bricks/r6sdLV08_vd1_mail/mail >>> Mallinfo >>> -------- >>> Arena : 36859904 >>> Ordblks : 10357 >>> Smblks : 519 >>> Hblks : 21 >>> Hblkhd : 30515200 >>> Usmblks : 0 >>> Fsmblks : 53440 >>> Uordblks : 18604144 >>> Fordblks : 18255760 >>> Keepcost : 114112 >>> >>> Mempool Stats >>> ------------- >>> Name HotCount ColdCount PaddedSizeof >>> AllocCount MaxAlloc Misses Max-StdAlloc >>> ---- -------- --------- ------------ >>> ---------- -------- -------- ------------ >>> mail-server:fd_t 0 1024 108 >>> 30773120 137 0 0 >>> mail-server:dentry_t 16110 274 84 >>> 235676148 16384 1106499 1152 >>> mail-server:inode_t 16363 21 156 >>> 237216876 16384 1876651 1169 >>> mail-trash:fd_t 0 1024 108 >>> 0 0 0 0 >>> mail-trash:dentry_t 0 32768 84 >>> 0 0 0 0 >>> mail-trash:inode_t 4 32764 156 >>> 4 4 0 0 >>> mail-trash:trash_local_t 0 64 8628 >>> 0 0 0 0 >>> mail-changetimerecorder:gf_ctr_local_t 0 64 >>> 16540 0 0 0 0 >>> mail-changelog:rpcsvc_request_t 0 8 2828 >>> 0 0 0 0 >>> mail-changelog:changelog_local_t 0 64 116 >>> 0 0 0 0 >>> mail-bitrot-stub:br_stub_local_t 0 512 84 >>> 79204 4 0 0 >>> mail-locks:pl_local_t 0 32 148 >>> 6812757 4 0 0 >>> mail-upcall:upcall_local_t 0 512 108 >>> 0 0 0 0 >>> mail-marker:marker_local_t 0 128 332 >>> 64980 3 0 0 >>> mail-quota:quota_local_t 0 64 476 >>> 0 0 0 0 >>> mail-server:rpcsvc_request_t 0 512 2828 >>> 45462533 34 0 0 >>> glusterfs:struct saved_frame 0 8 124 >>> 2 2 0 0 >>> glusterfs:struct rpc_req 0 8 588 >>> 2 2 0 0 >>> glusterfs:rpcsvc_request_t 1 7 2828 >>> 2 1 0 0 >>> glusterfs:log_buf_t 5 251 140 >>> 3452 6 0 0 >>> glusterfs:data_t 242 16141 52 >>> 480115498 664 0 0 >>> glusterfs:data_pair_t 230 16153 68 >>> 179483528 275 0 0 >>> glusterfs:dict_t 23 4073 140 >>> 303751675 627 0 0 >>> glusterfs:call_stub_t 0 1024 3764 >>> 45290655 34 0 0 >>> glusterfs:call_stack_t 1 1023 1708 >>> 43598469 34 0 0 >>> glusterfs:call_frame_t 1 4095 172 >>> 336219655 184 0 0 >>> ---------------------------------------------- >>> Brick : server2.xxx:/bricks/r6sdLV07_vd0_mail/mail >>> Mallinfo >>> -------- >>> Arena : 38174720 >>> Ordblks : 9041 >>> Smblks : 507 >>> Hblks : 21 >>> Hblkhd : 30515200 >>> Usmblks : 0 >>> Fsmblks : 51712 >>> Uordblks : 19415008 >>> Fordblks : 18759712 >>> Keepcost : 114848 >>> >>> Mempool Stats >>> ------------- >>> Name HotCount ColdCount PaddedSizeof >>> AllocCount MaxAlloc Misses Max-StdAlloc >>> ---- -------- --------- ------------ >>> ---------- -------- -------- ------------ >>> mail-server:fd_t 0 1024 108 >>> 2373075 133 0 0 >>> mail-server:dentry_t 14114 2270 84 >>> 3513654 16384 2300 267 >>> mail-server:inode_t 16374 10 156 >>> 6766642 16384 194635 1279 >>> mail-trash:fd_t 0 1024 108 >>> 0 0 0 0 >>> mail-trash:dentry_t 0 32768 84 >>> 0 0 0 0 >>> mail-trash:inode_t 4 32764 156 >>> 4 4 0 0 >>> mail-trash:trash_local_t 0 64 8628 >>> 0 0 0 0 >>> mail-changetimerecorder:gf_ctr_local_t 0 64 >>> 16540 0 0 0 0 >>> mail-changelog:rpcsvc_request_t 0 8 2828 >>> 0 0 0 0 >>> mail-changelog:changelog_local_t 0 64 116 >>> 0 0 0 0 >>> mail-bitrot-stub:br_stub_local_t 0 512 84 >>> 71354 4 0 0 >>> mail-locks:pl_local_t 0 32 148 >>> 8135032 4 0 0 >>> mail-upcall:upcall_local_t 0 512 108 >>> 0 0 0 0 >>> mail-marker:marker_local_t 0 128 332 >>> 65005 3 0 0 >>> mail-quota:quota_local_t 0 64 476 >>> 0 0 0 0 >>> mail-server:rpcsvc_request_t 0 512 2828 >>> 12882393 30 0 0 >>> glusterfs:struct saved_frame 0 8 124 >>> 2 2 0 0 >>> glusterfs:struct rpc_req 0 8 588 >>> 2 2 0 0 >>> glusterfs:rpcsvc_request_t 1 7 2828 >>> 2 1 0 0 >>> glusterfs:log_buf_t 5 251 140 >>> 3443 6 0 0 >>> glusterfs:data_t 242 16141 52 >>> 138743429 290 0 0 >>> glusterfs:data_pair_t 230 16153 68 >>> 126649864 270 0 0 >>> glusterfs:dict_t 23 4073 140 >>> 20356289 63 0 0 >>> glusterfs:call_stub_t 0 1024 3764 >>> 13678560 31 0 0 >>> glusterfs:call_stack_t 1 1023 1708 >>> 11011561 30 0 0 >>> glusterfs:call_frame_t 1 4095 172 >>> 125764190 193 0 0 >>> ---------------------------------------------- >>> ==>>> >>> So, my questions are: >>> >>> 1) what one should do to limit GlusterFS FUSE client memory usage? >>> 2) what one should do to prevent client high loadavg because of high >>> iowait because of multiple concurrent volume users? >>> >>> Server/client OS is CentOS 7.1, GlusterFS server version is 3.7.3, >>> GlusterFS client version is 3.7.4. >>> >>> Any additional info needed? > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users