thr3ads.net - Gluster users - [Gluster-users] Memory leak in 3.6.*? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Yannick Perret

2016-Jul-22 12:06 UTC

[Gluster-users] Memory leak in 3.6.*?

Hello,
some times ago I posted about a memory leak in client process, but it 
was on a very old 32bit machine (both kernel and OS) and I don't found 
evidences about a similar problem on our recent machines.
But I performed more tests and I have the same problem.

Clients are 64bit Debian 8.2 machines. Glusterfs client on these 
machines is compiled from sources with activated stuff:
FUSE client          : yes
Infiniband verbs     : no
epoll IO multiplex   : yes
argp-standalone      : no
fusermount           : yes
readline             : yes
georeplication       : yes
Linux-AIO            : no
Enable Debug         : no
systemtap            : no
Block Device xlator  : no
glupy                : no
Use syslog           : yes
XML output           : yes
QEMU Block formats   : no
Encryption xlator    : yes
Erasure Code xlator  : yes

I tested both 3.6.7 and 3.6.9 version on client (3.6.7 is the one 
installed on our machines, even on servers, 3.6.9 is for testing with 
last 3.6 version).

Here are the operations on the client (also performed with similar 
results with 3.6.7 version):
# /usr/local/sbin/glusterfs --version
glusterfs 3.6.9 built on Jul 22 2016 13:27:42
(?)
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA  /zog/
# cd /usr/
# cp -Rp * /zog/TEMP/
Then monitoring memory used by glusterfs process while 'cp' is running 
(resp. VSZ and RSS from 'ps'):
284740 70232
284740 70232
284876 71704
285000 72684
285136 74008
285416 75940
(?)
368684 151980
369324 153768
369836 155576
370092 156192
370092 156192
Here both sizes are stable and correspond to the end of 'cp' command.
If I restart an other 'cp' (even on the same directories) size starts 
again to increase.
If I perform a 'ls -lR' in the directory size also increase:
370756 192488
389964 212148
390948 213232
(here I ^C the 'ls')

When doing nothing the size don't increase but never decrease (calling 
'sync' don't change the situation).
Sending a HUP signal to glusterfs process also increases memory (390948 
213324 ? 456484 213320).
Changing volume configuration (changing diagnostics.client-sys-log-level 
value) don't change anything.

Here the actual ps:
root     17041  4.9  5.2 456484 213320 ?       Ssl  13:29   1:21 
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain 
--volfile-id=BACKUP-ADMIN-DATA /zog

Of course umouting/remounting fall back to "start" size:
# umount /zog
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA  /zog/
? root     28741  0.3  0.7 273320 30484 ?        Ssl  13:57   0:00 
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain 
--volfile-id=BACKUP-ADMIN-DATA /zog


I didn't saw this before because most of our volumes are mounted "on 
demand" for some storage activities or are permanently mounted but with 
very few activity.
But clearly this memory usage driff is a long-term problem. On the old 
32bit machine I had this problem ("solved" by using NFS mounts in
order
to wait for this old machine to be replaced) and it lead to glusterfs 
being killed by OS when out of free memory. It was faster than what I 
describe here but it's just a question of time.


Thanks for any help about that.

Regards,
--
Y.


The corresponding volume on servers is (if it can help):
Volume Name: BACKUP-ADMIN-DATA
Type: Replicate
Volume ID: 306d57f3-fb30-4bcc-8687-08bf0a3d7878
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sto1.my.domain:/glusterfs/backup-admin/data
Brick2: sto2.my.domain:/glusterfs/backup-admin/data
Options Reconfigured:
diagnostics.client-sys-log-level: WARNING




-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3369 bytes
Desc: Signature cryptographique S/MIME
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160722/ecba239c/attachment.p7s>

Yannick Perret

2016-Jul-22 13:31 UTC

head link

[Gluster-users] Memory leak in 3.6.*?

Note: I'm have a dev client machine so I can perform tests or recompile 
glusterfs client if it can helps getting data about that.

I did not test this problem against 3.7.x version as my 2 servers are in 
use and I can't upgrade them at this time, and 3.7 clients are not 
compatible with 3.6 servers (as far as I can see from my tests).

--
Y.

Le 22/07/2016 14:06, Yannick Perret a ?crit :> Hello,
> some times ago I posted about a memory leak in client process, but it 
> was on a very old 32bit machine (both kernel and OS) and I don't found 
> evidences about a similar problem on our recent machines.
> But I performed more tests and I have the same problem.
>
> Clients are 64bit Debian 8.2 machines. Glusterfs client on these 
> machines is compiled from sources with activated stuff:
> FUSE client          : yes
> Infiniband verbs     : no
> epoll IO multiplex   : yes
> argp-standalone      : no
> fusermount           : yes
> readline             : yes
> georeplication       : yes
> Linux-AIO            : no
> Enable Debug         : no
> systemtap            : no
> Block Device xlator  : no
> glupy                : no
> Use syslog           : yes
> XML output           : yes
> QEMU Block formats   : no
> Encryption xlator    : yes
> Erasure Code xlator  : yes
>
> I tested both 3.6.7 and 3.6.9 version on client (3.6.7 is the one 
> installed on our machines, even on servers, 3.6.9 is for testing with 
> last 3.6 version).
>
> Here are the operations on the client (also performed with similar 
> results with 3.6.7 version):
> # /usr/local/sbin/glusterfs --version
> glusterfs 3.6.9 built on Jul 22 2016 13:27:42
> (?)
> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA  /zog/
> # cd /usr/
> # cp -Rp * /zog/TEMP/
> Then monitoring memory used by glusterfs process while 'cp' is
running
> (resp. VSZ and RSS from 'ps'):
> 284740 70232
> 284740 70232
> 284876 71704
> 285000 72684
> 285136 74008
> 285416 75940
> (?)
> 368684 151980
> 369324 153768
> 369836 155576
> 370092 156192
> 370092 156192
> Here both sizes are stable and correspond to the end of 'cp'
command.
> If I restart an other 'cp' (even on the same directories) size
starts
> again to increase.
> If I perform a 'ls -lR' in the directory size also increase:
> 370756 192488
> 389964 212148
> 390948 213232
> (here I ^C the 'ls')
>
> When doing nothing the size don't increase but never decrease (calling 
> 'sync' don't change the situation).
> Sending a HUP signal to glusterfs process also increases memory 
> (390948 213324 ? 456484 213320).
> Changing volume configuration (changing 
> diagnostics.client-sys-log-level value) don't change anything.
>
> Here the actual ps:
> root     17041  4.9  5.2 456484 213320 ?       Ssl  13:29   1:21 
> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain 
> --volfile-id=BACKUP-ADMIN-DATA /zog
>
> Of course umouting/remounting fall back to "start" size:
> # umount /zog
> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA  /zog/
> ? root     28741  0.3  0.7 273320 30484 ?        Ssl  13:57   0:00 
> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain 
> --volfile-id=BACKUP-ADMIN-DATA /zog
>
>
> I didn't saw this before because most of our volumes are mounted
"on
> demand" for some storage activities or are permanently mounted but 
> with very few activity.
> But clearly this memory usage driff is a long-term problem. On the old 
> 32bit machine I had this problem ("solved" by using NFS mounts in
> order to wait for this old machine to be replaced) and it lead to 
> glusterfs being killed by OS when out of free memory. It was faster 
> than what I describe here but it's just a question of time.
>
>
> Thanks for any help about that.
>
> Regards,
> -- 
> Y.
>
>
> The corresponding volume on servers is (if it can help):
> Volume Name: BACKUP-ADMIN-DATA
> Type: Replicate
> Volume ID: 306d57f3-fb30-4bcc-8687-08bf0a3d7878
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: sto1.my.domain:/glusterfs/backup-admin/data
> Brick2: sto2.my.domain:/glusterfs/backup-admin/data
> Options Reconfigured:
> diagnostics.client-sys-log-level: WARNING
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160722/5f839464/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3370 bytes
Desc: Signature cryptographique S/MIME
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160722/5f839464/attachment.p7s>

Yannick Perret

2016-Jul-25 14:08 UTC

head link

[Gluster-users] Memory leak in 3.6.*?

Hello,
checking for related problems in bugreports I found this discussion:
https://www.gluster.org/pipermail/gluster-users/2015-September/023709.html

The memleaks from https://bugzilla.redhat.com/show_bug.cgi?id=1126831 
are supposed to be corrected in 3.7.9. Does 3.6.9 (that I'm using) also 
has these corrections?

But in the 1st discussion they talk about network.inode-lru-limit that 
may influence the used memory.
Is that true? I set the value to 512 but I'm still able to go far away 
over ~512Mo (I reached ~1.3Go of VSZ and ~1Go of RSS).
I also tested with 1024 and I'm at 1171140/987024 and it is still 
growing slowly (performing cp -Rp + ls -lR on the mounted volume).
BTW if this value can influence memory usage on clients is it possible 
to set it on a per-client base? Because it may depend on available 
memory on each client, not on the volume itself.

Any help would be appreciated has it prevent us to use FUSE mount.

Regards,
--
Y.

Le 22/07/2016 14:06, Yannick Perret a ?crit :> Hello,
> some times ago I posted about a memory leak in client process, but it 
> was on a very old 32bit machine (both kernel and OS) and I don't found 
> evidences about a similar problem on our recent machines.
> But I performed more tests and I have the same problem.
>
> Clients are 64bit Debian 8.2 machines. Glusterfs client on these 
> machines is compiled from sources with activated stuff:
> FUSE client          : yes
> Infiniband verbs     : no
> epoll IO multiplex   : yes
> argp-standalone      : no
> fusermount           : yes
> readline             : yes
> georeplication       : yes
> Linux-AIO            : no
> Enable Debug         : no
> systemtap            : no
> Block Device xlator  : no
> glupy                : no
> Use syslog           : yes
> XML output           : yes
> QEMU Block formats   : no
> Encryption xlator    : yes
> Erasure Code xlator  : yes
>
> I tested both 3.6.7 and 3.6.9 version on client (3.6.7 is the one 
> installed on our machines, even on servers, 3.6.9 is for testing with 
> last 3.6 version).
>
> Here are the operations on the client (also performed with similar 
> results with 3.6.7 version):
> # /usr/local/sbin/glusterfs --version
> glusterfs 3.6.9 built on Jul 22 2016 13:27:42
> (?)
> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA  /zog/
> # cd /usr/
> # cp -Rp * /zog/TEMP/
> Then monitoring memory used by glusterfs process while 'cp' is
running
> (resp. VSZ and RSS from 'ps'):
> 284740 70232
> 284740 70232
> 284876 71704
> 285000 72684
> 285136 74008
> 285416 75940
> (?)
> 368684 151980
> 369324 153768
> 369836 155576
> 370092 156192
> 370092 156192
> Here both sizes are stable and correspond to the end of 'cp'
command.
> If I restart an other 'cp' (even on the same directories) size
starts
> again to increase.
> If I perform a 'ls -lR' in the directory size also increase:
> 370756 192488
> 389964 212148
> 390948 213232
> (here I ^C the 'ls')
>
> When doing nothing the size don't increase but never decrease (calling 
> 'sync' don't change the situation).
> Sending a HUP signal to glusterfs process also increases memory 
> (390948 213324 ? 456484 213320).
> Changing volume configuration (changing 
> diagnostics.client-sys-log-level value) don't change anything.
>
> Here the actual ps:
> root     17041  4.9  5.2 456484 213320 ?       Ssl  13:29   1:21 
> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain 
> --volfile-id=BACKUP-ADMIN-DATA /zog
>
> Of course umouting/remounting fall back to "start" size:
> # umount /zog
> # mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA  /zog/
> ? root     28741  0.3  0.7 273320 30484 ?        Ssl  13:57   0:00 
> /usr/local/sbin/glusterfs --volfile-server=sto1.my.domain 
> --volfile-id=BACKUP-ADMIN-DATA /zog
>
>
> I didn't saw this before because most of our volumes are mounted
"on
> demand" for some storage activities or are permanently mounted but 
> with very few activity.
> But clearly this memory usage driff is a long-term problem. On the old 
> 32bit machine I had this problem ("solved" by using NFS mounts in
> order to wait for this old machine to be replaced) and it lead to 
> glusterfs being killed by OS when out of free memory. It was faster 
> than what I describe here but it's just a question of time.
>
>
> Thanks for any help about that.
>
> Regards,
> -- 
> Y.
>
>
> The corresponding volume on servers is (if it can help):
> Volume Name: BACKUP-ADMIN-DATA
> Type: Replicate
> Volume ID: 306d57f3-fb30-4bcc-8687-08bf0a3d7878
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: sto1.my.domain:/glusterfs/backup-admin/data
> Brick2: sto2.my.domain:/glusterfs/backup-admin/data
> Options Reconfigured:
> diagnostics.client-sys-log-level: WARNING
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160725/716c6030/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3369 bytes
Desc: Signature cryptographique S/MIME
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160725/716c6030/attachment.p7s>

Gluster users - Jul 2016 - Memory leak in 3.6.*?

[Gluster-users] Memory leak in 3.6.*?

[Gluster-users] Memory leak in 3.6.*?

[Gluster-users] Memory leak in 3.6.*?