thr3ads.net - Gluster users - [Gluster-users] Gluster eating up a lot of ram [Jul 2019]

If this information is useful, please help other people find it:
Share via:

Diego Remolina

2019-Jul-30 03:34 UTC

[Gluster-users] Gluster eating up a lot of ram

Will this kill the actual process or simply trigger the dump? Which process
should I kill? The brick process in the system or the fuse mount?

Diego

On Mon, Jul 29, 2019, 23:27 Nithya Balachandran <nbalacha at redhat.com>
wrote:
>
>
> On Tue, 30 Jul 2019 at 05:44, Diego Remolina <dijuremo at gmail.com>
wrote:
>
>> Unfortunately statedump crashes on both machines, even freshly
rebooted.
>>
>
> Do you see any statedump files in /var/run/gluster?  This looks more like
> the gluster cli crashed.
>
>>
>> [root at ysmha01 ~]# gluster --print-statedumpdir
>> /var/run/gluster
>> [root at ysmha01 ~]# gluster v statedump export
>> Segmentation fault (core dumped)
>>
>> [root at ysmha02 ~]# uptime
>>  20:12:20 up 6 min,  1 user,  load average: 0.72, 0.52, 0.24
>> [root at ysmha02 ~]# gluster --print-statedumpdir
>> /var/run/gluster
>> [root at ysmha02 ~]# gluster v statedump export
>> Segmentation fault (core dumped)
>>
>> I rebooted today after 40 days. Gluster was eating up shy of 40GB of
RAM
>> out of 64.
>>
>> What would you recommend to be the next step?
>>
>> Diego
>>
>> On Mon, Mar 4, 2019 at 5:07 AM Poornima Gurusiddaiah <pgurusid at
redhat.com>
>> wrote:
>>
>>> Could you also provide the statedump of the gluster process
consuming
>>> 44G ram [1]. Please make sure the statedump is taken when the
memory
>>> consumption is very high, like 10s of GBs, otherwise we may not be
able to
>>> identify the issue. Also i see that the cache size is 10G is that
something
>>> you arrived at, after doing some tests? Its relatively higher than
normal.
>>>
>>> [1]
>>>
https://docs.gluster.org/en/v3/Troubleshooting/statedump/#generate-a-statedump
>>>
>>> On Mon, Mar 4, 2019 at 12:23 AM Diego Remolina <dijuremo at
gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I will not be able to test gluster-6rc because this is a
production
>>>> environment and it takes several days for memory to grow a lot.
>>>>
>>>> The Samba server is hosting all types of files, small and large
from
>>>> small roaming profile type files to bigger files like adobe
suite, autodesk
>>>> Revit (file sizes in the hundreds of megabytes).
>>>>
>>>> As I stated before, this same issue was present back with 3.8.x
which I
>>>> was running before.
>>>>
>>>> The information you requested:
>>>>
>>>> [root at ysmha02 ~]# gluster v info export
>>>>
>>>> Volume Name: export
>>>> Type: Replicate
>>>> Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: 10.0.1.7:/bricks/hdds/brick
>>>> Brick2: 10.0.1.6:/bricks/hdds/brick
>>>> Options Reconfigured:
>>>> performance.stat-prefetch: on
>>>> performance.cache-min-file-size: 0
>>>> network.inode-lru-limit: 65536
>>>> performance.cache-invalidation: on
>>>> features.cache-invalidation: on
>>>> performance.md-cache-timeout: 600
>>>> features.cache-invalidation-timeout: 600
>>>> performance.cache-samba-metadata: on
>>>> transport.address-family: inet
>>>> server.allow-insecure: on
>>>> performance.cache-size: 10GB
>>>> cluster.server-quorum-type: server
>>>> nfs.disable: on
>>>> performance.io-thread-count: 64
>>>> performance.io-cache: on
>>>> cluster.lookup-optimize: on
>>>> cluster.readdir-optimize: on
>>>> server.event-threads: 5
>>>> client.event-threads: 5
>>>> performance.cache-max-file-size: 256MB
>>>> diagnostics.client-log-level: INFO
>>>> diagnostics.brick-log-level: INFO
>>>> cluster.server-quorum-ratio: 51%
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
>>>> www.avast.com
>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>
<#m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>
>>>> On Fri, Mar 1, 2019 at 11:07 PM Poornima Gurusiddaiah <
>>>> pgurusid at redhat.com> wrote:
>>>>
>>>>> This high memory consumption is not normal. Looks like
it's a memory
>>>>> leak. Is it possible to try it on test setup with
gluster-6rc? What is the
>>>>> kind of workload that goes into fuse mount? Large files or
small files? We
>>>>> need the following information to debug further:
>>>>> - Gluster volume info output
>>>>> - Statedump of the Gluster fuse mount process consuming 44G
ram.
>>>>>
>>>>> Regards,
>>>>> Poornima
>>>>>
>>>>>
>>>>> On Sat, Mar 2, 2019, 3:40 AM Diego Remolina <dijuremo at
gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I am using glusterfs with two servers as a file server
sharing files
>>>>>> via samba and ctdb. I cannot use samba vfs gluster
plugin, due to bug in
>>>>>> current Centos version of samba. So I am mounting via
fuse and exporting
>>>>>> the volume to samba from the mount point.
>>>>>>
>>>>>> Upon initial boot, the server where samba is exporting
files climbs
>>>>>> up to ~10GB RAM within a couple hours of use. From then
on, it is a
>>>>>> constant slow memory increase. In the past with gluster
3.8.x we had to
>>>>>> reboot the servers at around 30 days . With gluster
4.1.6 we are getting up
>>>>>> to 48 days, but RAM use is at 48GB out of 64GB. Is this
normal?
>>>>>>
>>>>>> The particular versions are below,
>>>>>>
>>>>>> [root at ysmha01 home]# uptime
>>>>>> 16:59:39 up 48 days,  9:56,  1 user,  load average:
3.75, 3.17, 3.00
>>>>>> [root at ysmha01 home]# rpm -qa | grep gluster
>>>>>> centos-release-gluster41-1.0-3.el7.centos.noarch
>>>>>> glusterfs-server-4.1.6-1.el7.x86_64
>>>>>> glusterfs-api-4.1.6-1.el7.x86_64
>>>>>> centos-release-gluster-legacy-4.0-2.el7.centos.noarch
>>>>>> glusterfs-4.1.6-1.el7.x86_64
>>>>>> glusterfs-client-xlators-4.1.6-1.el7.x86_64
>>>>>>
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.8.x86_64
>>>>>> glusterfs-fuse-4.1.6-1.el7.x86_64
>>>>>> glusterfs-libs-4.1.6-1.el7.x86_64
>>>>>> glusterfs-rdma-4.1.6-1.el7.x86_64
>>>>>> glusterfs-cli-4.1.6-1.el7.x86_64
>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>> [root at ysmha01 home]# rpm -qa | grep samba
>>>>>> samba-common-tools-4.8.3-4.el7.x86_64
>>>>>> samba-client-libs-4.8.3-4.el7.x86_64
>>>>>> samba-libs-4.8.3-4.el7.x86_64
>>>>>> samba-4.8.3-4.el7.x86_64
>>>>>> samba-common-libs-4.8.3-4.el7.x86_64
>>>>>> samba-common-4.8.3-4.el7.noarch
>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>> [root at ysmha01 home]# cat /etc/redhat-release
>>>>>> CentOS Linux release 7.6.1810 (Core)
>>>>>>
>>>>>> RAM view using top
>>>>>> Tasks: 398 total,   1 running, 397 sleeping,   0
stopped,   0 zombie
>>>>>> %Cpu(s):  7.0 us,  9.3 sy,  1.7 ni, 71.6 id,  9.7 wa, 
0.0 hi,  0.8
>>>>>> si,  0.0 st
>>>>>> KiB Mem : 65772000 total,  1851344 free, 60487404 used,
3433252
>>>>>> buff/cache
>>>>>> KiB Swap:        0 total,        0 free,        0 used.
3134316
>>>>>> avail Mem
>>>>>>
>>>>>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU
%MEM     TIME+
>>>>>> COMMAND
>>>>>>  9953 root      20   0 3727912 946496   3196 S 150.2 
1.4  38626:27
>>>>>> glusterfsd
>>>>>>  9634 root      20   0   48.1g  47.2g   3184 S  96.3
75.3  29513:55
>>>>>> glusterfs
>>>>>> 14485 root      20   0 3404140  63780   2052 S  80.7 
0.1   1590:13
>>>>>> glusterfs
>>>>>>
>>>>>> [root at ysmha01 ~]# gluster v status export
>>>>>> Status of volume: export
>>>>>> Gluster process                             TCP Port 
RDMA Port
>>>>>> Online  Pid
>>>>>>
>>>>>>
------------------------------------------------------------------------------
>>>>>> Brick 10.0.1.7:/bricks/hdds/brick           49157     0
Y
>>>>>>      13986
>>>>>> Brick 10.0.1.6:/bricks/hdds/brick           49153     0
Y
>>>>>>      9953
>>>>>> Self-heal Daemon on localhost               N/A      
N/A        Y
>>>>>>    14485
>>>>>> Self-heal Daemon on 10.0.1.7                N/A      
N/A        Y
>>>>>>    21934
>>>>>> Self-heal Daemon on 10.0.1.5                N/A      
N/A        Y
>>>>>>    4598
>>>>>>
>>>>>> Task Status of Volume export
>>>>>>
>>>>>>
------------------------------------------------------------------------------
>>>>>> There are no active volume tasks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
>>>>>> www.avast.com
>>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>>
<#m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_m_1092070095161815064_m_5816452762692804512_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190729/aa4c3834/attachment.html>

Raghavendra Gowdappa

2019-Jul-30 03:46 UTC

head link

[Gluster-users] Gluster eating up a lot of ram

On Tue, Jul 30, 2019 at 9:04 AM Diego Remolina <dijuremo at gmail.com>
wrote:
> Will this kill the actual process or simply trigger the dump?
>
This - kill -SIGUSR1 ... - will deliver signal SIGUSR1 to the process.
Glusterfs processes (bricks, client mount) are implemented to do a
statedump on receiving SIGUSR1.

Which process should I kill? The brick process in the system or the
fuse> mount?
>
All those processes that consume large amounts of memory.

> Diego
>
> On Mon, Jul 29, 2019, 23:27 Nithya Balachandran <nbalacha at
redhat.com>
> wrote:
>
>>
>>
>> On Tue, 30 Jul 2019 at 05:44, Diego Remolina <dijuremo at
gmail.com> wrote:
>>
>>> Unfortunately statedump crashes on both machines, even freshly
rebooted.
>>>
>>
>> Do you see any statedump files in /var/run/gluster?  This looks more
like
>> the gluster cli crashed.
>>
>>>
>>> [root at ysmha01 ~]# gluster --print-statedumpdir
>>> /var/run/gluster
>>> [root at ysmha01 ~]# gluster v statedump export
>>> Segmentation fault (core dumped)
>>>
>>> [root at ysmha02 ~]# uptime
>>>  20:12:20 up 6 min,  1 user,  load average: 0.72, 0.52, 0.24
>>> [root at ysmha02 ~]# gluster --print-statedumpdir
>>> /var/run/gluster
>>> [root at ysmha02 ~]# gluster v statedump export
>>> Segmentation fault (core dumped)
>>>
>>> I rebooted today after 40 days. Gluster was eating up shy of 40GB
of RAM
>>> out of 64.
>>>
>>> What would you recommend to be the next step?
>>>
>>> Diego
>>>
>>> On Mon, Mar 4, 2019 at 5:07 AM Poornima Gurusiddaiah <
>>> pgurusid at redhat.com> wrote:
>>>
>>>> Could you also provide the statedump of the gluster process
consuming
>>>> 44G ram [1]. Please make sure the statedump is taken when the
memory
>>>> consumption is very high, like 10s of GBs, otherwise we may not
be able to
>>>> identify the issue. Also i see that the cache size is 10G is
that something
>>>> you arrived at, after doing some tests? Its relatively higher
than normal.
>>>>
>>>> [1]
>>>>
https://docs.gluster.org/en/v3/Troubleshooting/statedump/#generate-a-statedump
>>>>
>>>> On Mon, Mar 4, 2019 at 12:23 AM Diego Remolina <dijuremo at
gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I will not be able to test gluster-6rc because this is a
production
>>>>> environment and it takes several days for memory to grow a
lot.
>>>>>
>>>>> The Samba server is hosting all types of files, small and
large from
>>>>> small roaming profile type files to bigger files like adobe
suite, autodesk
>>>>> Revit (file sizes in the hundreds of megabytes).
>>>>>
>>>>> As I stated before, this same issue was present back with
3.8.x which
>>>>> I was running before.
>>>>>
>>>>> The information you requested:
>>>>>
>>>>> [root at ysmha02 ~]# gluster v info export
>>>>>
>>>>> Volume Name: export
>>>>> Type: Replicate
>>>>> Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x 2 = 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.0.1.7:/bricks/hdds/brick
>>>>> Brick2: 10.0.1.6:/bricks/hdds/brick
>>>>> Options Reconfigured:
>>>>> performance.stat-prefetch: on
>>>>> performance.cache-min-file-size: 0
>>>>> network.inode-lru-limit: 65536
>>>>> performance.cache-invalidation: on
>>>>> features.cache-invalidation: on
>>>>> performance.md-cache-timeout: 600
>>>>> features.cache-invalidation-timeout: 600
>>>>> performance.cache-samba-metadata: on
>>>>> transport.address-family: inet
>>>>> server.allow-insecure: on
>>>>> performance.cache-size: 10GB
>>>>> cluster.server-quorum-type: server
>>>>> nfs.disable: on
>>>>> performance.io-thread-count: 64
>>>>> performance.io-cache: on
>>>>> cluster.lookup-optimize: on
>>>>> cluster.readdir-optimize: on
>>>>> server.event-threads: 5
>>>>> client.event-threads: 5
>>>>> performance.cache-max-file-size: 256MB
>>>>> diagnostics.client-log-level: INFO
>>>>> diagnostics.brick-log-level: INFO
>>>>> cluster.server-quorum-ratio: 51%
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
>>>>> www.avast.com
>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>
<#m_-1957083142921437393_m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>
>>>>> On Fri, Mar 1, 2019 at 11:07 PM Poornima Gurusiddaiah <
>>>>> pgurusid at redhat.com> wrote:
>>>>>
>>>>>> This high memory consumption is not normal. Looks like
it's a memory
>>>>>> leak. Is it possible to try it on test setup with
gluster-6rc? What is the
>>>>>> kind of workload that goes into fuse mount? Large files
or small files? We
>>>>>> need the following information to debug further:
>>>>>> - Gluster volume info output
>>>>>> - Statedump of the Gluster fuse mount process consuming
44G ram.
>>>>>>
>>>>>> Regards,
>>>>>> Poornima
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 2, 2019, 3:40 AM Diego Remolina
<dijuremo at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I am using glusterfs with two servers as a file
server sharing files
>>>>>>> via samba and ctdb. I cannot use samba vfs gluster
plugin, due to bug in
>>>>>>> current Centos version of samba. So I am mounting
via fuse and exporting
>>>>>>> the volume to samba from the mount point.
>>>>>>>
>>>>>>> Upon initial boot, the server where samba is
exporting files climbs
>>>>>>> up to ~10GB RAM within a couple hours of use. From
then on, it is a
>>>>>>> constant slow memory increase. In the past with
gluster 3.8.x we had to
>>>>>>> reboot the servers at around 30 days . With gluster
4.1.6 we are getting up
>>>>>>> to 48 days, but RAM use is at 48GB out of 64GB. Is
this normal?
>>>>>>>
>>>>>>> The particular versions are below,
>>>>>>>
>>>>>>> [root at ysmha01 home]# uptime
>>>>>>> 16:59:39 up 48 days,  9:56,  1 user,  load average:
3.75, 3.17, 3.00
>>>>>>> [root at ysmha01 home]# rpm -qa | grep gluster
>>>>>>> centos-release-gluster41-1.0-3.el7.centos.noarch
>>>>>>> glusterfs-server-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-api-4.1.6-1.el7.x86_64
>>>>>>>
centos-release-gluster-legacy-4.0-2.el7.centos.noarch
>>>>>>> glusterfs-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-client-xlators-4.1.6-1.el7.x86_64
>>>>>>>
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.8.x86_64
>>>>>>> glusterfs-fuse-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-libs-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-rdma-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-cli-4.1.6-1.el7.x86_64
>>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>>> [root at ysmha01 home]# rpm -qa | grep samba
>>>>>>> samba-common-tools-4.8.3-4.el7.x86_64
>>>>>>> samba-client-libs-4.8.3-4.el7.x86_64
>>>>>>> samba-libs-4.8.3-4.el7.x86_64
>>>>>>> samba-4.8.3-4.el7.x86_64
>>>>>>> samba-common-libs-4.8.3-4.el7.x86_64
>>>>>>> samba-common-4.8.3-4.el7.noarch
>>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>>> [root at ysmha01 home]# cat /etc/redhat-release
>>>>>>> CentOS Linux release 7.6.1810 (Core)
>>>>>>>
>>>>>>> RAM view using top
>>>>>>> Tasks: 398 total,   1 running, 397 sleeping,   0
stopped,   0 zombie
>>>>>>> %Cpu(s):  7.0 us,  9.3 sy,  1.7 ni, 71.6 id,  9.7
wa,  0.0 hi,  0.8
>>>>>>> si,  0.0 st
>>>>>>> KiB Mem : 65772000 total,  1851344 free, 60487404
used,  3433252
>>>>>>> buff/cache
>>>>>>> KiB Swap:        0 total,        0 free,        0
used.  3134316
>>>>>>> avail Mem
>>>>>>>
>>>>>>>   PID USER      PR  NI    VIRT    RES    SHR S 
%CPU %MEM     TIME+
>>>>>>> COMMAND
>>>>>>>  9953 root      20   0 3727912 946496   3196 S
150.2  1.4  38626:27
>>>>>>> glusterfsd
>>>>>>>  9634 root      20   0   48.1g  47.2g   3184 S 
96.3 75.3  29513:55
>>>>>>> glusterfs
>>>>>>> 14485 root      20   0 3404140  63780   2052 S 
80.7  0.1   1590:13
>>>>>>> glusterfs
>>>>>>>
>>>>>>> [root at ysmha01 ~]# gluster v status export
>>>>>>> Status of volume: export
>>>>>>> Gluster process                             TCP
Port  RDMA Port
>>>>>>> Online  Pid
>>>>>>>
>>>>>>>
------------------------------------------------------------------------------
>>>>>>> Brick 10.0.1.7:/bricks/hdds/brick           49157  
0          Y
>>>>>>>      13986
>>>>>>> Brick 10.0.1.6:/bricks/hdds/brick           49153  
0          Y
>>>>>>>      9953
>>>>>>> Self-heal Daemon on localhost               N/A    
N/A        Y
>>>>>>>      14485
>>>>>>> Self-heal Daemon on 10.0.1.7                N/A    
N/A        Y
>>>>>>>      21934
>>>>>>> Self-heal Daemon on 10.0.1.5                N/A    
N/A        Y
>>>>>>>      4598
>>>>>>>
>>>>>>> Task Status of Volume export
>>>>>>>
>>>>>>>
------------------------------------------------------------------------------
>>>>>>> There are no active volume tasks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
>>>>>>> www.avast.com
>>>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>>>
<#m_-1957083142921437393_m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_m_1092070095161815064_m_5816452762692804512_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190730/e61e0031/attachment.html>

Nithya Balachandran

2019-Jul-30 03:51 UTC

head link

[Gluster-users] Gluster eating up a lot of ram

Hi Diego,

Please do the following:

gluster v get <volname> readdir-ahead

If this is enabled, please disable it and see if it helps. There was a leak
in the opendir codpath that was fixed in later released.

Regards,
Nithya


On Tue, 30 Jul 2019 at 09:04, Diego Remolina <dijuremo at gmail.com>
wrote:
> Will this kill the actual process or simply trigger the dump? Which
> process should I kill? The brick process in the system or the fuse mount?
>
> Diego
>
> On Mon, Jul 29, 2019, 23:27 Nithya Balachandran <nbalacha at
redhat.com>
> wrote:
>
>>
>>
>> On Tue, 30 Jul 2019 at 05:44, Diego Remolina <dijuremo at
gmail.com> wrote:
>>
>>> Unfortunately statedump crashes on both machines, even freshly
rebooted.
>>>
>>
>> Do you see any statedump files in /var/run/gluster?  This looks more
like
>> the gluster cli crashed.
>>
>>>
>>> [root at ysmha01 ~]# gluster --print-statedumpdir
>>> /var/run/gluster
>>> [root at ysmha01 ~]# gluster v statedump export
>>> Segmentation fault (core dumped)
>>>
>>> [root at ysmha02 ~]# uptime
>>>  20:12:20 up 6 min,  1 user,  load average: 0.72, 0.52, 0.24
>>> [root at ysmha02 ~]# gluster --print-statedumpdir
>>> /var/run/gluster
>>> [root at ysmha02 ~]# gluster v statedump export
>>> Segmentation fault (core dumped)
>>>
>>> I rebooted today after 40 days. Gluster was eating up shy of 40GB
of RAM
>>> out of 64.
>>>
>>> What would you recommend to be the next step?
>>>
>>> Diego
>>>
>>> On Mon, Mar 4, 2019 at 5:07 AM Poornima Gurusiddaiah <
>>> pgurusid at redhat.com> wrote:
>>>
>>>> Could you also provide the statedump of the gluster process
consuming
>>>> 44G ram [1]. Please make sure the statedump is taken when the
memory
>>>> consumption is very high, like 10s of GBs, otherwise we may not
be able to
>>>> identify the issue. Also i see that the cache size is 10G is
that something
>>>> you arrived at, after doing some tests? Its relatively higher
than normal.
>>>>
>>>> [1]
>>>>
https://docs.gluster.org/en/v3/Troubleshooting/statedump/#generate-a-statedump
>>>>
>>>> On Mon, Mar 4, 2019 at 12:23 AM Diego Remolina <dijuremo at
gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I will not be able to test gluster-6rc because this is a
production
>>>>> environment and it takes several days for memory to grow a
lot.
>>>>>
>>>>> The Samba server is hosting all types of files, small and
large from
>>>>> small roaming profile type files to bigger files like adobe
suite, autodesk
>>>>> Revit (file sizes in the hundreds of megabytes).
>>>>>
>>>>> As I stated before, this same issue was present back with
3.8.x which
>>>>> I was running before.
>>>>>
>>>>> The information you requested:
>>>>>
>>>>> [root at ysmha02 ~]# gluster v info export
>>>>>
>>>>> Volume Name: export
>>>>> Type: Replicate
>>>>> Volume ID: b4353b3f-6ef6-4813-819a-8e85e5a95cff
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x 2 = 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.0.1.7:/bricks/hdds/brick
>>>>> Brick2: 10.0.1.6:/bricks/hdds/brick
>>>>> Options Reconfigured:
>>>>> performance.stat-prefetch: on
>>>>> performance.cache-min-file-size: 0
>>>>> network.inode-lru-limit: 65536
>>>>> performance.cache-invalidation: on
>>>>> features.cache-invalidation: on
>>>>> performance.md-cache-timeout: 600
>>>>> features.cache-invalidation-timeout: 600
>>>>> performance.cache-samba-metadata: on
>>>>> transport.address-family: inet
>>>>> server.allow-insecure: on
>>>>> performance.cache-size: 10GB
>>>>> cluster.server-quorum-type: server
>>>>> nfs.disable: on
>>>>> performance.io-thread-count: 64
>>>>> performance.io-cache: on
>>>>> cluster.lookup-optimize: on
>>>>> cluster.readdir-optimize: on
>>>>> server.event-threads: 5
>>>>> client.event-threads: 5
>>>>> performance.cache-max-file-size: 256MB
>>>>> diagnostics.client-log-level: INFO
>>>>> diagnostics.brick-log-level: INFO
>>>>> cluster.server-quorum-ratio: 51%
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
>>>>> www.avast.com
>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>
<#m_8374238785685214358_m_-3340449949414300599_m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>
>>>>> On Fri, Mar 1, 2019 at 11:07 PM Poornima Gurusiddaiah <
>>>>> pgurusid at redhat.com> wrote:
>>>>>
>>>>>> This high memory consumption is not normal. Looks like
it's a memory
>>>>>> leak. Is it possible to try it on test setup with
gluster-6rc? What is the
>>>>>> kind of workload that goes into fuse mount? Large files
or small files? We
>>>>>> need the following information to debug further:
>>>>>> - Gluster volume info output
>>>>>> - Statedump of the Gluster fuse mount process consuming
44G ram.
>>>>>>
>>>>>> Regards,
>>>>>> Poornima
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 2, 2019, 3:40 AM Diego Remolina
<dijuremo at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I am using glusterfs with two servers as a file
server sharing files
>>>>>>> via samba and ctdb. I cannot use samba vfs gluster
plugin, due to bug in
>>>>>>> current Centos version of samba. So I am mounting
via fuse and exporting
>>>>>>> the volume to samba from the mount point.
>>>>>>>
>>>>>>> Upon initial boot, the server where samba is
exporting files climbs
>>>>>>> up to ~10GB RAM within a couple hours of use. From
then on, it is a
>>>>>>> constant slow memory increase. In the past with
gluster 3.8.x we had to
>>>>>>> reboot the servers at around 30 days . With gluster
4.1.6 we are getting up
>>>>>>> to 48 days, but RAM use is at 48GB out of 64GB. Is
this normal?
>>>>>>>
>>>>>>> The particular versions are below,
>>>>>>>
>>>>>>> [root at ysmha01 home]# uptime
>>>>>>> 16:59:39 up 48 days,  9:56,  1 user,  load average:
3.75, 3.17, 3.00
>>>>>>> [root at ysmha01 home]# rpm -qa | grep gluster
>>>>>>> centos-release-gluster41-1.0-3.el7.centos.noarch
>>>>>>> glusterfs-server-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-api-4.1.6-1.el7.x86_64
>>>>>>>
centos-release-gluster-legacy-4.0-2.el7.centos.noarch
>>>>>>> glusterfs-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-client-xlators-4.1.6-1.el7.x86_64
>>>>>>>
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.8.x86_64
>>>>>>> glusterfs-fuse-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-libs-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-rdma-4.1.6-1.el7.x86_64
>>>>>>> glusterfs-cli-4.1.6-1.el7.x86_64
>>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>>> [root at ysmha01 home]# rpm -qa | grep samba
>>>>>>> samba-common-tools-4.8.3-4.el7.x86_64
>>>>>>> samba-client-libs-4.8.3-4.el7.x86_64
>>>>>>> samba-libs-4.8.3-4.el7.x86_64
>>>>>>> samba-4.8.3-4.el7.x86_64
>>>>>>> samba-common-libs-4.8.3-4.el7.x86_64
>>>>>>> samba-common-4.8.3-4.el7.noarch
>>>>>>> samba-vfs-glusterfs-4.8.3-4.el7.x86_64
>>>>>>> [root at ysmha01 home]# cat /etc/redhat-release
>>>>>>> CentOS Linux release 7.6.1810 (Core)
>>>>>>>
>>>>>>> RAM view using top
>>>>>>> Tasks: 398 total,   1 running, 397 sleeping,   0
stopped,   0 zombie
>>>>>>> %Cpu(s):  7.0 us,  9.3 sy,  1.7 ni, 71.6 id,  9.7
wa,  0.0 hi,  0.8
>>>>>>> si,  0.0 st
>>>>>>> KiB Mem : 65772000 total,  1851344 free, 60487404
used,  3433252
>>>>>>> buff/cache
>>>>>>> KiB Swap:        0 total,        0 free,        0
used.  3134316
>>>>>>> avail Mem
>>>>>>>
>>>>>>>   PID USER      PR  NI    VIRT    RES    SHR S 
%CPU %MEM     TIME+
>>>>>>> COMMAND
>>>>>>>  9953 root      20   0 3727912 946496   3196 S
150.2  1.4  38626:27
>>>>>>> glusterfsd
>>>>>>>  9634 root      20   0   48.1g  47.2g   3184 S 
96.3 75.3  29513:55
>>>>>>> glusterfs
>>>>>>> 14485 root      20   0 3404140  63780   2052 S 
80.7  0.1   1590:13
>>>>>>> glusterfs
>>>>>>>
>>>>>>> [root at ysmha01 ~]# gluster v status export
>>>>>>> Status of volume: export
>>>>>>> Gluster process                             TCP
Port  RDMA Port
>>>>>>> Online  Pid
>>>>>>>
>>>>>>>
------------------------------------------------------------------------------
>>>>>>> Brick 10.0.1.7:/bricks/hdds/brick           49157  
0          Y
>>>>>>>      13986
>>>>>>> Brick 10.0.1.6:/bricks/hdds/brick           49153  
0          Y
>>>>>>>      9953
>>>>>>> Self-heal Daemon on localhost               N/A    
N/A        Y
>>>>>>>      14485
>>>>>>> Self-heal Daemon on 10.0.1.7                N/A    
N/A        Y
>>>>>>>      21934
>>>>>>> Self-heal Daemon on 10.0.1.5                N/A    
N/A        Y
>>>>>>>      4598
>>>>>>>
>>>>>>> Task Status of Volume export
>>>>>>>
>>>>>>>
------------------------------------------------------------------------------
>>>>>>> There are no active volume tasks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon>
Virus-free.
>>>>>>> www.avast.com
>>>>>>>
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link>
>>>>>>>
<#m_8374238785685214358_m_-3340449949414300599_m_-7001269052163580460_m_-4519402017059013283_m_-1483290904248086332_m_-4429654867678350131_m_1092070095161815064_m_5816452762692804512_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190730/a1758fa7/attachment.html>

Gluster users - Jul 2019 - Gluster eating up a lot of ram

[Gluster-users] Gluster eating up a lot of ram

[Gluster-users] Gluster eating up a lot of ram

[Gluster-users] Gluster eating up a lot of ram