Hi,
Can you please specify which process has leak? Have you took the statedump
of the same process which has leak?
Thanks,
Sanju
On Sat, Feb 2, 2019 at 3:15 PM Pedro Costa <pedro at pmc.digital> wrote:
> Hi,
>
>
>
> I have a 3x replicated cluster running 4.1.7 on ubuntu 16.04.5, all 3
> replicas are also clients hosting a Node.js/Nginx web server.
>
>
>
> The current configuration is as such:
>
>
>
> Volume Name: gvol1
>
> Type: Replicate
>
> Volume ID: XXXXXX
>
> Status: Started
>
> Snapshot Count: 0
>
> Number of Bricks: 1 x 3 = 3
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: vm000000:/srv/brick1/gvol1
>
> Brick2: vm000001:/srv/brick1/gvol1
>
> Brick3: vm000002:/srv/brick1/gvol1
>
> Options Reconfigured:
>
> cluster.self-heal-readdir-size: 2KB
>
> cluster.self-heal-window-size: 2
>
> cluster.background-self-heal-count: 20
>
> network.ping-timeout: 5
>
> disperse.eager-lock: off
>
> performance.parallel-readdir: on
>
> performance.readdir-ahead: on
>
> performance.rda-cache-limit: 128MB
>
> performance.cache-refresh-timeout: 10
>
> performance.nl-cache-timeout: 600
>
> performance.nl-cache: on
>
> cluster.nufa: on
>
> performance.enable-least-priority: off
>
> server.outstanding-rpc-limit: 128
>
> performance.strict-o-direct: on
>
> cluster.shd-max-threads: 12
>
> client.event-threads: 4
>
> cluster.lookup-optimize: on
>
> network.inode-lru-limit: 90000
>
> performance.md-cache-timeout: 600
>
> performance.cache-invalidation: on
>
> performance.cache-samba-metadata: on
>
> performance.stat-prefetch: on
>
> features.cache-invalidation-timeout: 600
>
> features.cache-invalidation: on
>
> storage.fips-mode-rchecksum: on
>
> transport.address-family: inet
>
> nfs.disable: on
>
> performance.client-io-threads: on
>
> features.utime: on
>
> storage.ctime: on
>
> server.event-threads: 4
>
> performance.cache-size: 256MB
>
> performance.read-ahead: on
>
> cluster.readdir-optimize: on
>
> cluster.strict-readdir: on
>
> performance.io-thread-count: 8
>
> server.allow-insecure: on
>
> cluster.read-hash-mode: 0
>
> cluster.lookup-unhashed: auto
>
> cluster.choose-local: on
>
>
>
> I believe there?s a memory leak somewhere, it just keeps going up until it
> hangs one or more nodes taking the whole cluster down sometimes.
>
>
>
> I have taken 2 statedumps on one of the nodes, one where the memory is too
> high and another just after a reboot with the app running and the volume
> fully healed.
>
>
>
>
>
https://pmcdigital.sharepoint.com/:u:/g/EYDsNqTf1UdEuE6B0ZNVPfIBf_I-AbaqHotB1lJOnxLlTg?e=boYP09
> (high memory)
>
>
>
>
>
https://pmcdigital.sharepoint.com/:u:/g/EWZBsnET2xBHl6OxO52RCfIBvQ0uIDQ1GKJZ1GrnviyMhg?e=wI3yaY
> (after reboot)
>
>
>
> Any help would be greatly appreciated,
>
>
>
> Kindest Regards,
>
>
>
>
> *Pedro Maia Costa **Senior Developer, pmc.digital*
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
--
Thanks,
Sanju
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190204/20f08b29/attachment.html>