Zakhar Kirpichenko
2022-Feb-08 05:14 UTC
[Gluster-users] GlusterFS 9.5 fuse mount excessive memory usage
Hi, I've updated the github issue with more details: https://github.com/gluster/glusterfs/issues/3206#issuecomment-1030770617 Looks like there's a memory leak. /Z On Sat, Feb 5, 2022 at 8:45 PM Zakhar Kirpichenko <zakhar at gmail.com> wrote:> Hi Strahil, > > Many thanks for your reply! I've updated the Github issue with statedump > files taken before and after the tar operation: > https://github.com/gluster/glusterfs/files/8008635/glusterdump.19102.dump.zip > > Please disregard that path= entries are empty, in the original dumps there > are real paths but I deleted them as they might contain sensitive > information. > > The odd thing is that the dump file is full of: > > 1) xlator.performance.write-behind.wb_inode entries, but the tar operation > does not write to these files. The whole backup process is read-only. > > 2) xlator.performance.quick-read.inodectx entries, which never go away. > > None of this happens on other clients, which read and write from/to the > same volume in a much more intense manner. > > Best regards, > Z > > On Sat, Feb 5, 2022 at 11:23 AM Strahil Nikolov <hunter86_bg at yahoo.com> > wrote: > >> Can you generate a statedump before and after the tar ? >> For statedump generation , you can follow >> https://github.com/gluster/glusterfs/issues/1440#issuecomment-674051243 . >> >> Best Regards, >> Strahil Nikolov >> >> >> ? ??????, 5 ???????? 2022 ?., 07:54:22 ???????+2, Zakhar Kirpichenko < >> zakhar at gmail.com> ??????: >> >> >> Hi! >> >> I opened a Github issue https://github.com/gluster/glusterfs/issues/3206 >> but not sure how much attention they get there, so re-posting here just in >> case someone has any ideas. >> >> Description of problem: >> >> GlusterFS 9.5, 3-node cluster (2 bricks + arbiter), an attempt to tar the >> whole filesystem (35-40 GB, 1.6 million files) on a client succeeds but >> causes the glusterfs fuse mount process to consume 0.5+ GB of RAM. The >> usage never goes down after tar exits. >> >> The exact command to reproduce the issue: >> >> /usr/bin/tar --use-compress-program="/bin/pigz" -cf >> /path/to/archive.tar.gz --warning=no-file-changed /glusterfsmount >> >> The output of the gluster volume info command: >> >> Volume Name: gvol1 >> Type: Replicate >> Volume ID: 0292ac43-89bd-45a4-b91d-799b49613e60 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: 192.168.0.31:/gluster/brick1/gvol1 >> Brick2: 192.168.0.32:/gluster/brick1/gvol1 >> Brick3: 192.168.0.5:/gluster/brick1/gvol1 (arbiter) >> Options Reconfigured: >> performance.open-behind: off >> cluster.readdir-optimize: off >> cluster.consistent-metadata: on >> features.cache-invalidation: on >> diagnostics.count-fop-hits: on >> diagnostics.latency-measurement: on >> storage.fips-mode-rchecksum: on >> performance.cache-size: 256MB >> client.event-threads: 8 >> server.event-threads: 4 >> storage.reserve: 1 >> performance.cache-invalidation: on >> cluster.lookup-optimize: on >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: on >> features.cache-invalidation-timeout: 600 >> performance.md-cache-timeout: 600 >> network.inode-lru-limit: 50000 >> cluster.shd-max-threads: 4 >> cluster.self-heal-window-size: 8 >> performance.enable-least-priority: off >> performance.cache-max-file-size: 2MB >> >> The output of the gluster volume status command: >> >> Status of volume: gvol1 >> Gluster process TCP Port RDMA Port Online >> Pid >> >> ------------------------------------------------------------------------------ >> Brick 192.168.0.31:/gluster/brick1/gvol1 49152 0 Y >> 1767 >> Brick 192.168.0.32:/gluster/brick1/gvol1 49152 0 Y >> 1696 >> Brick 192.168.0.5:/gluster/brick1/gvol1 49152 0 Y >> 1318 >> Self-heal Daemon on localhost N/A N/A Y >> 1329 >> Self-heal Daemon on 192.168.0.31 N/A N/A Y >> 1778 >> Self-heal Daemon on 192.168.0.32 N/A N/A Y >> 1707 >> >> Task Status of Volume gvol1 >> >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> The output of the gluster volume heal command: >> >> Brick 192.168.0.31:/gluster/brick1/gvol1 >> Status: Connected >> Number of entries: 0 >> >> Brick 192.168.0.32:/gluster/brick1/gvol1 >> Status: Connected >> Number of entries: 0 >> >> Brick 192.168.0.5:/gluster/brick1/gvol1 >> Status: Connected >> Number of entries: 0 >> >> The operating system / glusterfs version: >> >> CentOS Linux release 7.9.2009 (Core), fully up to date >> glusterfs 9.5 >> kernel 3.10.0-1160.53.1.el7.x86_64 >> >> The logs are basically empty since the last mount except for the >> mount-related messages. >> >> Additional info: a statedump from the client is attached to the Github >> issue, >> https://github.com/gluster/glusterfs/files/8004792/glusterdump.18906.dump.1643991007.gz, >> in case someone wants to have a look. >> >> There was also an issue with other clients, running PHP applications with >> lots of small files, where glusterfs fuse mount process would very quickly >> balloon to ~2 GB over the course of 24 hours and its performance would slow >> to a crawl. This happened very consistently with glusterfs 8.x and 9.5, I >> managed to resolve it at least partially with disabling >> performance.open-behind: the memory usage either remains consistent or >> increases at a much slower rate, which is acceptable for this use case. >> >> Now the issue remains on this single client, which doesn't do much other >> than reading and archiving all files from the gluster volume once per day. >> The glusterfs fuse mount process balloons to 0.5+ GB during the first tar >> run and remains more or less consistent afterwards, including subsequent >> tar runs. >> >> I would very much appreciate any advice or suggestions. >> >> Best regards, >> Zakhar >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220208/9f069d14/attachment.html>
Zakhar Kirpichenko
2022-Mar-31 06:57 UTC
[Gluster-users] GlusterFS 9.5 fuse mount excessive memory usage
Hi, Any news about this? I provided very detailed test results and proof of the issue https://github.com/gluster/glusterfs/issues/3206 on 6 February 2022 but haven't heard back after that. Best regards, Zakhar On Tue, Feb 8, 2022 at 7:14 AM Zakhar Kirpichenko <zakhar at gmail.com> wrote:> Hi, > > I've updated the github issue with more details: > https://github.com/gluster/glusterfs/issues/3206#issuecomment-1030770617 > > Looks like there's a memory leak. > > /Z > > On Sat, Feb 5, 2022 at 8:45 PM Zakhar Kirpichenko <zakhar at gmail.com> > wrote: > >> Hi Strahil, >> >> Many thanks for your reply! I've updated the Github issue with statedump >> files taken before and after the tar operation: >> https://github.com/gluster/glusterfs/files/8008635/glusterdump.19102.dump.zip >> >> Please disregard that path= entries are empty, in the original dumps >> there are real paths but I deleted them as they might contain sensitive >> information. >> >> The odd thing is that the dump file is full of: >> >> 1) xlator.performance.write-behind.wb_inode entries, but the tar >> operation does not write to these files. The whole backup process is >> read-only. >> >> 2) xlator.performance.quick-read.inodectx entries, which never go away. >> >> None of this happens on other clients, which read and write from/to the >> same volume in a much more intense manner. >> >> Best regards, >> Z >> >> On Sat, Feb 5, 2022 at 11:23 AM Strahil Nikolov <hunter86_bg at yahoo.com> >> wrote: >> >>> Can you generate a statedump before and after the tar ? >>> For statedump generation , you can follow >>> https://github.com/gluster/glusterfs/issues/1440#issuecomment-674051243 >>> . >>> >>> Best Regards, >>> Strahil Nikolov >>> >>> >>> ? ??????, 5 ???????? 2022 ?., 07:54:22 ???????+2, Zakhar Kirpichenko < >>> zakhar at gmail.com> ??????: >>> >>> >>> Hi! >>> >>> I opened a Github issue https://github.com/gluster/glusterfs/issues/3206 >>> but not sure how much attention they get there, so re-posting here just in >>> case someone has any ideas. >>> >>> Description of problem: >>> >>> GlusterFS 9.5, 3-node cluster (2 bricks + arbiter), an attempt to tar >>> the whole filesystem (35-40 GB, 1.6 million files) on a client succeeds but >>> causes the glusterfs fuse mount process to consume 0.5+ GB of RAM. The >>> usage never goes down after tar exits. >>> >>> The exact command to reproduce the issue: >>> >>> /usr/bin/tar --use-compress-program="/bin/pigz" -cf >>> /path/to/archive.tar.gz --warning=no-file-changed /glusterfsmount >>> >>> The output of the gluster volume info command: >>> >>> Volume Name: gvol1 >>> Type: Replicate >>> Volume ID: 0292ac43-89bd-45a4-b91d-799b49613e60 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x (2 + 1) = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 192.168.0.31:/gluster/brick1/gvol1 >>> Brick2: 192.168.0.32:/gluster/brick1/gvol1 >>> Brick3: 192.168.0.5:/gluster/brick1/gvol1 (arbiter) >>> Options Reconfigured: >>> performance.open-behind: off >>> cluster.readdir-optimize: off >>> cluster.consistent-metadata: on >>> features.cache-invalidation: on >>> diagnostics.count-fop-hits: on >>> diagnostics.latency-measurement: on >>> storage.fips-mode-rchecksum: on >>> performance.cache-size: 256MB >>> client.event-threads: 8 >>> server.event-threads: 4 >>> storage.reserve: 1 >>> performance.cache-invalidation: on >>> cluster.lookup-optimize: on >>> transport.address-family: inet >>> nfs.disable: on >>> performance.client-io-threads: on >>> features.cache-invalidation-timeout: 600 >>> performance.md-cache-timeout: 600 >>> network.inode-lru-limit: 50000 >>> cluster.shd-max-threads: 4 >>> cluster.self-heal-window-size: 8 >>> performance.enable-least-priority: off >>> performance.cache-max-file-size: 2MB >>> >>> The output of the gluster volume status command: >>> >>> Status of volume: gvol1 >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> >>> ------------------------------------------------------------------------------ >>> Brick 192.168.0.31:/gluster/brick1/gvol1 49152 0 Y >>> 1767 >>> Brick 192.168.0.32:/gluster/brick1/gvol1 49152 0 Y >>> 1696 >>> Brick 192.168.0.5:/gluster/brick1/gvol1 49152 0 Y >>> 1318 >>> Self-heal Daemon on localhost N/A N/A Y >>> 1329 >>> Self-heal Daemon on 192.168.0.31 N/A N/A Y >>> 1778 >>> Self-heal Daemon on 192.168.0.32 N/A N/A Y >>> 1707 >>> >>> Task Status of Volume gvol1 >>> >>> ------------------------------------------------------------------------------ >>> There are no active volume tasks >>> >>> The output of the gluster volume heal command: >>> >>> Brick 192.168.0.31:/gluster/brick1/gvol1 >>> Status: Connected >>> Number of entries: 0 >>> >>> Brick 192.168.0.32:/gluster/brick1/gvol1 >>> Status: Connected >>> Number of entries: 0 >>> >>> Brick 192.168.0.5:/gluster/brick1/gvol1 >>> Status: Connected >>> Number of entries: 0 >>> >>> The operating system / glusterfs version: >>> >>> CentOS Linux release 7.9.2009 (Core), fully up to date >>> glusterfs 9.5 >>> kernel 3.10.0-1160.53.1.el7.x86_64 >>> >>> The logs are basically empty since the last mount except for the >>> mount-related messages. >>> >>> Additional info: a statedump from the client is attached to the Github >>> issue, >>> https://github.com/gluster/glusterfs/files/8004792/glusterdump.18906.dump.1643991007.gz, >>> in case someone wants to have a look. >>> >>> There was also an issue with other clients, running PHP applications >>> with lots of small files, where glusterfs fuse mount process would very >>> quickly balloon to ~2 GB over the course of 24 hours and its performance >>> would slow to a crawl. This happened very consistently with glusterfs 8.x >>> and 9.5, I managed to resolve it at least partially with disabling >>> performance.open-behind: the memory usage either remains consistent or >>> increases at a much slower rate, which is acceptable for this use case. >>> >>> Now the issue remains on this single client, which doesn't do much other >>> than reading and archiving all files from the gluster volume once per day. >>> The glusterfs fuse mount process balloons to 0.5+ GB during the first tar >>> run and remains more or less consistent afterwards, including subsequent >>> tar runs. >>> >>> I would very much appreciate any advice or suggestions. >>> >>> Best regards, >>> Zakhar >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://meet.google.com/cpu-eiue-hvk >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220331/e9c86a98/attachment.html>
Strahil Nikolov
2022-Apr-02 11:32 UTC
[Gluster-users] GlusterFS 9.5 fuse mount excessive memory usage
Sadly, I can't help but you can join the regular gluster meeting and ask for feedback on the topic. Best Regards,Strahil Nikolov On Thu, Mar 31, 2022 at 9:57, Zakhar Kirpichenko<zakhar at gmail.com> wrote: ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220402/05d1e6c0/attachment.html>