Nithya Balachandran
2018-Aug-09 13:10 UTC
[Gluster-users] blocking process on FUSE mount in directory which is using quota
Hi, Please provide the following: 1. gluster volume info 2. statedump of the fuse process when it hangs Thanks, Nithya On 9 August 2018 at 18:24, mabi <mabi at protonmail.ch> wrote:> Hello, > > I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12 > and now I see a weird behaviour on my client (using FUSE mount) where I > have processes (PHP 5.6 FPM) trying to access a specific directory and then > the process blocks. I can't kill the process either, not even with kill -9. > I need to reboot the machine in order to get rid of these blocked processes. > > This directory has one particularity compared to the other directories it > is that it has reached it's quota soft-limit as you can see here in the > output of gluster volume quota list: > > Path Hard-limit Soft-limit Used > Available Soft-limit exceeded? Hard-limit exceeded? > ------------------------------------------------------------ > ------------------------------------------------------------------- > /directory 100.0GB 80%(80.0GB) 90.5GB > 9.5GB Yes No > > That does not mean that it is the quota's fault but it might be a hint > where to start looking for... And by the way can someone explain me what > the soft-limit does? or does it not do anything special? > > Here is an the linux stack of a blocking process on that directory which > happened with a simple "ls -la": > > [Thu Aug 9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 > seconds. > [Thu Aug 9 14:21:07 2018] Not tainted 3.16.0-4-amd64 #1 > [Thu Aug 9 14:21:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Thu Aug 9 14:21:07 2018] ls D ffff88017ef93200 0 2272 > 2268 0x00000004 > [Thu Aug 9 14:21:07 2018] ffff88017653f490 0000000000000286 > 0000000000013200 ffff880174d7bfd8 > [Thu Aug 9 14:21:07 2018] 0000000000013200 ffff88017653f490 > ffff8800eeb3d5f0 ffff8800fefac800 > [Thu Aug 9 14:21:07 2018] ffff880174d7bbe0 ffff8800eeb3d6d0 > ffff8800fefac800 ffff8800ffe1e1c0 > [Thu Aug 9 14:21:07 2018] Call Trace: > [Thu Aug 9 14:21:07 2018] [<ffffffffa00dc50d>] ? > __fuse_request_send+0xbd/0x270 [fuse] > [Thu Aug 9 14:21:07 2018] [<ffffffff810abce0>] ? > prepare_to_wait_event+0xf0/0xf0 > [Thu Aug 9 14:21:07 2018] [<ffffffffa00e0791>] ? > fuse_dentry_revalidate+0x181/0x300 [fuse] > [Thu Aug 9 14:21:07 2018] [<ffffffff811b944e>] ? lookup_fast+0x25e/0x2b0 > [Thu Aug 9 14:21:07 2018] [<ffffffff811bacc5>] ? > path_lookupat+0x155/0x780 > [Thu Aug 9 14:21:07 2018] [<ffffffff81195715>] ? > kmem_cache_alloc+0x75/0x480 > [Thu Aug 9 14:21:07 2018] [<ffffffffa00dfca9>] ? > fuse_getxattr+0xe9/0x150 [fuse] > [Thu Aug 9 14:21:07 2018] [<ffffffff811bb316>] ? > filename_lookup+0x26/0xc0 > [Thu Aug 9 14:21:07 2018] [<ffffffff811bf594>] ? > user_path_at_empty+0x54/0x90 > [Thu Aug 9 14:21:07 2018] [<ffffffff81193e08>] ? > kmem_cache_free+0xd8/0x210 > [Thu Aug 9 14:21:07 2018] [<ffffffff811bf59f>] ? > user_path_at_empty+0x5f/0x90 > [Thu Aug 9 14:21:07 2018] [<ffffffff811b3d46>] ? vfs_fstatat+0x46/0x90 > [Thu Aug 9 14:21:07 2018] [<ffffffff811b421d>] ? SYSC_newlstat+0x1d/0x40 > [Thu Aug 9 14:21:07 2018] [<ffffffff811d34b8>] ? SyS_lgetxattr+0x58/0x80 > [Thu Aug 9 14:21:07 2018] [<ffffffff81525d0d>] ? > system_call_fast_compare_end+0x10/0x15 > > > My 3 gluster nodes are all Debian 9 and my client Debian 8. > > Let me know if you need more information. > > Best regards, > Mabi > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180809/4e9c0625/attachment.html>
mabi
2018-Aug-09 13:17 UTC
[Gluster-users] blocking process on FUSE mount in directory which is using quota
Hi Nithya, Thanks for the fast answer. Here the additional info: 1. gluster volume info Volume Name: myvol-private Type: Replicate Volume ID: e7a40a1b-45c9-4d3c-bb19-0c59b4eceec5 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: gfs1a:/data/myvol-private/brick Brick2: gfs1b:/data/myvol-private/brick Brick3: gfs1c:/srv/glusterfs/myvol-private/brick (arbiter) Options Reconfigured: features.default-soft-limit: 95% transport.address-family: inet features.quota-deem-statfs: on features.inode-quota: on features.quota: on nfs.disable: on performance.readdir-ahead: on client.event-threads: 4 server.event-threads: 4 auth.allow: 192.168.100.92 2. Sorry I have no clue how to take a "statedump" of a process on Linux. Which command should I use for that? and which process would you like, the blocked process (for example "ls")? Regards, M. ??????? Original Message ??????? On August 9, 2018 3:10 PM, Nithya Balachandran <nbalacha at redhat.com> wrote:> Hi, > > Please provide the following: > > - gluster volume info > - statedump of the fuse process when it hangs > > Thanks, > Nithya > > On 9 August 2018 at 18:24, mabi <mabi at protonmail.ch> wrote: > >> Hello, >> >> I recently upgraded my GlusterFS replica 2+1 (aribter) to version 3.12.12 and now I see a weird behaviour on my client (using FUSE mount) where I have processes (PHP 5.6 FPM) trying to access a specific directory and then the process blocks. I can't kill the process either, not even with kill -9. I need to reboot the machine in order to get rid of these blocked processes. >> >> This directory has one particularity compared to the other directories it is that it has reached it's quota soft-limit as you can see here in the output of gluster volume quota list: >> >> Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? >> ------------------------------------------------------------------------------------------------------------------------------- >> /directory 100.0GB 80%(80.0GB) 90.5GB 9.5GB Yes No >> >> That does not mean that it is the quota's fault but it might be a hint where to start looking for... And by the way can someone explain me what the soft-limit does? or does it not do anything special? >> >> Here is an the linux stack of a blocking process on that directory which happened with a simple "ls -la": >> >> [Thu Aug 9 14:21:07 2018] INFO: task ls:2272 blocked for more than 120 seconds. >> [Thu Aug 9 14:21:07 2018] Not tainted 3.16.0-4-amd64 #1 >> [Thu Aug 9 14:21:07 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [Thu Aug 9 14:21:07 2018] ls D ffff88017ef93200 0 2272 2268 0x00000004 >> [Thu Aug 9 14:21:07 2018] ffff88017653f490 0000000000000286 0000000000013200 ffff880174d7bfd8 >> [Thu Aug 9 14:21:07 2018] 0000000000013200 ffff88017653f490 ffff8800eeb3d5f0 ffff8800fefac800 >> [Thu Aug 9 14:21:07 2018] ffff880174d7bbe0 ffff8800eeb3d6d0 ffff8800fefac800 ffff8800ffe1e1c0 >> [Thu Aug 9 14:21:07 2018] Call Trace: >> [Thu Aug 9 14:21:07 2018] [<ffffffffa00dc50d>] ? __fuse_request_send+0xbd/0x270 [fuse] >> [Thu Aug 9 14:21:07 2018] [<ffffffff810abce0>] ? prepare_to_wait_event+0xf0/0xf0 >> [Thu Aug 9 14:21:07 2018] [<ffffffffa00e0791>] ? fuse_dentry_revalidate+0x181/0x300 [fuse] >> [Thu Aug 9 14:21:07 2018] [<ffffffff811b944e>] ? lookup_fast+0x25e/0x2b0 >> [Thu Aug 9 14:21:07 2018] [<ffffffff811bacc5>] ? path_lookupat+0x155/0x780 >> [Thu Aug 9 14:21:07 2018] [<ffffffff81195715>] ? kmem_cache_alloc+0x75/0x480 >> [Thu Aug 9 14:21:07 2018] [<ffffffffa00dfca9>] ? fuse_getxattr+0xe9/0x150 [fuse] >> [Thu Aug 9 14:21:07 2018] [<ffffffff811bb316>] ? filename_lookup+0x26/0xc0 >> [Thu Aug 9 14:21:07 2018] [<ffffffff811bf594>] ? user_path_at_empty+0x54/0x90 >> [Thu Aug 9 14:21:07 2018] [<ffffffff81193e08>] ? kmem_cache_free+0xd8/0x210 >> [Thu Aug 9 14:21:07 2018] [<ffffffff811bf59f>] ? user_path_at_empty+0x5f/0x90 >> [Thu Aug 9 14:21:07 2018] [<ffffffff811b3d46>] ? vfs_fstatat+0x46/0x90 >> [Thu Aug 9 14:21:07 2018] [<ffffffff811b421d>] ? SYSC_newlstat+0x1d/0x40 >> [Thu Aug 9 14:21:07 2018] [<ffffffff811d34b8>] ? SyS_lgetxattr+0x58/0x80 >> [Thu Aug 9 14:21:07 2018] [<ffffffff81525d0d>] ? system_call_fast_compare_end+0x10/0x15 >> >> My 3 gluster nodes are all Debian 9 and my client Debian 8. >> >> Let me know if you need more information. >> >> Best regards, >> Mabi >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180809/b399d113/attachment.html>
Alvin Starr
2018-Aug-09 13:30 UTC
[Gluster-users] blocking process on FUSE mount in directory which is using quota
A while back we found an issue with PHP around file locks. https://bugs.php.net/bug.php?id=53076 This would leave hanging processes that we could only kill by forcibly remounting the filesystems. Its not really a PHP bug but they are not willing to fix the code to deal with Glusters idiosyncrasies. On 08/09/2018 09:10 AM, Nithya Balachandran wrote:> Hi, > > Please provide the following: > > 1. gluster volume info > 2. statedump of the fuse process when it hangs > > > Thanks, > Nithya > > On 9 August 2018 at 18:24, mabi <mabi at protonmail.ch > <mailto:mabi at protonmail.ch>> wrote: > > Hello, > > I recently upgraded my GlusterFS replica 2+1 (aribter) to version > 3.12.12 and now I see a weird behaviour on my client (using FUSE > mount) where I have processes (PHP 5.6 FPM) trying to access a > specific directory and then the process blocks. I can't kill the > process either, not even with kill -9. I need to reboot the > machine in order to get rid of these blocked processes. > > This directory has one particularity compared to the other > directories it is that it has reached it's quota soft-limit as you > can see here in the output of gluster volume quota list: > > ? ? ? ? ? ? ? ? ? Path? ? ? ? ? ? ? ? ? ?Hard-limit Soft-limit? ? > ? Used? Available? Soft-limit exceeded? Hard-limit exceeded? > ------------------------------------------------------------------------------------------------------------------------------- > /directory? ? ? ? ? ? ? ? ? ? ? ? ? 100.0GB? ? ?80%(80.0GB) > ?90.5GB? ?9.5GB? ? ? ? ? ? ?Yes? ? ? ? ? ? ? ? ? ?No > > That does not mean that it is the quota's fault but it might be a > hint where to start looking for... And by the way can someone > explain me what the soft-limit does? or does it not do anything > special? > > Here is an the linux stack of a blocking process on that directory > which happened with a simple "ls -la": > > [Thu Aug? 9 14:21:07 2018] INFO: task ls:2272 blocked for more > than 120 seconds. > [Thu Aug? 9 14:21:07 2018]? ? ? ?Not tainted 3.16.0-4-amd64 #1 > [Thu Aug? 9 14:21:07 2018] "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [Thu Aug? 9 14:21:07 2018] ls? ? ? ? ? ? ? D ffff88017ef93200? ? > ?0? 2272? ?2268 0x00000004 > [Thu Aug? 9 14:21:07 2018]? ffff88017653f490 0000000000000286 > 0000000000013200 ffff880174d7bfd8 > [Thu Aug? 9 14:21:07 2018]? 0000000000013200 ffff88017653f490 > ffff8800eeb3d5f0 ffff8800fefac800 > [Thu Aug? 9 14:21:07 2018]? ffff880174d7bbe0 ffff8800eeb3d6d0 > ffff8800fefac800 ffff8800ffe1e1c0 > [Thu Aug? 9 14:21:07 2018] Call Trace: > [Thu Aug? 9 14:21:07 2018]? [<ffffffffa00dc50d>] ? > __fuse_request_send+0xbd/0x270 [fuse] > [Thu Aug? 9 14:21:07 2018]? [<ffffffff810abce0>] ? > prepare_to_wait_event+0xf0/0xf0 > [Thu Aug? 9 14:21:07 2018]? [<ffffffffa00e0791>] ? > fuse_dentry_revalidate+0x181/0x300 [fuse] > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811b944e>] ? > lookup_fast+0x25e/0x2b0 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811bacc5>] ? > path_lookupat+0x155/0x780 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff81195715>] ? > kmem_cache_alloc+0x75/0x480 > [Thu Aug? 9 14:21:07 2018]? [<ffffffffa00dfca9>] ? > fuse_getxattr+0xe9/0x150 [fuse] > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811bb316>] ? > filename_lookup+0x26/0xc0 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811bf594>] ? > user_path_at_empty+0x54/0x90 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff81193e08>] ? > kmem_cache_free+0xd8/0x210 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811bf59f>] ? > user_path_at_empty+0x5f/0x90 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811b3d46>] ? > vfs_fstatat+0x46/0x90 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811b421d>] ? > SYSC_newlstat+0x1d/0x40 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff811d34b8>] ? > SyS_lgetxattr+0x58/0x80 > [Thu Aug? 9 14:21:07 2018]? [<ffffffff81525d0d>] ? > system_call_fast_compare_end+0x10/0x15 > > > My 3 gluster nodes are all Debian 9 and my client Debian 8. > > Let me know if you need more information. > > Best regards, > Mabi > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > <https://lists.gluster.org/mailman/listinfo/gluster-users> > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Alvin Starr || land: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 alvin at netvel.net || -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180809/cfa63f30/attachment.html>