On Fri, Mar 31, 2017 at 12:29 PM, Amar Tumballi <atumball at redhat.com> wrote:> Hi Alvin, > > Thanks for the dump output. It helped a bit. > > For now, recommend turning off open-behind and read-ahead performance > translators for you to get rid of this situation, As I noticed hung FLUSH > operations from these translators. >Looks like I gave wrong advise by looking at below snippet: [global.callpool.stack.61]> stack=0x7f6c6f628f04 > uid=48 > gid=48 > pid=11077 > unique=10048797 > lk-owner=a73ae5bdb5fcd0d2 > op=FLUSH > type=1 > cnt=5 > > [global.callpool.stack.61.frame.1] > frame=0x7f6c6f793d88 > ref_count=0 > translator=edocs-production-write-behind > complete=0 > parent=edocs-production-read-ahead > wind_from=ra_flush > wind_to=FIRST_CHILD (this)->fops->flush > unwind_to=ra_flush_cbk > > [global.callpool.stack.61.frame.2] > frame=0x7f6c6f796c90 > ref_count=1 > translator=edocs-production-read-ahead > complete=0 > parent=edocs-production-open-behind > wind_from=default_flush_resume > wind_to=FIRST_CHILD(this)->fops->flush > unwind_to=default_flush_cbk > > [global.callpool.stack.61.frame.3] > frame=0x7f6c6f79b724 > ref_count=1 > translator=edocs-production-open-behind > complete=0 > parent=edocs-production > wind_from=io_stats_flush > wind_to=FIRST_CHILD(this)->fops->flush > unwind_to=io_stats_flush_cbk > > [global.callpool.stack.61.frame.4] > frame=0x7f6c6f79b474 > ref_count=1 > translator=edocs-production > complete=0 > parent=fuse > wind_from=fuse_flush_resume > wind_to=FIRST_CHILD(this)->fops->flush > unwind_to=fuse_err_cbk > > [global.callpool.stack.61.frame.5] > frame=0x7f6c6f796684 > ref_count=1 > translator=fuse > complete=0 >Mos probably, issue is with write-behind's flush. So please turn off write-behind and test. If you don't have any hung httpd processes, please let us know. -Amar> -Amar > > On Wed, Mar 29, 2017 at 6:56 AM, Alvin Starr <alvin at netvel.net> wrote: > >> We are running gluster 3.8.9-1 on Centos 7.3.1611 for the servers and on >> the clients 3.7.11-2 on Centos 6.8 >> >> We are seeing httpd processes hang in fuse_request_send or sync_page. >> >> These calls are from PHP 5.3.3-48 scripts >> >> I am attaching a tgz file that contains the process dump from glusterfsd >> and the hung pids along with the offending pid's stacks from >> /proc/{pid}/stack. >> >> This has been a low level annoyance for a while but it has become a much >> bigger issue because the number of hung processes went from a few a week to >> a few hundred a day. >> >> >> -- >> Alvin Starr || voice: (905)513-7688 >> Netvel Inc. || Cell: (416)806-0133 >> alvin at netvel.net || >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > -- > Amar Tumballi (amarts) >-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170331/107ea6c1/attachment.html>
Thanks Amar, I?ll consider your recommendations. But why performance is totally
different on two nodes? Will data be written to both nodes at the same time?
From: Amar Tumballi<mailto:atumball at redhat.com>
Sent: Friday, March 31, 2017 3:14 PM
To: Alvin Starr<mailto:alvin at netvel.net>
Cc: gluster-users at gluster.org List<mailto:gluster-users at gluster.org>
Subject: Re: [Gluster-users] hanging httpd processes.
On Fri, Mar 31, 2017 at 12:29 PM, Amar Tumballi <atumball at
redhat.com<mailto:atumball at redhat.com>> wrote:
Hi Alvin,
Thanks for the dump output. It helped a bit.
For now, recommend turning off open-behind and read-ahead performance
translators for you to get rid of this situation, As I noticed hung FLUSH
operations from these translators.
Looks like I gave wrong advise by looking at below snippet:
[global.callpool.stack.61]
stack=0x7f6c6f628f04
uid=48
gid=48
pid=11077
unique=10048797
lk-owner=a73ae5bdb5fcd0d2
op=FLUSH
type=1
cnt=5
[global.callpool.stack.61.frame.1]
frame=0x7f6c6f793d88
ref_count=0
translator=edocs-production-write-behind
complete=0
parent=edocs-production-read-ahead
wind_from=ra_flush
wind_to=FIRST_CHILD (this)->fops->flush
unwind_to=ra_flush_cbk
[global.callpool.stack.61.frame.2]
frame=0x7f6c6f796c90
ref_count=1
translator=edocs-production-read-ahead
complete=0
parent=edocs-production-open-behind
wind_from=default_flush_resume
wind_to=FIRST_CHILD(this)->fops->flush
unwind_to=default_flush_cbk
[global.callpool.stack.61.frame.3]
frame=0x7f6c6f79b724
ref_count=1
translator=edocs-production-open-behind
complete=0
parent=edocs-production
wind_from=io_stats_flush
wind_to=FIRST_CHILD(this)->fops->flush
unwind_to=io_stats_flush_cbk
[global.callpool.stack.61.frame.4]
frame=0x7f6c6f79b474
ref_count=1
translator=edocs-production
complete=0
parent=fuse
wind_from=fuse_flush_resume
wind_to=FIRST_CHILD(this)->fops->flush
unwind_to=fuse_err_cbk
[global.callpool.stack.61.frame.5]
frame=0x7f6c6f796684
ref_count=1
translator=fuse
complete=0
Mos probably, issue is with write-behind's flush. So please turn off
write-behind and test. If you don't have any hung httpd processes, please
let us know.
-Amar
-Amar
On Wed, Mar 29, 2017 at 6:56 AM, Alvin Starr <alvin at
netvel.net<mailto:alvin at netvel.net>> wrote:
We are running gluster 3.8.9-1 on Centos 7.3.1611 for the servers and on the
clients 3.7.11-2 on Centos 6.8
We are seeing httpd processes hang in fuse_request_send or sync_page.
These calls are from PHP 5.3.3-48 scripts
I am attaching a tgz file that contains the process dump from glusterfsd and
the hung pids along with the offending pid's stacks from /proc/{pid}/stack.
This has been a low level annoyance for a while but it has become a much bigger
issue because the number of hung processes went from a few a week to a few
hundred a day.
--
Alvin Starr || voice: (905)513-7688
Netvel Inc. || Cell: (416)806-0133
alvin at netvel.net<mailto:alvin at netvel.net> ||
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
--
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170331/9facb31d/attachment.html>
Sorry, I replied based on the wrong title, forget about this.
From: Yong Zhang<mailto:hiscal at outlook.com>
Sent: Saturday, April 1, 2017 1:49 AM
To: Amar Tumballi<mailto:atumball at redhat.com>; Alvin
Starr<mailto:alvin at netvel.net>
Cc: gluster-users at gluster.org List<mailto:gluster-users at gluster.org>
Subject: Re: [Gluster-users] hanging httpd processes.
Thanks Amar, I?ll consider your recommendations. But why performance is totally
different on two nodes? Will data be written to both nodes at the same time?
From: Amar Tumballi<mailto:atumball at redhat.com>
Sent: Friday, March 31, 2017 3:14 PM
To: Alvin Starr<mailto:alvin at netvel.net>
Cc: gluster-users at gluster.org List<mailto:gluster-users at gluster.org>
Subject: Re: [Gluster-users] hanging httpd processes.
On Fri, Mar 31, 2017 at 12:29 PM, Amar Tumballi <atumball at
redhat.com<mailto:atumball at redhat.com>> wrote:
Hi Alvin,
Thanks for the dump output. It helped a bit.
For now, recommend turning off open-behind and read-ahead performance
translators for you to get rid of this situation, As I noticed hung FLUSH
operations from these translators.
Looks like I gave wrong advise by looking at below snippet:
[global.callpool.stack.61]
stack=0x7f6c6f628f04
uid=48
gid=48
pid=11077
unique=10048797
lk-owner=a73ae5bdb5fcd0d2
op=FLUSH
type=1
cnt=5
[global.callpool.stack.61.frame.1]
frame=0x7f6c6f793d88
ref_count=0
translator=edocs-production-write-behind
complete=0
parent=edocs-production-read-ahead
wind_from=ra_flush
wind_to=FIRST_CHILD (this)->fops->flush
unwind_to=ra_flush_cbk
[global.callpool.stack.61.frame.2]
frame=0x7f6c6f796c90
ref_count=1
translator=edocs-production-read-ahead
complete=0
parent=edocs-production-open-behind
wind_from=default_flush_resume
wind_to=FIRST_CHILD(this)->fops->flush
unwind_to=default_flush_cbk
[global.callpool.stack.61.frame.3]
frame=0x7f6c6f79b724
ref_count=1
translator=edocs-production-open-behind
complete=0
parent=edocs-production
wind_from=io_stats_flush
wind_to=FIRST_CHILD(this)->fops->flush
unwind_to=io_stats_flush_cbk
[global.callpool.stack.61.frame.4]
frame=0x7f6c6f79b474
ref_count=1
translator=edocs-production
complete=0
parent=fuse
wind_from=fuse_flush_resume
wind_to=FIRST_CHILD(this)->fops->flush
unwind_to=fuse_err_cbk
[global.callpool.stack.61.frame.5]
frame=0x7f6c6f796684
ref_count=1
translator=fuse
complete=0
Mos probably, issue is with write-behind's flush. So please turn off
write-behind and test. If you don't have any hung httpd processes, please
let us know.
-Amar
-Amar
On Wed, Mar 29, 2017 at 6:56 AM, Alvin Starr <alvin at
netvel.net<mailto:alvin at netvel.net>> wrote:
We are running gluster 3.8.9-1 on Centos 7.3.1611 for the servers and on the
clients 3.7.11-2 on Centos 6.8
We are seeing httpd processes hang in fuse_request_send or sync_page.
These calls are from PHP 5.3.3-48 scripts
I am attaching a tgz file that contains the process dump from glusterfsd and
the hung pids along with the offending pid's stacks from /proc/{pid}/stack.
This has been a low level annoyance for a while but it has become a much bigger
issue because the number of hung processes went from a few a week to a few
hundred a day.
--
Alvin Starr || voice: (905)513-7688
Netvel Inc. || Cell: (416)806-0133
alvin at netvel.net<mailto:alvin at netvel.net> ||
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-users
--
Amar Tumballi (amarts)
--
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170401/da3ac1aa/attachment.html>