thr3ads.net - freebsd stable - apache hanging on 8.0 AMD64 [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Rainer Duffner

2010-Jan-09 05:26 UTC

apache hanging on 8.0 AMD64

Hi,

we have an "interesting" problem with FreeBSD 8.0 AMD64:

The server is a HP DL380G5 with two Harpertown-class CPUs and 8 GB RAM.
It is running MySQL, Apache (worker MPM) and PHP as CGI with Fast-CGI
and SUEXEC.
It has over 500 ZFS filesystems that  comprise various customers
websites, each running PHP as their own user.

Soon after we put this system into production, we saw httpd-processes
being stalled in the "ucond" state, leading to a total stand-still of
the apache-server (apache blocked itself somehow).
I disabled ZFS prefetching and the problem went away for a couple of
days - until yesterday, when it happened again.
Swap was unused when it happened the last time.
I switched top into "thread-mode" (M) and saw that the processes
actually seemed to be in different state (zio->i, arc_mr, tx_tx, RUN).
I cannot get any info from kstat, because when the problem happens and I
attach to one of the processes, I don't get anything back - it just sits
there.

If there anything I can take a look at to further debug this problem?
At the time of the hang, no swap was used:

last pid:  6450;  load averages: 36.32, 30.17,
17.75
up 4+11:15:44  20:11:01
482 processes: 28 running, 452 sleeping, 1 zombie, 1 lock
CPU:     % user,     % nice,     % system,     % interrupt,     % idle
Mem: 1619M Active, 3829M Inact, 2066M Wired, 211M Cache, 827M Buf, 188M Free
Swap: 8192M Total, 8192M Free

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 6011 user1          44    0 24960K  3432K RUN     1   2:50  7.08% pure-ftpd
 6038 user2          66    0   161M 18856K RUN     3   1:26  3.47% php-cgi
  716 root           46    0 32452K 13776K select  5 104:53  3.08% snmpd
 6021 user3          63    0   163M 20232K RUN     7   1:28  2.49% php-cgi
 6009 www            44    0   103M 26952K tx->tx  3   0:55  1.76% {httpd}
 6030 www            44    0   101M 26168K CPU4    7   0:57  1.66% {httpd}
 6028 www            44    0   101M 26476K tx->tx  2   0:55  1.66% {httpd}
 6030 www            44    0   101M 26168K zio->i  5   0:55  1.66% {httpd}
 6008 www            44    0   102M 26640K RUN     2   1:23  1.56% {httpd}
 6009 www            46    0   103M 26952K tx->tx  3   1:22  1.56% {httpd}
 6016 www            44    0   102M 26636K tx->tx  2   1:17  1.56% {httpd}
 6024 www            44    0   106M 26568K RUN     1   1:07  1.56% {httpd}
 5978 www            44    0   102M 26960K RUN     0   1:00  1.56% {httpd}
 6008 www            44    0   102M 26640K zio->i  7   0:55  1.56% {httpd}
 5970 www            44    0   108M 27700K arc_mr  4   0:59  1.46% {httpd}
 6024 www            44    0   106M 26568K tx->tx  5   0:50  1.46% {httpd}
 5979 www            45    0   102M 26904K zio->i  1   1:14  1.37% {httpd}
 6009 www            47    0   103M 26952K zio->i  7   1:11  1.37% {httpd}


I disabled all the apache-modules we don't need.

This is the only system of its kind we have, currently, but we would
really like to get this fixed so we can move more of our
hosting-customers to similar setup servers.

Another detail: due to the fact that every user has a access- and
error-logfile, we had to bump FD_SETSIZE to 16384U.
We tried bumping kern.maxvnodes to larger and larger values (now at
400000, <200k are used), but it didn't really help that much. Disabling
prefetching helped a lot (only one crash in 5 days) - but we would like
to know why it actually happens and then fix it forever ;-)




Best Regards,
Rainer

Brent Jones

2010-Jan-09 08:35 UTC

head link

apache hanging on 8.0 AMD64

On Fri, Jan 8, 2010 at 8:58 PM, Rainer Duffner <rainer@ultra-secure.de>
wrote:> Hi,
>
> we have an "interesting" problem with FreeBSD 8.0 AMD64:
>
> The server is a HP DL380G5 with two Harpertown-class CPUs and 8 GB RAM.
> It is running MySQL, Apache (worker MPM) and PHP as CGI with Fast-CGI
> and SUEXEC.
> It has over 500 ZFS filesystems that ?comprise various customers
> websites, each running PHP as their own user.
>
> Soon after we put this system into production, we saw httpd-processes
> being stalled in the "ucond" state, leading to a total
stand-still of
> the apache-server (apache blocked itself somehow).
> I disabled ZFS prefetching and the problem went away for a couple of
> days - until yesterday, when it happened again.
> Swap was unused when it happened the last time.
> I switched top into "thread-mode" (M) and saw that the processes
> actually seemed to be in different state (zio->i, arc_mr, tx_tx, RUN).
> I cannot get any info from kstat, because when the problem happens and I
> attach to one of the processes, I don't get anything back - it just
sits
> there.
>
> If there anything I can take a look at to further debug this problem?
> At the time of the hang, no swap was used:
>
> last pid: ?6450; ?load averages: 36.32, 30.17,
> 17.75
> up 4+11:15:44 ?20:11:01
> 482 processes: 28 running, 452 sleeping, 1 zombie, 1 lock
> CPU: ? ? % user, ? ? % nice, ? ? % system, ? ? % interrupt, ? ? % idle
> Mem: 1619M Active, 3829M Inact, 2066M Wired, 211M Cache, 827M Buf, 188M
Free
> Swap: 8192M Total, 8192M Free
>
> ?PID USERNAME ? ? ?PRI NICE ? SIZE ? ?RES STATE ? C ? TIME ? WCPU COMMAND
> ?6011 user1 ? ? ? ? ?44 ? ?0 24960K ?3432K RUN ? ? 1 ? 2:50 ?7.08%
pure-ftpd
> ?6038 user2 ? ? ? ? ?66 ? ?0 ? 161M 18856K RUN ? ? 3 ? 1:26 ?3.47% php-cgi
> ?716 root ? ? ? ? ? 46 ? ?0 32452K 13776K select ?5 104:53 ?3.08% snmpd
> ?6021 user3 ? ? ? ? ?63 ? ?0 ? 163M 20232K RUN ? ? 7 ? 1:28 ?2.49% php-cgi
> ?6009 www ? ? ? ? ? ?44 ? ?0 ? 103M 26952K tx->tx ?3 ? 0:55 ?1.76%
{httpd}
> ?6030 www ? ? ? ? ? ?44 ? ?0 ? 101M 26168K CPU4 ? ?7 ? 0:57 ?1.66% {httpd}
> ?6028 www ? ? ? ? ? ?44 ? ?0 ? 101M 26476K tx->tx ?2 ? 0:55 ?1.66%
{httpd}
> ?6030 www ? ? ? ? ? ?44 ? ?0 ? 101M 26168K zio->i ?5 ? 0:55 ?1.66%
{httpd}
> ?6008 www ? ? ? ? ? ?44 ? ?0 ? 102M 26640K RUN ? ? 2 ? 1:23 ?1.56% {httpd}
> ?6009 www ? ? ? ? ? ?46 ? ?0 ? 103M 26952K tx->tx ?3 ? 1:22 ?1.56%
{httpd}
> ?6016 www ? ? ? ? ? ?44 ? ?0 ? 102M 26636K tx->tx ?2 ? 1:17 ?1.56%
{httpd}
> ?6024 www ? ? ? ? ? ?44 ? ?0 ? 106M 26568K RUN ? ? 1 ? 1:07 ?1.56% {httpd}
> ?5978 www ? ? ? ? ? ?44 ? ?0 ? 102M 26960K RUN ? ? 0 ? 1:00 ?1.56% {httpd}
> ?6008 www ? ? ? ? ? ?44 ? ?0 ? 102M 26640K zio->i ?7 ? 0:55 ?1.56%
{httpd}
> ?5970 www ? ? ? ? ? ?44 ? ?0 ? 108M 27700K arc_mr ?4 ? 0:59 ?1.46% {httpd}
> ?6024 www ? ? ? ? ? ?44 ? ?0 ? 106M 26568K tx->tx ?5 ? 0:50 ?1.46%
{httpd}
> ?5979 www ? ? ? ? ? ?45 ? ?0 ? 102M 26904K zio->i ?1 ? 1:14 ?1.37%
{httpd}
> ?6009 www ? ? ? ? ? ?47 ? ?0 ? 103M 26952K zio->i ?7 ? 1:11 ?1.37%
{httpd}
>
>
> I disabled all the apache-modules we don't need.
>
> This is the only system of its kind we have, currently, but we would
> really like to get this fixed so we can move more of our
> hosting-customers to similar setup servers.
>
> Another detail: due to the fact that every user has a access- and
> error-logfile, we had to bump FD_SETSIZE to 16384U.
> We tried bumping kern.maxvnodes to larger and larger values (now at
> 400000, <200k are used), but it didn't really help that much.
Disabling
> prefetching helped a lot (only one crash in 5 days) - but we would like
> to know why it actually happens and then fix it forever ;-)
>
>
>
>
> Best Regards,
> Rainer
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
"freebsd-stable-unsubscribe@freebsd.org"
>
>From my experience with OpenSolaris, and many/large ZFS file systems,8GB is skirting on the low end amount of RAM to accomplish simple file
serving, let alone website and database hosting.
Even though you show little to no swap used, I'd still bet there is a
lot of memory pressure from the ZFS ARC.
Can you try limiting the size of the ARC, or add more memory?

On top of 500 zvols, do they also have snapshots under them? What does
disk I/O look like?
I've also seen Apache free due to SSL renegotiation (something I'm
currently struggling with), but you would see idle workers, not in any
state.



-- 
Brent Jones
brent@servuhome.net

freebsd stable - Jan 2010 - apache hanging on 8.0 AMD64

apache hanging on 8.0 AMD64

apache hanging on 8.0 AMD64