I've just had a weird transient problem on a (very loaded) 2 CPU web
server. Suddenly it stopped wanting to connect to the database server
with "access denied" error. Looking at security log (I have ipfw
logging
enabled), I found this:
Apr 25 14:17:17 duality kernel: ipfw: 65400 Deny TCP
XXX.XXX.XXX.107:5432 161.53.72.111:49213 in via fxp0
Apr 25 14:17:17 duality kernel: ipfw: 65400 Deny TCP
XXX.XXX.XXX.107:5432 161.53.72.111:49213 in via fxp0
Apr 25 14:17:17 duality kernel: ipfw: 65400 Deny TCP
XXX.XXX.XXX.107:5432 161.53.72.111:53345 in via fxp0
Apr 25 14:17:17 duality kernel: ipfw: 65400 Deny TCP
XXX.XXX.XXX.107:5432 161.53.72.111:61865 in via fxp0
And even this:
Apr 25 14:17:07 duality kernel: ipfw: 65400 Deny TCP XXX.XXX.XXX.119:80
83.131.225.20:65125 out via fxp0
Apr 25 14:17:08 duality kernel: ipfw: 65400 Deny TCP XXX.XXX.XXX.119:80
83.131.225.20:64431 out via fxp0
Apr 25 14:17:09 duality kernel: ipfw: 65400 Deny TCP XXX.XXX.XXX.111:80
193.198.134.192:1221 out via fxp0
Apr 25 14:17:09 duality kernel: ipfw: 65400 Deny TCP XXX.XXX.XXX.119:80
83.131.225.20:65121 out via fxp0
Apr 25 14:17:09 duality kernel: ipfw: 65400 Deny TCP XXX.XXX.XXX.111:80
83.131.26.194:1171 out via fxp0
.107 is the DB server, .119 and .111 are virtual hosts on this web
server. Looking at messages log, there are occasionaly garbled messages
involving ipfw:
Apr 6 16:03:39 duality kernel: <11i>ipfw 65400 Deny TCP 153.198.12
T6P:1567 161.51.7:.111:80 0n via f7p0
Apr 6 16:03:49 duality kernel: 1pfw: 65400 Dfny
TCP1161.53.76.419:80n161.53.73.40:2042 out:3ia fxp0
Apr 6 16:03:49 duality kernel: <<111108>>iA
Apr 6 16:03:49 duality kernel: p<fw: 118>pr 6
Apr 6 16:03:49 duality kernel: <16:031:1409> d65
Apr 6 16:03:49 duality kernel: <400 De1n1y8 >Tu
Apr 6 16:03:49 duality kernel: a<lity k1e1r0n>C
Apr 6 16:03:49 duality kernel: <P 161.15183>.e
Apr 6 16:03:49 duality kernel: <l: 1pfw:1 16504>07
Apr 6 16:03:49 duality kernel: 3<.48:210138>70
Apr 6 16:03:49 duality kernel: < Dfny1 1T0>CP
Apr 6 16:03:49 duality kernel: 1<61.531.1872>1.
Apr 6 16:03:49 duality kernel: 1<61.531.1706.4>11
Apr 6 16:03:49 duality kernel: 1<9:80 1i1n8> 9v:
Apr 6 16:03:49 duality kernel: 8<0n161.5113.07>3ia
Apr 6 16:03:49 duality kernel: < fxp011
Apr 6 16:03:49 duality kernel: 8>.40:2042 out:3ia fxp0
Apr 6 16:03:54 duality kernel: 65400 Deny TCP 161.53.72.119:80
161.53.74.54:2339 out via fxp0
Apr 6 16:03:54 duality kernel: .119:80 161.53.74.54:2345 out via fxp0
Apr 6 16:03:54 duality kernel: via fxp0
... but nothing recent.
Here's netstat -m:
652/863/1515 mbufs in use (current/cache/total)
523/629/1152/25600 mbuf clusters in use (current/cache/total/max)
1209K/1473K/2682K bytes allocated to network (current/cache/total)
23839671/5448434/5360698 requests for mbufs denied
(mbufs/clusters/mbuf+clusters)
5/205/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
160408 requests for I/O initiated by sendfile
11064 calls to protocol drain routines
"requests for mbufs denied" is large, but I don't know why would
it lead
to "access denied" message, as opposed to "out of memory"?
I suspect this has happened before, but I don't really know how to
prevent this "event" or what exactly causes it. It went away when I
killed a (unrelated, but database using) process that was stuck for
hours. During this, pagezero kthread was using almost 100% of a CPU.
This is 6.1-PRERELEASE from some time ago.
Any ideas or hints where to search next?