Hello, I am deploying FreeBSD based application proxies' based firewall (www.kernun.com, but not much English there) and am having frequent panics of RELENG_6_1 under load. The server has IP forwarding disabled. I've got two machines in a carp cluster and the transparent proxies use PF to get the data. I don't know much about kernel internals and PF but from the following backtrace I understand that the crash happens because rpool->cur on line 2158 in src/sys/contrib/pf/net/pf.c is NULL and is dereferenced. It probably shouldn't happen yet it does. The machines are SMP and were running SMP kernel. The only places where pool.cur (or pool->cur) is assigned to are in pf_ioctl.c. It seems there are some lock operations though so it is probably believed that the coder is properly locked. I have been running with kern.smp.disabled=1 for a moment before I put the old firewall in place and haven't seen the panic but the time was deffinitely too short to make me believe it fixes the issue. Can setting debug.mpsafenet to 0 possibly also help? I could probably bandaid this particular failure mode by returning failure instead of panicing but the bug is probably elsewhere. I've lost the debug kernel from which this backtrace is and can't therefore continue much :-(. Unfortunately so far I can only reproduce the problem in production and for obvious reasons I can't put it there. Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x28 fault code = supervisor read, page not present instruction pointer = 0x8:0xffffffff801ab528 stack pointer = 0x10:0xffffffffb1ade650 frame pointer = 0x10:0xffffff004cc7cc30 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 15 (swi1: net) trap number = 12 panic: page fault #0 doadump () at pcpu.h:172 #1 0x0000000000000004 in ?? () #2 0xffffffff803d5137 in boot (howto=260) at ../../../kern/kern_shutdown.c:402 #3 0xffffffff803d58a1 in panic (fmt=0xffffff007ba32000 "@\223<A3>{") at ../../../kern/kern_shutdown.c:558 #4 0xffffffff80543b3f in trap_fatal (frame=0xffffff007ba32000, eva=18446742976272241472) at ../../../amd64/amd64/trap.c:660 #5 0xffffffff80543e5f in trap_pfault (frame=0xffffffffb1ade5a0, usermode=0) at ../../../amd64/amd64/trap.c:573 #6 0xffffffff80544113 in trap (frame {tf_rdi = 2, tf_rsi = -1098223465792, tf_rdx = -1098439497700, tf_rcx = -1 314002464, tf_r8 = 0, tf_r9 = -1314002776, tf_rax = 0, tf_rbx = 0, tf_rbp = -109 8223465424, tf_r10 = 1, tf_r11 = 257, tf_r12 = -1098439497700, tf_r13 -1314002 776, tf_r14 = 2, tf_r15 = -1314002464, tf_trapno = 12, tf_addr = 40, tf_flags 216171684640539392, tf_err = 0, tf_rip = -2145733336, tf_cs = 8, tf_rflags = 661 18, tf_rsp = -1314003360, tf_ss = 16}) at ../../../amd64/amd64/trap.c:352 #7 0xffffffff8052feab in calltrap () at ../../../amd64/amd64/exception.S:168 #8 0xffffffff801ab528 in pf_map_addr (af=2 '\002', r=0xffffff004cc7cac0, saddr=0xffffff003fe7681c, naddr=0xffffffffb1ade9e0, init_addr=0x0, sn=0xffffffffb1ade8a8) at ../../../contrib/pf/net/pf.c:2163 #9 0xffffffff801acab6 in pf_get_translation (pd=0xffffffffb1ade9c0, m=0xffffff0042ede900, off=20, direction=1, kif=0xffffff007b038a00, sn=0xffffffffb1ade8a8, saddr=0xffffff003fe7681c, sport=0, daddr=0xffffff003fe76820, dport=50881, naddr=0xffffffffb1ade9e0, nport=0xffffffffb1ade8b6) at ../../../contrib/pf/net/pf.c:2618 #10 0xffffffff801b315b in pf_test_tcp (rm=0xffffffffb1ade960, sm=0xffffffffb1ade950, direction=1, kif=0xffffff007b038a00, m=0xffffff0042ede900, off=20, h=0xffffff003fe76810, pd=0xffffffffb1ade9c0, am=0xffffffffb1ade968, rsm=0xffffffffb1ade970, ifq=0x2, inp=0x0) at ../../../contrib/pf/net/pf.c:3013 #11 0xffffffff801b5694 in pf_test (dir=1, ifp=0xffffff0000bee800, m0=0xffffffffb1adeaa0, eh=0xffffffffb1ade97e, inp=0x0) at ../../../contrib/pf/net/pf.c:6449 #12 0xffffffff801bafb2 in pf_check_in (arg=0x2, m=0xffffffffb1adeaa0, ifp=0xffffff004cc7cac0, dir=-1314002464, inp=0xffffffffb1ade9e0) at ../../../contrib/pf/net/pf_ioctl.c:3358 #13 0xffffffff80461c2e in pfil_run_hooks (ph=0xffffffff807e0920, mp=0xffffffffb1adeb28, ifp=0xffffff0000bee800, dir=1, inp=0x0) at ../../../net/pfil.c:139 #14 0xffffffff8048d225 in ip_input (m=0xffffff0042ede900) at ../../../netinet/ip_input.c:465 #15 0xffffffff8046180c in netisr_processqueue (ni=0xffffffff807df690) at ../../../net/netisr.c:236 #16 0xffffffff80461abd in swi_net (dummy=0x2) at ../../../net/netisr.c:349 #17 0xffffffff803bbd99 in ithread_loop (arg=0xffffff00000506a0) at ../../../kern/kern_intr.c:684 #18 0xffffffff803ba527 in fork_exit ( callout=0xffffffff803bbc50 <ithread_loop>, arg=0xffffff00000506a0, frame=0xffffffffb1adec50) at ../../../kern/kern_fork.c:805 #19 0xffffffff8053020e in fork_trampoline () at ../../../amd64/amd64/exception.S:394 #20 0x0000000000000000 in ?? () The firewall also reports lots of PF problems durings operation: Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: KERN-100-E [natutil.c:770] ioctl(): Invalid argument (EINVAL=22) Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NATT-111-E add_rule(): PF ioctl DIOCADDRULE failed Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NATT-701-E addnatmap out(): Adding TCP NAT MAP from [127.0.0.1]:60860 to [212.80.76.13]:80 -> [193.179.161.10]:60860 failed Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NETL-210-E netbind(server,10): NAT binding failed Kernel often reports "pool_ticket: 1429 != 1430" (with increasing numbers over time). Thank you very much for any advice. Regards Michal
Michal Mertl wrote:> Hello, > > I am deploying FreeBSD based application proxies' based firewall > (www.kernun.com, but not much English there) and am having frequent > panics of RELENG_6_1 under load. The server has IP forwarding disabled. > > I've got two machines in a carp cluster and the transparent proxies use > PF to get the data. > > I don't know much about kernel internals and PF but from the following > backtrace I understand that the crash happens because rpool->cur on line > 2158 in src/sys/contrib/pf/net/pf.c is NULL and is dereferenced. It > probably shouldn't happen yet it does. > > The machines are SMP and were running SMP kernel. The only places where > pool.cur (or pool->cur) is assigned to are in pf_ioctl.c. It seems there > are some lock operations though so it is probably believed that the > coder is properly locked. > > I have been running with kern.smp.disabled=1 for a moment before I put > the old firewall in place and haven't seen the panic but the time was > deffinitely too short to make me believe it fixes the issue. Can setting > debug.mpsafenet to 0 possibly also help? >... Are you using user and/or group rules in your PF ruleset? If so, then you will want to set debug.mpsafenet to 0 as its a known issue with pf(4) currently. -Proto
[CC'ing -pf] On Thursday 20 July 2006 17:53, Michal Mertl wrote:> Hello, > > I am deploying FreeBSD based application proxies' based firewall > (www.kernun.com, but not much English there) and am having frequent > panics of RELENG_6_1 under load. The server has IP forwarding disabled. > > I've got two machines in a carp cluster and the transparent proxies use > PF to get the data.Which proxies are you using? The "pool_ticket: 1429 != 1430" messages you quote below indicate a synchronization problem within the app talking to pf via ioctl's. Tickets are used to ensure atomic commits for operations that require more than one ioctl. If your proxy app runs in parallel it might screw up the internal state and thus leave it undefined afterwards. I give you that this shouldn't cause a kernel problem, but if we could fix the app we can probably find the right sanity check more easily.> I don't know much about kernel internals and PF but from the following > backtrace I understand that the crash happens because rpool->cur on line > 2158 in src/sys/contrib/pf/net/pf.c is NULL and is dereferenced. It > probably shouldn't happen yet it does. > > The machines are SMP and were running SMP kernel. The only places where > pool.cur (or pool->cur) is assigned to are in pf_ioctl.c. It seems there > are some lock operations though so it is probably believed that the > coder is properly locked. > > I have been running with kern.smp.disabled=1 for a moment before I put > the old firewall in place and haven't seen the panic but the time was > deffinitely too short to make me believe it fixes the issue. Can setting > debug.mpsafenet to 0 possibly also help? > > I could probably bandaid this particular failure mode by returning > failure instead of panicing but the bug is probably elsewhere. > > I've lost the debug kernel from which this backtrace is and can't > therefore continue much :-(. Unfortunately so far I can only reproduce > the problem in production and for obvious reasons I can't put it there. > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x28 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xffffffff801ab528 > stack pointer = 0x10:0xffffffffb1ade650 > frame pointer = 0x10:0xffffff004cc7cc30 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 15 (swi1: net) > trap number = 12 > panic: page fault > > #0 doadump () at pcpu.h:172 > #1 0x0000000000000004 in ?? () > #2 0xffffffff803d5137 in boot (howto=260) > at ../../../kern/kern_shutdown.c:402 > #3 0xffffffff803d58a1 in panic (fmt=0xffffff007ba32000 "@\223<A3>{") > at ../../../kern/kern_shutdown.c:558 > #4 0xffffffff80543b3f in trap_fatal (frame=0xffffff007ba32000, > eva=18446742976272241472) at ../../../amd64/amd64/trap.c:660 > #5 0xffffffff80543e5f in trap_pfault (frame=0xffffffffb1ade5a0, > usermode=0) > at ../../../amd64/amd64/trap.c:573 > #6 0xffffffff80544113 in trap (frame> {tf_rdi = 2, tf_rsi = -1098223465792, tf_rdx = -1098439497700, > tf_rcx = -1 > 314002464, tf_r8 = 0, tf_r9 = -1314002776, tf_rax = 0, tf_rbx = 0, > tf_rbp = -109 > 8223465424, tf_r10 = 1, tf_r11 = 257, tf_r12 = -1098439497700, tf_r13 > -1314002 > 776, tf_r14 = 2, tf_r15 = -1314002464, tf_trapno = 12, tf_addr = 40, > tf_flags > 216171684640539392, tf_err = 0, tf_rip = -2145733336, tf_cs = 8, > tf_rflags = 661 > 18, tf_rsp = -1314003360, tf_ss = 16}) > at ../../../amd64/amd64/trap.c:352 > #7 0xffffffff8052feab in calltrap () > at ../../../amd64/amd64/exception.S:168 > #8 0xffffffff801ab528 in pf_map_addr (af=2 '\002', > r=0xffffff004cc7cac0, > saddr=0xffffff003fe7681c, naddr=0xffffffffb1ade9e0, init_addr=0x0, > sn=0xffffffffb1ade8a8) at ../../../contrib/pf/net/pf.c:2163 > #9 0xffffffff801acab6 in pf_get_translation (pd=0xffffffffb1ade9c0, > m=0xffffff0042ede900, off=20, direction=1, kif=0xffffff007b038a00, > sn=0xffffffffb1ade8a8, saddr=0xffffff003fe7681c, sport=0, > daddr=0xffffff003fe76820, dport=50881, naddr=0xffffffffb1ade9e0, > nport=0xffffffffb1ade8b6) at ../../../contrib/pf/net/pf.c:2618 > #10 0xffffffff801b315b in pf_test_tcp (rm=0xffffffffb1ade960, > sm=0xffffffffb1ade950, direction=1, kif=0xffffff007b038a00, > m=0xffffff0042ede900, off=20, h=0xffffff003fe76810, > pd=0xffffffffb1ade9c0, am=0xffffffffb1ade968, > rsm=0xffffffffb1ade970, > ifq=0x2, inp=0x0) at ../../../contrib/pf/net/pf.c:3013 > #11 0xffffffff801b5694 in pf_test (dir=1, ifp=0xffffff0000bee800, > m0=0xffffffffb1adeaa0, eh=0xffffffffb1ade97e, inp=0x0) > at ../../../contrib/pf/net/pf.c:6449 > #12 0xffffffff801bafb2 in pf_check_in (arg=0x2, m=0xffffffffb1adeaa0, > ifp=0xffffff004cc7cac0, dir=-1314002464, inp=0xffffffffb1ade9e0) > at ../../../contrib/pf/net/pf_ioctl.c:3358 > #13 0xffffffff80461c2e in pfil_run_hooks (ph=0xffffffff807e0920, > mp=0xffffffffb1adeb28, ifp=0xffffff0000bee800, dir=1, inp=0x0) > at ../../../net/pfil.c:139 > #14 0xffffffff8048d225 in ip_input (m=0xffffff0042ede900) > at ../../../netinet/ip_input.c:465 > #15 0xffffffff8046180c in netisr_processqueue (ni=0xffffffff807df690) > at ../../../net/netisr.c:236 > #16 0xffffffff80461abd in swi_net (dummy=0x2) > at ../../../net/netisr.c:349 > #17 0xffffffff803bbd99 in ithread_loop (arg=0xffffff00000506a0) > at ../../../kern/kern_intr.c:684 > #18 0xffffffff803ba527 in fork_exit ( > callout=0xffffffff803bbc50 <ithread_loop>, arg=0xffffff00000506a0, > frame=0xffffffffb1adec50) at ../../../kern/kern_fork.c:805 > #19 0xffffffff8053020e in fork_trampoline () > at ../../../amd64/amd64/exception.S:394 > #20 0x0000000000000000 in ?? () > > The firewall also reports lots of PF problems durings operation: > > Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: KERN-100-E > [natutil.c:770] ioctl(): Invalid argument (EINVAL=22) > Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NATT-111-E > add_rule(): PF ioctl DIOCADDRULE failed > Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NATT-701-E > addnatmap out(): Adding TCP NAT MAP from [127.0.0.1]:60860 to > [212.80.76.13]:80 -> [193.179.161.10]:60860 failed > Jul 20 10:44:11 fw1 kernel: Jul 20 10:44:11 fw1 HTTP[7607]: NETL-210-E > netbind(server,10): NAT binding failed > > Kernel often reports "pool_ticket: 1429 != 1430" (with increasing > numbers over time). > > Thank you very much for any advice.-- /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060721/4f42827a/attachment.pgp
Hello, I am the author of the proxy code you have been discussing here. I have done some investigation today and I will try to summarize the results.> Thank you. No, I am not using it and I am quite sure the proxies aren't > doing it behind my back either. In fact there isn't a single entry inI can confirm that the proxies do not use any user/group rules.> the rules tables - there are only rdr rules generated on the fly by the > proxies.Depending on configuration, the proxies can install rdr and/or map rules. The rdr rules are used for "incoming transparency", i.e., to catch clients' connections going through the firewall, but with the target server's destination address, not the firewall's. The map rules are used for "outgoing transparency", i.e., for changing the source address of connections to servers (for example, the original client source IP can be used instead of the firewall's IP).> Which proxies are you using? The "pool_ticket: 1429 != 1430" messages you > quote below indicate a synchronization problem within the app talking to pf > via ioctl's. Tickets are used to ensure atomic commits for operations thatThis should be only a transient error. The proxies detect this situation and retry the failed operation. The idea behind this behavior was that the ruleset manipulation is done by a small number of ioctls called quickly one after another by a single function, hence the collisions should not occur too often. But maybe I can add exclusive locking of the PF device, which should remove all collisions among several proxy processes trying to change PF rules at the same time.> If the proxy is using DIOCCHANGERULE (it must be the proxy, pfctl isn't > using it at all), AND is trying to add/update a rule that requires at > least one replacement address but contains an empty list, then this > would cause the panic seen when that rule later matches a packet.I think this is not the case. The proxy uses either DIOCXBEGIN + DIOCBEGINADDRS + DIOCADDADDR + DIOCADDRULE + DIOCXCOMMIT or DIOCCHANGERULE(PF_CHANGE_GET_TICKET) + DIOCBEGINADDRS + DIOCADDADDR + DIOCCHANGERULE(PF_CHANGE_ADD_TAIL). The first method is used in the first call to create the ruleset. In the subsequent call, the second method is used to modify the ruleset. But the list is never empty. If it was, the panics would occur always, which is not happening - there are other installations (but probably not 64bit SMP) working well. I can imagine the list becoming empty only if some other process deletes it by DIOCBEGINADDRS during pfioctl processing, after the "pcr->pool_ticket != ticket_pabuf" check. But this should be guarded by PF_LOCK. Of course, I could make some mistake in the calling sequence of PF ioctl. I wrote this piece of code by trial and error, using pfctl as a source of ideas, because I have not found a detailed manual for the PF API.> Michal, can you please confirm that the patch above fixes the panic? > The proxy will still misbehave and cause the log messages (one more > EINVAL in this case ;), but the kernel shouldn't crash anymore.Yes, the patch should fix the panics, but it does not solve the problem.> This functionality of the software (using PF with anchors) is quite newIt is not so new, it is now about 9 months in production use.> Anchors were introduced for this purpose, i.e. splitting the ruleset > into separate pieces, over each of which a single process can have > authority, so different processes don't stomp on each other's toes with > ruleset modifications.In fact, the possibility to split the ruleset into anchors owned by individual processes was one our major reasons to move from IPF to PF.> Ask them if they really need to still use DIOCCHANGERULE, as the idea > with anchors is generally to only operate within one anchor, and usually > flush or replace the (smaller) ruleset within.DIOCCHANGERULE is useful for us, because each proxy process can have several redirections or mappings and it creates/deletes them incrementally, as it opens/closes individual network connections. It seems to me unnecessary to always replace the whole ruleset.> Each anchor has its own ticket, so if you're seeing ticket mismatches, > that means there are concurrent operations on the same anchor, even.But the conflicts are on the pool_ticket which is, as I understand it, only one for all operations.> They (the Kernun authors) run multiple processes for each proxy. > Originally they used slightly modified Apached core for their proxies I > believe. Thus there are probably more processes using the same anchor.No, there are not. The anchors are even named by the owner process ID.> I don't really understand what they do inside - I would think that when > there are no traffic blocking rules, there's no point in doing anything > with PF except initial setting of the rdr rule to the proxy.As I have mentioned above, there are dynamicaly created rules for outgoing transparent connections (source-address in the configuration) and for some special cases, e.g., handling FTP data connections. -- Martin Beran Senior Developer Trusted Network Solutions, a.s. [ www.tns.cz ]
On Fri, Jul 21, 2006 at 02:15:33PM +0000, Martin Beran wrote:> I think this is not the case. The proxy uses either DIOCXBEGIN + DIOCBEGINADDRS > + DIOCADDADDR + DIOCADDRULE + DIOCXCOMMIT or > DIOCCHANGERULE(PF_CHANGE_GET_TICKET) + DIOCBEGINADDRS + DIOCADDADDR > + DIOCCHANGERULE(PF_CHANGE_ADD_TAIL). The first method is used in the first > call to create the ruleset. In the subsequent call, the second method is used > to modify the ruleset.I did an experiment - repeated adding and deleting rules in two processes, as fast as possible. I expected EBUSY from time to time, but I also received EINVAL indeterministically. It seems to me that when the PF ioctl() is called simultaneously by two processes, it sometimes retuns EINVAL, although it sould be possible to either complete the operation (parameters are correct), or return EBUSY. -- Martin Beran Senior Developer Trusted Network Solutions, a.s. mobil: +420 603 820 932 [ www.tns.cz ]