On Thursday, February 11, 2021 7:57:43 AM CET Helge Oldach
wrote:> Hi Stefan,
>
> Stefan Ehmann wrote on Thu, 11 Feb 2021 02:50:35 +0100 (CET):
> > On Wednesday, February 10, 2021 7:46:25 AM CET Helge Oldach wrote:
> > > Hi,
> > >
> > > Stefan Ehmann wrote on Tue, 09 Feb 2021 23:23:32 +0100 (CET):
> > > > I'm having issues with stale TCP connections after the
upgrade from
> > > > 12.2
> > > > to
> > > > 13.0-BETA1.
> > > >
> > > > Symptoms:
> > > > Outgoing TCP connections no longer receive data after being
idle.
> > > >
> > > > I can do more testing later, but I think these ipfw rules
trigger the
> > > > problem: - check-state
> > > > - allow tcp from me to any setup keep-state
> > > > - deny ip from any to any
> > > >
> > > > After establishing an outgoing connection (e.g, via netcat),
I see a
> > > > new
> > > > dynamic rule and the 300s counter running down via
> > > > # ipfw -Da list
> > > >
> > > > net.inet.ip.fw.dyn_keepalive is set to 1, so the timer
should be
> > > > refreshed
> > > > via keep-alive on idle connections.
> > > >
> > > > Don't know if it's deterministic, but from what
I've seen so far:
> > > > - When counter gets low the first time, it is reset to 300
as
> > > > expected.
> > > > - When the counter nears zero for the second time, the
dynamic rule is
> > > > deleted and I get ipfw denies.
> > >
> > > I am afraid I can't reproduce. I have followed your test case
however
> > > I'm seeing that a TCP keepalive reliably triggers a timer
refresh. For
> >
> > > example (sleep 1 loop over ipfw -Da list | grep):
> > Tested in VirtualBox with amd64.vmdk from:
> >
> > https://download.freebsd.org/ftp/releases/VM-IMAGES/13.0-BETA1/
>
> We do agree on amd64, right?
Yes, amd64.
> I precisely followed your steps (VirtualBox 6.1.18), except:
> > Terminal 1:
> > kldload ipfw
> > ipfw add check-state
> > ipfw allow tcp from me to any setup keep-state
>
> ipfw add allow tcp from me to any setup keep-state
>
> > /bin/sh (I don't speek csh)
>
> toor is my friend :-)
>
> > while true; do sleep 1; ipfw -Da list; done
>
> while sleep 1; do ipfw -Da list; done
>
> > Terminal 2:
> > nc <remote> 12345
>
> nc <remote> 80
>
> (Apache is listening on <remote>.)
>
> > On <remote> nc -l 12345 is running
>
> On <remote>: tcpdump port 80
>
> I am seeing keepalives every 5 minutes and the ipfw timer has fired
> every time, resetting the dynamic rule to 300 secs TTL. I am also seeing
> keepalives received and replied in the tcpdump. Everything according
> to the books I am afraid. My nc session is still sending after some 45
> minutes.
>
> > Updated to 187492ef639f, but nothing changed.
>
> Hmmm. I'm out of ideas. Are you 100% sure the remote session is not
torn
> down routinely after something between 300-600 seconds silence?
Yes. If I remove the deny rule, the connection is still working after the
dynamic rule expired.
Thanks so far. I'll try some more testing on different hardware over the
weekend.
Initially, I've seen the problem on epair(4) devices. That should rule out
network hardware issues.
I've never seen this problem on 12.2 but I hope I can avoid a git bisect
from
there.