Hi Chris,
On Tue, Jul 29, 2025 at 02:41:53PM -0400, Chris Rapier
wrote:> I know that this is likely a very niche question but I was hoping to
> understand things a little better.
>
> Background:
> I'm implementing RFC 8305 (Happy Eyeballs) for SSH. When connecting to
dual
> stack targets it will start a race between an IPv6 connection and an IPv4
> connection and use which every connects first. To test this I set up tc
> qdisc filters to impose a 600ms delay on IPv6 connections to my target. The
> assumptions being that the excessive delay would favor IPv4 connections.
FYI - some folks are working on Happy Eyeballs v3 - might be worth some
of your time to take a look at that approach as well:
https://www.ietf.org/archive/id/draft-ietf-happy-happyeyeballs-v3-01.html
> Problem:
> In my tests the IPv6 connection *always* ended up being the connection used
> even though the RTT was 600ms higher than the IPv4 connection. I then
> noticed the same issue when I was using an OpenSSH client under the same
> circumstances. If I used "ssh -4 target.host" I would still see a
600ms
> delay on the path even though a "ping -4 target.host" would
return with a
> 2ms RTT. The interactive and bulk data sessions over SSH would always end
up
> seeing that excessive delay. The only situation in which that was true was
> the ssh package under Ubuntu.
>
> After a bunch of testing I found out that Ubuntu reverts the IPQoS default
> changes made in commit 5ee8448a. I absolutely understand why these changes
> were made to IPQoS and I have a way to resolve the issue in my code. The
> problem is that I don't understand why I'm seeing the behaviour
that I am.
> Why does setting IPQoS to lowdelay work in my, admittedly unique, situation
> while using the default of AF21 seems to produce this excessive delay
across
> IPv4 connections?
>
> I set up the filter using:
> tc qdisc add dev enp0s5 root handle 1: prio
> tc qdisc add dev enp0s5 parent 1:3 handle 30: netem delay 600ms
> tc filter add dev enp0s5 parent 1: prio 3 u32 match u16 0x6000 0xf000 at 0
flowid 1:3 # Delay all IPv6
Perhaps traffic that doesn't match your filter ends up in the same class
as the delayed traffic?
https://stackoverflow.com/questions/40196730/simulate-network-latency-on-specific-port-using-tc
more examples: https://news.ycombinator.com/item?id=11345570
> Maybe my test environment is faulty?
Unsure, possibly.
https://lpc.events/event/11/contributions/943/attachments/901/1780/inet_tos_lpc2021.pdf
In relationship to commit 5ee8448a, I was disappointed to discover that
the Linux ecosystem appears to have undone efforts to improve the
interactive experience versus fixing the root cause.
Kind regards,
Job