thr3ads.net - LARTC - multi-path TCP performance [Jul 2005]

If this information is useful, please help other people find it:
Share via:

Li, Ji

2005-Jul-26 16:45 UTC

multi-path TCP performance

I am measuring the performance of one TCP connection over two symmetric
paths. Packets are sent to two paths alternatively. I found that when
the latency of each path are within 1ms, the overall TCP throughput is
the *sum* of the throughput of the two paths. However, when the latency
of the two paths increases to 5ms, the overal TCP throughput drops to
the throughput of a *single* path. Has anyone studied similar problem?
What makes the performance go down?
 
I use Fedora Core 3 and 4, teql and netem for my emulation.


_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Darryl Miles

2005-Jul-30 16:32 UTC

head link

Re: multi-path TCP performance

Maybe one of the TCP options is intefering with the out-of-order 
reception the receiving end experiences.

Try to disable all options you can and repeat.  Research into why/what 
each option is there and does.  Some options are for the other end of 
the performance spectrum, window scaling for example.  So they wont 
provide any assistance to your situatution.

My guess would be SACK (selective acknowledgement) is causing the 
reciving end to signal to the sending to retransmit the (apparently) 
lost packets it sees.  When in reality these packets are delayed not 
lost and it just doesn''t know yet.  So disable sack on linux try
"echo 0
 > /proc/sys/net/ipv4/tcp_sack" try this as both ends (but maybe only 
your bottlenecked / teql end needs it done)

There is also a delayed ack mechanism that trys to reduce acks flowing 
the other way and also add some additional wait to the reception of 
marginally delayed data packets so they can be colated before the ack is 
send back, maybe the amount of time can be increased to help colation 
(providing this is kept within some % of the overall route RTT).

If the receiving end receives multiple ack packets with the same 
sequence number it starts to conclude the data just beyond that acked 
has gone missing, after 3 in a row sending starts to shut down and the 
sending end and spits out another retransmission of what it believes to 
be the lost packet.  This is how it worked BEFORE SACK became the 
default anyway, this is some TCP fast-ack mechanism.

What % of the Round Trip Time does the delay constitute ?  You talk of 
1ms and 5ms deviation, if you are talking about RTT being ethernet like 
speeds then 5ms is a long time.  All TCP timings are dynamic around what 
the sending side computes the RTT to be as the goal of sending bulk TCP 
data is to fill the virtual pipeline between sender and receiver.  But 
to do this in a way that is co-operative with other users.  Lost or 
delayed packets are the pricipal indicator the route is congested and 
therefore the sending site backs off.  If your best RTT is 7ms and worst 
12ms you can''t expect a few simple options to make much difference.  
However if the overall RTT is in the order of 70+ms there maybe plenty 
of room to see some improvements with configuration changes.

Can you improve the load balancing at the congested sending end ?  For 
example have you made sure there is only a single packet transmitted 
queue at the interfaces.  "ifconfig ppp0 txqueue 1" or some other low 
number like 2 or 3.  The default looks to be 64 these days, this is too 
much if your teql interface also has a queue and the ppp0 interface goes 
and asks teql for another packet everytime is has space for one.

Just some pointers for you.

Darryl

Li, Ji wrote:
> I am measuring the performance of one TCP connection over two 
> symmetric paths. Packets are sent to two paths alternatively. I found 
> that when the latency of each path are within 1ms, the overall TCP 
> throughput is the *sum* of the throughput of the two paths. However, 
> when the latency of the two paths increases to 5ms, the overal TCP 
> throughput drops to the throughput of a *single* path. Has anyone 
> studied similar problem? What makes the performance go down?
>  
> I use Fedora Core 3 and 4, teql and netem for my emulation.
>
>------------------------------------------------------------------------
>
>_______________________________________________
>LARTC mailing list
>LARTC@mailman.ds9a.nl
>http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
>  
>

Reasonably Related Threads

Search for more possibly parallel threads

LARTC - Jul 2005 - multi-path TCP performance

multi-path TCP performance

Re: multi-path TCP performance

Reasonably Related Threads