thr3ads.net - LARTC - The effects of queueing on delay [Oct 2005]

If this information is useful, please help other people find it:
Share via:

Jonathan Lynch

2005-Oct-11 21:04 UTC

The effects of queueing on delay

I have a router with 3 network interfaces like in the following ASCII
diagram below. All interfaces are 100mbit. There is tcp traffic being
sent from net1 to net3 and from net2 to net3 and the tcp connections
consume as much bandwidth as possible. There is a pfifo queue on the
egress interface eth0 of the core router with a limit of 10 packets.


net1 --> (eth1) router (eth0) -> net3 
                (eth2)
                  ^
                  |
                net 2

I police traffic on the edge of net1 to 48.4375 Mbit and shape the
traffic on exit of net 2 to 48.4375 Mbit. There are no packets in the
queue of the egress interface eth0 of the router at any stage. (every
packet is enqueued by pfifo_enqueue() to an empty queue. I have
confirmed this by adding adding a counter in sch_fifo.c that is
incremented every time there is a packet in the queue when a new packet
is enqueued.) The delay is at a maximum of 2ms. 

When I increase the policing rate and shaping rates to 48.4687. The
combined increase is 31.2 kbit which is very small. there are some
packets queued for a short period and some dropped which clears the
queue. The maximum number of packets dropped was 20 per second. But the
delay goes up to 30ms.  

check out the graphs at
http://frink.nuigalway.ie/~jlynch/queue/


I cant seem to explain this. Even if the queue was full all the time and
each packet was of maximum size, the delay imposed by queueing should be
a maximum of 10 * 1500 * 8 /100,000,000 which equals 1ms. 

How can so much delay be added by such a small increase in the
throughput coming from net1 and net2 ? 

I would appreciate if someone could explain it to me.

Btw im using a stratum 1 NTP server on the same LAN to ensure
measurement accuracy.


Jonathan

RH Equipe Teleinfor

2005-Oct-11 21:10 UTC

head link

Linha fale a vontede todos os dias (11)6839-0277

Linha fale a vontade todos os dias, aquela que você só paga a assinatura
www.telefonevesper.com.br


----- Original Message -----
From: "Jonathan Lynch" <jlynch@frink.nuigalway.ie>
To: <lartc@mailman.ds9a.nl>
Sent: Tuesday, October 11, 2005 6:04 PM
Subject: [LARTC] The effects of queueing on delay

>
> I have a router with 3 network interfaces like in the following ASCII
> diagram below. All interfaces are 100mbit. There is tcp traffic being
> sent from net1 to net3 and from net2 to net3 and the tcp connections
> consume as much bandwidth as possible. There is a pfifo queue on the
> egress interface eth0 of the core router with a limit of 10 packets.
>
>
> net1 --> (eth1) router (eth0) -> net3
>                 (eth2)
>                   ^
>                   |
>                 net 2
>
> I police traffic on the edge of net1 to 48.4375 Mbit and shape the
> traffic on exit of net 2 to 48.4375 Mbit. There are no packets in the
> queue of the egress interface eth0 of the router at any stage. (every
> packet is enqueued by pfifo_enqueue() to an empty queue. I have
> confirmed this by adding adding a counter in sch_fifo.c that is
> incremented every time there is a packet in the queue when a new packet
> is enqueued.) The delay is at a maximum of 2ms.
>
> When I increase the policing rate and shaping rates to 48.4687. The
> combined increase is 31.2 kbit which is very small. there are some
> packets queued for a short period and some dropped which clears the
> queue. The maximum number of packets dropped was 20 per second. But the
> delay goes up to 30ms.
>
> check out the graphs at
> http://frink.nuigalway.ie/~jlynch/queue/
>
>
> I cant seem to explain this. Even if the queue was full all the time and
> each packet was of maximum size, the delay imposed by queueing should be
> a maximum of 10 * 1500 * 8 /100,000,000 which equals 1ms.
>
> How can so much delay be added by such a small increase in the
> throughput coming from net1 and net2 ?
>
> I would appreciate if someone could explain it to me.
>
> Btw im using a stratum 1 NTP server on the same LAN to ensure
> measurement accuracy.
>
>
> Jonathan
>
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Jonathan Lynch

2005-Oct-13 16:06 UTC

head link

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

This was down to the tx buffer size on the network card i was using. It
was an Intel 82547EI gigabit Card using the e1000 driver and operating
at 100mbit. The tx buffer was set to 256 which caused this huge delay.
The minimum the driver lets me reduce the tx buffer size using ethtool
is 80. By reducing the tx ring buffer to 80, the delay when there is
full link utilisation and a maximum queue of 10 packets was reduced from
30ms to 10ms.

The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx to
16 on the e1000 driver, but the max throughput i could achieve on the
interface went down.

Has anyone experimented with reducing the size of the tx buffer on this
card to get a good balance between delay and throughput ?

Jonathan







On Tue, 2005-10-11 at 22:04 +0100, Jonathan Lynch wrote:
> I have a router with 3 network interfaces like in the following ASCII
> diagram below. All interfaces are 100mbit. There is tcp traffic being
> sent from net1 to net3 and from net2 to net3 and the tcp connections
> consume as much bandwidth as possible. There is a pfifo queue on the
> egress interface eth0 of the core router with a limit of 10 packets.
> 
> 
> net1 --> (eth1) router (eth0) -> net3 
>                 (eth2)
>                   ^
>                   |
>                 net 2
> 
> I police traffic on the edge of net1 to 48.4375 Mbit and shape the
> traffic on exit of net 2 to 48.4375 Mbit. There are no packets in the
> queue of the egress interface eth0 of the router at any stage. (every
> packet is enqueued by pfifo_enqueue() to an empty queue. I have
> confirmed this by adding adding a counter in sch_fifo.c that is
> incremented every time there is a packet in the queue when a new packet
> is enqueued.) The delay is at a maximum of 2ms. 
> 
> When I increase the policing rate and shaping rates to 48.4687. The
> combined increase is 31.2 kbit which is very small. there are some
> packets queued for a short period and some dropped which clears the
> queue. The maximum number of packets dropped was 20 per second. But the
> delay goes up to 30ms.  
> 
> check out the graphs at
> http://frink.nuigalway.ie/~jlynch/queue/
> 
> 
> I cant seem to explain this. Even if the queue was full all the time and
> each packet was of maximum size, the delay imposed by queueing should be
> a maximum of 10 * 1500 * 8 /100,000,000 which equals 1ms. 
> 
> How can so much delay be added by such a small increase in the
> throughput coming from net1 and net2 ? 
> 
> I would appreciate if someone could explain it to me.
> 
> Btw im using a stratum 1 NTP server on the same LAN to ensure
> measurement accuracy.
> 
> 
> Jonathan
> 
> _______________________________________________
> LARTC mailing list
> LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Andy Furniss

2005-Oct-17 19:20 UTC

head link

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Jonathan Lynch wrote:> This was down to the tx buffer size on the network card i was using. It
> was an Intel 82547EI gigabit Card using the e1000 driver and operating
> at 100mbit. The tx buffer was set to 256 which caused this huge delay.
> The minimum the driver lets me reduce the tx buffer size using ethtool
> is 80. By reducing the tx ring buffer to 80, the delay when there is
> full link utilisation and a maximum queue of 10 packets was reduced from
> 30ms to 10ms.
> 
> The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx to
> 16 on the e1000 driver, but the max throughput i could achieve on the
> interface went down.
> 
> Has anyone experimented with reducing the size of the tx buffer on this
> card to get a good balance between delay and throughput ?
Strange - I thought that as long as you are under rate for the link then 
the most htb should burst per tick is the burst size specified.

That assumes one bulk class - more will make it worse.

Andy.

Jonathan Lynch

2005-Nov-19 23:53 UTC

head link

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Quoting Andy Furniss <andy.furniss@dsl.pipex.com>:
> Jonathan Lynch wrote:
> > This was down to the tx buffer size on the network card i was using.
It
> > was an Intel 82547EI gigabit Card using the e1000 driver and operating
> > at 100mbit. The tx buffer was set to 256 which caused this huge delay.
> > The minimum the driver lets me reduce the tx buffer size using ethtool
> > is 80. By reducing the tx ring buffer to 80, the delay when there is
> > full link utilisation and a maximum queue of 10 packets was reduced
from
> > 30ms to 10ms.
> >
> > The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the tx
to
> > 16 on the e1000 driver, but the max throughput i could achieve on the
> > interface went down.
> >
> > Has anyone experimented with reducing the size of the tx buffer on
this
> > card to get a good balance between delay and throughput ?
>
> Strange - I thought that as long as you are under rate for the link then
> the most htb should burst per tick is the burst size specified.
>
> That assumes one bulk class - more will make it worse.
>
> Andy.
>
Just noticed your reply there, havnt been very busy lately and havnt been
checked LARTC in a while.

say for example with a htb qdisc configured with a ceil of 100 Mbit (overhead 24
mpu 84 mtu 1600
burst 12k cburst 12k quantum 1500) or a queue discipline that doesnt rate limit
such as prio or red
there was a delay of 30 ms imposed when the outgoing interface was saturated and
the tx ring size
was 256. when the tx ring size was reduced to 80 the delay was around 9ms.

The tx ring is a fifo structure. The NIC driver uses DMA to transmit packets
from the tx ring. these
are worst case delays when The tx ring is full of maximum size FTP packets with
the VoIP packet at
the end. The VoIP has to wait for all the FTP packets to be transmitted.

When the rate was reduced to 99Mbit the maximum delay imposed is about 2ms. It
seems that with the
reduced rate there is time to clear more packets from the TX ring...there are
less packets in the
ring resulting in a lower delay. But the delay increases linearly.

Also a question when defining the following parameters (overhead 24 mpu 84 mtu
1600 burst 12k cburst
12k quantum 1500) i have them defined on all classes and on the htb qdisc
itself. Is there a minimum
 place where they can be specified...ie just on the htb qdisc itself, or do they
have to be
specified on all

Jonathan

Andy Furniss

2005-Dec-06 01:55 UTC

head link

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Jonathan Lynch wrote:> Quoting Andy Furniss <andy.furniss@dsl.pipex.com>:
> 
> 
>>Jonathan Lynch wrote:
>>
>>>This was down to the tx buffer size on the network card i was using.
It
>>>was an Intel 82547EI gigabit Card using the e1000 driver and
operating
>>>at 100mbit. The tx buffer was set to 256 which caused this huge
delay.
>>>The minimum the driver lets me reduce the tx buffer size using
ethtool
>>>is 80. By reducing the tx ring buffer to 80, the delay when there is
>>>full link utilisation and a maximum queue of 10 packets was reduced
from
>>>30ms to 10ms.
>>>
>>>The 3com 3c59x vortex driver uses a tx buffer of 16. I reduced the
tx to
>>>16 on the e1000 driver, but the max throughput i could achieve on
the
>>>interface went down.
>>>
>>>Has anyone experimented with reducing the size of the tx buffer on
this
>>>card to get a good balance between delay and throughput ?
>>
>>Strange - I thought that as long as you are under rate for the link then
>>the most htb should burst per tick is the burst size specified.
>>
>>That assumes one bulk class - more will make it worse.
>>
>>Andy.
>>
> 
> 
> Just noticed your reply there, havnt been very busy lately and havnt been
checked LARTC in a while.
> 
> say for example with a htb qdisc configured with a ceil of 100 Mbit
(overhead 24 mpu 84 mtu 1600
> burst 12k cburst 12k quantum 1500) or a queue discipline that doesnt rate
limit such as prio or red
> there was a delay of 30 ms imposed when the outgoing interface was
saturated and the tx ring size
> was 256. when the tx ring size was reduced to 80 the delay was around 9ms.
Ahh I see what you mean - reducing the buffer beyond htb, but I don''t 
really see why you need to, rather than reducing htb rate so you only 
have one htb burst of packets in it at a time (that assumes you only 
have two classes like in your other tests - more bulk classes would be 
worse).
> 
> The tx ring is a fifo structure. The NIC driver uses DMA to transmit
packets from the tx ring. these
> are worst case delays when The tx ring is full of maximum size FTP packets
with the VoIP packet at
> the end. The VoIP has to wait for all the FTP packets to be transmitted.
> 
> When the rate was reduced to 99Mbit the maximum delay imposed is about 2ms.
It seems that with the
> reduced rate there is time to clear more packets from the TX ring...there
are less packets in the
> ring resulting in a lower delay. But the delay increases linearly.
I agree from our previous discussion and tests that even with overheads 
added you need to back off a bit more than expected, but I assumed this 
was to do with either :

The 8/16 byte granularity of the lookup tables.
The fact that overhead was not designed into htb, but added later.
Timers maybe being a bit out.
Me not knowing the overhead of ethernet properly.

Making tx buffer smaller is just a workaround for htb being over rate 
for whatever reason.
> 
> Also a question when defining the following parameters (overhead 24 mpu 84
mtu 1600 burst 12k cburst
> 12k quantum 1500)
I suppose quantum should be 1514 - as you pointed out to me previously 
   as it''s eth - maybe more if you are still testing on vlans. I
don''t
think it will make any difference in this test - I see the overhead 
allows for it already (I am just stating for the list as it was down 
during our previous discussions about overheads).

mtu 1600 - It''s a bit late to check now - but there is a size makes the
table go from 8 to 16 and 1600 seems familiar - but 2048 comes to mind 
aswell :-) I would check with tc -s -d class ls and reduce it a bit if 
you see /16 after burst.

  i have them defined on all classes and on the htb qdisc itself. Is 
there a minimum>  place where they can be specified...ie just on the htb qdisc itself, or do
they have to be
> specified on all
I would think so, htb has a lookup table for each rate and ceil.

Andy.

Reasonably Related Threads

Search for more possibly parallel threads

LARTC - Oct 2005 - The effects of queueing on delay

The effects of queueing on delay

Linha fale a vontede todos os dias (11)6839-0277

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Re: The effects of queueing on delay...(TX Ring Buffer the problem)

Reasonably Related Threads