thr3ads.net - LARTC - SFQ buckets/extensions [Jun 2002]

If this information is useful, please help other people find it:
Share via:

Don Cohen

2002-Jun-05 16:04 UTC

SFQ buckets/extensions

> ... What if SFQ were to start with a minimal number of buckets, and > track how ''deep'' each bucket was, then go to a larger
number of bits
 > (2/4 at a time?) if the buckets hit a certain depth?  Theoretically,
 > this would mean that ''fairness'' would be achieved more
often in current
 > collision situations but that a smaller number of buckets would be
 > necessary to achieve fairness in currently low-collision situations.
 > 
 > I haven''t looked at the SFQ code in a while, so I don''t
know how much
 > benefit this would be in terms of processing time, or even how expensive
 > it would be to change hash sizes on the fly, but at a certain level of
 > resolution (+/- 2-4 bits), the changes wouldn''t be terribly
frequent
 > anyway.

A few reactions:
- The only runtime cost of lots of buckets is a small amount of
storage for each bucket.  Allocating buckets at runtime also
introduces the problem that you could run out of space.
- There''s no advantage to having many more buckets that the number 
of packets you''re willing to queue, which is typically only on the
order of a few hundred.

==== extensions
 > And all the discussions tend to lead to the conclusion that there should
 > be an sfq option (when the queue is created) for:
 > 	a) how big the hash is
 > 	b) whether to take into account source ports or not
 > 	c) whether to take into account destination ports or not
 > 	d) etc. :)
 > 
 > Maybe someone who''s written a qdisc would feel up to this?

I''ve been hoping to get to it, since I have other stuff I''d
like to
incorporate into a new sfq version.  

 > From: Alexander Atanasov <alex@ssi.bg>
 > 	I''ve done some in this direction , probably needs more work, and
 > it''s poorly tested - expect b00ms ;)
 > 	
 > 	This adds a new qdisc for now - esfq which is a 100% clone of
 > original sfq.
 > 	- You can set all sfq parameters: hash table size, queue depths,
 > queue limits.
 > 	- You can choose from 3 hash types: original(classic), dst ip, src
 > ip.
 > 	Things to consider: perturbation with dst and src hashes is not
 > good IMHO, you can try with perturb 0 if it couses trouble.
 > 
 > 	Please, see the attached files.
 > 
 > 	Plaing with it gives interesting results:  
 > 	higher depth -> makes flows equal slower
 > 	small depth  -> makes flows equal faster
 > 	limit kills big delays when set at about 75-85% of depth.

I don''t understand what these last three lines mean.  Could you
explain?

 > 
 > 	Needs testings and mesurements - that''s why i made it
 > separate qdisc and not a patch over sfq, i wanted to compare both.
 > 
 > 	Any feedback good or bad is welcome. 

I''ll send you my current module, also a variant of SFQ.  It contains
doc that I think is worth including, also changes some of the code to
be more understandable, separates the number of packets allowed in the
queue from the number of buckets, supports the time limit (discussed
in earlier messages), controls these things via /proc, maybe a few
other things I''m forgetting.  This version does not support hashing on
different properties of the packet, cause it uses a totally different
criterion for identifying "subclasses" of traffic.  You can discard
that and restore the sfq hash with your modifications.  I think (hope)
these changes are pretty much independent.

Alexander Atanasov

2002-Jun-06 03:52 UTC

head link

Re: SFQ buckets/extensions

On Wed, 5 Jun 2002, Don Cohen wrote:
> ==== extensions
>  > And all the discussions tend to lead to the conclusion that there
should
>  > be an sfq option (when the queue is created) for:
>  > 	a) how big the hash is
>  > 	b) whether to take into account source ports or not
>  > 	c) whether to take into account destination ports or not
>  > 	d) etc. :)
>  > 
>  > Maybe someone who''s written a qdisc would feel up to this?
> 
> I''ve been hoping to get to it, since I have other stuff
I''d like to
> incorporate into a new sfq version.  
	At first look - i think i''ve have to incorporate my changes into
your work. I''ve not done much just added hashes and unlocked what
Alexey Kuznetsov did.
>  > 	Plaing with it gives interesting results:  
>  > 	higher depth -> makes flows equal slower
>  > 	small depth  -> makes flows equal faster
>  > 	limit kills big delays when set at about 75-85% of depth.
> 
> I don''t understand what these last three lines mean.  Could you
> explain?
	depth is how much packets which are queued on a row of the
hash table. If you have large queues (higher depth) sfq reacts slower when
a new flow appears (it has to do more work to make queue lengths equal
). When you have short queues it reacts faster, so adjusting depth
to your bandwidth and traffic type can make it do better work.
I set bounded cbq class 320kbits and esfq with dst hash:
	Start an upload - it gets 40KB
	Start second one - it should get 20KB asap to be fair.
With depth 128 it would take it let''s say 6 sec. to make both 20KB,
with
depth 64 about 3sec - drop packets early with shorter queue.
(i''ve to make some exact measurements since this is just an example
and may not be correct). 

	limit sets a threshold on queued packets - if a packet exceeds
it''s dropped so delay is smaller, but when it tries to make flows equal
it
counts depth, not limit. With above example depth 128 and limit 100:
	When first upload enqueue 100 packets sfq starts to drop,
but goal to make flows equal is 64 packets in queue. Flow doesn''t get
the 28 packets which are to be enqueued and delayed for a long time and
probably dropped when recived.
It''s similiar to RED - you have a queue  which you try to keep at about
X
percent full but can go over it, limit keeps it at most X percent full.
What about making it like RED - more packets go over limit more are droped
early -> limit goes down, less are dropped -> limit goes up?
> 
>  > 
>  > 	Needs testings and mesurements - that''s why i made it
>  > separate qdisc and not a patch over sfq, i wanted to compare both.
>  > 
>  > 	Any feedback good or bad is welcome. 
> 
> I''ll send you my current module, also a variant of SFQ.  It
contains
> doc that I think is worth including, also changes some of the code to
> be more understandable, separates the number of packets allowed in the
> queue from the number of buckets, supports the time limit (discussed
> in earlier messages), controls these things via /proc, maybe a few
> other things I''m forgetting.  This version does not support
hashing on
> different properties of the packet, cause it uses a totally different
> criterion for identifying "subclasses" of traffic.  You can
discard
> that and restore the sfq hash with your modifications.  I think (hope)
> these changes are pretty much independent.
> 
	I''ve taken a quick look over it and i like the ideas there,
give me some time to get the details. I hope we can make something good of
both. 

-- 
have fun,
alex

Don Cohen

2002-Jun-06 06:01 UTC

head link

Re: SFQ buckets/extensions

Alexander Atanasov writes:
 > 	At first look - i think i''ve have to incorporate my changes into
 > your work. I''ve not done much just added hashes and unlocked what
 > Alexey Kuznetsov did.
Not quite that simple.  You have to throw out about half of my file,
mostly the last third or so, which replaces the hash function, plus 
most of the /proc stuff, and probably a lot of other little pieces
scattered here and there.
I now recall a few other things - I support configuration of sevice 
weights for different subqueues, which makes no sense for sfq, also
I record the amount of service (bytes and packets) per subqueue and
report these to the tc -s -d stuff, which also makes no sense for sfq.
After removing all that stuff you then have to restore the hash.

 > >  > 	Plaing with it gives interesting results:  
 > >  > 	higher depth -> makes flows equal slower
 > >  > 	small depth  -> makes flows equal faster
 > >  > 	limit kills big delays when set at about 75-85% of depth.
 > > 
 > > I don''t understand what these last three lines mean.  Could
you
 > > explain?
 > 
 > 	depth is how much packets which are queued on a row of the
 > hash table. If you have large queues (higher depth) sfq reacts slower when
 > a new flow appears (it has to do more work to make queue lengths equal
 > ). When you have short queues it reacts faster, so adjusting depth
 > to your bandwidth and traffic type can make it do better work.
 > I set bounded cbq class 320kbits and esfq with dst hash:
 > 	Start an upload - it gets 40KB
 > 	Start second one - it should get 20KB asap to be fair.
 > With depth 128 it would take it let''s say 6 sec. to make both
20KB, with
 > depth 64 about 3sec - drop packets early with shorter queue.
 > (i''ve to make some exact measurements since this is just an
example
 > and may not be correct). 
I don''t see why that should be the case.  And I don''t recall
ever
observing it.  This adaptation time should be practically zero.
There''s no work in making the queues equal.
(Let''s use the word queue to mean the whole SFQ and subqueue for the
part sharing a hash index.)
If you have, say, 100 packets in one subqueue and 10 in another they''re
already sharing the bandwidth 50-50. 

 > 	limit sets a threshold on queued packets - if a packet exceeds
 > it''s dropped so delay is smaller, but when it tries to make flows
equal it
 > counts depth, not limit. With above example depth 128 and limit 100:
 > 	When first upload enqueue 100 packets sfq starts to drop,
 > but goal to make flows equal is 64 packets in queue. Flow doesn''t
get
 > the 28 packets which are to be enqueued and delayed for a long time and
 > probably dropped when recived.
I disagree that the goal is to make the subqueues the same length.
The goal is to serve them with the same bandwidth (as long as they
don''t become empty.)
Queue length depends on how many packets each download is willing to
send without an ack.  If one is willing to send 100 and the other is
willing to send 10, then the subqueues will likely be length 100 and
10, but each will still get the same bandwidth.  Without window
scaling the max window size is 64K which is only about 45 packets,
so it''s not really normal to have 100 packets in a subqueue.

Jan Coppens

2002-Jun-06 09:21 UTC

head link

SFQ + RED

Hi all,

Is there any implemented qdisc that combines the benefits of a SFQ with
those of a RED queue. Something like a FRED, CBT or DWRED?

thx,
Jan

Alexey Talikov

2002-Jun-06 11:38 UTC

head link

Re: SFQ + RED

Add RED on IMQ see http://luxik.cdi.cz/~patrick/imq/
and SFQ on network device or vice versa
send all incoming packets to imq
modprobe imq mumdevs=1
iptables -t mangle -A PREROUTING -i $IN_DEV -j IMQ --todev 0
ip link set imq0 up

06.06.2002 14:21:55, "Jan Coppens" <Jan.Coppens@intec.rug.ac.be>
wrote:
>Hi all,
>
>Is there any implemented qdisc that combines the benefits of a SFQ with
>those of a RED queue. Something like a FRED, CBT or DWRED?
>
>thx,
>Jan
>
>_______________________________________________
>LARTC mailing list / LARTC@mailman.ds9a.nl
>http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>
-----------------------------------
mailto:alexey_talikov@texlab.com.uz
BR
Alexey Talikov
FORTEK
-----------------------------------

Michael T. Babcock

2002-Jun-06 13:05 UTC

head link

Re: SFQ buckets/extensions

> I disagree that the goal is to make the subqueues the same length.
> The goal is to serve them with the same bandwidth (as long as they
> don''t become empty.)
Not that you need backing-up, but I agree with you.  SFQ is there to provide
near-fair queuing on a per-session basis.  As modified, it could also be
used to provide near-fair queuing on a per-IP basis instead, but having
different-depthed sub-queues simply indicates why fairness is needed (one
sub-queue would otherwise have dominated the available bandwidth).

A very long sub-queue however, indicates that perhaps fairness is not being
acheived (although, that''s why its refered to as being stochastic). 
This is
why I suggested at least being able to tune the size of the hash / number of
hash buckets so as to redistribute the streams through more sub-queues.
Dealing with bursts properly at Linux timer resolutions is another issue
:-).
--
Michael T. Babcock
CTO, FibreSpeed Ltd.

Alexander Atanasov

2002-Jun-06 23:06 UTC

head link

Re: SFQ buckets/extensions

On Wed, 5 Jun 2002, Don Cohen wrote:
> Alexander Atanasov writes:
>  > >  > 	Plaing with it gives interesting results:  
>  > >  > 	higher depth -> makes flows equal slower
>  > >  > 	small depth  -> makes flows equal faster
>  > >  > 	limit kills big delays when set at about 75-85% of depth.
>  > > 
>  > > I don''t understand what these last three lines mean. 
Could you
>  > > explain?
>  > 
>  > 	depth is how much packets which are queued on a row of the
>  > hash table. If you have large queues (higher depth) sfq reacts slower
when
>  > a new flow appears (it has to do more work to make queue lengths
equal
>  > ). When you have short queues it reacts faster, so adjusting depth
>  > to your bandwidth and traffic type can make it do better work.
>  > I set bounded cbq class 320kbits and esfq with dst hash:
>  > 	Start an upload - it gets 40KB
>  > 	Start second one - it should get 20KB asap to be fair.
>  > With depth 128 it would take it let''s say 6 sec. to make
both 20KB, with
>  > depth 64 about 3sec - drop packets early with shorter queue.
>  > (i''ve to make some exact measurements since this is just an
example
>  > and may not be correct). 
> I don''t see why that should be the case.  And I don''t
recall ever
> observing it.  This adaptation time should be practically zero.
> There''s no work in making the queues equal.
	When queues get full sfq have to drop packets and it must
send less(delay) or drop on the faster flows which gets a bigger queues to
slow down them. That''s the work i mean. And it really depends on the
subflow depth.
> (Let''s use the word queue to mean the whole SFQ and subqueue for
the
> part sharing a hash index.)
> If you have, say, 100 packets in one subqueue and 10 in another
they''re
> already sharing the bandwidth 50-50. 
	No, that means that the flow with 100 packets is using about
10 times more bandwidth. To be fair both must have lenght around 50.
> 
>  > 	limit sets a threshold on queued packets - if a packet exceeds
>  > it''s dropped so delay is smaller, but when it tries to make
flows equal it
>  > counts depth, not limit. With above example depth 128 and limit 100:
>  > 	When first upload enqueue 100 packets sfq starts to drop,
>  > but goal to make flows equal is 64 packets in queue. Flow
doesn''t get
>  > the 28 packets which are to be enqueued and delayed for a long time
and
>  > probably dropped when recived.
> I disagree that the goal is to make the subqueues the same length.
> The goal is to serve them with the same bandwidth (as long as they
> don''t become empty.)
	I agree that the goal is to serve the same bandwidth, but
bandwidth is served with packets which carry a data of X bytes up to MTU,
so i consider packets as the base unit. ( look at the example as both ftp 
flows send the same sized packets with the same data at the same rate).  
You have:
	enqueue -> put in the subflow   - len++
	dequeue -> get from the subflow - len--

So when dequeues ocure slower than enqueues they fill up - packets
are delayed and at some point get dropped. So the problem is when
your queues get full with packets - how to dequeue to be fair to flows
which are identified with the hash. SFQ tries to make equal queues for the
flows to do this.
> Queue length depends on how many packets each download is willing to
> send without an ack.  If one is willing to send 100 and the other is
> willing to send 10, then the subqueues will likely be length 100 and
> 10, but each will still get the same bandwidth.  Without window
	That''s true if you can send 110 packets and not exceed your
bandwidth , trouble comes when you can send 100 , how to deal with with
the 10 that exceed. 
> scaling the max window size is 64K which is only about 45 packets,
> so it''s not really normal to have 100 packets in a subqueue.
	It''s normal if you have congested links, it''s not good
in anyway :( But that''s why qos is to avoid and handle this.
You have 45 packets per one flow but with 3 hash collisions you get above
the 128 packets limit and drop. esfq sloves this you can tune
number of subflows and their lengths.

-- 
have fun,
alex

Don Cohen

2002-Jun-06 23:38 UTC

head link

Re: SFQ buckets/extensions

Alexander Atanasov writes:

 > > I don''t see why that should be the case.  And I
don''t recall ever
 > > observing it.  This adaptation time should be practically zero.
 > > There''s no work in making the queues equal.
 > 
 > 	When queues get full sfq have to drop packets and it must
 > send less(delay) or drop on the faster flows which gets a bigger queues to
 > slow down them. That''s the work i mean. And it really depends on
the
 > subflow depth.

If the queue is "full" (currently measured in number of packets
allowed) then a packet is dropped from the longest subqueue.
That does not necessarily equalize the subqueue lengths.  You
can still have queues of wildly differing lengths.

 > > (Let''s use the word queue to mean the whole SFQ and subqueue
for the
 > > part sharing a hash index.)
 > > If you have, say, 100 packets in one subqueue and 10 in another
they''re
 > > already sharing the bandwidth 50-50. 
 > 
 > 	No, that means that the flow with 100 packets is using about
 > 10 times more bandwidth. To be fair both must have lenght around 50.

Again, no.  It''s perfectly possible for one queue to always have
length in range 99-101 and another to always have length 9-11 and
still be serving them both at the same rate.  Just imagine that one
client sends 100 packets at once and the other sends 10 at once and
then they each send one per second and we forward one per second.
Same rate (suppose all packets are same length), different queue
lengths.  In fact SFQ serves them at the same rate no matter how long
they are.  Note - serving in this case means sending, not receiving.

 > 	I agree that the goal is to serve the same bandwidth, but

Not only is this the goal, it''s what SFQ does.

 > bandwidth is served with packets which carry a data of X bytes up to MTU,
 > so i consider packets as the base unit.

Actually, SFQ serves the same rate in bytes/sec, not in packets/sec.
If one flow uses only 1000byte packets and another only 100 byte
packets, the second will get 10 times as many packets/sec as the first.
It''s only for measuring the length of the queue and subqueues that SFQ
counts packets.  It would be more accurate in some sense to measure
this in bytes as well.  But if you''re using a lot of packets then you
really ought to use large ones.

 ( look at the example as both ftp 
 > flows send the same sized packets with the same data at the same rate).  
 > You have:
 > 	enqueue -> put in the subflow   - len++
 > 	dequeue -> get from the subflow - len--
 > 
 > So when dequeues ocure slower than enqueues they fill up - packets
 > are delayed and at some point get dropped. So the problem is when
 > your queues get full with packets - how to dequeue to be fair to flows
 > which are identified with the hash. SFQ tries to make equal queues for the
 > flows to do this.

No, it doesn''t.  It does something like round robin on the subqueues
which serves them all at the same rate.  That''s independent of how
long the subqueues get.  In fact, if you had infinite memory (and
could tolerate infinite delay) you could skip the drop from the tail
of the longest queue.  (You''d have to change the current algorithm
a little bit.  The stuff that supports the tail drop wouldn''t work.)

 > > Queue length depends on how many packets each download is willing to
 > > send without an ack.  If one is willing to send 100 and the other is
 > > willing to send 10, then the subqueues will likely be length 100 and
 > > 10, but each will still get the same bandwidth.  Without window
 > 
 > 	That''s true if you can send 110 packets and not exceed your
 > bandwidth , trouble comes when you can send 100 , how to deal with with
 > the 10 that exceed. 

SFQ is not trying to control incoming bandwidth, just outgoing.
The sending 100 packets above is some other machine, X, sending to the
machine with SFQ, Y.  For Y these packets are incoming.  We don''t
control that.  We only control the rate at which Y forwards them.
So X sends 100 packets.  Y gets 100 packets.  If they all go into
our SFQ and that is limited to 10 packets/sec then those packets are
going to be in the queue for a while.  In the mean while, Z sends 10
packets to Y.  Now we have two subqueues.  In the next second, each 
will get to send 5 packets.  So now we have queues of length 95 and 5.

 > > scaling the max window size is 64K which is only about 45 packets,
 > > so it''s not really normal to have 100 packets in a subqueue.
 > 
 > 	It''s normal if you have congested links, it''s not good

I don''t think this is really related to congested links.
Well behaved (as in TCP) traffic should not be creating big backlogs
no matter how fast or slow or congested the links are.

 > in anyway :( But that''s why qos is to avoid and handle this.
 > You have 45 packets per one flow but with 3 hash collisions you get above
 > the 128 packets limit and drop. esfq sloves this you can tune
 > number of subflows and their lengths.

I still don''t see what esfq is doing that is supposed to help.
More subqueues does reduce the probablity of collisions, but this
seems unlikely to be a problem.  Limiting the size of a subqueue
below the size of the entire queue doesn''t do any good as far as
I can see.

Alexander Atanasov

2002-Jun-07 01:33 UTC

head link

Re: SFQ buckets/extensions

On Thu, 6 Jun 2002, Don Cohen wrote:
> Alexander Atanasov writes:
> 
>  > > I don''t see why that should be the case.  And I
don''t recall ever
>  > > observing it.  This adaptation time should be practically zero.
>  > > There''s no work in making the queues equal.
>  > 
>  > 	When queues get full sfq have to drop packets and it must
>  > send less(delay) or drop on the faster flows which gets a bigger
queues to
>  > slow down them. That''s the work i mean. And it really
depends on the
>  > subflow depth.
> 
> If the queue is "full" (currently measured in number of packets
> allowed) then a packet is dropped from the longest subqueue.
> That does not necessarily equalize the subqueue lengths.  You
> can still have queues of wildly differing lengths.
	In a long term always droping from the largest subqueue gives you
equal subqueues.
> 
>  > > (Let''s use the word queue to mean the whole SFQ and
subqueue for the
>  > > part sharing a hash index.)
>  > > If you have, say, 100 packets in one subqueue and 10 in another
they''re
>  > > already sharing the bandwidth 50-50. 
>  > 
>  > 	No, that means that the flow with 100 packets is using about
>  > 10 times more bandwidth. To be fair both must have lenght around 50.
> 
> Again, no.  It''s perfectly possible for one queue to always have
> length in range 99-101 and another to always have length 9-11 and
> still be serving them both at the same rate.  Just imagine that one
> client sends 100 packets at once and the other sends 10 at once and
> then they each send one per second and we forward one per second.
> Same rate (suppose all packets are same length), different queue
> lengths.  In fact SFQ serves them at the same rate no matter how long
> they are.  Note - serving in this case means sending, not receiving.
Okay, we said that sfq dropes from the longest subqueue.
We start with:
q1 - 100 
q2 - 10

we enqueue(recive) 2 packets/sec
and dequeue(send) 1 packet/sec. ( rr from q1 and q2 ).
q1 will grow to the limit first so sfq will start to drop there,
it will be the longest queue until, q2 gets equal to q1.
In time things should go like this:
After 56 seconds ( if limit is 128 packets ) q1 hits the limit
and sfq starts to drop. 
Now q2 is just with 38 packets and grows until it gets equal to q1,
then start to drop round robin.
I agree that rate is the same .5 packet/sec. But the initial
100 packets are burst which sfq delays (buffers) since it fits
fine in queue limit. If q1 recives again 100 packets most of them get
will dropped.
> 
>  > 	I agree that the goal is to serve the same bandwidth, but
> 
> Not only is this the goal, it''s what SFQ does.
> 
>  > bandwidth is served with packets which carry a data of X bytes up to
MTU,
>  > so i consider packets as the base unit.
> 
> Actually, SFQ serves the same rate in bytes/sec, not in packets/sec.
> If one flow uses only 1000byte packets and another only 100 byte
> packets, the second will get 10 times as many packets/sec as the first.
> It''s only for measuring the length of the queue and subqueues that
SFQ
> counts packets.  It would be more accurate in some sense to measure
> this in bytes as well.  But if you''re using a lot of packets then
you
> really ought to use large ones.
	I''ve not said that it servers packets/sec. But said that
the packets i speak of are equal in size, so rate=packet/sec=bytes/sec.
It powers a queue to send allotment bytes, not packets. But
when it have allotment = 1 and packet with len = 800 the packet is 
send and it''s powered again. so it sends more than it should be.
Measurement in bytes can make this more incorrect since you have enqueued
packets with different lenghts which you can not split a part and dequeue
in bytes (sigh wouldn''t it be nice? ), counting queues in packets
(exactly
what they have) is better. May be doing it like the cbq with average
packet size can gave us better results.

[cut]> 
>  > > Queue length depends on how many packets each download is
willing to
>  > > send without an ack.  If one is willing to send 100 and the
other is
>  > > willing to send 10, then the subqueues will likely be length 100
and
>  > > 10, but each will still get the same bandwidth.  Without window
>  > 
>  > 	That''s true if you can send 110 packets and not exceed your
>  > bandwidth , trouble comes when you can send 100 , how to deal with
with
>  > the 10 that exceed. 
> 
> SFQ is not trying to control incoming bandwidth, just outgoing.
> The sending 100 packets above is some other machine, X, sending to the
> machine with SFQ, Y.  For Y these packets are incoming.  We don''t
> control that.  We only control the rate at which Y forwards them.
> So X sends 100 packets.  Y gets 100 packets.  If they all go into
> our SFQ and that is limited to 10 packets/sec then those packets are
> going to be in the queue for a while.  In the mean while, Z sends 10
> packets to Y.  Now we have two subqueues.  In the next second, each 
> will get to send 5 packets.  So now we have queues of length 95 and 5.
	Sending from the queue, not controling incoming bandwidht.
Yes look above.
> 
>  > > scaling the max window size is 64K which is only about 45
packets,
>  > > so it''s not really normal to have 100 packets in a
subqueue.
>  > 
>  > 	It''s normal if you have congested links, it''s not
good
> 
> I don''t think this is really related to congested links.
> Well behaved (as in TCP) traffic should not be creating big backlogs
> no matter how fast or slow or congested the links are.
	All started with the so called Download Accelerators,
which do paralel gets and make TCP behave bad. 
In normal case TCP adjusts fast and does not create backlogs.
But when you have to change it''s bandwidth , you have to create 
a backlog to slow it down. It then clears fast.
> 
>  > in anyway :( But that''s why qos is to avoid and handle this.
>  > You have 45 packets per one flow but with 3 hash collisions you get
above
>  > the 128 packets limit and drop. esfq sloves this you can tune
>  > number of subflows and their lengths.
> 
> I still don''t see what esfq is doing that is supposed to help.
> More subqueues does reduce the probablity of collisions, but this
> seems unlikely to be a problem.  Limiting the size of a subqueue
> below the size of the entire queue doesn''t do any good as far as
> I can see.
	Collisions are problem belive me, that''s why there is a 
perturbation to reduce them. You can arrange link sharing when controling
number of subqueues more easy. Limiting may become good if made dynamic.

	People wanted options to tune hashsize/limits/depths and
the most wanted (needed) was the option to select hash type.
SRC/DST Port hash is in TODO now.
Searching gives answers like "Patch SFQ", but that just
doesn''t
worked for me.i need to have different sfqs on different 
classes/interfaces. So that''s why there is an esfq. And i hope it
really
helps.

-- 
have fun,
alex

Don Cohen

2002-Jun-07 01:59 UTC

head link

Re: SFQ buckets/extensions

Alexander Atanasov writes:
 > > Again, no.  It''s perfectly possible for one queue to always
have
 > > length in range 99-101 and another to always have length 9-11 and
 > > still be serving them both at the same rate.  Just imagine that one
 > > client sends 100 packets at once and the other sends 10 at once and
 > > then they each send one per second and we forward one per second.
 > > Same rate (suppose all packets are same length), different queue
 > > lengths.  In fact SFQ serves them at the same rate no matter how long
 > > they are.  Note - serving in this case means sending, not receiving.
 > 
 > Okay, we said that sfq dropes from the longest subqueue.
 > We start with:
 > q1 - 100 
 > q2 - 10
 > 
 > we enqueue(recive) 2 packets/sec
I meant one from each source
 > and dequeue(send) 1 packet/sec. ( rr from q1 and q2 ).
I meant one from each subqueue

So we''re in steady state.
After one second we have sent one from each subqueue and it has been
replenished.

 > q1 will grow to the limit first so sfq will start to drop there,
The total queue size is not growing so nothing is being dropped.
Maybe you think it''s just incredibly lucky that these sources are
both sending at just the same rate that they''re being served, but
that''s what tcp is supposed to do.  

 > Measurement in bytes can make this more incorrect since you have enqueued
 > packets with different lenghts which you can not split a part and dequeue
 > in bytes (sigh wouldn''t it be nice? ), counting queues in packets
(exactly
 > what they have) is better. May be doing it like the cbq with average
 > packet size can gave us better results.
No, the errors are accumulated.  It''s always within one packet of the
ideal in terms of what''s sent.  The advantage of measuring queue
length in bytes would be more fairness in dropping, i.e., you should
drop from a queue with 10 packets each of length 1000 bytes instead of
a queue with 20 packets each of length 100 bytes.

 > 	People wanted options to tune hashsize/limits/depths and
 > the most wanted (needed) was the option to select hash type.
 > SRC/DST Port hash is in TODO now.
 > Searching gives answers like "Patch SFQ", but that just
doesn''t
 > worked for me.i need to have different sfqs on different 
 > classes/interfaces. So that''s why there is an esfq. And i hope it
really
 > helps.
I think the version I sent does give control over queue size and
number of buckets.  It also gives you something called expire, which
I think might do the same sort of thing that you wanted from your
limits.  We''ve discussed it on this list before so I won''t go
into it
again.

Alexander Atanasov

2002-Jun-07 08:26 UTC

head link

Re: SFQ buckets/extensions

On Thu, 6 Jun 2002, Don Cohen wrote:
> Maybe you think it''s just incredibly lucky that these sources are
> both sending at just the same rate that they''re being served, but
> that''s what tcp is supposed to do.  
	No, i do not think like this. If we think just for TCP connections
we may end up with ideas like changig window size. TCP tunes to send this
way but it also tries to send more each 5 seconds, if it have data.
SFQ classify connections with ports, esfq classify them and just by IP so
we can have flows:
	SRC IP+proto+DST IP+PORT - one flow
	just DST IP - one flow
	another	we can think of - one flow
So i think just about packets of size bytes but without protos which they
carry and TCP with its features to tune becomes a kind of exception.
> 
>  > Measurement in bytes can make this more incorrect since you have
enqueued
>  > packets with different lenghts which you can not split a part and
dequeue
>  > in bytes (sigh wouldn''t it be nice? ), counting queues in
packets (exactly
>  > what they have) is better. May be doing it like the cbq with average
>  > packet size can gave us better results.
> No, the errors are accumulated.  It''s always within one packet of
the
> ideal in terms of what''s sent.  The advantage of measuring queue
> length in bytes would be more fairness in dropping, i.e., you should
> drop from a queue with 10 packets each of length 1000 bytes instead of
> a queue with 20 packets each of length 100 bytes.
	I didn''t get it ? What do you mean - have different queues 
agains packets sizes ?
> I think the version I sent does give control over queue size and
> number of buckets.  It also gives you something called expire, which
> I think might do the same sort of thing that you wanted from your
> limits.  We''ve discussed it on this list before so I
won''t go into it
> again. 
	We need a way to control what happens in a subqueue there
are these for now:
	- Time limits you''ve implemented
	- Someone asked about (G)RED
	- devik asked about wrr for connections
	- others ?

-- 
have fun,
alex

Michael T. Babcock

2002-Jun-07 14:18 UTC

head link

RE: SFQ buckets/extensions

> 	In a long term always droping from the largest subqueue 
> gives you equal subqueues.
And, of course, one could have it drop them using a RED-like algorithm
to make the sessions stabilize themselves better.
> what they have) is better. May be doing it like the cbq with average
> packet size can gave us better results.
Calculating average packet size on the fly isn''t that hard either.
> 	All started with the so called Download Accelerators,
> which do paralel gets and make TCP behave bad. 
> In normal case TCP adjusts fast and does not create backlogs.
> But when you have to change it''s bandwidth , you have to create 
> a backlog to slow it down. It then clears fast.
I actually usually download pieces of the same file from multiple
mirrors if its truly large and want it fast :-).  Ranged downloads make
that quite possible ... The only way to equalize bandwidth "fairly" in
these scenarios still seems to be to implement the hierarchial approach
of hashing against destination IP (the user receiving the packets) and
then having sub-queues to those queues based on ports and to drop
packets (based on RED?) in each of these sub-queues (because they''re
closer to the actual TCP sessions involved).
> 	People wanted options to tune hashsize/limits/depths and
> the most wanted (needed) was the option to select hash type.
> SRC/DST Port hash is in TODO now.
And I''m glad it exists for testing at least.
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.

Jan Coppens

2002-Jun-07 14:37 UTC

head link

Re: SFQ buckets/extensions

----- Original Message -----
From: "Michael T. Babcock" <mbabcock@fibrespeed.net>
To: <lartc@mailman.ds9a.nl>
Sent: Friday, June 07, 2002 4:18 PM
Subject: RE: [LARTC] SFQ buckets/extensions

> > In a long term always droping from the largest subqueue
> > gives you equal subqueues.
>
> And, of course, one could have it drop them using a RED-like algorithm
> to make the sessions stabilize themselves better.
If you are dealing with a responsive session (like TCP) that is...
> > what they have) is better. May be doing it like the cbq with average
> > packet size can gave us better results.
>
> Calculating average packet size on the fly isn''t that hard either.
>
> > All started with the so called Download Accelerators,
> > which do paralel gets and make TCP behave bad.
> > In normal case TCP adjusts fast and does not create backlogs.
> > But when you have to change it''s bandwidth , you have to
create
> > a backlog to slow it down. It then clears fast.
>
> I actually usually download pieces of the same file from multiple
> mirrors if its truly large and want it fast :-).  Ranged downloads make
> that quite possible ... The only way to equalize bandwidth
"fairly" in
> these scenarios still seems to be to implement the hierarchial approach
> of hashing against destination IP (the user receiving the packets) and
> then having sub-queues to those queues based on ports and to drop
> packets (based on RED?) in each of these sub-queues (because
they''re
> closer to the actual TCP sessions involved).
>
Why not implementing the sfb (stochastic fair Blue)? From the papers I read,
the algorithm with the double buffered moving hash extension, seemed like a
better alternative for the FRED, Stochastic fair RED and SFQ.
> > People wanted options to tune hashsize/limits/depths and
> > the most wanted (needed) was the option to select hash type.
> > SRC/DST Port hash is in TODO now.
>
> And I''m glad it exists for testing at least.
> --
> Michael T. Babcock
> CTO, FibreSpeed Ltd.
>
> _______________________________________________
> LARTC mailing list / LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>
>

Don Cohen

2002-Jun-07 16:14 UTC

head link

Re: SFQ buckets/extensions

Alexander Atanasov writes:

 > SFQ classify connections with ports, esfq classify them and just by IP so
 > we can have flows:
 > 	SRC IP+proto+DST IP+PORT - one flow
 > 	just DST IP - one flow
 > 	another	we can think of - one flow
 > So i think just about packets of size bytes but without protos which they
 > carry and TCP with its features to tune becomes a kind of exception.

I don''t think this is true.  First, of course, almost all packets TCP.
Second, I''d expect that multiple TCP streams along the same path tend
to equalize, assuming of course that they are not differentiated along
the way.  E.g., with just fifo queuing I''d expect that two
scp''s along
the same path would tend toward equal bandwidth.  If this is true then 
multiple TCP streams along the same path would tend to act like one.

Perhaps someone else out there can either confirm or debunk that
expectation?
Of course, things get more complicated when the streams have the same
source but different destinations.  In that case I''d expect them to
again adjust to the differences in bandwidth to the different
destinations.  So maybe sub-subqueues below the source IP subqueues
are not so important. 

 > > No, the errors are accumulated.  It''s always within one
packet of the
 > > ideal in terms of what''s sent.  The advantage of measuring
queue
 > > length in bytes would be more fairness in dropping, i.e., you should
 > > drop from a queue with 10 packets each of length 1000 bytes instead
of
 > > a queue with 20 packets each of length 100 bytes.
 > 
 > 	I didn''t get it ? What do you mean - have different queues 
 > agains packets sizes ?

I was suggesting that when you add a packet to a subqueue you not just
record the fact that there are now 16 packets in that subqueue, but
1600 bytes.  Then the limit on total queue size is measured in bytes.
When you enqueue, you check to see whether this packet would cause the
limit to be exceeded, and if so you drop from the subqueue with the
most bytes.

That''s closer to the spirit of SFQ, but it''s probably more
expensive
in run time.  The current implementation has a very fast way to
determine which subqueue has the most packets.  The object is to make
enqueue/dequeue small constant time operations.  I don''t see (so far)
how to do that with queue lengths measured in bytes.

Martin Devera

2002-Jun-07 18:13 UTC

head link

Re: SFQ buckets/extensions

> 	No, i do not think like this. If we think just for TCP connections
> we may end up with ideas like changig window size. TCP tunes to send this
> way but it also tries to send more each 5 seconds, if it have data.
please not this. :-) Packetizer sw goes this way and it is
- not allowed by RFC
- problematic with some TCP stacks

devik

Martin Devera

2002-Jun-07 18:38 UTC

head link

RE: SFQ buckets/extensions

> that quite possible ... The only way to equalize bandwidth
"fairly" in
> these scenarios still seems to be to implement the hierarchial approach
> of hashing against destination IP (the user receiving the packets) and
exactly. The discussion should be about how to implement if
efficiently. What about to have N IP buckets and N IP/PORT
buckets. When IP/PORT hash resolves to bucket then we could (re)link the
IP/PORT bucket to IP bucket''s DRR list.
During colision (two different IP but the same IP/PORT hash) the
while bucket would be stolen from original IP and moved to new.
It is bad but it is why we call it Stochastic ;)

It would be fast with bounded memory usage ...

Just my opinion ..
devik
> > 	People wanted options to tune hashsize/limits/depths and
> > the most wanted (needed) was the option to select hash type.
> > SRC/DST Port hash is in TODO now.
>
> And I''m glad it exists for testing at least.
> --
> Michael T. Babcock
> CTO, FibreSpeed Ltd.
>
> _______________________________________________
> LARTC mailing list / LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>
>

Michael T. Babcock

2002-Jun-07 19:01 UTC

head link

RE: SFQ buckets/extensions

> exactly. The discussion should be about how to implement if
> efficiently. What about to have N IP buckets and N IP/PORT
> buckets. When IP/PORT hash resolves to bucket then we could 
> (re)link the
Consider a new classful queueing discipline "SFC" that behaves exactly
as SFQ does and can have only one sub-queue attached to it.

... SFC (8 bit hash against dest ip)
     |
     \-- SFC (11 bit hash against src ip, port, dest ip, port)
          |
          \-- dequeue ...

Is this even possible with how the classes are currently built in the
kernel?
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.

Martin Devera

2002-Jun-07 19:08 UTC

head link

RE: SFQ buckets/extensions

It can''t be done. Outer SFC could do nothing with the packet.
It could only give it to inner one....

On Fri, 7 Jun 2002, Michael T. Babcock wrote:
> > exactly. The discussion should be about how to implement if
> > efficiently. What about to have N IP buckets and N IP/PORT
> > buckets. When IP/PORT hash resolves to bucket then we could
> > (re)link the
>
> Consider a new classful queueing discipline "SFC" that behaves
exactly
> as SFQ does and can have only one sub-queue attached to it.
>
> ... SFC (8 bit hash against dest ip)
>      |
>      \-- SFC (11 bit hash against src ip, port, dest ip, port)
>           |
>           \-- dequeue ...
>
> Is this even possible with how the classes are currently built in the
> kernel?
> --
> Michael T. Babcock
> CTO, FibreSpeed Ltd.
>
> _______________________________________________
> LARTC mailing list / LARTC@mailman.ds9a.nl
> http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>
>

Michael T. Babcock

2002-Jun-07 19:50 UTC

head link

RE: SFQ buckets/extensions

For the sake of a play-by-play (and why it wouldn''t quite work right
initially):

1) we need to dequeue a packet
2) we ask the 11 bit lower SFC for a packet.
3) it asks the upper SFC
4) the upper SFC takes the next bucket, based on IP, and gives us a
packet.
5) the lower SFC takes that packet and has nothing else to work with, so
it passes it on.

... So you''re right :-)

The two have to intercommunicate more ... So they have to be one.
> Hmm I can''t imagine that :) Probably it should be tested
> 
> On Fri, 7 Jun 2002, Michael T. Babcock wrote:
> 
> > > It can''t be done. Outer SFC could do nothing with the
packet.
> > > It could only give it to inner one....
> >
> > ... Ah, but it can choose which packet to give the inner 
> one.  It would
> > reorder the packets into a semi-fair order based on IP.
> >
> > > > ... SFC (8 bit hash against dest ip)
> > > >      |
> > > >      \-- SFC (11 bit hash against src ip, port, dest ip,
port)
> > > >           |
> > > >           \-- dequeue ...

Gregory Maxwell

2002-Jun-08 18:42 UTC

head link

Re: SFQ buckets/extensions

On Fri, Jun 07, 2002 at 04:37:07PM +0200, Jan Coppens
wrote:> Why not implementing the sfb (stochastic fair Blue)? From the papers I
read,
> the algorithm with the double buffered moving hash extension, seemed like a
> better alternative for the FRED, Stochastic fair RED and SFQ.
I''d like to toss in a request that anyone who impliments this please
make
the bloom filter size adjustable and provide a mask to remove information
from the hash (i.e. sport/dport) to allow for per host fairness rather then
per flow.

LARTC - Jun 2002 - SFQ buckets/extensions

SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

SFQ + RED

Re: SFQ + RED

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

RE: SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

Re: SFQ buckets/extensions

RE: SFQ buckets/extensions

RE: SFQ buckets/extensions

RE: SFQ buckets/extensions

RE: SFQ buckets/extensions

Re: SFQ buckets/extensions