> ... What if SFQ were to start with a minimal number of buckets, and> track how ''deep'' each bucket was, then go to a larger number of bits > (2/4 at a time?) if the buckets hit a certain depth? Theoretically, > this would mean that ''fairness'' would be achieved more often in current > collision situations but that a smaller number of buckets would be > necessary to achieve fairness in currently low-collision situations. > > I haven''t looked at the SFQ code in a while, so I don''t know how much > benefit this would be in terms of processing time, or even how expensive > it would be to change hash sizes on the fly, but at a certain level of > resolution (+/- 2-4 bits), the changes wouldn''t be terribly frequent > anyway. A few reactions: - The only runtime cost of lots of buckets is a small amount of storage for each bucket. Allocating buckets at runtime also introduces the problem that you could run out of space. - There''s no advantage to having many more buckets that the number of packets you''re willing to queue, which is typically only on the order of a few hundred. ==== extensions > And all the discussions tend to lead to the conclusion that there should > be an sfq option (when the queue is created) for: > a) how big the hash is > b) whether to take into account source ports or not > c) whether to take into account destination ports or not > d) etc. :) > > Maybe someone who''s written a qdisc would feel up to this? I''ve been hoping to get to it, since I have other stuff I''d like to incorporate into a new sfq version. > From: Alexander Atanasov <alex@ssi.bg> > I''ve done some in this direction , probably needs more work, and > it''s poorly tested - expect b00ms ;) > > This adds a new qdisc for now - esfq which is a 100% clone of > original sfq. > - You can set all sfq parameters: hash table size, queue depths, > queue limits. > - You can choose from 3 hash types: original(classic), dst ip, src > ip. > Things to consider: perturbation with dst and src hashes is not > good IMHO, you can try with perturb 0 if it couses trouble. > > Please, see the attached files. > > Plaing with it gives interesting results: > higher depth -> makes flows equal slower > small depth -> makes flows equal faster > limit kills big delays when set at about 75-85% of depth. I don''t understand what these last three lines mean. Could you explain? > > Needs testings and mesurements - that''s why i made it > separate qdisc and not a patch over sfq, i wanted to compare both. > > Any feedback good or bad is welcome. I''ll send you my current module, also a variant of SFQ. It contains doc that I think is worth including, also changes some of the code to be more understandable, separates the number of packets allowed in the queue from the number of buckets, supports the time limit (discussed in earlier messages), controls these things via /proc, maybe a few other things I''m forgetting. This version does not support hashing on different properties of the packet, cause it uses a totally different criterion for identifying "subclasses" of traffic. You can discard that and restore the sfq hash with your modifications. I think (hope) these changes are pretty much independent.
On Wed, 5 Jun 2002, Don Cohen wrote:> ==== extensions > > And all the discussions tend to lead to the conclusion that there should > > be an sfq option (when the queue is created) for: > > a) how big the hash is > > b) whether to take into account source ports or not > > c) whether to take into account destination ports or not > > d) etc. :) > > > > Maybe someone who''s written a qdisc would feel up to this? > > I''ve been hoping to get to it, since I have other stuff I''d like to > incorporate into a new sfq version.At first look - i think i''ve have to incorporate my changes into your work. I''ve not done much just added hashes and unlocked what Alexey Kuznetsov did.> > Plaing with it gives interesting results: > > higher depth -> makes flows equal slower > > small depth -> makes flows equal faster > > limit kills big delays when set at about 75-85% of depth. > > I don''t understand what these last three lines mean. Could you > explain?depth is how much packets which are queued on a row of the hash table. If you have large queues (higher depth) sfq reacts slower when a new flow appears (it has to do more work to make queue lengths equal ). When you have short queues it reacts faster, so adjusting depth to your bandwidth and traffic type can make it do better work. I set bounded cbq class 320kbits and esfq with dst hash: Start an upload - it gets 40KB Start second one - it should get 20KB asap to be fair. With depth 128 it would take it let''s say 6 sec. to make both 20KB, with depth 64 about 3sec - drop packets early with shorter queue. (i''ve to make some exact measurements since this is just an example and may not be correct). limit sets a threshold on queued packets - if a packet exceeds it''s dropped so delay is smaller, but when it tries to make flows equal it counts depth, not limit. With above example depth 128 and limit 100: When first upload enqueue 100 packets sfq starts to drop, but goal to make flows equal is 64 packets in queue. Flow doesn''t get the 28 packets which are to be enqueued and delayed for a long time and probably dropped when recived. It''s similiar to RED - you have a queue which you try to keep at about X percent full but can go over it, limit keeps it at most X percent full. What about making it like RED - more packets go over limit more are droped early -> limit goes down, less are dropped -> limit goes up?> > > > > Needs testings and mesurements - that''s why i made it > > separate qdisc and not a patch over sfq, i wanted to compare both. > > > > Any feedback good or bad is welcome. > > I''ll send you my current module, also a variant of SFQ. It contains > doc that I think is worth including, also changes some of the code to > be more understandable, separates the number of packets allowed in the > queue from the number of buckets, supports the time limit (discussed > in earlier messages), controls these things via /proc, maybe a few > other things I''m forgetting. This version does not support hashing on > different properties of the packet, cause it uses a totally different > criterion for identifying "subclasses" of traffic. You can discard > that and restore the sfq hash with your modifications. I think (hope) > these changes are pretty much independent. >I''ve taken a quick look over it and i like the ideas there, give me some time to get the details. I hope we can make something good of both. -- have fun, alex
Alexander Atanasov writes: > At first look - i think i''ve have to incorporate my changes into > your work. I''ve not done much just added hashes and unlocked what > Alexey Kuznetsov did. Not quite that simple. You have to throw out about half of my file, mostly the last third or so, which replaces the hash function, plus most of the /proc stuff, and probably a lot of other little pieces scattered here and there. I now recall a few other things - I support configuration of sevice weights for different subqueues, which makes no sense for sfq, also I record the amount of service (bytes and packets) per subqueue and report these to the tc -s -d stuff, which also makes no sense for sfq. After removing all that stuff you then have to restore the hash. > > > Plaing with it gives interesting results: > > > higher depth -> makes flows equal slower > > > small depth -> makes flows equal faster > > > limit kills big delays when set at about 75-85% of depth. > > > > I don''t understand what these last three lines mean. Could you > > explain? > > depth is how much packets which are queued on a row of the > hash table. If you have large queues (higher depth) sfq reacts slower when > a new flow appears (it has to do more work to make queue lengths equal > ). When you have short queues it reacts faster, so adjusting depth > to your bandwidth and traffic type can make it do better work. > I set bounded cbq class 320kbits and esfq with dst hash: > Start an upload - it gets 40KB > Start second one - it should get 20KB asap to be fair. > With depth 128 it would take it let''s say 6 sec. to make both 20KB, with > depth 64 about 3sec - drop packets early with shorter queue. > (i''ve to make some exact measurements since this is just an example > and may not be correct). I don''t see why that should be the case. And I don''t recall ever observing it. This adaptation time should be practically zero. There''s no work in making the queues equal. (Let''s use the word queue to mean the whole SFQ and subqueue for the part sharing a hash index.) If you have, say, 100 packets in one subqueue and 10 in another they''re already sharing the bandwidth 50-50. > limit sets a threshold on queued packets - if a packet exceeds > it''s dropped so delay is smaller, but when it tries to make flows equal it > counts depth, not limit. With above example depth 128 and limit 100: > When first upload enqueue 100 packets sfq starts to drop, > but goal to make flows equal is 64 packets in queue. Flow doesn''t get > the 28 packets which are to be enqueued and delayed for a long time and > probably dropped when recived. I disagree that the goal is to make the subqueues the same length. The goal is to serve them with the same bandwidth (as long as they don''t become empty.) Queue length depends on how many packets each download is willing to send without an ack. If one is willing to send 100 and the other is willing to send 10, then the subqueues will likely be length 100 and 10, but each will still get the same bandwidth. Without window scaling the max window size is 64K which is only about 45 packets, so it''s not really normal to have 100 packets in a subqueue.
Hi all, Is there any implemented qdisc that combines the benefits of a SFQ with those of a RED queue. Something like a FRED, CBT or DWRED? thx, Jan
Add RED on IMQ see http://luxik.cdi.cz/~patrick/imq/ and SFQ on network device or vice versa send all incoming packets to imq modprobe imq mumdevs=1 iptables -t mangle -A PREROUTING -i $IN_DEV -j IMQ --todev 0 ip link set imq0 up 06.06.2002 14:21:55, "Jan Coppens" <Jan.Coppens@intec.rug.ac.be> wrote:>Hi all, > >Is there any implemented qdisc that combines the benefits of a SFQ with >those of a RED queue. Something like a FRED, CBT or DWRED? > >thx, >Jan > >_______________________________________________ >LARTC mailing list / LARTC@mailman.ds9a.nl >http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ >----------------------------------- mailto:alexey_talikov@texlab.com.uz BR Alexey Talikov FORTEK -----------------------------------
> I disagree that the goal is to make the subqueues the same length. > The goal is to serve them with the same bandwidth (as long as they > don''t become empty.)Not that you need backing-up, but I agree with you. SFQ is there to provide near-fair queuing on a per-session basis. As modified, it could also be used to provide near-fair queuing on a per-IP basis instead, but having different-depthed sub-queues simply indicates why fairness is needed (one sub-queue would otherwise have dominated the available bandwidth). A very long sub-queue however, indicates that perhaps fairness is not being acheived (although, that''s why its refered to as being stochastic). This is why I suggested at least being able to tune the size of the hash / number of hash buckets so as to redistribute the streams through more sub-queues. Dealing with bursts properly at Linux timer resolutions is another issue :-). -- Michael T. Babcock CTO, FibreSpeed Ltd.
On Wed, 5 Jun 2002, Don Cohen wrote:> Alexander Atanasov writes: > > > > Plaing with it gives interesting results: > > > > higher depth -> makes flows equal slower > > > > small depth -> makes flows equal faster > > > > limit kills big delays when set at about 75-85% of depth. > > > > > > I don''t understand what these last three lines mean. Could you > > > explain? > > > > depth is how much packets which are queued on a row of the > > hash table. If you have large queues (higher depth) sfq reacts slower when > > a new flow appears (it has to do more work to make queue lengths equal > > ). When you have short queues it reacts faster, so adjusting depth > > to your bandwidth and traffic type can make it do better work. > > I set bounded cbq class 320kbits and esfq with dst hash: > > Start an upload - it gets 40KB > > Start second one - it should get 20KB asap to be fair. > > With depth 128 it would take it let''s say 6 sec. to make both 20KB, with > > depth 64 about 3sec - drop packets early with shorter queue. > > (i''ve to make some exact measurements since this is just an example > > and may not be correct). > I don''t see why that should be the case. And I don''t recall ever > observing it. This adaptation time should be practically zero. > There''s no work in making the queues equal.When queues get full sfq have to drop packets and it must send less(delay) or drop on the faster flows which gets a bigger queues to slow down them. That''s the work i mean. And it really depends on the subflow depth.> (Let''s use the word queue to mean the whole SFQ and subqueue for the > part sharing a hash index.) > If you have, say, 100 packets in one subqueue and 10 in another they''re > already sharing the bandwidth 50-50.No, that means that the flow with 100 packets is using about 10 times more bandwidth. To be fair both must have lenght around 50.> > > limit sets a threshold on queued packets - if a packet exceeds > > it''s dropped so delay is smaller, but when it tries to make flows equal it > > counts depth, not limit. With above example depth 128 and limit 100: > > When first upload enqueue 100 packets sfq starts to drop, > > but goal to make flows equal is 64 packets in queue. Flow doesn''t get > > the 28 packets which are to be enqueued and delayed for a long time and > > probably dropped when recived. > I disagree that the goal is to make the subqueues the same length. > The goal is to serve them with the same bandwidth (as long as they > don''t become empty.)I agree that the goal is to serve the same bandwidth, but bandwidth is served with packets which carry a data of X bytes up to MTU, so i consider packets as the base unit. ( look at the example as both ftp flows send the same sized packets with the same data at the same rate). You have: enqueue -> put in the subflow - len++ dequeue -> get from the subflow - len-- So when dequeues ocure slower than enqueues they fill up - packets are delayed and at some point get dropped. So the problem is when your queues get full with packets - how to dequeue to be fair to flows which are identified with the hash. SFQ tries to make equal queues for the flows to do this.> Queue length depends on how many packets each download is willing to > send without an ack. If one is willing to send 100 and the other is > willing to send 10, then the subqueues will likely be length 100 and > 10, but each will still get the same bandwidth. Without windowThat''s true if you can send 110 packets and not exceed your bandwidth , trouble comes when you can send 100 , how to deal with with the 10 that exceed.> scaling the max window size is 64K which is only about 45 packets, > so it''s not really normal to have 100 packets in a subqueue.It''s normal if you have congested links, it''s not good in anyway :( But that''s why qos is to avoid and handle this. You have 45 packets per one flow but with 3 hash collisions you get above the 128 packets limit and drop. esfq sloves this you can tune number of subflows and their lengths. -- have fun, alex
Alexander Atanasov writes: > > I don''t see why that should be the case. And I don''t recall ever > > observing it. This adaptation time should be practically zero. > > There''s no work in making the queues equal. > > When queues get full sfq have to drop packets and it must > send less(delay) or drop on the faster flows which gets a bigger queues to > slow down them. That''s the work i mean. And it really depends on the > subflow depth. If the queue is "full" (currently measured in number of packets allowed) then a packet is dropped from the longest subqueue. That does not necessarily equalize the subqueue lengths. You can still have queues of wildly differing lengths. > > (Let''s use the word queue to mean the whole SFQ and subqueue for the > > part sharing a hash index.) > > If you have, say, 100 packets in one subqueue and 10 in another they''re > > already sharing the bandwidth 50-50. > > No, that means that the flow with 100 packets is using about > 10 times more bandwidth. To be fair both must have lenght around 50. Again, no. It''s perfectly possible for one queue to always have length in range 99-101 and another to always have length 9-11 and still be serving them both at the same rate. Just imagine that one client sends 100 packets at once and the other sends 10 at once and then they each send one per second and we forward one per second. Same rate (suppose all packets are same length), different queue lengths. In fact SFQ serves them at the same rate no matter how long they are. Note - serving in this case means sending, not receiving. > I agree that the goal is to serve the same bandwidth, but Not only is this the goal, it''s what SFQ does. > bandwidth is served with packets which carry a data of X bytes up to MTU, > so i consider packets as the base unit. Actually, SFQ serves the same rate in bytes/sec, not in packets/sec. If one flow uses only 1000byte packets and another only 100 byte packets, the second will get 10 times as many packets/sec as the first. It''s only for measuring the length of the queue and subqueues that SFQ counts packets. It would be more accurate in some sense to measure this in bytes as well. But if you''re using a lot of packets then you really ought to use large ones. ( look at the example as both ftp > flows send the same sized packets with the same data at the same rate). > You have: > enqueue -> put in the subflow - len++ > dequeue -> get from the subflow - len-- > > So when dequeues ocure slower than enqueues they fill up - packets > are delayed and at some point get dropped. So the problem is when > your queues get full with packets - how to dequeue to be fair to flows > which are identified with the hash. SFQ tries to make equal queues for the > flows to do this. No, it doesn''t. It does something like round robin on the subqueues which serves them all at the same rate. That''s independent of how long the subqueues get. In fact, if you had infinite memory (and could tolerate infinite delay) you could skip the drop from the tail of the longest queue. (You''d have to change the current algorithm a little bit. The stuff that supports the tail drop wouldn''t work.) > > Queue length depends on how many packets each download is willing to > > send without an ack. If one is willing to send 100 and the other is > > willing to send 10, then the subqueues will likely be length 100 and > > 10, but each will still get the same bandwidth. Without window > > That''s true if you can send 110 packets and not exceed your > bandwidth , trouble comes when you can send 100 , how to deal with with > the 10 that exceed. SFQ is not trying to control incoming bandwidth, just outgoing. The sending 100 packets above is some other machine, X, sending to the machine with SFQ, Y. For Y these packets are incoming. We don''t control that. We only control the rate at which Y forwards them. So X sends 100 packets. Y gets 100 packets. If they all go into our SFQ and that is limited to 10 packets/sec then those packets are going to be in the queue for a while. In the mean while, Z sends 10 packets to Y. Now we have two subqueues. In the next second, each will get to send 5 packets. So now we have queues of length 95 and 5. > > scaling the max window size is 64K which is only about 45 packets, > > so it''s not really normal to have 100 packets in a subqueue. > > It''s normal if you have congested links, it''s not good I don''t think this is really related to congested links. Well behaved (as in TCP) traffic should not be creating big backlogs no matter how fast or slow or congested the links are. > in anyway :( But that''s why qos is to avoid and handle this. > You have 45 packets per one flow but with 3 hash collisions you get above > the 128 packets limit and drop. esfq sloves this you can tune > number of subflows and their lengths. I still don''t see what esfq is doing that is supposed to help. More subqueues does reduce the probablity of collisions, but this seems unlikely to be a problem. Limiting the size of a subqueue below the size of the entire queue doesn''t do any good as far as I can see.
On Thu, 6 Jun 2002, Don Cohen wrote:> Alexander Atanasov writes: > > > > I don''t see why that should be the case. And I don''t recall ever > > > observing it. This adaptation time should be practically zero. > > > There''s no work in making the queues equal. > > > > When queues get full sfq have to drop packets and it must > > send less(delay) or drop on the faster flows which gets a bigger queues to > > slow down them. That''s the work i mean. And it really depends on the > > subflow depth. > > If the queue is "full" (currently measured in number of packets > allowed) then a packet is dropped from the longest subqueue. > That does not necessarily equalize the subqueue lengths. You > can still have queues of wildly differing lengths.In a long term always droping from the largest subqueue gives you equal subqueues.> > > > (Let''s use the word queue to mean the whole SFQ and subqueue for the > > > part sharing a hash index.) > > > If you have, say, 100 packets in one subqueue and 10 in another they''re > > > already sharing the bandwidth 50-50. > > > > No, that means that the flow with 100 packets is using about > > 10 times more bandwidth. To be fair both must have lenght around 50. > > Again, no. It''s perfectly possible for one queue to always have > length in range 99-101 and another to always have length 9-11 and > still be serving them both at the same rate. Just imagine that one > client sends 100 packets at once and the other sends 10 at once and > then they each send one per second and we forward one per second. > Same rate (suppose all packets are same length), different queue > lengths. In fact SFQ serves them at the same rate no matter how long > they are. Note - serving in this case means sending, not receiving.Okay, we said that sfq dropes from the longest subqueue. We start with: q1 - 100 q2 - 10 we enqueue(recive) 2 packets/sec and dequeue(send) 1 packet/sec. ( rr from q1 and q2 ). q1 will grow to the limit first so sfq will start to drop there, it will be the longest queue until, q2 gets equal to q1. In time things should go like this: After 56 seconds ( if limit is 128 packets ) q1 hits the limit and sfq starts to drop. Now q2 is just with 38 packets and grows until it gets equal to q1, then start to drop round robin. I agree that rate is the same .5 packet/sec. But the initial 100 packets are burst which sfq delays (buffers) since it fits fine in queue limit. If q1 recives again 100 packets most of them get will dropped.> > > I agree that the goal is to serve the same bandwidth, but > > Not only is this the goal, it''s what SFQ does. > > > bandwidth is served with packets which carry a data of X bytes up to MTU, > > so i consider packets as the base unit. > > Actually, SFQ serves the same rate in bytes/sec, not in packets/sec. > If one flow uses only 1000byte packets and another only 100 byte > packets, the second will get 10 times as many packets/sec as the first. > It''s only for measuring the length of the queue and subqueues that SFQ > counts packets. It would be more accurate in some sense to measure > this in bytes as well. But if you''re using a lot of packets then you > really ought to use large ones.I''ve not said that it servers packets/sec. But said that the packets i speak of are equal in size, so rate=packet/sec=bytes/sec. It powers a queue to send allotment bytes, not packets. But when it have allotment = 1 and packet with len = 800 the packet is send and it''s powered again. so it sends more than it should be. Measurement in bytes can make this more incorrect since you have enqueued packets with different lenghts which you can not split a part and dequeue in bytes (sigh wouldn''t it be nice? ), counting queues in packets (exactly what they have) is better. May be doing it like the cbq with average packet size can gave us better results. [cut]> > > > Queue length depends on how many packets each download is willing to > > > send without an ack. If one is willing to send 100 and the other is > > > willing to send 10, then the subqueues will likely be length 100 and > > > 10, but each will still get the same bandwidth. Without window > > > > That''s true if you can send 110 packets and not exceed your > > bandwidth , trouble comes when you can send 100 , how to deal with with > > the 10 that exceed. > > SFQ is not trying to control incoming bandwidth, just outgoing. > The sending 100 packets above is some other machine, X, sending to the > machine with SFQ, Y. For Y these packets are incoming. We don''t > control that. We only control the rate at which Y forwards them. > So X sends 100 packets. Y gets 100 packets. If they all go into > our SFQ and that is limited to 10 packets/sec then those packets are > going to be in the queue for a while. In the mean while, Z sends 10 > packets to Y. Now we have two subqueues. In the next second, each > will get to send 5 packets. So now we have queues of length 95 and 5.Sending from the queue, not controling incoming bandwidht. Yes look above.> > > > scaling the max window size is 64K which is only about 45 packets, > > > so it''s not really normal to have 100 packets in a subqueue. > > > > It''s normal if you have congested links, it''s not good > > I don''t think this is really related to congested links. > Well behaved (as in TCP) traffic should not be creating big backlogs > no matter how fast or slow or congested the links are.All started with the so called Download Accelerators, which do paralel gets and make TCP behave bad. In normal case TCP adjusts fast and does not create backlogs. But when you have to change it''s bandwidth , you have to create a backlog to slow it down. It then clears fast.> > > in anyway :( But that''s why qos is to avoid and handle this. > > You have 45 packets per one flow but with 3 hash collisions you get above > > the 128 packets limit and drop. esfq sloves this you can tune > > number of subflows and their lengths. > > I still don''t see what esfq is doing that is supposed to help. > More subqueues does reduce the probablity of collisions, but this > seems unlikely to be a problem. Limiting the size of a subqueue > below the size of the entire queue doesn''t do any good as far as > I can see.Collisions are problem belive me, that''s why there is a perturbation to reduce them. You can arrange link sharing when controling number of subqueues more easy. Limiting may become good if made dynamic. People wanted options to tune hashsize/limits/depths and the most wanted (needed) was the option to select hash type. SRC/DST Port hash is in TODO now. Searching gives answers like "Patch SFQ", but that just doesn''t worked for me.i need to have different sfqs on different classes/interfaces. So that''s why there is an esfq. And i hope it really helps. -- have fun, alex
Alexander Atanasov writes: > > Again, no. It''s perfectly possible for one queue to always have > > length in range 99-101 and another to always have length 9-11 and > > still be serving them both at the same rate. Just imagine that one > > client sends 100 packets at once and the other sends 10 at once and > > then they each send one per second and we forward one per second. > > Same rate (suppose all packets are same length), different queue > > lengths. In fact SFQ serves them at the same rate no matter how long > > they are. Note - serving in this case means sending, not receiving. > > Okay, we said that sfq dropes from the longest subqueue. > We start with: > q1 - 100 > q2 - 10 > > we enqueue(recive) 2 packets/sec I meant one from each source > and dequeue(send) 1 packet/sec. ( rr from q1 and q2 ). I meant one from each subqueue So we''re in steady state. After one second we have sent one from each subqueue and it has been replenished. > q1 will grow to the limit first so sfq will start to drop there, The total queue size is not growing so nothing is being dropped. Maybe you think it''s just incredibly lucky that these sources are both sending at just the same rate that they''re being served, but that''s what tcp is supposed to do. > Measurement in bytes can make this more incorrect since you have enqueued > packets with different lenghts which you can not split a part and dequeue > in bytes (sigh wouldn''t it be nice? ), counting queues in packets (exactly > what they have) is better. May be doing it like the cbq with average > packet size can gave us better results. No, the errors are accumulated. It''s always within one packet of the ideal in terms of what''s sent. The advantage of measuring queue length in bytes would be more fairness in dropping, i.e., you should drop from a queue with 10 packets each of length 1000 bytes instead of a queue with 20 packets each of length 100 bytes. > People wanted options to tune hashsize/limits/depths and > the most wanted (needed) was the option to select hash type. > SRC/DST Port hash is in TODO now. > Searching gives answers like "Patch SFQ", but that just doesn''t > worked for me.i need to have different sfqs on different > classes/interfaces. So that''s why there is an esfq. And i hope it really > helps. I think the version I sent does give control over queue size and number of buckets. It also gives you something called expire, which I think might do the same sort of thing that you wanted from your limits. We''ve discussed it on this list before so I won''t go into it again.
On Thu, 6 Jun 2002, Don Cohen wrote:> Maybe you think it''s just incredibly lucky that these sources are > both sending at just the same rate that they''re being served, but > that''s what tcp is supposed to do.No, i do not think like this. If we think just for TCP connections we may end up with ideas like changig window size. TCP tunes to send this way but it also tries to send more each 5 seconds, if it have data. SFQ classify connections with ports, esfq classify them and just by IP so we can have flows: SRC IP+proto+DST IP+PORT - one flow just DST IP - one flow another we can think of - one flow So i think just about packets of size bytes but without protos which they carry and TCP with its features to tune becomes a kind of exception.> > > Measurement in bytes can make this more incorrect since you have enqueued > > packets with different lenghts which you can not split a part and dequeue > > in bytes (sigh wouldn''t it be nice? ), counting queues in packets (exactly > > what they have) is better. May be doing it like the cbq with average > > packet size can gave us better results. > No, the errors are accumulated. It''s always within one packet of the > ideal in terms of what''s sent. The advantage of measuring queue > length in bytes would be more fairness in dropping, i.e., you should > drop from a queue with 10 packets each of length 1000 bytes instead of > a queue with 20 packets each of length 100 bytes.I didn''t get it ? What do you mean - have different queues agains packets sizes ?> I think the version I sent does give control over queue size and > number of buckets. It also gives you something called expire, which > I think might do the same sort of thing that you wanted from your > limits. We''ve discussed it on this list before so I won''t go into it > again.We need a way to control what happens in a subqueue there are these for now: - Time limits you''ve implemented - Someone asked about (G)RED - devik asked about wrr for connections - others ? -- have fun, alex
> In a long term always droping from the largest subqueue > gives you equal subqueues.And, of course, one could have it drop them using a RED-like algorithm to make the sessions stabilize themselves better.> what they have) is better. May be doing it like the cbq with average > packet size can gave us better results.Calculating average packet size on the fly isn''t that hard either.> All started with the so called Download Accelerators, > which do paralel gets and make TCP behave bad. > In normal case TCP adjusts fast and does not create backlogs. > But when you have to change it''s bandwidth , you have to create > a backlog to slow it down. It then clears fast.I actually usually download pieces of the same file from multiple mirrors if its truly large and want it fast :-). Ranged downloads make that quite possible ... The only way to equalize bandwidth "fairly" in these scenarios still seems to be to implement the hierarchial approach of hashing against destination IP (the user receiving the packets) and then having sub-queues to those queues based on ports and to drop packets (based on RED?) in each of these sub-queues (because they''re closer to the actual TCP sessions involved).> People wanted options to tune hashsize/limits/depths and > the most wanted (needed) was the option to select hash type. > SRC/DST Port hash is in TODO now.And I''m glad it exists for testing at least. -- Michael T. Babcock CTO, FibreSpeed Ltd.
----- Original Message ----- From: "Michael T. Babcock" <mbabcock@fibrespeed.net> To: <lartc@mailman.ds9a.nl> Sent: Friday, June 07, 2002 4:18 PM Subject: RE: [LARTC] SFQ buckets/extensions> > In a long term always droping from the largest subqueue > > gives you equal subqueues. > > And, of course, one could have it drop them using a RED-like algorithm > to make the sessions stabilize themselves better.If you are dealing with a responsive session (like TCP) that is...> > what they have) is better. May be doing it like the cbq with average > > packet size can gave us better results. > > Calculating average packet size on the fly isn''t that hard either. > > > All started with the so called Download Accelerators, > > which do paralel gets and make TCP behave bad. > > In normal case TCP adjusts fast and does not create backlogs. > > But when you have to change it''s bandwidth , you have to create > > a backlog to slow it down. It then clears fast. > > I actually usually download pieces of the same file from multiple > mirrors if its truly large and want it fast :-). Ranged downloads make > that quite possible ... The only way to equalize bandwidth "fairly" in > these scenarios still seems to be to implement the hierarchial approach > of hashing against destination IP (the user receiving the packets) and > then having sub-queues to those queues based on ports and to drop > packets (based on RED?) in each of these sub-queues (because they''re > closer to the actual TCP sessions involved). >Why not implementing the sfb (stochastic fair Blue)? From the papers I read, the algorithm with the double buffered moving hash extension, seemed like a better alternative for the FRED, Stochastic fair RED and SFQ.> > People wanted options to tune hashsize/limits/depths and > > the most wanted (needed) was the option to select hash type. > > SRC/DST Port hash is in TODO now. > > And I''m glad it exists for testing at least. > -- > Michael T. Babcock > CTO, FibreSpeed Ltd. > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ > >
Alexander Atanasov writes: > SFQ classify connections with ports, esfq classify them and just by IP so > we can have flows: > SRC IP+proto+DST IP+PORT - one flow > just DST IP - one flow > another we can think of - one flow > So i think just about packets of size bytes but without protos which they > carry and TCP with its features to tune becomes a kind of exception. I don''t think this is true. First, of course, almost all packets TCP. Second, I''d expect that multiple TCP streams along the same path tend to equalize, assuming of course that they are not differentiated along the way. E.g., with just fifo queuing I''d expect that two scp''s along the same path would tend toward equal bandwidth. If this is true then multiple TCP streams along the same path would tend to act like one. Perhaps someone else out there can either confirm or debunk that expectation? Of course, things get more complicated when the streams have the same source but different destinations. In that case I''d expect them to again adjust to the differences in bandwidth to the different destinations. So maybe sub-subqueues below the source IP subqueues are not so important. > > No, the errors are accumulated. It''s always within one packet of the > > ideal in terms of what''s sent. The advantage of measuring queue > > length in bytes would be more fairness in dropping, i.e., you should > > drop from a queue with 10 packets each of length 1000 bytes instead of > > a queue with 20 packets each of length 100 bytes. > > I didn''t get it ? What do you mean - have different queues > agains packets sizes ? I was suggesting that when you add a packet to a subqueue you not just record the fact that there are now 16 packets in that subqueue, but 1600 bytes. Then the limit on total queue size is measured in bytes. When you enqueue, you check to see whether this packet would cause the limit to be exceeded, and if so you drop from the subqueue with the most bytes. That''s closer to the spirit of SFQ, but it''s probably more expensive in run time. The current implementation has a very fast way to determine which subqueue has the most packets. The object is to make enqueue/dequeue small constant time operations. I don''t see (so far) how to do that with queue lengths measured in bytes.
> No, i do not think like this. If we think just for TCP connections > we may end up with ideas like changig window size. TCP tunes to send this > way but it also tries to send more each 5 seconds, if it have data.please not this. :-) Packetizer sw goes this way and it is - not allowed by RFC - problematic with some TCP stacks devik
> that quite possible ... The only way to equalize bandwidth "fairly" in > these scenarios still seems to be to implement the hierarchial approach > of hashing against destination IP (the user receiving the packets) andexactly. The discussion should be about how to implement if efficiently. What about to have N IP buckets and N IP/PORT buckets. When IP/PORT hash resolves to bucket then we could (re)link the IP/PORT bucket to IP bucket''s DRR list. During colision (two different IP but the same IP/PORT hash) the while bucket would be stolen from original IP and moved to new. It is bad but it is why we call it Stochastic ;) It would be fast with bounded memory usage ... Just my opinion .. devik> > People wanted options to tune hashsize/limits/depths and > > the most wanted (needed) was the option to select hash type. > > SRC/DST Port hash is in TODO now. > > And I''m glad it exists for testing at least. > -- > Michael T. Babcock > CTO, FibreSpeed Ltd. > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ > >
> exactly. The discussion should be about how to implement if > efficiently. What about to have N IP buckets and N IP/PORT > buckets. When IP/PORT hash resolves to bucket then we could > (re)link theConsider a new classful queueing discipline "SFC" that behaves exactly as SFQ does and can have only one sub-queue attached to it. ... SFC (8 bit hash against dest ip) | \-- SFC (11 bit hash against src ip, port, dest ip, port) | \-- dequeue ... Is this even possible with how the classes are currently built in the kernel? -- Michael T. Babcock CTO, FibreSpeed Ltd.
It can''t be done. Outer SFC could do nothing with the packet. It could only give it to inner one.... On Fri, 7 Jun 2002, Michael T. Babcock wrote:> > exactly. The discussion should be about how to implement if > > efficiently. What about to have N IP buckets and N IP/PORT > > buckets. When IP/PORT hash resolves to bucket then we could > > (re)link the > > Consider a new classful queueing discipline "SFC" that behaves exactly > as SFQ does and can have only one sub-queue attached to it. > > ... SFC (8 bit hash against dest ip) > | > \-- SFC (11 bit hash against src ip, port, dest ip, port) > | > \-- dequeue ... > > Is this even possible with how the classes are currently built in the > kernel? > -- > Michael T. Babcock > CTO, FibreSpeed Ltd. > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ > >
For the sake of a play-by-play (and why it wouldn''t quite work right initially): 1) we need to dequeue a packet 2) we ask the 11 bit lower SFC for a packet. 3) it asks the upper SFC 4) the upper SFC takes the next bucket, based on IP, and gives us a packet. 5) the lower SFC takes that packet and has nothing else to work with, so it passes it on. ... So you''re right :-) The two have to intercommunicate more ... So they have to be one.> Hmm I can''t imagine that :) Probably it should be tested > > On Fri, 7 Jun 2002, Michael T. Babcock wrote: > > > > It can''t be done. Outer SFC could do nothing with the packet. > > > It could only give it to inner one.... > > > > ... Ah, but it can choose which packet to give the inner > one. It would > > reorder the packets into a semi-fair order based on IP. > > > > > > ... SFC (8 bit hash against dest ip) > > > > | > > > > \-- SFC (11 bit hash against src ip, port, dest ip, port) > > > > | > > > > \-- dequeue ...
On Fri, Jun 07, 2002 at 04:37:07PM +0200, Jan Coppens wrote:> Why not implementing the sfb (stochastic fair Blue)? From the papers I read, > the algorithm with the double buffered moving hash extension, seemed like a > better alternative for the FRED, Stochastic fair RED and SFQ.I''d like to toss in a request that anyone who impliments this please make the bloom filter size adjustable and provide a mask to remove information from the hash (i.e. sport/dport) to allow for per host fairness rather then per flow.