I notice one other small problem with my modified version of SFQ. The fact that packets can be dropped at dequeue time is incompatible with the way HTB (and probably CBQ and others modeled on it) keep statistics. When I fill a low rate queue causing packets to expire and be dropped at dequeue I get interesting statistics like this: This is my variant of SFQ qdisc plfq 8016: dev eth1 ... Sent 17468 bytes 166 pkts (dropped 833, overlimits 0) ... qdisc htb 1: dev eth1 r2q 10 default 2 dcache 0 deq_util 1/1000000 deq_rate 0 trials_per_deq 0 dcache_hits 0 direct_packets 0 Sent 32626 bytes 309 pkts (dropped 690, overlimits 1126) backlog 143p Note that plfq, which is inside of htb, thinks it has dropped more packets than htb has. The reason is that htb considers a packet to be "sent" when it''s enqueued. I notice in retrospect that even with current SFQ, the statistics are not correct. The reason is that it''s possible to successfully enqueue packet#1 (at which point HTB adds its length to the number of bytes "sent)", then fail to enqueue packet#2. But within SFQ this "failure" might really mean that packet#2 was accepted to be sent later and it displaced packet#1. It happens that this "replacement" keeps the packet count correct, but not the byte count. One can imagine another version of SFQ that, instead of limiting the number of packets in the queue, limits the number of bytes. In that case the number of packets would not be correct either. I think that HTB (and other classful qdiscs) should count the number of bytes and packets "sent" when they''re dequeued, not when they''re enqueued. It''s an interesting point that, at the moment, I don''t see any way for the classful qdisc to know how many packets and bytes were dropped. However, I think it would be useful to know instead how many were passed to enqueue. Then you''d get a report like Enqueued 32626 bytes 309 pkts, Dequeued 17468 bytes 166 pkts where the difference is the sum of what''s been dropped and what''s still in the queue.
Hi, only a few notes on the theme. You are right with the displacement and bad enqueue byte counters. Maybe it would be better to cound packets at dequeue time only in clasfull qdisc. It also makes better sense because qdosc can also instruct SFQ to drop packet - this drop is then not decremented from HTB/CBQ''s stats. HTB uses enqueue time counting because CBQ does it too and at very begining I based my work on CBQ. Seems it is time to change things. Another important info: you really CAN''T drop packets in dequeue routine if you don''t want to fool classfull parent. Many logic in CBQ/HTB/ATM/PRIO qdisc is based on the existence of backlog. When you drop in dequeue, parent will think that he itself still has some packets somewhere (in children) and will constantly attempt to find them. And will be confused by the fact that it can''t. Enqueue routine can give you confirmation of packet enqueue, drop routine the same. Only dequeue can''t say hey there is your skb and by the way two others was dropped. What a pity. SOLUTIONS: One way is to monitor (store) q.qlen variable of all children of classfull qdisc. When I call enqueue/drop/dequeue on it I''ll see its discrepancy agains last stored value and will update my counter accordingly. In the same way we could add q.bytelen and then we would be able to do the same for bytes - but this is not neccessary probably as we need to know only bytes dequeued ... There is other way - add parameter to dequeue routine which tells us how many packets we should out qlen decrease by. But I think that former approach is simpler and prone to "silent" drops (for example by some timer). Don do you plan to implement these corrections for CBQ/PRIO ? I''ll do the change for HTB. devik On Sat, 4 May 2002, Don Cohen wrote:> > I notice one other small problem with my modified version of SFQ. > The fact that packets can be dropped at dequeue time is incompatible > with the way HTB (and probably CBQ and others modeled on it) keep > statistics. When I fill a low rate queue causing packets to expire > and be dropped at dequeue I get interesting statistics like this: > > This is my variant of SFQ > qdisc plfq 8016: dev eth1 > ... > Sent 17468 bytes 166 pkts (dropped 833, overlimits 0) > ... > qdisc htb 1: dev eth1 r2q 10 default 2 dcache 0 > deq_util 1/1000000 deq_rate 0 trials_per_deq 0 > dcache_hits 0 direct_packets 0 > Sent 32626 bytes 309 pkts (dropped 690, overlimits 1126) > backlog 143p > > Note that plfq, which is inside of htb, thinks it has dropped more > packets than htb has. The reason is that htb considers a packet to be > "sent" when it''s enqueued. I notice in retrospect that even with > current SFQ, the statistics are not correct. The reason is that it''s > possible to successfully enqueue packet#1 (at which point HTB adds its > length to the number of bytes "sent)", then fail to enqueue packet#2. > But within SFQ this "failure" might really mean that packet#2 was > accepted to be sent later and it displaced packet#1. It happens that > this "replacement" keeps the packet count correct, but not the byte > count. One can imagine another version of SFQ that, instead of > limiting the number of packets in the queue, limits the number of > bytes. In that case the number of packets would not be correct > either. > > I think that HTB (and other classful qdiscs) should count the > number of bytes and packets "sent" when they''re dequeued, not when > they''re enqueued. It''s an interesting point that, at the moment, I > don''t see any way for the classful qdisc to know how many packets and > bytes were dropped. However, I think it would be useful to know > instead how many were passed to enqueue. Then you''d get a report like > Enqueued 32626 bytes 309 pkts, Dequeued 17468 bytes 166 pkts > where the difference is the sum of what''s been dropped and what''s > still in the queue. > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ > >
Eh more info if you are interested: I already do some backlog fixes in HTB2/3. In CBQ there is possible to drop class which has backlog already. And this packlog is not substracted from CBQ''s one. I feel it as error however it needs to be tested. The same holds for attaching inner qdisc - if previous one had backlog then CBQ''s backlog will be probably messed up. devik
Martin Devera writes: Hi, only a few notes on the theme. You are right with the displacement and bad enqueue byte counters. Maybe it would be better to cound packets at dequeue time only in clasfull qdisc. It also makes better sense because qdosc can also instruct SFQ to drop packet - this drop I don''t understand. What other qdisc instructs SFQ to drop? is then not decremented from HTB/CBQ''s stats. HTB uses enqueue time counting because CBQ does it too and at very I figured that''s why it worked that way. begining I based my work on CBQ. Seems it is time to change things. Another important info: you really CAN''T drop packets in dequeue routine if you don''t want to fool classfull parent. Many logic in CBQ/HTB/ATM/PRIO qdisc is based on the existence of backlog. I gather you only care whether this qdisc is waiting to send, not how much is in the queue. When you drop in dequeue, parent will think that he itself still has some packets somewhere (in children) and will constantly attempt to find them. And will be confused by the fact that it can''t. Enqueue routine can give you confirmation of packet enqueue, drop routine the same. Only dequeue can''t say hey there is your skb and by the way two others was dropped. What a pity. SOLUTIONS: One way is to monitor (store) q.qlen variable of all children of classfull qdisc. When I call enqueue/drop/dequeue on it I''ll see its discrepancy agains last stored value and will update my counter This seems like the right solution, except that you shouldn''t even need to store a counter. Just use the one in the child qdisc. BTW, I think you are justified in assuming that this counter can only change in two ways: decrease when dequeue is called (but possibly by more than one packet) and increase by 1 when enqueue is called. And I guess also requeue. Which raises another problem, since that''s not called by the parent, right? accordingly. In the same way we could add q.bytelen and then we would be able to do the same for bytes - but this is not neccessary probably as we need to know only bytes dequeued ... My impression is that every qdisc is currently supposed to keep track of # packets in the queue, but not necessarily # bytes. Best not to change that, especially since we don''t even have any currently proposed use for it. There is other way - add parameter to dequeue routine which tells us how many packets we should out qlen decrease by. But I think that former approach is simpler and prone to "silent" drops (for example by some timer). I think you''re saying the former approach is better, but "prone" suggests otherwise. I prefer the former approach. It''s better for all the other qdiscs out there to keep working without change. There are only a few classful qdiscs that would have to change, and they could still work unchanged with qdiscs that don''t do silent drops. Don do you plan to implement these corrections for CBQ/PRIO ? Not any time soon. The last time I tried to read CBQ it was in order to figure out what all those parameters mean, and I didn''t get very far. Now that I no longer use it I have less incentive. I''ll do the change for HTB. Thanks. But which ones? It sounds like you plan to - count as sent only what''s returned from dequeue - ignore the return value of enqueue and check the queue length to see whether it''s empty I like those, but also suggested that HTB would be a good place to add code for dropping stale packets. If you''re interested in doing that I should mention: Not all packets come with timestamps (e.g., locally generated ones). Solution: at enqueue, if the timestamp is 0, set it to the current time.
> packets at dequeue time only in clasfull qdisc. It also makes better > sense because qdosc can also instruct SFQ to drop packet - this drop > > I don''t understand. What other qdisc instructs SFQ to drop?CBQ only at the moment. It can call q->ops->drop on inner qdisc as response to overlimit event. It forces system to converge faster to underlimit state> Another important info: you really CAN''T drop packets in dequeue > routine if you don''t want to fool classfull parent. Many logic > in CBQ/HTB/ATM/PRIO qdisc is based on the existence of backlog. > > I gather you only care whether this qdisc is waiting to send, not how > much is in the queue.yes ..> SOLUTIONS: > One way is to monitor (store) q.qlen variable of all children of > classfull qdisc. When I call enqueue/drop/dequeue on it I''ll see > its discrepancy agains last stored value and will update my counter > > This seems like the right solution, except that you shouldn''t even > need to store a counter. Just use the one in the child qdisc. > BTW, I think you are justified in assuming that this counter can only > change in two ways: decrease when dequeue is called (but possibly by > more than one packet) and increase by 1 when enqueue is called. > And I guess also requeue. Which raises another problem, since that''s > not called by the parent, right?No - you can''t use the q.qlen of inner. It is under inner''s control and you can''t change it. And the assumption how it can change should not be used unless you have some rules for all qdisc how to change it. Somewhere in sources you can read that one of qdisc duties is to ensure that it''s q.qlen value is correct. It doesn''t restrict how it can change and when. I''d suggest for each leaf to maintain last_qlen variable which would start with 0 (also q.qlen will be 0). Then time to time (in enqueue, dequeue, drop ....) you can "repair" your own q.qlen by doing: q.qlen += cl->q.qlen - cl->last_qlen; cl->last_qlen = cl->q.qlen; for each leaf cl. Hence xl->last_qlen tells you what part of total q.qlen is "owned" by cl. Simple and robust, isn''t it ?> I like those, but also suggested that HTB would be a good place to add > code for dropping stale packets. If you''re interested in doing that I > should mention: > Not all packets come with timestamps (e.g., locally generated ones). > Solution: at enqueue, if the timestamp is 0, set it to the current > time.It would be better done in fifo/sfq. It is because HTB can do it only when dequeuing and at that time some packets were already dropped by SFQ because if has had queues full of "old" packets. SFQ should maintain pqio queue of all packets and look at head of the list sometimes to keep its queues frre of old packets and thus to have room for new ones. devik
Martin Devera writes: > > SOLUTIONS: > > One way is to monitor (store) q.qlen variable of all children of > > classfull qdisc. When I call enqueue/drop/dequeue on it I''ll see > > its discrepancy agains last stored value and will update my counter > > > > This seems like the right solution, except that you shouldn''t even > > need to store a counter. Just use the one in the child qdisc. > > BTW, I think you are justified in assuming that this counter can only > > change in two ways: decrease when dequeue is called (but possibly by > > more than one packet) and increase by 1 when enqueue is called. > > And I guess also requeue. Which raises another problem, since that''s > > not called by the parent, right? > > No - you can''t use the q.qlen of inner. It is under inner''s control > and you can''t change it. And the assumption how it can change should > not be used unless you have some rules for all qdisc how to change > it. The only way I can make sense of this is that we must be miscommunicating. I do not propose that HTB alter an SFQ q.qlen. I propose that HTB read it. I think you were suggesting that HTB make its own copy of it, and I was just suggesting that there''s no need for that copy. > Somewhere in sources you can read that one of qdisc duties is > to ensure that it''s q.qlen value is correct. It doesn''t restrict > how it can change and when. I think it''s pretty much necessary from the implementation viewpoint to make some assumptions. In particular, you don''t want to look at all inactive leaves every time you do a dequeue, just to make sure that they haven''t somehow manufactured their own packets. Once the queue is empty you expect it to stay empty until you call enqueue, right? > I''d suggest for each leaf to maintain last_qlen variable which would > start with 0 (also q.qlen will be 0). Then time to time (in enqueue, > dequeue, drop ....) you can "repair" your own q.qlen by doing: > q.qlen += cl->q.qlen - cl->last_qlen; cl->last_qlen = cl->q.qlen; > for each leaf cl. > Hence xl->last_qlen tells you what part of total q.qlen is "owned" by > cl. > Simple and robust, isn''t it ? I don''t get it. A qdisc is already supposed to maintain its qlen. You don''t think that''s being done correctly now? Why do you want this other data? You''re proposing something that will require changing every existing qdisc? Or is this the class (in htb code) trying to keep a copy of the data in its qdisc? If the latter, then I guess I don''t object (since it''s only in your code and transparent to me), but I still don''t see why you need it. > > I like those, but also suggested that HTB would be a good place to add > > code for dropping stale packets. If you''re interested in doing that I > > should mention: > > Not all packets come with timestamps (e.g., locally generated ones). > > Solution: at enqueue, if the timestamp is 0, set it to the current > > time. > > It would be better done in fifo/sfq. It is because HTB can do it > only when dequeuing and at that time some packets were already dropped > by SFQ because if has had queues full of "old" packets. I see, you suggest that I could do better if enqueue started with something like while (oldest packet expired) drop oldest packet Of course this is still only an approximation. There will still be times when you have to drop a packet but it turns out that, before the next one is dequeued another packet will have expired (so if you only knew that you could have saved the packet you dropped before). > SFQ should maintain pqio queue of all packets and look at head of > the list sometimes to keep its queues frre of old packets and thus > to have room for new ones. I suppose SFQ could still do something like above even if HTB were dropping expired packets at dequeue. And this would save all qdiscs the work of dropping expired packets at dequeue, which is still needed even if you also check at enqueue. Also, only dropping at dequeue is a good approximation to what you want in the common case where the queue size is too large for the low rate assigned to the class. I argue it''s much better to do this in a few classful qdiscs than to ask for it to be done by all the others.
Don Cohen wrote:> > > I like those, but also suggested that HTB would be a good place to add > > > code for dropping stale packets. If you''re interested in doing that I > > > should mention: > > > Not all packets come with timestamps (e.g., locally generated ones). > > > Solution: at enqueue, if the timestamp is 0, set it to the current > > > time. > > > > It would be better done in fifo/sfq. It is because HTB can do it > > only when dequeuing and at that time some packets were already dropped > > by SFQ because if has had queues full of "old" packets. > > I see, you suggest that I could do better if enqueue started with > something like > while (oldest packet expired) drop oldest packet > Of course this is still only an approximation. There will still be > times when you have to drop a packet but it turns out that, before the > next one is dequeued another packet will have expired (so if you only > knew that you could have saved the packet you dropped before). > > > SFQ should maintain pqio queue of all packets and look at head of > > the list sometimes to keep its queues frre of old packets and thus > > to have room for new ones. > > I suppose SFQ could still do something like above even if HTB were > dropping expired packets at dequeue. And this would save all qdiscs > the work of dropping expired packets at dequeue, which is still needed > even if you also check at enqueue. Also, only dropping at dequeue is > a good approximation to what you want in the common case where the > queue size is too large for the low rate assigned to the class. I > argue it''s much better to do this in a few classful qdiscs than to ask > for it to be done by all the others.I hope i got you right, here are my thoughts on this: I don''t think dropping at dequeue is necessary. The goal is not to provide a maximum in-queue time, if the qdisc is able to send there is no reason to drop. The problem is if the qdisc is not able to send many packets can get queued until drops occur. This means it takes a long time until the sender receives indication of congestion. For TCP, congestion is indicated by either consequent ACKS with the same ACK number or SACKs. ACKs are only generated by the receiver if something was actualy received, so by dropping packets after some timeout the time until a duplicate ACK is generated becomes smaller. If you assume the expiration time to be smaller than the time requried to fill the senders congestion window, it doesn''t makes sense anymore to drop packets during dequeue as this could possibly prevent a duplicate ACK from beeing generated -> relay indication of congestion. It sounds better to me to check for expired packets during enqueue (timers would probably be too expensive i guess) and drop them before enqueueing the new packet. With SFQ, what about not keeping a per-packet timeout but a per-flow congestion-indication timeout which would drop the first packet of a flow after max(age of any packet from this flow) >= timeout ? This would assure congestion indication to be delivered to the sender as soon as the flow is chosen to send again after the timeout occured (as long as qlen(flow) > 1 which is probably the case). Bye, Patrick
Patrick McHardy writes: > I don''t think dropping at dequeue is necessary. Here''s an example showing it is. I have an SFQ with max queue size 128 and very low rate, say about 1 packet/sec that I use to limit the rate of SYN''s. Now as part of a test I send a syn flood, say 200 packets in one second. In that first second SFQ drops 200-128 but the time limit won''t drop any more on enqueue. Now we''re sending one packet/sec and I try to open a tcp connection. Suppose the age limit is 5 sec. and we drop on enqueue only. If I try to open the tcp connection 3 sec after the flood, none of the 124 or so packets in the queue has expired and it''s going to take 2 min. for my syn to get through. Whereas, if dequeue drops expired packets then I can get through in 2 sec. The goal is not to provide > a maximum in-queue time, if the qdisc is able to send there is no reason > to drop. The reason to drop is that it''s a waste of time/bandwidth to send expired packets, and even worse to make something unexpired wait for them (and even worse yet if it has to wait long enough to expire itself). The problem is if the qdisc is not able to send many packets > can get queued until drops occur. This means it takes a long time until > the sender receives indication of congestion. For TCP, congestion is > indicated by either consequent ACKS with the same ACK number or SACKs. In this case tcp should stop cause it doesn''t get any acks cause its packets are not getting forwarded. Or if the problem is the other direction, it should stop cause it''s not getting the acks. > ACKs are only generated by the receiver if something was actualy > received, so by dropping packets after some timeout the time until a > duplicate ACK is generated becomes smaller. I think now we''re getting into the subject of why it''s good to drop packets that have been waiting for a long time. If your objective is to generate a duplicate ack then I''m not sure dropping packets is the right way. After all, you might drop a packet that would have generated one sooner. For that matter, you might have dropped one that would have generated a non-duplicate ack, which would be even better for tcp. Nevertheless, I do think it''s good to drop packets that are not forwarded in a timely fashion. It''s just not a simple argument. > If you assume the expiration time to be smaller than the time requried > to fill the senders congestion window, it doesn''t makes sense anymore to > drop packets during dequeue as this could possibly prevent a duplicate > ACK from beeing generated -> relay indication of congestion. > It sounds better to me to check for expired packets during enqueue > (timers would probably be too expensive i guess) and drop them before > enqueueing the new packet. I''m not sure what you have in mind for the check here. I was expecting that no packets would actually take a long time between being received/generated and being enqueued. Rather, when you enqueue one, you can look in the queue to find any others that have been in the queue too long. But this does not catch as many cases as checking at dequeue. I was not proposing to set any timers. Just record the time the packet arrives and when you dequeue, see how long ago that was. > With SFQ, what about not keeping a per-packet timeout but a per-flow > congestion-indication timeout which would drop the first packet of a > flow after max(age of any packet from this flow) >= timeout ? This would > assure congestion indication to be delivered to the sender as soon as > the flow is chosen to send again after the timeout occured (as long as > qlen(flow) > 1 which is probably the case). I don''t think I understand what you''re proposing. If we enqueue 50 packets from one flow all at once, and after sending 10 they all expire (since they''re all about the same age), then you want to drop one? Why is it worth sending the rest?
> > > This seems like the right solution, except that you shouldn''t even > > > need to store a counter. Just use the one in the child qdisc. > > > BTW, I think you are justified in assuming that this counter can only > > > change in two ways: decrease when dequeue is called (but possibly by > > > more than one packet) and increase by 1 when enqueue is called. > > > And I guess also requeue. Which raises another problem, since that''s > > > not called by the parent, right? > > > > No - you can''t use the q.qlen of inner. It is under inner''s control > > and you can''t change it. And the assumption how it can change should > > not be used unless you have some rules for all qdisc how to change > > it. > > The only way I can make sense of this is that we must be > miscommunicating. I do not propose that HTB alter an SFQ q.qlen. > I propose that HTB read it. I think you were suggesting that HTB > make its own copy of it, and I was just suggesting that there''s no > need for that copy.I know. And yes I propose secondary copy for classfull ones and yes there is problem reading inner''s (SFQ''s f.e.) q.qlen. What I do jjust now - I read SFQ''s qlen. It is ok for detecting backlog because SFQ MUST ensure that it is correct. So that to be clear - I have NO problem reading backlog info. However I (and CBQ too) have problem to maintain our own qlen !> > Somewhere in sources you can read that one of qdisc duties is > > to ensure that it''s q.qlen value is correct. It doesn''t restrict > > how it can change and when. > > I think it''s pretty much necessary from the implementation viewpoint > to make some assumptions. In particular, you don''t want to look at > all inactive leaves every time you do a dequeue, just to make sure > that they haven''t somehow manufactured their own packets. Once the > queue is empty you expect it to stay empty until you call enqueue, > right?right. What you know: - each qdisc''s (own) qlen is valid at all (for us important) times - qdisc''s qlen can increase only after successfull (!) enqueue These are only guaranteed ones which I''ve seen described in sources. But decrease can come from dequeue, drop (it is reality now) and nothing prevent qdisc to drop some other packets (if it updates qlen accordingly) - that''s what your "old dropper" wants to do. Consequences below ..> > I''d suggest for each leaf to maintain last_qlen variable which would > > start with 0 (also q.qlen will be 0). Then time to time (in enqueue, > > dequeue, drop ....) you can "repair" your own q.qlen by doing: > > q.qlen += cl->q.qlen - cl->last_qlen; cl->last_qlen = cl->q.qlen; > > for each leaf cl. > > Hence xl->last_qlen tells you what part of total q.qlen is "owned" by > > cl. > > Simple and robust, isn''t it ? > > I don''t get it. A qdisc is already supposed to maintain its qlen. > You don''t think that''s being done correctly now? Why do you want this > other data? You''re proposing something that will require changing > every existing qdisc? Or is this the class (in htb code) trying to > keep a copy of the data in its qdisc? If the latter, then I guess I > don''t object (since it''s only in your code and transparent to me), but > I still don''t see why you need it.Qdisc''s Q qlen is: Q.qlen = Q.extra_qlen + sum over all children ch (ch.qlen) Q.extra_qlen are skb''s in Q''s own queues (HTB''s direct queue, SFQ''s hash, PFIFO''s main queue, TBF''s queue, CBQ has no such one). Because Q.enqueue enqueues packet into its own queue of into child qdisc (and two rules above hold) you can simply increase Q.qlen in case of Q.enqueue. But say this scenario: HTB.enqueue, skb goes to SFQ''s queue, HTB.qlen = SFQ.qlen = 1 [1] HTB.enqueue, skb goes to SFQ''s queue, HTB.qlen = SFQ.qlen = 2 [2] time passed HTB.enqueue, skb goes to SFQ''s queue, SFQ decides to drop two too old skb''s, HTB.qlen = 3, SFQ.qlen = 1 [3] Now HTB.qlen is bad! Here is I would maintain copy of SFQ''s qlen [in brackets] I''d catch that SFQ.qlen decreased by two (under my hands) because 1-[3]=-2 and I can correct HTB.qlen accordingly to 1. It needs to be done only in equrur,dequeue and drop. If some inner qdisc changes in time between, parent qdisc will ignore this change in backlog size until next dequeue/enqueue of that class.> > > I like those, but also suggested that HTB would be a good place to add > > > code for dropping stale packets. If you''re interested in doing that I > > > should mention: > > > Not all packets come with timestamps (e.g., locally generated ones). > > > Solution: at enqueue, if the timestamp is 0, set it to the current > > > time. > > > > It would be better done in fifo/sfq. It is because HTB can do it > > only when dequeuing and at that time some packets were already dropped > > by SFQ because if has had queues full of "old" packets. > > I see, you suggest that I could do better if enqueue started with > something like > while (oldest packet expired) drop oldest packet > Of course this is still only an approximation. There will still be > times when you have to drop a packet but it turns out that, before the > next one is dequeued another packet will have expired (so if you only > knew that you could have saved the packet you dropped before).yes somethink like :) If we would maintain these copies above then you could drop any number of packets. It would be helpful to drop all of these in both dequeue & enqueue to assure that you will not dequeue old packet if too much time passes between last enqueue and dequeue.> > SFQ should maintain pqio queue of all packets and look at head of > > the list sometimes to keep its queues frre of old packets and thus > > to have room for new ones. > > I suppose SFQ could still do something like above even if HTB were > dropping expired packets at dequeue. And this would save all qdiscs > the work of dropping expired packets at dequeue, which is still needed > even if you also check at enqueue. Also, only dropping at dequeue isas I write above, SFQ should do it at both sides ..> a good approximation to what you want in the common case where the > queue size is too large for the low rate assigned to the class. I > argue it''s much better to do this in a few classful qdiscs than to ask > for it to be done by all the others.I agree. But as I said before - it will lead to dropping new packets due to queue full of old packets. I''m not sure whether it has any benefit then. Simply hack this into HTB, measure and if it will make something better the it is worth of implementing. I''m not convinced about it. With fixed queue size and minimal assured rate you always know how old packet can be in the queue ... devik
> > I suppose SFQ could still do something like above even if HTB were > > dropping expired packets at dequeue. And this would save all qdiscs > > the work of dropping expired packets at dequeue, which is still needed > > even if you also check at enqueue. Also, only dropping at dequeue is > > a good approximation to what you want in the common case where the > > queue size is too large for the low rate assigned to the class. I > > argue it''s much better to do this in a few classful qdiscs than to ask > > for it to be done by all the others. > > I hope i got you right, here are my thoughts on this: > > I don''t think dropping at dequeue is necessary. The goal is not to provide > a maximum in-queue time, if the qdisc is able to send there is no reason > to drop. The problem is if the qdisc is not able to send many packets > can get queued until drops occur. This means it takes a long time until > the sender receives indication of congestion. For TCP, congestion is > indicated by either consequent ACKS with the same ACK number or SACKs. > ACKs are only generated by the receiver if something was actualy > received, so by dropping packets after some timeout the time until a > duplicate ACK is generated becomes smaller.IIRC the congestion is also indicated by not recieving ACK for some period (2*RTT is initial IIRC). So that when packet is delayed too much TCP treats it as drop and sends dupliocate ACK requesting whole prev. segment ...> If you assume the expiration time to be smaller than the time requried > to fill the senders congestion window, it doesn''t makes sense anymore to > drop packets during dequeue as this could possibly prevent a duplicate > ACK from beeing generated -> relay indication of congestion.right.> It sounds better to me to check for expired packets during enqueue > (timers would probably be too expensive i guess) and drop them before > enqueueing the new packet.exactly :)> With SFQ, what about not keeping a per-packet timeout but a per-flow > congestion-indication timeout which would drop the first packet of a > flow after max(age of any packet from this flow) >= timeout ? This would > assure congestion indication to be delivered to the sender as soon as > the flow is chosen to send again after the timeout occured (as long as > qlen(flow) > 1 which is probably the case).Maybe I don''t remember some RFC correctly but ACKs with info about congestion are sent in reverse direction .. (?) But I still think that if ever do this machinery that it should be done at bot enqueue and dequeue time. Don Cohen is right in thing that in dequeue it''d be simpler to implement (at least for testing). devik
> I have an SFQ with max queue size 128 and very low rate, say about > 1 packet/sec that I use to limit the rate of SYN''s. > Now as part of a test I send a syn flood, say 200 packets in one > second. In that first second SFQ drops 200-128 but the time limit > won''t drop any more on enqueue. Now we''re sending one packet/sec > and I try to open a tcp connection. > Suppose the age limit is 5 sec. and we drop on enqueue only. > If I try to open the tcp connection 3 sec after the flood, none of the > 124 or so packets in the queue has expired and it''s going to take 2 > min. for my syn to get through. Whereas, if dequeue drops expired > packets then I can get through in 2 sec.Nice example :) And it also shown that you want to drop at both enqueue & dequeue. At enqueue because there is much higher rate of flood enqueues which will keep your sfq full until next dequeue (which drains all oldies). OTOH only after some time the packet will be old. So that checking too often will not allow to drop anything ... Probably testing is only way to go .. devik
Martin Devera writes: > So that to be clear - I have NO problem reading backlog info. > However I (and CBQ too) have problem to maintain our own qlen ! Oh, of course. Now everything makes sense. When I asked whether all you care about inner qlen is whether it''s 0 you mislead me by confirming my incorrect guess. Or perhaps it used to be right but now is wrong. You now need it (not just whether it''s 0) in order to figure out and maintain your own qlen. > right. What you know: > - each qdisc''s (own) qlen is valid at all (for us important) times > - qdisc''s qlen can increase only after successfull (!) enqueue But I think there''s another case - requeue. Maybe you consider that the same as enqueue. (I''ve never seen evidence that it was executed, but it''s called in sch_generic.c) > > I see, you suggest that I could do better if enqueue started with > > something like > > while (oldest packet expired) drop oldest packet > > Of course this is still only an approximation. There will still be > > times when you have to drop a packet but it turns out that, before the > > next one is dequeued another packet will have expired (so if you only > > knew that you could have saved the packet you dropped before). > > yes somethink like :) If we would maintain these copies above then > you could drop any number of packets. > It would be helpful to drop all of these in both dequeue & enqueue > to assure that you will not dequeue old packet if too much time > passes between last enqueue and dequeue. Unfortunately, for SFQ it''s not so easy to find all expired packets in the queue. We''d have to add a fifo data structure to do that in the desired time (proportional to the number of expired packets). This is only important when the queue is full, of course. I still think it''s reasonable to check only at dequeue. Basically if you know max time between dequeues is t, then a queue filled by a flood can only remain full for time t + expire_time after the flood ends. After that there will be a dequeue that throws out all the flood packets. Then your next packet will be accepted and delayed at most expire_time. This is only slightly worse than the guarantee you get by expiring at both enqueue and dequeue. I see that the same could be said for only expiring at enqueue, but that''s much harder. In particular the dequeue only solution can be done by the classful qdiscs on their own while the enqueue only (or both) cannot.
Hi ! Don Cohen wrote:> Patrick McHardy writes: > > > I don''t think dropping at dequeue is necessary. > Here''s an example showing it is. > I have an SFQ with max queue size 128 and very low rate, say about > 1 packet/sec that I use to limit the rate of SYN''s. > Now as part of a test I send a syn flood, say 200 packets in one > second. In that first second SFQ drops 200-128 but the time limit > won''t drop any more on enqueue. Now we''re sending one packet/sec > and I try to open a tcp connection. > Suppose the age limit is 5 sec. and we drop on enqueue only. > If I try to open the tcp connection 3 sec after the flood, none of the > 124 or so packets in the queue has expired and it''s going to take 2 > min. for my syn to get through. Whereas, if dequeue drops expired > packets then I can get through in 2 sec.I don''t think this is a good example. Why use SFQ for only SYNs ? For each connection (flow) only one SYN is going to get enqueued. SFQ distributes available bandwidth equal between flows if it the queue becomes full. Now, if each flow just consists of one packet it''s basically the same like fifo with higher overhead. What if the Synflood lasts longer / your connection request happens during the synflood ? It will expire like the others. Syncookies provide a much better protection against synfloods, you won''t have to wait at all.> The goal is not to provide > > a maximum in-queue time, if the qdisc is able to send there is no reason > > to drop. > The reason to drop is that it''s a waste of time/bandwidth to send > expired packets, and even worse to make something unexpired wait for > them (and even worse yet if it has to wait long enough to expire > itself). > > The problem is if the qdisc is not able to send many packets > > can get queued until drops occur. This means it takes a long time until > > the sender receives indication of congestion. For TCP, congestion is > > indicated by either consequent ACKS with the same ACK number or SACKs. > In this case tcp should stop cause it doesn''t get any acks cause its > packets are not getting forwarded. Or if the problem is the other > direction, it should stop cause it''s not getting the acks. > > > ACKs are only generated by the receiver if something was actualy > > received, so by dropping packets after some timeout the time until a > > duplicate ACK is generated becomes smaller. > I think now we''re getting into the subject of why it''s good to > drop packets that have been waiting for a long time. > If your objective is to generate a duplicate ack then I''m not sure > dropping packets is the right way. After all, you might drop a packet > that would have generated one sooner. For that matter, you might have > dropped one that would have generated a non-duplicate ack, which would > be even better for tcp. Nevertheless, I do think it''s good to drop > packets that are not forwarded in a timely fashion. It''s just not a > simple argument.I think maybe we misunderstood. What i want to achieve is that flows respond more quickly to changes in available bandwidth. I assume that i am the bottleneck, otherwise it wouldn''t make much sense at all. You write "After all, you might drop a packet that would have generated one sooner. For that matter, you might have dropped one that would have generated a non-duplicate ack, which would be even better for tcp.". If we leave packets dropped on the path to us for reasons besides congestion out of sight it''s unlikely that we drop a packet that would have generated a duplicate ack earlier. After all, we are the bottleneck so we control packet drops. Also, i don''t want to generate a non-duplicate ACK, i want to tell the remote LESS bandwidth is available.> > If you assume the expiration time to be smaller than the time requried > > to fill the senders congestion window, it doesn''t makes sense anymore to > > drop packets during dequeue as this could possibly prevent a duplicate > > ACK from beeing generated -> relay indication of congestion. > > It sounds better to me to check for expired packets during enqueue > > (timers would probably be too expensive i guess) and drop them before > > enqueueing the new packet. > I''m not sure what you have in mind for the check here. > I was expecting that no packets would actually take a long time > between being received/generated and being enqueued. Rather, when you > enqueue one, you can look in the queue to find any others that have > been in the queue too long. But this does not catch as many cases as > checking at dequeue. > I was not proposing to set any timers. Just record the time the packet > arrives and when you dequeue, see how long ago that was. > > > With SFQ, what about not keeping a per-packet timeout but a per-flow > > congestion-indication timeout which would drop the first packet of a > > flow after max(age of any packet from this flow) >= timeout ? This would > > assure congestion indication to be delivered to the sender as soon as > > the flow is chosen to send again after the timeout occured (as long as > > qlen(flow) > 1 which is probably the case). > > I don''t think I understand what you''re proposing. If we enqueue 50 > packets from one flow all at once, and after sending 10 they all > expire (since they''re all about the same age), then you want to drop > one? Why is it worth sending the rest?I don''t want to drop unneccasary much. If i drop the first one, the sender will receive notification of congestion as soon as possible and slow down. But i haven''t put very much thought in this so it will probably not work like this. Bye, Patrick
Patrick McHardy writes: > I don''t think this is a good example. > Why use SFQ for only SYNs ? Ok, use pfifo instead. > What if the Synflood lasts longer / your connection request happens > during the synflood ? It will expire like the others. I wasn''t trying to solve the flooding problem in this example. (As it turns out, I am trying to solve that problem, and this is related to that subject, but I won''t go into details here. Anyone who wants to know more should see http://www.cs3-inc.com/ddos.html) > Syncookies provide a much better protection against synfloods, you won''t > have to wait at all. Ok, change syn to ping. > I think maybe we misunderstood. > What i want to achieve is that flows respond more quickly to changes in > available bandwidth. In the case of tcp this adjustment happens automatically because the acks slow down (or speed up). It''s much better for tcp to adjust the rate without having to drop. > > I don''t think I understand what you''re proposing. If we enqueue 50 > > packets from one flow all at once, and after sending 10 they all > > expire (since they''re all about the same age), then you want to drop > > one? Why is it worth sending the rest? > > I don''t want to drop unneccasary much. If i drop the first one, the > sender will receive notification of congestion as soon as possible and > slow down. But i haven''t put very much thought in this so it will > probably not work like this. Dropping has a pretty drastic effect on tcp. If the communicating tcp''s support various features, dropping one packet might not cause any significant slowdown, whereas in other implementations dropping one could cause a major slowdown with lots of stuff having to be retransmitted. It''s not the case that you can expect a small number of drops to cause a small amount of slowdown.
> > right. What you know: > > - each qdisc''s (own) qlen is valid at all (for us important) times > > - qdisc''s qlen can increase only after successfull (!) enqueue > But I think there''s another case - requeue. Maybe you consider that > the same as enqueue. > (I''ve never seen evidence that it was executed, but it''s called in > sch_generic.c)It should be handled in the same way as requeue only should not increase stats.> I still think it''s reasonable to check only at dequeue. Basically if > you know max time between dequeues is t, then a queue filled by a > flood can only remain full for time t + expire_time after the flood > ends. After that there will be a dequeue that throws out all the > flood packets. Then your next packet will be accepted and delayed at > most expire_time.It might be. Why don''t you try it ? I feel still the same about it: I think that experiment is the only way to test the idea. devik
> It should be handled in the same way as requeue only should not increase> stats. I guess you mean requeue should be the same as enqueue. > > I still think it''s reasonable to check only at dequeue. Basically if > > you know max time between dequeues is t, then a queue filled by a > > flood can only remain full for time t + expire_time after the flood > > ends. After that there will be a dequeue that throws out all the > > flood packets. Then your next packet will be accepted and delayed at > > most expire_time. > > It might be. Why don''t you try it ? I feel still the same about it: I > think that experiment is the only way to test the idea. What experiments would you like to see?
On Thu, 9 May 2002, Don Cohen wrote:> > It should be handled in the same way as requeue only should not increase > > stats. > I guess you mean requeue should be the same as enqueue.err of course, yes> > > I still think it''s reasonable to check only at dequeue. Basically if > > > you know max time between dequeues is t, then a queue filled by a > > > flood can only remain full for time t + expire_time after the flood > > > ends. After that there will be a dequeue that throws out all the > > > flood packets. Then your next packet will be accepted and delayed at > > > most expire_time. > > > > It might be. Why don''t you try it ? I feel still the same about it: I > > think that experiment is the only way to test the idea. > > What experiments would you like to see?Implement your idea and measure difference in delay, jitter, thoughtput of TCP and reaction to flooding. In other words show that your idea can do better that current approach. When I was speking about idea like HTB nobody listened. I''ve to implement it .. devik