Does anyone have any pointers on how other people have implemented tcp window adjustment to do bandwidth shaping? Granted the basic idea is to set the window size to be RTT * bandwidth, but a quick squiz at google turns up mostly papers on how to implement this at the sender end with a view to some new magic TCP implementation. I''m really interested in notes on how to implement at the router side, perhaps with a view to writing a new QOS module. Biggest issue I can see right now is an architecture one, ie monitoring the incoming packet rate and then applying that to the outgoing ACK packets. Linux QOS separates the in and out traffic modules. Wondering how one best communicates this info... Thanks for any thoughts Ed W _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
> Does anyone have any pointers on how other people have implemented tcp > window adjustment to do bandwidth shaping?Hmm....I _heard_ that Packeteer had patents on this and so nobody else was attempting to do it. Possibly an incorrect rumor, but it made sense to me. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
David Boreham wrote:> >> Does anyone have any pointers on how other people have implemented tcp >> window adjustment to do bandwidth shaping? > > > Hmm....I _heard_ that Packeteer had patents on this and > so nobody else was attempting to do it. > > Possibly an incorrect rumor, but it made sense to me.They say on their site that their algorithm is patented which could be a pain if it''s the obvious solution. The more I think about it (probably not enough yet), the more I think that just keeping state and depiggybacking acks could achieve much the same thing if your shaper is clever enough. The worst senario for me is bittorrent, and if I could depiggyback the acks I don''t see that playing with window size on top of that would be any better than keeping state so that I had an idea of how much was unstoppably on the way. Closing the window down isn''t going to stop what''s allready left the sender and is sitting in a big modem buffer any quicker than me stopping sending acks. Just knowing how lagged out each connection is would be enough to allow me to change bandwidth more elegantly without too much buffer filling. Not with anything that exists in Linux now - but even just hacking HTB/HFSC so that a class could behave as full as soon as it sees traffic would be a start. I can allready sort of break slowstart by treating new connections harshly (short queue), though it would be nice in the case of bittorrent to be able to detect connections that go back into slowstart aswell - sfq sort of singles them out, but it''s a bit late by the time it gets them. I guess there are other things you could do aswell like trying to account for different rtts with the intention of avoiding bursts. Andy. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
On Tue, Feb 08, 2005 at 11:49:39PM +0000, Ed Wildgoose wrote:> Does anyone have any pointers on how other people have implemented tcp > window adjustment to do bandwidth shaping? > > Granted the basic idea is to set the window size to be RTT * bandwidth, > but a quick squiz at google turns up mostly papers on how to implement > this at the sender end with a view to some new magic TCP > implementation. I''m really interested in notes on how to implement at > the router side, perhaps with a view to writing a new QOS module. > > Biggest issue I can see right now is an architecture one, ie monitoring > the incoming packet rate and then applying that to the outgoing ACK > packets.instead of shaping the incoming traffic and estimate rate from the outgoing traffic, you can ''delay'' the outgoing ACK, and estimate the rate from the raise of the sequence number. so you just shape on the outgoing queue, without take care of the incoming traffic. note that everything here is patented, so if you must have lawyers, if you want to develop.. :-/ http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2Fsearch-bool.html&r=0&f=S&l=50&TERM1=packeteer&FIELD1=&co1=AND&TERM2=&FIELD2=&d=ptxt -- BOFH excuse #33: pizeo-electric interference _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
marco ghidinelli wrote:> On Tue, Feb 08, 2005 at 11:49:39PM +0000, Ed Wildgoose wrote: > >>Does anyone have any pointers on how other people have implemented tcp >>window adjustment to do bandwidth shaping? >> >>Granted the basic idea is to set the window size to be RTT * bandwidth, >>but a quick squiz at google turns up mostly papers on how to implement >>this at the sender end with a view to some new magic TCP >>implementation. I''m really interested in notes on how to implement at >>the router side, perhaps with a view to writing a new QOS module. >> >>Biggest issue I can see right now is an architecture one, ie monitoring >>the incoming packet rate and then applying that to the outgoing ACK >>packets. > > > instead of shaping the incoming traffic and estimate rate from the > outgoing traffic, you can ''delay'' the outgoing ACK, and estimate the rate > from the raise of the sequence number. > > so you just shape on the outgoing queue, without take care of the > incoming traffic. > > note that everything here is patented, so if you must have lawyers, if > you want to develop.. > > :-/ > > http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2Fsearch-bool.html&r=0&f=S&l=50&TERM1=packeteer&FIELD1=&co1=AND&TERM2=&FIELD2=&d=ptxt >Ewwww thats not nice - not that I know what those would stop you being able to do. I presume they just apply in the USA and wouldn''t be enforcable in Europe yet as software patents are not allowed (yet)? Andy. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Sorry for the belated reply:>instead of shaping the incoming traffic and estimate rate from the >outgoing traffic, you can ''delay'' the outgoing ACK, and estimate the rate >from the raise of the sequence number. > >I like the sound of this idea, but I don''t follow the details? Certainly it seems to me that you can do most of the work by only looking at outgoing ACK packets. For example with certain assumptions we can simply measure the outgoing ACK rate, assume this is dependent on the amount of data being controlled by our bandwidth throttling, and therefore we get a really good estimate of the effect of our current incoming rate. However, this breaks down if for example the sender was not sending data as fast as possible. Also simply delaying ACK''s doesn''t seem to be the whole answer because the sender should simply see this as a longer RTT and increase the window size to keep more data in transit. Seems that we need to do a little of both, eg examine outgoing ACK speed, reduce the window to the approx correct size and then our RED/tail drop takes care of the fine tuning As others have said, it''s stuff like Bittorrent which really shows the weaknesses in the current system. I find that even throttling bittorrent to say, half the incoming bandwidth still shows regular increases in latency, no doubt to the effects of the sudden rush of incoming connections. (or slow start effects basically). In the BWMGR product (or whatever it is called), I get the impression they do more work on controlling initial windows to try and throttle slow start back some? Seems to me that one could do more work around the time of the initial ACK to get a window size more in keeping with the flow for that tcp class? If we only allocate 10Kb of our connection to that class, and we are connected via some broadband device, then nowhere in the world is more than 350ms away, and hence a window size of 65535 is clearly wayyy to large - lets fix this early? How do we fit this thing into the linux QOS architecture anyway? Ed W _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
On Fri, Feb 11, 2005 at 02:57:59AM +0000, Ed Wildgoose wrote:> > I like the sound of this idea, but I don''t follow the details? > > Certainly it seems to me that you can do most of the work by only > looking at outgoing ACK packets. For example with certain assumptions > we can simply measure the outgoing ACK rate, assume this is dependent on > the amount of data being controlled by our bandwidth throttling, and > therefore we get a really good estimate of the effect of our current > incoming rate. However, this breaks down if for example the sender was > not sending data as fast as possible.we have to estimate the Acknowledge data rate and if the sender is overlimit we can begin to slowdown his traffic. or something similar...> Also simply delaying ACK''s doesn''t seem to be the whole answer because > the sender should simply see this as a longer RTT and increase the > window size to keep more data in transit. Seems that we need to do a > little of both, eg examine outgoing ACK speed, reduce the window to the > approx correct size and then our RED/tail drop takes care of the fine tuninghehe, right. and if we are lucky, there will be a router in the internet that will ''taildrop'' that traffic for us, so we are not wasting our bandwith. (i didn''t tested this algorithm, but in a network simulator (ns) it works).> As others have said, it''s stuff like Bittorrent which really shows the > weaknesses in the current system. I find that even throttling > bittorrent to say, half the incoming bandwidth still shows regular > increases in latency, no doubt to the effects of the sudden rush of > incoming connections. (or slow start effects basically).i''ll do a lot of testing on this.. bittorrent will be a pain, i guess.> In the BWMGR product (or whatever it is called), I get the impression > they do more work on controlling initial windows to try and throttle > slow start back some? Seems to me that one could do more work around > the time of the initial ACK to get a window size more in keeping with > the flow for that tcp class? If we only allocate 10Kb of our connection > to that class, and we are connected via some broadband device, then > nowhere in the world is more than 350ms away, and hence a window size of > 65535 is clearly wayyy to large - lets fix this early?i''m just trying to slow the traffic (it''s for my master|degree|pedigree thesis, so i don''t want to waste all my life on this) without changing the window size.> How do we fit this thing into the linux QOS architecture anyway?i''m writing a scheduler that just delay the ack rate (it''s in a very preliminar state, so nearly nothing was done). now i''m looking for a place where to put the flow information (in a conntrack module, maybe?) bye! p.s. sorry for my bad english... -- BOFH excuse #266: All of the packets are empty. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>i''m just trying to slow the traffic (it''s for my master|degree|pedigree >thesis, so i don''t want to waste all my life on this) without changing >the window size. > > > >>How do we fit this thing into the linux QOS architecture anyway? >> >> > >i''m writing a scheduler that just delay the ack rate (it''s in a very >preliminar state, so nearly nothing was done). > >now i''m looking for a place where to put the flow information (in a >conntrack module, maybe?) > >Have a look at the BWMGR qos product. They have some interesting thoughts. Basically their idea seems to be that you only need to get the window shaping (or ACK shaping) roughly right. The fine tuning happens just as now with the queue simply filling up a little. Seems to me that this is right, if you just get the window even +/- 50% of the target bandwidth then you can do fine tuning by delaying ACK and buffering data. The trick is basically to avoid the huge splurge of data during slow start which can cause queuing on the ISP end. Otherwise I am broadly speaking very happy with the default QOS. It''s just this queueing which occurs when a bunch of connections all start together which is the problem. This isn''t really just a bittorrent issue though because a busy webserver would likely see the same conditions? Are we all on the same page as to what the problem is? Any more thoughts on how to tackle it? I''m still not convinced that delaying ACK''s is really any better than the current option to buffer incoming data. I guess the receiving machine TCP stack gets it earlier so the app looks more responsive, but other than the lower lag I don''t see much difference really? Curious to hear how your project gets on though! Please keep us informed! Ed W _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
I did notice that there was a netfilter module which did window tracking. It''s in POM under the fairly experimental section I think. Haven''t checked it out, but there could be some interesting code in there Ed W _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Ed Wildgoose wrote:> >> i''m just trying to slow the traffic (it''s for my master|degree|pedigree >> thesis, so i don''t want to waste all my life on this) without changing >> the window size. >> >> >> >>> How do we fit this thing into the linux QOS architecture anyway? >>> >> >> >> i''m writing a scheduler that just delay the ack rate (it''s in a very >> preliminar state, so nearly nothing was done). >> >> now i''m looking for a place where to put the flow information (in a >> conntrack module, maybe?) >> >> > > Have a look at the BWMGR qos product. They have some interesting thoughts.They seem to have some strange ones too - from http://www.etinc.com/bwcompare.htm "Suppose you have a T1 line and a single server. Suppose that 2 remote clients request the same page simultaneously, and that page has 15,000 bytes of information. With a typical TCP window of 16K, the entire page will be sent without requiring an ACK from the client." Conviniently forgets that slowstart exists - stopping the above is what it''s for isn''t it? Then later - "What is important to understand is that bandwidth management devices which do not utilize window manipulation at all cannot reduce your network latency on an overall basis. They will simply shift the latency from one type of traffic to another." They are still talking about egress shaping here - I say - so what if latency gets shifted, as long as you arrange for it to be "shifted" to bulk then it doesn''t matter. Egress shaping is sorted without window manipulation - As long as whatever you define as interactive traffic is < your bandwidth then you can arrange for it never to be delayed by bulk traffic any more than the bitrate X length of bulk packet.> > Basically their idea seems to be that you only need to get the window > shaping (or ACK shaping) roughly right. The fine tuning happens just as > now with the queue simply filling up a little. Seems to me that this is > right, if you just get the window even +/- 50% of the target bandwidth > then you can do fine tuning by delaying ACK and buffering data. The > trick is basically to avoid the huge splurge of data during slow start > which can cause queuing on the ISP end.It would be nice to slow slowstart like this, to some extent linux TCP already does this (well it does at LAN speed, it advertises a smaller window then grows it). Not much help for WAN rates or forwarded Windows traffic though.> > Otherwise I am broadly speaking very happy with the default QOS. It''s > just this queueing which occurs when a bunch of connections all start > together which is the problem. This isn''t really just a bittorrent > issue though because a busy webserver would likely see the same conditions?I don''t get the webserver bit - "heavy" browsing on 512kbit link hurts - typically 4 simultaneous connections over and over. Sortable by sending new connections to a short, lowish rate sfq (tweaked for ingress). But only if there is no other bulk aswell. If there is then I think you need some sort of prediction/intelligence or a dumb queue will react too late and over aggressivly - resync bursts aswell then.> > Are we all on the same page as to what the problem is? Any more > thoughts on how to tackle it? I''m still not convinced that delaying > ACK''s is really any better than the current option to buffer incoming > data.I suppose the gain for marco is that he will not just be delaying acks - OK he could do the same for data and I agree that there isn''t much difference and it could be better to work with date in some ways eg. you can really drop whereas with acks it would be harder to simulate a drop. I don''t think he will be just buffering though - thats the advantage for him - to build in intelligence to handling the problem of shaping from the wrong end of a bottleneck without the handicap of using dumb queues that are seeing traffic that''s allready shaped by a fifo and whose fill rate is determined by what percentage of the fifo rate you set their rate. I still think that dumb queues can be improved for this situation, though - Making an htb/hfsc class that behaves as full as soon as it sees traffic, so other classes that are sharing get throttled before it''s too late. Making an eesfq - preferably without the s - perturb is horrible really, the packet reordering alone causes resync bursts in tests I''ve done. It is also pretty pointless dropping a whole bunch packets from the same window, one per slot with some timed immunity would be nice - they already used the link anyway. That is more like sfred or fred I suppose. Being able to detect existing connections that go back into slowstart - don''t know how to do that as such, but I suspect bittorrent causes this as it cycles through existing connections trying to find better for you. De piggybacking acks from full duplex bulk traffic - again bittorrent, though you can make the effects of bt using full duplex for bulk a bit better by using shorter egress queues for it. I guess the receiving machine TCP stack gets it earlier so the> app looks more responsive, but other than the lower lag I don''t see much > difference really? > > Curious to hear how your project gets on though! Please keep us informed!Me too - Thomas Graf is also playing with delaying acks - see the dummy replacing imq thread this & last month on a netdev archive. I also remember seeing another shaper that uses delay pools for acks bandwidth arbiter or arbitrator IIRC. If you can''t find it say, I probably have it squirreled away somewhere on my old PC. Andy.> > Ed W > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ >_______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Ed Wildgoose wrote:> Sorry for the belated reply: > >> instead of shaping the incoming traffic and estimate rate from the >> outgoing traffic, you can ''delay'' the outgoing ACK, and estimate the rate >> from the raise of the sequence number. >> >> > > I like the sound of this idea, but I don''t follow the details? > > Certainly it seems to me that you can do most of the work by only > looking at outgoing ACK packets. For example with certain assumptions > we can simply measure the outgoing ACK rate, assume this is dependent on > the amount of data being controlled by our bandwidth throttling, and > therefore we get a really good estimate of the effect of our current > incoming rate. However, this breaks down if for example the sender was > not sending data as fast as possible.Also delayed acks complicate things a bit - If you are not going to deconstruct and spawn new acks then you will always be working with pairs. If I had a meg I guess I wouldnt care (and for those with highly asymmetric dsl links they are almost a must) but for slow links it''s nice to be able to get 1 ack = 1 packet.> > Also simply delaying ACK''s doesn''t seem to be the whole answer because > the sender should simply see this as a longer RTT and increase the > window size to keep more data in transit. Seems that we need to do a > little of both, eg examine outgoing ACK speed, reduce the window to the > approx correct size and then our RED/tail drop takes care of the fine > tuning > > As others have said, it''s stuff like Bittorrent which really shows the > weaknesses in the current system. I find that even throttling > bittorrent to say, half the incoming bandwidth still shows regular > increases in latency, no doubt to the effects of the sudden rush of > incoming connections. (or slow start effects basically).IIRC you have 1 meg - I always assumed that this would be nicer that my 1/2 meg. I''ll have to do some tests when I get time and make some graphs to see how my setup behaves, it''s a bit tricky with bt though as things can change alot over time on the same torrent. If you are throttling to 50% and not using RED then it''s going to be more aggressive WRT over dropping and maybe this causes problems by its self.> > In the BWMGR product (or whatever it is called), I get the impression > they do more work on controlling initial windows to try and throttle > slow start back some? Seems to me that one could do more work around > the time of the initial ACK to get a window size more in keeping with > the flow for that tcp class? If we only allocate 10Kb of our connection > to that class, and we are connected via some broadband device, then > nowhere in the world is more than 350ms away, and hence a window size of > 65535 is clearly wayyy to large - lets fix this early?I''m not sure about this one but if you are p2p with someone whose buffer is flooded and you are competing for their bandwidth with others you may actually over throttle yourself by reducing window. The way broadband in the UK is going - higher bandwidth and usage caps I think we may also see more contention and not knowing what your bandwidth is, is going to be tricky :-) (policers to classify?) Andy.> > How do we fit this thing into the linux QOS architecture anyway? > > Ed W > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ >_______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
> "What is important to understand is that bandwidth management devices > which do not utilize window manipulation at all cannot reduce your > network latency on an overall basis. They will simply shift the > latency from one type of traffic to another." > > They are still talking about egress shaping here - I say - so what if > latency gets shifted, as long as you arrange for it to be "shifted" to > bulk then it doesn''t matter. > > Egress shaping is sorted without window manipulation - As long as > whatever you define as interactive traffic is < your bandwidth then > you can arrange for it never to be delayed by bulk traffic any more > than the bitrate X length of bulk packet.I do agree with you, but I think they may have just badly phrased their point. I *think* what they are trying to say is that with really big devices, ie thousands of connections, then you can''t just wait for the traffic to queue up and let it back up like that. You need to be a bit proactive and kill the traffic right from the start. I *assume* they mean it''s just not feasible to do the equiv of SFQ or some other kind of classifier which can really finely grain the priorities for lots of connections. Hence their point that you are just shifting latency around. Bad point really, but I can kind of see what they might be saying.> It would be nice to slow slowstart like this, to some extent linux TCP > already does this (well it does at LAN speed, it advertises a smaller > window then grows it). Not much help for WAN rates or forwarded > Windows traffic though.Well, perhaps this is really what we want to do? I noticed some interesting looking code on the POM part of netfilter to do some window tracking...> I don''t get the webserver bit - "heavy" browsing on 512kbit link hurts > - typically 4 simultaneous connections over and over. Sortable by > sending new connections to a short, lowish rate sfq (tweaked for > ingress). But only if there is no other bulk aswell. If there is then > I think you need some sort of prediction/intelligence or a dumb > queue will react too late and over aggressivly - resync bursts aswell > then.I do agree. My point was perhaps more that lots of little connections hurt. Long lasting ones seem well controlled by current QOS. Also, I think SFQ is completely buggering up my IP telephone? I currently have some SFQ classes on the relevant queues and I wonder if the rehasing is re-ordering the packets?> I suppose the gain for marco is that he will not just be delaying acks > - OK he could do the same for data and I agree that there isn''t much > difference and it could be better to work with date in some ways eg. > you can really drop whereas with acks it would be harder to simulate a > drop.I''m just not sure that he is actually addressing the "lots of small connections" or "overload during slow start" problems that I think are still the main issues to be sorted?> Me too - Thomas Graf is also playing with delaying acks - see the > dummy replacing imq thread this & last month on a netdev archive. > > I also remember seeing another shaper that uses delay pools for acks > bandwidth arbiter or arbitrator IIRC. If you can''t find it say, I > probably have it squirreled away somewhere on my old PC.Cool, I saw a note about dummy on this list, but didn''t read the thread. Yes please, notes on the other idea would be interesting. I still think that the only sensible answer to this is to literally throttle the initial window advertisement, and perhaps even aggresively manipulate it thereafter. I''m thinking that you could make some reasonable assumptions perhaps for users on "normal" wan links where latency ranges from 10-15ms up to 330ms. Satellite links are a whole new ball game of course (and I have some satellite users...) I''m thinking that if you could throttle the window back to even the size of the class that the connection "belongs to", then you could control the rest of the link using the normal linux QOS queues. I currently think the killer is just the spikes in latency which occur as TCP tries to test the window size, or when you have a bunch of connections all doing slow start at the same time (same kind of thing really). Window control really seems to be the tool to get the link roughly under control and then the queuing should be for the fine grained stuff. What do you think? ...Now how do I code it...? Ed W _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
On Fri, Feb 11, 2005 at 06:18:01PM +0000, Ed Wildgoose wrote:> > Are we all on the same page as to what the problem is? Any more > thoughts on how to tackle it? I''m still not convinced that delaying > ACK''s is really any better than the current option to buffer incoming > data.basically, we have smaller queue because we are queueing the small ack, not the big data transfer on the downlink. and delaying ack is the first step to do the delay+advertised window resize that maybe someone (me?) will do in the future.> I guess the receiving machine TCP stack gets it earlier so the > app looks more responsive, but other than the lower lag I don''t see much > difference really? > > Curious to hear how your project gets on though! Please keep us informed!of course! :) -- BOFH excuse #208: Your mail is being routed through Germany ... and they''re censoring us. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
On Sat, Feb 12, 2005 at 02:16:09AM +0000, Andy Furniss wrote:> > >Are we all on the same page as to what the problem is? Any more > >thoughts on how to tackle it? I''m still not convinced that delaying > >ACK''s is really any better than the current option to buffer incoming > >data. > > I suppose the gain for marco is that he will not just be delaying acks - > OK he could do the same for data and I agree that there isn''t much > difference and it could be better to work with date in some ways eg. you > can really drop whereas with acks it would be harder to simulate a drop.note that drop means ''losing correct data already received''. so when we drop, we need a retransmission, so we are wasting bandwidth. btw, simulate a drop is easy just send two (or three - it depends on the implementation of the tcp stack) duplicate ack that means: we have received out of order packet so please retransmit.> I don''t think he will be just buffering though - thats the advantage for > him - to build in intelligence to handling the problem of shaping from > the wrong end of a bottleneck without the handicap of using dumb queues > that are seeing traffic that''s allready shaped by a fifo and whose fill > rate is determined by what percentage of the fifo rate you set their rate.yes, that''s the idea.> I guess the receiving machine TCP stack gets it earlier so the > >app looks more responsive, but other than the lower lag I don''t see much > >difference really? > > > >Curious to hear how your project gets on though! Please keep us informed! > > Me too - Thomas Graf is also playing with delaying acks - see the dummy > replacing imq thread this & last month on a netdev archive. > > I also remember seeing another shaper that uses delay pools for acks > bandwidth arbiter or arbitrator IIRC. If you can''t find it say, I > probably have it squirreled away somewhere on my old PC.thank you for those references. ciao ciao. -- BOFH excuse #270: Someone has messed up the kernel pointers _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Ed Wildgoose wrote:> I do agree with you, but I think they may have just badly phrased their > point. I *think* what they are trying to say is that with really big > devices, ie thousands of connections, then you can''t just wait for the > traffic to queue up and let it back up like that. You need to be a bit > proactive and kill the traffic right from the start. I *assume* they > mean it''s just not feasible to do the equiv of SFQ or some other kind of > classifier which can really finely grain the priorities for lots of > connections. Hence their point that you are just shifting latency around. > > Bad point really, but I can kind of see what they might be saying.Yea maybe I am being a bit harsh and I suppose throttling window will save some drops (which would throttle cwind anyway) so it is a bit more elegant.> > >> It would be nice to slow slowstart like this, to some extent linux TCP >> already does this (well it does at LAN speed, it advertises a smaller >> window then grows it). Not much help for WAN rates or forwarded >> Windows traffic though. > > > > Well, perhaps this is really what we want to do? I noticed some > interesting looking code on the POM part of netfilter to do some window > tracking...Yea netfilter conntrack does seem a logical place to do this - don''t know how, though.> >> I don''t get the webserver bit - "heavy" browsing on 512kbit link hurts >> - typically 4 simultaneous connections over and over. Sortable by >> sending new connections to a short, lowish rate sfq (tweaked for >> ingress). But only if there is no other bulk aswell. If there is then >> I think you need some sort of prediction/intelligence or a dumb >> queue will react too late and over aggressivly - resync bursts aswell >> then. > > > I do agree. My point was perhaps more that lots of little connections > hurt. Long lasting ones seem well controlled by current QOS.Yes I agree.> Also, I > think SFQ is completely buggering up my IP telephone? I currently have > some SFQ classes on the relevant queues and I wonder if the rehasing is > re-ordering the packets?Maybe - but then if it''s interactive traffic it should not really be queued enough for it to matter. I hang esfq on my interactive class, but it doesn''t get used - it''s only there because I made it increment an unused counter every time a packet arrives and the queue is not empty - it always says 0, which means the way I use htb/my marking are working as expected. But my home setup is not very scaleable and my marking relies on me having control over apps. If I were netadmin for many users I would have got round to using hfsc by now.> >> I suppose the gain for marco is that he will not just be delaying acks >> - OK he could do the same for data and I agree that there isn''t much >> difference and it could be better to work with date in some ways eg. >> you can really drop whereas with acks it would be harder to simulate a >> drop. > > > > I''m just not sure that he is actually addressing the "lots of small > connections" or "overload during slow start" problems that I think are > still the main issues to be sorted? > > >> Me too - Thomas Graf is also playing with delaying acks - see the >> dummy replacing imq thread this & last month on a netdev archive. >> >> I also remember seeing another shaper that uses delay pools for acks >> bandwidth arbiter or arbitrator IIRC. If you can''t find it say, I >> probably have it squirreled away somewhere on my old PC. > > > > Cool, I saw a note about dummy on this list, but didn''t read the > thread. Yes please, notes on the other idea would be interesting.Well I had a look and it seems what I was thinking of has since turned into netlimiter. Maybe I was thinking of something else - maybe not, but I thought it was to do with a Uni. Here''s what I have - http://www.jessingale.dsl.pipex.com/arb.tar.gz> I still think that the only sensible answer to this is to literally > throttle the initial window advertisement, and perhaps even aggresively > manipulate it thereafter. I''m thinking that you could make some > reasonable assumptions perhaps for users on "normal" wan links where > latency ranges from 10-15ms up to 330ms.Yes - but for browsing you''ll need to account for how many concurrent connections as well, which is going to vary. Satellite links are a whole> new ball game of course (and I have some satellite users...)I''ve seen posts from people wanting to remove slowstart totally for satellite use.> > I''m thinking that if you could throttle the window back to even the size > of the class that the connection "belongs to", then you could control > the rest of the link using the normal linux QOS queues. I currently > think the killer is just the spikes in latency which occur as TCP tries > to test the window size, or when you have a bunch of connections all > doing slow start at the same time (same kind of thing really). Window > control really seems to be the tool to get the link roughly under > control and then the queuing should be for the fine grained stuff. > > What do you think? ...Now how do I code it...?It''s bloody tricky and I haven''t got a clue :-) I think the number of connections within a class is going to matter and how to elegantly (predictivly and without resync bursts) change the bandwidth of existing classes/connections, also needs sorting. Anything is going to be better than doing nothing, though. Andy. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
marco ghidinelli wrote:> note that drop means ''losing correct data already received''. so when we > drop, we need a retransmission, so we are wasting bandwidth. >Yea - it''s a pain, but then you can''t rely on anywhere else to do it apart from your ISP/Teleco - who will fill a big buffer first. You still loose the bandwidth of the retransmit - I loose the drop aswell, but then I have to back off X% anyway and the drop sort of comes out of that.> btw, simulate a drop is easy just send two (or three - it depends on the > implementation of the tcp stack) duplicate ack that means: we have received > out of order packet so please retransmit.Do you have to do do it differently for servers that do sacks, or do they still react to DUPs ? Andy. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
> Do you have to do do it differently for servers that do sacks, or do > they still react to DUPs ?Can someone provide a link to a table which lists tcp "implementations" by OS? At the very least it would be interesting to know what features and tcp implementations are used by WinXP, Win2K, Linux 2.4, Linux 2.6 and if possible also what Win98, ME and various BSD/Solaris implementations use, although for my purposes I care less about these. It''s really hard to find out whether for example, SACK is implemented in WinXP tcp...? I''m not even sure what changed between linux 2.4 and 2.6? Thanks for any pointers Ed W _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Andy Furniss wrote:> > marco ghidinelli wrote: > > > note that drop means ''losing correct data already received''. so when we > > drop, we need a retransmission, so we are wasting bandwidth. > > > > Yea - it''s a pain, but then you can''t rely on anywhere else to do it > apart from your ISP/Teleco - who will fill a big buffer first.Andy, In the absense of an ACK, which cannot occur until the intended victim receives and says it got the packet, what causes the sender to keep sending such that the ISP''s "big buffer" fills? I suppose that a better way to ask this question is to ask what RFC(s) I should be looking up. -- gypsy _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
>In the absense of an ACK, which cannot occur until the intended victim >receives and says it got the packet, what causes the sender to keep >sending such that the ISP''s "big buffer" fills? > >I suppose that a better way to ask this question is to ask what RFC(s) I >should be looking up. > >Basically this is TCP''s window scaling algorithm. In very simple terms it discovers the max speed of the link by sending packets faster and faster until you start to get packet dropping. Then tcp backs off a bit and slowly probes the limits continuously as time goes by. _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
On Mon, 2005-02-14 at 01:27 +0100, marco ghidinelli wrote:> On Fri, Feb 11, 2005 at 06:18:01PM +0000, Ed Wildgoose wrote: > > > > Are we all on the same page as to what the problem is? Any more > > thoughts on how to tackle it? I''m still not convinced that delaying > > ACK''s is really any better than the current option to buffer incoming > > data. > > basically, we have smaller queue because we are queueing the small ack, not > the big data transfer on the downlink. > > and delaying ack is the first step to do the delay+advertised window resize > that maybe someone (me?) will do in the future.Pardon me for jumping in a month after the last post. but what is the objective of this anyway? My understanding is that we need to priotise acks. (irrespective of which ack it is, eg: bittorrent acks/ssh acks/web acks etc) Isn''t the point here is to make sure that what gets sent out, is get sent out faster (in the case of ssh acks where we should get ''minimum delay'') In my current implementation at home, I use TC/HTB rules to priortise outgoing acks. This definately helps esp when doing huge Linux ISO dl''s through Bittorrent. Why delay Acks?? Just because you want to delay/make/simulate the normal TCP behaviour of making the window smaller?? (unless of course if you can make "Specific" Windows smaller. (i would think this would be useful only if we can tag acks for Bulk traffic like BT) -- Ow Mun Heng Gentoo/Linux on DELL D600 1.4Ghz 98% Microsoft(tm) Free!! Neuromancer 18:05:11 up 21:33, 3 users, load average: 0.22, 0.30, 0.56
>In my current implementation at home, I use TC/HTB rules to priortise >outgoing acks. This definately helps esp when doing huge Linux ISO dl''s >through Bittorrent. > >Why delay Acks?? Just because you want to delay/make/simulate the normal >TCP behaviour of making the window smaller?? > >...because "we" don''t want bittorrent to go faster.... The goal is to slow things down, not speed them up. Granted though that prioritising acks is useful up until the client is using the desired amount of bandwidth, after that we want to slow them down again Ed W
On Tue, 2005-03-15 at 15:24 +0000, Ed Wildgoose wrote:> >In my current implementation at home, I use TC/HTB rules to priortise > >outgoing acks. This definately helps esp when doing huge Linux ISO dl''s > >through Bittorrent. > > > >Why delay Acks?? Just because you want to delay/make/simulate the normal > >TCP behaviour of making the window smaller?? > > > > > > > ...because "we" don''t want bittorrent to go faster....Hmm.. I see your point in that by ack the BT packets, they will come in faster.> > The goal is to slow things down, not speed them up. Granted though that > prioritising acks is useful up until the client is using the desired > amount of bandwidth, after that we want to slow them down againUntil a better method of priotising/delayin the ACKs, I guess I will have to content with priotising ACKs so that web-requests ACKs will still get through faster. Hmm.. didn''t BRam Cohen come out with BT 4.0 which is supposed to mark BT packets as "BULK"?> Ed W-- Ow Mun Heng Gentoo/Linux on DELL D600 1.4Ghz 98% Microsoft(tm) Free!! Neuromancer 15:43:17 up 6:14, 6 users, load average: 0.30, 0.41, 0.27