While I''m thinking about that review of howto changes, here are a few other responses about things I don''t believe. I''ll be interested in more info if anyone has any. ===[from new doc] Besides being classful, CBQ is also a shaper and it is in that aspect that it really doesn''t work very well. It should work like this. I''ve not noticed that it doesn''t work well. Perhaps cause I''m accidentally using the right parameters? If you try to shape a 10mbit/s connection to 1mbit/s, the link should be idle 90% of the time. If it isn''t, we need to throttle so that it IS idle 90% of the time. Which is the way it does work, as far as I can tell. This is pretty hard to measure, so CBQ also needs to know how big an average packet is going to be, and instead derives the idle time from the number of microseconds that elapse between requests from the hardware layer for more data. Combined, this can be used to approximate how full or empty the link is. I can''t believe this dependence on packet size since I''ve always had good results using the same default packet size even though different tests use very different packet sizes. tc class add dev eth1 parent 10:100 classid 10:2 cbq \ bandwidth 30Kbit rate 30Kbit allot 1514 weight 30Kbit prio 5 \ maxburst 10 avpkt 1000 bounded I send ping packets with default data size (56 bytes) which is 98 bytes per packet inc. mac, ip, icmp headers. [[new data to avoid problems with that in original reply]] In 10 seconds I get 413 replies, which I assume means 413 got sent (I enqueued 1000) That''s (* 413 98 8 .1)=32.3kbps, pretty close. Now I try 1000 bytes of data and get 40 replies over 10 seconds (again enqueuing 1000 packets) (* 1042 8 40 .1)=33.3kpbs, again pretty close Finally, data size 1 gives me 981 replies over 10 seconds (this time I have to increase the rate in order to saturate the limit) (* 43 8 981 .1)=33.7kbps It''s clearly not counting every packet as the same size!! === bandwidth The physical bandwidth of your device, also needed for idle time calculations. Notice above I supplied bandwidth 30kbit which is far from the actual physical bandwidth (100Mbit). Maybe this is why I get good results. Maybe this is what you''re SUPPOSED to do! Recall in the experiment I reported to lartc 10/10 the correct bandwidth ended up giving me about twice the rate. I don''t see from above explanation why that should be, but again it suggests that this parameter really ought to be set to the desired rate.
Well ahu is currently a bit biased toward htb. I''m pretty comfortable with his position ;-)) Maybe the sentences in doc are a bit personal feelings about cbq. Cbq can behave VERY well (as you know whole HTB implemets CBQ theory so that HTB is in fact kind of CBQ scheduler). Only linux''s CBQ scheduler makes some assumptions about hardware speed and latency. These simply don''t hold for some other devices and thus ANK''s CBQ don''t work here. Also as you know there is too many users where CBQ doesn''t work and nobody can explain why. I''m able to dig into CBQ sources and tell you what exactly is with bandwidth and avpkt and how to use them to get things wrong. But I will not spend my time to try to understand all aspect of implementation which has some fundamental weak places. So I decided to reimplement CBQ algorithm in other way starting with theory from Floyd (ANK started with ns-2.0 sources). So my impl uses different alhgorithm but has the same goals. I''d compare CBQ and HTB like sendmail vs. exim or postfix. CBQ is here for long and everyone uses it. For those who want to understand what it does and want to do it correctly (like me) I created HTB. It is much simpler for me to repair ALL bugs and imprecisions in clear code which does depend ONLY on time differences of FINITE granularity and timer of finite granularity. So I don''t need to manage artifical timers based on hw rate and I don''t need to twiggle with top-down algorithm which IS approximation and ANK made some changes against Floyd''s paper because they "seems more reasonable" (ANK''s comment). HTB is based on Floyd''s paper and makes NO approximations. So that is your rates are different from what you expect it is bug and it CAN (and will) be fixed. In CBQ qdisc you can''t say if the error is due bug or by design (top-level approximation error). That are my $0.02 for LARTS readers. ;) On Fri, 7 Dec 2001, Don Cohen wrote:> > While I''m thinking about that review of howto changes, here are a few > other responses about things I don''t believe. I''ll be interested in > more info if anyone has any. > > ===> [from new doc] > Besides being classful, CBQ is also a shaper and it is in that aspect > that it really doesn''t work very well. It should work like this. > I''ve not noticed that it doesn''t work well. Perhaps cause I''m > accidentally using the right parameters? > If you try to shape a 10mbit/s connection to 1mbit/s, the link should > be idle 90% of the time. If it isn''t, we need to throttle so that it > IS idle 90% of the time. > Which is the way it does work, as far as I can tell. > > This is pretty hard to measure, so CBQ also needs to know how big an > average packet is going to be, and instead derives the idle > time from the number of microseconds that elapse between requests from > the hardware layer for more data. Combined, this can > be used to approximate how full or empty the link is. > > I can''t believe this dependence on packet size since I''ve always had > good results using the same default packet size even though different > tests use very different packet sizes. > > tc class add dev eth1 parent 10:100 classid 10:2 cbq \ > bandwidth 30Kbit rate 30Kbit allot 1514 weight 30Kbit prio 5 \ > maxburst 10 avpkt 1000 bounded > I send ping packets with default data size (56 bytes) > which is 98 bytes per packet inc. mac, ip, icmp headers. > [[new data to avoid problems with that in original reply]] > In 10 seconds I get 413 replies, > which I assume means 413 got sent (I enqueued 1000) > That''s (* 413 98 8 .1)=32.3kbps, pretty close. > Now I try 1000 bytes of data and get 40 replies over 10 seconds > (again enqueuing 1000 packets) > (* 1042 8 40 .1)=33.3kpbs, again pretty close > Finally, data size 1 gives me 981 replies over 10 seconds (this time I > have to increase the rate in order to saturate the limit) > (* 43 8 981 .1)=33.7kbps > > It''s clearly not counting every packet as the same size!! > > ===> bandwidth > The physical bandwidth of your device, also needed for idle time > calculations. > > Notice above I supplied bandwidth 30kbit which is far from the actual > physical bandwidth (100Mbit). Maybe this is why I get good results. > Maybe this is what you''re SUPPOSED to do! > > Recall in the experiment I reported to lartc 10/10 the correct > bandwidth ended up giving me about twice the rate. I don''t see from > above explanation why that should be, but again it suggests that this > parameter really ought to be set to the desired rate. > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://ds9a.nl/2.4Routing/ > >
On Fri, Dec 07, 2001 at 08:15:34AM -0800, Don Cohen wrote:> [from new doc] > Besides being classful, CBQ is also a shaper and it is in that aspect > that it really doesn''t work very well. It should work like this. > I''ve not noticed that it doesn''t work well. Perhaps cause I''m > accidentally using the right parameters?Perhaps because you happen to use it in the circumstances it works well for. As Devik said, he knows many cases in which CBQ missed the mark - he can always explain them, and often suggest fixes, but it is not a ''surefire'' algorithm.> If you try to shape a 10mbit/s connection to 1mbit/s, the link should > be idle 90% of the time. If it isn''t, we need to throttle so that it > IS idle 90% of the time. > Which is the way it does work, as far as I can tell.Yes, this is how it should work. Now imagine a busy ethernet segment with a lot of colisions. CBQ is configured with ''bandwidth 100mbit/s''. Because the ethernet is busy, the network card has a hard time getting 5mbit/s out of the door. We are trying to shape traffic to 2Mbit/s, so the link should be idle 98% of the time if we are sending at the configured rate. Now, because the effective bandwidth has decreased to 5Mbit/s, at 2Mbit/s, the link appears *far* too busy - it is idle only 60% over time! In fact, in this situation, CBQ will shape our traffic to 100kbit/s, which isn''t even *close* to the configured 2mbit/s. Now, I admit that not many segments will be as busy as this. But do you see my point? The effectively available bandwidth determines how we shape. In many cases, you don''t notice this. If you are talking to a switch, for example, your ethernet will be pretty empty except for you. You *will* get dequeued at a rate of 100mbit/s - *if* your network adaptor is up to it. If your computer is busy with other things, or has a crappy bus or network adaptor, CBQ may not be dequeued at 100mbit/s either, in which case too it will do crazy things.> I can''t believe this dependence on packet size since I''ve always had > good results using the same default packet size even though different > tests use very different packet sizes.Ok. Let''s see what the kernel does with avpkt. It doesn''t do anything with it. This is truly marvelous :-) It takes great care to copy avpkt to the kernel. As far as I can tell, avpkt is only used to calculate maxidle. So I retract the packet size dependency statement. It was wrong.> (* 1042 8 40 .1)=33.3kpbs, again pretty close(tests)> Finally, data size 1 gives me 981 replies over 10 seconds (this time I > have to increase the rate in order to saturate the limit) > (* 43 8 981 .1)=33.7kbps > > It''s clearly not counting every packet as the same size!!Thanks for testing this!> bandwidth > The physical bandwidth of your device, also needed for idle time > calculations. > > Notice above I supplied bandwidth 30kbit which is far from the actual > physical bandwidth (100Mbit). Maybe this is why I get good results. > Maybe this is what you''re SUPPOSED to do!Not that I''m aware of.> Recall in the experiment I reported to lartc 10/10 the correct > bandwidth ended up giving me about twice the rate. I don''t see from > above explanation why that should be, but again it suggests that this > parameter really ought to be set to the desired rate.I''ve succeded into bullying some of the kernel people (Hi Jamal :-)) into reading over the HOWTO and manpages - we will soon know. Regards, bert -- http://www.PowerDNS.com Versatile DNS Software & Services Trilab The Technology People Netherlabs BV / Rent-a-Nerd.nl - Nerd Available - ''SYN! .. SYN|ACK! .. ACK!'' - the mating call of the internet
On Sat, Dec 08, 2001 at 09:10:50PM +0100, bert hubert wrote:> > Notice above I supplied bandwidth 30kbit which is far from the actual > > physical bandwidth (100Mbit). Maybe this is why I get good results. > > Maybe this is what you''re SUPPOSED to do! > > Not that I''m aware of.To agree with you, AFAICS, the correct way to deal with this is to specify the root bandwidth as the maximum physical bandwidth on the interface, then split it down using classes that have rates set to the expected rates. On a 100Mbit card connected to a 256kbit line, I used something like: tc qdisc add dev eth0 root handle 1: cbq \ bandwidth 100Mbit avpkt 1000 tc class add dev eth0 parent 1:0 classid 1:1 cbq \ bandwidth 100Mbit rate 256kbit [...] tc qdisc add dev eth0 parent 1:1 handle 10: cbq \ bandwidth 256kbit allot 1514 avpkt 1000 (PS, highly inspired by Stef and others'' scripts of course) All my other classes then hang off 10: instead of 1: and work quite well. What I''ve considered doing a few times is adding an option to dump out the values CBQ is looking at for idleness at each level as well as dynamic avpkt values (based on reality). HTB may do this, of course. -- Michael T. Babcock CTO, FibreSpeed Ltd. (Hosting, Security, Consultation, Database, etc) http://www.fibrespeed.net/~mbabcock/