Hi all, Having built a fairly complex QoS system I can now see that there are a number of places where I feel that QoS in Linux could and should be improved. The MD of my company, NetServers.co.uk, has generously agreed in principle to sponsor this work if there is community interest and agreement. The first thing which should be changed is the whole "filter" system, which classifies traffic. It does basically the same job as Netfilter, but in a completely incompatible, much less powerful way. For example, there are no user-defined chains to jump between, rules are just read in linear order or you can jump down the classification tree, but I suspect this is not the right structure for really powerful classification. All the filters are per-device, so if you want to apply the same filters to every device then you have to repeat them that number of times, which is inefficient and slow to load. I also don''t think we should have two competing, incompatible systems for packet matching in the kernel, and iptables is clearly superior. My proposed solution is a new Netfilter table which packets pass through on their way out to a device.There would be just a single terminal target, CLASSIFY, which would enqueue the packet in the specified classifier. Unclassified packets which drop off the end of the entry chain would pass on to the old-style tc filtering system, for backwards compatibility. With iptables'' powerful packet matchers and the ability to define custom chains and jump between them, this would be significantly more powerful than tc. It would also be easier to use (for anyone who already knows iptables), and eventually the old code could be removed from the QoS filters, simplifying them. Netfilter MARK can of course be used to achieve some of this, but there is much contention for the MARK field due to its many uses, and it also requires a tc filter for each MARK value to file packets with that mark in the appropriate class. The double filtering is bad for performance, and would be unnecessary if iptables could feed packets directly into the appropriate class. The classifier system can also be improved, with less drastic changes. It suffers from some limitations: - Classes cannot be named, only assigned a number from a 16-bit range. This makes classification rulesets hard to read and follow. - Classes must always be attached to a specific qdisc or parent class, on a specific device. This makes it impossible to put a global limit on traffic "from" a zone, since at least each device the packets go "to" will have its own classifier with its own leaky bucket. - Classes can only be attached to one parent, so the potential for using QoS for multilink, load balancing, etc. is not realised at present. Although you could create a bonded device and attach classes to it, you could not set different rates on the two devices unless you can use filters to determine in advance which packets will go through which device, which defeats the point of link load balancing. These limitations can all be fixed with changes to the QoS kernel structure and tc tools. Finally, the tc tool itself is undocumented, and its syntax is somewhat arcane and just a little bizarre in places. I believe that implementing these suggestions would make Linux QoS more powerful, more accessible, and simpler. Does anyone here agree or disagree? Would anyone like to help me with this project? Where else should I ask about this, apart from the obvious Netfilter mailing list? Comments, suggestions and flames welcome. Cheers, Chris. -- ___ __ _ / __// / ,__(_)_ | Chris Wilson -- UNIX Firewall Lead Developer | / (_ / ,\/ _/ /_ \ | NetServers.co.uk http://www.netservers.co.uk | \ _//_/_/_//_/___/ | 21 Signet Court, Cambridge, UK. 01223 576516 | _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Chris Wilson wrote:>Hi all, > >Having built a fairly complex QoS system I can now see that there are a >number of places where I feel that QoS in Linux could and should be >improved. The MD of my company, NetServers.co.uk, has generously agreed in >principle to sponsor this work if there is community interest and agreement. > >The first thing which should be changed is the whole "filter" system, >which classifies traffic. It does basically the same job as Netfilter, but >in a completely incompatible, much less powerful way. > >For example, there are no user-defined chains to jump between, rules are >just read in linear order or you can jump down the classification tree, >but I suspect this is not the right structure for really powerful >classification. > >All the filters are per-device, so if you want to apply the same >filters to every device then you have to repeat them that number of times, >which is inefficient and slow to load. >that arises from the fact that filters are bound to qdiscs and qdiscs are bound to devices. handles are local to each device, so it doesn''t make much sense to have global filters.> >I also don''t think we should have two competing, incompatible systems for >packet matching in the kernel, and iptables is clearly superior. >while not beeing as fast as tc filters.> >My proposed solution is a new Netfilter table which packets pass through >on their way out to a device.There would be just a single terminal target, >CLASSIFY, which would enqueue the packet in the specified classifier. >Unclassified packets which drop off the end of the entry chain would pass >on to the old-style tc filtering system, for backwards compatibility. >i think with netfilter table you mean iptables table. iptables can only see ip, so you still have the need for tc filters as they exist. The classify target can easily be made, just set skb->priority to the class id and the qdisc will take care of the rest. if your company wishes to sponsor me for writing one for you, go ahead ;)> >With iptables'' powerful packet matchers and the ability to define custom >chains and jump between them, this would be significantly more powerful >than tc. It would also be easier to use (for anyone who already knows >iptables), and eventually the old code could be removed from the QoS >filters, simplifying them. >read me comments above .. iptables -> ip, tc filter -> anything> >Netfilter MARK can of course be used to achieve some of this, but there is >much contention for the MARK field due to its many uses, and it also >requires a tc filter for each MARK value to file packets with that mark in >the appropriate class. The double filtering is bad for performance, and >would be unnecessary if iptables could feed packets directly into the >appropriate class. >use skb->priority.> >The classifier system can also be improved, with less drastic changes. It >suffers from some limitations: > >- Classes cannot be named, only assigned a number from a 16-bit range. > This makes classification rulesets hard to read and follow. >nice idea .. i think i''m gonna look check if this is possible without much effort ..> >- Classes must always be attached to a specific qdisc or parent class, on > a specific device. This makes it impossible to put a global limit on > traffic "from" a zone, since at least each device the packets go "to" > will have its own classifier with its own leaky bucket. >you may use imq for this (http://trash.net/~kaber/imq)> >- Classes can only be attached to one parent, so the potential for using > QoS for multilink, load balancing, etc. is not realised at present. > Although you could create a bonded device and attach classes to it, > you could not set different rates on the two devices unless you can > use filters to determine in advance which packets will go through which > device, which defeats the point of link load balancing. > >These limitations can all be fixed with changes to the QoS kernel >structure and tc tools. > >Finally, the tc tool itself is undocumented, and its syntax is somewhat >arcane and just a little bizarre in places. >also some other quirks, f.e. 1 tc mbit != 1 iptraf mbit .. there is a tc-ng from jamal, but I''ve never tried it.> >I believe that implementing these suggestions would make Linux QoS more >powerful, more accessible, and simpler. Does anyone here agree or >disagree? Would anyone like to help me with this project? Where else >should I ask about this, apart from the obvious Netfilter mailing list? > > >Comments, suggestions and flames welcome. >Cheers, Chris. > >Bye Patrick _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Patrick McHardy
2003-Jan-25 01:51 UTC
Netfilter target: CLASSIFY (was Re: QoS in Linux: Project Suggestion)
Patrick McHardy wrote:> Chris Wilson wrote: > >> >> My proposed solution is a new Netfilter table which packets pass through >> on their way out to a device.There would be just a single terminal >> target, >> CLASSIFY, which would enqueue the packet in the specified classifier. >> Unclassified packets which drop off the end of the entry chain would >> pass >> on to the old-style tc filtering system, for backwards compatibility. >> > > i think with netfilter table you mean iptables table. iptables can > only see ip, so you still > have the need for tc filters as they exist. The classify target can > easily be made, > just set skb->priority to the class id and the qdisc will take care of > the rest. if your company > wishes to sponsor me for writing one for you, go ahead ;)sorry if my last mail sounded arrogant, i think you deserve honour for beeing willing to sponsor linux QoS development. As sign of my appologies, take this CLASSIFY target patch against latest netfiler CVS ;) It is untested, but despite possible easy-to-fix typos i hope it works. chmod +x userspace/extensions/.CLASSIFY-test after applying the patch. Usage is: -j CLASSIFY --set-class MAJOR:MINOR Bye, Patrick
Patrick McHardy
2003-Jan-25 02:17 UTC
Re: Netfilter target: CLASSIFY (was Re: QoS in Linux: Project Suggestion)
Patrick McHardy wrote:> Patrick McHardy wrote: > >> Chris Wilson wrote: >> >>> >>> My proposed solution is a new Netfilter table which packets pass >>> through >>> on their way out to a device.There would be just a single terminal >>> target, >>> CLASSIFY, which would enqueue the packet in the specified classifier. >>> Unclassified packets which drop off the end of the entry chain would >>> pass >>> on to the old-style tc filtering system, for backwards compatibility. >>> >> >> i think with netfilter table you mean iptables table. iptables can >> only see ip, so you still >> have the need for tc filters as they exist. The classify target can >> easily be made, >> just set skb->priority to the class id and the qdisc will take care >> of the rest. if your company >> wishes to sponsor me for writing one for you, go ahead ;) > > > sorry if my last mail sounded arrogant, i think you deserve honour for > beeing willing to sponsor linux QoS development. > As sign of my appologies, take this CLASSIFY target patch against > latest netfiler CVS ;) It is untested, but despite possible > easy-to-fix typos i hope it works. chmod +x > userspace/extensions/.CLASSIFY-test after applying the patch. > Usage is: > -j CLASSIFY --set-class MAJOR:MINOR > > Bye, > Patrick >sorry last one had obvious class value parsing error (treated as integer instead of hex), i hope this one is fine. patrick
Patrick McHardy
2003-Jan-25 11:34 UTC
Re: Netfilter target: CLASSIFY (was Re: QoS in Linux: Project Suggestion)
ok this one actually compiles ;) patrick
Abraham van der Merwe
2003-Jan-29 21:48 UTC
Re: Netfilter target: CLASSIFY (was Re: QoS in Linux: Project Suggestion)
Hi Patrick!> ok this one actually compiles ;)just a small bug in your code:> + *p = TC_H_MAKE(i, j);that should be changed to TC_H_MAKE(i<<16, j) -- Regards Abraham Here we are in America ... when do we collect unemployment? ___________________________________________________ Abraham vd Merwe [ZR1BBQ] - Frogfoot Networks P.O. Box 3472, Matieland, Stellenbosch, 7602 Cell: +27 82 565 4451 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net
Patrick McHardy
2003-Jan-29 23:20 UTC
Re: Netfilter target: CLASSIFY (was Re: QoS in Linux: Project Suggestion)
Abraham van der Merwe wrote:>Hi Patrick! > > > >>ok this one actually compiles ;) >> >> > >just a small bug in your code: > > > >>+ *p = TC_H_MAKE(i, j); >> >> > >that should be changed to TC_H_MAKE(i<<16, j) > > >thanks, i''ve attached a changed diff. have you actually tested it ? does it work ? ;) bye, patrick
Abraham van der Merwe
2003-Jan-30 07:49 UTC
Re: Netfilter target: CLASSIFY (was Re: QoS in Linux: Project Suggestion)
Hi Patrick!> >>ok this one actually compiles ;) > >> > >> > > > >just a small bug in your code: > > > > > > > >>+ *p = TC_H_MAKE(i, j); > >> > >> > > > >that should be changed to TC_H_MAKE(i<<16, j) > > > > > > > thanks, i''ve attached a changed diff. have you actually tested it ? > does it work ? ;)Yes, I tested it last night - works great (: -- Regards Abraham The devil finds work for idle glands. ___________________________________________________ Abraham vd Merwe [ZR1BBQ] - Frogfoot Networks P.O. Box 3472, Matieland, Stellenbosch, 7602 Cell: +27 82 565 4451 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net