Brian S Julin
2007-Nov-28 17:27 UTC
dynamic PBR, actions, docs and getting it all straight
Hi, Fair warning this may be a bit rambling, and is definitely a bit long. I am trying to prototype a system for doing dynamic policy-based routing (source address dependent based on reverse routes from BGP or other dynamic routing protocols.) We need to do this due to a cacophony of factors I won''t get into. To do so the general plan is to store dynamic routes in their own table, classify based on source realm, and use the tc "mirred" action to redirect packets that source from addresses routed back to by that table onto a different egress interface. It seems obvious this can be done, that the old "iptables -j ROUTE" method is falling into disfavor and lack of maintenence, and that the tc "mirred" action is stepping up to take its place. However this has raised numerous questions, most of which just because this is my first wade into the LARTC pool. Also, though, I am having trouble finding any docs that factor in actions, since they are relatively new -- but not so new that this should really be the case. (And speaking of docs, one wonders whether the "Traffic Control HOWTO" posted at linux-ip.net bearing version 1.0.2 is intended to split/supercede the LARTC HOWTO or is completely rogue. It appears to be a very well done doc, but also does not factor in actions.) Anyway, the questions: 1) When a packet is "mirred egress redirect"ed, how does the system determine the destination MAC address to place on the outgoing interface, assuming it is ethernet? If I have things straight, this packet will never see the routing stack again and so a gateway cannot be designated? (The older iptables -j ROUTE allowed designation of a gateway) If this: http://www.shorewall.net/NetfilterOverview.html ...is right there is no swat at mangling/rewriting post-qdisc? I''m guessing "that''s a job for IMQ"? 2) If I have things straight again, it is not necessary to involve iptables to do this. The method cited in the few examples on the net about doing this use fwmark. However, with the tc "route" filter it should not be neccessary to do that anymore. Am I right there? 3) Per 2) which is the better method to use? 4) Is there an authoritative list of which actions are supported at which points in the syntax tree? The "route" filter seems to only support classifying and gact, for example, and if I am interpreting the not-so-lucid error messages from yesterday''s wrestle with tc correctly, the inability to execute certain actions extends into any policer appended to the filter. What''s supported where and what will be eventually supported where? 5) Is there any way to turn on more error messages from the kernel so I can tell what the heck tc doesn''t like about commands, even if I have to read it from syslog and the userspace handles aren''t meaningful it still might be nice to have. 6) If I have this right, it''s possible to define a class using the "rule" filter, then a subclass using a do-nothing filter (u32 match u32 0 0) which then in turn invokes the "mirred" action. I am not quite clear, however, precisely when a packet is counted against a qdisc and when precisely actions "happen." I am worried about the activation of the "route" rule counting as link use even though the packet is redirected (stolen). Mainly because in order to use a filter just to execute an action, it''s mandatory to have a class to attach it to, and then a second class for packets that did not match (the normal traffic) -- each class having bandwith limits or whatnot depending on the qdisc. If I have it right a stolen or dropped packet, though, will not show up because it won''t actually be there in the qdisc when the kernel comes collecting (?). 7) Will eventually classless qdiscs regain support for attaching filters, given that filters do not necessarily have to assign a class, they can instead execute an action or a police with nothing but actions? Or will it always be necessary to create classes to contain the filters, and thus use a classy qdisc? I say regeain because I seem to recall seeing a doc that showed attaching a filter to a classless qdisc, though I can''t find it now and perhaps that was an error. 8) As a curiosity, why "handle XX fw" rather than "fw handle XX"? 9) Is there any motion to bring the distributed manpage up to sync? 10) I haven''t even looked into it yet -- how does (or does?) one integrate L2+L3 criteria/actions with qdiscs... any docs on other-than "protocol ip"? I am assuming it is not possible to trick things into performing a direct route table comparison against a packet that is not routed, but bridged, other than to build a netfilter ipset from the route table with bubblegum and spit and just use ebtables on it. But I''d be bummed if I assumed so wrongly and passed up an elegant solution. Thanks for any help wrapping my head around this.