thr3ads.net - LARTC - Questions about IMQ [Jun 2002]

If this information is useful, please help other people find it:
Share via:

Don Cohen

2002-Jun-13 20:39 UTC

Questions about IMQ

from http://luxik.cdi.cz/~patrick/imq/index.html
 By default you have two imq devices (imq0 and imq1).

That means if you don''t specify numdevs below the default is 2, right?

  from http://luxik.cdi.cz/~patrick/imq/index.html
      modprobe imq numdevs=1
      tc qdisc add dev imq0 handle 1: root htb default 1
      tc class add dev imq0 parent 1: classid 1:1 htb rate 1mbit
      tc qdisc add dev imq0 parent 1:1 handle 10: htb default 5

We''re adding an htb as the qdisc for a child class of htb ?  Why?
Isn''t that just wasting time?  Can''t all 10: stuff be done
with 1:
instead? 

So above the total rate for all imq traffic is limited to 1Mbit.  As
far as I see, it all goes out 1:1, and below we shape that.
      tc class add dev imq0 parent 10: handle 10:1 htb rate 256kbit burst 30k
prio 1
      tc class add dev imq0 parent 10: handle 10:2 htb rate 256kbit burst 30k
prio 2
      tc class add dev imq0 parent 10: handle 10:5 htb rate 1mbit prio 3

      tc qdisc add dev imq0 parent 10:1 handle 21:0 pfifo
      tc qdisc add dev imq0 parent 10:2 handle 22:0 sfq
      tc qdisc add dev imq0 parent 10:3 handle 23:0 sfq

Should this 10:3 be 10:5 ?

      tc filter add dev imq0 protocol ip pref 1 parent 10: handle 1 fw classid
10:1
      tc filter add dev imq0 protocol ip pref 2 parent 10: handle 2 fw classid
10:2

      iptables -t mangle -A PREROUTING -i ppp0 -j IMQ

This is a little confusing.  I gather -j IMQ means to mark the packet
so that AFTER the mangle table is done, the IMQ device will steal the
packet.  Is that right?  If there were more than one imq device, how
would we know which one gets the packet?

      iptables -t mangle -A PREROUTING -i ppp0 -p tcp -m tos --tos
minimize-delay -m state --state ESTABLISHED -j MARK --set-mark 1
      iptables -t mangle -A PREROUTING -i ppp0 -p tcp --sport 80 --dport 1024:
-m state --state ESTABLISHED -j MARK --set-mark 2

      ip link set imq0 up

So here''s what I imagine happens.  Please confirm or correct.
The packet goes through mangle table and maybe gets marked for later
classification and maybe gets marked for imq.  Then the imq hook
steals away those that were marked for imq.  It enqueues them and
dequeues them according to its classes and the marks.  At this
point the skb dev is imq0 ?  Or still ppp0 ?  
When the packet is eventually dequeued (if not dropped) then it
goes where?  I''m hoping it goes to the beginning of pre-routing
so we can apply conntrack/nat/mangle rules to it with -i imq0.

I suspect this is not the case, since I see in the patch code
  nf_reinject(skb, info, NF_ACCEPT)
I''m not even sure netfilter supports what I want.
I see in
 http://netfilter.samba.org/documentation/HOWTO//netfilter-hacking-HOWTO-3.html
  5.NF_REPEAT: call this hook again. 
but what''s "this hook" ?  Is it the imq hook or pre_routing ?

My goal here is to protect conntrack from attack by rate limiting the
packets not from known connections.  To do that I need to send them to
imq before conntrack sees them.  Unfortunately, conntrack does all the
work that I want to avoid for those packets in prerouting, and
conntrack sees them before mangle.  But maybe I could restrict all
operations that involve conntrack to packets with dev imq (or the imq
mark), which I hope would result in conntrack NOT seeing the packets
with other devices.  I could instead send all those packets from other
devices (or without the mark) to imq, where I would look them up in
the conntrack table (but not add them!) to see whether they belong in
the rate limited class or not.  Then when (if) they''re released they
should go through conntrack, nat, mangle, etc.

Perhaps even better than changing the dev to imq0 would be a way for
netfilter rules to match on the imq mark.  Then I wouldn''t have to
worry about whether rp_filter would still work.  


 from http://luxik.cdi.cz/~patrick/imq/faq.html

 4. When do packets reach the device (qdisc) ? 

 The imq device registers NF_IP_PRE_ROUTING (for ingress) and
 NF_IP_POST_ROUTING (egress) netfilter hooks.  These hooks are also
 registered by iptables. Hooks can be registered with different
 priorities which determine the order in which the registered functions
 will be called. Packet delivery to the imq device in NF_IP_PRE_ROUTING
 happens directly after the mangle table has been passed (not in the
 table itself!). In NF_IP_POST_ROUTING packets reach the device after
 ALL tables have been passed. This means you will be able to use
 netfilter marks for classifying incoming and outgoing packets. Packets
 seen in NF_IP_PRE_ROUTING include the ones that will be dropped by
 packet filtering later (since they already occupied bandwidth), in
 NF_IP_POST_ROUTING only packets which already passed packet filtering
 are seen.

 from include/linux/netfilter_ipv4.h
 enum nf_ip_hook_priorities {
        NF_IP_PRI_FIRST = INT_MIN,
        NF_IP_PRI_CONNTRACK = -200,
        NF_IP_PRI_MANGLE = -150,
        NF_IP_PRI_NAT_DST = -100,
        NF_IP_PRI_FILTER = 0,
        NF_IP_PRI_NAT_SRC = 100,
        NF_IP_PRI_LAST = INT_MAX,
 };

So after mangle means first conntrack, then mangle, then IMQ, then ...

It might be worth mentioning this somewhere in doc.


One other thing I worry about.
net/ipv4/netfilter/ip_queue.c contains:
  * Packets arrive here from netfilter for queuing to userspace.
  * All of them must be fed back via nf_reinject() or Alexey will kill Rusty.
  */
 static int netfilter_receive(struct sk_buff *skb,

I notice the patch returning NF_QUEUE.
Will Rusty survive if IMQ ends up dropping packets ?

Patrick McHardy

2002-Jun-13 23:18 UTC

head link

Re: Questions about IMQ

Hi.
Alot question, i hope i can answer them to your satisfaction.

Don Cohen wrote:
> from http://luxik.cdi.cz/~patrick/imq/index.html
> By default you have two imq devices (imq0 and imq1)
>
>
>That means if you don''t specify numdevs below the default is 2,
right?
>right.
>
>  from http://luxik.cdi.cz/~patrick/imq/index.html
>      modprobe imq numdevs=1
>      tc qdisc add dev imq0 handle 1: root htb default 1
>      tc class add dev imq0 parent 1: classid 1:1 htb rate 1mbit
>      tc qdisc add dev imq0 parent 1:1 handle 10: htb default 5
>
>We''re adding an htb as the qdisc for a child class of htb ?  Why?
>Isn''t that just wasting time?  Can''t all 10: stuff be done
with 1:
>instead? 
>The root qdisc is used for delay simulation, 10:0 is the "real" qdisc
( http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#prio )
>
>So above the total rate for all imq traffic is limited to 1Mbit.  As
>far as I see, it all goes out 1:1, and below we shape that.
>      tc class add dev imq0 parent 10: handle 10:1 htb rate 256kbit burst
30k prio 1
>      tc class add dev imq0 parent 10: handle 10:2 htb rate 256kbit burst
30k prio 2
>      tc class add dev imq0 parent 10: handle 10:5 htb rate 1mbit prio 3
>
>      tc qdisc add dev imq0 parent 10:1 handle 21:0 pfifo
>      tc qdisc add dev imq0 parent 10:2 handle 22:0 sfq
>      tc qdisc add dev imq0 parent 10:3 handle 23:0 sfq
>
>Should this 10:3 be 10:5 ?
>Yes you''re right. Someone else already told me but i forgot to correct
it.
>
>
>      tc filter add dev imq0 protocol ip pref 1 parent 10: handle 1 fw
classid 10:1
>      tc filter add dev imq0 protocol ip pref 2 parent 10: handle 2 fw
classid 10:2
>
>      iptables -t mangle -A PREROUTING -i ppp0 -j IMQ
>
>This is a little confusing.  I gather -j IMQ means to mark the packet
>so that AFTER the mangle table is done, the IMQ device will steal the
>packet.  Is that right?  If there were more than one imq device, how
>would we know which one gets the packet?
>I put it in this order on purpose because it thought it would show 
people the imq device does not get to see the packet
during mangle table traversal but afterwards. It probably IS confusing 
so i''m going to change it the next days.
If more than one imq device is used you specify the one which should get 
the packet with --todev argument to IMQ target.
>
>
>      iptables -t mangle -A PREROUTING -i ppp0 -p tcp -m tos --tos
minimize-delay -m state --state ESTABLISHED -j MARK --set-mark 1
>      iptables -t mangle -A PREROUTING -i ppp0 -p tcp --sport 80 --dport
1024: -m state --state ESTABLISHED -j MARK --set-mark 2
>
>      ip link set imq0 up
>
>So here''s what I imagine happens.  Please confirm or correct.
>The packet goes through mangle table and maybe gets marked for later
>classification and maybe gets marked for imq.  Then the imq hook
>steals away those that were marked for imq.  It enqueues them and
>dequeues them according to its classes and the marks.  At this
>point the skb dev is imq0 ?  Or still ppp0 ?  
>skb->dev doesn''t get changed if thats what you mean ..
>
>When the packet is eventually dequeued (if not dropped) then it
>goes where?  I''m hoping it goes to the beginning of pre-routing
>so we can apply conntrack/nat/mangle rules to it with -i imq0.
>No it doesn''t. I think it doesn''t make any sense to use any
kind of
iptables rules on packets
passing imq because all of them come from/go to real devices which you 
can use in your
rules.
>
>I suspect this is not the case, since I see in the patch code
>  nf_reinject(skb, info, NF_ACCEPT)
>I''m not even sure netfilter supports what I want.
>I see in
>
http://netfilter.samba.org/documentation/HOWTO//netfilter-hacking-HOWTO-3.html
>  5.NF_REPEAT: call this hook again. 
>but what''s "this hook" ?  Is it the imq hook or
pre_routing ?
>it''s imq hook. from net/core/netfilter.c:
nf_reinject(...)
{
...
        if (verdict == NF_REPEAT) {
                elem = elem->prev;
                verdict = NF_ACCEPT;
        }
...
}
>
>
>My goal here is to protect conntrack from attack by rate limiting the
>packets not from known connections.  To do that I need to send them to
>imq before conntrack sees them.  Unfortunately, conntrack does all the
>work that I want to avoid for those packets in prerouting, and
>conntrack sees them before mangle.  But maybe I could restrict all
>you can easily change this order. i guess you already noticed if you 
looked at the imq source.
but are you sure this is necessary ? i guess your connection must be 
extremly fast if someone
wants to dos you through a connection tracking table fillup attack ...
>
>operations that involve conntrack to packets with dev imq (or the imq
>mark), which I hope would result in conntrack NOT seeing the packets
>with other devices.  I could instead send all those packets from other
>devices (or without the mark) to imq, where I would look them up in
>the conntrack table (but not add them!) to see whether they belong in
>the rate limited class or not.  Then when (if) they''re released
they
>should go through conntrack, nat, mangle, etc.
>
>
>Perhaps even better than changing the dev to imq0 would be a way for
>netfilter rules to match on the imq mark.  Then I wouldn''t have to
>worry about whether rp_filter would still work.  
>Changing skb->dev to imq0 would result in something like this:
... -> NF_HOOK(..) -> imq -> qdisc -> reinject -> continue
NF_HOOK ->
... -> dev_queue_xmit -> qdisc -> imq -> reinject (CRASH!)
>
>
> from http://luxik.cdi.cz/~patrick/imq/faq.html
>
> 4. When do packets reach the device (qdisc) ? 
>
> The imq device registers NF_IP_PRE_ROUTING (for ingress) and
> NF_IP_POST_ROUTING (egress) netfilter hooks.  These hooks are also
> registered by iptables. Hooks can be registered with different
> priorities which determine the order in which the registered functions
> will be called. Packet delivery to the imq device in NF_IP_PRE_ROUTING
> happens directly after the mangle table has been passed (not in the
> table itself!). In NF_IP_POST_ROUTING packets reach the device after
> ALL tables have been passed. This means you will be able to use
> netfilter marks for classifying incoming and outgoing packets. Packets
> seen in NF_IP_PRE_ROUTING include the ones that will be dropped by
> packet filtering later (since they already occupied bandwidth), in
> NF_IP_POST_ROUTING only packets which already passed packet filtering
> are seen.
>
> from include/linux/netfilter_ipv4.h
> enum nf_ip_hook_priorities {
>        NF_IP_PRI_FIRST = INT_MIN,
>        NF_IP_PRI_CONNTRACK = -200,
>        NF_IP_PRI_MANGLE = -150,
>        NF_IP_PRI_NAT_DST = -100,
>        NF_IP_PRI_FILTER = 0,
>        NF_IP_PRI_NAT_SRC = 100,
>        NF_IP_PRI_LAST = INT_MAX,
> };
>
>So after mangle means first conntrack, then mangle, then IMQ, then ...
>
>It might be worth mentioning this somewhere in doc.
>hmm i guess i may explanation is the worst way to describe this simple 
fact :)
(its supposed to mean the same thing)
>
>
>One other thing I worry about.
>net/ipv4/netfilter/ip_queue.c contains:
>  * Packets arrive here from netfilter for queuing to userspace.
>  * All of them must be fed back via nf_reinject() or Alexey will kill
Rusty.
>  */
> static int netfilter_receive(struct sk_buff *skb,
>
>I notice the patch returning NF_QUEUE.
>Will Rusty survive if IMQ ends up dropping packets ?
>If you look at the imq source you find a imq_skb_destructor, i though 
about adding a comment that
it''s meant to save rusty''s life. if skb''s are freed
inside qdiscs
kfree_skb will call the destructor which
will do necessary things to protect rusty :)

Bye,
Patrick

Don Cohen

2002-Jun-14 05:28 UTC

head link

Re: Questions about IMQ

Patrick McHardy writes:
 > >We''re adding an htb as the qdisc for a child class of htb ? 
Why?
 > >Isn''t that just wasting time?  Can''t all 10: stuff
be done with 1:
 > >instead? 
 > >
 > The root qdisc is used for delay simulation, 10:0 is the "real"
qdisc
 > ( http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#prio )

So I was right, just to waste time.  That was not part of the spec, as
I recall.  So I suggest it be removed from the example.

 > If more than one imq device is used you specify the one which should get 
 > the packet with --todev argument to IMQ target.

Be sure to put that in the doc.  I didn''t see it there.
I suppose the default is imq0 ?

 > skb->dev doesn''t get changed if thats what you mean ..

Ok, important to know that.  I gather there is currently no way to
read the imq mark from netfilter.  

 > >When the packet is eventually dequeued (if not dropped) then it
 > >goes where?  I''m hoping it goes to the beginning of
pre-routing
 > >so we can apply conntrack/nat/mangle rules to it with -i imq0.
 > >
 > No it doesn''t. I think it doesn''t make any sense to use
any kind of
 > iptables rules on packets
 > passing imq because all of them come from/go to real devices which you 
 > can use in your rules.

But if you could read the imq mark then it would make a lot of sense.
These two things in combination would allow me to do what I want
without changing the code.  As it is, it looks like I need a local
variant of IMQ that runs before conntrack.  (On the other hand, this
is probably the more efficient solution anyhow.)

 > >I suspect this is not the case, since I see in the patch code
 > >  nf_reinject(skb, info, NF_ACCEPT)
 > >I''m not even sure netfilter supports what I want.
 > >I see in
 > >
http://netfilter.samba.org/documentation/HOWTO//netfilter-hacking-HOWTO-3.html
 > >  5.NF_REPEAT: call this hook again. 
 > >but what''s "this hook" ?  Is it the imq hook or
pre_routing ?
 > >
 > it''s imq hook. from net/core/netfilter.c:
 > nf_reinject(...)

As I thought, there''s no convenient way for you to do what I want.

 > you can easily change this order. i guess you already noticed if you 
 > looked at the imq source.

Right.  But this is not a change that everyone would want.

 > but are you sure this is necessary ? i guess your connection must be 
 > extremly fast if someone
 > wants to dos you through a connection tracking table fillup attack ...

My idea of extremely fast has changed recently.  Maybe it''s a bit
ahead of yours.  First, I''m interested in protecting against attacks
from inside the firewall, and these are typically connected at
100Mbit.  Is that fast enough?  Next I''ve been playing with gigabit
cards.  Finally I visited sprint a few weeks ago and they''re not
interested in anything as slow as one gigabit.  Although, for a
firewall, I admit that seems fast enough for the time being.

 > Changing skb->dev to imq0 would result in something like this:
 > ... -> NF_HOOK(..) -> imq -> qdisc -> reinject -> continue
NF_HOOK ->
 > ... -> dev_queue_xmit -> qdisc -> imq -> reinject (CRASH!)

If you mean it could result in infinite loops, yes, but this is not
the first invention of infinite loops.  If your rules do the right
things then the loops can also be avoided.  Besides, that requires
my other request, that the reinject go back to the beginning of the
prerouting hook.  Without that it was completely plausible that the
skb dev could have been changed.  But I''m not complaining.  I just
wanted to know.

 > If you look at the imq source you find a imq_skb_destructor, i though 
 > about adding a comment that
 > it''s meant to save rusty''s life. if skb''s are
freed inside qdiscs
 > kfree_skb will call the destructor which
 > will do necessary things to protect rusty :)
Ok, I wouldn''t want to contribute to his early demise.  This tends to
confirm my first guess, which was that the important thing here was to
free skbs when they are no longer in use.  I guess user mode can''t free
them, but perhaps the better solution would have been to free them
before a copy is sent to user space and then recreating them if the
copy ever came back.  But I digress...

Thanks for all the answers.

Patrick McHardy

2002-Jun-14 15:15 UTC

head link

Re: Questions about IMQ

Don Cohen wrote:
>Patrick McHardy writes:
> > >We''re adding an htb as the qdisc for a child class of htb
?  Why?
> > >Isn''t that just wasting time?  Can''t all 10:
stuff be done with 1:
> > >instead? 
> > >
> > The root qdisc is used for delay simulation, 10:0 is the
"real" qdisc
> > ( http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#prio )
>
>So I was right, just to waste time.  That was not part of the spec, as
>I recall.  So I suggest it be removed from the example.
>
I don''t know if its really necessary, but as imq is a software device
it
probably is .. maybe devik
can answer this ...
>
>
> > If more than one imq device is used you specify the one which should
get
> > the packet with --todev argument to IMQ target.
>
>Be sure to put that in the doc.  I didn''t see it there.
>I suppose the default is imq0 ?
>yes ..
>
>
> > skb->dev doesn''t get changed if thats what you mean ..
>
>Ok, important to know that.  I gather there is currently no way to
>read the imq mark from netfilter.  
>
currently not, but if you need something like this just change the mark 
match to match skb->imq_flags
instead of skb->nfmark ... you should then look at include/linux/imq.h 
to see the meaning of the different bits.
>
>
> > >When the packet is eventually dequeued (if not dropped) then it
> > >goes where?  I''m hoping it goes to the beginning of
pre-routing
> > >so we can apply conntrack/nat/mangle rules to it with -i imq0.
> > >
> > No it doesn''t. I think it doesn''t make any sense to
use any kind of
> > iptables rules on packets
> > passing imq because all of them come from/go to real devices which you
> > can use in your rules.
>
>But if you could read the imq mark then it would make a lot of sense.
>These two things in combination would allow me to do what I want
>without changing the code.  As it is, it looks like I need a local
>variant of IMQ that runs before conntrack.  (On the other hand, this
>is probably the more efficient solution anyhow.)
>
more efficient maybe, but you will loose the possibility to only donate 
bandwidth to established and assured
connections by using the state match ..
>
>
> > >I suspect this is not the case, since I see in the patch code
> > >  nf_reinject(skb, info, NF_ACCEPT)
> > >I''m not even sure netfilter supports what I want.
> > >I see in
> > >
http://netfilter.samba.org/documentation/HOWTO//netfilter-hacking-HOWTO-3.html
> > >  5.NF_REPEAT: call this hook again. 
> > >but what''s "this hook" ?  Is it the imq hook or
pre_routing ?
> > >
> > it''s imq hook. from net/core/netfilter.c:
> > nf_reinject(...)
>
>As I thought, there''s no convenient way for you to do what I want.
>
> > you can easily change this order. i guess you already noticed if you 
> > looked at the imq source.
>
>Right.  But this is not a change that everyone would want.
>
> > but are you sure this is necessary ? i guess your connection must be 
> > extremly fast if someone
> > wants to dos you through a connection tracking table fillup attack ...
>
>My idea of extremely fast has changed recently.  Maybe it''s a bit
>ahead of yours.  First, I''m interested in protecting against
attacks
>from inside the firewall, and these are typically connected at
>100Mbit.  Is that fast enough?  Next I''ve been playing with gigabit
>cards.  Finally I visited sprint a few weeks ago and they''re not
>interested in anything as slow as one gigabit.  Although, for a
>firewall, I admit that seems fast enough for the time being.
>
> > Changing skb->dev to imq0 would result in something like this:
> > ... -> NF_HOOK(..) -> imq -> qdisc -> reinject ->
continue NF_HOOK ->
> > ... -> dev_queue_xmit -> qdisc -> imq -> reinject (CRASH!)
>
>If you mean it could result in infinite loops, yes, but this is not
>the first invention of infinite loops.  If your rules do the right
>things then the loops can also be avoided.  Besides, that requires
>my other request, that the reinject go back to the beginning of the
>prerouting hook.  Without that it was completely plausible that the
>skb dev could have been changed.  But I''m not complaining.  I just
>wanted to know.
>
> > If you look at the imq source you find a imq_skb_destructor, i though 
> > about adding a comment that
> > it''s meant to save rusty''s life. if skb''s
are freed inside qdiscs
> > kfree_skb will call the destructor which
> > will do necessary things to protect rusty :)
>Ok, I wouldn''t want to contribute to his early demise.  This tends
to
>confirm my first guess, which was that the important thing here was to
>free skbs when they are no longer in use.  I guess user mode can''t
free
>them, but perhaps the better solution would have been to free them
>before a copy is sent to user space and then recreating them if the
>copy ever came back.  But I digress...
>
userspace ?? imq never sends anything to userspace, but if it really 
would then you''re right, userspace can''t free skbs.
also the destructor doesn''t free them but it releases references hold
on
the real devices which are taken before by netfilter
so the real device doesn''t vanish while the packet is out of netfilters
control ..
>
>
>Thanks for all the answers.
>Your welcome,
Patrick

LARTC - Jun 2002 - Questions about IMQ

Questions about IMQ

Re: Questions about IMQ

Re: Questions about IMQ

Re: Questions about IMQ