thr3ads.net - LARTC - tc: u32 match in nexthdr not working? [Dec 2001]

If this information is useful, please help other people find it:
Share via:

Lutz Pressler

2001-Dec-13 19:46 UTC

tc: u32 match in nexthdr not working?

Hello,

it seems, that filtering on nexthdr (TCP/UDP) content, especially
src or dst port, is not working.

The following has no effect on 2.4.16 or older (even 2.2) kernels:

# tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match tcp
dst 3128 0xffff police rate 40kbit burst 10k drop flowid :1

Even if
# tc filter ls dev eth0 parent ffff:
filter protocol ip pref 50 u32
filter protocol ip pref 50 u32 fh 800: ht divisor 1
filter protocol ip pref 50 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid :1
police 4 action drop rate 40Kbit burst 10Kb mtu 2Kb
  match 00000c38/0000ffff at nexthdr+0

looks reasonable, TCP connections to port 3128 are not policed.

If I use "match ip dst <ip-address>" instead, the policing
works.

Port based matching isn''t working for outgoing shapers either, as
can be seen with the statistics functions.

Any idea? Anybody with port based (etc.) filtering actually working?

Regards,
  Lutz

-- 
  _              |  Lutz Pressler          |  Tel: ++49-551-3700002
 |_     |\ |     |  Service Network GmbH   |  FAX: ++49-551-3700009
 ._|ER  | \|ET   |  Bahnhofsallee 1b       |   mailto:lp@SerNet.DE
Service Network  |  D-37081 Goettingen     |  http://www.SerNet.DE/

bert hubert

2001-Dec-14 00:13 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Thu, Dec 13, 2001 at 08:46:57PM +0100, Lutz Pressler wrote:
> The following has no effect on 2.4.16 or older (even 2.2) kernels:
> 
> # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match tcp
> dst 3128 0xffff police rate 40kbit burst 10k drop flowid :1
Double check what this means! This limits speed of data *coming in to* your
proxy from a client (browser). That is not a lot - most data will flow he
other way, and will indeed not be matched.

Data being received BY your proxy from the internet is not matched by this
proxy.
> Even if
> # tc filter ls dev eth0 parent ffff:
> filter protocol ip pref 50 u32
> filter protocol ip pref 50 u32 fh 800: ht divisor 1
> filter protocol ip pref 50 u32 fh 800::800 order 2048 key ht 800 bkt 0
> flowid :1 police 4 action drop rate 40Kbit burst 10Kb mtu 2Kb
>   match 00000c38/0000ffff at nexthdr+0
You supply a lot of redundant information. I''m not sure what the
''4'' means
in this rule.
> looks reasonable, TCP connections to port 3128 are not policed.
> 
> If I use "match ip dst <ip-address>" instead, the policing
works.
Your proxy does no necessarily download FROM port 3128! 

Regards,

bert

-- 
http://www.PowerDNS.com          Versatile DNS Software & Services
Trilab                                 The Technology People
Netherlabs BV / Rent-a-Nerd.nl           - Nerd Available -
''SYN! .. SYN|ACK! .. ACK!'' - the mating call of the internet

Lutz Pressler

2001-Dec-14 07:36 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Fri, 14 Dec 2001, bert hubert wrote:
> On Thu, Dec 13, 2001 at 08:46:57PM +0100, Lutz Pressler wrote:
>
> > The following has no effect on 2.4.16 or older (even 2.2) kernels:
> >
> > # tc filter add dev eth0 parent ffff: protocol ip prio 50 u32 match
tcp
> > dst 3128 0xffff police rate 40kbit burst 10k drop flowid :1
>
> Double check what this means! This limits speed of data *coming in to* your
> proxy from a client (browser). That is not a lot - most data will flow he
> other way, and will indeed not be matched.
>Sorry, that was a typo (I forget that I tried the other way too, to be
complete, before doing the cut&paste). Of course "src
3128"!> Data being received BY your proxy from the internet is not matched by this
> proxy.
>
> > Even if
> > # tc filter ls dev eth0 parent ffff:
> > filter protocol ip pref 50 u32
> > filter protocol ip pref 50 u32 fh 800: ht divisor 1
>
> > filter protocol ip pref 50 u32 fh 800::800 order 2048 key ht 800 bkt 0
> > flowid :1 police 4 action drop rate 40Kbit burst 10Kb mtu 2Kb
> >   match 00000c38/0000ffff at nexthdr+0
and "match 0c380000/ffff0000" here.>
> You supply a lot of redundant information. I''m not sure what the
''4'' means
> in this rule.Neither do I, haven''t set it explicitly. Seems to increase with every
change in policing rules.>
> > looks reasonable, TCP connections to port 3128 are not policed.
> >
> > If I use "match ip dst <ip-address>" instead, the
policing works.
>
> Your proxy does no necessarily download FROM port 3128!I did that - as a test, real situation is not about 3128 - on the client,
not the proxy.

Lutz

-- 
  _              |  Lutz Pressler          |  Tel: ++49-551-3700002
 |_     |\ |     |  Service Network GmbH   |  FAX: ++49-551-3700009
 ._|ER  | \|ET   |  Bahnhofsallee 1b       |   mailto:lp@SerNet.DE
Service Network  |  D-37081 Goettingen     |  http://www.SerNet.DE/

Lutz Pressler

2001-Dec-14 12:10 UTC

head link

Re: tc: u32 match in nexthdr not working?

Hi again,

ok, did some tests:

match ip sport 3128  does work (as does the more correct
match ip sport 3128 0xffff match ip protocol 0xff  to only consider
TCP) - match tcp src 3128 does not.

The difference as shown by tc filter show dev eth0 parent ffff:
is that ip sport -> "match 0c380000/ffff0000 at 20"
        tcp src ->  "match 0c380000/ffff0000 at nexthdr+0".

This confirms my assumption, that nexthrd is broken.
at nexthdr+0 _should_ work with IP options present, "at 20" not,
correct?

Lutz

-- 
  _              |  Lutz Pressler          |  Tel: ++49-551-3700002
 |_     |\ |     |  Service Network GmbH   |  FAX: ++49-551-3700009
 ._|ER  | \|ET   |  Bahnhofsallee 1b       |   mailto:lp@SerNet.DE
Service Network  |  D-37081 Goettingen     |  http://www.SerNet.DE/

Julian Anastasov

2001-Dec-14 12:56 UTC

head link

Re: tc: u32 match in nexthdr not working?

Hello,

On Fri, 14 Dec 2001, Lutz Pressler wrote:
> Hi again,
>
> ok, did some tests:
>
> match ip sport 3128  does work (as does the more correct
> match ip sport 3128 0xffff match ip protocol 0xff  to only consider
> TCP) - match tcp src 3128 does not.
>
> The difference as shown by tc filter show dev eth0 parent ffff:
> is that ip sport -> "match 0c380000/ffff0000 at 20"
>         tcp src ->  "match 0c380000/ffff0000 at nexthdr+0".
>
> This confirms my assumption, that nexthrd is broken.
	It confirms only that nexthdr does not work with your
settings. Nothing more. Read carefully iproute2/README.iproute2+tc
and particularly the last filter in this file. I agree, it is not
documented very well. To use nexthdr you must use "offset" with
hash table. U32 is universal (read line #2 in cls_u32.c), it does
not know that you are using IPv4, so the value 20 can not be
guessed. For this, "offset" is used to extract the iphdr->ihl
value and to use it as a base for all nexthdr+ relative offsets.
> at nexthdr+0 _should_ work with IP options present, "at 20" not,
> correct?
>
> Lutz
Regards

--
Julian Anastasov <ja@ssi.bg>

bert hubert

2001-Dec-14 12:58 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Fri, Dec 14, 2001 at 02:56:57PM +0200, Julian Anastasov wrote:
> > The difference as shown by tc filter show dev eth0 parent ffff:
> > is that ip sport -> "match 0c380000/ffff0000 at 20"
> >         tcp src ->  "match 0c380000/ffff0000 at
nexthdr+0".
> not know that you are using IPv4, so the value 20 can not be
> guessed. For this, "offset" is used to extract the iphdr->ihl
> value and to use it as a base for all nexthdr+ relative offsets.
Damn, that''s broken. Or at least, extremely non-obvious and hard to get
working. Overly universal comes to mind. So ''ip sport'' would
stop matching
packets with ip options?

Thanks for enlightening us - will update the HOWTO to this effect.

Regards,

bert

-- 
http://www.PowerDNS.com          Versatile DNS Software & Services
Trilab                                 The Technology People
Netherlabs BV / Rent-a-Nerd.nl           - Nerd Available -
''SYN! .. SYN|ACK! .. ACK!'' - the mating call of the internet

Julian Anastasov

2001-Dec-14 13:15 UTC

head link

Re: tc: u32 match in nexthdr not working?

Hello,

On Fri, 14 Dec 2001, bert hubert wrote:
>
> > not know that you are using IPv4, so the value 20 can not be
> > guessed. For this, "offset" is used to extract the
iphdr->ihl
> > value and to use it as a base for all nexthdr+ relative offsets.
>
> Damn, that''s broken. Or at least, extremely non-obvious and hard
to get
> working. Overly universal comes to mind. So ''ip sport''
would stop matching
> packets with ip options?
	No, ihl includes the options. Everything works perfectly.
It is bug to use sport and dport if ip options are present. There
are tcp dst and tcp src for example. Same for udp. For icmp there
are icmp type and icmp code. All they use the same base pointer.
> Regards,
>
> bert
Regards

--
Julian Anastasov <ja@ssi.bg>

bert hubert

2001-Dec-14 13:32 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Fri, Dec 14, 2001 at 03:15:43PM +0200, Julian Anastasov wrote:

> > Damn, that''s broken. Or at least, extremely non-obvious and
hard to get
> > working. Overly universal comes to mind. So ''ip
sport'' would stop matching
> > packets with ip options?
> 
> 	No, ihl includes the options. Everything works perfectly.
> It is bug to use sport and dport if ip options are present. There
Geh. Or an ''undocumented feature''. Because you don''t
know what kind of
packets you will send or forward, using ''ip sport'' is always a
bug.
> are tcp dst and tcp src for example. Same for udp. For icmp there
> are icmp type and icmp code. All they use the same base pointer.
But tcp src only works when operating in a hashed filter? Which is
not often the case. 

I tried this:
tc filter add dev eth0 parent 1:0 prio 5 u32  \
	match ip nofrag \
	offset mask 0x0F00 shift 6 \
	match tcp src 22 0xffff classid 1:2

But it doesn''t work, gives:
RTNETLINK answers: Invalid argument

Regards,

bert

-- 
http://www.PowerDNS.com          Versatile DNS Software & Services
Trilab                                 The Technology People
Netherlabs BV / Rent-a-Nerd.nl           - Nerd Available -
''SYN! .. SYN|ACK! .. ACK!'' - the mating call of the internet

Julian Anastasov

2001-Dec-14 13:54 UTC

head link

Re: tc: u32 match in nexthdr not working?

Hello,

On Fri, 14 Dec 2001, bert hubert wrote:
> > 	No, ihl includes the options. Everything works perfectly.
> > It is bug to use sport and dport if ip options are present. There
>
> Geh. Or an ''undocumented feature''. Because you
don''t know what kind of
> packets you will send or forward, using ''ip sport'' is
always a bug.
	Yes
> > are tcp dst and tcp src for example. Same for udp. For icmp there
> > are icmp type and icmp code. All they use the same base pointer.
>
> But tcp src only works when operating in a hashed filter? Which is
> not often the case.
	Right. But only then we can match packets with options.
> I tried this:
> tc filter add dev eth0 parent 1:0 prio 5 u32  \
> 	match ip nofrag \
> 	offset mask 0x0F00 shift 6 \
> 	match tcp src 22 0xffff classid 1:2
>
> But it doesn''t work, gives:
	Of course
> RTNETLINK answers: Invalid argument
Didn''t tried it but something like this:

F="tc filter add dev eth0 parent 1:0 protocol ip prio 5"
$F handle 1: u32 divisor 1
$F u32 ht 1: match tcp src 22 0xFFFF match ip protocol 6 0xFF match ip firstfrag
flowid 1:2
$F u32 ht 800:: match u8 0 0 offset at 0 mask 0x0f00 shift 6 link 1:

Using ip nofrag is another bug :) Small? You miss traffic.
> Regards,
>
> bert
Regards

--
Julian Anastasov <ja@ssi.bg>

Henrik Nordstrom

2001-Dec-14 15:16 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Friday 14 December 2001 14.15, Julian Anastasov wrote:
>         No, ihl includes the options. Everything works perfectly.
> It is bug to use sport and dport if ip options are present. There
> are tcp dst and tcp src for example. Same for udp. For icmp there
> are icmp type and icmp code. All they use the same base pointer.
Which only works if you have a chained the filter rules using a hash table, 
where the hash table has a IP offset rule.

Regards
Henrik

Michael T. Babcock

2001-Dec-14 19:59 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Fri, Dec 14, 2001 at 03:54:43PM +0200, Julian Anastasov
wrote:> Didn''t tried it but something like this:
> 
> F="tc filter add dev eth0 parent 1:0 protocol ip prio 5"
> $F handle 1: u32 divisor 1
> $F u32 ht 1: match tcp src 22 0xFFFF match ip protocol 6 0xFF match ip
firstfrag flowid 1:2
> $F u32 ht 800:: match u8 0 0 offset at 0 mask 0x0f00 shift 6 link 1:
Thanks for that example; a few more U32 filter examples in the HOWTO 
would be welcome I''m sure ... ;-~
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.     (Hosting, Security, Consultation, Database, etc)
http://www.fibrespeed.net/~mbabcock/

bert hubert

2001-Dec-14 23:00 UTC

head link

Re: tc: u32 match in nexthdr not working?

On Fri, Dec 14, 2001 at 02:59:15PM -0500, Michael T. Babcock
wrote:> On Fri, Dec 14, 2001 at 03:54:43PM +0200, Julian Anastasov wrote:
> > Didn''t tried it but something like this:
> > 
> > F="tc filter add dev eth0 parent 1:0 protocol ip prio 5"
> > $F handle 1: u32 divisor 1
> > $F u32 ht 1: match tcp src 22 0xFFFF match ip protocol 6 0xFF match ip
firstfrag flowid 1:2
> > $F u32 ht 800:: match u8 0 0 offset at 0 mask 0x0f00 shift 6 link 1:
> 
> Thanks for that example; a few more U32 filter examples in the HOWTO 
> would be welcome I''m sure ... ;-~
I''m always happy to receive tested examples. That is what takes the
most
time - I actually try to test everything these days or I need to be *sure*
that somebody tested it.

In the past a lot of crap was merged which later turned out not to work :-(
> Michael T. Babcock
> CTO, FibreSpeed Ltd.     (Hosting, Security, Consultation, Database, etc)
> http://www.fibrespeed.net/~mbabcock/
Oh, I''ve been exploring how the ''virtual clock'' works
in the Linux CBQ
implementation, it turns out that you can misconfigure it quite badly and
still get *statistically* accurate shaping. I''m still figuring out the
effects at short timescales of misconfiguring bandwidth.

Regards,

bert hubert

-- 
http://www.PowerDNS.com          Versatile DNS Software & Services
Trilab                                 The Technology People
Netherlabs BV / Rent-a-Nerd.nl           - Nerd Available -
''SYN! .. SYN|ACK! .. ACK!'' - the mating call of the internet

Michael T. Babcock

2001-Dec-14 23:07 UTC

head link

Re: CBQ virtual clock

On Sat, Dec 15, 2001 at 12:00:18AM +0100, bert hubert
wrote:> Oh, I''ve been exploring how the ''virtual clock''
works in the Linux CBQ
> implementation, it turns out that you can misconfigure it quite badly and
> still get *statistically* accurate shaping. I''m still figuring out
the
> effects at short timescales of misconfiguring bandwidth.
Please post your observations as you come across them so we can also test
them and see what''s going on faster together.
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.     (Hosting, Security, Consultation, Database, etc)
http://www.fibrespeed.net/~mbabcock/

bert hubert

2001-Dec-14 23:49 UTC

head link

Re: CBQ virtual clock

On Fri, Dec 14, 2001 at 06:07:28PM -0500, Michael T. Babcock
wrote:> On Sat, Dec 15, 2001 at 12:00:18AM +0100, bert hubert wrote:
> > Oh, I''ve been exploring how the ''virtual
clock'' works in the Linux CBQ
> > implementation, it turns out that you can misconfigure it quite badly
and
> > still get *statistically* accurate shaping. I''m still
figuring out the
> > effects at short timescales of misconfiguring bandwidth.
> 
> Please post your observations as you come across them so we can also test
> them and see what''s going on faster together.
The theory is like this. CBQ wants to know the idle time of the interface,
which would work like this.

Enqueue a packet
Enqueue a packet
Enqueue a packet
	Packet is dequeued to the interface
	interface busy sending packet
	interface notifies us that the packet was sent
	CBQ notes how much time has passed, and uses this for avgidle
		calculations
	Packet is dequeued to the interface
	process repeats.

Ok - now, this is not how it works in Unix, or at least, in Linux. In fact
it goes like this:

Enqueue a packet to the queue
	Dequeue all packets from the queue, give to the network interface
Enqueue a packet to the queue
	Dequeue all packets from the queue, give to the network interface
Enqueue a packet to the queue
	Dequeue all packets from the queue, give to the network interface
...
Enqueue a packet to the queue
	Network interface is fed up, will notice us when there is room again
	......
Enqueue a packet to the queue
Enqueue a packet to the queue
Enqueue a packet to the queue
Enqueue a packet to the queue
	Network interface tells us that there is room
	dequeue
	dequeue
	dequeue
	dequeue

Now - we don''t now really *know* when the network interface was done
sending. So, what ''Alexey CBQ'' does is to guesstimate how long
the interface
would be busy, and move the clock ahead to the point where the interface
would be idle after sending a packet.

So it works like this:

Enqueue a packet
	Dequeue a packet, store how big it was
Enqueue a packet
	Calculate how long the previous packet will take to transmit.
	Calculate how much time has actually passed
	Move ''now'' ahead by the maximum of the above two.

Ok, what does this mean.

Sending packets would look like this:

|
|
|-----+            +-+     +--- etc
|     |            | |     |
|     |            | |     |
|     |------------| |-----|
+-------------------------------------------------------
1     2            3 4     5

At moment 1, a packet starts to be send. At point 2, the packet is done
sending. CBQ knows that the packet was 10000 bits long. Say we want to shape
to 100kbit/s, then time time that should elapse between point 3, where we
start sending the next packet, and point 1 where we started sending the
first one, is 0.1 second.

In an ideal world, the next dequeue request comes directly when the network
device is done sending, at moment two. But it doesn''t, it comes
immediately.

However, the kernel can *calculate* when 2 should occur by dividing the size
of the packet by the actual bandwidth of the device. CBQ then shifts the
virtual time to point 2, and bases all calculations on that.

This is why CBQ needs to know the bandwidth of your link. Now, if the
you set the bandwidth of your link higher than it is in reality, CBQ will
mess up its avgidle calculations.

It appears that ''overlimit'' is then still shaped at the proper
rate, but
link sharing may be done wrong. This is what I''m investigating now.

The above may not make much sense, but perhaps you can make something of it
:-)

Regards,

bert

-- 
http://www.PowerDNS.com          Versatile DNS Software & Services
Trilab                                 The Technology People
Netherlabs BV / Rent-a-Nerd.nl           - Nerd Available -
''SYN! .. SYN|ACK! .. ACK!'' - the mating call of the internet

Michael T. Babcock

2001-Dec-15 03:55 UTC

head link

Re: CBQ virtual clock

On Sat, Dec 15, 2001 at 12:49:42AM +0100, bert hubert
wrote:> Enqueue a packet to the queue
> Enqueue a packet to the queue
> Enqueue a packet to the queue
> 	Network interface tells us that there is room
> 	dequeue
> 	dequeue
> [...] 
> The above may not make much sense, but perhaps you can make something of it
Whats interesting is on a couple of occasions I have seen a situation 
where a "ping -n {someone over eth1}" where eth1 is a CBQ''d
interface
will cause something like:

64 bytes from 216.168.105.33: icmp_seq=0 ttl=255 time=3.0 s
64 bytes from 216.168.105.33: icmp_seq=1 ttl=255 time=2.0 s
64 bytes from 216.168.105.33: icmp_seq=2 ttl=255 time=1.0 s
64 bytes from 216.168.105.33: icmp_seq=3 ttl=255 time=0.2 ms

... after a 3 second delay.  I wonder if I could reproduce this and
see if its related to some setting in CBQ.
-- 
Michael T. Babcock
CTO, FibreSpeed Ltd.     (Hosting, Security, Consultation, Database, etc)
http://www.fibrespeed.net/~mbabcock/

Seemingly Similar Threads

Search for more maybe matching threads

LARTC - Dec 2001 - tc: u32 match in nexthdr not working?

tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: tc: u32 match in nexthdr not working?

Re: CBQ virtual clock

Re: CBQ virtual clock

Re: CBQ virtual clock

Seemingly Similar Threads