Stephen Hemminger
2017-May-11 16:01 UTC
[Bridge] [PATCH v2 1/2] net: Added mtu parameter to dev_forward_skb calls
On Thu, 11 May 2017 15:46:27 +0200 Fredrik Markstrom <fredrik.markstrom at gmail.com> wrote:> From: Fredrik Markstr?m <fredrik.markstrom at gmail.com> > > is_skb_forwardable() currently checks if the packet size is <= mtu of > the receiving interface. This is not consistent with most of the hardware > ethernet drivers that happily receives packets larger then MTU.Wrong. Hardware interfaces are free to drop any packet greater than MTU (actually MTU + VLAN). The actual limit is a function of the hardware. Some hardware can only limit by power of 2; some can only limit frames larger than 1500; some have no limiting at all. Any application that should: * not expect packets larger than MTU to be received * not send packets larger than MTU * check actual receive size. IP protocols will do truncation of padded packets
Fredrik Markström
2017-May-11 19:10 UTC
[Bridge] [PATCH v2 1/2] net: Added mtu parameter to dev_forward_skb calls
On Thu, May 11, 2017 at 6:01 PM, Stephen Hemminger <stephen at networkplumber.org> wrote:> On Thu, 11 May 2017 15:46:27 +0200 > Fredrik Markstrom <fredrik.markstrom at gmail.com> wrote: > >> From: Fredrik Markstr?m <fredrik.markstrom at gmail.com> >> >> is_skb_forwardable() currently checks if the packet size is <= mtu of >> the receiving interface. This is not consistent with most of the hardware >> ethernet drivers that happily receives packets larger then MTU. > > Wrong.What is "Wrong" ? I was initially skeptical to implement this patch, since it feels odd to have different MTU:s set on the two sides of a link. After consulting some IP people and the RFC:s I kind of changed my mind and thought I'd give it a shot. In the RFCs I couldn't find anything that defined when and when not a received packet should be dropped.> > Hardware interfaces are free to drop any packet greater than MTU (actually MTU + VLAN). > The actual limit is a function of the hardware. Some hardware can only limit by > power of 2; some can only limit frames larger than 1500; some have no limiting at all.Agreed. The purpose of these patches is to be able to configure an veth interface to mimic these different behaviors. Non of the Ethernet interfaces I have access to drops packets due to them being larger then the configured MTU like veth does. Being able to mimic real Ethernet hardware is useful when consolidating hardware using containers/namespaces. In a reply to a comment from David Miller in my previous version of the patch I attached the example below to demonstrate the case in detail. This works with all ethernet hardware setups I have access to: ---- 8< ------ # Host A eth2 and Host B eth0 is on the same network. # On HOST A % ip address add 1.2.3.4/24 dev eth2 % ip link set eth2 mtu 300 up % # HOST B % ip address add 1.2.3.5/24 dev eth0 % ip link set eth0 mtu 1000 up % ping -c 1 -W 1 -s 400 1.2.3.4 PING 1.2.3.4 (1.2.3.4) 400(428) bytes of data. 408 bytes from 1.2.3.4: icmp_seq=1 ttl=64 time=1.57 ms --- 1.2.3.4 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.573/1.573/1.573/0.000 ms ---- 8< ------ But it doesn't work with veth: ---- 8< ------ # veth0 and veth1 is a veth pair and veth1 has ben moved to a separate network namespace. % # NS A % ip address add 1.2.3.4/24 dev veth0 % ip link set veth0 mtu 300 up % # NS B % ip address add 1.2.3.5/24 dev veth1 % ip link set veth1 mtu 1000 up % ping -c 1 -W 1 -s 400 1.2.3.4 PING 1.2.3.4 (1.2.3.4) 400(428) bytes of data. --- 1.2.3.4 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms ---- 8< ------ -- /Fredrik