Willem de Bruijn
2018-Dec-20 14:04 UTC
4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect
On Thu, Dec 20, 2018 at 6:15 AM Ido Schimmel <idosch at idosch.org> wrote:> > +Willem > > On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote: > > Folks, > > > > I got this warning today. I cant tell when and why this happened, so I do not know yet how to reproduce. > > Maybe someone has a quick idea. > > > > [85109.572032] WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318 > > I managed to trigger this warning as well the other day, but from a > different call path: > > [280155.348610] fib_multipath_hash+0x28c/0x2d0 > [280155.348613] ? fib_multipath_hash+0x28c/0x2d0 > [280155.348619] fib_select_path+0x241/0x32f > [280155.348622] ? __fib_lookup+0x6a/0xb0 > [280155.348626] ip_route_output_key_hash_rcu+0x650/0xa30 > [280155.348631] ? __alloc_skb+0x9b/0x1d0 > [280155.348634] inet_rtm_getroute+0x3f7/0xb80inet_rtm_getroute builds a new packet with inet_rtm_getroute_build_skb here without dev or sk.> Problem is the synthesized skb for output route resolution does not have > skb->dev or skb->sk set. When a multipath route is hit and > net.ipv4.fib_multipath_hash_policy is set the flow dissector is called > with this skb and the warning is triggered. > > I plan to fix it by setting skb->dev to net->loopback_dev.The device can be chosen based on iif in inet_rtm_getroute? A first thought, I don't know this code very well. Let me know if you want me to take a stab at that patch. IPv6 probably will need the same.> I assume we > want to keep this warning to prevent call paths which will otherwise > silently fallback to standard flow dissector instead of the BPF one.Indeed, the warning is there to sniff out paths that do not follow what I thought was an invariant. If there are too many exceptions, I may have to revisit that assumption. But for now, let's see if we can address these edge cases.> I'm not familiar with tap code, so someone else will need to patch this > case, but it looks like: > > tap_sendmsg() > tap_get_user() > skb_probe_transport_header() > skb_flow_dissect_flow_keys_basic() > __skb_flow_dissect() > > skb->dev is only set later in the code.tap_get_user uses sock_alloc_send_pskb (through tap_alloc_skb) to allocate the skb. So skb->sk should be set at the time of skb_probe_transport_header. I'm not sure how this path triggers the warning.