Jan Beulich
2010-Dec-03 10:52 UTC
[Xen-devel] skb_checksum_setup() placement in pv-ops vs. legacy kernel
Ian, Jeremy, knowing pretty little about networking, it nevertheless seems to me that the different placement of skb_checksum_setup() (in the receive paths of pv-ops vs in various transmit paths in legacy) poses a compatibility problem (nothing done on either side if sending from pv-ops to legacy, and done on both ends when sending from legacy to pv-ops). Am I overlooking something here? Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-03 11:12 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
On Fri, 2010-12-03 at 10:52 +0000, Jan Beulich wrote:> Ian, Jeremy, > > knowing pretty little about networking, it nevertheless seems to me > that the different placement of skb_checksum_setup() (in the receive > paths of pv-ops vs in various transmit paths in legacy) poses a > compatibility problem (nothing done on either side if sending from > pv-ops to legacy, and done on both ends when sending from legacy > to pv-ops). Am I overlooking something here?Possibly confusion due to the backwards naming convention in netback? The pvops dom0 side calls skb_checksum_setup in net_tx_submit which (counter-intuitively) is the function which receives the skb from the guest and passes it up to the dom0 network stack (i.e. it handles guest tx). Since we call skb_checksum_setup on the ingress path all skbs in the domain 0 network stack always have their checksum fields correctly initialised and there is never anything to be done when transmitting transmitting out the other side, either to another domU or to a physical device, and therefore it doesn''t matter which kernel the domU is running. On legacy dom0 skb_checksum_setup is called on the generic transmit path, so skbs in the domain 0 network stack can have uninitialised checksum fields but this is always fixed up before passing back down to either netback (called the rx path in netback parlance) or a physical device. This can (and has) caused trouble in the past where networking subsystems are interested in the checksum fields before egress, e.g. we needed to do fixup in various netfilter code paths etc. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Dec-03 11:51 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
>>> On 03.12.10 at 12:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > On Fri, 2010-12-03 at 10:52 +0000, Jan Beulich wrote: >> Ian, Jeremy, >> >> knowing pretty little about networking, it nevertheless seems to me >> that the different placement of skb_checksum_setup() (in the receive >> paths of pv-ops vs in various transmit paths in legacy) poses a >> compatibility problem (nothing done on either side if sending from >> pv-ops to legacy, and done on both ends when sending from legacy >> to pv-ops). Am I overlooking something here? > > Possibly confusion due to the backwards naming convention in netback?No - note that I wrote it specifically this way in the original mail.> The pvops dom0 side calls skb_checksum_setup in net_tx_submit which > (counter-intuitively) is the function which receives the skb from the > guest and passes it up to the dom0 network stack (i.e. it handles guest > tx). > > Since we call skb_checksum_setup on the ingress path all skbs in the > domain 0 network stack always have their checksum fields correctly > initialised and there is never anything to be done when transmitting > transmitting out the other side, either to another domU or to a physical > device, and therefore it doesn''t matter which kernel the domU is > running. > > On legacy dom0 skb_checksum_setup is called on the generic transmit > path, so skbs in the domain 0 network stack can have uninitialised > checksum fields but this is always fixed up before passing back down to > either netback (called the rx path in netback parlance) or a physical > device. This can (and has) caused trouble in the past where networking > subsystems are interested in the checksum fields before egress, e.g. we > needed to do fixup in various netfilter code paths etc.Yes, I can see the benefit of doing it the pv-ops way. The question is what happens for a transmission from pv-ops (frontend or backend - nothing done in the transmit path) to legacy (again frontend or backend - nothing done in the receive path). Secondary question was whether the duplicated effort on transmission the other way around may be a (performance) issue. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-03 12:06 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
On Fri, 2010-12-03 at 11:51 +0000, Jan Beulich wrote:> >>> On 03.12.10 at 12:12, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > > On Fri, 2010-12-03 at 10:52 +0000, Jan Beulich wrote: > >> Ian, Jeremy, > >> > >> knowing pretty little about networking, it nevertheless seems to me > >> that the different placement of skb_checksum_setup() (in the receive > >> paths of pv-ops vs in various transmit paths in legacy) poses a > >> compatibility problem (nothing done on either side if sending from > >> pv-ops to legacy, and done on both ends when sending from legacy > >> to pv-ops). Am I overlooking something here? > > > > Possibly confusion due to the backwards naming convention in netback? > > No - note that I wrote it specifically this way in the original mail. > > > The pvops dom0 side calls skb_checksum_setup in net_tx_submit which > > (counter-intuitively) is the function which receives the skb from the > > guest and passes it up to the dom0 network stack (i.e. it handles guest > > tx). > > > > Since we call skb_checksum_setup on the ingress path all skbs in the > > domain 0 network stack always have their checksum fields correctly > > initialised and there is never anything to be done when transmitting > > transmitting out the other side, either to another domU or to a physical > > device, and therefore it doesn''t matter which kernel the domU is > > running. > > > > On legacy dom0 skb_checksum_setup is called on the generic transmit > > path, so skbs in the domain 0 network stack can have uninitialised > > checksum fields but this is always fixed up before passing back down to > > either netback (called the rx path in netback parlance) or a physical > > device. This can (and has) caused trouble in the past where networking > > subsystems are interested in the checksum fields before egress, e.g. we > > needed to do fixup in various netfilter code paths etc. > > Yes, I can see the benefit of doing it the pv-ops way. The question is > what happens for a transmission from pv-ops (frontend or backend - > nothing done in the transmit path) to legacy (again frontend or > backend - nothing done in the receive path).You mean a packet flowing pvops-domU -> pvops-dom0 -> legacy? In this case the dom0 kernel does the necessary setup at (*) in the pvops-domU -> (*) pvops-dom0 hop so there is nothing to do on the pvops-dom0 (*) ->legacy hop. If the legacy kernel forwards the packet further it will have to do the setup on its egress path, this is the same if dom0 is pvops or legacy.> Secondary question was whether the duplicated effort on transmission the other way around > may be a (performance) issue.You mean the legacy (*) -> (*) pvops-dom0 -> pvops-domU case? In that case the setup is done at the two (*)''s but it is not really "duplicated" as such since it is in the context of two separate skbs. if the dom0 was legacy then the second one would still happen but on the egress path. I have a feeling I''m not understanding what your concern is correctly. If the above isn''t what you mean can you give an example of the path of the packet and when the setup is (not) occurring. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Dec-03 12:24 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
>>> On 03.12.10 at 13:06, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > You mean a packet flowing pvops-domU -> pvops-dom0 -> legacy?I was actually just thinking in terms of a simple pair (see below).> In this case the dom0 kernel does the necessary setup at (*) in the > pvops-domU -> (*) pvops-dom0 hop so there is nothing to do on the > pvops-dom0 (*) ->legacy hop. > > If the legacy kernel forwards the packet further it will have to do the > setup on its egress path, this is the same if dom0 is pvops or legacy. > >> Secondary question was whether the duplicated effort on transmission the > other way around >> may be a (performance) issue. > > You mean the legacy (*) -> (*) pvops-dom0 -> pvops-domU case? > > In that case the setup is done at the two (*)''s but it is not really > "duplicated" as such since it is in the context of two separate skbs. if > the dom0 was legacy then the second one would still happen but on the > egress path. > > I have a feeling I''m not understanding what your concern is correctly. > If the above isn''t what you mean can you give an example of the path of > the packet and when the setup is (not) occurring.pv-ops-{front,back}end -> legacy-{back,front} (for example a pv-ops DomU sending a packet to (not through) a legacy Dom0, or pv-ops Dom0 sending to legacy DomU). Of course, if the packet fully passes the backend domain''s stack, it will have undergone the setup at least once (either on its way into or out of that stack). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-03 13:18 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
On Fri, 2010-12-03 at 12:24 +0000, Jan Beulich wrote:> >>> On 03.12.10 at 13:06, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > > You mean a packet flowing pvops-domU -> pvops-dom0 -> legacy? > > I was actually just thinking in terms of a simple pair (see below). > > > In this case the dom0 kernel does the necessary setup at (*) in the > > pvops-domU -> (*) pvops-dom0 hop so there is nothing to do on the > > pvops-dom0 (*) ->legacy hop. > > > > If the legacy kernel forwards the packet further it will have to do the > > setup on its egress path, this is the same if dom0 is pvops or legacy. > > > >> Secondary question was whether the duplicated effort on transmission the > > other way around > >> may be a (performance) issue. > > > > You mean the legacy (*) -> (*) pvops-dom0 -> pvops-domU case? > > > > In that case the setup is done at the two (*)''s but it is not really > > "duplicated" as such since it is in the context of two separate skbs. if > > the dom0 was legacy then the second one would still happen but on the > > egress path. > > > > I have a feeling I''m not understanding what your concern is correctly. > > If the above isn''t what you mean can you give an example of the path of > > the packet and when the setup is (not) occurring. > > pv-ops-{front,back}end -> legacy-{back,front} (for example a > pv-ops DomU sending a packet to (not through) a legacy Dom0,The setup which is done in skb_checksum_setup is internal to the guest''s skb data structure and doesn''t cross the pv interface boundary. The fields which it sets up are just offsets to the checksum field in the packet, it doesn''t actually manipulate the content of the packet or impact what goes into the ring until/unless the guest does TSO or something similar in which case the kernel needs to make sure the fields are setup first. So in the pvops-front->legacy-back case the legacy dom0 is already happy with having skbs with invalid checksum fields floating around in its stack since it sees the exact same thing in the legacy-front->legacy-back case. If it gets to a point where it needs the fields to be valid (either to forward on or if in some case it matters for local delivery) then it still has to do the necessary setup at that point.> or pv-ops Dom0 sending to legacy DomU).Same here, the legacy kernel knows it needs to setup the skb checksum fields before it uses them.> Of course, if the > packet fully passes the backend domain''s stack, it will have > undergone the setup at least once (either on its way into or > out of that stack).Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Dec-07 12:24 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
>>> On 03.12.10 at 14:18, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > The setup which is done in skb_checksum_setup is internal to the guest''s > skb data structure and doesn''t cross the pv interface boundary. The > fields which it sets up are just offsets to the checksum field in the > packet, it doesn''t actually manipulate the content of the packet or > impact what goes into the ring until/unless the guest does TSO or > something similar in which case the kernel needs to make sure the fields > are setup first.Okay, that makes it much easier to change the behavior then. What I''m then not understanding is who the consumer of this data is, and why it wasn''t done the receive path way from the beginning. Were there issues with the no longer used loopback driver? Or did kernel networking infrastructure change (if so, it''d be nice to know when and what)? Thanks for bearing with me, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Dec-07 13:29 UTC
[Xen-devel] Re: skb_checksum_setup() placement in pv-ops vs. legacy kernel
On Tue, 2010-12-07 at 12:24 +0000, Jan Beulich wrote:> >>> On 03.12.10 at 14:18, Ian Campbell <Ian.Campbell@eu.citrix.com> wrote: > > The setup which is done in skb_checksum_setup is internal to the guest''s > > skb data structure and doesn''t cross the pv interface boundary. The > > fields which it sets up are just offsets to the checksum field in the > > packet, it doesn''t actually manipulate the content of the packet or > > impact what goes into the ring until/unless the guest does TSO or > > something similar in which case the kernel needs to make sure the fields > > are setup first. > > Okay, that makes it much easier to change the behavior then. > > What I''m then not understanding is who the consumer of this > data is,The physical NIC driver can use it as part of setting up its descriptors fo transmit with TSO. I think the software TSO/GSO egress paths use it too in skb_checksum_help().> and why it wasn''t done the receive path way from the > beginning. Were there issues with the no longer used loopback > driver? Or did kernel networking infrastructure change (if so, > it''d be nice to know when and what)?I don''t really know the answer to this, it''s a little before my time. Perhaps it was simply a desire to defer work as long as possible in the hopes that it won''t be necessary for some reason? e.g. the skb gets dropped and not delivered. Doesn''t seem terribly compelling to me -- perhaps someone else remembers that far back. A bunch of stuff relating the CHECKSUM_* changed at some point after 2.6.18 but I don''t know if that had any impact on this aspect of things. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel