Hi, this is with reference to <https://bugzilla.redhat.com/show_bug.cgi?id=865736> -- RHEL-5.9 Beta host & guest. Nonetheless I think my question applies to current upstream Linux -- if not, I''d greatly appreciate commit hashes. Consider several igbvf''s that belong to the same PF (port): they are functions that share a (bus, device) pair (aka "slot") in dom0. The VPCI implementation of pciback_add_pci_dev() [drivers/xen/pciback/vpci.c] will assign these sibling functions to the same virtual slot. In other words, VFs that are siblings in dom0 end up as siblings in the PV domU. (Upstream path and function: "drivers/xen/xen-pciback/vpci.c", __xen_pcibk_add_pci_dev().) This logic appears to date back to <http://xenbits.xensource.com/xen-unstable.hg/rev/5b433b4fca34>. The RHEL-5.9 Beta PV domU does something like this: pci_scan_slot /* for each function: */ pci_scan_single_device pci_scan_device pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, &l) pci_setup_device pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type) dev->multifunction = !!(hdr_type & 0x80); if (dev_scanned && !dev->multifunc && func == 0): break Current upstream Linux has gone through several changes here, but the gist is the same: if function 0 is successfully scanned and it explicitly reports itself non-multi (--> no_next_fn()), then the rest of the functions on the slot is skipped. Problem is, function 0 of the igbvf I''m looking at reports itself as non-multifunction, and thus the domU doesn''t find the rest of the passed-through functions. pciback seems to have no overlay for PCI_HEADER_TYPE in array "header_common" [upstream: drivers/xen/xen-pciback/conf_space_header.c], thus pciback_config_read() / xen_pcibk_config_read() pass through the header type transparently when the domU reads it in pci_setup_device(). I wonder if - a new dom0 overlay should be introduced for PCI_HEADER_TYPE, to the tune of upstream Linux commit fd5b221b (ie. vendor-id/device-id), that would perhaps check the # of sibling functions in the given vpci slot, and fake the MSB in "hdr_type", - or the domU''s slot scanning logic should be changed, - or the igbvf I''m looking at reports an incorrect "hdr_type" (in which case we''d still have to fake something). The legacy "/etc/xen/xend-pci-quirks.sxp" interface is only suitable for giving extra write access to the domU, thus not good enough here. Thanks a lot! Laszlo
On 16/10/12 15:21, Laszlo Ersek wrote:> Hi, > > this is with reference to > <https://bugzilla.redhat.com/show_bug.cgi?id=865736> -- RHEL-5.9 Beta > host & guest. Nonetheless I think my question applies to current > upstream Linux -- if not, I''d greatly appreciate commit hashes. > > Consider several igbvf''s that belong to the same PF (port): they are > functions that share a (bus, device) pair (aka "slot") in dom0. The VPCI > implementation of pciback_add_pci_dev() [drivers/xen/pciback/vpci.c] > will assign these sibling functions to the same virtual slot. In other > words, VFs that are siblings in dom0 end up as siblings in the PV domU. > > (Upstream path and function: "drivers/xen/xen-pciback/vpci.c", > __xen_pcibk_add_pci_dev().) > > This logic appears to date back to > <http://xenbits.xensource.com/xen-unstable.hg/rev/5b433b4fca34>. > > > The RHEL-5.9 Beta PV domU does something like this: > > pci_scan_slot > /* for each function: */ > pci_scan_single_device > pci_scan_device > pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, &l) > pci_setup_device > pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type) > dev->multifunction = !!(hdr_type & 0x80); > if (dev_scanned && !dev->multifunc && func == 0): break > > Current upstream Linux has gone through several changes here, but the > gist is the same: if function 0 is successfully scanned and it > explicitly reports itself non-multi (--> no_next_fn()), then the rest of > the functions on the slot is skipped. > > Problem is, function 0 of the igbvf I''m looking at reports itself as > non-multifunction, and thus the domU doesn''t find the rest of the > passed-through functions.Which is correct, because the virtual function itself is only a single function. I would hazard a guess that the real bug is trying to fake up 8 individual virtual functions as an 8-fuction device, which seems like a toolstack bug to me. While it would certainly be possible to trap and emulate reads like this, I would think it would be decidedly hacky, thus preferably avoided. ~Andrew> > > pciback seems to have no overlay for PCI_HEADER_TYPE in array > "header_common" [upstream: drivers/xen/xen-pciback/conf_space_header.c], > thus pciback_config_read() / xen_pcibk_config_read() pass through the > header type transparently when the domU reads it in pci_setup_device(). > > > I wonder if > > - a new dom0 overlay should be introduced for PCI_HEADER_TYPE, to the > tune of upstream Linux commit fd5b221b (ie. vendor-id/device-id), that > would perhaps check the # of sibling functions in the given vpci slot, > and fake the MSB in "hdr_type", > > - or the domU''s slot scanning logic should be changed, > > - or the igbvf I''m looking at reports an incorrect "hdr_type" (in which > case we''d still have to fake something). The legacy > "/etc/xen/xend-pci-quirks.sxp" interface is only suitable for giving > extra write access to the domU, thus not good enough here. > > Thanks a lot! > Laszlo > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Hi, On 10/16/12 16:42, Andrew Cooper wrote:> I would hazard a guess that the real bug is trying to fake up 8 > individual virtual functions as an 8-fuction device, which seems like a > toolstack bug to me.I believe that comes from:>> The VPCI >> implementation of pciback_add_pci_dev() [drivers/xen/pciback/vpci.c] >> will assign these sibling functions to the same virtual slot. In other >> words, VFs that are siblings in dom0 end up as siblings in the PV domU.This grouping-together of virtual functions is the same in current upstream Linux:>> (Upstream path and function: "drivers/xen/xen-pciback/vpci.c", >> __xen_pcibk_add_pci_dev().)(+ match_slot())>> >> This logic appears to date back to >> <http://xenbits.xensource.com/xen-unstable.hg/rev/5b433b4fca34>.A closer pointer into the changeset: http://xenbits.xensource.com/xen-unstable.hg/rev/5b433b4fca34#l16.85 The patch doesn''t seem to justify the grouping specifically, thus I did not even try to refute it. Now that you point it out, match_slot() is probably insufficient grounds to group functions together. Maybe we should check *additionally* if the device being passed through is multi-function. I''ll try it. Thanks! Laszlo
On 16/10/12 16:04, Laszlo Ersek wrote:> A closer pointer into the changeset: > > http://xenbits.xensource.com/xen-unstable.hg/rev/5b433b4fca34#l16.85 > > The patch doesn''t seem to justify the grouping specifically, thus I did > not even try to refute it. > > Now that you point it out, match_slot() is probably insufficient grounds > to group functions together. Maybe we should check *additionally* if the > device being passed through is multi-function. I''ll try it.While that would hopefully solve the bug you have discovered, it does raise some more queries. What should we do when passing through only the first function of a multifunction device, specifically in reference to domain disagregation? I would like to hope things would just work, but I am somewhat doubtful. For SRIOV hardware, passing the physical function through should be fine (even though it is a multifunction device), as it is specifically designed to work in this way. For a CNA however, the chances of getting it working at all with different functions in different domains is unlikely at best. I just wanted to put these thoughts out in case anyone has any bright ideas about how to solve them. ~Andrew> > Thanks! > Laszlo-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Laszlo Ersek
2012-Oct-16 17:36 UTC
[PATCH] xen PV passthru: assign SR-IOV virtual functions to separate virtual slots
VFs are reported as single-function devices in PCI_HEADER_TYPE, which causes pci_scan_slot() in the PV domU to skip all VFs beyond #0 in the pciback-provided slot. Avoid this by assigning each VF to a separate virtual slot. Signed-off-by: Laszlo Ersek <lersek@redhat.com> --- drivers/xen/xen-pciback/vpci.c | 35 ++++++++++++++++++++--------------- 1 files changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c index 46d140b..489404a 100644 --- a/drivers/xen/xen-pciback/vpci.c +++ b/drivers/xen/xen-pciback/vpci.c @@ -89,21 +89,26 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev, mutex_lock(&vpci_dev->lock); - /* Keep multi-function devices together on the virtual PCI bus */ - for (slot = 0; slot < PCI_SLOT_MAX; slot++) { - if (!list_empty(&vpci_dev->dev_list[slot])) { - t = list_entry(list_first(&vpci_dev->dev_list[slot]), - struct pci_dev_entry, list); - - if (match_slot(dev, t->dev)) { - pr_info(DRV_NAME ": vpci: %s: " - "assign to virtual slot %d func %d\n", - pci_name(dev), slot, - PCI_FUNC(dev->devfn)); - list_add_tail(&dev_entry->list, - &vpci_dev->dev_list[slot]); - func = PCI_FUNC(dev->devfn); - goto unlock; + /* + * Keep multi-function devices together on the virtual PCI bus, except + * virtual functions. + */ + if (!dev->is_virtfn) { + for (slot = 0; slot < PCI_SLOT_MAX; slot++) { + if (!list_empty(&vpci_dev->dev_list[slot])) { + t = list_entry(list_first(&vpci_dev->dev_list[slot]), + struct pci_dev_entry, list); + + if (match_slot(dev, t->dev)) { + pr_info(DRV_NAME ": vpci: %s: " + "assign to virtual slot %d func %d\n", + pci_name(dev), slot, + PCI_FUNC(dev->devfn)); + list_add_tail(&dev_entry->list, + &vpci_dev->dev_list[slot]); + func = PCI_FUNC(dev->devfn); + goto unlock; + } } } } -- 1.7.1
Jan Beulich
2012-Oct-17 07:13 UTC
Re: [PATCH] xen PV passthru: assign SR-IOV virtual functions to separate virtual slots
>>> On 16.10.12 at 19:36, Laszlo Ersek <lersek@redhat.com> wrote: > VFs are reported as single-function devices in PCI_HEADER_TYPE, which > causes pci_scan_slot() in the PV domU to skip all VFs beyond #0 in the > pciback-provided slot. Avoid this by assigning each VF to a separate > virtual slot. > > Signed-off-by: Laszlo Ersek <lersek@redhat.com> > --- > drivers/xen/xen-pciback/vpci.c | 35 ++++++++++++++++++++--------------- > 1 files changed, 20 insertions(+), 15 deletions(-) > > diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c > index 46d140b..489404a 100644 > --- a/drivers/xen/xen-pciback/vpci.c > +++ b/drivers/xen/xen-pciback/vpci.c > @@ -89,21 +89,26 @@ static int __xen_pcibk_add_pci_dev(struct > xen_pcibk_device *pdev, > > mutex_lock(&vpci_dev->lock); > > - /* Keep multi-function devices together on the virtual PCI bus */ > - for (slot = 0; slot < PCI_SLOT_MAX; slot++) { > - if (!list_empty(&vpci_dev->dev_list[slot])) { > - t = list_entry(list_first(&vpci_dev->dev_list[slot]), > - struct pci_dev_entry, list); > - > - if (match_slot(dev, t->dev)) { > - pr_info(DRV_NAME ": vpci: %s: " > - "assign to virtual slot %d func %d\n", > - pci_name(dev), slot, > - PCI_FUNC(dev->devfn)); > - list_add_tail(&dev_entry->list, > - &vpci_dev->dev_list[slot]); > - func = PCI_FUNC(dev->devfn); > - goto unlock; > + /* > + * Keep multi-function devices together on the virtual PCI bus, except > + * virtual functions. > + */ > + if (!dev->is_virtfn) { > + for (slot = 0; slot < PCI_SLOT_MAX; slot++) { > + if (!list_empty(&vpci_dev->dev_list[slot])) {To keep indentation reasonable (and shrink the patch size at once), could you use if (list_empty(&vpci_dev->dev_list[slot])) continue; here instead? Or alternatively tweak the for() above to embed the if()''s condition there? Jan> + t = list_entry(list_first(&vpci_dev->dev_list[slot]), > + struct pci_dev_entry, list); > + > + if (match_slot(dev, t->dev)) { > + pr_info(DRV_NAME ": vpci: %s: " > + "assign to virtual slot %d func %d\n", > + pci_name(dev), slot, > + PCI_FUNC(dev->devfn)); > + list_add_tail(&dev_entry->list, > + &vpci_dev->dev_list[slot]); > + func = PCI_FUNC(dev->devfn); > + goto unlock; > + } > } > } > } > -- > 1.7.1 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Laszlo Ersek
2012-Oct-17 09:55 UTC
[PATCH v2] xen PV passthru: assign SR-IOV virtual functions to separate virtual slots
VFs are reported as single-function devices in PCI_HEADER_TYPE, which causes pci_scan_slot() in the PV domU to skip all VFs beyond #0 in the pciback-provided slot. Avoid this by assigning each VF to a separate virtual slot. Signed-off-by: Laszlo Ersek <lersek@redhat.com> --- v1->v2: * avoid extra indentation * make sure virtual functions are assigned with func=0 drivers/xen/xen-pciback/vpci.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c index 46d140b..0f478ac 100644 --- a/drivers/xen/xen-pciback/vpci.c +++ b/drivers/xen/xen-pciback/vpci.c @@ -89,9 +89,15 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev, mutex_lock(&vpci_dev->lock); - /* Keep multi-function devices together on the virtual PCI bus */ - for (slot = 0; slot < PCI_SLOT_MAX; slot++) { - if (!list_empty(&vpci_dev->dev_list[slot])) { + /* + * Keep multi-function devices together on the virtual PCI bus, except + * virtual functions. + */ + if (!dev->is_virtfn) { + for (slot = 0; slot < PCI_SLOT_MAX; slot++) { + if (list_empty(&vpci_dev->dev_list[slot])) + continue; + t = list_entry(list_first(&vpci_dev->dev_list[slot]), struct pci_dev_entry, list); @@ -116,7 +122,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev, pci_name(dev), slot); list_add_tail(&dev_entry->list, &vpci_dev->dev_list[slot]); - func = PCI_FUNC(dev->devfn); + func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn); goto unlock; } } -- 1.7.1
Jan Beulich
2012-Oct-17 10:01 UTC
Re: [PATCH v2] xen PV passthru: assign SR-IOV virtual functions to separate virtual slots
>>> On 17.10.12 at 11:55, Laszlo Ersek <lersek@redhat.com> wrote: > VFs are reported as single-function devices in PCI_HEADER_TYPE, which > causes pci_scan_slot() in the PV domU to skip all VFs beyond #0 in the > pciback-provided slot. Avoid this by assigning each VF to a separate > virtual slot. > > Signed-off-by: Laszlo Ersek <lersek@redhat.com>Acked-by: Jan Beulich <jbeulich@suse.com>> --- > v1->v2: > * avoid extra indentation > * make sure virtual functions are assigned with func=0 > > drivers/xen/xen-pciback/vpci.c | 14 ++++++++++---- > 1 files changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c > index 46d140b..0f478ac 100644 > --- a/drivers/xen/xen-pciback/vpci.c > +++ b/drivers/xen/xen-pciback/vpci.c > @@ -89,9 +89,15 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev, > > mutex_lock(&vpci_dev->lock); > > - /* Keep multi-function devices together on the virtual PCI bus */ > - for (slot = 0; slot < PCI_SLOT_MAX; slot++) { > - if (!list_empty(&vpci_dev->dev_list[slot])) { > + /* > + * Keep multi-function devices together on the virtual PCI bus, except > + * virtual functions. > + */ > + if (!dev->is_virtfn) { > + for (slot = 0; slot < PCI_SLOT_MAX; slot++) { > + if (list_empty(&vpci_dev->dev_list[slot])) > + continue; > + > t = list_entry(list_first(&vpci_dev->dev_list[slot]), > struct pci_dev_entry, list); > > @@ -116,7 +122,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev, > pci_name(dev), slot); > list_add_tail(&dev_entry->list, > &vpci_dev->dev_list[slot]); > - func = PCI_FUNC(dev->devfn); > + func = dev->is_virtfn ? 0 : PCI_FUNC(dev->devfn); > goto unlock; > } > } > -- > 1.7.1
Konrad Rzeszutek Wilk
2012-Oct-17 14:51 UTC
Re: [PATCH v2] xen PV passthru: assign SR-IOV virtual functions to separate virtual slots
On Wed, Oct 17, 2012 at 11:01:17AM +0100, Jan Beulich wrote:> >>> On 17.10.12 at 11:55, Laszlo Ersek <lersek@redhat.com> wrote: > > VFs are reported as single-function devices in PCI_HEADER_TYPE, which > > causes pci_scan_slot() in the PV domU to skip all VFs beyond #0 in the > > pciback-provided slot. Avoid this by assigning each VF to a separate > > virtual slot. > > > > Signed-off-by: Laszlo Ersek <lersek@redhat.com> > > Acked-by: Jan Beulich <jbeulich@suse.com>Applied.