Jan Beulich
2011-Oct-07 07:38 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
>>> On 07.10.11 at 04:19, "Kay, Allen M" <allen.m.kay@intel.com> wrote: > Currently I three places where enable_ats_device() can be eventually called > because of it is in domain_context_mapping(): > > 1) setup_dom0_devices() > 2) intel_iommu_add_device() > 3) reassign_device_ownership() > > Calling it in the first two is probably all we need to cover all the > devices. I don''t think we need to call it again in > reassign_device_ownership().That would mean that a device that was found during Xen''s initial scan (i.e. pdev->domain set due to it having gone through setup_dom0_pci_devices()), but for which enable_ats_device() was unsuccessful due to mmcfg access still being impossible at that point, would never get ATS enabled. But that''s the whole point of the thread here. My question isn''t whether to *re*move call sites, but whether it would be possible to move them elsewhere. For which I''d like to understand why this is being done in the places it is now (not the least why this is done in VT-d specific code in the first place). Just like in the suggested change to how pci_enable_acs() gets called, this should really happen from pci_add_device() *without* regard to whether pdev->domain was already set. Also, earlier you suggested to remove the call to pci_enable_acs() from setup_dom0_device() - I''m not convinced anymore that this is correct, since old Dom0 kernels (forward ports from the 2.6.18 tree up to pretty recently) can''t be relied upon to report all PCI devices to Xen. Which also suggests that we shouldn''t really remove scan_pci_devices() (although it may be possible to adjust it back to scan only segment 0). Otoh, when mmcfg isn''t available early, on such Dom0 kernels ATS wouldn''t get enabled today either, even with the adjusted call site of pci_enable_acs(). Consequently, another alternative would be to retry ATS and ACS enabling when mmcfg becomes available for a certain bus range, i.e. out of pci_mmcfg_reserved(). Jan> By the way, due to the lack of production ATS devices, we have not tried ATS > for quite a while. I''m not sure we should make the change now or should we > just make a note of it in reassign_device_ownership() for now. > > Allen > > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@novell.com] > Sent: Friday, August 19, 2011 2:26 AM > To: Kay, Allen M > Subject: Resend: RE: enable_ats_device() call site > > (for some reason the first send to you bounced - please re-add xen-devel as cc > if you reply to this one) > >>>> On 18.08.11 at 01:27, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >>> what is the reason for calling this from VT-d''s domain_context_mapping()? >>> I neither undertsand why this is VT-d specific, nor why it needs to be >>> re-done with each device re-assignment. >> >> The reason is FLR clears the ATS enabled bit so we need to re-enable it for >> every re-assignment. The reason we don''t need to do this for ACS might be > ACS >> reside on the bridge, not in the PCI endpoint. ATS on the other hand, >> resides in PCI endpoints. > > And why is it VT-d specific then? The problem to solve is that enabling > may not happen when it is first attempted, in the case where Xen on its > own can''t be certain that using MMCFG is safe. Hence when the device > gets reported by Dom0 (or when MMCFG gets enabled for the respective > bus), another attempt needs to be made at enabling it. De-assigning and > then re-assigning the device to Dom0 seems to be overkill to me. > >>> Alternatively - why do we need scan_pci_devices() at all? We''re >>> supposed to be getting the devices reported from Dom0 anyway >> >> Looks like it is use for building bus2bridge[] which is used for figuring >> out upstream bridges which are needed when assigning non-PCIe devices. > > Oh, right, I keep forgetting that, especially as that puts under question > why we have Dom0 report non-extfn, non-virtfn devices at all. And > perhaps we should issue a warning if Dom0 reports such a device that > we didn''t know about already? > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Oct-08 02:09 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
>For which I''d like to understand why this is being done in the places it is now >(not the least why this is done in VT-d specific code in the first place).The reason it is call by reassign_device_ownership() is because FLR clears ATS enabling bit on the device - I forgot about it when I wrote the last email so we still need to re-enable ATS on the device for each device assignment. To summarize: 1) Reason for difference in ATS and ACS handling a. ATS capability is in the PCIe endpoint - enabling bit is cleared by device FLR on the passthrough device. b. ACS capability is in the PCIe switch - not affected by FLR on the passthrough device. 2) ATS enabling requirement a. VT-d engine serving the device has to be ATS capable. b. device has to be ATS capable Allen -----Original Message----- From: Jan Beulich [mailto:JBeulich@suse.com] Sent: Friday, October 07, 2011 12:39 AM To: Kay, Allen M Cc: xen-devel@lists.xensource.com Subject: RE: Resend: RE: enable_ats_device() call site>>> On 07.10.11 at 04:19, "Kay, Allen M" <allen.m.kay@intel.com> wrote: > Currently I three places where enable_ats_device() can be eventually called > because of it is in domain_context_mapping(): > > 1) setup_dom0_devices() > 2) intel_iommu_add_device() > 3) reassign_device_ownership() > > Calling it in the first two is probably all we need to cover all the > devices. I don''t think we need to call it again in > reassign_device_ownership().That would mean that a device that was found during Xen''s initial scan (i.e. pdev->domain set due to it having gone through setup_dom0_pci_devices()), but for which enable_ats_device() was unsuccessful due to mmcfg access still being impossible at that point, would never get ATS enabled. But that''s the whole point of the thread here. My question isn''t whether to *re*move call sites, but whether it would be possible to move them elsewhere. For which I''d like to understand why this is being done in the places it is now (not the least why this is done in VT-d specific code in the first place). Just like in the suggested change to how pci_enable_acs() gets called, this should really happen from pci_add_device() *without* regard to whether pdev->domain was already set. Also, earlier you suggested to remove the call to pci_enable_acs() from setup_dom0_device() - I''m not convinced anymore that this is correct, since old Dom0 kernels (forward ports from the 2.6.18 tree up to pretty recently) can''t be relied upon to report all PCI devices to Xen. Which also suggests that we shouldn''t really remove scan_pci_devices() (although it may be possible to adjust it back to scan only segment 0). Otoh, when mmcfg isn''t available early, on such Dom0 kernels ATS wouldn''t get enabled today either, even with the adjusted call site of pci_enable_acs(). Consequently, another alternative would be to retry ATS and ACS enabling when mmcfg becomes available for a certain bus range, i.e. out of pci_mmcfg_reserved(). Jan> By the way, due to the lack of production ATS devices, we have not tried ATS > for quite a while. I''m not sure we should make the change now or should we > just make a note of it in reassign_device_ownership() for now. > > Allen > > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@novell.com] > Sent: Friday, August 19, 2011 2:26 AM > To: Kay, Allen M > Subject: Resend: RE: enable_ats_device() call site > > (for some reason the first send to you bounced - please re-add xen-devel as cc > if you reply to this one) > >>>> On 18.08.11 at 01:27, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >>> what is the reason for calling this from VT-d''s domain_context_mapping()? >>> I neither undertsand why this is VT-d specific, nor why it needs to be >>> re-done with each device re-assignment. >> >> The reason is FLR clears the ATS enabled bit so we need to re-enable it for >> every re-assignment. The reason we don''t need to do this for ACS might be > ACS >> reside on the bridge, not in the PCI endpoint. ATS on the other hand, >> resides in PCI endpoints. > > And why is it VT-d specific then? The problem to solve is that enabling > may not happen when it is first attempted, in the case where Xen on its > own can''t be certain that using MMCFG is safe. Hence when the device > gets reported by Dom0 (or when MMCFG gets enabled for the respective > bus), another attempt needs to be made at enabling it. De-assigning and > then re-assigning the device to Dom0 seems to be overkill to me. > >>> Alternatively - why do we need scan_pci_devices() at all? We''re >>> supposed to be getting the devices reported from Dom0 anyway >> >> Looks like it is use for building bus2bridge[] which is used for figuring >> out upstream bridges which are needed when assigning non-PCIe devices. > > Oh, right, I keep forgetting that, especially as that puts under question > why we have Dom0 report non-extfn, non-virtfn devices at all. And > perhaps we should issue a warning if Dom0 reports such a device that > we didn''t know about already? > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-11 12:54 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
>>> On 08.10.11 at 04:09, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >> For which I''d like to understand why this is being done in the places it is > now >>(not the least why this is done in VT-d specific code in the first place). > > The reason it is call by reassign_device_ownership() is because FLR clears > ATS enabling bit on the device - I forgot about it when I wrote the last email > so we still need to re-enable ATS on the device for each device assignment. > To summarize: > > 1) Reason for difference in ATS and ACS handling > a. ATS capability is in the PCIe endpoint - enabling bit is cleared by > device FLR on the passthrough device. > b. ACS capability is in the PCIe switch - not affected by FLR on the > passthrough device. > > 2) ATS enabling requirement > a. VT-d engine serving the device has to be ATS capable. > b. device has to be ATS capableOkay, so how about the below then (with an attached prerequisite cleanup patch)? Jan --- 2011-09-20.orig/xen/drivers/passthrough/iommu.c +++ 2011-09-20/xen/drivers/passthrough/iommu.c @@ -150,6 +150,23 @@ int iommu_add_device(struct pci_dev *pde return hd->platform_ops->add_device(pdev); } +int iommu_enable_device(struct pci_dev *pdev) +{ + struct hvm_iommu *hd; + + if ( !pdev->domain ) + return -EINVAL; + + ASSERT(spin_is_locked(&pcidevs_lock)); + + hd = domain_hvm_iommu(pdev->domain); + if ( !iommu_enabled || !hd->platform_ops || + !hd->platform_ops->enable_device ) + return 0; + + return hd->platform_ops->enable_device(pdev); +} + int iommu_remove_device(struct pci_dev *pdev) { struct hvm_iommu *hd; --- 2011-09-20.orig/xen/drivers/passthrough/pci.c +++ 2011-09-20/xen/drivers/passthrough/pci.c @@ -258,7 +258,7 @@ struct pci_dev *pci_get_pdev_by_domain( * pci_enable_acs - enable ACS if hardware support it * @dev: the PCI device */ -void pci_enable_acs(struct pci_dev *pdev) +static void pci_enable_acs(struct pci_dev *pdev) { int pos; u16 cap, ctrl, seg = pdev->seg; @@ -409,8 +409,11 @@ int pci_add_device(u16 seg, u8 bus, u8 d } list_add(&pdev->domain_list, &dom0->arch.pdev_list); - pci_enable_acs(pdev); } + else + iommu_enable_device(pdev); + + pci_enable_acs(pdev); out: spin_unlock(&pcidevs_lock); --- 2011-09-20.orig/xen/drivers/passthrough/vtd/iommu.c +++ 2011-09-20/xen/drivers/passthrough/vtd/iommu.c @@ -1901,6 +1901,19 @@ static int intel_iommu_add_device(struct return ret; } +static int intel_iommu_enable_device(struct pci_dev *pdev) +{ + struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev); + int ret = drhd ? ats_device(pdev, drhd) : -ENODEV; + + if ( ret <= 0 ) + return ret; + + ret = enable_ats_device(pdev->seg, pdev->bus, pdev->devfn); + + return ret >= 0 ? 0 : ret; +} + static int intel_iommu_remove_device(struct pci_dev *pdev) { struct acpi_rmrr_unit *rmrr; @@ -1931,7 +1944,6 @@ static int intel_iommu_remove_device(str static void __init setup_dom0_device(struct pci_dev *pdev) { domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn); - pci_enable_acs(pdev); pci_vtd_quirk(pdev); } @@ -2302,6 +2314,7 @@ const struct iommu_ops intel_iommu_ops .init = intel_iommu_domain_init, .dom0_init = intel_iommu_dom0_init, .add_device = intel_iommu_add_device, + .enable_device = intel_iommu_enable_device, .remove_device = intel_iommu_remove_device, .assign_device = intel_iommu_assign_device, .teardown = iommu_domain_teardown, --- 2011-09-20.orig/xen/include/xen/iommu.h +++ 2011-09-20/xen/include/xen/iommu.h @@ -70,6 +70,7 @@ int iommu_enable_x2apic_IR(void); void iommu_disable_x2apic_IR(void); int iommu_add_device(struct pci_dev *pdev); +int iommu_enable_device(struct pci_dev *pdev); int iommu_remove_device(struct pci_dev *pdev); int iommu_domain_init(struct domain *d); void iommu_dom0_init(struct domain *d); @@ -120,6 +121,7 @@ struct iommu_ops { int (*init)(struct domain *d); void (*dom0_init)(struct domain *d); int (*add_device)(struct pci_dev *pdev); + int (*enable_device)(struct pci_dev *pdev); int (*remove_device)(struct pci_dev *pdev); int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn); void (*teardown)(struct domain *d); --- 2011-09-20.orig/xen/include/xen/pci.h +++ 2011-09-20/xen/include/xen/pci.h @@ -134,6 +134,5 @@ struct pirq; int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable); void msixtbl_pt_unregister(struct domain *, struct pirq *); void msixtbl_pt_cleanup(struct domain *d); -void pci_enable_acs(struct pci_dev *pdev); #endif /* __XEN_PCI_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Oct-18 22:46 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
Hi Jan, Sorry for the late reply, I was trying to close something on another project. I have the following questions on the patches after reviewing the paches: 1) In acpi_find_matched_atsr_unit(), you added following code. The original code only tries to match the bus number. What is the purpose of this new additional code? Does it fix a problem on one of your systems? + for ( i = 0; i < atsr->scope.devices_cnt; ++i ) + if ( atsr->scope.devices[i] == bdf ) + return atsr; 2) In pci_add_device() function, the original code calls pci_enable_acs() only if pdev->domain is not set. The new code calls pci_enable_acs() unconditionally, potentially more than once? What is the reason for the change? 3) In the same pci_add_device() function, the new code now also calls iommu_enable_device() which currently calls enable_ats_device(). This means the new code will enable ATS as it is being discovered by the platform. However, I did not see any code that removing enable_ats_device() call in domain_context_mapping(). Is this the intention? If so, what is the reason? I see the reason the original code is still needed but I don''t see why we need to call enable_ats_device() during platform device discovery since the enabling bit will get cleared by FLR. Allen -----Original Message----- From: Jan Beulich [mailto:JBeulich@suse.com] Sent: Tuesday, October 11, 2011 5:54 AM To: Kay, Allen M Cc: xen-devel@lists.xensource.com Subject: RE: Resend: RE: enable_ats_device() call site>>> On 08.10.11 at 04:09, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >> For which I''d like to understand why this is being done in the places >> it is > now >>(not the least why this is done in VT-d specific code in the first place). > > The reason it is call by reassign_device_ownership() is because FLR > clears ATS enabling bit on the device - I forgot about it when I wrote > the last email so we still need to re-enable ATS on the device for each device assignment. > To summarize: > > 1) Reason for difference in ATS and ACS handling > a. ATS capability is in the PCIe endpoint - enabling bit is > cleared by device FLR on the passthrough device. > b. ACS capability is in the PCIe switch - not affected by FLR on > the passthrough device. > > 2) ATS enabling requirement > a. VT-d engine serving the device has to be ATS capable. > b. device has to be ATS capableOkay, so how about the below then (with an attached prerequisite cleanup patch)? Jan --- 2011-09-20.orig/xen/drivers/passthrough/iommu.c +++ 2011-09-20/xen/drivers/passthrough/iommu.c @@ -150,6 +150,23 @@ int iommu_add_device(struct pci_dev *pde return hd->platform_ops->add_device(pdev); } +int iommu_enable_device(struct pci_dev *pdev) { + struct hvm_iommu *hd; + + if ( !pdev->domain ) + return -EINVAL; + + ASSERT(spin_is_locked(&pcidevs_lock)); + + hd = domain_hvm_iommu(pdev->domain); + if ( !iommu_enabled || !hd->platform_ops || + !hd->platform_ops->enable_device ) + return 0; + + return hd->platform_ops->enable_device(pdev); +} + int iommu_remove_device(struct pci_dev *pdev) { struct hvm_iommu *hd; --- 2011-09-20.orig/xen/drivers/passthrough/pci.c +++ 2011-09-20/xen/drivers/passthrough/pci.c @@ -258,7 +258,7 @@ struct pci_dev *pci_get_pdev_by_domain( * pci_enable_acs - enable ACS if hardware support it * @dev: the PCI device */ -void pci_enable_acs(struct pci_dev *pdev) +static void pci_enable_acs(struct pci_dev *pdev) { int pos; u16 cap, ctrl, seg = pdev->seg; @@ -409,8 +409,11 @@ int pci_add_device(u16 seg, u8 bus, u8 d } list_add(&pdev->domain_list, &dom0->arch.pdev_list); - pci_enable_acs(pdev); } + else + iommu_enable_device(pdev); + + pci_enable_acs(pdev); out: spin_unlock(&pcidevs_lock); --- 2011-09-20.orig/xen/drivers/passthrough/vtd/iommu.c +++ 2011-09-20/xen/drivers/passthrough/vtd/iommu.c @@ -1901,6 +1901,19 @@ static int intel_iommu_add_device(struct return ret; } +static int intel_iommu_enable_device(struct pci_dev *pdev) { + struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev); + int ret = drhd ? ats_device(pdev, drhd) : -ENODEV; + + if ( ret <= 0 ) + return ret; + + ret = enable_ats_device(pdev->seg, pdev->bus, pdev->devfn); + + return ret >= 0 ? 0 : ret; +} + static int intel_iommu_remove_device(struct pci_dev *pdev) { struct acpi_rmrr_unit *rmrr; @@ -1931,7 +1944,6 @@ static int intel_iommu_remove_device(str static void __init setup_dom0_device(struct pci_dev *pdev) { domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn); - pci_enable_acs(pdev); pci_vtd_quirk(pdev); } @@ -2302,6 +2314,7 @@ const struct iommu_ops intel_iommu_ops .init = intel_iommu_domain_init, .dom0_init = intel_iommu_dom0_init, .add_device = intel_iommu_add_device, + .enable_device = intel_iommu_enable_device, .remove_device = intel_iommu_remove_device, .assign_device = intel_iommu_assign_device, .teardown = iommu_domain_teardown, --- 2011-09-20.orig/xen/include/xen/iommu.h +++ 2011-09-20/xen/include/xen/iommu.h @@ -70,6 +70,7 @@ int iommu_enable_x2apic_IR(void); void iommu_disable_x2apic_IR(void); int iommu_add_device(struct pci_dev *pdev); +int iommu_enable_device(struct pci_dev *pdev); int iommu_remove_device(struct pci_dev *pdev); int iommu_domain_init(struct domain *d); void iommu_dom0_init(struct domain *d); @@ -120,6 +121,7 @@ struct iommu_ops { int (*init)(struct domain *d); void (*dom0_init)(struct domain *d); int (*add_device)(struct pci_dev *pdev); + int (*enable_device)(struct pci_dev *pdev); int (*remove_device)(struct pci_dev *pdev); int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn); void (*teardown)(struct domain *d); --- 2011-09-20.orig/xen/include/xen/pci.h +++ 2011-09-20/xen/include/xen/pci.h @@ -134,6 +134,5 @@ struct pirq; int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable); void msixtbl_pt_unregister(struct domain *, struct pirq *); void msixtbl_pt_cleanup(struct domain *d); -void pci_enable_acs(struct pci_dev *pdev); #endif /* __XEN_PCI_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-19 06:47 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
>>> On 19.10.11 at 00:46, "Kay, Allen M" <allen.m.kay@intel.com> wrote: > Sorry for the late reply, I was trying to close something on another > project. I have the following questions on the patches after reviewing the > paches: > > 1) In acpi_find_matched_atsr_unit(), you added following code. The original > code only tries to match the bus number. What is the purpose of this new > additional code? Does it fix a problem on one of your systems? > > + for ( i = 0; i < atsr->scope.devices_cnt; ++i ) > + if ( atsr->scope.devices[i] == bdf ) > + return atsr;I reckon that the availability of device specifications in the ATSR data structure must be there for a purpose. If that''s not correct, then I''ll certainly remove that code again, but I''d like to understand what that data is meant to be for in that case.> 2) In pci_add_device() function, the original code calls pci_enable_acs() > only if pdev->domain is not set. The new code calls pci_enable_acs() > unconditionally, potentially more than once? What is the reason for the > change?That''s the whole purpose of the change, so just to repeat: MMCFG accesses may not be possible at scan_pci_devices() time for some or all segments/busses. Hence enabling ATS may simply be impossible at that point, and must be attempted a second time after Dom0 reported whether using MMCFG is safe. Since enabling ATS on an already enabled device doesn''t do any harm according to how enable_ats_device() is implemented I can''t see any bad in doing so. If there is, then we''re back to square one where I was asking you how to properly do ATS enabling given the described MMCFG restriction.> 3) In the same pci_add_device() function, the new code now also calls > iommu_enable_device() which currently calls enable_ats_device(). This means > the new code will enable ATS as it is being discovered by the platform. > However, I did not see any code that removing enable_ats_device() call in > domain_context_mapping(). Is this the intention? If so, what is the reason?You were telling me that this needs to be re-done after FLR, and hence has to remain there.> I see the reason the original code is still needed but I don''t see why we > need to call enable_ats_device() during platform device discovery since the > enabling bit will get cleared by FLR.Either we don''t need to call it at all during discovery (which I doubt, since when the device is in use by Dom0, I suppose having ATS enabled is still desirable or even required), or we have to potentially do it twice (remember that older Dom0 kernels may fail to report all PCI devices to the hypervisor). Jan> Allen > > > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, October 11, 2011 5:54 AM > To: Kay, Allen M > Cc: xen-devel@lists.xensource.com > Subject: RE: Resend: RE: enable_ats_device() call site > >>>> On 08.10.11 at 04:09, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >>> For which I''d like to understand why this is being done in the places >>> it is >> now >>>(not the least why this is done in VT-d specific code in the first place). >> >> The reason it is call by reassign_device_ownership() is because FLR >> clears ATS enabling bit on the device - I forgot about it when I wrote >> the last email so we still need to re-enable ATS on the device for each > device assignment. >> To summarize: >> >> 1) Reason for difference in ATS and ACS handling >> a. ATS capability is in the PCIe endpoint - enabling bit is >> cleared by device FLR on the passthrough device. >> b. ACS capability is in the PCIe switch - not affected by FLR on >> the passthrough device. >> >> 2) ATS enabling requirement >> a. VT-d engine serving the device has to be ATS capable. >> b. device has to be ATS capable > > Okay, so how about the below then (with an attached prerequisite cleanup > patch)? > > Jan > > --- 2011-09-20.orig/xen/drivers/passthrough/iommu.c > +++ 2011-09-20/xen/drivers/passthrough/iommu.c > @@ -150,6 +150,23 @@ int iommu_add_device(struct pci_dev *pde > return hd->platform_ops->add_device(pdev); > } > > +int iommu_enable_device(struct pci_dev *pdev) { > + struct hvm_iommu *hd; > + > + if ( !pdev->domain ) > + return -EINVAL; > + > + ASSERT(spin_is_locked(&pcidevs_lock)); > + > + hd = domain_hvm_iommu(pdev->domain); > + if ( !iommu_enabled || !hd->platform_ops || > + !hd->platform_ops->enable_device ) > + return 0; > + > + return hd->platform_ops->enable_device(pdev); > +} > + > int iommu_remove_device(struct pci_dev *pdev) { > struct hvm_iommu *hd; > --- 2011-09-20.orig/xen/drivers/passthrough/pci.c > +++ 2011-09-20/xen/drivers/passthrough/pci.c > @@ -258,7 +258,7 @@ struct pci_dev *pci_get_pdev_by_domain( > * pci_enable_acs - enable ACS if hardware support it > * @dev: the PCI device > */ > -void pci_enable_acs(struct pci_dev *pdev) > +static void pci_enable_acs(struct pci_dev *pdev) > { > int pos; > u16 cap, ctrl, seg = pdev->seg; > @@ -409,8 +409,11 @@ int pci_add_device(u16 seg, u8 bus, u8 d > } > > list_add(&pdev->domain_list, &dom0->arch.pdev_list); > - pci_enable_acs(pdev); > } > + else > + iommu_enable_device(pdev); > + > + pci_enable_acs(pdev); > > out: > spin_unlock(&pcidevs_lock); > --- 2011-09-20.orig/xen/drivers/passthrough/vtd/iommu.c > +++ 2011-09-20/xen/drivers/passthrough/vtd/iommu.c > @@ -1901,6 +1901,19 @@ static int intel_iommu_add_device(struct > return ret; > } > > +static int intel_iommu_enable_device(struct pci_dev *pdev) { > + struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev); > + int ret = drhd ? ats_device(pdev, drhd) : -ENODEV; > + > + if ( ret <= 0 ) > + return ret; > + > + ret = enable_ats_device(pdev->seg, pdev->bus, pdev->devfn); > + > + return ret >= 0 ? 0 : ret; > +} > + > static int intel_iommu_remove_device(struct pci_dev *pdev) { > struct acpi_rmrr_unit *rmrr; > @@ -1931,7 +1944,6 @@ static int intel_iommu_remove_device(str static void > __init setup_dom0_device(struct pci_dev *pdev) { > domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn); > - pci_enable_acs(pdev); > pci_vtd_quirk(pdev); > } > > @@ -2302,6 +2314,7 @@ const struct iommu_ops intel_iommu_ops > .init = intel_iommu_domain_init, > .dom0_init = intel_iommu_dom0_init, > .add_device = intel_iommu_add_device, > + .enable_device = intel_iommu_enable_device, > .remove_device = intel_iommu_remove_device, > .assign_device = intel_iommu_assign_device, > .teardown = iommu_domain_teardown, > --- 2011-09-20.orig/xen/include/xen/iommu.h > +++ 2011-09-20/xen/include/xen/iommu.h > @@ -70,6 +70,7 @@ int iommu_enable_x2apic_IR(void); void > iommu_disable_x2apic_IR(void); > > int iommu_add_device(struct pci_dev *pdev); > +int iommu_enable_device(struct pci_dev *pdev); > int iommu_remove_device(struct pci_dev *pdev); int > iommu_domain_init(struct domain *d); void iommu_dom0_init(struct domain *d); > @@ -120,6 +121,7 @@ struct iommu_ops { > int (*init)(struct domain *d); > void (*dom0_init)(struct domain *d); > int (*add_device)(struct pci_dev *pdev); > + int (*enable_device)(struct pci_dev *pdev); > int (*remove_device)(struct pci_dev *pdev); > int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn); > void (*teardown)(struct domain *d); > --- 2011-09-20.orig/xen/include/xen/pci.h > +++ 2011-09-20/xen/include/xen/pci.h > @@ -134,6 +134,5 @@ struct pirq; > int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable); > void msixtbl_pt_unregister(struct domain *, struct pirq *); void > msixtbl_pt_cleanup(struct domain *d); -void pci_enable_acs(struct pci_dev > *pdev); > > #endif /* __XEN_PCI_H__ */_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Oct-19 22:20 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
> I reckon that the availability of device specifications in the ATSR data structure must be there for a purpose. > If that''s not correct, then I''ll certainly remove that code again, but I''d like to understand what that data is meant > to be for in that case.The atsr leverages the same PCI device scope is used for drhd and rmrr so device and function comes along with bus number. As far as I can tell, we only need to check the bus number for atsr.> Since enabling ATS on an already enabled device doesn''t do any harm according to how enable_ats_device() is > implemented I can''t see any bad in doing so. If there is, then we''re back to square one where I was asking you how > to properly do ATS enabling given the described MMCFG restriction.Yes, there should be no harm enabling ACS or ATS multiple times. It would be good tested out to make sure though.> Either we don''t need to call it at all during discovery (which I doubt, since when the device is in use by Dom0, I > suppose having ATS enabled is still desirable or even required), or we have to potentially do it twice (remember > that older Dom0 kernels may fail to report all PCI devices to the hypervisor).I see, calling enable_ats_device() in pci_add_device() will also solve the case where MMCFG might not work until after dom0 is initialized. As I mentioned before, our QA team doesn''t test ATS and ACS regularly. It would good if you can coordinate with our QA team to test out these changes to make sure they don''t break any ACS and ATS functionality. Allen -----Original Message----- From: Jan Beulich [mailto:JBeulich@suse.com] Sent: Tuesday, October 18, 2011 11:47 PM To: Kay, Allen M Cc: Dugger, Donald D; Shan, Haitao; Tian, Kevin; xen-devel@lists.xensource.com Subject: RE: Resend: RE: enable_ats_device() call site>>> On 19.10.11 at 00:46, "Kay, Allen M" <allen.m.kay@intel.com> wrote: > Sorry for the late reply, I was trying to close something on another > project. I have the following questions on the patches after > reviewing the > paches: > > 1) In acpi_find_matched_atsr_unit(), you added following code. The > original code only tries to match the bus number. What is the purpose > of this new additional code? Does it fix a problem on one of your systems? > > + for ( i = 0; i < atsr->scope.devices_cnt; ++i ) > + if ( atsr->scope.devices[i] == bdf ) > + return atsr;I reckon that the availability of device specifications in the ATSR data structure must be there for a purpose. If that''s not correct, then I''ll certainly remove that code again, but I''d like to understand what that data is meant to be for in that case.> 2) In pci_add_device() function, the original code calls > pci_enable_acs() only if pdev->domain is not set. The new code calls > pci_enable_acs() unconditionally, potentially more than once? What is > the reason for the change?That''s the whole purpose of the change, so just to repeat: MMCFG accesses may not be possible at scan_pci_devices() time for some or all segments/busses. Hence enabling ATS may simply be impossible at that point, and must be attempted a second time after Dom0 reported whether using MMCFG is safe. Since enabling ATS on an already enabled device doesn''t do any harm according to how enable_ats_device() is implemented I can''t see any bad in doing so. If there is, then we''re back to square one where I was asking you how to properly do ATS enabling given the described MMCFG restriction.> 3) In the same pci_add_device() function, the new code now also calls > iommu_enable_device() which currently calls enable_ats_device(). This > means the new code will enable ATS as it is being discovered by the platform. > However, I did not see any code that removing enable_ats_device() call > in domain_context_mapping(). Is this the intention? If so, what is the reason?You were telling me that this needs to be re-done after FLR, and hence has to remain there.> I see the reason the original code is still needed but I don''t see > why we need to call enable_ats_device() during platform device > discovery since the enabling bit will get cleared by FLR.Either we don''t need to call it at all during discovery (which I doubt, since when the device is in use by Dom0, I suppose having ATS enabled is still desirable or even required), or we have to potentially do it twice (remember that older Dom0 kernels may fail to report all PCI devices to the hypervisor). Jan> Allen > > > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, October 11, 2011 5:54 AM > To: Kay, Allen M > Cc: xen-devel@lists.xensource.com > Subject: RE: Resend: RE: enable_ats_device() call site > >>>> On 08.10.11 at 04:09, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >>> For which I''d like to understand why this is being done in the >>> places it is >> now >>>(not the least why this is done in VT-d specific code in the first place). >> >> The reason it is call by reassign_device_ownership() is because FLR >> clears ATS enabling bit on the device - I forgot about it when I >> wrote the last email so we still need to re-enable ATS on the device >> for each > device assignment. >> To summarize: >> >> 1) Reason for difference in ATS and ACS handling >> a. ATS capability is in the PCIe endpoint - enabling bit is >> cleared by device FLR on the passthrough device. >> b. ACS capability is in the PCIe switch - not affected by FLR on >> the passthrough device. >> >> 2) ATS enabling requirement >> a. VT-d engine serving the device has to be ATS capable. >> b. device has to be ATS capable > > Okay, so how about the below then (with an attached prerequisite > cleanup patch)? > > Jan > > --- 2011-09-20.orig/xen/drivers/passthrough/iommu.c > +++ 2011-09-20/xen/drivers/passthrough/iommu.c > @@ -150,6 +150,23 @@ int iommu_add_device(struct pci_dev *pde > return hd->platform_ops->add_device(pdev); > } > > +int iommu_enable_device(struct pci_dev *pdev) { > + struct hvm_iommu *hd; > + > + if ( !pdev->domain ) > + return -EINVAL; > + > + ASSERT(spin_is_locked(&pcidevs_lock)); > + > + hd = domain_hvm_iommu(pdev->domain); > + if ( !iommu_enabled || !hd->platform_ops || > + !hd->platform_ops->enable_device ) > + return 0; > + > + return hd->platform_ops->enable_device(pdev); > +} > + > int iommu_remove_device(struct pci_dev *pdev) { > struct hvm_iommu *hd; > --- 2011-09-20.orig/xen/drivers/passthrough/pci.c > +++ 2011-09-20/xen/drivers/passthrough/pci.c > @@ -258,7 +258,7 @@ struct pci_dev *pci_get_pdev_by_domain( > * pci_enable_acs - enable ACS if hardware support it > * @dev: the PCI device > */ > -void pci_enable_acs(struct pci_dev *pdev) > +static void pci_enable_acs(struct pci_dev *pdev) > { > int pos; > u16 cap, ctrl, seg = pdev->seg; > @@ -409,8 +409,11 @@ int pci_add_device(u16 seg, u8 bus, u8 d > } > > list_add(&pdev->domain_list, &dom0->arch.pdev_list); > - pci_enable_acs(pdev); > } > + else > + iommu_enable_device(pdev); > + > + pci_enable_acs(pdev); > > out: > spin_unlock(&pcidevs_lock); > --- 2011-09-20.orig/xen/drivers/passthrough/vtd/iommu.c > +++ 2011-09-20/xen/drivers/passthrough/vtd/iommu.c > @@ -1901,6 +1901,19 @@ static int intel_iommu_add_device(struct > return ret; > } > > +static int intel_iommu_enable_device(struct pci_dev *pdev) { > + struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev); > + int ret = drhd ? ats_device(pdev, drhd) : -ENODEV; > + > + if ( ret <= 0 ) > + return ret; > + > + ret = enable_ats_device(pdev->seg, pdev->bus, pdev->devfn); > + > + return ret >= 0 ? 0 : ret; > +} > + > static int intel_iommu_remove_device(struct pci_dev *pdev) { > struct acpi_rmrr_unit *rmrr; > @@ -1931,7 +1944,6 @@ static int intel_iommu_remove_device(str static > void __init setup_dom0_device(struct pci_dev *pdev) { > domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn); > - pci_enable_acs(pdev); > pci_vtd_quirk(pdev); > } > > @@ -2302,6 +2314,7 @@ const struct iommu_ops intel_iommu_ops > .init = intel_iommu_domain_init, > .dom0_init = intel_iommu_dom0_init, > .add_device = intel_iommu_add_device, > + .enable_device = intel_iommu_enable_device, > .remove_device = intel_iommu_remove_device, > .assign_device = intel_iommu_assign_device, > .teardown = iommu_domain_teardown, > --- 2011-09-20.orig/xen/include/xen/iommu.h > +++ 2011-09-20/xen/include/xen/iommu.h > @@ -70,6 +70,7 @@ int iommu_enable_x2apic_IR(void); void > iommu_disable_x2apic_IR(void); > > int iommu_add_device(struct pci_dev *pdev); > +int iommu_enable_device(struct pci_dev *pdev); > int iommu_remove_device(struct pci_dev *pdev); int > iommu_domain_init(struct domain *d); void iommu_dom0_init(struct > domain *d); @@ -120,6 +121,7 @@ struct iommu_ops { > int (*init)(struct domain *d); > void (*dom0_init)(struct domain *d); > int (*add_device)(struct pci_dev *pdev); > + int (*enable_device)(struct pci_dev *pdev); > int (*remove_device)(struct pci_dev *pdev); > int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn); > void (*teardown)(struct domain *d); > --- 2011-09-20.orig/xen/include/xen/pci.h > +++ 2011-09-20/xen/include/xen/pci.h > @@ -134,6 +134,5 @@ struct pirq; > int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t > gtable); void msixtbl_pt_unregister(struct domain *, struct pirq *); > void msixtbl_pt_cleanup(struct domain *d); -void pci_enable_acs(struct > pci_dev *pdev); > > #endif /* __XEN_PCI_H__ */_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-20 07:24 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
>>> On 20.10.11 at 00:20, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >> I reckon that the availability of device specifications in the ATSR data > structure must be there for a purpose. >> If that''s not correct, then I''ll certainly remove that code again, but I''d > like to understand what that data is meant >> to be for in that case. > > The atsr leverages the same PCI device scope is used for drhd and rmrr so > device and function comes along with bus number. As far as I can tell, we > only need to check the bus number for atsr.So why does the capability to list individual devices then exist? And why does it matter for DRHDs, but not for ATSRs?>> Either we don''t need to call it at all during discovery (which I doubt, > since when the device is in use by Dom0, I >> suppose having ATS enabled is still desirable or even required), or we have > to potentially do it twice (remember >> that older Dom0 kernels may fail to report all PCI devices to the > hypervisor). > > I see, calling enable_ats_device() in pci_add_device() will also solve the > case where MMCFG might not work until after dom0 is initialized. > > As I mentioned before, our QA team doesn''t test ATS and ACS regularly. It > would good if you can coordinate with our QA team to test out these changes > to make sure they don''t break any ACS and ATS functionality.How would I do that other than by getting the stuff committed and wait for their bi-weekly(?) testing round? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Oct-21 01:59 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
> So why does the capability to list individual devices then exist? And why does it matter for DRHDs, but not for ATSRs?The difference is ATSR only is meant to communicate PCIe root ports'' ATS capability. If the root port is capable, then downstream endpoints can enable ATS device translation cache. acpi_find_matched_drhd_unit() is used to find out which VT-d hardware is servicing the endpoint device. Given drhd lists either the actually PCI endpoints or include_all, we have to match the actual BDF of the device passed in with devices we recorded for that VT-d HW. acpi_find_matched_astr_unit() is used to find out if the endpoint device is under a ATS capable PCIe root port or not. ASTR information is remembered as secondary and subsidiary bus ranges. All we have to do is the match the device''s bus number with a root ports bus range. Matching the actual device in this case, will only match the root port device itself, this is what we recorded in acpi_parse_dev_scope(), which should not happen since we don''t assign a root port to a guest. Even if we do, checking for ATS capability is meaningless since root port will not have device translation cache. Allen -----Original Message----- From: Jan Beulich [mailto:JBeulich@suse.com] Sent: Thursday, October 20, 2011 12:24 AM To: Kay, Allen M Cc: Dugger, Donald D; Shan, Haitao; Tian, Kevin; xen-devel@lists.xensource.com Subject: RE: Resend: RE: enable_ats_device() call site>>> On 20.10.11 at 00:20, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >> I reckon that the availability of device specifications in the ATSR >> data > structure must be there for a purpose. >> If that''s not correct, then I''ll certainly remove that code again, >> but I''d > like to understand what that data is meant >> to be for in that case. > > The atsr leverages the same PCI device scope is used for drhd and rmrr > so device and function comes along with bus number. As far as I can > tell, we only need to check the bus number for atsr.So why does the capability to list individual devices then exist? And why does it matter for DRHDs, but not for ATSRs?>> Either we don''t need to call it at all during discovery (which I >> doubt, > since when the device is in use by Dom0, I >> suppose having ATS enabled is still desirable or even required), or >> we have > to potentially do it twice (remember >> that older Dom0 kernels may fail to report all PCI devices to the > hypervisor). > > I see, calling enable_ats_device() in pci_add_device() will also solve > the case where MMCFG might not work until after dom0 is initialized. > > As I mentioned before, our QA team doesn''t test ATS and ACS regularly. > It would good if you can coordinate with our QA team to test out these > changes to make sure they don''t break any ACS and ATS functionality.How would I do that other than by getting the stuff committed and wait for their bi-weekly(?) testing round? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-21 07:52 UTC
[Xen-devel] RE: Resend: RE: enable_ats_device() call site
>>> On 21.10.11 at 03:59, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >> So why does the capability to list individual devices then exist? And why > does it matter for DRHDs, but not for ATSRs? > > The difference is ATSR only is meant to communicate PCIe root ports'' ATS > capability. If the root port is capable, then downstream endpoints can > enable ATS device translation cache. > > acpi_find_matched_drhd_unit() is used to find out which VT-d hardware is > servicing the endpoint device. Given drhd lists either the actually PCI > endpoints or include_all, we have to match the actual BDF of the device > passed in with devices we recorded for that VT-d HW. > > acpi_find_matched_astr_unit() is used to find out if the endpoint device is > under a ATS capable PCIe root port or not. ASTR information is remembered as > secondary and subsidiary bus ranges. All we have to do is the match the > device''s bus number with a root ports bus range. Matching the actual device > in this case, will only match the root port device itself, this is what we > recorded in acpi_parse_dev_scope(), which should not happen since we don''t > assign a root port to a guest. Even if we do, checking for ATS capability is > meaningless since root port will not have device translation cache.Okay, so I''ll remove that part then and re-submit both patches. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel