Michael S. Tsirkin
2021-Oct-09 09:53 UTC
[PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On Fri, Oct 08, 2021 at 05:37:07PM -0700, Kuppuswamy Sathyanarayanan wrote:> From: Andi Kleen <ak at linux.intel.com> > > For Confidential VM guests like TDX, the host is untrusted and hence > the devices emulated by the host or any data coming from the host > cannot be trusted. So the drivers that interact with the outside world > have to be hardened by sharing memory with host on need basis > with proper hardening fixes. > > For the PCI driver case, to share the memory with the host add > pci_iomap_host_shared() and pci_iomap_host_shared_range() APIs. > > Signed-off-by: Andi Kleen <ak at linux.intel.com> > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy at linux.intel.com>So I proposed to make all pci mappings shared, eliminating the need to patch drivers. To which Andi replied One problem with removing the ioremap opt-in is that it's still possible for drivers to get at devices without going through probe. To which Greg replied: https://lore.kernel.org/all/YVXBNJ431YIWwZdQ at kroah.com/ If there are in-kernel PCI drivers that do not do this, they need to be fixed today. Can you guys resolve the differences here? And once they are resolved, mention this in the commit log so I don't get to re-read the series just to find out nothing changed in this respect? I frankly do not believe we are anywhere near being able to harden an arbitrary kernel config against attack. How about creating a defconfig that makes sense for TDX then? Anyone deviating from that better know what they are doing, this API tweaking is just putting policy into the kernel ...> --- > Changes since v4: > * Replaced "_shared" with "_host_shared" in pci_iomap* APIs > * Fixed commit log as per review comments. > > include/asm-generic/pci_iomap.h | 6 +++++ > lib/pci_iomap.c | 47 +++++++++++++++++++++++++++++++++ > 2 files changed, 53 insertions(+) > > diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h > index df636c6d8e6c..a4a83c8ab3cf 100644 > --- a/include/asm-generic/pci_iomap.h > +++ b/include/asm-generic/pci_iomap.h > @@ -18,6 +18,12 @@ extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar, > extern void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar, > unsigned long offset, > unsigned long maxlen); > +extern void __iomem *pci_iomap_host_shared(struct pci_dev *dev, int bar, > + unsigned long max); > +extern void __iomem *pci_iomap_host_shared_range(struct pci_dev *dev, int bar, > + unsigned long offset, > + unsigned long maxlen); > + > /* Create a virtual mapping cookie for a port on a given PCI device. > * Do not call this directly, it exists to make it easier for architectures > * to override */ > diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c > index 57bd92f599ee..2816dc8715da 100644 > --- a/lib/pci_iomap.c > +++ b/lib/pci_iomap.c > @@ -25,6 +25,11 @@ static void __iomem *map_ioremap_wc(phys_addr_t addr, size_t size) > return ioremap_wc(addr, size); > } > > +static void __iomem *map_ioremap_host_shared(phys_addr_t addr, size_t size) > +{ > + return ioremap_host_shared(addr, size); > +} > + > static void __iomem *pci_iomap_range_map(struct pci_dev *dev, > int bar, > unsigned long offset, > @@ -106,6 +111,48 @@ void __iomem *pci_iomap_wc_range(struct pci_dev *dev, > } > EXPORT_SYMBOL_GPL(pci_iomap_wc_range); > > +/** > + * pci_iomap_host_shared_range - create a virtual shared mapping cookie > + * for a PCI BAR > + * @dev: PCI device that owns the BAR > + * @bar: BAR number > + * @offset: map memory at the given offset in BAR > + * @maxlen: max length of the memory to map > + * > + * Remap a pci device's resources shared in a confidential guest. > + * For more details see pci_iomap_range's documentation.So how does a driver author know when to use this function, and when to use the regular pci_iomap_range? Drivers have no idea whether they are used in a confidential guest, and which ranges are shared, it's a TDX thing ... This documentation should really address it.> + * > + * @maxlen specifies the maximum length to map. To get access to > + * the complete BAR from offset to the end, pass %0 here. > + */ > +void __iomem *pci_iomap_host_shared_range(struct pci_dev *dev, int bar, > + unsigned long offset, > + unsigned long maxlen) > +{ > + return pci_iomap_range_map(dev, bar, offset, maxlen, > + map_ioremap_host_shared, true); > +} > +EXPORT_SYMBOL_GPL(pci_iomap_host_shared_range); > + > +/** > + * pci_iomap_host_shared - create a virtual shared mapping cookie for a PCI BAR > + * @dev: PCI device that owns the BAR > + * @bar: BAR number > + * @maxlen: length of the memory to map > + * > + * See pci_iomap for details. This function creates a shared mapping > + * with the host for confidential hosts. > + * > + * @maxlen specifies the maximum length to map. To get access to the > + * complete BAR without checking for its length first, pass %0 here. > + */ > +void __iomem *pci_iomap_host_shared(struct pci_dev *dev, int bar, > + unsigned long maxlen) > +{ > + return pci_iomap_host_shared_range(dev, bar, 0, maxlen); > +} > +EXPORT_SYMBOL_GPL(pci_iomap_host_shared); > + > /** > * pci_iomap - create a virtual mapping cookie for a PCI BAR > * @dev: PCI device that owns the BAR > -- > 2.25.1
Dan Williams
2021-Oct-09 20:39 UTC
[PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
On Sat, Oct 9, 2021 at 2:53 AM Michael S. Tsirkin <mst at redhat.com> wrote:> > On Fri, Oct 08, 2021 at 05:37:07PM -0700, Kuppuswamy Sathyanarayanan wrote: > > From: Andi Kleen <ak at linux.intel.com> > > > > For Confidential VM guests like TDX, the host is untrusted and hence > > the devices emulated by the host or any data coming from the host > > cannot be trusted. So the drivers that interact with the outside world > > have to be hardened by sharing memory with host on need basis > > with proper hardening fixes. > > > > For the PCI driver case, to share the memory with the host add > > pci_iomap_host_shared() and pci_iomap_host_shared_range() APIs. > > > > Signed-off-by: Andi Kleen <ak at linux.intel.com> > > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy at linux.intel.com> > > So I proposed to make all pci mappings shared, eliminating the need > to patch drivers. > > To which Andi replied > One problem with removing the ioremap opt-in is that > it's still possible for drivers to get at devices without going through probe. > > To which Greg replied: > https://lore.kernel.org/all/YVXBNJ431YIWwZdQ at kroah.com/ > If there are in-kernel PCI drivers that do not do this, they need to be > fixed today. > > Can you guys resolve the differences here?I agree with you and Greg here. If a driver is accessing hardware resources outside of the bind lifetime of one of the devices it supports, and in a way that neither modrobe-policy nor device-authorization -policy infrastructure can block, that sounds like a bug report. Fix those drivers instead of sprinkling ioremap_shared in select places and with unclear rules about when a driver is allowed to do "shared" mappings. Let the new device-authorization mechanism (with policy in userspace) be the central place where all of these driver "trust" issues are managed.> And once they are resolved, mention this in the commit log so > I don't get to re-read the series just to find out nothing > changed in this respect? > > I frankly do not believe we are anywhere near being able to harden > an arbitrary kernel config against attack. > How about creating a defconfig that makes sense for TDX then? > Anyone deviating from that better know what they are doing, > this API tweaking is just putting policy into the kernel ...Right, userspace authorization policy and select driver fixups seems to be the answer to the raised concerns.
Andi Kleen
2021-Oct-10 22:22 UTC
[PATCH v5 12/16] PCI: Add pci_iomap_host_shared(), pci_iomap_host_shared_range()
> To which Andi replied > One problem with removing the ioremap opt-in is that > it's still possible for drivers to get at devices without going through probe. > > To which Greg replied: > https://lore.kernel.org/all/YVXBNJ431YIWwZdQ at kroah.com/ > If there are in-kernel PCI drivers that do not do this, they need to be > fixed today. > > Can you guys resolve the differences here?I addressed this in my other mail, but we may need more discussion.> > And once they are resolved, mention this in the commit log so > I don't get to re-read the series just to find out nothing > changed in this respect? > > I frankly do not believe we are anywhere near being able to harden > an arbitrary kernel config against attack.Why not? Device filter and the opt-ins together are a fairly strong mechanism. And it's not that they're a lot of code or super complicated either. You're essentially objecting to a single line change in your subsystem here.> How about creating a defconfig that makes sense for TDX then?TDX can be used in many different ways, I don't think a defconfig is practical. In theory you could do some Kconfig dependency (at the pain point of having separate kernel binariees), but why not just do it at run time then if you maintain the list anyways. That's much easier and saner for everyone. In the past we usually always ended up with runtime mechanism for similar things anyways. Also it turns out that the filter mechanisms are needed for some arch drivers which are not even configurable, so alone it's probably not enough,> Anyone deviating from that better know what they are doing, > this API tweaking is just putting policy into the kernel ...Hardening drivers is kernel policy. It cannot be done anywhere else. -Andi