Jason Gunthorpe
2023-Sep-22 16:27 UTC
[PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote:> On 22/09/2023 1:41 pm, Jason Gunthorpe wrote: > > On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote: > > > > > They're not strictly equivalent: this check works around a temporary issue > > > > > with the IOMMU core, which calls map/unmap before the domain is > > > > > finalized. > > > > > > > > Where? The above points to iommu_create_device_direct_mappings() but > > > > it doesn't because the pgsize_bitmap == 0: > > > > > > __iommu_domain_alloc() sets pgsize_bitmap in this case: > > > > > > /* > > > * If not already set, assume all sizes by default; the driver > > > * may override this later > > > */ > > > if (!domain->pgsize_bitmap) > > > domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; > > > > Dirver's shouldn't do that. > > > > The core code was fixed to try again with mapping reserved regions to > > support these kinds of drivers. > > This is still the "normal" code path, really; I think it's only AMD that > started initialising the domain bitmap "early" and warranted making it > conditional.My main point was that iommu_create_device_direct_mappings() should fail for unfinalized domains, setting pgsize_bitmap to allow it to succeed is not a nice hack, and not necessary now. What do you think about something like this to replace iommu_create_device_direct_mappings(), that does enforce things properly? static int resv_cmp(void *priv, const struct list_head *llhs, const struct list_head *lrhs) { struct iommu_resv_region *lhs = list_entry(llhs, struct iommu_resv_region, list); struct iommu_resv_region *rhs = list_entry(lrhs, struct iommu_resv_region, list); if (lhs->start == rhs->start) return 0; if (lhs->start < rhs->start) return -1; return 1; } static int iommu_create_device_direct_mappings(struct iommu_domain *domain, struct device *dev) { struct iommu_resv_region *entry; struct iommu_resv_region *tmp; struct list_head mappings; struct list_head direct; phys_addr_t cur = 0; int ret = 0; INIT_LIST_HEAD(&mappings); INIT_LIST_HEAD(&direct); iommu_get_resv_regions(dev, &mappings); list_for_each_entry_safe(entry, tmp, &mappings, list) { if (entry->type == IOMMU_RESV_DIRECT) dev->iommu->require_direct = 1; if ((domain->type & __IOMMU_DOMAIN_PAGING) && (entry->type == IOMMU_RESV_DIRECT || entry->type == IOMMU_RESV_DIRECT_RELAXABLE)) { if (domain->geometry.aperture_start > entry->start || domain->geometry.aperture_end == 0 || (domain->geometry.aperture_end - 1) < (entry->start + entry->length - 1)) { ret = -EINVAL; goto out; } list_move(&entry->list, &direct); } } if (list_empty(&direct)) goto out; /* * FW can have overlapping ranges, sort the list by start address * and map any duplicated IOVA only once. */ list_sort(NULL, &direct, resv_cmp); list_for_each_entry(entry, &direct, list) { phys_addr_t start_pfn = entry->start / PAGE_SIZE; phys_addr_t last_pfn (entry->length - 1 + entry->start) / PAGE_SIZE; if (start_pfn < cur) start_pfn = cur; if (start_pfn <= last_pfn) { ret = iommu_map(domain, start_pfn * PAGE_SIZE, start_pfn * PAGE_SIZE, (last_pfn - start_pfn + 1) * PAGE_SIZE, entry->prot, GFP_KERNEL); if (ret) goto out; cur = last_pfn + 1; } } out: list_splice(&direct, &mappings); iommu_put_resv_regions(dev, &mappings); return ret; }
Robin Murphy
2023-Sep-22 18:07 UTC
[PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On 22/09/2023 5:27 pm, Jason Gunthorpe wrote:> On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote: >> On 22/09/2023 1:41 pm, Jason Gunthorpe wrote: >>> On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote: >>>>>> They're not strictly equivalent: this check works around a temporary issue >>>>>> with the IOMMU core, which calls map/unmap before the domain is >>>>>> finalized. >>>>> >>>>> Where? The above points to iommu_create_device_direct_mappings() but >>>>> it doesn't because the pgsize_bitmap == 0: >>>> >>>> __iommu_domain_alloc() sets pgsize_bitmap in this case: >>>> >>>> /* >>>> * If not already set, assume all sizes by default; the driver >>>> * may override this later >>>> */ >>>> if (!domain->pgsize_bitmap) >>>> domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; >>> >>> Dirver's shouldn't do that. >>> >>> The core code was fixed to try again with mapping reserved regions to >>> support these kinds of drivers. >> >> This is still the "normal" code path, really; I think it's only AMD that >> started initialising the domain bitmap "early" and warranted making it >> conditional. > > My main point was that iommu_create_device_direct_mappings() should > fail for unfinalized domains, setting pgsize_bitmap to allow it to > succeed is not a nice hack, and not necessary now.Sure, but it's the whole "unfinalised domains" and rewriting domain->pgsize_bitmap after attach thing that is itself the massive hack. AMD doesn't do that, and doesn't need to; it knows the appropriate format at allocation time and can quite happily return a fully working domain which allows map before attach, but the old ops->pgsize_bitmap mechanism fundamentally doesn't work for multiple formats with different page sizes. The only thing I'd accuse it of doing wrong is the weird half-and-half thing of having one format as a default via one mechanism, and the other as an override through the other, rather than setting both explicitly. virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings either; it sets it once it's discovered any instance, since apparently it's assuming that all instances must support identical page sizes, and thus once it's seen one it can work "normally" per the core code's assumptions. It's also I think the only driver which has a "finalise" bodge but *can* still properly support map-before-attach, by virtue of having to replay mappings to every new endpoint anyway.> What do you think about something like this to replace > iommu_create_device_direct_mappings(), that does enforce things > properly?I fail to see how that would make any practical difference. Either the mappings can be correctly set up in a pagetable *before* the relevant device is attached to that pagetable, or they can't (if the driver doesn't have enough information to be able to do so) and we just have to really hope nothing blows up in the race window between attaching the device to an empty pagetable and having a second try at iommu_create_device_direct_mappings(). That's a driver-level issue and has nothing to do with pgsize_bitmap either way. Thanks, Robin.