Danilo Krummrich
2023-Sep-12 10:06 UTC
[Nouveau] [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellstr?m wrote:> Hi, Danilo > > On 9/11/23 19:49, Danilo Krummrich wrote: > > Hi Thomas, > > > > On 9/11/23 19:19, Thomas Hellstr?m wrote: > > > Hi, Danilo > > > > > > On 9/9/23 17:31, Danilo Krummrich wrote: > > > > This patch adds an abstraction layer between the drm_gpuva mappings of > > > > a particular drm_gem_object and this GEM object itself. The abstraction > > > > represents a combination of a drm_gem_object and drm_gpuvm. The > > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure > > > > representing this abstraction), while each drm_gpuvm_bo contains > > > > list of > > > > mappings of this GEM object. > > > > > > > > This has multiple advantages: > > > > > > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists > > > > ??? of the drm_gpuvm. This is useful for tracking external and evicted > > > > ??? objects per VM, which is introduced in subsequent patches. > > > > > > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain > > > > ??? drm_gpuvm becomes much cheaper. > > > > > > > > 3) Drivers can derive and extend the structure to easily represent > > > > ??? driver specific states of a BO for a certain GPUVM. > > > > > > > > The idea of this abstraction was taken from amdgpu, hence the > > > > credit for > > > > this idea goes to the developers of amdgpu. > > > > > > > > Cc: Christian K?nig <christian.koenig at amd.com> > > > > Signed-off-by: Danilo Krummrich <dakr at redhat.com> > > > > > > Did you consider having the drivers embed the struct drm_gpuvm_bo in > > > their own bo definition? I figure that would mean using the gem bo's > > > refcounting and providing a helper to call from the driver's bo > > > release. Looks like that could potentially save a lot of code? Or is > > > there something that won't work with that approach? > > > > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free > > callback for drivers to register for that purpose. > > > > - Danilo > > Now after looking a bit deeper, I think actually the question could be > rephrased as, why don't we just use the > struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of > keeping things simple? Drivers would then just embed it in their bo subclass > and we'd avoid unnecessary fields in the struct drm_gem_object for drivers > that don't do VM_BIND yet.struct drm_gem_object::gpuva is just a container containing a list in order to (currently) attach drm_gpuva structs to it and with this patch attach drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to the driver specific BO structure". Having a common connection between GEM objects and drm_gpuva structs was one of the goals of the initial GPUVA manager patch series however.> > Sure, this won't be per bo and per vm, but it'd really only make a slight > difference where we have multiple VMAs per bo, where per-vm per-bo state > either needs to be duplicated or attached to a single vma (as in the case of > the external bo list).Correct, one implication is that we don't get a per VM and BO abstraction, and hence are left with a list of all drm_gpuva structs having the same backing BO, regardless of the VM. For amdgpu this was always a concern. Now that we want to keep track of external and evicted objects it's going to be a concern for most drivers I guess. Because the only structure we could use for tracking external and evicted objects we are left with (without having a VM_BO abstraction) is struct drm_gpuva. But this structure isn't unique and we need to consider cases where userspace just allocates rather huge BOs and creates tons of mappings from it. Running the full list of drm_gpuva structs (with even the ones from other VMs included) for adding an external or evicted object isn't very efficient. Not to mention that the maintenance when the mapping we've (randomly) picked as an entry for the external/evicted object list is unmapped, but there are still mappings left in the VM with the same backing BO. Now, a way to get rid of the VM_BO abstraction would be to use maple trees instead, since then we can store drm_gem_object structs directly for each VM. However, Xe had concerns about using maple trees and preferred lists, plus having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO abstraction for cases with tons of VMs and tons of mappings per BO. Hence, having a VM_BO abstraction enabling us to track external/evicted objects with lists seems to satisfy everyone's needs. - Danilo> > To me that looks like a substantial amount of less code / complexity? > > /Thomas > > > > > > > > > > Thanks, > > > > > > Thomas > > > > > > > > >
Thomas Hellström
2023-Sep-12 10:33 UTC
[Nouveau] [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
On 9/12/23 12:06, Danilo Krummrich wrote:> On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellstr?m wrote: >> Hi, Danilo >> >> On 9/11/23 19:49, Danilo Krummrich wrote: >>> Hi Thomas, >>> >>> On 9/11/23 19:19, Thomas Hellstr?m wrote: >>>> Hi, Danilo >>>> >>>> On 9/9/23 17:31, Danilo Krummrich wrote: >>>>> This patch adds an abstraction layer between the drm_gpuva mappings of >>>>> a particular drm_gem_object and this GEM object itself. The abstraction >>>>> represents a combination of a drm_gem_object and drm_gpuvm. The >>>>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure >>>>> representing this abstraction), while each drm_gpuvm_bo contains >>>>> list of >>>>> mappings of this GEM object. >>>>> >>>>> This has multiple advantages: >>>>> >>>>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists >>>>> ??? of the drm_gpuvm. This is useful for tracking external and evicted >>>>> ??? objects per VM, which is introduced in subsequent patches. >>>>> >>>>> 2) Finding mappings of a certain drm_gem_object mapped in a certain >>>>> ??? drm_gpuvm becomes much cheaper. >>>>> >>>>> 3) Drivers can derive and extend the structure to easily represent >>>>> ??? driver specific states of a BO for a certain GPUVM. >>>>> >>>>> The idea of this abstraction was taken from amdgpu, hence the >>>>> credit for >>>>> this idea goes to the developers of amdgpu. >>>>> >>>>> Cc: Christian K?nig <christian.koenig at amd.com> >>>>> Signed-off-by: Danilo Krummrich <dakr at redhat.com> >>>> Did you consider having the drivers embed the struct drm_gpuvm_bo in >>>> their own bo definition? I figure that would mean using the gem bo's >>>> refcounting and providing a helper to call from the driver's bo >>>> release. Looks like that could potentially save a lot of code? Or is >>>> there something that won't work with that approach? >>> There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free >>> callback for drivers to register for that purpose. >>> >>> - Danilo >> Now after looking a bit deeper, I think actually the question could be >> rephrased as, why don't we just use the >> struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of >> keeping things simple? Drivers would then just embed it in their bo subclass >> and we'd avoid unnecessary fields in the struct drm_gem_object for drivers >> that don't do VM_BIND yet. > struct drm_gem_object::gpuva is just a container containing a list in order to > (currently) attach drm_gpuva structs to it and with this patch attach > drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically > means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to > the driver specific BO structure". Having a common connection between GEM > objects and drm_gpuva structs was one of the goals of the initial GPUVA manager > patch series however. > >> Sure, this won't be per bo and per vm, but it'd really only make a slight >> difference where we have multiple VMAs per bo, where per-vm per-bo state >> either needs to be duplicated or attached to a single vma (as in the case of >> the external bo list). > > Correct, one implication is that we don't get a per VM and BO abstraction, and > hence are left with a list of all drm_gpuva structs having the same backing BO, > regardless of the VM. > > For amdgpu this was always a concern. Now that we want to keep track of external > and evicted objects it's going to be a concern for most drivers I guess. Because > the only structure we could use for tracking external and evicted objects we are > left with (without having a VM_BO abstraction) is struct drm_gpuva. But this > structure isn't unique and we need to consider cases where userspace just > allocates rather huge BOs and creates tons of mappings from it. Running the full > list of drm_gpuva structs (with even the ones from other VMs included) for > adding an external or evicted object isn't very efficient. Not to mention that > the maintenance when the mapping we've (randomly) picked as an entry for the > external/evicted object list is unmapped, but there are still mappings left in > the VM with the same backing BO.For the evicted object it's not much of an issue; we maintain a list of vmas needing rebinding for each VM rather than objects evicted, so there is no or very little additional overhead there. The extobj list is indeed a problem if many VMAs are bound to the same bo. Not that the code snippets are complicated, but the list traversals would be excessive.> > Now, a way to get rid of the VM_BO abstraction would be to use maple trees > instead, since then we can store drm_gem_object structs directly for each VM. > However, Xe had concerns about using maple trees and preferred lists, plus > having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO > abstraction for cases with tons of VMs and tons of mappings per BO. Hence, > having a VM_BO abstraction enabling us to track external/evicted objects with > lists seems to satisfy everyone's needs.Indeed this is a tradeoff between a simple implementation that is OK for situations with not many VMs nor VMAs per bo vs a more complex implementation that optimizes for the opposite case. So if this latter is a case we need to optimize for at this point then I guess it's the way to go. (I'm in the process of adapting the xe driver to this, so I just wanted to bring up areas where the implementations differ quite a lot and make sure options are discussed). Thanks, Thomas> > - Danilo > >> To me that looks like a substantial amount of less code / complexity? >> >> /Thomas >> >> >>>> Thanks, >>>> >>>> Thomas >>>> >>>>