thr3ads.net - Nouveau - [Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping [May 2018]

If this information is useful, please help other people find it:
Share via:

Robin Murphy

2018-May-30 13:30 UTC

[Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

On 30/05/18 14:00, Thierry Reding wrote:> On Wed, May 30, 2018 at 11:30:25AM +0100, Robin Murphy wrote:
>> On 30/05/18 09:03, Thierry Reding wrote:
>>> From: Thierry Reding <treding at nvidia.com>
>>>
>>> Depending on the kernel configuration, early ARM architecture setup
code
>>> may have attached the GPU to a DMA/IOMMU mapping that transparently
uses
>>> the IOMMU to back the DMA API. Tegra requires special handling for
IOMMU
>>> backed buffers (a special bit in the GPU's MMU page tables
indicates the
>>> memory path to take: via the SMMU or directly to the memory
controller).
>>> Transparently backing DMA memory with an IOMMU prevents Nouveau
from
>>> properly handling such memory accesses and causes memory access
faults.
>>>
>>> As a side-note: buffers other than those allocated in instance
memory
>>> don't need to be physically contiguous from the GPU's
perspective since
>>> the GPU can map them into contiguous buffers using its own MMU.
Mapping
>>> these buffers through the IOMMU is unnecessary and will even lead
to
>>> performance degradation because of the additional translation. One
>>> exception to this are compressible buffers which need large pages.
In
>>> order to enable these large pages, multiple small pages will have
to be
>>> combined into one large (I/O virtually contiguous) mapping via the
>>> IOMMU. However, that is a topic outside the scope of this fix and
isn't
>>> currently supported. An implementation will want to explicitly
create
>>> these large pages in the Nouveau driver, so detaching from a
DMA/IOMMU
>>> mapping would still be required.
>>
>> I wonder if it might make sense to have a hook in iommu_attach_device()
to
>> notify the arch DMA API code when moving devices between unmanaged and
DMA
>> ops domains? That seems like it might be the most low-impact way to
address
>> the overall problem long-term.
>>
>>> Signed-off-by: Thierry Reding <treding at nvidia.com>
>>> ---
>>> Changes in v3:
>>> - clarify the use of IOMMU mapping for compressible buffers
>>> - squash multiple patches into this
>>>
>>>    drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 5 +++++
>>>    1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
>>> index 78597da6313a..d0538af1b967 100644
>>> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
>>> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
>>> @@ -105,6 +105,11 @@ nvkm_device_tegra_probe_iommu(struct
nvkm_device_tegra *tdev)
>>>    	unsigned long pgsize_bitmap;
>>>    	int ret;
>>> +#if IS_ENABLED(CONFIG_ARM)
>>
>> Wouldn't CONFIG_ARM_DMA_USE_IOMMU be even more appropriate?
> 
> Not necessarily. arm_dma_iommu_detach_device() is always defined on ARM,
> only with CONFIG_ARM_DMA_USE_IOMMU=n it will be empty. So this check is
> a guard to make sure we don't call the function when it isn't
available,
> but it may still not do anything.
Calling a function under condition A, which only does anything under 
condition B, when B depends on A, is identical in behaviour to only 
calling the function under condition B, except needlessly harder to follow.
>>> +	/* make sure we can use the IOMMU exclusively */
>>> +	arm_dma_iommu_detach_device(dev);
>>
>> As before, I would just use the existing infrastructure the same way
the
>> Exynos DRM driver currently does in __exynos_iommu_attach() (albeit
without
>> then reattaching to another DMA ops mapping).
> 
> That's pretty much what I initially did and which was shot down by
> Christoph. As I said earlier, at this point I don't really care what
> color the shed will be. Can you and Christoph come to an agreement
> on what it should be?
What I was getting at is that arm_iommu_detach_device() already *is* the 
exact function Christoph was asking for, it just needs a minor fix 
instead of adding explicit set_dma_ops() fiddling at its callsites which 
only obfuscates the fact that it's supposed to be responsible for 
resetting the device's DMA ops already.

Robin.

Thierry Reding

2018-May-30 13:41 UTC

head link

[Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

On Wed, May 30, 2018 at 02:30:51PM +0100, Robin Murphy
wrote:> On 30/05/18 14:00, Thierry Reding wrote:
> > On Wed, May 30, 2018 at 11:30:25AM +0100, Robin Murphy wrote:
> > > On 30/05/18 09:03, Thierry Reding wrote:
> > > > From: Thierry Reding <treding at nvidia.com>
> > > > 
> > > > Depending on the kernel configuration, early ARM
architecture setup code
> > > > may have attached the GPU to a DMA/IOMMU mapping that
transparently uses
> > > > the IOMMU to back the DMA API. Tegra requires special
handling for IOMMU
> > > > backed buffers (a special bit in the GPU's MMU page
tables indicates the
> > > > memory path to take: via the SMMU or directly to the memory
controller).
> > > > Transparently backing DMA memory with an IOMMU prevents
Nouveau from
> > > > properly handling such memory accesses and causes memory
access faults.
> > > > 
> > > > As a side-note: buffers other than those allocated in
instance memory
> > > > don't need to be physically contiguous from the
GPU's perspective since
> > > > the GPU can map them into contiguous buffers using its own
MMU. Mapping
> > > > these buffers through the IOMMU is unnecessary and will even
lead to
> > > > performance degradation because of the additional
translation. One
> > > > exception to this are compressible buffers which need large
pages. In
> > > > order to enable these large pages, multiple small pages will
have to be
> > > > combined into one large (I/O virtually contiguous) mapping
via the
> > > > IOMMU. However, that is a topic outside the scope of this
fix and isn't
> > > > currently supported. An implementation will want to
explicitly create
> > > > these large pages in the Nouveau driver, so detaching from a
DMA/IOMMU
> > > > mapping would still be required.
> > > 
> > > I wonder if it might make sense to have a hook in
iommu_attach_device() to
> > > notify the arch DMA API code when moving devices between
unmanaged and DMA
> > > ops domains? That seems like it might be the most low-impact way
to address
> > > the overall problem long-term.
> > > 
> > > > Signed-off-by: Thierry Reding <treding at nvidia.com>
> > > > ---
> > > > Changes in v3:
> > > > - clarify the use of IOMMU mapping for compressible buffers
> > > > - squash multiple patches into this
> > > > 
> > > >    drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 5
+++++
> > > >    1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git
a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> > > > index 78597da6313a..d0538af1b967 100644
> > > > --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> > > > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
> > > > @@ -105,6 +105,11 @@ nvkm_device_tegra_probe_iommu(struct
nvkm_device_tegra *tdev)
> > > >    	unsigned long pgsize_bitmap;
> > > >    	int ret;
> > > > +#if IS_ENABLED(CONFIG_ARM)
> > > 
> > > Wouldn't CONFIG_ARM_DMA_USE_IOMMU be even more appropriate?
> > 
> > Not necessarily. arm_dma_iommu_detach_device() is always defined on
ARM,
> > only with CONFIG_ARM_DMA_USE_IOMMU=n it will be empty. So this check
is
> > a guard to make sure we don't call the function when it isn't
available,
> > but it may still not do anything.
> 
> Calling a function under condition A, which only does anything under
> condition B, when B depends on A, is identical in behaviour to only calling
> the function under condition B, except needlessly harder to follow.
> 
> > > > +	/* make sure we can use the IOMMU exclusively */
> > > > +	arm_dma_iommu_detach_device(dev);
> > > 
> > > As before, I would just use the existing infrastructure the same
way the
> > > Exynos DRM driver currently does in __exynos_iommu_attach()
(albeit without
> > > then reattaching to another DMA ops mapping).
> > 
> > That's pretty much what I initially did and which was shot down by
> > Christoph. As I said earlier, at this point I don't really care
what
> > color the shed will be. Can you and Christoph come to an agreement
> > on what it should be?
> 
> What I was getting at is that arm_iommu_detach_device() already *is* the
> exact function Christoph was asking for, it just needs a minor fix instead
> of adding explicit set_dma_ops() fiddling at its callsites which only
> obfuscates the fact that it's supposed to be responsible for resetting
the
> device's DMA ops already.
It still has the downside of callers having to explicitly check for the
existence of a mapping, otherwise they'll cause a warning to be printed
to the kernel log.

That's not all that bad, though. I'll prepare version 4 with those
changes.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL:
<https://lists.freedesktop.org/archives/nouveau/attachments/20180530/02b467fa/attachment.sig>

Robin Murphy

2018-May-30 13:46 UTC

head link

[Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

On 30/05/18 14:41, Thierry Reding wrote:> On Wed, May 30, 2018 at 02:30:51PM +0100, Robin Murphy wrote:
>> On 30/05/18 14:00, Thierry Reding wrote:
>>> On Wed, May 30, 2018 at 11:30:25AM +0100, Robin Murphy wrote:
>>>> On 30/05/18 09:03, Thierry Reding wrote:
>>>>> From: Thierry Reding <treding at nvidia.com>
>>>>>
>>>>> Depending on the kernel configuration, early ARM
architecture setup code
>>>>> may have attached the GPU to a DMA/IOMMU mapping that
transparently uses
>>>>> the IOMMU to back the DMA API. Tegra requires special
handling for IOMMU
>>>>> backed buffers (a special bit in the GPU's MMU page
tables indicates the
>>>>> memory path to take: via the SMMU or directly to the memory
controller).
>>>>> Transparently backing DMA memory with an IOMMU prevents
Nouveau from
>>>>> properly handling such memory accesses and causes memory
access faults.
>>>>>
>>>>> As a side-note: buffers other than those allocated in
instance memory
>>>>> don't need to be physically contiguous from the
GPU's perspective since
>>>>> the GPU can map them into contiguous buffers using its own
MMU. Mapping
>>>>> these buffers through the IOMMU is unnecessary and will
even lead to
>>>>> performance degradation because of the additional
translation. One
>>>>> exception to this are compressible buffers which need large
pages. In
>>>>> order to enable these large pages, multiple small pages
will have to be
>>>>> combined into one large (I/O virtually contiguous) mapping
via the
>>>>> IOMMU. However, that is a topic outside the scope of this
fix and isn't
>>>>> currently supported. An implementation will want to
explicitly create
>>>>> these large pages in the Nouveau driver, so detaching from
a DMA/IOMMU
>>>>> mapping would still be required.
>>>>
>>>> I wonder if it might make sense to have a hook in
iommu_attach_device() to
>>>> notify the arch DMA API code when moving devices between
unmanaged and DMA
>>>> ops domains? That seems like it might be the most low-impact
way to address
>>>> the overall problem long-term.
>>>>
>>>>> Signed-off-by: Thierry Reding <treding at nvidia.com>
>>>>> ---
>>>>> Changes in v3:
>>>>> - clarify the use of IOMMU mapping for compressible buffers
>>>>> - squash multiple patches into this
>>>>>
>>>>>     drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 5
+++++
>>>>>     1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git
a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
>>>>> index 78597da6313a..d0538af1b967 100644
>>>>> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
>>>>> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
>>>>> @@ -105,6 +105,11 @@ nvkm_device_tegra_probe_iommu(struct
nvkm_device_tegra *tdev)
>>>>>     	unsigned long pgsize_bitmap;
>>>>>     	int ret;
>>>>> +#if IS_ENABLED(CONFIG_ARM)
>>>>
>>>> Wouldn't CONFIG_ARM_DMA_USE_IOMMU be even more appropriate?
>>>
>>> Not necessarily. arm_dma_iommu_detach_device() is always defined on
ARM,
>>> only with CONFIG_ARM_DMA_USE_IOMMU=n it will be empty. So this
check is
>>> a guard to make sure we don't call the function when it
isn't available,
>>> but it may still not do anything.
>>
>> Calling a function under condition A, which only does anything under
>> condition B, when B depends on A, is identical in behaviour to only
calling
>> the function under condition B, except needlessly harder to follow.
>>
>>>>> +	/* make sure we can use the IOMMU exclusively */
>>>>> +	arm_dma_iommu_detach_device(dev);
>>>>
>>>> As before, I would just use the existing infrastructure the
same way the
>>>> Exynos DRM driver currently does in __exynos_iommu_attach()
(albeit without
>>>> then reattaching to another DMA ops mapping).
>>>
>>> That's pretty much what I initially did and which was shot down
by
>>> Christoph. As I said earlier, at this point I don't really care
what
>>> color the shed will be. Can you and Christoph come to an agreement
>>> on what it should be?
>>
>> What I was getting at is that arm_iommu_detach_device() already *is*
the
>> exact function Christoph was asking for, it just needs a minor fix
instead
>> of adding explicit set_dma_ops() fiddling at its callsites which only
>> obfuscates the fact that it's supposed to be responsible for
resetting the
>> device's DMA ops already.
> 
> It still has the downside of callers having to explicitly check for the
> existence of a mapping, otherwise they'll cause a warning to be printed
> to the kernel log.
Or we could look at the way it's actually used, and reconsider whether 
the warning is really appropriate. That's always an option ;)

Robin.
> That's not all that bad, though. I'll prepare version 4 with those
> changes.
> 
> Thierry
>

Apparently Analagous Threads

Search for more maybe matching threads

Nouveau - May 2018 - [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

[Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

[Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

[Nouveau] [PATCH v3 2/2] drm/nouveau: tegra: Detach from ARM DMA/IOMMU mapping

Apparently Analagous Threads