thr3ads.net - Nouveau - [Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Alexandre Courbot

2015-Apr-17 06:26 UTC

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh at nvidia.com>
wrote:> Some platforms have IOMMU to map non-contiguous physical memory into
> contiguous GPU virtual address. We can use this feature to enable big pages
> mapping on scattered small pages. To achieve that, we also need changes in
> subdev/mmu as well.
>
> Signed-off-by: Vince Hsu <vinceh at nvidia.com>
> ---
>  drm/nouveau/nouveau_bo.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
> index 77326e344dad..da76ee1121e4 100644
> --- a/drm/nouveau/nouveau_bo.c
> +++ b/drm/nouveau/nouveau_bo.c
> @@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int size, int
align,
>         if (drm->client.vm) {
>                 if (!(flags & TTM_PL_FLAG_TT) && size > 256
* 1024)
>                         nvbo->page_shift =
drm->client.vm->mmu->lpg_shift;
> +
> +               if ((flags & TTM_PL_FLAG_TT) &&
> +                               drm->client.vm->mmu->iommu_capable
&&
> +                               (size % (1 <<
drm->client.vm->mmu->lpg_shift)) == 0)
> +                       nvbo->page_shift =
drm->client.vm->mmu->lpg_shift;
I wonder if we should not just use the same size heuristics as for VRAM above?

Here, unless your buffer size is an exact multiple of the big page
size (128K for GK20A), you will not use big pages at all. In effect,
many buffers will be rejected for this reason. A behavior like "if the
buffer size of more than 256KB, increase the size of the buffer to the
next multiple of 128K and use big pages" would probably yield better
results.
>         }
>
>         nouveau_bo_fixup_align(nvbo, flags, &align, &size);
> @@ -1641,6 +1646,10 @@ nouveau_bo_vma_add(struct nouveau_bo *nvbo, struct
nvkm_vm *vm,
>             (nvbo->bo.mem.mem_type == TTM_PL_VRAM ||
>              nvbo->page_shift != vma->vm->mmu->lpg_shift))
>                 nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
> +       else if (nvbo->bo.mem.mem_type == TTM_PL_TT &&
> +               vma->vm->mmu->iommu_capable &&
> +               nvbo->page_shift == vma->vm->mmu->lpg_shift)
> +               nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
Sorry, I don't understand why this is needed, could you explain?

Vince Hsu

2015-Apr-17 07:28 UTC

head link

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

On 04/17/2015 02:26 PM, Alexandre Courbot wrote:> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh at nvidia.com>
wrote:
>> Some platforms have IOMMU to map non-contiguous physical memory into
>> contiguous GPU virtual address. We can use this feature to enable big
pages
>> mapping on scattered small pages. To achieve that, we also need changes
in
>> subdev/mmu as well.
>>
>> Signed-off-by: Vince Hsu <vinceh at nvidia.com>
>> ---
>>   drm/nouveau/nouveau_bo.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
>> index 77326e344dad..da76ee1121e4 100644
>> --- a/drm/nouveau/nouveau_bo.c
>> +++ b/drm/nouveau/nouveau_bo.c
>> @@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int size,
int align,
>>          if (drm->client.vm) {
>>                  if (!(flags & TTM_PL_FLAG_TT) && size >
256 * 1024)
>>                          nvbo->page_shift =
drm->client.vm->mmu->lpg_shift;
>> +
>> +               if ((flags & TTM_PL_FLAG_TT) &&
>> +                              
drm->client.vm->mmu->iommu_capable &&
>> +                               (size % (1 <<
drm->client.vm->mmu->lpg_shift)) == 0)
>> +                       nvbo->page_shift =
drm->client.vm->mmu->lpg_shift;
> I wonder if we should not just use the same size heuristics as for VRAM
above?
>
> Here, unless your buffer size is an exact multiple of the big page
> size (128K for GK20A), you will not use big pages at all. In effect,
> many buffers will be rejected for this reason. A behavior like "if the
> buffer size of more than 256KB, increase the size of the buffer to the
> next multiple of 128K and use big pages" would probably yield better
> results.I'm told that the user space would align the buffer to the big page 
size. So I made this patch based on that impression. I'm not pretty 
familiar with the user space behavior, so sorry if I say something wrong 
here.

And are we allowed to extend the buffer size even the exporter doesn't 
know about that?
>>          }
>>
>>          nouveau_bo_fixup_align(nvbo, flags, &align, &size);
>> @@ -1641,6 +1646,10 @@ nouveau_bo_vma_add(struct nouveau_bo *nvbo,
struct nvkm_vm *vm,
>>              (nvbo->bo.mem.mem_type == TTM_PL_VRAM ||
>>               nvbo->page_shift != vma->vm->mmu->lpg_shift))
>>                  nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
>> +       else if (nvbo->bo.mem.mem_type == TTM_PL_TT &&
>> +               vma->vm->mmu->iommu_capable &&
>> +               nvbo->page_shift ==
vma->vm->mmu->lpg_shift)
>> +               nvkm_vm_map(vma, nvbo->bo.mem.mm_node);
> Sorry, I don't understand why this is needed, could you explain?Originally the buffer can only be mapped when the memory type is VRAM or 
the page is small. So I added another condition to map big page for TT.

Thanks,
Vince





-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may
contain
confidential information.  Any unauthorized review, use, disclosure or
distribution
is prohibited.  If you are not the intended recipient, please contact the sender
by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Alexandre Courbot

2015-Apr-17 07:33 UTC

head link

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

On Fri, Apr 17, 2015 at 4:28 PM, Vince Hsu <vinceh at nvidia.com>
wrote:>
>
> On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
>>
>> On Thu, Apr 16, 2015 at 8:06 PM, Vince Hsu <vinceh at nvidia.com>
wrote:
>>>
>>> Some platforms have IOMMU to map non-contiguous physical memory
into
>>> contiguous GPU virtual address. We can use this feature to enable
big
>>> pages
>>> mapping on scattered small pages. To achieve that, we also need
changes
>>> in
>>> subdev/mmu as well.
>>>
>>> Signed-off-by: Vince Hsu <vinceh at nvidia.com>
>>> ---
>>>   drm/nouveau/nouveau_bo.c | 9 +++++++++
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drm/nouveau/nouveau_bo.c b/drm/nouveau/nouveau_bo.c
>>> index 77326e344dad..da76ee1121e4 100644
>>> --- a/drm/nouveau/nouveau_bo.c
>>> +++ b/drm/nouveau/nouveau_bo.c
>>> @@ -221,6 +221,11 @@ nouveau_bo_new(struct drm_device *dev, int
size, int
>>> align,
>>>          if (drm->client.vm) {
>>>                  if (!(flags & TTM_PL_FLAG_TT) && size
> 256 * 1024)
>>>                          nvbo->page_shift >>>
drm->client.vm->mmu->lpg_shift;
>>> +
>>> +               if ((flags & TTM_PL_FLAG_TT) &&
>>> +                              
drm->client.vm->mmu->iommu_capable &&
>>> +                               (size % (1 <<
>>> drm->client.vm->mmu->lpg_shift)) == 0)
>>> +                       nvbo->page_shift >>>
drm->client.vm->mmu->lpg_shift;
>>
>> I wonder if we should not just use the same size heuristics as for VRAM
>> above?
>>
>> Here, unless your buffer size is an exact multiple of the big page
>> size (128K for GK20A), you will not use big pages at all. In effect,
>> many buffers will be rejected for this reason. A behavior like "if
the
>> buffer size of more than 256KB, increase the size of the buffer to the
>> next multiple of 128K and use big pages" would probably yield
better
>> results.
>
> I'm told that the user space would align the buffer to the big page
size. So
> I made this patch based on that impression. I'm not pretty familiar
with the
> user space behavior, so sorry if I say something wrong here.
We would have to add this logic into Mesa, and only for GPUs that
feature no VRAM. Not impossible, but let's gather more informed
opinions about this.

Terje Bergstrom

2015-Apr-17 15:24 UTC

head link

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

On 04/17/2015 12:28 AM, Vince Hsu wrote:>
> On 04/17/2015 02:26 PM, Alexandre Courbot wrote:
>> I wonder if we should not just use the same size heuristics as for 
>> VRAM above?
>>
>> Here, unless your buffer size is an exact multiple of the big page
>> size (128K for GK20A), you will not use big pages at all. In effect,
>> many buffers will be rejected for this reason. A behavior like "if
the
>> buffer size of more than 256KB, increase the size of the buffer to the
>> next multiple of 128K and use big pages" would probably yield
better
>> results.
> I'm told that the user space would align the buffer to the big page 
> size. So I made this patch based on that impression. I'm not pretty 
> familiar with the user space behavior, so sorry if I say something 
> wrong here.
>
> And are we allowed to extend the buffer size even the exporter doesn't 
> know about that?The alignment of IOMMU VA base address and size has to be correct. As a 
consequence you could think that this means that the allocated size of 
the buffer has to be big page aligned.

You could also just allocate buffer actual size with whatever alignment, 
and take care of alignment at buffer mapping time. In case buffer is 
120k, we could allocate 128k of IOMMU VA at 128k aligned base address, 
and leave the rest 8k IOMMU unmapped, and map it as a single large page 
to GPU MMU. But that would allow triggering IOMMU fault from GPU command 
buffers, and IOMMU faults are much more difficult to debug than GPU MMU 
faults, so I don't recommend this.

So in summary, I prefer taking care of buffer size alignment at 
allocation time, and IOMMU VA base alignment at mapping time.

Nouveau - Apr 2015 - [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available

[Nouveau] [PATCH 4/6] drm: enable big page mapping for small pages when IOMMU is available