Maarten Lankhorst
2013-Jul-15 08:39 UTC
[Nouveau] [PATCH] drm/nouveau: do not move buffers when not needed
Op 15-07-13 08:05, Ben Skeggs schreef:> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst > <maarten.lankhorst at canonical.com> wrote: >> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p >> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to >> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves. > It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size. > > What configuration does the buffer that's getting moved here have > exactly? The placement restriction isn't necessary on GF8, the rest > of the restrictions may currently be required still however. > >= vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok becausethe vm hides all the tiling translations, and the engines will read the raw bitstream correctly. 8<--- This prevents buffer moves from being done on NV50+, where remapping is not needed because the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible. nouveau_bo_tile_layout is always 0 on < NV_50. Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> --- diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index d506da5..762bfcd 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo) struct nouveau_device *device = nv_device(drm->device); u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT; - /* as long as the bo isn't in vram, and isn't tiled, we've got - * nothing to do here. + /* + * if the bo is not in vram, or remapping can be done (nv50+) + * do not worry about placement, any location is valid */ - if (bo->mem.mem_type != TTM_PL_VRAM) { - if (nv_device(drm->device)->card_type < NV_50 || - !nouveau_bo_tile_layout(nvbo)) - return 0; - } + if (nv_device(drm->device)->card_type >= NV_50 || + bo->mem.mem_type != TTM_PL_VRAM) + return 0; /* make sure bo is in mappable vram */ if (bo->mem.start + bo->mem.num_pages < mappable)
Martin Peres
2013-Aug-24 06:26 UTC
[Nouveau] [PATCH] drm/nouveau: do not move buffers when not needed
On 15/07/2013 10:39, Maarten Lankhorst wrote:> Op 15-07-13 08:05, Ben Skeggs schreef: >> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst >> <maarten.lankhorst at canonical.com> wrote: >>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p >>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to >>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves. >> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size. >> >> What configuration does the buffer that's getting moved here have >> exactly? The placement restriction isn't necessary on GF8, the rest >> of the restrictions may currently be required still however. >> >> = vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because > the vm hides all the tiling translations, and the engines will read the raw bitstream correctly. > 8<--- > This prevents buffer moves from being done on NV50+, where remapping is not needed because > the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible. > nouveau_bo_tile_layout is always 0 on < NV_50. > > Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>There are still some rendering issues on my nvc4, but the framerate is much smoother than it was before this patch. Tested-by: Martin Peres <martin.peres at labri.fr>> --- > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index d506da5..762bfcd 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo) > struct nouveau_device *device = nv_device(drm->device); > u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT; > > - /* as long as the bo isn't in vram, and isn't tiled, we've got > - * nothing to do here. > + /* > + * if the bo is not in vram, or remapping can be done (nv50+) > + * do not worry about placement, any location is valid > */ > - if (bo->mem.mem_type != TTM_PL_VRAM) { > - if (nv_device(drm->device)->card_type < NV_50 || > - !nouveau_bo_tile_layout(nvbo)) > - return 0; > - } > + if (nv_device(drm->device)->card_type >= NV_50 || > + bo->mem.mem_type != TTM_PL_VRAM) > + return 0; > > /* make sure bo is in mappable vram */ > if (bo->mem.start + bo->mem.num_pages < mappable) > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/nouveau
Ben Skeggs
2013-Sep-04 01:24 UTC
[Nouveau] [PATCH] drm/nouveau: do not move buffers when not needed
On Mon, Jul 15, 2013 at 6:39 PM, Maarten Lankhorst <maarten.lankhorst at canonical.com> wrote:> Op 15-07-13 08:05, Ben Skeggs schreef: >> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst >> <maarten.lankhorst at canonical.com> wrote: >>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p >>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to >>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves. >> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size. >> >> What configuration does the buffer that's getting moved here have >> exactly? The placement restriction isn't necessary on GF8, the rest >> of the restrictions may currently be required still however. >> >>= vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because > the vm hides all the tiling translations, and the engines will read the raw bitstream correctly.Why would you be doing such a thing in the first place? It seems pointless, and quite possibly counter-productive to use a tiled layout for a linear data structure...> 8<--- > This prevents buffer moves from being done on NV50+, where remapping is not needed because > the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible. > nouveau_bo_tile_layout is always 0 on < NV_50. > > Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> > --- > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c > index d506da5..762bfcd 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo) > struct nouveau_device *device = nv_device(drm->device); > u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT; > > - /* as long as the bo isn't in vram, and isn't tiled, we've got > - * nothing to do here. > + /* > + * if the bo is not in vram, or remapping can be done (nv50+) > + * do not worry about placement, any location is valid > */ > - if (bo->mem.mem_type != TTM_PL_VRAM) { > - if (nv_device(drm->device)->card_type < NV_50 || > - !nouveau_bo_tile_layout(nvbo)) > - return 0; > - } > + if (nv_device(drm->device)->card_type >= NV_50 || > + bo->mem.mem_type != TTM_PL_VRAM) > + return 0;I get what you're trying to do here, and we should definitely avoid the "mappable vram" check on GF8, but I suspect this condition is too broad. I'll think about it more after I finish reviewing the rest of the patches on the list.. Thanks, Ben.> > /* make sure bo is in mappable vram */ > if (bo->mem.start + bo->mem.num_pages < mappable) >
Maarten Lankhorst
2013-Sep-04 13:25 UTC
[Nouveau] [PATCH] drm/nouveau: do not move buffers when not needed
Op 04-09-13 03:24, Ben Skeggs schreef:> On Mon, Jul 15, 2013 at 6:39 PM, Maarten Lankhorst > <maarten.lankhorst at canonical.com> wrote: >> Op 15-07-13 08:05, Ben Skeggs schreef: >>> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst >>> <maarten.lankhorst at canonical.com> wrote: >>>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p >>>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to >>>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves. >>> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size. >>> >>> What configuration does the buffer that's getting moved here have >>> exactly? The placement restriction isn't necessary on GF8, the rest >>> of the restrictions may currently be required still however. >>> >>> = vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because >> the vm hides all the tiling translations, and the engines will read the raw bitstream correctly. > Why would you be doing such a thing in the first place? It seems > pointless, and quite possibly counter-productive to use a tiled layout > for a linear data structure...Initially I just allocated everything I didn't need to access directly tiled, and it seems I did the same for the bitstream bo. I only found out later about the bug with excessive moves causing a major slowdown.>> 8<--- >> This prevents buffer moves from being done on NV50+, where remapping is not needed because >> the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible. >> nouveau_bo_tile_layout is always 0 on < NV_50. >> >> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> >> --- >> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c >> index d506da5..762bfcd 100644 >> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c >> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c >> @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo) >> struct nouveau_device *device = nv_device(drm->device); >> u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT; >> >> - /* as long as the bo isn't in vram, and isn't tiled, we've got >> - * nothing to do here. >> + /* >> + * if the bo is not in vram, or remapping can be done (nv50+) >> + * do not worry about placement, any location is valid >> */ >> - if (bo->mem.mem_type != TTM_PL_VRAM) { >> - if (nv_device(drm->device)->card_type < NV_50 || >> - !nouveau_bo_tile_layout(nvbo)) >> - return 0; >> - } >> + if (nv_device(drm->device)->card_type >= NV_50 || >> + bo->mem.mem_type != TTM_PL_VRAM) >> + return 0; > I get what you're trying to do here, and we should definitely avoid > the "mappable vram" check on GF8, but I suspect this condition is too > broad. I'll think about it more after I finish reviewing the rest of > the patches on the list.. >I think this relaxed check is fine. If it's !VRAM, the host can always access it because it has direct access to the pages without needing anything from the gpu. On >= NV50 the move can always be skipped too because the memory is mapped to the vm, and always accessible. ~Maarten
Apparently Analagous Threads
- [PATCH] drm/nouveau: do not move buffers when not needed
- [PATCH] drm/nouveau: kill nouveau_ttm_fault_reserve_notify handler to prevent useless buffer moves
- [PATCH] drm/nouveau: do not move buffers when not needed
- [PATCH] drm/nouveau: kill nouveau_ttm_fault_reserve_notify handler to prevent useless buffer moves
- [PATCH 1/7] drm/nouveau: fix m2mf copy to tiled gart