Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 00/10] Add support for MPEG2 and VC-1 on VP3/VP4 for NV98-NVAF
As it turns out, with the proprietary firmware, the VP3 and VP4 interfaces are identical. Furthermore, this is all already implemented for nvc0. So these patches (a) move the easily sharable bits of the nvc0 implementation into the nouveau directory, and then (b) implement the other parts in nv50. The non-shared parts are still largely copies, but there are some differences, not the least of which is the BEGIN_NV04 vs BEGIN_NVC0 stuff. Probably more refactoring is possible, but right now there is ~1k lines of code in nouveau, and ~1k lines of code in each of nvc0/nvc0_video* and nv50/nv98_video*. For whatever reason, h264 and mpeg4 don't work "out of the box". With h264, the decoder hangs after decoding a few frames, and I think reports are that mpeg4 just doesn't work at all. (I also seem to remember hearing that mpeg4 didn't work on nvc0+ either, so that's likely related.) I played with it a bunch, but I couldn't figure out what was wrong. Getting h264 to work will be a larger effort, and will be able to build on this patchset. For now h264 and mpeg4 are not reported in the capabilities of pre-nvc0 cards. This patchset has received limited testing on VP3, pre-nvc0 VP4, post-nvc0 VP4, and VP5. It's worth noting that the mpeg1 decoding looks all weird and blocky, but it's the exact same issue as when using the blob. And the same bug exists on VP2 (although not with mesa/xvmc). They probably are missing some detail about difference of mpeg2 vs mpeg1 (oddification maybe? who knows... same bug exists with mesa/vdpau on top of VP2, so it's something subtle that mplayer/ffmpeg gets right but mesa's and nvidia's mpeg parsers don't). For people who aren't already intimately familiar with the video decoding acceleration situation on nvidia cards, take a look at http://nouveau.freedesktop.org/wiki/VideoAcceleration/ for information on how to obtain firmware, what kernels to use, what cards support what, etc. You can also find these patches at https://github.com/imirkin/mesa.git vp4-2 For reviewing purposes, you might prefer to look at the patches with diff -M -C although I didn't think that would be appropriate for sending to the ML. Ilia Mirkin (10): nvc0: refactor video buffer management logic into nouveau_vp3 nvc0: standardize on using #if for NVC0_DEBUG_FENCE nvc0: move nvc0_decoder into nouveau, rename to nouveau_vp3_decoder nvc0: move bsp param-filling logic into nouveau nvc0: move vp param filling logic into nouveau nvc0: move some of the simpler decoder functions into nouveau nvc0: move firmware loading functions to nouveau nvc0: move video param and format support functions to nouveau nv50: separate video logic from noalloc nv50: add vp3/vp4 support for mpeg2/vc1 src/gallium/drivers/nouveau/Makefile.sources | 5 +- src/gallium/drivers/nouveau/nouveau_vp3_video.c | 399 ++++++++++++++++ src/gallium/drivers/nouveau/nouveau_vp3_video.h | 228 ++++++++++ .../drivers/nouveau/nouveau_vp3_video_bsp.c | 310 +++++++++++++ src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c | 485 ++++++++++++++++++++ src/gallium/drivers/nv50/Makefile.sources | 6 +- src/gallium/drivers/nv50/nv50_context.c | 5 +- src/gallium/drivers/nv50/nv50_context.h | 14 + src/gallium/drivers/nv50/nv50_miptree.c | 6 +- src/gallium/drivers/nv50/nv50_resource.h | 1 + src/gallium/drivers/nv50/nv50_screen.c | 7 +- src/gallium/drivers/nv50/nv50_winsys.h | 4 - src/gallium/drivers/nv50/nv84_video.c | 2 +- src/gallium/drivers/nv50/nv84_video.h | 4 + src/gallium/drivers/nv50/nv98_video.c | 308 +++++++++++++ src/gallium/drivers/nv50/nv98_video.h | 48 ++ src/gallium/drivers/nv50/nv98_video_bsp.c | 159 +++++++ src/gallium/drivers/nv50/nv98_video_ppp.c | 143 ++++++ src/gallium/drivers/nv50/nv98_video_vp.c | 202 +++++++++ src/gallium/drivers/nvc0/nvc0_context.h | 5 - src/gallium/drivers/nvc0/nvc0_screen.c | 18 +- src/gallium/drivers/nvc0/nvc0_video.c | 332 +------------- src/gallium/drivers/nvc0/nvc0_video.h | 191 +------- src/gallium/drivers/nvc0/nvc0_video_bsp.c | 301 +----------- src/gallium/drivers/nvc0/nvc0_video_ppp.c | 16 +- src/gallium/drivers/nvc0/nvc0_video_vp.c | 502 +-------------------- 26 files changed, 2396 insertions(+), 1305 deletions(-) create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video.c create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video.h create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c create mode 100644 src/gallium/drivers/nv50/nv98_video.c create mode 100644 src/gallium/drivers/nv50/nv98_video.h create mode 100644 src/gallium/drivers/nv50/nv98_video_bsp.c create mode 100644 src/gallium/drivers/nv50/nv98_video_ppp.c create mode 100644 src/gallium/drivers/nv50/nv98_video_vp.c -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 01/10] nvc0: refactor video buffer management logic into nouveau_vp3
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/Makefile.sources | 3 +- src/gallium/drivers/nouveau/nouveau_vp3_video.c | 165 ++++++++++++++++++++++++ src/gallium/drivers/nouveau/nouveau_vp3_video.h | 38 ++++++ src/gallium/drivers/nvc0/nvc0_video.c | 135 +------------------ src/gallium/drivers/nvc0/nvc0_video.h | 29 ++--- src/gallium/drivers/nvc0/nvc0_video_bsp.c | 4 +- src/gallium/drivers/nvc0/nvc0_video_ppp.c | 6 +- src/gallium/drivers/nvc0/nvc0_video_vp.c | 38 +++--- 8 files changed, 243 insertions(+), 175 deletions(-) create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video.c create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video.h diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index cc9e68f..f7c9249 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -4,4 +4,5 @@ C_SOURCES := \ nouveau_mm.c \ nouveau_buffer.c \ nouveau_heap.c \ - nouveau_video.c + nouveau_video.c \ + nouveau_vp3_video.c diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c b/src/gallium/drivers/nouveau/nouveau_vp3_video.c new file mode 100644 index 0000000..a55c2e8 --- /dev/null +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c @@ -0,0 +1,165 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nouveau_screen.h" +#include "nouveau_context.h" +#include "nouveau_vp3_video.h" + +#include "util/u_video.h" +#include "util/u_format.h" +#include "util/u_sampler.h" + +static struct pipe_sampler_view ** +nouveau_vp3_video_buffer_sampler_view_planes(struct pipe_video_buffer *buffer) +{ + struct nouveau_vp3_video_buffer *buf = (struct nouveau_vp3_video_buffer *)buffer; + return buf->sampler_view_planes; +} + +static struct pipe_sampler_view ** +nouveau_vp3_video_buffer_sampler_view_components(struct pipe_video_buffer *buffer) +{ + struct nouveau_vp3_video_buffer *buf = (struct nouveau_vp3_video_buffer *)buffer; + return buf->sampler_view_components; +} + +static struct pipe_surface ** +nouveau_vp3_video_buffer_surfaces(struct pipe_video_buffer *buffer) +{ + struct nouveau_vp3_video_buffer *buf = (struct nouveau_vp3_video_buffer *)buffer; + return buf->surfaces; +} + +static void +nouveau_vp3_video_buffer_destroy(struct pipe_video_buffer *buffer) +{ + struct nouveau_vp3_video_buffer *buf = (struct nouveau_vp3_video_buffer *)buffer; + unsigned i; + + assert(buf); + + for (i = 0; i < VL_NUM_COMPONENTS; ++i) { + pipe_resource_reference(&buf->resources[i], NULL); + pipe_sampler_view_reference(&buf->sampler_view_planes[i], NULL); + pipe_sampler_view_reference(&buf->sampler_view_components[i], NULL); + pipe_surface_reference(&buf->surfaces[i * 2], NULL); + pipe_surface_reference(&buf->surfaces[i * 2 + 1], NULL); + } + FREE(buffer); +} + +struct pipe_video_buffer * +nouveau_vp3_video_buffer_create(struct pipe_context *pipe, + const struct pipe_video_buffer *templat, + int flags) +{ + struct nouveau_vp3_video_buffer *buffer; + struct pipe_resource templ; + unsigned i, j, component; + struct pipe_sampler_view sv_templ; + struct pipe_surface surf_templ; + + assert(templat->interlaced); + if (getenv("XVMC_VL") || templat->buffer_format != PIPE_FORMAT_NV12) + return vl_video_buffer_create(pipe, templat); + + assert(templat->chroma_format == PIPE_VIDEO_CHROMA_FORMAT_420); + + buffer = CALLOC_STRUCT(nouveau_vp3_video_buffer); + if (!buffer) + return NULL; + + buffer->base.buffer_format = templat->buffer_format; + buffer->base.context = pipe; + buffer->base.destroy = nouveau_vp3_video_buffer_destroy; + buffer->base.chroma_format = templat->chroma_format; + buffer->base.width = templat->width; + buffer->base.height = templat->height; + buffer->base.get_sampler_view_planes = nouveau_vp3_video_buffer_sampler_view_planes; + buffer->base.get_sampler_view_components = nouveau_vp3_video_buffer_sampler_view_components; + buffer->base.get_surfaces = nouveau_vp3_video_buffer_surfaces; + buffer->base.interlaced = true; + + memset(&templ, 0, sizeof(templ)); + templ.target = PIPE_TEXTURE_2D_ARRAY; + templ.depth0 = 1; + templ.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET; + templ.format = PIPE_FORMAT_R8_UNORM; + templ.width0 = buffer->base.width; + templ.height0 = (buffer->base.height + 1)/2; + templ.flags = flags; + templ.array_size = 2; + + buffer->resources[0] = pipe->screen->resource_create(pipe->screen, &templ); + if (!buffer->resources[0]) + goto error; + + templ.format = PIPE_FORMAT_R8G8_UNORM; + buffer->num_planes = 2; + templ.width0 = (templ.width0 + 1) / 2; + templ.height0 = (templ.height0 + 1) / 2; + for (i = 1; i < buffer->num_planes; ++i) { + buffer->resources[i] = pipe->screen->resource_create(pipe->screen, &templ); + if (!buffer->resources[i]) + goto error; + } + + memset(&sv_templ, 0, sizeof(sv_templ)); + for (component = 0, i = 0; i < buffer->num_planes; ++i ) { + struct pipe_resource *res = buffer->resources[i]; + unsigned nr_components = util_format_get_nr_components(res->format); + + u_sampler_view_default_template(&sv_templ, res, res->format); + buffer->sampler_view_planes[i] = pipe->create_sampler_view(pipe, res, &sv_templ); + if (!buffer->sampler_view_planes[i]) + goto error; + + for (j = 0; j < nr_components; ++j, ++component) { + sv_templ.swizzle_r = sv_templ.swizzle_g = sv_templ.swizzle_b = PIPE_SWIZZLE_RED + j; + sv_templ.swizzle_a = PIPE_SWIZZLE_ONE; + + buffer->sampler_view_components[component] = pipe->create_sampler_view(pipe, res, &sv_templ); + if (!buffer->sampler_view_components[component]) + goto error; + } + } + + memset(&surf_templ, 0, sizeof(surf_templ)); + for (j = 0; j < buffer->num_planes; ++j) { + surf_templ.format = buffer->resources[j]->format; + surf_templ.u.tex.first_layer = surf_templ.u.tex.last_layer = 0; + buffer->surfaces[j * 2] = pipe->create_surface(pipe, buffer->resources[j], &surf_templ); + if (!buffer->surfaces[j * 2]) + goto error; + + surf_templ.u.tex.first_layer = surf_templ.u.tex.last_layer = 1; + buffer->surfaces[j * 2 + 1] = pipe->create_surface(pipe, buffer->resources[j], &surf_templ); + if (!buffer->surfaces[j * 2 + 1]) + goto error; + } + + return &buffer->base; + +error: + nouveau_vp3_video_buffer_destroy(&buffer->base); + return NULL; +} diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h new file mode 100644 index 0000000..bff5d76 --- /dev/null +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -0,0 +1,38 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "pipe/p_defines.h" +#include "vl/vl_video_buffer.h" + +struct nouveau_vp3_video_buffer { + struct pipe_video_buffer base; + unsigned num_planes, valid_ref; + struct pipe_resource *resources[VL_NUM_COMPONENTS]; + struct pipe_sampler_view *sampler_view_planes[VL_NUM_COMPONENTS]; + struct pipe_sampler_view *sampler_view_components[VL_NUM_COMPONENTS]; + struct pipe_surface *surfaces[VL_NUM_COMPONENTS * 2]; +}; + +struct pipe_video_buffer * +nouveau_vp3_video_buffer_create(struct pipe_context *pipe, + const struct pipe_video_buffer *templat, + int flags); diff --git a/src/gallium/drivers/nvc0/nvc0_video.c b/src/gallium/drivers/nvc0/nvc0_video.c index 7cc086a..626d0d4 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.c +++ b/src/gallium/drivers/nvc0/nvc0_video.c @@ -63,12 +63,12 @@ nvc0_decoder_decode_bitstream(struct pipe_video_decoder *decoder, const unsigned *num_bytes) { struct nvc0_decoder *dec = (struct nvc0_decoder *)decoder; - struct nvc0_video_buffer *target = (struct nvc0_video_buffer *)video_target; + struct nouveau_vp3_video_buffer *target = (struct nouveau_vp3_video_buffer *)video_target; uint32_t comm_seq = ++dec->fence_seq; union pipe_desc desc; unsigned vp_caps, is_ref, ret; - struct nvc0_video_buffer *refs[16] = {}; + struct nouveau_vp3_video_buffer *refs[16] = {}; desc.base = picture; @@ -506,137 +506,10 @@ fail: return NULL; } -static struct pipe_sampler_view ** -nvc0_video_buffer_sampler_view_planes(struct pipe_video_buffer *buffer) -{ - struct nvc0_video_buffer *buf = (struct nvc0_video_buffer *)buffer; - return buf->sampler_view_planes; -} - -static struct pipe_sampler_view ** -nvc0_video_buffer_sampler_view_components(struct pipe_video_buffer *buffer) -{ - struct nvc0_video_buffer *buf = (struct nvc0_video_buffer *)buffer; - return buf->sampler_view_components; -} - -static struct pipe_surface ** -nvc0_video_buffer_surfaces(struct pipe_video_buffer *buffer) -{ - struct nvc0_video_buffer *buf = (struct nvc0_video_buffer *)buffer; - return buf->surfaces; -} - -static void -nvc0_video_buffer_destroy(struct pipe_video_buffer *buffer) -{ - struct nvc0_video_buffer *buf = (struct nvc0_video_buffer *)buffer; - unsigned i; - - assert(buf); - - for (i = 0; i < VL_NUM_COMPONENTS; ++i) { - pipe_resource_reference(&buf->resources[i], NULL); - pipe_sampler_view_reference(&buf->sampler_view_planes[i], NULL); - pipe_sampler_view_reference(&buf->sampler_view_components[i], NULL); - pipe_surface_reference(&buf->surfaces[i * 2], NULL); - pipe_surface_reference(&buf->surfaces[i * 2 + 1], NULL); - } - FREE(buffer); -} - struct pipe_video_buffer * nvc0_video_buffer_create(struct pipe_context *pipe, const struct pipe_video_buffer *templat) { - struct nvc0_video_buffer *buffer; - struct pipe_resource templ; - unsigned i, j, component; - struct pipe_sampler_view sv_templ; - struct pipe_surface surf_templ; - - assert(templat->interlaced); - if (getenv("XVMC_VL") || templat->buffer_format != PIPE_FORMAT_NV12) - return vl_video_buffer_create(pipe, templat); - - assert(templat->chroma_format == PIPE_VIDEO_CHROMA_FORMAT_420); - - buffer = CALLOC_STRUCT(nvc0_video_buffer); - if (!buffer) - return NULL; - - buffer->base.buffer_format = templat->buffer_format; - buffer->base.context = pipe; - buffer->base.destroy = nvc0_video_buffer_destroy; - buffer->base.chroma_format = templat->chroma_format; - buffer->base.width = templat->width; - buffer->base.height = templat->height; - buffer->base.get_sampler_view_planes = nvc0_video_buffer_sampler_view_planes; - buffer->base.get_sampler_view_components = nvc0_video_buffer_sampler_view_components; - buffer->base.get_surfaces = nvc0_video_buffer_surfaces; - buffer->base.interlaced = true; - - memset(&templ, 0, sizeof(templ)); - templ.target = PIPE_TEXTURE_2D_ARRAY; - templ.depth0 = 1; - templ.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET; - templ.format = PIPE_FORMAT_R8_UNORM; - templ.width0 = buffer->base.width; - templ.height0 = (buffer->base.height + 1)/2; - templ.flags = NVC0_RESOURCE_FLAG_VIDEO; - templ.array_size = 2; - - buffer->resources[0] = pipe->screen->resource_create(pipe->screen, &templ); - if (!buffer->resources[0]) - goto error; - - templ.format = PIPE_FORMAT_R8G8_UNORM; - buffer->num_planes = 2; - templ.width0 = (templ.width0 + 1) / 2; - templ.height0 = (templ.height0 + 1) / 2; - for (i = 1; i < buffer->num_planes; ++i) { - buffer->resources[i] = pipe->screen->resource_create(pipe->screen, &templ); - if (!buffer->resources[i]) - goto error; - } - - memset(&sv_templ, 0, sizeof(sv_templ)); - for (component = 0, i = 0; i < buffer->num_planes; ++i ) { - struct pipe_resource *res = buffer->resources[i]; - unsigned nr_components = util_format_get_nr_components(res->format); - - u_sampler_view_default_template(&sv_templ, res, res->format); - buffer->sampler_view_planes[i] = pipe->create_sampler_view(pipe, res, &sv_templ); - if (!buffer->sampler_view_planes[i]) - goto error; - - for (j = 0; j < nr_components; ++j, ++component) { - sv_templ.swizzle_r = sv_templ.swizzle_g = sv_templ.swizzle_b = PIPE_SWIZZLE_RED + j; - sv_templ.swizzle_a = PIPE_SWIZZLE_ONE; - - buffer->sampler_view_components[component] = pipe->create_sampler_view(pipe, res, &sv_templ); - if (!buffer->sampler_view_components[component]) - goto error; - } - } - - memset(&surf_templ, 0, sizeof(surf_templ)); - for (j = 0; j < buffer->num_planes; ++j) { - surf_templ.format = buffer->resources[j]->format; - surf_templ.u.tex.first_layer = surf_templ.u.tex.last_layer = 0; - buffer->surfaces[j * 2] = pipe->create_surface(pipe, buffer->resources[j], &surf_templ); - if (!buffer->surfaces[j * 2]) - goto error; - - surf_templ.u.tex.first_layer = surf_templ.u.tex.last_layer = 1; - buffer->surfaces[j * 2 + 1] = pipe->create_surface(pipe, buffer->resources[j], &surf_templ); - if (!buffer->surfaces[j * 2 + 1]) - goto error; - } - - return &buffer->base; - -error: - nvc0_video_buffer_destroy(&buffer->base); - return NULL; + return nouveau_vp3_video_buffer_create( + pipe, templat, NVC0_RESOURCE_FLAG_VIDEO); } diff --git a/src/gallium/drivers/nvc0/nvc0_video.h b/src/gallium/drivers/nvc0/nvc0_video.h index aed1424..271ed5c 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.h +++ b/src/gallium/drivers/nvc0/nvc0_video.h @@ -22,9 +22,9 @@ #include "nvc0_context.h" #include "nvc0_screen.h" +#include "nouveau/nouveau_vp3_video.h" #include "vl/vl_decoder.h" -#include "vl/vl_video_buffer.h" #include "vl/vl_types.h" #include "util/u_video.h" @@ -53,15 +53,6 @@ union pipe_desc { struct pipe_h264_picture_desc *h264; }; -struct nvc0_video_buffer { - struct pipe_video_buffer base; - unsigned num_planes, valid_ref; - struct pipe_resource *resources[VL_NUM_COMPONENTS]; - struct pipe_sampler_view *sampler_view_planes[VL_NUM_COMPONENTS]; - struct pipe_sampler_view *sampler_view_components[VL_NUM_COMPONENTS]; - struct pipe_surface *surfaces[VL_NUM_COMPONENTS * 2]; -}; - struct nvc0_decoder { struct pipe_video_decoder base; struct nouveau_client *client; @@ -105,7 +96,7 @@ struct nvc0_decoder { // and give shaders a chance to run as well. struct { - struct nvc0_video_buffer *vidbuf; + struct nouveau_vp3_video_buffer *vidbuf; unsigned last_used; unsigned field_pic_flag : 1; unsigned decoded_top : 1; @@ -151,7 +142,7 @@ static INLINE uint32_t mb_half(uint32_t coord) } static INLINE uint64_t -nvc0_video_addr(struct nvc0_decoder *dec, struct nvc0_video_buffer *target) +nvc0_video_addr(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *target) { uint64_t ret; if (target) @@ -197,25 +188,25 @@ nvc0_decoder_inter_sizes(struct nvc0_decoder *dec, uint32_t slice_count, extern unsigned nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned num_buffers, const void *const *data, const unsigned *num_bytes, unsigned *vp_caps, unsigned *is_ref, - struct nvc0_video_buffer *refs[16]); + struct nouveau_vp3_video_buffer *refs[16]); extern void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned *caps, unsigned *is_ref, - struct nvc0_video_buffer *refs[16]); + struct nouveau_vp3_video_buffer *refs[16]); extern void nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, unsigned comm_seq, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned caps, unsigned is_ref, - struct nvc0_video_buffer *refs[16]); + struct nouveau_vp3_video_buffer *refs[16]); extern void nvc0_decoder_ppp(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, unsigned comm_seq); + struct nouveau_vp3_video_buffer *target, unsigned comm_seq); diff --git a/src/gallium/drivers/nvc0/nvc0_video_bsp.c b/src/gallium/drivers/nvc0/nvc0_video_bsp.c index 450dc2b..8f93861 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_bsp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_bsp.c @@ -241,11 +241,11 @@ static void dump_comm_bsp(struct comm *comm) unsigned nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned num_buffers, const void *const *data, const unsigned *num_bytes, unsigned *vp_caps, unsigned *is_ref, - struct nvc0_video_buffer *refs[16]) + struct nouveau_vp3_video_buffer *refs[16]) { struct nouveau_pushbuf *push = dec->pushbuf[0]; enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); diff --git a/src/gallium/drivers/nvc0/nvc0_video_ppp.c b/src/gallium/drivers/nvc0/nvc0_video_ppp.c index efa2527..823e360 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_ppp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_ppp.c @@ -23,7 +23,7 @@ #include "nvc0_video.h" static void -nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nvc0_video_buffer *target, uint32_t low700) { +nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *target, uint32_t low700) { struct nouveau_pushbuf *push = dec->pushbuf[2]; uint32_t stride_in = mb(dec->base.width); @@ -73,7 +73,7 @@ nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nvc0_video_buffer *targe } static uint32_t -nvc0_decoder_vc1_ppp(struct nvc0_decoder *dec, struct pipe_vc1_picture_desc *desc, struct nvc0_video_buffer *target) { +nvc0_decoder_vc1_ppp(struct nvc0_decoder *dec, struct pipe_vc1_picture_desc *desc, struct nouveau_vp3_video_buffer *target) { struct nouveau_pushbuf *push = dec->pushbuf[2]; nvc0_decoder_setup_ppp(dec, target, 0x1412); @@ -89,7 +89,7 @@ nvc0_decoder_vc1_ppp(struct nvc0_decoder *dec, struct pipe_vc1_picture_desc *des } void -nvc0_decoder_ppp(struct nvc0_decoder *dec, union pipe_desc desc, struct nvc0_video_buffer *target, unsigned comm_seq) { +nvc0_decoder_ppp(struct nvc0_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq) { enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); struct nouveau_pushbuf *push = dec->pushbuf[2]; unsigned ppp_caps = 0x10; diff --git a/src/gallium/drivers/nvc0/nvc0_video_vp.c b/src/gallium/drivers/nvc0/nvc0_video_vp.c index c5d4f94..7c1691c 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_vp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_vp.c @@ -170,7 +170,7 @@ struct h264_picparm_vp { // 700..a00 }; static void -nvc0_decoder_handle_references(struct nvc0_decoder *dec, struct nvc0_video_buffer *refs[16], unsigned seq, struct nvc0_video_buffer *target) +nvc0_decoder_handle_references(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *refs[16], unsigned seq, struct nouveau_vp3_video_buffer *target) { unsigned h264 = u_reduce_video_profile(dec->base.profile) == PIPE_VIDEO_CODEC_MPEG4_AVC; unsigned i, idx, empty_spot = dec->base.max_references + 1; @@ -221,7 +221,7 @@ nvc0_decoder_handle_references(struct nvc0_decoder *dec, struct nvc0_video_buffe } static void -nvc0_decoder_kick_ref(struct nvc0_decoder *dec, struct nvc0_video_buffer *target) +nvc0_decoder_kick_ref(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *target) { dec->refs[target->valid_ref].vidbuf = NULL; dec->refs[target->valid_ref].last_used = 0; @@ -231,7 +231,7 @@ nvc0_decoder_kick_ref(struct nvc0_decoder *dec, struct nvc0_video_buffer *target static uint32_t nvc0_decoder_fill_picparm_mpeg12_vp(struct nvc0_decoder *dec, struct pipe_mpeg12_picture_desc *desc, - struct nvc0_video_buffer *refs[16], + struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, char *map) { @@ -272,15 +272,15 @@ nvc0_decoder_fill_picparm_mpeg12_vp(struct nvc0_decoder *dec, memcpy(pic_vp->intra_quantizer_matrix, desc->intra_matrix, 0x40); memcpy(pic_vp->non_intra_quantizer_matrix, desc->non_intra_matrix, 0x40); memcpy(map, pic_vp, sizeof(*pic_vp)); - refs[0] = (struct nvc0_video_buffer *)desc->ref[0]; - refs[!!refs[0]] = (struct nvc0_video_buffer *)desc->ref[1]; + refs[0] = (struct nouveau_vp3_video_buffer *)desc->ref[0]; + refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)desc->ref[1]; return ret | (dec->base.profile != PIPE_VIDEO_PROFILE_MPEG1); } static uint32_t nvc0_decoder_fill_picparm_mpeg4_vp(struct nvc0_decoder *dec, struct pipe_mpeg4_picture_desc *desc, - struct nvc0_video_buffer *refs[16], + struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, char *map) { @@ -320,15 +320,15 @@ nvc0_decoder_fill_picparm_mpeg4_vp(struct nvc0_decoder *dec, memcpy(pic_vp->intra, desc->intra_matrix, 0x40); memcpy(pic_vp->non_intra, desc->non_intra_matrix, 0x40); memcpy(map, pic_vp, sizeof(*pic_vp)); - refs[0] = (struct nvc0_video_buffer *)desc->ref[0]; - refs[!!refs[0]] = (struct nvc0_video_buffer *)desc->ref[1]; + refs[0] = (struct nouveau_vp3_video_buffer *)desc->ref[0]; + refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)desc->ref[1]; return ret; } static uint32_t nvc0_decoder_fill_picparm_h264_vp(struct nvc0_decoder *dec, const struct pipe_h264_picture_desc *d, - struct nvc0_video_buffer *refs[16], + struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, char *map) { @@ -377,7 +377,7 @@ nvc0_decoder_fill_picparm_h264_vp(struct nvc0_decoder *dec, for (i = 0; i < d->num_ref_frames; ++i) { if (!d->ref[i]) break; - refs[j] = (struct nvc0_video_buffer *)d->ref[i]; + refs[j] = (struct nouveau_vp3_video_buffer *)d->ref[i]; h->refs[j].fifo_idx = j + 1; h->refs[j].tmp_idx = refs[j]->valid_ref; h->refs[j].field_order_cnt[0] = d->field_order_cnt_list[i][0]; @@ -412,8 +412,8 @@ nvc0_decoder_fill_picparm_h264_vp(struct nvc0_decoder *dec, static void nvc0_decoder_fill_picparm_h264_vp_refs(struct nvc0_decoder *dec, struct pipe_h264_picture_desc *d, - struct nvc0_video_buffer *refs[16], - struct nvc0_video_buffer *target, + struct nouveau_vp3_video_buffer *refs[16], + struct nouveau_vp3_video_buffer *target, char *map) { struct h264_picparm_vp *h = (struct h264_picparm_vp *)map; @@ -431,7 +431,7 @@ nvc0_decoder_fill_picparm_h264_vp_refs(struct nvc0_decoder *dec, static uint32_t nvc0_decoder_fill_picparm_vc1_vp(struct nvc0_decoder *dec, struct pipe_vc1_picture_desc *d, - struct nvc0_video_buffer *refs[16], + struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, char *map) { @@ -455,8 +455,8 @@ nvc0_decoder_fill_picparm_vc1_vp(struct nvc0_decoder *dec, vc->overlap = d->overlap; vc->quantizer = d->quantizer; vc->u36 = 0; // ? No idea what this one is.. - refs[0] = (struct nvc0_video_buffer *)d->ref[0]; - refs[!!refs[0]] = (struct nvc0_video_buffer *)d->ref[1]; + refs[0] = (struct nouveau_vp3_video_buffer *)d->ref[0]; + refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)d->ref[1]; return 0x12; } @@ -494,9 +494,9 @@ static void dump_comm_vp(struct nvc0_decoder *dec, struct comm *comm, u32 comm_s #endif void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, unsigned comm_seq, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned *caps, unsigned *is_ref, - struct nvc0_video_buffer *refs[16]) + struct nouveau_vp3_video_buffer *refs[16]) { struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NVC0_VIDEO_QDEPTH]; enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); @@ -528,9 +528,9 @@ void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, union pipe_desc desc, void nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, - struct nvc0_video_buffer *target, unsigned comm_seq, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned caps, unsigned is_ref, - struct nvc0_video_buffer *refs[16]) + struct nouveau_vp3_video_buffer *refs[16]) { struct nouveau_pushbuf *push = dec->pushbuf[1]; uint32_t bsp_addr, comm_addr, inter_addr, ucode_addr, pic_addr[17], last_addr, null_addr; -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 02/10] nvc0: standardize on using #if for NVC0_DEBUG_FENCE
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nvc0/nvc0_video.c | 2 +- src/gallium/drivers/nvc0/nvc0_video.h | 6 +++--- src/gallium/drivers/nvc0/nvc0_video_bsp.c | 4 ++-- src/gallium/drivers/nvc0/nvc0_video_ppp.c | 2 +- src/gallium/drivers/nvc0/nvc0_video_vp.c | 2 +- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/nvc0/nvc0_video.c b/src/gallium/drivers/nvc0/nvc0_video.c index 626d0d4..18de2ed 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.c +++ b/src/gallium/drivers/nvc0/nvc0_video.c @@ -116,7 +116,7 @@ nvc0_decoder_destroy(struct pipe_video_decoder *decoder) nouveau_bo_ref(NULL, &dec->bitplane_bo); nouveau_bo_ref(NULL, &dec->inter_bo[0]); nouveau_bo_ref(NULL, &dec->inter_bo[1]); -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE nouveau_bo_ref(NULL, &dec->fence_bo); #endif nouveau_bo_ref(NULL, &dec->fw_bo); diff --git a/src/gallium/drivers/nvc0/nvc0_video.h b/src/gallium/drivers/nvc0/nvc0_video.h index 271ed5c..67eca7c 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.h +++ b/src/gallium/drivers/nvc0/nvc0_video.h @@ -33,9 +33,9 @@ #define VP_OFFSET 0x200 #define COMM_OFFSET 0x500 -//#define NVC0_DEBUG_FENCE 1 +#define NVC0_DEBUG_FENCE 0 -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE # define NVC0_VIDEO_QDEPTH 1 #else # define NVC0_VIDEO_QDEPTH 2 @@ -59,7 +59,7 @@ struct nvc0_decoder { struct nouveau_object *channel[3], *bsp, *vp, *ppp; struct nouveau_pushbuf *pushbuf[3]; -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE /* dump fence and comm, as needed.. */ unsigned *fence_map; struct comm *comm; diff --git a/src/gallium/drivers/nvc0/nvc0_video_bsp.c b/src/gallium/drivers/nvc0/nvc0_video_bsp.c index 8f93861..bdb9c64 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_bsp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_bsp.c @@ -261,7 +261,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, struct nouveau_pushbuf_refn bo_refs[] = { { bsp_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, { inter_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, #endif { dec->bitplane_bo, NOUVEAU_BO_RDWR | NOUVEAU_BO_VRAM }, @@ -271,7 +271,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, if (!dec->bitplane_bo) num_refs--; -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE fence_extra = 4; #endif diff --git a/src/gallium/drivers/nvc0/nvc0_video_ppp.c b/src/gallium/drivers/nvc0/nvc0_video_ppp.c index 823e360..836add3 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_ppp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_ppp.c @@ -36,7 +36,7 @@ nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer { NULL, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { NULL, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { dec->ref_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, #endif }; diff --git a/src/gallium/drivers/nvc0/nvc0_video_vp.c b/src/gallium/drivers/nvc0/nvc0_video_vp.c index 7c1691c..74e3915 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_vp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_vp.c @@ -543,7 +543,7 @@ nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, { inter_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { dec->ref_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { bsp_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, -#ifdef NVC0_DEBUG_FENCE +#if NVC0_DEBUG_FENCE { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, #endif { dec->fw_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 03/10] nvc0: move nvc0_decoder into nouveau, rename to nouveau_vp3_decoder
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/nouveau_vp3_video.h | 160 +++++++++++++++++++++++ src/gallium/drivers/nvc0/nvc0_video.c | 22 ++-- src/gallium/drivers/nvc0/nvc0_video.h | 165 +----------------------- src/gallium/drivers/nvc0/nvc0_video_bsp.c | 28 ++-- src/gallium/drivers/nvc0/nvc0_video_ppp.c | 16 +-- src/gallium/drivers/nvc0/nvc0_video_vp.c | 60 ++++----- 6 files changed, 227 insertions(+), 224 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index bff5d76..7322138 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -21,8 +21,11 @@ */ #include "pipe/p_defines.h" + #include "vl/vl_video_buffer.h" +#include "util/u_video.h" + struct nouveau_vp3_video_buffer { struct pipe_video_buffer base; unsigned num_planes, valid_ref; @@ -32,6 +35,163 @@ struct nouveau_vp3_video_buffer { struct pipe_surface *surfaces[VL_NUM_COMPONENTS * 2]; }; +#define SLICE_SIZE 0x200 +#define VP_OFFSET 0x200 +#define COMM_OFFSET 0x500 + +#define NOUVEAU_VP3_DEBUG_FENCE 0 + +#if NOUVEAU_VP3_DEBUG_FENCE +# define NOUVEAU_VP3_VIDEO_QDEPTH 1 +#else +# define NOUVEAU_VP3_VIDEO_QDEPTH 2 +#endif + +#define SUBC_BSP(m) dec->bsp_idx, (m) +#define SUBC_VP(m) dec->vp_idx, (m) +#define SUBC_PPP(m) dec->ppp_idx, (m) + +union pipe_desc { + struct pipe_picture_desc *base; + struct pipe_mpeg12_picture_desc *mpeg12; + struct pipe_mpeg4_picture_desc *mpeg4; + struct pipe_vc1_picture_desc *vc1; + struct pipe_h264_picture_desc *h264; +}; + +struct nouveau_vp3_decoder { + struct pipe_video_decoder base; + struct nouveau_client *client; + struct nouveau_object *channel[3], *bsp, *vp, *ppp; + struct nouveau_pushbuf *pushbuf[3]; + +#if NOUVEAU_VP3_DEBUG_FENCE + /* dump fence and comm, as needed.. */ + unsigned *fence_map; + struct comm *comm; + + struct nouveau_bo *fence_bo; +#endif + + struct nouveau_bo *fw_bo, *bitplane_bo; + + // array size max_references + 2, contains unpostprocessed images + // added at the end of ref_bo is a tmp array + // tmp is an array for h264, with each member being used for a ref frame or current + // target.. size = (((mb(w)*((mb(h)+1)&~1))+3)>>2)<<8 * (max_references+1) + // for other codecs, it simply seems that size = w*h is enough + // unsure what it's supposed to contain.. + struct nouveau_bo *ref_bo; + + struct nouveau_bo *inter_bo[2]; + + struct nouveau_bo *bsp_bo[NOUVEAU_VP3_VIDEO_QDEPTH]; + + // bo's used by each cycle: + + // bsp_bo: contains raw bitstream data and parameters for BSP and VP. + // inter_bo: contains data shared between BSP and VP + // ref_bo: reference image data, used by PPP and VP + // bitplane_bo: contain bitplane data (similar to ref_bo), used by BSP only + // fw_bo: used by VP only. + + // Needed amount of copies in optimal case: + // 2 copies of inter_bo, VP would process the last inter_bo, while BSP is + // writing out a new set. + // NOUVEAU_VP3_VIDEO_QDEPTH copies of bsp_bo. We don't want to block the + // pipeline ever, and give shaders a chance to run as well. + + struct { + struct nouveau_vp3_video_buffer *vidbuf; + unsigned last_used; + unsigned field_pic_flag : 1; + unsigned decoded_top : 1; + unsigned decoded_bottom : 1; + } refs[17]; + unsigned fence_seq, fw_sizes, last_frame_num, tmp_stride, ref_stride; + + unsigned bsp_idx, vp_idx, ppp_idx; +}; + +struct comm { + uint32_t bsp_cur_index; // 000 + uint32_t byte_ofs; // 004 + uint32_t status[0x10]; // 008 + uint32_t pos[0x10]; // 048 + uint8_t pad[0x100 - 0x88]; // 0a0 bool comm_encrypted + + uint32_t pvp_cur_index; // 100 + uint32_t acked_byte_ofs; // 104 + uint32_t status_vp[0x10]; // 108 + uint16_t mb_y[0x10]; //148 + uint32_t pvp_stage; // 168 0xeeXX + uint16_t parse_endpos_index; // 16c + uint16_t irq_index; // 16e + uint8_t irq_470[0x10]; // 170 + uint32_t irq_pos[0x10]; // 180 + uint32_t parse_endpos[0x10]; // 1c0 +}; + +static INLINE uint32_t nouveau_vp3_video_align(uint32_t h) +{ + return ((h+0x3f)&~0x3f); +}; + +static INLINE uint32_t mb(uint32_t coord) +{ + return (coord + 0xf)>>4; +} + +static INLINE uint32_t mb_half(uint32_t coord) +{ + return (coord + 0x1f)>>5; +} + +static INLINE uint64_t +nouveau_vp3_video_addr(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target) +{ + uint64_t ret; + if (target) + ret = dec->ref_stride * target->valid_ref; + else + ret = dec->ref_stride * (dec->base.max_references+1); + return dec->ref_bo->offset + ret; +} + +static INLINE void +nouveau_vp3_ycbcr_offsets(struct nouveau_vp3_decoder *dec, uint32_t *y2, + uint32_t *cbcr, uint32_t *cbcr2) +{ + uint32_t w = mb(dec->base.width), size; + *y2 = mb_half(dec->base.height)*w; + *cbcr = *y2 * 2; + *cbcr2 = *cbcr + w * (nouveau_vp3_video_align(dec->base.height)>>6); + + /* The check here should never fail because it means a bug + * in the code rather than a bug in hardware.. + */ + size = (2 * (*cbcr2 - *cbcr) + *cbcr) << 8; + if (size > dec->ref_stride) { + debug_printf("Overshot ref_stride (%u) with size %u and ofs (%u,%u,%u)\n", + dec->ref_stride, size, *y2<<8, *cbcr<<8, *cbcr2<<8); + *y2 = *cbcr = *cbcr2 = 0; + assert(size <= dec->ref_stride); + } +} + +static INLINE void +nouveau_vp3_inter_sizes(struct nouveau_vp3_decoder *dec, uint32_t slice_count, + uint32_t *slice_size, uint32_t *bucket_size, + uint32_t *ring_size) +{ + *slice_size = (SLICE_SIZE * slice_count)>>8; + if (u_reduce_video_profile(dec->base.profile) == PIPE_VIDEO_CODEC_MPEG12) + *bucket_size = 0; + else + *bucket_size = mb(dec->base.width) * 3; + *ring_size = (dec->inter_bo[0]->size >> 8) - *bucket_size - *slice_size; +} + struct pipe_video_buffer * nouveau_vp3_video_buffer_create(struct pipe_context *pipe, const struct pipe_video_buffer *templat, diff --git a/src/gallium/drivers/nvc0/nvc0_video.c b/src/gallium/drivers/nvc0/nvc0_video.c index 18de2ed..73963c2 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.c +++ b/src/gallium/drivers/nvc0/nvc0_video.c @@ -62,7 +62,7 @@ nvc0_decoder_decode_bitstream(struct pipe_video_decoder *decoder, const void *const *data, const unsigned *num_bytes) { - struct nvc0_decoder *dec = (struct nvc0_decoder *)decoder; + struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; struct nouveau_vp3_video_buffer *target = (struct nouveau_vp3_video_buffer *)video_target; uint32_t comm_seq = ++dec->fence_seq; union pipe_desc desc; @@ -88,7 +88,7 @@ nvc0_decoder_decode_bitstream(struct pipe_video_decoder *decoder, static void nvc0_decoder_flush(struct pipe_video_decoder *decoder) { - struct nvc0_decoder *dec = (struct nvc0_decoder *)decoder; + struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; (void)dec; } @@ -109,19 +109,19 @@ nvc0_decoder_end_frame(struct pipe_video_decoder *decoder, static void nvc0_decoder_destroy(struct pipe_video_decoder *decoder) { - struct nvc0_decoder *dec = (struct nvc0_decoder *)decoder; + struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; int i; nouveau_bo_ref(NULL, &dec->ref_bo); nouveau_bo_ref(NULL, &dec->bitplane_bo); nouveau_bo_ref(NULL, &dec->inter_bo[0]); nouveau_bo_ref(NULL, &dec->inter_bo[1]); -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE nouveau_bo_ref(NULL, &dec->fence_bo); #endif nouveau_bo_ref(NULL, &dec->fw_bo); - for (i = 0; i < NVC0_VIDEO_QDEPTH; ++i) + for (i = 0; i < NOUVEAU_VP3_VIDEO_QDEPTH; ++i) nouveau_bo_ref(NULL, &dec->bsp_bo[i]); nouveau_object_del(&dec->bsp); @@ -173,7 +173,7 @@ nvc0_create_decoder(struct pipe_context *context, bool chunked_decode) { struct nouveau_screen *screen = &((struct nvc0_context *)context)->screen->base; - struct nvc0_decoder *dec; + struct nouveau_vp3_decoder *dec; struct nouveau_pushbuf **push; union nouveau_bo_config cfg; bool kepler = screen->device->chipset >= 0xe0; @@ -196,7 +196,7 @@ nvc0_create_decoder(struct pipe_context *context, return NULL; } - dec = CALLOC_STRUCT(nvc0_decoder); + dec = CALLOC_STRUCT(nouveau_vp3_decoder); if (!dec) return NULL; dec->client = screen->client; @@ -288,7 +288,7 @@ nvc0_create_decoder(struct pipe_context *context, dec->base.begin_frame = nvc0_decoder_begin_frame; dec->base.end_frame = nvc0_decoder_end_frame; - for (i = 0; i < NVC0_VIDEO_QDEPTH && !ret; ++i) + for (i = 0; i < NOUVEAU_VP3_VIDEO_QDEPTH && !ret; ++i) ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, 0, 1 << 20, &cfg, &dec->bsp_bo[i]); if (!ret) @@ -325,7 +325,7 @@ nvc0_create_decoder(struct pipe_context *context, } case PIPE_VIDEO_CODEC_MPEG4_AVC: { codec = 3; - dec->tmp_stride = 16 * mb_half(width) * nvc0_video_align(height) * 3 / 2; + dec->tmp_stride = 16 * mb_half(width) * nouveau_vp3_video_align(height) * 3 / 2; tmp_size = dec->tmp_stride * (max_references + 1); assert(max_references <= 16); break; @@ -415,7 +415,7 @@ nvc0_create_decoder(struct pipe_context *context, goto fail; } - dec->ref_stride = mb(width)*16 * (mb_half(height)*32 + nvc0_video_align(height)/2); + dec->ref_stride = mb(width)*16 * (mb_half(height)*32 + nouveau_vp3_video_align(height)/2); ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, 0, dec->ref_stride * (max_references+2) + tmp_size, &cfg, &dec->ref_bo); @@ -438,7 +438,7 @@ nvc0_create_decoder(struct pipe_context *context, ++dec->fence_seq; -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE ret = nouveau_bo_new(screen->device, NOUVEAU_BO_GART|NOUVEAU_BO_MAP, 0, 0x1000, NULL, &dec->fence_bo); if (ret) diff --git a/src/gallium/drivers/nvc0/nvc0_video.h b/src/gallium/drivers/nvc0/nvc0_video.h index 67eca7c..1ceb6eb 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.h +++ b/src/gallium/drivers/nvc0/nvc0_video.h @@ -29,172 +29,15 @@ #include "util/u_video.h" -#define SLICE_SIZE 0x200 -#define VP_OFFSET 0x200 -#define COMM_OFFSET 0x500 - -#define NVC0_DEBUG_FENCE 0 - -#if NVC0_DEBUG_FENCE -# define NVC0_VIDEO_QDEPTH 1 -#else -# define NVC0_VIDEO_QDEPTH 2 -#endif - -#define SUBC_BSP(m) dec->bsp_idx, (m) -#define SUBC_VP(m) dec->vp_idx, (m) -#define SUBC_PPP(m) dec->ppp_idx, (m) - -union pipe_desc { - struct pipe_picture_desc *base; - struct pipe_mpeg12_picture_desc *mpeg12; - struct pipe_mpeg4_picture_desc *mpeg4; - struct pipe_vc1_picture_desc *vc1; - struct pipe_h264_picture_desc *h264; -}; - -struct nvc0_decoder { - struct pipe_video_decoder base; - struct nouveau_client *client; - struct nouveau_object *channel[3], *bsp, *vp, *ppp; - struct nouveau_pushbuf *pushbuf[3]; - -#if NVC0_DEBUG_FENCE - /* dump fence and comm, as needed.. */ - unsigned *fence_map; - struct comm *comm; - - struct nouveau_bo *fence_bo; -#endif - - struct nouveau_bo *fw_bo, *bitplane_bo; - - // array size max_references + 2, contains unpostprocessed images - // added at the end of ref_bo is a tmp array - // tmp is an array for h264, with each member being used for a ref frame or current - // target.. size = (((mb(w)*((mb(h)+1)&~1))+3)>>2)<<8 * (max_references+1) - // for other codecs, it simply seems that size = w*h is enough - // unsure what it's supposed to contain.. - struct nouveau_bo *ref_bo; - - struct nouveau_bo *inter_bo[2]; - - struct nouveau_bo *bsp_bo[NVC0_VIDEO_QDEPTH]; - - // bo's used by each cycle: - - // bsp_bo: contains raw bitstream data and parameters for BSP and VP. - // inter_bo: contains data shared between BSP and VP - // ref_bo: reference image data, used by PPP and VP - // bitplane_bo: contain bitplane data (similar to ref_bo), used by BSP only - // fw_bo: used by VP only. - - // Needed amount of copies in optimal case: - // 2 copies of inter_bo, VP would process the last inter_bo, while BSP is - // writing out a new set. - // NVC0_VIDEO_QDEPTH copies of bsp_bo. We don't want to block the pipeline ever, - // and give shaders a chance to run as well. - - struct { - struct nouveau_vp3_video_buffer *vidbuf; - unsigned last_used; - unsigned field_pic_flag : 1; - unsigned decoded_top : 1; - unsigned decoded_bottom : 1; - } refs[17]; - unsigned fence_seq, fw_sizes, last_frame_num, tmp_stride, ref_stride; - - unsigned bsp_idx, vp_idx, ppp_idx; -}; - -struct comm { - uint32_t bsp_cur_index; // 000 - uint32_t byte_ofs; // 004 - uint32_t status[0x10]; // 008 - uint32_t pos[0x10]; // 048 - uint8_t pad[0x100 - 0x88]; // 0a0 bool comm_encrypted - - uint32_t pvp_cur_index; // 100 - uint32_t acked_byte_ofs; // 104 - uint32_t status_vp[0x10]; // 108 - uint16_t mb_y[0x10]; //148 - uint32_t pvp_stage; // 168 0xeeXX - uint16_t parse_endpos_index; // 16c - uint16_t irq_index; // 16e - uint8_t irq_470[0x10]; // 170 - uint32_t irq_pos[0x10]; // 180 - uint32_t parse_endpos[0x10]; // 1c0 -}; - -static INLINE uint32_t nvc0_video_align(uint32_t h) -{ - return ((h+0x3f)&~0x3f); -}; - -static INLINE uint32_t mb(uint32_t coord) -{ - return (coord + 0xf)>>4; -} - -static INLINE uint32_t mb_half(uint32_t coord) -{ - return (coord + 0x1f)>>5; -} - -static INLINE uint64_t -nvc0_video_addr(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *target) -{ - uint64_t ret; - if (target) - ret = dec->ref_stride * target->valid_ref; - else - ret = dec->ref_stride * (dec->base.max_references+1); - return dec->ref_bo->offset + ret; -} - -static INLINE void -nvc0_decoder_ycbcr_offsets(struct nvc0_decoder *dec, uint32_t *y2, - uint32_t *cbcr, uint32_t *cbcr2) -{ - uint32_t w = mb(dec->base.width), size; - *y2 = mb_half(dec->base.height)*w; - *cbcr = *y2 * 2; - *cbcr2 = *cbcr + w * (nvc0_video_align(dec->base.height)>>6); - - /* The check here should never fail because it means a bug - * in the code rather than a bug in hardware.. - */ - size = (2 * (*cbcr2 - *cbcr) + *cbcr) << 8; - if (size > dec->ref_stride) { - debug_printf("Overshot ref_stride (%u) with size %u and ofs (%u,%u,%u)\n", - dec->ref_stride, size, *y2<<8, *cbcr<<8, *cbcr2<<8); - *y2 = *cbcr = *cbcr2 = 0; - assert(size <= dec->ref_stride); - } -} - -static INLINE void -nvc0_decoder_inter_sizes(struct nvc0_decoder *dec, uint32_t slice_count, - uint32_t *slice_size, uint32_t *bucket_size, - uint32_t *ring_size) -{ - *slice_size = (SLICE_SIZE * slice_count)>>8; - if (u_reduce_video_profile(dec->base.profile) == PIPE_VIDEO_CODEC_MPEG12) - *bucket_size = 0; - else - *bucket_size = mb(dec->base.width) * 3; - *ring_size = (dec->inter_bo[0]->size >> 8) - *bucket_size - *slice_size; -} - extern unsigned -nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, +nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned num_buffers, const void *const *data, const unsigned *num_bytes, unsigned *vp_caps, unsigned *is_ref, struct nouveau_vp3_video_buffer *refs[16]); -extern void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, +extern void nvc0_decoder_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, @@ -202,11 +45,11 @@ extern void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *refs[16]); extern void -nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, +nvc0_decoder_vp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned caps, unsigned is_ref, struct nouveau_vp3_video_buffer *refs[16]); extern void -nvc0_decoder_ppp(struct nvc0_decoder *dec, union pipe_desc desc, +nvc0_decoder_ppp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq); diff --git a/src/gallium/drivers/nvc0/nvc0_video_bsp.c b/src/gallium/drivers/nvc0/nvc0_video_bsp.c index bdb9c64..c632a5e 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_bsp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_bsp.c @@ -110,7 +110,7 @@ struct h264_picparm_bsp { }; static uint32_t -nvc0_decoder_fill_picparm_mpeg12_bsp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_mpeg12_bsp(struct nouveau_vp3_decoder *dec, struct pipe_mpeg12_picture_desc *desc, char *map) { @@ -132,7 +132,7 @@ nvc0_decoder_fill_picparm_mpeg12_bsp(struct nvc0_decoder *dec, } static uint32_t -nvc0_decoder_fill_picparm_mpeg4_bsp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_mpeg4_bsp(struct nouveau_vp3_decoder *dec, struct pipe_mpeg4_picture_desc *desc, char *map) { @@ -157,7 +157,7 @@ nvc0_decoder_fill_picparm_mpeg4_bsp(struct nvc0_decoder *dec, } static uint32_t -nvc0_decoder_fill_picparm_vc1_bsp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_vc1_bsp(struct nouveau_vp3_decoder *dec, struct pipe_vc1_picture_desc *d, char *map) { @@ -189,7 +189,7 @@ nvc0_decoder_fill_picparm_vc1_bsp(struct nvc0_decoder *dec, } static uint32_t -nvc0_decoder_fill_picparm_h264_bsp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_h264_bsp(struct nouveau_vp3_decoder *dec, struct pipe_h264_picture_desc *d, char *map) { @@ -230,7 +230,7 @@ nvc0_decoder_fill_picparm_h264_bsp(struct nvc0_decoder *dec, return caps | 3; } -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE static void dump_comm_bsp(struct comm *comm) { unsigned idx = comm->bsp_cur_index & 0xf; @@ -240,7 +240,7 @@ static void dump_comm_bsp(struct comm *comm) #endif unsigned -nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, +nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned num_buffers, const void *const *data, const unsigned *num_bytes, @@ -255,13 +255,13 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, uint32_t endmarker, caps; struct strparm_bsp *str_bsp; int ret, i; - struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NVC0_VIDEO_QDEPTH]; + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1]; unsigned fence_extra = 0; struct nouveau_pushbuf_refn bo_refs[] = { { bsp_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, { inter_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, #endif { dec->bitplane_bo, NOUVEAU_BO_RDWR | NOUVEAU_BO_VRAM }, @@ -271,7 +271,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, if (!dec->bitplane_bo) num_refs--; -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE fence_extra = 4; #endif @@ -329,7 +329,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, /* Reserved for picparm_vp */ bsp += 0x300; /* Reserved for comm */ -#if !NVC0_DEBUG_FENCE +#if !NOUVEAU_VP3_DEBUG_FENCE memset(bsp, 0, 0x200); #endif bsp += 0x200; @@ -351,7 +351,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, bsp_addr = bsp_bo->offset >> 8; inter_addr = inter_bo->offset >> 8; -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE memset(dec->comm, 0, 0x200); comm_addr = (dec->fence_bo->offset + COMM_OFFSET) >> 8; #else @@ -370,7 +370,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, bitplane_addr = dec->bitplane_bo->offset >> 8; - nvc0_decoder_inter_sizes(dec, 1, &slice_size, &bucket_size, &ring_size); + nouveau_vp3_inter_sizes(dec, 1, &slice_size, &bucket_size, &ring_size); BEGIN_NVC0(push, SUBC_BSP(0x400), 6); PUSH_DATA (push, bsp_addr); // 400 picparm addr PUSH_DATA (push, inter_addr); // 404 interparm addr @@ -379,7 +379,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, PUSH_DATA (push, bitplane_addr); // 410 BITPLANE_DATA PUSH_DATA (push, 0x400); // 414 BITPLANE_DATA_SIZE } else { - nvc0_decoder_inter_sizes(dec, desc.h264->slice_count, &slice_size, &bucket_size, &ring_size); + nouveau_vp3_inter_sizes(dec, desc.h264->slice_count, &slice_size, &bucket_size, &ring_size); BEGIN_NVC0(push, SUBC_BSP(0x400), 8); PUSH_DATA (push, bsp_addr); // 400 picparm addr PUSH_DATA (push, inter_addr); // 404 interparm addr @@ -392,7 +392,7 @@ nvc0_decoder_bsp(struct nvc0_decoder *dec, union pipe_desc desc, // TODO: Double check 414 / 418 with nvidia trace } -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE BEGIN_NVC0(push, SUBC_BSP(0x240), 3); PUSH_DATAh(push, dec->fence_bo->offset); PUSH_DATA (push, dec->fence_bo->offset); diff --git a/src/gallium/drivers/nvc0/nvc0_video_ppp.c b/src/gallium/drivers/nvc0/nvc0_video_ppp.c index 836add3..496db80 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_ppp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_ppp.c @@ -23,7 +23,7 @@ #include "nvc0_video.h" static void -nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *target, uint32_t low700) { +nvc0_decoder_setup_ppp(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target, uint32_t low700) { struct nouveau_pushbuf *push = dec->pushbuf[2]; uint32_t stride_in = mb(dec->base.width); @@ -36,7 +36,7 @@ nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer { NULL, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { NULL, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { dec->ref_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, #endif }; @@ -48,10 +48,10 @@ nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer } nouveau_pushbuf_refn(push, bo_refs, num_refs); - nvc0_decoder_ycbcr_offsets(dec, &y2, &cbcr, &cbcr2); + nouveau_vp3_ycbcr_offsets(dec, &y2, &cbcr, &cbcr2); BEGIN_NVC0(push, SUBC_PPP(0x700), 10); - in_addr = nvc0_video_addr(dec, target) >> 8; + in_addr = nouveau_vp3_video_addr(dec, target) >> 8; PUSH_DATA (push, (stride_out << 24) | (stride_out << 16) | low700); // 700 PUSH_DATA (push, (stride_in << 24) | (stride_in << 16) | (dec_h << 8) | dec_w); // 704 @@ -73,7 +73,7 @@ nvc0_decoder_setup_ppp(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer } static uint32_t -nvc0_decoder_vc1_ppp(struct nvc0_decoder *dec, struct pipe_vc1_picture_desc *desc, struct nouveau_vp3_video_buffer *target) { +nvc0_decoder_vc1_ppp(struct nouveau_vp3_decoder *dec, struct pipe_vc1_picture_desc *desc, struct nouveau_vp3_video_buffer *target) { struct nouveau_pushbuf *push = dec->pushbuf[2]; nvc0_decoder_setup_ppp(dec, target, 0x1412); @@ -89,13 +89,13 @@ nvc0_decoder_vc1_ppp(struct nvc0_decoder *dec, struct pipe_vc1_picture_desc *des } void -nvc0_decoder_ppp(struct nvc0_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq) { +nvc0_decoder_ppp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq) { enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); struct nouveau_pushbuf *push = dec->pushbuf[2]; unsigned ppp_caps = 0x10; unsigned fence_extra = 0; -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE fence_extra = 4; #endif @@ -116,7 +116,7 @@ nvc0_decoder_ppp(struct nvc0_decoder *dec, union pipe_desc desc, struct nouveau_ PUSH_DATA (push, comm_seq); PUSH_DATA (push, ppp_caps); -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE BEGIN_NVC0(push, SUBC_PPP(0x240), 3); PUSH_DATAh(push, (dec->fence_bo->offset + 0x20)); PUSH_DATA (push, (dec->fence_bo->offset + 0x20)); diff --git a/src/gallium/drivers/nvc0/nvc0_video_vp.c b/src/gallium/drivers/nvc0/nvc0_video_vp.c index 74e3915..e3c00b9 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_vp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_vp.c @@ -170,7 +170,7 @@ struct h264_picparm_vp { // 700..a00 }; static void -nvc0_decoder_handle_references(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *refs[16], unsigned seq, struct nouveau_vp3_video_buffer *target) +nvc0_decoder_handle_references(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *refs[16], unsigned seq, struct nouveau_vp3_video_buffer *target) { unsigned h264 = u_reduce_video_profile(dec->base.profile) == PIPE_VIDEO_CODEC_MPEG4_AVC; unsigned i, idx, empty_spot = dec->base.max_references + 1; @@ -221,7 +221,7 @@ nvc0_decoder_handle_references(struct nvc0_decoder *dec, struct nouveau_vp3_vide } static void -nvc0_decoder_kick_ref(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer *target) +nvc0_decoder_kick_ref(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target) { dec->refs[target->valid_ref].vidbuf = NULL; dec->refs[target->valid_ref].last_used = 0; @@ -229,7 +229,7 @@ nvc0_decoder_kick_ref(struct nvc0_decoder *dec, struct nouveau_vp3_video_buffer } static uint32_t -nvc0_decoder_fill_picparm_mpeg12_vp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_mpeg12_vp(struct nouveau_vp3_decoder *dec, struct pipe_mpeg12_picture_desc *desc, struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, @@ -252,10 +252,10 @@ nvc0_decoder_fill_picparm_mpeg12_vp(struct nvc0_decoder *dec, pic_vp->height = mb(dec->base.height); pic_vp->unk08 = pic_vp->unk04 = (dec->base.width+0xf)&~0xf; // Stride - nvc0_decoder_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); + nouveau_vp3_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); pic_vp->ofs[5] = pic_vp->ofs[3]; pic_vp->ofs[0] = pic_vp->ofs[2] = 0; - nvc0_decoder_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); + nouveau_vp3_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); pic_vp->alternate_scan = desc->alternate_scan; pic_vp->pad2[0] = pic_vp->pad2[1] = pic_vp->pad2[2] = 0; @@ -278,7 +278,7 @@ nvc0_decoder_fill_picparm_mpeg12_vp(struct nvc0_decoder *dec, } static uint32_t -nvc0_decoder_fill_picparm_mpeg4_vp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_mpeg4_vp(struct nouveau_vp3_decoder *dec, struct pipe_mpeg4_picture_desc *desc, struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, @@ -293,11 +293,11 @@ nvc0_decoder_fill_picparm_mpeg4_vp(struct nvc0_decoder *dec, pic_vp->height = mb(dec->base.height)<<4; pic_vp->unk0c = pic_vp->unk08 = mb(dec->base.width)<<4; // Stride - nvc0_decoder_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); + nouveau_vp3_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); pic_vp->ofs[5] = pic_vp->ofs[3]; pic_vp->ofs[0] = pic_vp->ofs[2] = 0; pic_vp->pad1 = pic_vp->pad2 = 0; - nvc0_decoder_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); + nouveau_vp3_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); pic_vp->trd[0] = desc->trd[0]; pic_vp->trd[1] = desc->trd[1]; @@ -326,7 +326,7 @@ nvc0_decoder_fill_picparm_mpeg4_vp(struct nvc0_decoder *dec, } static uint32_t -nvc0_decoder_fill_picparm_h264_vp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_h264_vp(struct nouveau_vp3_decoder *dec, const struct pipe_h264_picture_desc *d, struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, @@ -341,12 +341,12 @@ nvc0_decoder_fill_picparm_h264_vp(struct nvc0_decoder *dec, h->width = mb(dec->base.width); h->height = mb(dec->base.height); h->stride1 = h->stride2 = mb(dec->base.width)*16; - nvc0_decoder_ycbcr_offsets(dec, &h->ofs[1], &h->ofs[3], &h->ofs[4]); + nouveau_vp3_ycbcr_offsets(dec, &h->ofs[1], &h->ofs[3], &h->ofs[4]); h->ofs[5] = h->ofs[3]; h->ofs[0] = h->ofs[2] = 0; h->u24 = dec->tmp_stride >> 8; assert(h->u24); - nvc0_decoder_inter_sizes(dec, 1, &ring, &h->bucket_size, &h->inter_ring_data_size); + nouveau_vp3_inter_sizes(dec, 1, &ring, &h->bucket_size, &h->inter_ring_data_size); h->u220 = 0; h->f0 = d->mb_adaptive_frame_field_flag; @@ -410,7 +410,7 @@ nvc0_decoder_fill_picparm_h264_vp(struct nvc0_decoder *dec, } static void -nvc0_decoder_fill_picparm_h264_vp_refs(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_h264_vp_refs(struct nouveau_vp3_decoder *dec, struct pipe_h264_picture_desc *d, struct nouveau_vp3_video_buffer *refs[16], struct nouveau_vp3_video_buffer *target, @@ -429,7 +429,7 @@ nvc0_decoder_fill_picparm_h264_vp_refs(struct nvc0_decoder *dec, } static uint32_t -nvc0_decoder_fill_picparm_vc1_vp(struct nvc0_decoder *dec, +nvc0_decoder_fill_picparm_vc1_vp(struct nouveau_vp3_decoder *dec, struct pipe_vc1_picture_desc *d, struct nouveau_vp3_video_buffer *refs[16], unsigned *is_ref, @@ -440,14 +440,14 @@ nvc0_decoder_fill_picparm_vc1_vp(struct nvc0_decoder *dec, assert(dec->base.profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE); *is_ref = d->picture_type <= 1; - nvc0_decoder_ycbcr_offsets(dec, &vc->ofs[1], &vc->ofs[3], &vc->ofs[4]); + nouveau_vp3_ycbcr_offsets(dec, &vc->ofs[1], &vc->ofs[3], &vc->ofs[4]); vc->ofs[5] = vc->ofs[3]; vc->ofs[0] = vc->ofs[2] = 0; vc->width = dec->base.width; vc->height = mb(dec->base.height)<<4; vc->unk0c = vc->unk10 = mb(dec->base.width)<<4; // Stride vc->pad = vc->pad2 = 0; - nvc0_decoder_inter_sizes(dec, 1, &ring, &vc->bucket_size, &vc->inter_ring_data_size); + nouveau_vp3_inter_sizes(dec, 1, &ring, &vc->bucket_size, &vc->inter_ring_data_size); vc->profile = dec->base.profile - PIPE_VIDEO_PROFILE_VC1_SIMPLE; vc->loopfilter = d->loopfilter; vc->fastuvmc = d->fastuvmc; @@ -460,8 +460,8 @@ nvc0_decoder_fill_picparm_vc1_vp(struct nvc0_decoder *dec, return 0x12; } -#if NVC0_DEBUG_FENCE -static void dump_comm_vp(struct nvc0_decoder *dec, struct comm *comm, u32 comm_seq, +#if NOUVEAU_VP3_DEBUG_FENCE +static void dump_comm_vp(struct nouveau_vp3_decoder *dec, struct comm *comm, u32 comm_seq, struct nouveau_bo *inter_bo, unsigned slice_size) { unsigned i, idx = comm->pvp_cur_index & 0xf; @@ -493,12 +493,12 @@ static void dump_comm_vp(struct nvc0_decoder *dec, struct comm *comm, u32 comm_s } #endif -void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, union pipe_desc desc, +void nvc0_decoder_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned *caps, unsigned *is_ref, struct nouveau_vp3_video_buffer *refs[16]) { - struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NVC0_VIDEO_QDEPTH]; + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); char *vp = bsp_bo->map + VP_OFFSET; @@ -527,7 +527,7 @@ void nvc0_decoder_vp_caps(struct nvc0_decoder *dec, union pipe_desc desc, } void -nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, +nvc0_decoder_vp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned caps, unsigned is_ref, struct nouveau_vp3_video_buffer *refs[16]) @@ -536,41 +536,41 @@ nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, uint32_t bsp_addr, comm_addr, inter_addr, ucode_addr, pic_addr[17], last_addr, null_addr; uint32_t slice_size, bucket_size, ring_size, i; enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); - struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NVC0_VIDEO_QDEPTH]; + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1]; u32 fence_extra = 0, codec_extra = 0; struct nouveau_pushbuf_refn bo_refs[] = { { inter_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { dec->ref_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, { bsp_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, #endif { dec->fw_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, }; int num_refs = sizeof(bo_refs)/sizeof(*bo_refs) - !dec->fw_bo; -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE fence_extra = 4; #endif if (codec == PIPE_VIDEO_CODEC_MPEG4_AVC) { - nvc0_decoder_inter_sizes(dec, desc.h264->slice_count, &slice_size, &bucket_size, &ring_size); + nouveau_vp3_inter_sizes(dec, desc.h264->slice_count, &slice_size, &bucket_size, &ring_size); codec_extra += 2; } else - nvc0_decoder_inter_sizes(dec, 1, &slice_size, &bucket_size, &ring_size); + nouveau_vp3_inter_sizes(dec, 1, &slice_size, &bucket_size, &ring_size); if (dec->base.max_references > 2) codec_extra += 1 + (dec->base.max_references - 2); - pic_addr[16] = nvc0_video_addr(dec, target) >> 8; - last_addr = null_addr = nvc0_video_addr(dec, NULL) >> 8; + pic_addr[16] = nouveau_vp3_video_addr(dec, target) >> 8; + last_addr = null_addr = nouveau_vp3_video_addr(dec, NULL) >> 8; for (i = 0; i < dec->base.max_references; ++i) { if (!refs[i]) pic_addr[i] = last_addr; else if (dec->refs[refs[i]->valid_ref].vidbuf == refs[i]) - last_addr = pic_addr[i] = nvc0_video_addr(dec, refs[i]) >> 8; + last_addr = pic_addr[i] = nouveau_vp3_video_addr(dec, refs[i]) >> 8; else pic_addr[i] = null_addr; } @@ -583,7 +583,7 @@ nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, nouveau_pushbuf_refn(push, bo_refs, num_refs); bsp_addr = bsp_bo->offset >> 8; -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE comm_addr = (dec->fence_bo->offset + COMM_OFFSET)>>8; #else comm_addr = bsp_addr + (COMM_OFFSET>>8); @@ -635,7 +635,7 @@ nvc0_decoder_vp(struct nvc0_decoder *dec, union pipe_desc desc, //debug_printf("Decoding %08lx with %08lx and %08lx\n", pic_addr[16], pic_addr[0], pic_addr[1]); -#if NVC0_DEBUG_FENCE +#if NOUVEAU_VP3_DEBUG_FENCE BEGIN_NVC0(push, SUBC_VP(0x240), 3); PUSH_DATAh(push, (dec->fence_bo->offset + 0x10)); PUSH_DATA (push, (dec->fence_bo->offset + 0x10)); -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 04/10] nvc0: move bsp param-filling logic into nouveau
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/Makefile.sources | 3 +- src/gallium/drivers/nouveau/nouveau_vp3_video.h | 10 +- .../drivers/nouveau/nouveau_vp3_video_bsp.c | 310 +++++++++++++++++++++ src/gallium/drivers/nvc0/nvc0_video_bsp.c | 277 +----------------- 4 files changed, 324 insertions(+), 276 deletions(-) create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index f7c9249..ca33207 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -5,4 +5,5 @@ C_SOURCES := \ nouveau_buffer.c \ nouveau_heap.c \ nouveau_video.c \ - nouveau_vp3_video.c + nouveau_vp3_video.c \ + nouveau_vp3_video_bsp.c diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index 7322138..2558d57 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -20,10 +20,10 @@ * OTHER DEALINGS IN THE SOFTWARE. */ -#include "pipe/p_defines.h" +#include <libdrm/nouveau.h> +#include "pipe/p_defines.h" #include "vl/vl_video_buffer.h" - #include "util/u_video.h" struct nouveau_vp3_video_buffer { @@ -196,3 +196,9 @@ struct pipe_video_buffer * nouveau_vp3_video_buffer_create(struct pipe_context *pipe, const struct pipe_video_buffer *templat, int flags); + +uint32_t +nouveau_vp3_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, + unsigned comm_seq, unsigned num_buffers, + const void *const *data, const unsigned *num_bytes); diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c b/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c new file mode 100644 index 0000000..0311854 --- /dev/null +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c @@ -0,0 +1,310 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nouveau_vp3_video.h" + +struct strparm_bsp { + uint32_t w0[4]; // bits 0-23 length, bits 24-31 addr_hi + uint32_t w1[4]; // bit 8-24 addr_lo + uint32_t unk20; // should be idx * 0x8000000, bitstream offset + uint32_t do_crypto_crap; // set to 0 +}; + +struct mpeg12_picparm_bsp { + uint16_t width; + uint16_t height; + uint8_t picture_structure; + uint8_t picture_coding_type; + uint8_t intra_dc_precision; + uint8_t frame_pred_frame_dct; + uint8_t concealment_motion_vectors; + uint8_t intra_vlc_format; + uint16_t pad; + uint8_t f_code[2][2]; +}; + +struct mpeg4_picparm_bsp { + uint16_t width; + uint16_t height; + uint8_t vop_time_increment_size; + uint8_t interlaced; + uint8_t resync_marker_disable; +}; + +struct vc1_picparm_bsp { + uint16_t width; + uint16_t height; + uint8_t profile; // 04 0 simple, 1 main, 2 advanced + uint8_t postprocflag; // 05 + uint8_t pulldown; // 06 + uint8_t interlaced; // 07 + uint8_t tfcntrflag; // 08 + uint8_t finterpflag; // 09 + uint8_t psf; // 0a + uint8_t pad; // 0b + uint8_t multires; // 0c + uint8_t syncmarker; // 0d + uint8_t rangered; // 0e + uint8_t maxbframes; // 0f + uint8_t dquant; // 10 + uint8_t panscan_flag; // 11 + uint8_t refdist_flag; // 12 + uint8_t quantizer; // 13 + uint8_t extended_mv; // 14 + uint8_t extended_dmv; // 15 + uint8_t overlap; // 16 + uint8_t vstransform; // 17 +}; + +struct h264_picparm_bsp { + // 00 + uint32_t unk00; + // 04 + uint32_t log2_max_frame_num_minus4; // 04 checked + uint32_t pic_order_cnt_type; // 08 checked + uint32_t log2_max_pic_order_cnt_lsb_minus4; // 0c checked + uint32_t delta_pic_order_always_zero_flag; // 10, or unknown + + uint32_t frame_mbs_only_flag; // 14, always 1? + uint32_t direct_8x8_inference_flag; // 18, always 1? + uint32_t width_mb; // 1c checked + uint32_t height_mb; // 20 checked + // 24 + //struct picparm2 + uint32_t entropy_coding_mode_flag; // 00, checked + uint32_t pic_order_present_flag; // 04 checked + uint32_t unk; // 08 seems to be 0? + uint32_t pad1; // 0c seems to be 0? + uint32_t pad2; // 10 always 0 ? + uint32_t num_ref_idx_l0_active_minus1; // 14 always 0? + uint32_t num_ref_idx_l1_active_minus1; // 18 always 0? + uint32_t weighted_pred_flag; // 1c checked + uint32_t weighted_bipred_idc; // 20 checked + uint32_t pic_init_qp_minus26; // 24 checked + uint32_t deblocking_filter_control_present_flag; // 28 always 1? + uint32_t redundant_pic_cnt_present_flag; // 2c always 0? + uint32_t transform_8x8_mode_flag; // 30 checked + uint32_t mb_adaptive_frame_field_flag; // 34 checked-ish + uint8_t field_pic_flag; // 38 checked + uint8_t bottom_field_flag; // 39 checked + uint8_t real_pad[0x1b]; // XX why? +}; + +static uint32_t +nouveau_vp3_fill_picparm_mpeg12_bsp(struct nouveau_vp3_decoder *dec, + struct pipe_mpeg12_picture_desc *desc, + char *map) +{ + struct mpeg12_picparm_bsp *pic_bsp = (struct mpeg12_picparm_bsp *)map; + int i; + pic_bsp->width = dec->base.width; + pic_bsp->height = dec->base.height; + pic_bsp->picture_structure = desc->picture_structure; + pic_bsp->picture_coding_type = desc->picture_coding_type; + pic_bsp->intra_dc_precision = desc->intra_dc_precision; + pic_bsp->frame_pred_frame_dct = desc->frame_pred_frame_dct; + pic_bsp->concealment_motion_vectors = desc->concealment_motion_vectors; + pic_bsp->intra_vlc_format = desc->intra_vlc_format; + pic_bsp->pad = 0; + for (i = 0; i < 4; ++i) + pic_bsp->f_code[i/2][i%2] = desc->f_code[i/2][i%2] + 1; // FU + + return (desc->num_slices << 4) | (dec->base.profile != PIPE_VIDEO_PROFILE_MPEG1); +} + +static uint32_t +nouveau_vp3_fill_picparm_mpeg4_bsp(struct nouveau_vp3_decoder *dec, + struct pipe_mpeg4_picture_desc *desc, + char *map) +{ + struct mpeg4_picparm_bsp *pic_bsp = (struct mpeg4_picparm_bsp *)map; + uint32_t t, bits = 0; + pic_bsp->width = dec->base.width; + pic_bsp->height = dec->base.height; + assert(desc->vop_time_increment_resolution > 0); + + t = desc->vop_time_increment_resolution - 1; + while (t) { + bits++; + t /= 2; + } + if (!bits) + bits = 1; + t = desc->vop_time_increment_resolution - 1; + pic_bsp->vop_time_increment_size = bits; + pic_bsp->interlaced = desc->interlaced; + pic_bsp->resync_marker_disable = desc->resync_marker_disable; + return 4; +} + +static uint32_t +nouveau_vp3_fill_picparm_vc1_bsp(struct nouveau_vp3_decoder *dec, + struct pipe_vc1_picture_desc *d, + char *map) +{ + struct vc1_picparm_bsp *vc = (struct vc1_picparm_bsp *)map; + uint32_t caps = (d->slice_count << 4)&0xfff0; + vc->width = dec->base.width; + vc->height = dec->base.height; + vc->profile = dec->base.profile - PIPE_VIDEO_PROFILE_VC1_SIMPLE; // 04 + vc->postprocflag = d->postprocflag; + vc->pulldown = d->pulldown; + vc->interlaced = d->interlace; + vc->tfcntrflag = d->tfcntrflag; // 08 + vc->finterpflag = d->finterpflag; + vc->psf = d->psf; + vc->pad = 0; + vc->multires = d->multires; // 0c + vc->syncmarker = d->syncmarker; + vc->rangered = d->rangered; + vc->maxbframes = d->maxbframes; + vc->dquant = d->dquant; // 10 + vc->panscan_flag = d->panscan_flag; + vc->refdist_flag = d->refdist_flag; + vc->quantizer = d->quantizer; + vc->extended_mv = d->extended_mv; // 14 + vc->extended_dmv = d->extended_dmv; + vc->overlap = d->overlap; + vc->vstransform = d->vstransform; + return caps | 2; +} + +static uint32_t +nouveau_vp3_fill_picparm_h264_bsp(struct nouveau_vp3_decoder *dec, + struct pipe_h264_picture_desc *d, + char *map) +{ + struct h264_picparm_bsp stub_h = {}, *h = &stub_h; + uint32_t caps = (d->slice_count << 4)&0xfff0; + + assert(!(d->slice_count & ~0xfff)); + if (d->slice_count & 0x1000) + caps |= 1 << 20; + + assert(offsetof(struct h264_picparm_bsp, bottom_field_flag) == (0x39 + 0x24)); + h->unk00 = 1; + h->pad1 = h->pad2 = 0; + h->unk = 0; + h->log2_max_frame_num_minus4 = d->log2_max_frame_num_minus4; + h->frame_mbs_only_flag = d->frame_mbs_only_flag; + h->direct_8x8_inference_flag = d->direct_8x8_inference_flag; + h->width_mb = mb(dec->base.width); + h->height_mb = mb(dec->base.height); + h->entropy_coding_mode_flag = d->entropy_coding_mode_flag; + h->pic_order_present_flag = d->pic_order_present_flag; + h->pic_order_cnt_type = d->pic_order_cnt_type; + h->log2_max_pic_order_cnt_lsb_minus4 = d->log2_max_pic_order_cnt_lsb_minus4; + h->delta_pic_order_always_zero_flag = d->delta_pic_order_always_zero_flag; + h->num_ref_idx_l0_active_minus1 = d->num_ref_idx_l0_active_minus1; + h->num_ref_idx_l1_active_minus1 = d->num_ref_idx_l1_active_minus1; + h->weighted_pred_flag = d->weighted_pred_flag; + h->weighted_bipred_idc = d->weighted_bipred_idc; + h->pic_init_qp_minus26 = d->pic_init_qp_minus26; + h->deblocking_filter_control_present_flag = d->deblocking_filter_control_present_flag; + h->redundant_pic_cnt_present_flag = d->redundant_pic_cnt_present_flag; + h->transform_8x8_mode_flag = d->transform_8x8_mode_flag; + h->mb_adaptive_frame_field_flag = d->mb_adaptive_frame_field_flag; + h->field_pic_flag = d->field_pic_flag; + h->bottom_field_flag = d->bottom_field_flag; + memset(h->real_pad, 0, sizeof(h->real_pad)); + *(struct h264_picparm_bsp *)map = *h; + return caps | 3; +} + +uint32_t +nouveau_vp3_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, + unsigned comm_seq, unsigned num_buffers, + const void *const *data, const unsigned *num_bytes) +{ + enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; + char *bsp; + uint32_t endmarker, caps; + struct strparm_bsp *str_bsp; + int i; + + bsp = bsp_bo->map; + /* + * 0x000..0x100: picparm_bsp + * 0x200..0x500: picparm_vp + * 0x500..0x700: comm + * 0x700..onward: raw bitstream + */ + + switch (codec){ + case PIPE_VIDEO_CODEC_MPEG12: + endmarker = 0xb7010000; + caps = nouveau_vp3_fill_picparm_mpeg12_bsp(dec, desc.mpeg12, bsp); + break; + case PIPE_VIDEO_CODEC_MPEG4: + endmarker = 0xb1010000; + caps = nouveau_vp3_fill_picparm_mpeg4_bsp(dec, desc.mpeg4, bsp); + break; + case PIPE_VIDEO_CODEC_VC1: { + endmarker = 0x0a010000; + caps = nouveau_vp3_fill_picparm_vc1_bsp(dec, desc.vc1, bsp); + break; + } + case PIPE_VIDEO_CODEC_MPEG4_AVC: { + endmarker = 0x0b010000; + caps = nouveau_vp3_fill_picparm_h264_bsp(dec, desc.h264, bsp); + break; + } + default: assert(0); return -1; + } + + caps |= 0 << 16; // reset struct comm if flag is set + caps |= 1 << 17; // enable watchdog + caps |= 0 << 18; // do not report error to VP, so it can continue decoding what we have + caps |= 0 << 19; // if enabled, use crypto crap? + bsp += 0x100; + + str_bsp = (struct strparm_bsp *)bsp; + memset(str_bsp, 0, 0x80); + str_bsp->w0[0] = 16; + str_bsp->w1[0] = 0x1; + bsp += 0x100; + /* Reserved for picparm_vp */ + bsp += 0x300; + /* Reserved for comm */ +#if !NOUVEAU_VP3_DEBUG_FENCE + memset(bsp, 0, 0x200); +#endif + bsp += 0x200; + for (i = 0; i < num_buffers; ++i) { + memcpy(bsp, data[i], num_bytes[i]); + bsp += num_bytes[i]; + str_bsp->w0[0] += num_bytes[i]; + } + + /* Append end sequence */ + *(uint32_t *)bsp = endmarker; + bsp += 4; + *(uint32_t *)bsp = 0x00000000; + bsp += 4; + *(uint32_t *)bsp = endmarker; + bsp += 4; + *(uint32_t *)bsp = 0x00000000; + + return caps; +} diff --git a/src/gallium/drivers/nvc0/nvc0_video_bsp.c b/src/gallium/drivers/nvc0/nvc0_video_bsp.c index c632a5e..06c85e6 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_bsp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_bsp.c @@ -22,214 +22,6 @@ #include "nvc0_video.h" -struct strparm_bsp { - uint32_t w0[4]; // bits 0-23 length, bits 24-31 addr_hi - uint32_t w1[4]; // bit 8-24 addr_lo - uint32_t unk20; // should be idx * 0x8000000, bitstream offset - uint32_t do_crypto_crap; // set to 0 -}; - -struct mpeg12_picparm_bsp { - uint16_t width; - uint16_t height; - uint8_t picture_structure; - uint8_t picture_coding_type; - uint8_t intra_dc_precision; - uint8_t frame_pred_frame_dct; - uint8_t concealment_motion_vectors; - uint8_t intra_vlc_format; - uint16_t pad; - uint8_t f_code[2][2]; -}; - -struct mpeg4_picparm_bsp { - uint16_t width; - uint16_t height; - uint8_t vop_time_increment_size; - uint8_t interlaced; - uint8_t resync_marker_disable; -}; - -struct vc1_picparm_bsp { - uint16_t width; - uint16_t height; - uint8_t profile; // 04 0 simple, 1 main, 2 advanced - uint8_t postprocflag; // 05 - uint8_t pulldown; // 06 - uint8_t interlaced; // 07 - uint8_t tfcntrflag; // 08 - uint8_t finterpflag; // 09 - uint8_t psf; // 0a - uint8_t pad; // 0b - uint8_t multires; // 0c - uint8_t syncmarker; // 0d - uint8_t rangered; // 0e - uint8_t maxbframes; // 0f - uint8_t dquant; // 10 - uint8_t panscan_flag; // 11 - uint8_t refdist_flag; // 12 - uint8_t quantizer; // 13 - uint8_t extended_mv; // 14 - uint8_t extended_dmv; // 15 - uint8_t overlap; // 16 - uint8_t vstransform; // 17 -}; - -struct h264_picparm_bsp { - // 00 - uint32_t unk00; - // 04 - uint32_t log2_max_frame_num_minus4; // 04 checked - uint32_t pic_order_cnt_type; // 08 checked - uint32_t log2_max_pic_order_cnt_lsb_minus4; // 0c checked - uint32_t delta_pic_order_always_zero_flag; // 10, or unknown - - uint32_t frame_mbs_only_flag; // 14, always 1? - uint32_t direct_8x8_inference_flag; // 18, always 1? - uint32_t width_mb; // 1c checked - uint32_t height_mb; // 20 checked - // 24 - //struct picparm2 - uint32_t entropy_coding_mode_flag; // 00, checked - uint32_t pic_order_present_flag; // 04 checked - uint32_t unk; // 08 seems to be 0? - uint32_t pad1; // 0c seems to be 0? - uint32_t pad2; // 10 always 0 ? - uint32_t num_ref_idx_l0_active_minus1; // 14 always 0? - uint32_t num_ref_idx_l1_active_minus1; // 18 always 0? - uint32_t weighted_pred_flag; // 1c checked - uint32_t weighted_bipred_idc; // 20 checked - uint32_t pic_init_qp_minus26; // 24 checked - uint32_t deblocking_filter_control_present_flag; // 28 always 1? - uint32_t redundant_pic_cnt_present_flag; // 2c always 0? - uint32_t transform_8x8_mode_flag; // 30 checked - uint32_t mb_adaptive_frame_field_flag; // 34 checked-ish - uint8_t field_pic_flag; // 38 checked - uint8_t bottom_field_flag; // 39 checked - uint8_t real_pad[0x1b]; // XX why? -}; - -static uint32_t -nvc0_decoder_fill_picparm_mpeg12_bsp(struct nouveau_vp3_decoder *dec, - struct pipe_mpeg12_picture_desc *desc, - char *map) -{ - struct mpeg12_picparm_bsp *pic_bsp = (struct mpeg12_picparm_bsp *)map; - int i; - pic_bsp->width = dec->base.width; - pic_bsp->height = dec->base.height; - pic_bsp->picture_structure = desc->picture_structure; - pic_bsp->picture_coding_type = desc->picture_coding_type; - pic_bsp->intra_dc_precision = desc->intra_dc_precision; - pic_bsp->frame_pred_frame_dct = desc->frame_pred_frame_dct; - pic_bsp->concealment_motion_vectors = desc->concealment_motion_vectors; - pic_bsp->intra_vlc_format = desc->intra_vlc_format; - pic_bsp->pad = 0; - for (i = 0; i < 4; ++i) - pic_bsp->f_code[i/2][i%2] = desc->f_code[i/2][i%2] + 1; // FU - - return (desc->num_slices << 4) | (dec->base.profile != PIPE_VIDEO_PROFILE_MPEG1); -} - -static uint32_t -nvc0_decoder_fill_picparm_mpeg4_bsp(struct nouveau_vp3_decoder *dec, - struct pipe_mpeg4_picture_desc *desc, - char *map) -{ - struct mpeg4_picparm_bsp *pic_bsp = (struct mpeg4_picparm_bsp *)map; - uint32_t t, bits = 0; - pic_bsp->width = dec->base.width; - pic_bsp->height = dec->base.height; - assert(desc->vop_time_increment_resolution > 0); - - t = desc->vop_time_increment_resolution - 1; - while (t) { - bits++; - t /= 2; - } - if (!bits) - bits = 1; - t = desc->vop_time_increment_resolution - 1; - pic_bsp->vop_time_increment_size = bits; - pic_bsp->interlaced = desc->interlaced; - pic_bsp->resync_marker_disable = desc->resync_marker_disable; - return 4; -} - -static uint32_t -nvc0_decoder_fill_picparm_vc1_bsp(struct nouveau_vp3_decoder *dec, - struct pipe_vc1_picture_desc *d, - char *map) -{ - struct vc1_picparm_bsp *vc = (struct vc1_picparm_bsp *)map; - uint32_t caps = (d->slice_count << 4)&0xfff0; - vc->width = dec->base.width; - vc->height = dec->base.height; - vc->profile = dec->base.profile - PIPE_VIDEO_PROFILE_VC1_SIMPLE; // 04 - vc->postprocflag = d->postprocflag; - vc->pulldown = d->pulldown; - vc->interlaced = d->interlace; - vc->tfcntrflag = d->tfcntrflag; // 08 - vc->finterpflag = d->finterpflag; - vc->psf = d->psf; - vc->pad = 0; - vc->multires = d->multires; // 0c - vc->syncmarker = d->syncmarker; - vc->rangered = d->rangered; - vc->maxbframes = d->maxbframes; - vc->dquant = d->dquant; // 10 - vc->panscan_flag = d->panscan_flag; - vc->refdist_flag = d->refdist_flag; - vc->quantizer = d->quantizer; - vc->extended_mv = d->extended_mv; // 14 - vc->extended_dmv = d->extended_dmv; - vc->overlap = d->overlap; - vc->vstransform = d->vstransform; - return caps | 2; -} - -static uint32_t -nvc0_decoder_fill_picparm_h264_bsp(struct nouveau_vp3_decoder *dec, - struct pipe_h264_picture_desc *d, - char *map) -{ - struct h264_picparm_bsp stub_h = {}, *h = &stub_h; - uint32_t caps = (d->slice_count << 4)&0xfff0; - - assert(!(d->slice_count & ~0xfff)); - if (d->slice_count & 0x1000) - caps |= 1 << 20; - - assert(offsetof(struct h264_picparm_bsp, bottom_field_flag) == (0x39 + 0x24)); - h->unk00 = 1; - h->pad1 = h->pad2 = 0; - h->unk = 0; - h->log2_max_frame_num_minus4 = d->log2_max_frame_num_minus4; - h->frame_mbs_only_flag = d->frame_mbs_only_flag; - h->direct_8x8_inference_flag = d->direct_8x8_inference_flag; - h->width_mb = mb(dec->base.width); - h->height_mb = mb(dec->base.height); - h->entropy_coding_mode_flag = d->entropy_coding_mode_flag; - h->pic_order_present_flag = d->pic_order_present_flag; - h->pic_order_cnt_type = d->pic_order_cnt_type; - h->log2_max_pic_order_cnt_lsb_minus4 = d->log2_max_pic_order_cnt_lsb_minus4; - h->delta_pic_order_always_zero_flag = d->delta_pic_order_always_zero_flag; - h->num_ref_idx_l0_active_minus1 = d->num_ref_idx_l0_active_minus1; - h->num_ref_idx_l1_active_minus1 = d->num_ref_idx_l1_active_minus1; - h->weighted_pred_flag = d->weighted_pred_flag; - h->weighted_bipred_idc = d->weighted_bipred_idc; - h->pic_init_qp_minus26 = d->pic_init_qp_minus26; - h->deblocking_filter_control_present_flag = d->deblocking_filter_control_present_flag; - h->redundant_pic_cnt_present_flag = d->redundant_pic_cnt_present_flag; - h->transform_8x8_mode_flag = d->transform_8x8_mode_flag; - h->mb_adaptive_frame_field_flag = d->mb_adaptive_frame_field_flag; - h->field_pic_flag = d->field_pic_flag; - h->bottom_field_flag = d->bottom_field_flag; - memset(h->real_pad, 0, sizeof(h->real_pad)); - *(struct h264_picparm_bsp *)map = *h; - return caps | 3; -} - #if NOUVEAU_VP3_DEBUG_FENCE static void dump_comm_bsp(struct comm *comm) { @@ -249,12 +41,10 @@ nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, { struct nouveau_pushbuf *push = dec->pushbuf[0]; enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); - char *bsp; uint32_t bsp_addr, comm_addr, inter_addr; uint32_t slice_size, bucket_size, ring_size; - uint32_t endmarker, caps; - struct strparm_bsp *str_bsp; - int ret, i; + uint32_t caps; + int ret; struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1]; unsigned fence_extra = 0; @@ -280,74 +70,15 @@ nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, debug_printf("map failed: %i %s\n", ret, strerror(-ret)); return -1; } - bsp = bsp_bo->map; - /* - * 0x000..0x100: picparm_bsp - * 0x200..0x500: picparm_vp - * 0x500..0x700: comm - * 0x700..onward: raw bitstream - */ - switch (codec){ - case PIPE_VIDEO_CODEC_MPEG12: - endmarker = 0xb7010000; - caps = nvc0_decoder_fill_picparm_mpeg12_bsp(dec, desc.mpeg12, bsp); - break; - case PIPE_VIDEO_CODEC_MPEG4: - endmarker = 0xb1010000; - caps = nvc0_decoder_fill_picparm_mpeg4_bsp(dec, desc.mpeg4, bsp); - break; - case PIPE_VIDEO_CODEC_VC1: { - endmarker = 0x0a010000; - caps = nvc0_decoder_fill_picparm_vc1_bsp(dec, desc.vc1, bsp); - break; - } - case PIPE_VIDEO_CODEC_MPEG4_AVC: { - endmarker = 0x0b010000; - caps = nvc0_decoder_fill_picparm_h264_bsp(dec, desc.h264, bsp); - break; - } - default: assert(0); return -1; - } + caps = nouveau_vp3_bsp(dec, desc, target, comm_seq, + num_buffers, data, num_bytes); nvc0_decoder_vp_caps(dec, desc, target, comm_seq, vp_caps, is_ref, refs); nouveau_pushbuf_space(push, 6 + (codec == PIPE_VIDEO_CODEC_MPEG4_AVC ? 9 : 7) + fence_extra + 2, num_refs, 0); nouveau_pushbuf_refn(push, bo_refs, num_refs); - caps |= 0 << 16; // reset struct comm if flag is set - caps |= 1 << 17; // enable watchdog - caps |= 0 << 18; // do not report error to VP, so it can continue decoding what we have - caps |= 0 << 19; // if enabled, use crypto crap? - bsp += 0x100; - - str_bsp = (struct strparm_bsp *)bsp; - memset(str_bsp, 0, 0x80); - str_bsp->w0[0] = 16; - str_bsp->w1[0] = 0x1; - bsp += 0x100; - /* Reserved for picparm_vp */ - bsp += 0x300; - /* Reserved for comm */ -#if !NOUVEAU_VP3_DEBUG_FENCE - memset(bsp, 0, 0x200); -#endif - bsp += 0x200; - for (i = 0; i < num_buffers; ++i) { - memcpy(bsp, data[i], num_bytes[i]); - bsp += num_bytes[i]; - str_bsp->w0[0] += num_bytes[i]; - } - - /* Append end sequence */ - *(uint32_t *)bsp = endmarker; - bsp += 4; - *(uint32_t *)bsp = 0x00000000; - bsp += 4; - *(uint32_t *)bsp = endmarker; - bsp += 4; - *(uint32_t *)bsp = 0x00000000; - bsp_addr = bsp_bo->offset >> 8; inter_addr = inter_bo->offset >> 8; -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 05/10] nvc0: move vp param filling logic into nouveau
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/Makefile.sources | 3 +- src/gallium/drivers/nouveau/nouveau_vp3_video.h | 6 + src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c | 485 +++++++++++++++++++++ src/gallium/drivers/nvc0/nvc0_video.h | 7 - src/gallium/drivers/nvc0/nvc0_video_bsp.c | 2 +- src/gallium/drivers/nvc0/nvc0_video_vp.c | 472 +------------------- 6 files changed, 499 insertions(+), 476 deletions(-) create mode 100644 src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index ca33207..7912f67 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -6,4 +6,5 @@ C_SOURCES := \ nouveau_heap.c \ nouveau_video.c \ nouveau_vp3_video.c \ - nouveau_vp3_video_bsp.c + nouveau_vp3_video_bsp.c \ + nouveau_vp3_video_vp.c diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index 2558d57..8d3548a 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -202,3 +202,9 @@ nouveau_vp3_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned num_buffers, const void *const *data, const unsigned *num_bytes); + +void +nouveau_vp3_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, + unsigned *caps, unsigned *is_ref, + struct nouveau_vp3_video_buffer *refs[16]); diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c b/src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c new file mode 100644 index 0000000..c9b1b99 --- /dev/null +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video_vp.c @@ -0,0 +1,485 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nouveau_vp3_video.h" + +struct mpeg12_picparm_vp { + uint16_t width; // 00 in mb units + uint16_t height; // 02 in mb units + + uint32_t unk04; // 04 stride for Y? + uint32_t unk08; // 08 stride for CbCr? + + uint32_t ofs[6]; // 1c..20 ofs + uint32_t bucket_size; // 24 + uint32_t inter_ring_data_size; // 28 + uint16_t unk2c; // 2c + uint16_t alternate_scan; // 2e + uint16_t unk30; // 30 not seen set yet + uint16_t picture_structure; // 32 + uint16_t pad2[3]; + uint16_t unk3a; // 3a set on I frame? + + uint32_t f_code[4]; // 3c + uint32_t picture_coding_type; // 4c + uint32_t intra_dc_precision; // 50 + uint32_t q_scale_type; // 54 + uint32_t top_field_first; // 58 + uint32_t full_pel_forward_vector; // 5c + uint32_t full_pel_backward_vector; // 60 + uint8_t intra_quantizer_matrix[0x40]; // 64 + uint8_t non_intra_quantizer_matrix[0x40]; // a4 +}; + +struct mpeg4_picparm_vp { + uint32_t width; // 00 in normal units + uint32_t height; // 04 in normal units + uint32_t unk08; // stride 1 + uint32_t unk0c; // stride 2 + uint32_t ofs[6]; // 10..24 ofs + uint32_t bucket_size; // 28 + uint32_t pad1; // 2c, pad + uint32_t pad2; // 30 + uint32_t inter_ring_data_size; // 34 + + uint32_t trd[2]; // 38, 3c + uint32_t trb[2]; // 40, 44 + uint32_t u48; // XXX codec selection? Should test with different values of VdpDecoderProfile + uint16_t f_code_fw; // 4c + uint16_t f_code_bw; // 4e + uint8_t interlaced; // 50 + + uint8_t quant_type; // bool, written to 528 + uint8_t quarter_sample; // bool, written to 548 + uint8_t short_video_header; // bool, negated written to 528 shifted by 1 + uint8_t u54; // bool, written to 0x740 + uint8_t vop_coding_type; // 55 + uint8_t rounding_control; // 56 + uint8_t alternate_vertical_scan_flag; // 57 bool + uint8_t top_field_first; // bool, written to vuc + + uint8_t pad4[3]; // 59, 5a, 5b, contains garbage on blob + uint32_t pad5[0x10]; // 5c...9c non-inclusive, but WHY? + + uint32_t intra[0x10]; // 9c + uint32_t non_intra[0x10]; // bc + // udc..uff pad? +}; + +// Full version, with data pumped from BSP +struct vc1_picparm_vp { + uint32_t bucket_size; // 00 + uint32_t pad; // 04 + + uint32_t inter_ring_data_size; // 08 + uint32_t unk0c; // stride 1 + uint32_t unk10; // stride 2 + uint32_t ofs[6]; // 14..28 ofs + + uint16_t width; // 2c + uint16_t height; // 2e + + uint8_t profile; // 30 0 = simple, 1 = main, 2 = advanced + uint8_t loopfilter; // 31 written into vuc + uint8_t fastuvmc; // 32, written into vuc + uint8_t dquant; // 33 + + uint8_t overlap; // 34 + uint8_t quantizer; // 35 + uint8_t u36; // 36, bool + uint8_t pad2; // 37, to align to 0x38 +}; + +struct h264_picparm_vp { // 700..a00 + uint16_t width, height; + uint32_t stride1, stride2; // 04 08 + uint32_t ofs[6]; // 0c..24 in-image offset + + uint32_t u24; // nfi ac8 ? + uint32_t bucket_size; // 28 bucket size + uint32_t inter_ring_data_size; // 2c + + unsigned f0 : 1; // 0 0x01: into 640 shifted by 3, 540 shifted by 5, half size something? + unsigned f1 : 1; // 1 0x02: into vuc ofs 56 + unsigned weighted_pred_flag : 1; // 2 0x04 + unsigned f3 : 1; // 3 0x08: into vuc ofs 68 + unsigned is_reference : 1; // 4 + unsigned interlace : 1; // 5 field_pic_flag + unsigned bottom_field_flag : 1; // 6 + unsigned f7 : 1; // 7 0x80: nfi yet + + signed log2_max_frame_num_minus4 : 4; // 31 0..3 + unsigned u31_45 : 2; // 31 4..5 + unsigned pic_order_cnt_type : 2; // 31 6..7 + signed pic_init_qp_minus26 : 6; // 32 0..5 + signed chroma_qp_index_offset : 5; // 32 6..10 + signed second_chroma_qp_index_offset : 5; // 32 11..15 + + unsigned weighted_bipred_idc : 2; // 34 0..1 + unsigned fifo_dec_index : 7; // 34 2..8 + unsigned tmp_idx : 5; // 34 9..13 + unsigned frame_number : 16; // 34 14..29 + unsigned u34_3030 : 1; // 34 30..30 pp.u34[30:30] + unsigned u34_3131 : 1; // 34 31..31 pad? + + uint32_t field_order_cnt[2]; // 38, 3c + + struct { // 40 + // 0x00223102 + // nfi (needs: top_is_reference, bottom_is_reference, is_long_term, maybe some other state that was saved.. + unsigned fifo_idx : 7; // 00 0..6 + unsigned tmp_idx : 5; // 00 7..11 + unsigned unk12 : 1; // 00 12 not seen yet, but set, maybe top_is_reference + unsigned unk13 : 1; // 00 13 not seen yet, but set, maybe bottom_is_reference? + unsigned unk14 : 1; // 00 14 skipped? + unsigned notseenyet : 1; // 00 15 pad? + unsigned unk16 : 1; // 00 16 + unsigned unk17 : 4; // 00 17..20 + unsigned unk21 : 4; // 00 21..24 + unsigned pad : 7; // 00 d25..31 + + uint32_t field_order_cnt[2]; // 04,08 + uint32_t frame_idx; // 0c + } refs[0x10]; + + uint8_t m4x4[6][16]; // 140 + uint8_t m8x8[2][64]; // 1a0 + uint32_t u220; // 220 number of extra reorder_list to append? + uint8_t u224[0x20]; // 224..244 reorder_list append ? + uint8_t nfi244[0xb0]; // add some pad to make sure nulls are read +}; + +static void +nouveau_vp3_handle_references(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *refs[16], unsigned seq, struct nouveau_vp3_video_buffer *target) +{ + unsigned h264 = u_reduce_video_profile(dec->base.profile) == PIPE_VIDEO_CODEC_MPEG4_AVC; + unsigned i, idx, empty_spot = dec->base.max_references + 1; + for (i = 0; i < dec->base.max_references; ++i) { + if (!refs[i]) + continue; + + idx = refs[i]->valid_ref; + //debug_printf("ref[%i] %p in slot %i\n", i, refs[i], idx); + assert(target != refs[i] || + (h264 && empty_spot && + (!dec->refs[idx].decoded_bottom || !dec->refs[idx].decoded_top))); + if (target == refs[i]) + empty_spot = 0; + + if (dec->refs[idx].vidbuf != refs[i]) { + debug_printf("%p is not a real ref\n", refs[i]); + // FIXME: Maybe do m2mf copy here if a application really depends on it? + continue; + } + + assert(dec->refs[idx].vidbuf == refs[i]); + dec->refs[idx].last_used = seq; + } + if (!empty_spot) + return; + + /* Try to find a real empty spot first, there should be one.. + */ + for (i = 0; i < dec->base.max_references + 1; ++i) { + if (dec->refs[i].last_used < seq) { + if (!dec->refs[i].vidbuf) { + empty_spot = i; + break; + } + if (empty_spot < dec->base.max_references+1 && + dec->refs[empty_spot].last_used < dec->refs[i].last_used) + continue; + empty_spot = i; + } + } + assert(empty_spot < dec->base.max_references+1); + dec->refs[empty_spot].last_used = seq; +// debug_printf("Kicked %p to add %p to slot %i\n", dec->refs[empty_spot].vidbuf, target, i); + dec->refs[empty_spot].vidbuf = target; + dec->refs[empty_spot].decoded_bottom = dec->refs[empty_spot].decoded_top = 0; + target->valid_ref = empty_spot; +} + +static uint32_t +nouveau_vp3_fill_picparm_mpeg12_vp(struct nouveau_vp3_decoder *dec, + struct pipe_mpeg12_picture_desc *desc, + struct nouveau_vp3_video_buffer *refs[16], + unsigned *is_ref, + char *map) +{ + struct mpeg12_picparm_vp pic_vp_stub = {}, *pic_vp = &pic_vp_stub; + uint32_t i, ret = 0x01010, ring; // !async_shutdown << 16 | watchdog << 12 | irq_record << 4 | unk; + assert(!(dec->base.width & 0xf)); + *is_ref = desc->picture_coding_type <= 2; + + if (dec->base.profile == PIPE_VIDEO_PROFILE_MPEG1) + pic_vp->picture_structure = 3; + else + pic_vp->picture_structure = desc->picture_structure; + + assert(desc->picture_structure != 4); + if (desc->picture_structure == 4) // Untested, but should work + ret |= 0x100; + pic_vp->width = mb(dec->base.width); + pic_vp->height = mb(dec->base.height); + pic_vp->unk08 = pic_vp->unk04 = (dec->base.width+0xf)&~0xf; // Stride + + nouveau_vp3_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); + pic_vp->ofs[5] = pic_vp->ofs[3]; + pic_vp->ofs[0] = pic_vp->ofs[2] = 0; + nouveau_vp3_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); + + pic_vp->alternate_scan = desc->alternate_scan; + pic_vp->pad2[0] = pic_vp->pad2[1] = pic_vp->pad2[2] = 0; + pic_vp->unk30 = desc->picture_structure < 3 && (desc->picture_structure == 2 - desc->top_field_first); + pic_vp->unk3a = (desc->picture_coding_type == 1); + for (i = 0; i < 4; ++i) + pic_vp->f_code[i] = desc->f_code[i/2][i%2] + 1; // FU + pic_vp->picture_coding_type = desc->picture_coding_type; + pic_vp->intra_dc_precision = desc->intra_dc_precision; + pic_vp->q_scale_type = desc->q_scale_type; + pic_vp->top_field_first = desc->top_field_first; + pic_vp->full_pel_forward_vector = desc->full_pel_forward_vector; + pic_vp->full_pel_backward_vector = desc->full_pel_backward_vector; + memcpy(pic_vp->intra_quantizer_matrix, desc->intra_matrix, 0x40); + memcpy(pic_vp->non_intra_quantizer_matrix, desc->non_intra_matrix, 0x40); + memcpy(map, pic_vp, sizeof(*pic_vp)); + refs[0] = (struct nouveau_vp3_video_buffer *)desc->ref[0]; + refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)desc->ref[1]; + return ret | (dec->base.profile != PIPE_VIDEO_PROFILE_MPEG1); +} + +static uint32_t +nouveau_vp3_fill_picparm_mpeg4_vp(struct nouveau_vp3_decoder *dec, + struct pipe_mpeg4_picture_desc *desc, + struct nouveau_vp3_video_buffer *refs[16], + unsigned *is_ref, + char *map) +{ + struct mpeg4_picparm_vp pic_vp_stub = {}, *pic_vp = &pic_vp_stub; + uint32_t ring, ret = 0x01014; // !async_shutdown << 16 | watchdog << 12 | irq_record << 4 | unk; + assert(!(dec->base.width & 0xf)); + *is_ref = desc->vop_coding_type <= 1; + + pic_vp->width = dec->base.width; + pic_vp->height = mb(dec->base.height)<<4; + pic_vp->unk0c = pic_vp->unk08 = mb(dec->base.width)<<4; // Stride + + nouveau_vp3_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); + pic_vp->ofs[5] = pic_vp->ofs[3]; + pic_vp->ofs[0] = pic_vp->ofs[2] = 0; + pic_vp->pad1 = pic_vp->pad2 = 0; + nouveau_vp3_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); + + pic_vp->trd[0] = desc->trd[0]; + pic_vp->trd[1] = desc->trd[1]; + pic_vp->trb[0] = desc->trb[0]; + pic_vp->trb[1] = desc->trb[1]; + pic_vp->u48 = 0; // Codec? + pic_vp->pad1 = pic_vp->pad2 = 0; + pic_vp->f_code_fw = desc->vop_fcode_forward; + pic_vp->f_code_bw = desc->vop_fcode_backward; + pic_vp->interlaced = desc->interlaced; + pic_vp->quant_type = desc->quant_type; + pic_vp->quarter_sample = desc->quarter_sample; + pic_vp->short_video_header = desc->short_video_header; + pic_vp->u54 = 0; + pic_vp->vop_coding_type = desc->vop_coding_type; + pic_vp->rounding_control = desc->rounding_control; + pic_vp->alternate_vertical_scan_flag = desc->alternate_vertical_scan_flag; + pic_vp->top_field_first = desc->top_field_first; + + memcpy(pic_vp->intra, desc->intra_matrix, 0x40); + memcpy(pic_vp->non_intra, desc->non_intra_matrix, 0x40); + memcpy(map, pic_vp, sizeof(*pic_vp)); + refs[0] = (struct nouveau_vp3_video_buffer *)desc->ref[0]; + refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)desc->ref[1]; + return ret; +} + +static uint32_t +nouveau_vp3_fill_picparm_h264_vp(struct nouveau_vp3_decoder *dec, + const struct pipe_h264_picture_desc *d, + struct nouveau_vp3_video_buffer *refs[16], + unsigned *is_ref, + char *map) +{ + struct h264_picparm_vp stub_h = {}, *h = &stub_h; + unsigned ring, i, j = 0; + assert(offsetof(struct h264_picparm_vp, u224) == 0x224); + *is_ref = d->is_reference; + dec->last_frame_num = d->frame_num; + + h->width = mb(dec->base.width); + h->height = mb(dec->base.height); + h->stride1 = h->stride2 = mb(dec->base.width)*16; + nouveau_vp3_ycbcr_offsets(dec, &h->ofs[1], &h->ofs[3], &h->ofs[4]); + h->ofs[5] = h->ofs[3]; + h->ofs[0] = h->ofs[2] = 0; + h->u24 = dec->tmp_stride >> 8; + assert(h->u24); + nouveau_vp3_inter_sizes(dec, 1, &ring, &h->bucket_size, &h->inter_ring_data_size); + + h->u220 = 0; + h->f0 = d->mb_adaptive_frame_field_flag; + h->f1 = d->direct_8x8_inference_flag; + h->weighted_pred_flag = d->weighted_pred_flag; + h->f3 = d->constrained_intra_pred_flag; + h->is_reference = d->is_reference; + h->interlace = d->field_pic_flag; + h->bottom_field_flag = d->bottom_field_flag; + h->f7 = 0; // TODO: figure out when set.. + h->log2_max_frame_num_minus4 = d->log2_max_frame_num_minus4; + h->u31_45 = 1; + + h->pic_order_cnt_type = d->pic_order_cnt_type; + h->pic_init_qp_minus26 = d->pic_init_qp_minus26; + h->chroma_qp_index_offset = d->chroma_qp_index_offset; + h->second_chroma_qp_index_offset = d->second_chroma_qp_index_offset; + h->weighted_bipred_idc = d->weighted_bipred_idc; + h->tmp_idx = 0; // set in h264_vp_refs below + h->fifo_dec_index = 0; // always set to 0 to be fifo compatible with other codecs + h->frame_number = d->frame_num; + h->u34_3030 = h->u34_3131 = 0; + h->field_order_cnt[0] = d->field_order_cnt[0]; + h->field_order_cnt[1] = d->field_order_cnt[1]; + memset(h->refs, 0, sizeof(h->refs)); + memcpy(h->m4x4, d->scaling_lists_4x4, sizeof(h->m4x4) + sizeof(h->m8x8)); + h->u220 = 0; + for (i = 0; i < d->num_ref_frames; ++i) { + if (!d->ref[i]) + break; + refs[j] = (struct nouveau_vp3_video_buffer *)d->ref[i]; + h->refs[j].fifo_idx = j + 1; + h->refs[j].tmp_idx = refs[j]->valid_ref; + h->refs[j].field_order_cnt[0] = d->field_order_cnt_list[i][0]; + h->refs[j].field_order_cnt[1] = d->field_order_cnt_list[i][1]; + h->refs[j].frame_idx = d->frame_num_list[i]; + if (!dec->refs[refs[j]->valid_ref].field_pic_flag) { + h->refs[j].unk12 = d->top_is_reference[i]; + h->refs[j].unk13 = d->bottom_is_reference[i]; + } + h->refs[j].unk14 = 0; + h->refs[j].notseenyet = 0; + h->refs[j].unk16 = dec->refs[refs[j]->valid_ref].field_pic_flag; + h->refs[j].unk17 = dec->refs[refs[j]->valid_ref].decoded_top && + d->top_is_reference[i]; + h->refs[j].unk21 = dec->refs[refs[j]->valid_ref].decoded_bottom && + d->bottom_is_reference[i]; + h->refs[j].pad = 0; + assert(!d->is_long_term[i]); + j++; + } + for (; i < 16; ++i) + assert(!d->ref[i]); + assert(d->num_ref_frames <= dec->base.max_references); + + for (; i < d->num_ref_frames; ++i) + h->refs[j].unk16 = d->field_pic_flag; + *(struct h264_picparm_vp *)map = *h; + + return 0x1113; +} + +static void +nouveau_vp3_fill_picparm_h264_vp_refs(struct nouveau_vp3_decoder *dec, + struct pipe_h264_picture_desc *d, + struct nouveau_vp3_video_buffer *refs[16], + struct nouveau_vp3_video_buffer *target, + char *map) +{ + struct h264_picparm_vp *h = (struct h264_picparm_vp *)map; + assert(dec->refs[target->valid_ref].vidbuf == target); +// debug_printf("Target: %p\n", target); + + h->tmp_idx = target->valid_ref; + dec->refs[target->valid_ref].field_pic_flag = d->field_pic_flag; + if (!d->field_pic_flag || d->bottom_field_flag) + dec->refs[target->valid_ref].decoded_bottom = 1; + if (!d->field_pic_flag || !d->bottom_field_flag) + dec->refs[target->valid_ref].decoded_top = 1; +} + +static uint32_t +nouveau_vp3_fill_picparm_vc1_vp(struct nouveau_vp3_decoder *dec, + struct pipe_vc1_picture_desc *d, + struct nouveau_vp3_video_buffer *refs[16], + unsigned *is_ref, + char *map) +{ + struct vc1_picparm_vp *vc = (struct vc1_picparm_vp *)map; + unsigned ring; + assert(dec->base.profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE); + *is_ref = d->picture_type <= 1; + + nouveau_vp3_ycbcr_offsets(dec, &vc->ofs[1], &vc->ofs[3], &vc->ofs[4]); + vc->ofs[5] = vc->ofs[3]; + vc->ofs[0] = vc->ofs[2] = 0; + vc->width = dec->base.width; + vc->height = mb(dec->base.height)<<4; + vc->unk0c = vc->unk10 = mb(dec->base.width)<<4; // Stride + vc->pad = vc->pad2 = 0; + nouveau_vp3_inter_sizes(dec, 1, &ring, &vc->bucket_size, &vc->inter_ring_data_size); + vc->profile = dec->base.profile - PIPE_VIDEO_PROFILE_VC1_SIMPLE; + vc->loopfilter = d->loopfilter; + vc->fastuvmc = d->fastuvmc; + vc->dquant = d->dquant; + vc->overlap = d->overlap; + vc->quantizer = d->quantizer; + vc->u36 = 0; // ? No idea what this one is.. + refs[0] = (struct nouveau_vp3_video_buffer *)d->ref[0]; + refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)d->ref[1]; + return 0x12; +} + +void nouveau_vp3_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, + unsigned *caps, unsigned *is_ref, + struct nouveau_vp3_video_buffer *refs[16]) +{ + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; + enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); + char *vp = bsp_bo->map + VP_OFFSET; + + switch (codec){ + case PIPE_VIDEO_CODEC_MPEG12: + *caps = nouveau_vp3_fill_picparm_mpeg12_vp(dec, desc.mpeg12, refs, is_ref, vp); + nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target); + return; + case PIPE_VIDEO_CODEC_MPEG4: + *caps = nouveau_vp3_fill_picparm_mpeg4_vp(dec, desc.mpeg4, refs, is_ref, vp); + nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target); + return; + case PIPE_VIDEO_CODEC_VC1: { + *caps = nouveau_vp3_fill_picparm_vc1_vp(dec, desc.vc1, refs, is_ref, vp); + nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target); + return; + } + case PIPE_VIDEO_CODEC_MPEG4_AVC: { + *caps = nouveau_vp3_fill_picparm_h264_vp(dec, desc.h264, refs, is_ref, vp); + nouveau_vp3_handle_references(dec, refs, dec->fence_seq, target); + nouveau_vp3_fill_picparm_h264_vp_refs(dec, desc.h264, refs, target, vp); + return; + } + default: assert(0); return; + } +} diff --git a/src/gallium/drivers/nvc0/nvc0_video.h b/src/gallium/drivers/nvc0/nvc0_video.h index 1ceb6eb..5cd79f6 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.h +++ b/src/gallium/drivers/nvc0/nvc0_video.h @@ -37,13 +37,6 @@ nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, unsigned *vp_caps, unsigned *is_ref, struct nouveau_vp3_video_buffer *refs[16]); -extern void nvc0_decoder_vp_caps(struct nouveau_vp3_decoder *dec, - union pipe_desc desc, - struct nouveau_vp3_video_buffer *target, - unsigned comm_seq, - unsigned *caps, unsigned *is_ref, - struct nouveau_vp3_video_buffer *refs[16]); - extern void nvc0_decoder_vp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, diff --git a/src/gallium/drivers/nvc0/nvc0_video_bsp.c b/src/gallium/drivers/nvc0/nvc0_video_bsp.c index 06c85e6..2ce5519 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_bsp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_bsp.c @@ -74,7 +74,7 @@ nvc0_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, caps = nouveau_vp3_bsp(dec, desc, target, comm_seq, num_buffers, data, num_bytes); - nvc0_decoder_vp_caps(dec, desc, target, comm_seq, vp_caps, is_ref, refs); + nouveau_vp3_vp_caps(dec, desc, target, comm_seq, vp_caps, is_ref, refs); nouveau_pushbuf_space(push, 6 + (codec == PIPE_VIDEO_CODEC_MPEG4_AVC ? 9 : 7) + fence_extra + 2, num_refs, 0); nouveau_pushbuf_refn(push, bo_refs, num_refs); diff --git a/src/gallium/drivers/nvc0/nvc0_video_vp.c b/src/gallium/drivers/nvc0/nvc0_video_vp.c index e3c00b9..eb4f1a9 100644 --- a/src/gallium/drivers/nvc0/nvc0_video_vp.c +++ b/src/gallium/drivers/nvc0/nvc0_video_vp.c @@ -23,443 +23,6 @@ #include "nvc0_video.h" #include <sys/mman.h> -struct mpeg12_picparm_vp { - uint16_t width; // 00 in mb units - uint16_t height; // 02 in mb units - - uint32_t unk04; // 04 stride for Y? - uint32_t unk08; // 08 stride for CbCr? - - uint32_t ofs[6]; // 1c..20 ofs - uint32_t bucket_size; // 24 - uint32_t inter_ring_data_size; // 28 - uint16_t unk2c; // 2c - uint16_t alternate_scan; // 2e - uint16_t unk30; // 30 not seen set yet - uint16_t picture_structure; // 32 - uint16_t pad2[3]; - uint16_t unk3a; // 3a set on I frame? - - uint32_t f_code[4]; // 3c - uint32_t picture_coding_type; // 4c - uint32_t intra_dc_precision; // 50 - uint32_t q_scale_type; // 54 - uint32_t top_field_first; // 58 - uint32_t full_pel_forward_vector; // 5c - uint32_t full_pel_backward_vector; // 60 - uint8_t intra_quantizer_matrix[0x40]; // 64 - uint8_t non_intra_quantizer_matrix[0x40]; // a4 -}; - -struct mpeg4_picparm_vp { - uint32_t width; // 00 in normal units - uint32_t height; // 04 in normal units - uint32_t unk08; // stride 1 - uint32_t unk0c; // stride 2 - uint32_t ofs[6]; // 10..24 ofs - uint32_t bucket_size; // 28 - uint32_t pad1; // 2c, pad - uint32_t pad2; // 30 - uint32_t inter_ring_data_size; // 34 - - uint32_t trd[2]; // 38, 3c - uint32_t trb[2]; // 40, 44 - uint32_t u48; // XXX codec selection? Should test with different values of VdpDecoderProfile - uint16_t f_code_fw; // 4c - uint16_t f_code_bw; // 4e - uint8_t interlaced; // 50 - - uint8_t quant_type; // bool, written to 528 - uint8_t quarter_sample; // bool, written to 548 - uint8_t short_video_header; // bool, negated written to 528 shifted by 1 - uint8_t u54; // bool, written to 0x740 - uint8_t vop_coding_type; // 55 - uint8_t rounding_control; // 56 - uint8_t alternate_vertical_scan_flag; // 57 bool - uint8_t top_field_first; // bool, written to vuc - - uint8_t pad4[3]; // 59, 5a, 5b, contains garbage on blob - uint32_t pad5[0x10]; // 5c...9c non-inclusive, but WHY? - - uint32_t intra[0x10]; // 9c - uint32_t non_intra[0x10]; // bc - // udc..uff pad? -}; - -// Full version, with data pumped from BSP -struct vc1_picparm_vp { - uint32_t bucket_size; // 00 - uint32_t pad; // 04 - - uint32_t inter_ring_data_size; // 08 - uint32_t unk0c; // stride 1 - uint32_t unk10; // stride 2 - uint32_t ofs[6]; // 14..28 ofs - - uint16_t width; // 2c - uint16_t height; // 2e - - uint8_t profile; // 30 0 = simple, 1 = main, 2 = advanced - uint8_t loopfilter; // 31 written into vuc - uint8_t fastuvmc; // 32, written into vuc - uint8_t dquant; // 33 - - uint8_t overlap; // 34 - uint8_t quantizer; // 35 - uint8_t u36; // 36, bool - uint8_t pad2; // 37, to align to 0x38 -}; - -struct h264_picparm_vp { // 700..a00 - uint16_t width, height; - uint32_t stride1, stride2; // 04 08 - uint32_t ofs[6]; // 0c..24 in-image offset - - uint32_t u24; // nfi ac8 ? - uint32_t bucket_size; // 28 bucket size - uint32_t inter_ring_data_size; // 2c - - unsigned f0 : 1; // 0 0x01: into 640 shifted by 3, 540 shifted by 5, half size something? - unsigned f1 : 1; // 1 0x02: into vuc ofs 56 - unsigned weighted_pred_flag : 1; // 2 0x04 - unsigned f3 : 1; // 3 0x08: into vuc ofs 68 - unsigned is_reference : 1; // 4 - unsigned interlace : 1; // 5 field_pic_flag - unsigned bottom_field_flag : 1; // 6 - unsigned f7 : 1; // 7 0x80: nfi yet - - signed log2_max_frame_num_minus4 : 4; // 31 0..3 - unsigned u31_45 : 2; // 31 4..5 - unsigned pic_order_cnt_type : 2; // 31 6..7 - signed pic_init_qp_minus26 : 6; // 32 0..5 - signed chroma_qp_index_offset : 5; // 32 6..10 - signed second_chroma_qp_index_offset : 5; // 32 11..15 - - unsigned weighted_bipred_idc : 2; // 34 0..1 - unsigned fifo_dec_index : 7; // 34 2..8 - unsigned tmp_idx : 5; // 34 9..13 - unsigned frame_number : 16; // 34 14..29 - unsigned u34_3030 : 1; // 34 30..30 pp.u34[30:30] - unsigned u34_3131 : 1; // 34 31..31 pad? - - uint32_t field_order_cnt[2]; // 38, 3c - - struct { // 40 - // 0x00223102 - // nfi (needs: top_is_reference, bottom_is_reference, is_long_term, maybe some other state that was saved.. - unsigned fifo_idx : 7; // 00 0..6 - unsigned tmp_idx : 5; // 00 7..11 - unsigned unk12 : 1; // 00 12 not seen yet, but set, maybe top_is_reference - unsigned unk13 : 1; // 00 13 not seen yet, but set, maybe bottom_is_reference? - unsigned unk14 : 1; // 00 14 skipped? - unsigned notseenyet : 1; // 00 15 pad? - unsigned unk16 : 1; // 00 16 - unsigned unk17 : 4; // 00 17..20 - unsigned unk21 : 4; // 00 21..24 - unsigned pad : 7; // 00 d25..31 - - uint32_t field_order_cnt[2]; // 04,08 - uint32_t frame_idx; // 0c - } refs[0x10]; - - uint8_t m4x4[6][16]; // 140 - uint8_t m8x8[2][64]; // 1a0 - uint32_t u220; // 220 number of extra reorder_list to append? - uint8_t u224[0x20]; // 224..244 reorder_list append ? - uint8_t nfi244[0xb0]; // add some pad to make sure nulls are read -}; - -static void -nvc0_decoder_handle_references(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *refs[16], unsigned seq, struct nouveau_vp3_video_buffer *target) -{ - unsigned h264 = u_reduce_video_profile(dec->base.profile) == PIPE_VIDEO_CODEC_MPEG4_AVC; - unsigned i, idx, empty_spot = dec->base.max_references + 1; - for (i = 0; i < dec->base.max_references; ++i) { - if (!refs[i]) - continue; - - idx = refs[i]->valid_ref; - //debug_printf("ref[%i] %p in slot %i\n", i, refs[i], idx); - assert(target != refs[i] || - (h264 && empty_spot && - (!dec->refs[idx].decoded_bottom || !dec->refs[idx].decoded_top))); - if (target == refs[i]) - empty_spot = 0; - - if (dec->refs[idx].vidbuf != refs[i]) { - debug_printf("%p is not a real ref\n", refs[i]); - // FIXME: Maybe do m2mf copy here if a application really depends on it? - continue; - } - - assert(dec->refs[idx].vidbuf == refs[i]); - dec->refs[idx].last_used = seq; - } - if (!empty_spot) - return; - - /* Try to find a real empty spot first, there should be one.. - */ - for (i = 0; i < dec->base.max_references + 1; ++i) { - if (dec->refs[i].last_used < seq) { - if (!dec->refs[i].vidbuf) { - empty_spot = i; - break; - } - if (empty_spot < dec->base.max_references+1 && - dec->refs[empty_spot].last_used < dec->refs[i].last_used) - continue; - empty_spot = i; - } - } - assert(empty_spot < dec->base.max_references+1); - dec->refs[empty_spot].last_used = seq; -// debug_printf("Kicked %p to add %p to slot %i\n", dec->refs[empty_spot].vidbuf, target, i); - dec->refs[empty_spot].vidbuf = target; - dec->refs[empty_spot].decoded_bottom = dec->refs[empty_spot].decoded_top = 0; - target->valid_ref = empty_spot; -} - -static void -nvc0_decoder_kick_ref(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target) -{ - dec->refs[target->valid_ref].vidbuf = NULL; - dec->refs[target->valid_ref].last_used = 0; -// debug_printf("Unreffed %p\n", target); -} - -static uint32_t -nvc0_decoder_fill_picparm_mpeg12_vp(struct nouveau_vp3_decoder *dec, - struct pipe_mpeg12_picture_desc *desc, - struct nouveau_vp3_video_buffer *refs[16], - unsigned *is_ref, - char *map) -{ - struct mpeg12_picparm_vp pic_vp_stub = {}, *pic_vp = &pic_vp_stub; - uint32_t i, ret = 0x01010, ring; // !async_shutdown << 16 | watchdog << 12 | irq_record << 4 | unk; - assert(!(dec->base.width & 0xf)); - *is_ref = desc->picture_coding_type <= 2; - - if (dec->base.profile == PIPE_VIDEO_PROFILE_MPEG1) - pic_vp->picture_structure = 3; - else - pic_vp->picture_structure = desc->picture_structure; - - assert(desc->picture_structure != 4); - if (desc->picture_structure == 4) // Untested, but should work - ret |= 0x100; - pic_vp->width = mb(dec->base.width); - pic_vp->height = mb(dec->base.height); - pic_vp->unk08 = pic_vp->unk04 = (dec->base.width+0xf)&~0xf; // Stride - - nouveau_vp3_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); - pic_vp->ofs[5] = pic_vp->ofs[3]; - pic_vp->ofs[0] = pic_vp->ofs[2] = 0; - nouveau_vp3_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); - - pic_vp->alternate_scan = desc->alternate_scan; - pic_vp->pad2[0] = pic_vp->pad2[1] = pic_vp->pad2[2] = 0; - pic_vp->unk30 = desc->picture_structure < 3 && (desc->picture_structure == 2 - desc->top_field_first); - pic_vp->unk3a = (desc->picture_coding_type == 1); - for (i = 0; i < 4; ++i) - pic_vp->f_code[i] = desc->f_code[i/2][i%2] + 1; // FU - pic_vp->picture_coding_type = desc->picture_coding_type; - pic_vp->intra_dc_precision = desc->intra_dc_precision; - pic_vp->q_scale_type = desc->q_scale_type; - pic_vp->top_field_first = desc->top_field_first; - pic_vp->full_pel_forward_vector = desc->full_pel_forward_vector; - pic_vp->full_pel_backward_vector = desc->full_pel_backward_vector; - memcpy(pic_vp->intra_quantizer_matrix, desc->intra_matrix, 0x40); - memcpy(pic_vp->non_intra_quantizer_matrix, desc->non_intra_matrix, 0x40); - memcpy(map, pic_vp, sizeof(*pic_vp)); - refs[0] = (struct nouveau_vp3_video_buffer *)desc->ref[0]; - refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)desc->ref[1]; - return ret | (dec->base.profile != PIPE_VIDEO_PROFILE_MPEG1); -} - -static uint32_t -nvc0_decoder_fill_picparm_mpeg4_vp(struct nouveau_vp3_decoder *dec, - struct pipe_mpeg4_picture_desc *desc, - struct nouveau_vp3_video_buffer *refs[16], - unsigned *is_ref, - char *map) -{ - struct mpeg4_picparm_vp pic_vp_stub = {}, *pic_vp = &pic_vp_stub; - uint32_t ring, ret = 0x01014; // !async_shutdown << 16 | watchdog << 12 | irq_record << 4 | unk; - assert(!(dec->base.width & 0xf)); - *is_ref = desc->vop_coding_type <= 1; - - pic_vp->width = dec->base.width; - pic_vp->height = mb(dec->base.height)<<4; - pic_vp->unk0c = pic_vp->unk08 = mb(dec->base.width)<<4; // Stride - - nouveau_vp3_ycbcr_offsets(dec, &pic_vp->ofs[1], &pic_vp->ofs[3], &pic_vp->ofs[4]); - pic_vp->ofs[5] = pic_vp->ofs[3]; - pic_vp->ofs[0] = pic_vp->ofs[2] = 0; - pic_vp->pad1 = pic_vp->pad2 = 0; - nouveau_vp3_inter_sizes(dec, 1, &ring, &pic_vp->bucket_size, &pic_vp->inter_ring_data_size); - - pic_vp->trd[0] = desc->trd[0]; - pic_vp->trd[1] = desc->trd[1]; - pic_vp->trb[0] = desc->trb[0]; - pic_vp->trb[1] = desc->trb[1]; - pic_vp->u48 = 0; // Codec? - pic_vp->pad1 = pic_vp->pad2 = 0; - pic_vp->f_code_fw = desc->vop_fcode_forward; - pic_vp->f_code_bw = desc->vop_fcode_backward; - pic_vp->interlaced = desc->interlaced; - pic_vp->quant_type = desc->quant_type; - pic_vp->quarter_sample = desc->quarter_sample; - pic_vp->short_video_header = desc->short_video_header; - pic_vp->u54 = 0; - pic_vp->vop_coding_type = desc->vop_coding_type; - pic_vp->rounding_control = desc->rounding_control; - pic_vp->alternate_vertical_scan_flag = desc->alternate_vertical_scan_flag; - pic_vp->top_field_first = desc->top_field_first; - - memcpy(pic_vp->intra, desc->intra_matrix, 0x40); - memcpy(pic_vp->non_intra, desc->non_intra_matrix, 0x40); - memcpy(map, pic_vp, sizeof(*pic_vp)); - refs[0] = (struct nouveau_vp3_video_buffer *)desc->ref[0]; - refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)desc->ref[1]; - return ret; -} - -static uint32_t -nvc0_decoder_fill_picparm_h264_vp(struct nouveau_vp3_decoder *dec, - const struct pipe_h264_picture_desc *d, - struct nouveau_vp3_video_buffer *refs[16], - unsigned *is_ref, - char *map) -{ - struct h264_picparm_vp stub_h = {}, *h = &stub_h; - unsigned ring, i, j = 0; - assert(offsetof(struct h264_picparm_vp, u224) == 0x224); - *is_ref = d->is_reference; - dec->last_frame_num = d->frame_num; - - h->width = mb(dec->base.width); - h->height = mb(dec->base.height); - h->stride1 = h->stride2 = mb(dec->base.width)*16; - nouveau_vp3_ycbcr_offsets(dec, &h->ofs[1], &h->ofs[3], &h->ofs[4]); - h->ofs[5] = h->ofs[3]; - h->ofs[0] = h->ofs[2] = 0; - h->u24 = dec->tmp_stride >> 8; - assert(h->u24); - nouveau_vp3_inter_sizes(dec, 1, &ring, &h->bucket_size, &h->inter_ring_data_size); - - h->u220 = 0; - h->f0 = d->mb_adaptive_frame_field_flag; - h->f1 = d->direct_8x8_inference_flag; - h->weighted_pred_flag = d->weighted_pred_flag; - h->f3 = d->constrained_intra_pred_flag; - h->is_reference = d->is_reference; - h->interlace = d->field_pic_flag; - h->bottom_field_flag = d->bottom_field_flag; - h->f7 = 0; // TODO: figure out when set.. - h->log2_max_frame_num_minus4 = d->log2_max_frame_num_minus4; - h->u31_45 = 1; - - h->pic_order_cnt_type = d->pic_order_cnt_type; - h->pic_init_qp_minus26 = d->pic_init_qp_minus26; - h->chroma_qp_index_offset = d->chroma_qp_index_offset; - h->second_chroma_qp_index_offset = d->second_chroma_qp_index_offset; - h->weighted_bipred_idc = d->weighted_bipred_idc; - h->tmp_idx = 0; // set in h264_vp_refs below - h->fifo_dec_index = 0; // always set to 0 to be fifo compatible with other codecs - h->frame_number = d->frame_num; - h->u34_3030 = h->u34_3131 = 0; - h->field_order_cnt[0] = d->field_order_cnt[0]; - h->field_order_cnt[1] = d->field_order_cnt[1]; - memset(h->refs, 0, sizeof(h->refs)); - memcpy(h->m4x4, d->scaling_lists_4x4, sizeof(h->m4x4) + sizeof(h->m8x8)); - h->u220 = 0; - for (i = 0; i < d->num_ref_frames; ++i) { - if (!d->ref[i]) - break; - refs[j] = (struct nouveau_vp3_video_buffer *)d->ref[i]; - h->refs[j].fifo_idx = j + 1; - h->refs[j].tmp_idx = refs[j]->valid_ref; - h->refs[j].field_order_cnt[0] = d->field_order_cnt_list[i][0]; - h->refs[j].field_order_cnt[1] = d->field_order_cnt_list[i][1]; - h->refs[j].frame_idx = d->frame_num_list[i]; - if (!dec->refs[refs[j]->valid_ref].field_pic_flag) { - h->refs[j].unk12 = d->top_is_reference[i]; - h->refs[j].unk13 = d->bottom_is_reference[i]; - } - h->refs[j].unk14 = 0; - h->refs[j].notseenyet = 0; - h->refs[j].unk16 = dec->refs[refs[j]->valid_ref].field_pic_flag; - h->refs[j].unk17 = dec->refs[refs[j]->valid_ref].decoded_top && - d->top_is_reference[i]; - h->refs[j].unk21 = dec->refs[refs[j]->valid_ref].decoded_bottom && - d->bottom_is_reference[i]; - h->refs[j].pad = 0; - assert(!d->is_long_term[i]); - j++; - } - for (; i < 16; ++i) - assert(!d->ref[i]); - assert(d->num_ref_frames <= dec->base.max_references); - - for (; i < d->num_ref_frames; ++i) - h->refs[j].unk16 = d->field_pic_flag; - *(struct h264_picparm_vp *)map = *h; - - return 0x1113; -} - -static void -nvc0_decoder_fill_picparm_h264_vp_refs(struct nouveau_vp3_decoder *dec, - struct pipe_h264_picture_desc *d, - struct nouveau_vp3_video_buffer *refs[16], - struct nouveau_vp3_video_buffer *target, - char *map) -{ - struct h264_picparm_vp *h = (struct h264_picparm_vp *)map; - assert(dec->refs[target->valid_ref].vidbuf == target); -// debug_printf("Target: %p\n", target); - - h->tmp_idx = target->valid_ref; - dec->refs[target->valid_ref].field_pic_flag = d->field_pic_flag; - if (!d->field_pic_flag || d->bottom_field_flag) - dec->refs[target->valid_ref].decoded_bottom = 1; - if (!d->field_pic_flag || !d->bottom_field_flag) - dec->refs[target->valid_ref].decoded_top = 1; -} - -static uint32_t -nvc0_decoder_fill_picparm_vc1_vp(struct nouveau_vp3_decoder *dec, - struct pipe_vc1_picture_desc *d, - struct nouveau_vp3_video_buffer *refs[16], - unsigned *is_ref, - char *map) -{ - struct vc1_picparm_vp *vc = (struct vc1_picparm_vp *)map; - unsigned ring; - assert(dec->base.profile != PIPE_VIDEO_PROFILE_VC1_SIMPLE); - *is_ref = d->picture_type <= 1; - - nouveau_vp3_ycbcr_offsets(dec, &vc->ofs[1], &vc->ofs[3], &vc->ofs[4]); - vc->ofs[5] = vc->ofs[3]; - vc->ofs[0] = vc->ofs[2] = 0; - vc->width = dec->base.width; - vc->height = mb(dec->base.height)<<4; - vc->unk0c = vc->unk10 = mb(dec->base.width)<<4; // Stride - vc->pad = vc->pad2 = 0; - nouveau_vp3_inter_sizes(dec, 1, &ring, &vc->bucket_size, &vc->inter_ring_data_size); - vc->profile = dec->base.profile - PIPE_VIDEO_PROFILE_VC1_SIMPLE; - vc->loopfilter = d->loopfilter; - vc->fastuvmc = d->fastuvmc; - vc->dquant = d->dquant; - vc->overlap = d->overlap; - vc->quantizer = d->quantizer; - vc->u36 = 0; // ? No idea what this one is.. - refs[0] = (struct nouveau_vp3_video_buffer *)d->ref[0]; - refs[!!refs[0]] = (struct nouveau_vp3_video_buffer *)d->ref[1]; - return 0x12; -} - #if NOUVEAU_VP3_DEBUG_FENCE static void dump_comm_vp(struct nouveau_vp3_decoder *dec, struct comm *comm, u32 comm_seq, struct nouveau_bo *inter_bo, unsigned slice_size) @@ -493,37 +56,12 @@ static void dump_comm_vp(struct nouveau_vp3_decoder *dec, struct comm *comm, u32 } #endif -void nvc0_decoder_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc, - struct nouveau_vp3_video_buffer *target, unsigned comm_seq, - unsigned *caps, unsigned *is_ref, - struct nouveau_vp3_video_buffer *refs[16]) +static void +nvc0_decoder_kick_ref(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target) { - struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; - enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); - char *vp = bsp_bo->map + VP_OFFSET; - - switch (codec){ - case PIPE_VIDEO_CODEC_MPEG12: - *caps = nvc0_decoder_fill_picparm_mpeg12_vp(dec, desc.mpeg12, refs, is_ref, vp); - nvc0_decoder_handle_references(dec, refs, dec->fence_seq, target); - return; - case PIPE_VIDEO_CODEC_MPEG4: - *caps = nvc0_decoder_fill_picparm_mpeg4_vp(dec, desc.mpeg4, refs, is_ref, vp); - nvc0_decoder_handle_references(dec, refs, dec->fence_seq, target); - return; - case PIPE_VIDEO_CODEC_VC1: { - *caps = nvc0_decoder_fill_picparm_vc1_vp(dec, desc.vc1, refs, is_ref, vp); - nvc0_decoder_handle_references(dec, refs, dec->fence_seq, target); - return; - } - case PIPE_VIDEO_CODEC_MPEG4_AVC: { - *caps = nvc0_decoder_fill_picparm_h264_vp(dec, desc.h264, refs, is_ref, vp); - nvc0_decoder_handle_references(dec, refs, dec->fence_seq, target); - nvc0_decoder_fill_picparm_h264_vp_refs(dec, desc.h264, refs, target, vp); - return; - } - default: assert(0); return; - } + dec->refs[target->valid_ref].vidbuf = NULL; + dec->refs[target->valid_ref].last_used = 0; +// debug_printf("Unreffed %p\n", target); } void -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 06/10] nvc0: move some of the simpler decoder functions into nouveau
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/nouveau_vp3_video.c | 63 ++++++++++++++++++++++++ src/gallium/drivers/nouveau/nouveau_vp3_video.h | 3 ++ src/gallium/drivers/nvc0/nvc0_video.c | 65 ++----------------------- 3 files changed, 69 insertions(+), 62 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c b/src/gallium/drivers/nouveau/nouveau_vp3_video.c index a55c2e8..6abc581 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c @@ -163,3 +163,66 @@ error: nouveau_vp3_video_buffer_destroy(&buffer->base); return NULL; } + +static void +nouveau_vp3_decoder_flush(struct pipe_video_decoder *decoder) +{ +} + +static void +nouveau_vp3_decoder_begin_frame(struct pipe_video_decoder *decoder, + struct pipe_video_buffer *target, + struct pipe_picture_desc *picture) +{ +} + +static void +nouveau_vp3_decoder_end_frame(struct pipe_video_decoder *decoder, + struct pipe_video_buffer *target, + struct pipe_picture_desc *picture) +{ +} + +static void +nouveau_vp3_decoder_destroy(struct pipe_video_decoder *decoder) +{ + struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; + int i; + + nouveau_bo_ref(NULL, &dec->ref_bo); + nouveau_bo_ref(NULL, &dec->bitplane_bo); + nouveau_bo_ref(NULL, &dec->inter_bo[0]); + nouveau_bo_ref(NULL, &dec->inter_bo[1]); +#if NOUVEAU_VP3_DEBUG_FENCE + nouveau_bo_ref(NULL, &dec->fence_bo); +#endif + nouveau_bo_ref(NULL, &dec->fw_bo); + + for (i = 0; i < NOUVEAU_VP3_VIDEO_QDEPTH; ++i) + nouveau_bo_ref(NULL, &dec->bsp_bo[i]); + + nouveau_object_del(&dec->bsp); + nouveau_object_del(&dec->vp); + nouveau_object_del(&dec->ppp); + + if (dec->channel[0] != dec->channel[1]) { + for (i = 0; i < 3; ++i) { + nouveau_pushbuf_del(&dec->pushbuf[i]); + nouveau_object_del(&dec->channel[i]); + } + } else { + nouveau_pushbuf_del(dec->pushbuf); + nouveau_object_del(dec->channel); + } + + FREE(dec); +} + +void +nouveau_vp3_decoder_init_common(struct pipe_video_decoder *dec) +{ + dec->destroy = nouveau_vp3_decoder_destroy; + dec->flush = nouveau_vp3_decoder_flush; + dec->begin_frame = nouveau_vp3_decoder_begin_frame; + dec->end_frame = nouveau_vp3_decoder_end_frame; +} diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index 8d3548a..cb088fe 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -197,6 +197,9 @@ nouveau_vp3_video_buffer_create(struct pipe_context *pipe, const struct pipe_video_buffer *templat, int flags); +void +nouveau_vp3_decoder_init_common(struct pipe_video_decoder *decoder); + uint32_t nouveau_vp3_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, diff --git a/src/gallium/drivers/nvc0/nvc0_video.c b/src/gallium/drivers/nvc0/nvc0_video.c index 73963c2..2ba17b4 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.c +++ b/src/gallium/drivers/nvc0/nvc0_video.c @@ -85,62 +85,6 @@ nvc0_decoder_decode_bitstream(struct pipe_video_decoder *decoder, nvc0_decoder_ppp(dec, desc, target, comm_seq); } -static void -nvc0_decoder_flush(struct pipe_video_decoder *decoder) -{ - struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; - (void)dec; -} - -static void -nvc0_decoder_begin_frame(struct pipe_video_decoder *decoder, - struct pipe_video_buffer *target, - struct pipe_picture_desc *picture) -{ -} - -static void -nvc0_decoder_end_frame(struct pipe_video_decoder *decoder, - struct pipe_video_buffer *target, - struct pipe_picture_desc *picture) -{ -} - -static void -nvc0_decoder_destroy(struct pipe_video_decoder *decoder) -{ - struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; - int i; - - nouveau_bo_ref(NULL, &dec->ref_bo); - nouveau_bo_ref(NULL, &dec->bitplane_bo); - nouveau_bo_ref(NULL, &dec->inter_bo[0]); - nouveau_bo_ref(NULL, &dec->inter_bo[1]); -#if NOUVEAU_VP3_DEBUG_FENCE - nouveau_bo_ref(NULL, &dec->fence_bo); -#endif - nouveau_bo_ref(NULL, &dec->fw_bo); - - for (i = 0; i < NOUVEAU_VP3_VIDEO_QDEPTH; ++i) - nouveau_bo_ref(NULL, &dec->bsp_bo[i]); - - nouveau_object_del(&dec->bsp); - nouveau_object_del(&dec->vp); - nouveau_object_del(&dec->ppp); - - if (dec->channel[0] != dec->channel[1]) { - for (i = 0; i < 3; ++i) { - nouveau_pushbuf_del(&dec->pushbuf[i]); - nouveau_object_del(&dec->channel[i]); - } - } else { - nouveau_pushbuf_del(dec->pushbuf); - nouveau_object_del(dec->channel); - } - - FREE(dec); -} - static void nvc0_video_getpath(enum pipe_video_profile profile, char *path) { switch (u_reduce_video_profile(profile)) { @@ -200,6 +144,7 @@ nvc0_create_decoder(struct pipe_context *context, if (!dec) return NULL; dec->client = screen->client; + nouveau_vp3_decoder_init_common(&dec->base); if (!kepler) { dec->bsp_idx = 5; @@ -282,11 +227,7 @@ nvc0_create_decoder(struct pipe_context *context, dec->base.width = width; dec->base.height = height; dec->base.max_references = max_references; - dec->base.destroy = nvc0_decoder_destroy; - dec->base.flush = nvc0_decoder_flush; dec->base.decode_bitstream = nvc0_decoder_decode_bitstream; - dec->base.begin_frame = nvc0_decoder_begin_frame; - dec->base.end_frame = nvc0_decoder_end_frame; for (i = 0; i < NOUVEAU_VP3_VIDEO_QDEPTH && !ret; ++i) ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, @@ -497,12 +438,12 @@ nvc0_create_decoder(struct pipe_context *context, fw_fail: debug_printf("Cannot create decoder without firmware..\n"); - nvc0_decoder_destroy(&dec->base); + dec->base.destroy(&dec->base); return NULL; fail: debug_printf("Creation failed: %s (%i)\n", strerror(-ret), ret); - nvc0_decoder_destroy(&dec->base); + dec->base.destroy(&dec->base); return NULL; } -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 07/10] nvc0: move firmware loading functions to nouveau
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/nouveau_vp3_video.c | 101 ++++++++++++++++++++++++ src/gallium/drivers/nouveau/nouveau_vp3_video.h | 5 ++ src/gallium/drivers/nvc0/nvc0_video.c | 92 +-------------------- 3 files changed, 108 insertions(+), 90 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c b/src/gallium/drivers/nouveau/nouveau_vp3_video.c index 6abc581..a3387b3 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c @@ -20,6 +20,10 @@ * OTHER DEALINGS IN THE SOFTWARE. */ +#include <sys/mman.h> +#include <stdio.h> +#include <fcntl.h> + #include "nouveau_screen.h" #include "nouveau_context.h" #include "nouveau_vp3_video.h" @@ -226,3 +230,100 @@ nouveau_vp3_decoder_init_common(struct pipe_video_decoder *dec) dec->begin_frame = nouveau_vp3_decoder_begin_frame; dec->end_frame = nouveau_vp3_decoder_end_frame; } + +static void vp4_getpath(enum pipe_video_profile profile, char *path) +{ + switch (u_reduce_video_profile(profile)) { + case PIPE_VIDEO_CODEC_MPEG12: { + sprintf(path, "/lib/firmware/nouveau/vuc-mpeg12-0"); + break; + } + case PIPE_VIDEO_CODEC_MPEG4: { + sprintf(path, "/lib/firmware/nouveau/vuc-mpeg4-0"); + break; + } + case PIPE_VIDEO_CODEC_VC1: { + sprintf(path, "/lib/firmware/nouveau/vuc-vc1-0"); + break; + } + case PIPE_VIDEO_CODEC_MPEG4_AVC: { + sprintf(path, "/lib/firmware/nouveau/vuc-h264-0"); + break; + } + default: assert(0); + } +} + +int +nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec, + enum pipe_video_profile profile, + unsigned chipset) +{ + int fd; + char path[PATH_MAX]; + ssize_t r; + uint32_t *end, endval; + + vp4_getpath(profile, path); + + if (nouveau_bo_map(dec->fw_bo, NOUVEAU_BO_WR, dec->client)) + return 1; + + fd = open(path, O_RDONLY | O_CLOEXEC); + if (fd < 0) { + fprintf(stderr, "opening firmware file %s failed: %m\n", path); + return 1; + } + r = read(fd, dec->fw_bo->map, 0x4000); + close(fd); + + if (r < 0) { + fprintf(stderr, "reading firmware file %s failed: %m\n", path); + return 1; + } + + if (r == 0x4000) { + fprintf(stderr, "firmware file %s too large!\n", path); + return 1; + } + + if (r & 0xff) { + fprintf(stderr, "firmware file %s wrong size!\n", path); + return 1; + } + + end = dec->fw_bo->map + r - 4; + endval = *end; + while (endval == *end) + end--; + + r = (intptr_t)end - (intptr_t)dec->fw_bo->map + 4; + + switch (u_reduce_video_profile(profile)) { + case PIPE_VIDEO_CODEC_MPEG12: { + assert((r & 0xff) == 0xe0); + dec->fw_sizes = (0x2e0<<16) | (r - 0x2e0); + break; + } + case PIPE_VIDEO_CODEC_MPEG4: { + assert((r & 0xff) == 0xe0); + dec->fw_sizes = (0x2e0<<16) | (r - 0x2e0); + break; + } + case PIPE_VIDEO_CODEC_VC1: { + assert((r & 0xff) == 0xac); + dec->fw_sizes = (0x3ac<<16) | (r - 0x3ac); + break; + } + case PIPE_VIDEO_CODEC_MPEG4_AVC: { + assert((r & 0xff) == 0x70); + dec->fw_sizes = (0x370<<16) | (r - 0x370); + break; + } + default: + return 1; + } + munmap(dec->fw_bo->map, dec->fw_bo->size); + dec->fw_bo->map = NULL; + return 0; +} diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index cb088fe..5e40385 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -200,6 +200,11 @@ nouveau_vp3_video_buffer_create(struct pipe_context *pipe, void nouveau_vp3_decoder_init_common(struct pipe_video_decoder *decoder); +int +nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec, + enum pipe_video_profile profile, + unsigned chipset); + uint32_t nouveau_vp3_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, diff --git a/src/gallium/drivers/nvc0/nvc0_video.c b/src/gallium/drivers/nvc0/nvc0_video.c index 2ba17b4..e0f29be 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.c +++ b/src/gallium/drivers/nvc0/nvc0_video.c @@ -25,9 +25,6 @@ #include "util/u_sampler.h" #include "util/u_format.h" -#include <sys/mman.h> -#include <fcntl.h> - int nvc0_screen_get_video_param(struct pipe_screen *pscreen, enum pipe_video_profile profile, @@ -85,29 +82,6 @@ nvc0_decoder_decode_bitstream(struct pipe_video_decoder *decoder, nvc0_decoder_ppp(dec, desc, target, comm_seq); } -static void nvc0_video_getpath(enum pipe_video_profile profile, char *path) -{ - switch (u_reduce_video_profile(profile)) { - case PIPE_VIDEO_CODEC_MPEG12: { - sprintf(path, "/lib/firmware/nouveau/vuc-mpeg12-0"); - break; - } - case PIPE_VIDEO_CODEC_MPEG4: { - sprintf(path, "/lib/firmware/nouveau/vuc-mpeg4-0"); - break; - } - case PIPE_VIDEO_CODEC_VC1: { - sprintf(path, "/lib/firmware/nouveau/vuc-vc1-0"); - break; - } - case PIPE_VIDEO_CODEC_MPEG4_AVC: { - sprintf(path, "/lib/firmware/nouveau/vuc-h264-0"); - break; - } - default: assert(0); - } -} - struct pipe_video_decoder * nvc0_create_decoder(struct pipe_context *context, enum pipe_video_profile profile, @@ -277,76 +251,14 @@ nvc0_create_decoder(struct pipe_context *context, } if (screen->device->chipset < 0xd0) { - int fd; - char path[PATH_MAX]; - ssize_t r; - uint32_t *end, endval; - ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, 0, 0x4000, &cfg, &dec->fw_bo); - if (!ret) - ret = nouveau_bo_map(dec->fw_bo, NOUVEAU_BO_WR, dec->client); if (ret) goto fail; - nvc0_video_getpath(profile, path); - - fd = open(path, O_RDONLY | O_CLOEXEC); - if (fd < 0) { - fprintf(stderr, "opening firmware file %s failed: %m\n", path); - goto fw_fail; - } - r = read(fd, dec->fw_bo->map, 0x4000); - close(fd); - - if (r < 0) { - fprintf(stderr, "reading firmware file %s failed: %m\n", path); - goto fw_fail; - } - - if (r == 0x4000) { - fprintf(stderr, "firmware file %s too large!\n", path); - goto fw_fail; - } - - if (r & 0xff) { - fprintf(stderr, "firmware file %s wrong size!\n", path); - goto fw_fail; - } - - end = dec->fw_bo->map + r - 4; - endval = *end; - while (endval == *end) - end--; - - r = (intptr_t)end - (intptr_t)dec->fw_bo->map + 4; - - switch (u_reduce_video_profile(profile)) { - case PIPE_VIDEO_CODEC_MPEG12: { - assert((r & 0xff) == 0xe0); - dec->fw_sizes = (0x2e0<<16) | (r - 0x2e0); - break; - } - case PIPE_VIDEO_CODEC_MPEG4: { - assert((r & 0xff) == 0xe0); - dec->fw_sizes = (0x2e0<<16) | (r - 0x2e0); - break; - } - case PIPE_VIDEO_CODEC_VC1: { - assert((r & 0xff) == 0xac); - dec->fw_sizes = (0x3ac<<16) | (r - 0x3ac); - break; - } - case PIPE_VIDEO_CODEC_MPEG4_AVC: { - assert((r & 0xff) == 0x70); - dec->fw_sizes = (0x370<<16) | (r - 0x370); - break; - } - default: + ret = nouveau_vp3_load_firmware(dec, profile, screen->device->chipset); + if (ret) goto fw_fail; - } - munmap(dec->fw_bo->map, dec->fw_bo->size); - dec->fw_bo->map = NULL; } if (codec != 3) { -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 08/10] nvc0: move video param and format support functions to nouveau
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/nouveau_vp3_video.c | 37 +++++++++++++++++++++++++ src/gallium/drivers/nouveau/nouveau_vp3_video.h | 10 +++++++ src/gallium/drivers/nvc0/nvc0_context.h | 5 ---- src/gallium/drivers/nvc0/nvc0_screen.c | 18 +++--------- src/gallium/drivers/nvc0/nvc0_video.c | 26 ----------------- 5 files changed, 51 insertions(+), 45 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c b/src/gallium/drivers/nouveau/nouveau_vp3_video.c index a3387b3..95ba7ec 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c @@ -327,3 +327,40 @@ nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec, dec->fw_bo->map = NULL; return 0; } + +int +nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen, + enum pipe_video_profile profile, + enum pipe_video_cap param) +{ + switch (param) { + case PIPE_VIDEO_CAP_SUPPORTED: + return profile >= PIPE_VIDEO_PROFILE_MPEG1; + case PIPE_VIDEO_CAP_NPOT_TEXTURES: + return 1; + case PIPE_VIDEO_CAP_MAX_WIDTH: + case PIPE_VIDEO_CAP_MAX_HEIGHT: + return nouveau_screen(pscreen)->device->chipset < 0xd0 ? 2048 : 4096; + case PIPE_VIDEO_CAP_PREFERED_FORMAT: + return PIPE_FORMAT_NV12; + case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED: + case PIPE_VIDEO_CAP_PREFERS_INTERLACED: + return true; + case PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE: + return false; + default: + debug_printf("unknown video param: %d\n", param); + return 0; + } +} + +boolean +nouveau_vp3_screen_video_supported(struct pipe_screen *screen, + enum pipe_format format, + enum pipe_video_profile profile) +{ + if (profile != PIPE_VIDEO_PROFILE_UNKNOWN) + return format == PIPE_FORMAT_NV12; + + return vl_video_buffer_is_format_supported(screen, format, profile); +} diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.h b/src/gallium/drivers/nouveau/nouveau_vp3_video.h index 5e40385..f1a1054 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.h +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.h @@ -216,3 +216,13 @@ nouveau_vp3_vp_caps(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq, unsigned *caps, unsigned *is_ref, struct nouveau_vp3_video_buffer *refs[16]); + +int +nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen, + enum pipe_video_profile profile, + enum pipe_video_cap param); + +boolean +nouveau_vp3_screen_video_supported(struct pipe_screen *screen, + enum pipe_format format, + enum pipe_video_profile profile); diff --git a/src/gallium/drivers/nvc0/nvc0_context.h b/src/gallium/drivers/nvc0/nvc0_context.h index 9e58960..db6bb10 100644 --- a/src/gallium/drivers/nvc0/nvc0_context.h +++ b/src/gallium/drivers/nvc0/nvc0_context.h @@ -346,11 +346,6 @@ struct pipe_video_buffer * nvc0_video_buffer_create(struct pipe_context *pipe, const struct pipe_video_buffer *templat); -int -nvc0_screen_get_video_param(struct pipe_screen *pscreen, - enum pipe_video_profile profile, - enum pipe_video_cap param); - /* nvc0_push.c */ void nvc0_push_vbo(struct nvc0_context *, const struct pipe_draw_info *); diff --git a/src/gallium/drivers/nvc0/nvc0_screen.c b/src/gallium/drivers/nvc0/nvc0_screen.c index bc5580b..93a2902 100644 --- a/src/gallium/drivers/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nvc0/nvc0_screen.c @@ -27,6 +27,8 @@ #include "vl/vl_decoder.h" #include "vl/vl_video_buffer.h" +#include "nouveau/nouveau_vp3_video.h" + #include "nvc0_context.h" #include "nvc0_screen.h" @@ -63,18 +65,6 @@ nvc0_screen_is_format_supported(struct pipe_screen *pscreen, return (nvc0_format_table[format].usage & bindings) == bindings; } -static boolean -nvc0_screen_video_supported(struct pipe_screen *screen, - enum pipe_format format, - enum pipe_video_profile profile) -{ - if (profile != PIPE_VIDEO_PROFILE_UNKNOWN) - return format == PIPE_FORMAT_NV12; - - return vl_video_buffer_is_format_supported(screen, format, profile); -} - - static int nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) { @@ -593,8 +583,8 @@ nvc0_screen_create(struct nouveau_device *dev) nvc0_screen_init_resource_functions(pscreen); - screen->base.base.get_video_param = nvc0_screen_get_video_param; - screen->base.base.is_video_format_supported = nvc0_screen_video_supported; + screen->base.base.get_video_param = nouveau_vp3_screen_get_video_param; + screen->base.base.is_video_format_supported = nouveau_vp3_screen_video_supported; ret = nouveau_bo_new(dev, NOUVEAU_BO_GART | NOUVEAU_BO_MAP, 0, 4096, NULL, &screen->fence.bo); diff --git a/src/gallium/drivers/nvc0/nvc0_video.c b/src/gallium/drivers/nvc0/nvc0_video.c index e0f29be..5891f09 100644 --- a/src/gallium/drivers/nvc0/nvc0_video.c +++ b/src/gallium/drivers/nvc0/nvc0_video.c @@ -25,32 +25,6 @@ #include "util/u_sampler.h" #include "util/u_format.h" -int -nvc0_screen_get_video_param(struct pipe_screen *pscreen, - enum pipe_video_profile profile, - enum pipe_video_cap param) -{ - switch (param) { - case PIPE_VIDEO_CAP_SUPPORTED: - return profile >= PIPE_VIDEO_PROFILE_MPEG1; - case PIPE_VIDEO_CAP_NPOT_TEXTURES: - return 1; - case PIPE_VIDEO_CAP_MAX_WIDTH: - case PIPE_VIDEO_CAP_MAX_HEIGHT: - return nouveau_screen(pscreen)->device->chipset < 0xd0 ? 2048 : 4096; - case PIPE_VIDEO_CAP_PREFERED_FORMAT: - return PIPE_FORMAT_NV12; - case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED: - case PIPE_VIDEO_CAP_PREFERS_INTERLACED: - return true; - case PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE: - return false; - default: - debug_printf("unknown video param: %d\n", param); - return 0; - } -} - static void nvc0_decoder_decode_bitstream(struct pipe_video_decoder *decoder, struct pipe_video_buffer *video_target, -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 09/10] nv50: separate video logic from noalloc
The upcoming vp3 logic will want the video layout, but allocated by the miptree. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nv50/nv50_miptree.c | 6 ++++-- src/gallium/drivers/nv50/nv50_resource.h | 1 + src/gallium/drivers/nv50/nv84_video.c | 2 +- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_miptree.c b/src/gallium/drivers/nv50/nv50_miptree.c index 28be768..461710e 100644 --- a/src/gallium/drivers/nv50/nv50_miptree.c +++ b/src/gallium/drivers/nv50/nv50_miptree.c @@ -335,8 +335,10 @@ nv50_miptree_create(struct pipe_screen *pscreen, if (unlikely(pt->flags & NV50_RESOURCE_FLAG_VIDEO)) { nv50_miptree_init_layout_video(mt); - /* BO allocation done by client */ - return pt; + if (pt->flags & NV50_RESOURCE_FLAG_NOALLOC) { + /* BO allocation done by client */ + return pt; + } } else if (bo_config.nv50.memtype != 0) { nv50_miptree_init_layout_tiled(mt); diff --git a/src/gallium/drivers/nv50/nv50_resource.h b/src/gallium/drivers/nv50/nv50_resource.h index c520a72..b104404 100644 --- a/src/gallium/drivers/nv50/nv50_resource.h +++ b/src/gallium/drivers/nv50/nv50_resource.h @@ -17,6 +17,7 @@ void nv50_screen_init_resource_functions(struct pipe_screen *pscreen); #define NV50_RESOURCE_FLAG_VIDEO (NOUVEAU_RESOURCE_FLAG_DRV_PRIV << 0) +#define NV50_RESOURCE_FLAG_NOALLOC (NOUVEAU_RESOURCE_FLAG_DRV_PRIV << 1) #define NV50_TILE_SHIFT_X(m) 6 #define NV50_TILE_SHIFT_Y(m) ((((m) >> 4) & 0xf) + 2) diff --git a/src/gallium/drivers/nv50/nv84_video.c b/src/gallium/drivers/nv50/nv84_video.c index d5f6295..b24ea8f 100644 --- a/src/gallium/drivers/nv50/nv84_video.c +++ b/src/gallium/drivers/nv50/nv84_video.c @@ -669,7 +669,7 @@ nv84_video_buffer_create(struct pipe_context *pipe, templ.format = PIPE_FORMAT_R8_UNORM; templ.width0 = align(template->width, 2); templ.height0 = align(template->height, 4) / 2; - templ.flags = NV50_RESOURCE_FLAG_VIDEO; + templ.flags = NV50_RESOURCE_FLAG_VIDEO | NV50_RESOURCE_FLAG_NOALLOC; templ.array_size = 2; cfg.nv50.tile_mode = 0x20; -- 1.8.1.5
Ilia Mirkin
2013-Aug-11 07:19 UTC
[Nouveau] [PATCH 10/10] nv50: add vp3/vp4 support for mpeg2/vc1
h264/mpeg4 remain disabled for pre-nvc0, there's some minor bug/difference which causes the decoding to hang after some frames. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/gallium/drivers/nouveau/nouveau_vp3_video.c | 39 ++- src/gallium/drivers/nv50/Makefile.sources | 6 +- src/gallium/drivers/nv50/nv50_context.c | 5 +- src/gallium/drivers/nv50/nv50_context.h | 14 ++ src/gallium/drivers/nv50/nv50_screen.c | 7 +- src/gallium/drivers/nv50/nv50_winsys.h | 4 - src/gallium/drivers/nv50/nv84_video.h | 4 + src/gallium/drivers/nv50/nv98_video.c | 308 ++++++++++++++++++++++++ src/gallium/drivers/nv50/nv98_video.h | 48 ++++ src/gallium/drivers/nv50/nv98_video_bsp.c | 159 ++++++++++++ src/gallium/drivers/nv50/nv98_video_ppp.c | 143 +++++++++++ src/gallium/drivers/nv50/nv98_video_vp.c | 202 ++++++++++++++++ 12 files changed, 927 insertions(+), 12 deletions(-) create mode 100644 src/gallium/drivers/nv50/nv98_video.c create mode 100644 src/gallium/drivers/nv50/nv98_video.h create mode 100644 src/gallium/drivers/nv50/nv98_video_bsp.c create mode 100644 src/gallium/drivers/nv50/nv98_video_ppp.c create mode 100644 src/gallium/drivers/nv50/nv98_video_vp.c diff --git a/src/gallium/drivers/nouveau/nouveau_vp3_video.c b/src/gallium/drivers/nouveau/nouveau_vp3_video.c index 95ba7ec..5926ff2 100644 --- a/src/gallium/drivers/nouveau/nouveau_vp3_video.c +++ b/src/gallium/drivers/nouveau/nouveau_vp3_video.c @@ -231,6 +231,25 @@ nouveau_vp3_decoder_init_common(struct pipe_video_decoder *dec) dec->end_frame = nouveau_vp3_decoder_end_frame; } +static void vp3_getpath(enum pipe_video_profile profile, char *path) +{ + switch (u_reduce_video_profile(profile)) { + case PIPE_VIDEO_CODEC_MPEG12: { + sprintf(path, "/lib/firmware/nouveau/vuc-vp3-mpeg12-0"); + break; + } + case PIPE_VIDEO_CODEC_VC1: { + sprintf(path, "/lib/firmware/nouveau/vuc-vp3-vc1-0"); + break; + } + case PIPE_VIDEO_CODEC_MPEG4_AVC: { + sprintf(path, "/lib/firmware/nouveau/vuc-vp3-h264-0"); + break; + } + default: assert(0); + } +} + static void vp4_getpath(enum pipe_video_profile profile, char *path) { switch (u_reduce_video_profile(profile)) { @@ -264,7 +283,10 @@ nouveau_vp3_load_firmware(struct nouveau_vp3_decoder *dec, ssize_t r; uint32_t *end, endval; - vp4_getpath(profile, path); + if (chipset >= 0xa3 && chipset != 0xaa && chipset != 0xac) + vp4_getpath(profile, path); + else + vp3_getpath(profile, path); if (nouveau_bo_map(dec->fw_bo, NOUVEAU_BO_WR, dec->client)) return 1; @@ -333,14 +355,25 @@ nouveau_vp3_screen_get_video_param(struct pipe_screen *pscreen, enum pipe_video_profile profile, enum pipe_video_cap param) { + int chipset = nouveau_screen(pscreen)->device->chipset; + int vp3 = chipset < 0xa3 || chipset == 0xaa || chipset == 0xac; + int vp5 = chipset >= 0xd0; + enum pipe_video_codec codec = u_reduce_video_profile(profile); switch (param) { case PIPE_VIDEO_CAP_SUPPORTED: - return profile >= PIPE_VIDEO_PROFILE_MPEG1; + /* For now, h264 and mpeg4 don't work on pre-nvc0. */ + if (chipset < 0xc0) + return codec == PIPE_VIDEO_CODEC_MPEG12 || + codec == PIPE_VIDEO_CODEC_VC1; + /* In the general case, this should work, once the pre-nvc0 problems are + * resolved. */ + return profile >= PIPE_VIDEO_PROFILE_MPEG1 && ( + !vp3 || codec != PIPE_VIDEO_CODEC_MPEG4); case PIPE_VIDEO_CAP_NPOT_TEXTURES: return 1; case PIPE_VIDEO_CAP_MAX_WIDTH: case PIPE_VIDEO_CAP_MAX_HEIGHT: - return nouveau_screen(pscreen)->device->chipset < 0xd0 ? 2048 : 4096; + return vp5 ? 4096 : 2048; case PIPE_VIDEO_CAP_PREFERED_FORMAT: return PIPE_FORMAT_NV12; case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED: diff --git a/src/gallium/drivers/nv50/Makefile.sources b/src/gallium/drivers/nv50/Makefile.sources index 0fdac51..9a2d102 100644 --- a/src/gallium/drivers/nv50/Makefile.sources +++ b/src/gallium/drivers/nv50/Makefile.sources @@ -16,7 +16,11 @@ C_SOURCES := \ nv50_query.c \ nv84_video.c \ nv84_video_bsp.c \ - nv84_video_vp.c + nv84_video_vp.c \ + nv98_video.c \ + nv98_video_bsp.c \ + nv98_video_vp.c \ + nv98_video_ppp.c CODEGEN_NV50_SOURCES := \ codegen/nv50_ir.cpp \ diff --git a/src/gallium/drivers/nv50/nv50_context.c b/src/gallium/drivers/nv50/nv50_context.c index 79a0473..6fdbb0b 100644 --- a/src/gallium/drivers/nv50/nv50_context.c +++ b/src/gallium/drivers/nv50/nv50_context.c @@ -267,8 +267,9 @@ nv50_create(struct pipe_screen *pscreen, void *priv) pipe->create_video_decoder = nv84_create_decoder; pipe->create_video_buffer = nv84_video_buffer_create; } else { - /* Unsupported, but need to init pointers. */ - nouveau_context_init_vdec(&nv50->base); + /* VP3/4 */ + pipe->create_video_decoder = nv98_create_decoder; + pipe->create_video_buffer = nv98_video_buffer_create; } flags = NOUVEAU_BO_VRAM | NOUVEAU_BO_RD; diff --git a/src/gallium/drivers/nv50/nv50_context.h b/src/gallium/drivers/nv50/nv50_context.h index b204cc8..52a1aa5 100644 --- a/src/gallium/drivers/nv50/nv50_context.h +++ b/src/gallium/drivers/nv50/nv50_context.h @@ -313,4 +313,18 @@ nv84_screen_video_supported(struct pipe_screen *screen, enum pipe_format format, enum pipe_video_profile profile); +/* nv98_video.c */ +struct pipe_video_decoder * +nv98_create_decoder(struct pipe_context *context, + enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint, + enum pipe_video_chroma_format chroma_format, + unsigned width, unsigned height, + unsigned max_references, + bool expect_chunked_decode); + +struct pipe_video_buffer * +nv98_video_buffer_create(struct pipe_context *pipe, + const struct pipe_video_buffer *template); + #endif diff --git a/src/gallium/drivers/nv50/nv50_screen.c b/src/gallium/drivers/nv50/nv50_screen.c index 2951eb4..5f57d96 100644 --- a/src/gallium/drivers/nv50/nv50_screen.c +++ b/src/gallium/drivers/nv50/nv50_screen.c @@ -27,6 +27,8 @@ #include "nv50_context.h" #include "nv50_screen.h" +#include "nouveau/nouveau_vp3_video.h" + #include "nouveau/nv_object.xml.h" #include <errno.h> @@ -656,8 +658,9 @@ nv50_screen_create(struct nouveau_device *dev) screen->base.base.get_video_param = nv84_screen_get_video_param; screen->base.base.is_video_format_supported = nv84_screen_video_supported; } else { - /* Unsupported, but need to init pointers. */ - nouveau_screen_init_vdec(&screen->base); + /* VP3/4 */ + screen->base.base.get_video_param = nouveau_vp3_screen_get_video_param; + screen->base.base.is_video_format_supported = nouveau_vp3_screen_video_supported; } ret = nouveau_bo_new(dev, NOUVEAU_BO_GART | NOUVEAU_BO_MAP, 0, 4096, diff --git a/src/gallium/drivers/nv50/nv50_winsys.h b/src/gallium/drivers/nv50/nv50_winsys.h index e04247b..145ee70 100644 --- a/src/gallium/drivers/nv50/nv50_winsys.h +++ b/src/gallium/drivers/nv50/nv50_winsys.h @@ -60,10 +60,6 @@ PUSH_REFN(struct nouveau_pushbuf *push, struct nouveau_bo *bo, uint32_t flags) #define SUBC_COMPUTE(m) 6, (m) #define NV50_COMPUTE(n) SUBC_COMPUTE(NV50_COMPUTE_##n) -/* These are expected to be on their own pushbufs */ -#define SUBC_BSP(m) 2, (m) -#define SUBC_VP(m) 2, (m) - static INLINE uint32_t NV50_FIFO_PKHDR(int subc, int mthd, unsigned size) diff --git a/src/gallium/drivers/nv50/nv84_video.h b/src/gallium/drivers/nv50/nv84_video.h index 4ff8cf3..4240cef 100644 --- a/src/gallium/drivers/nv50/nv84_video.h +++ b/src/gallium/drivers/nv50/nv84_video.h @@ -33,6 +33,10 @@ #include "nv50_context.h" +/* These are expected to be on their own pushbufs */ +#define SUBC_BSP(m) 2, (m) +#define SUBC_VP(m) 2, (m) + union pipe_desc { struct pipe_picture_desc *base; struct pipe_mpeg12_picture_desc *mpeg12; diff --git a/src/gallium/drivers/nv50/nv98_video.c b/src/gallium/drivers/nv50/nv98_video.c new file mode 100644 index 0000000..eb23804 --- /dev/null +++ b/src/gallium/drivers/nv50/nv98_video.c @@ -0,0 +1,308 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nv98_video.h" + +#include "util/u_sampler.h" +#include "util/u_format.h" + +static void +nv98_decoder_decode_bitstream(struct pipe_video_decoder *decoder, + struct pipe_video_buffer *video_target, + struct pipe_picture_desc *picture, + unsigned num_buffers, + const void *const *data, + const unsigned *num_bytes) +{ + struct nouveau_vp3_decoder *dec = (struct nouveau_vp3_decoder *)decoder; + struct nouveau_vp3_video_buffer *target = (struct nouveau_vp3_video_buffer *)video_target; + uint32_t comm_seq = ++dec->fence_seq; + union pipe_desc desc; + + unsigned vp_caps, is_ref, ret; + struct nouveau_vp3_video_buffer *refs[16] = {}; + + desc.base = picture; + + assert(target->base.buffer_format == PIPE_FORMAT_NV12); + + ret = nv98_decoder_bsp(dec, desc, target, comm_seq, + num_buffers, data, num_bytes, + &vp_caps, &is_ref, refs); + + /* did we decode bitstream correctly? */ + assert(ret == 2); + + nv98_decoder_vp(dec, desc, target, comm_seq, vp_caps, is_ref, refs); + nv98_decoder_ppp(dec, desc, target, comm_seq); +} + +struct pipe_video_decoder * +nv98_create_decoder(struct pipe_context *context, + enum pipe_video_profile profile, + enum pipe_video_entrypoint entrypoint, + enum pipe_video_chroma_format chroma_format, + unsigned width, unsigned height, unsigned max_references, + bool chunked_decode) +{ + struct nouveau_screen *screen = &((struct nv50_context *)context)->screen->base; + struct nouveau_vp3_decoder *dec; + struct nouveau_pushbuf **push; + struct nv04_fifo nv04_data = {.vram = 0xbeef0201, .gart = 0xbeef0202}; + union nouveau_bo_config cfg; + + cfg.nv50.tile_mode = 0x20; + cfg.nv50.memtype = 0x70; + + int ret, i; + uint32_t codec = 1, ppp_codec = 3; + uint32_t timeout; + u32 tmp_size = 0; + + if (getenv("XVMC_VL")) + return vl_create_decoder(context, profile, entrypoint, + chroma_format, width, height, + max_references, chunked_decode); + + if (entrypoint != PIPE_VIDEO_ENTRYPOINT_BITSTREAM) { + debug_printf("%x\n", entrypoint); + return NULL; + } + + dec = CALLOC_STRUCT(nouveau_vp3_decoder); + if (!dec) + return NULL; + dec->client = screen->client; + nouveau_vp3_decoder_init_common(&dec->base); + + dec->bsp_idx = 5; + dec->vp_idx = 6; + dec->ppp_idx = 7; + + ret = nouveau_object_new(&screen->device->object, 0, + NOUVEAU_FIFO_CHANNEL_CLASS, + &nv04_data, sizeof(nv04_data), &dec->channel[0]); + + if (!ret) + ret = nouveau_pushbuf_new(screen->client, dec->channel[0], 4, + 32 * 1024, true, &dec->pushbuf[0]); + + for (i = 1; i < 3; ++i) { + dec->channel[i] = dec->channel[0]; + dec->pushbuf[i] = dec->pushbuf[0]; + } + push = dec->pushbuf; + + if (!ret) + ret = nouveau_object_new(dec->channel[0], 0x390b1, 0x85b1, NULL, 0, &dec->bsp); + if (!ret) + ret = nouveau_object_new(dec->channel[1], 0x190b2, 0x85b2, NULL, 0, &dec->vp); + if (!ret) + ret = nouveau_object_new(dec->channel[2], 0x290b3, 0x85b3, NULL, 0, &dec->ppp); + if (ret) + goto fail; + + BEGIN_NV04(push[0], SUBC_BSP(NV01_SUBCHAN_OBJECT), 1); + PUSH_DATA (push[0], dec->bsp->handle); + + BEGIN_NV04(push[0], SUBC_BSP(0x180), 5); + for (i = 0; i < 5; i++) + PUSH_DATA (push[0], nv04_data.vram); + + BEGIN_NV04(push[1], SUBC_VP(NV01_SUBCHAN_OBJECT), 1); + PUSH_DATA (push[1], dec->vp->handle); + + BEGIN_NV04(push[1], SUBC_VP(0x180), 6); + for (i = 0; i < 6; i++) + PUSH_DATA (push[1], nv04_data.vram); + + BEGIN_NV04(push[2], SUBC_PPP(NV01_SUBCHAN_OBJECT), 1); + PUSH_DATA (push[2], dec->ppp->handle); + + BEGIN_NV04(push[2], SUBC_PPP(0x180), 5); + for (i = 0; i < 5; i++) + PUSH_DATA (push[2], nv04_data.vram); + + dec->base.context = context; + dec->base.profile = profile; + dec->base.entrypoint = entrypoint; + dec->base.chroma_format = chroma_format; + dec->base.width = width; + dec->base.height = height; + dec->base.max_references = max_references; + dec->base.decode_bitstream = nv98_decoder_decode_bitstream; + + for (i = 0; i < NOUVEAU_VP3_VIDEO_QDEPTH && !ret; ++i) + ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, + 0, 1 << 20, NULL, &dec->bsp_bo[i]); + if (!ret) + ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, + 0x100, 4 << 20, NULL, &dec->inter_bo[0]); + if (!ret) + nouveau_bo_ref(dec->inter_bo[0], &dec->inter_bo[1]); + if (ret) + goto fail; + + switch (u_reduce_video_profile(profile)) { + case PIPE_VIDEO_CODEC_MPEG12: { + codec = 1; + assert(max_references <= 2); + break; + } + case PIPE_VIDEO_CODEC_MPEG4: { + codec = 4; + tmp_size = mb(height)*16 * mb(width)*16; + assert(max_references <= 2); + break; + } + case PIPE_VIDEO_CODEC_VC1: { + ppp_codec = codec = 2; + tmp_size = mb(height)*16 * mb(width)*16; + assert(max_references <= 2); + break; + } + case PIPE_VIDEO_CODEC_MPEG4_AVC: { + codec = 3; + dec->tmp_stride = 16 * mb_half(width) * nouveau_vp3_video_align(height) * 3 / 2; + tmp_size = dec->tmp_stride * (max_references + 1); + assert(max_references <= 16); + break; + } + default: + fprintf(stderr, "invalid codec\n"); + goto fail; + } + + ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, 0, + 0x4000, NULL, &dec->fw_bo); + if (ret) + goto fail; + + ret = nouveau_vp3_load_firmware(dec, profile, screen->device->chipset); + if (ret) + goto fw_fail; + + if (codec != 3) { + ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, 0, + 0x400, NULL, &dec->bitplane_bo); + if (ret) + goto fail; + } + + dec->ref_stride = mb(width)*16 * (mb_half(height)*32 + nouveau_vp3_video_align(height)/2); + ret = nouveau_bo_new(screen->device, NOUVEAU_BO_VRAM, 0, + dec->ref_stride * (max_references+2) + tmp_size, + &cfg, &dec->ref_bo); + if (ret) + goto fail; + + timeout = 0; + + BEGIN_NV04(push[0], SUBC_BSP(0x200), 2); + PUSH_DATA (push[0], codec); + PUSH_DATA (push[0], timeout); + + BEGIN_NV04(push[1], SUBC_VP(0x200), 2); + PUSH_DATA (push[1], codec); + PUSH_DATA (push[1], timeout); + + BEGIN_NV04(push[2], SUBC_PPP(0x200), 2); + PUSH_DATA (push[2], ppp_codec); + PUSH_DATA (push[2], timeout); + + ++dec->fence_seq; + +#if NOUVEAU_VP3_DEBUG_FENCE + ret = nouveau_bo_new(screen->device, NOUVEAU_BO_GART|NOUVEAU_BO_MAP, + 0, 0x1000, NULL, &dec->fence_bo); + if (ret) + goto fail; + + nouveau_bo_map(dec->fence_bo, NOUVEAU_BO_RDWR, screen->client); + dec->fence_map = dec->fence_bo->map; + dec->fence_map[0] = dec->fence_map[4] = dec->fence_map[8] = 0; + dec->comm = (struct comm *)(dec->fence_map + (COMM_OFFSET/sizeof(*dec->fence_map))); + + /* So lets test if the fence is working? */ + nouveau_pushbuf_space(push[0], 6, 1, 0); + PUSH_REFN (push[0], dec->fence_bo, NOUVEAU_BO_GART|NOUVEAU_BO_RDWR); + BEGIN_NV04(push[0], SUBC_BSP(0x240), 3); + PUSH_DATAh(push[0], dec->fence_bo->offset); + PUSH_DATA (push[0], dec->fence_bo->offset); + PUSH_DATA (push[0], dec->fence_seq); + + BEGIN_NV04(push[0], SUBC_BSP(0x304), 1); + PUSH_DATA (push[0], 0); + PUSH_KICK (push[0]); + + nouveau_pushbuf_space(push[1], 6, 1, 0); + PUSH_REFN (push[1], dec->fence_bo, NOUVEAU_BO_GART|NOUVEAU_BO_RDWR); + BEGIN_NV04(push[1], SUBC_VP(0x240), 3); + PUSH_DATAh(push[1], (dec->fence_bo->offset + 0x10)); + PUSH_DATA (push[1], (dec->fence_bo->offset + 0x10)); + PUSH_DATA (push[1], dec->fence_seq); + + BEGIN_NV04(push[1], SUBC_VP(0x304), 1); + PUSH_DATA (push[1], 0); + PUSH_KICK (push[1]); + + nouveau_pushbuf_space(push[2], 6, 1, 0); + PUSH_REFN (push[2], dec->fence_bo, NOUVEAU_BO_GART|NOUVEAU_BO_RDWR); + BEGIN_NV04(push[2], SUBC_PPP(0x240), 3); + PUSH_DATAh(push[2], (dec->fence_bo->offset + 0x20)); + PUSH_DATA (push[2], (dec->fence_bo->offset + 0x20)); + PUSH_DATA (push[2], dec->fence_seq); + + BEGIN_NV04(push[2], SUBC_PPP(0x304), 1); + PUSH_DATA (push[2], 0); + PUSH_KICK (push[2]); + + usleep(100); + while (dec->fence_seq > dec->fence_map[0] || + dec->fence_seq > dec->fence_map[4] || + dec->fence_seq > dec->fence_map[8]) { + debug_printf("%u: %u %u %u\n", dec->fence_seq, dec->fence_map[0], dec->fence_map[4], dec->fence_map[8]); + usleep(100); + } + debug_printf("%u: %u %u %u\n", dec->fence_seq, dec->fence_map[0], dec->fence_map[4], dec->fence_map[8]); +#endif + + return &dec->base; + +fw_fail: + debug_printf("Cannot create decoder without firmware..\n"); + dec->base.destroy(&dec->base); + return NULL; + +fail: + debug_printf("Creation failed: %s (%i)\n", strerror(-ret), ret); + dec->base.destroy(&dec->base); + return NULL; +} + +struct pipe_video_buffer * +nv98_video_buffer_create(struct pipe_context *pipe, + const struct pipe_video_buffer *templat) +{ + return nouveau_vp3_video_buffer_create( + pipe, templat, NV50_RESOURCE_FLAG_VIDEO); +} diff --git a/src/gallium/drivers/nv50/nv98_video.h b/src/gallium/drivers/nv50/nv98_video.h new file mode 100644 index 0000000..a157201 --- /dev/null +++ b/src/gallium/drivers/nv50/nv98_video.h @@ -0,0 +1,48 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nv50_context.h" +#include "nv50_screen.h" +#include "nouveau/nouveau_vp3_video.h" + +#include "vl/vl_decoder.h" +#include "vl/vl_types.h" + +#include "util/u_video.h" + +extern unsigned +nv98_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, + unsigned comm_seq, unsigned num_buffers, + const void *const *data, const unsigned *num_bytes, + unsigned *vp_caps, unsigned *is_ref, + struct nouveau_vp3_video_buffer *refs[16]); + +extern void +nv98_decoder_vp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, + unsigned caps, unsigned is_ref, + struct nouveau_vp3_video_buffer *refs[16]); + +extern void +nv98_decoder_ppp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq); diff --git a/src/gallium/drivers/nv50/nv98_video_bsp.c b/src/gallium/drivers/nv50/nv98_video_bsp.c new file mode 100644 index 0000000..440066d --- /dev/null +++ b/src/gallium/drivers/nv50/nv98_video_bsp.c @@ -0,0 +1,159 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nv98_video.h" + +#if NOUVEAU_VP3_DEBUG_FENCE +static void dump_comm_bsp(struct comm *comm) +{ + unsigned idx = comm->bsp_cur_index & 0xf; + debug_printf("Cur seq: %x, bsp byte ofs: %x\n", comm->bsp_cur_index, comm->byte_ofs); + debug_printf("Status: %08x, pos: %08x\n", comm->status[idx], comm->pos[idx]); +} +#endif + +unsigned +nv98_decoder_bsp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, + unsigned comm_seq, unsigned num_buffers, + const void *const *data, const unsigned *num_bytes, + unsigned *vp_caps, unsigned *is_ref, + struct nouveau_vp3_video_buffer *refs[16]) +{ + struct nouveau_pushbuf *push = dec->pushbuf[0]; + enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); + uint32_t bsp_addr, comm_addr, inter_addr; + uint32_t slice_size, bucket_size, ring_size; + uint32_t caps; + int ret; + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; + struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1]; + unsigned fence_extra = 0; + struct nouveau_pushbuf_refn bo_refs[] = { + { bsp_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, + { inter_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, +#if NOUVEAU_VP3_DEBUG_FENCE + { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, +#endif + { dec->bitplane_bo, NOUVEAU_BO_RDWR | NOUVEAU_BO_VRAM }, + }; + int num_refs = sizeof(bo_refs)/sizeof(*bo_refs); + + if (!dec->bitplane_bo) + num_refs--; + +#if NOUVEAU_VP3_DEBUG_FENCE + fence_extra = 4; +#endif + + ret = nouveau_bo_map(bsp_bo, NOUVEAU_BO_WR, dec->client); + if (ret) { + debug_printf("map failed: %i %s\n", ret, strerror(-ret)); + return -1; + } + + caps = nouveau_vp3_bsp(dec, desc, target, comm_seq, + num_buffers, data, num_bytes); + + nouveau_vp3_vp_caps(dec, desc, target, comm_seq, vp_caps, is_ref, refs); + + nouveau_pushbuf_space(push, 6 + (codec == PIPE_VIDEO_CODEC_MPEG4_AVC ? 9 : 8) + fence_extra + 2, num_refs, 0); + nouveau_pushbuf_refn(push, bo_refs, num_refs); + + bsp_addr = bsp_bo->offset >> 8; + inter_addr = inter_bo->offset >> 8; + +#if NOUVEAU_VP3_DEBUG_FENCE + memset(dec->comm, 0, 0x200); + comm_addr = (dec->fence_bo->offset + COMM_OFFSET) >> 8; +#else + comm_addr = bsp_addr + (COMM_OFFSET>>8); +#endif + + BEGIN_NV04(push, SUBC_BSP(0x700), 5); + PUSH_DATA (push, caps); // 700 cmd + PUSH_DATA (push, bsp_addr + 1); // 704 strparm_bsp + PUSH_DATA (push, bsp_addr + 7); // 708 str addr + PUSH_DATA (push, comm_addr); // 70c comm + PUSH_DATA (push, comm_seq); // 710 seq + + if (codec != PIPE_VIDEO_CODEC_MPEG4_AVC) { + u32 bitplane_addr; + int mpeg12 = (codec == PIPE_VIDEO_CODEC_MPEG12); + + bitplane_addr = dec->bitplane_bo->offset >> 8; + + nouveau_vp3_inter_sizes(dec, 1, &slice_size, &bucket_size, &ring_size); + BEGIN_NV04(push, SUBC_BSP(0x400), mpeg12 ? 5 : 7); + PUSH_DATA (push, bsp_addr); // 400 picparm addr + PUSH_DATA (push, inter_addr); // 404 interparm addr + PUSH_DATA (push, inter_addr + slice_size + bucket_size); // 408 interdata addr + PUSH_DATA (push, ring_size << 8); // 40c interdata_size + if (!mpeg12) { + PUSH_DATA (push, bitplane_addr); // 410 BITPLANE_DATA + PUSH_DATA (push, 0x400); // 414 BITPLANE_DATA_SIZE + } + PUSH_DATA (push, 0); // dma idx + } else { + nouveau_vp3_inter_sizes(dec, desc.h264->slice_count, &slice_size, &bucket_size, &ring_size); + BEGIN_NV04(push, SUBC_BSP(0x400), 8); + PUSH_DATA (push, bsp_addr); // 400 picparm addr + PUSH_DATA (push, inter_addr); // 404 interparm addr + PUSH_DATA (push, slice_size << 8); // 408 interparm size? + PUSH_DATA (push, inter_addr + slice_size + bucket_size); // 40c interdata addr + PUSH_DATA (push, ring_size << 8); // 410 interdata size + PUSH_DATA (push, inter_addr + slice_size); // 414 bucket? + PUSH_DATA (push, bucket_size << 8); // 418 bucket size? unshifted.. + PUSH_DATA (push, 0); // 41c targets + // TODO: Double check 414 / 418 with nvidia trace + } + +#if NOUVEAU_VP3_DEBUG_FENCE + BEGIN_NV04(push, SUBC_BSP(0x240), 3); + PUSH_DATAh(push, dec->fence_bo->offset); + PUSH_DATA (push, dec->fence_bo->offset); + PUSH_DATA (push, dec->fence_seq); + + BEGIN_NV04(push, SUBC_BSP(0x300), 1); + PUSH_DATA (push, 1); + PUSH_KICK (push); + + { + unsigned spin = 0; + do { + usleep(100); + if ((spin++ & 0xff) == 0xff) { + debug_printf("b%u: %u\n", dec->fence_seq, dec->fence_map[0]); + dump_comm_bsp(dec->comm); + } + } while (dec->fence_seq > dec->fence_map[0]); + } + + dump_comm_bsp(dec->comm); + return dec->comm->status[comm_seq & 0xf]; +#else + BEGIN_NV04(push, SUBC_BSP(0x300), 1); + PUSH_DATA (push, 0); + PUSH_KICK (push); + return 2; +#endif +} diff --git a/src/gallium/drivers/nv50/nv98_video_ppp.c b/src/gallium/drivers/nv50/nv98_video_ppp.c new file mode 100644 index 0000000..4c4b8af --- /dev/null +++ b/src/gallium/drivers/nv50/nv98_video_ppp.c @@ -0,0 +1,143 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nv98_video.h" + +static void +nv98_decoder_setup_ppp(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target, uint32_t low700) { + struct nouveau_pushbuf *push = dec->pushbuf[2]; + + uint32_t stride_in = mb(dec->base.width); + uint32_t stride_out = mb(target->resources[0]->width0); + uint32_t dec_h = mb(dec->base.height); + uint32_t dec_w = mb(dec->base.width); + uint64_t in_addr; + uint32_t y2, cbcr, cbcr2, i; + struct nouveau_pushbuf_refn bo_refs[] = { + { NULL, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, + { NULL, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, + { dec->ref_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, +#if NOUVEAU_VP3_DEBUG_FENCE + { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, +#endif + }; + unsigned num_refs = sizeof(bo_refs)/sizeof(*bo_refs); + + for (i = 0; i < 2; ++i) { + struct nv50_miptree *mt = (struct nv50_miptree *)target->resources[i]; + bo_refs[i].bo = mt->base.bo; + } + + nouveau_pushbuf_refn(push, bo_refs, num_refs); + nouveau_vp3_ycbcr_offsets(dec, &y2, &cbcr, &cbcr2); + + BEGIN_NV04(push, SUBC_PPP(0x700), 10); + in_addr = nouveau_vp3_video_addr(dec, target) >> 8; + + PUSH_DATA (push, (stride_out << 24) | (stride_out << 16) | low700); // 700 + PUSH_DATA (push, (stride_in << 24) | (stride_in << 16) | (dec_h << 8) | dec_w); // 704 + assert(dec_w == stride_in); + + /* Input: */ + PUSH_DATA (push, in_addr); // 708 + PUSH_DATA (push, in_addr + y2); // 70c + PUSH_DATA (push, in_addr + cbcr); // 710 + PUSH_DATA (push, in_addr + cbcr2); // 714 + + for (i = 0; i < 2; ++i) { + struct nv50_miptree *mt = (struct nv50_miptree *)target->resources[i]; + + PUSH_DATA (push, mt->base.address >> 8); + PUSH_DATA (push, (mt->base.address + mt->total_size/2) >> 8); + mt->base.status |= NOUVEAU_BUFFER_STATUS_GPU_WRITING; + } +} + +static uint32_t +nv98_decoder_vc1_ppp(struct nouveau_vp3_decoder *dec, struct pipe_vc1_picture_desc *desc, struct nouveau_vp3_video_buffer *target) { + struct nouveau_pushbuf *push = dec->pushbuf[2]; + + nv98_decoder_setup_ppp(dec, target, 0x1412); + assert(!desc->deblockEnable); + assert(!(dec->base.width & 0xf)); + assert(!(dec->base.height & 0xf)); + + BEGIN_NV04(push, SUBC_PPP(0x400), 1); + PUSH_DATA (push, desc->pquant << 11); + + // 728 = wtf? + return 0x10; +} + +void +nv98_decoder_ppp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, struct nouveau_vp3_video_buffer *target, unsigned comm_seq) { + enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); + struct nouveau_pushbuf *push = dec->pushbuf[2]; + unsigned ppp_caps = 0x10; + unsigned fence_extra = 0; + +#if NOUVEAU_VP3_DEBUG_FENCE + fence_extra = 4; +#endif + + nouveau_pushbuf_space(push, 11 + (codec == PIPE_VIDEO_CODEC_VC1 ? 2 : 0) + 3 + fence_extra + 2, 4, 0); + + switch (codec) { + case PIPE_VIDEO_CODEC_MPEG12: { + unsigned mpeg2 = dec->base.profile != PIPE_VIDEO_PROFILE_MPEG1; + nv98_decoder_setup_ppp(dec, target, 0x1410 | mpeg2); + break; + } + case PIPE_VIDEO_CODEC_MPEG4: nv98_decoder_setup_ppp(dec, target, 0x1414); break; + case PIPE_VIDEO_CODEC_VC1: ppp_caps = nv98_decoder_vc1_ppp(dec, desc.vc1, target); break; + case PIPE_VIDEO_CODEC_MPEG4_AVC: nv98_decoder_setup_ppp(dec, target, 0x1413); break; + default: assert(0); + } + BEGIN_NV04(push, SUBC_PPP(0x734), 2); + PUSH_DATA (push, comm_seq); + PUSH_DATA (push, ppp_caps); + +#if NOUVEAU_VP3_DEBUG_FENCE + BEGIN_NV04(push, SUBC_PPP(0x240), 3); + PUSH_DATAh(push, (dec->fence_bo->offset + 0x20)); + PUSH_DATA (push, (dec->fence_bo->offset + 0x20)); + PUSH_DATA (push, dec->fence_seq); + + BEGIN_NV04(push, SUBC_PPP(0x300), 1); + PUSH_DATA (push, 1); + PUSH_KICK (push); + + { + unsigned spin = 0; + + do { + usleep(100); + if ((spin++ & 0xff) == 0xff) + debug_printf("p%u: %u\n", dec->fence_seq, dec->fence_map[8]); + } while (dec->fence_seq > dec->fence_map[8]); + } +#else + BEGIN_NV04(push, SUBC_PPP(0x300), 1); + PUSH_DATA (push, 0); + PUSH_KICK (push); +#endif +} diff --git a/src/gallium/drivers/nv50/nv98_video_vp.c b/src/gallium/drivers/nv50/nv98_video_vp.c new file mode 100644 index 0000000..04994bd --- /dev/null +++ b/src/gallium/drivers/nv50/nv98_video_vp.c @@ -0,0 +1,202 @@ +/* + * Copyright 2011-2013 Maarten Lankhorst + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include "nv98_video.h" +#include <sys/mman.h> + +#if NOUVEAU_VP3_DEBUG_FENCE +static void dump_comm_vp(struct nouveau_vp3_decoder *dec, struct comm *comm, u32 comm_seq, + struct nouveau_bo *inter_bo, unsigned slice_size) +{ + unsigned i, idx = comm->pvp_cur_index & 0xf; + debug_printf("Status: %08x, stage: %08x\n", comm->status_vp[idx], comm->pvp_stage); +#if 0 + debug_printf("Acked byte ofs: %x, bsp byte ofs: %x\n", comm->acked_byte_ofs, comm->byte_ofs); + debug_printf("Irq/parse indexes: %i %i\n", comm->irq_index, comm->parse_endpos_index); + + for (i = 0; i != comm->irq_index; ++i) + debug_printf("irq[%i] = { @ %08x -> %04x }\n", i, comm->irq_pos[i], comm->irq_470[i]); + for (i = 0; i != comm->parse_endpos_index; ++i) + debug_printf("parse_endpos[%i] = { @ %08x}\n", i, comm->parse_endpos[i]); +#endif + debug_printf("mb_y = %u\n", comm->mb_y[idx]); + if (comm->status_vp[idx] == 1) + return; + + if ((comm->pvp_stage & 0xff) != 0xff) { + unsigned *map; + assert(nouveau_bo_map(inter_bo, NOUVEAU_BO_RD|NOUVEAU_BO_NOBLOCK, dec->client) >= 0); + map = inter_bo->map; + for (i = 0; i < comm->byte_ofs + slice_size; i += 0x10) { + debug_printf("%05x: %08x %08x %08x %08x\n", i, map[i/4], map[i/4+1], map[i/4+2], map[i/4+3]); + } + munmap(inter_bo->map, inter_bo->size); + inter_bo->map = NULL; + } + assert((comm->pvp_stage & 0xff) == 0xff); +} +#endif + +static void +nv98_decoder_kick_ref(struct nouveau_vp3_decoder *dec, struct nouveau_vp3_video_buffer *target) +{ + dec->refs[target->valid_ref].vidbuf = NULL; + dec->refs[target->valid_ref].last_used = 0; +// debug_printf("Unreffed %p\n", target); +} + +void +nv98_decoder_vp(struct nouveau_vp3_decoder *dec, union pipe_desc desc, + struct nouveau_vp3_video_buffer *target, unsigned comm_seq, + unsigned caps, unsigned is_ref, + struct nouveau_vp3_video_buffer *refs[16]) +{ + struct nouveau_pushbuf *push = dec->pushbuf[1]; + uint32_t bsp_addr, comm_addr, inter_addr, ucode_addr, pic_addr[17], last_addr, null_addr; + uint32_t slice_size, bucket_size, ring_size, i; + enum pipe_video_codec codec = u_reduce_video_profile(dec->base.profile); + struct nouveau_bo *bsp_bo = dec->bsp_bo[comm_seq % NOUVEAU_VP3_VIDEO_QDEPTH]; + struct nouveau_bo *inter_bo = dec->inter_bo[comm_seq & 1]; + u32 fence_extra = 0, codec_extra = 0; + struct nouveau_pushbuf_refn bo_refs[] = { + { inter_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, + { dec->ref_bo, NOUVEAU_BO_WR | NOUVEAU_BO_VRAM }, + { bsp_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, +#if NOUVEAU_VP3_DEBUG_FENCE + { dec->fence_bo, NOUVEAU_BO_WR | NOUVEAU_BO_GART }, +#endif + { dec->fw_bo, NOUVEAU_BO_RD | NOUVEAU_BO_VRAM }, + }; + int num_refs = sizeof(bo_refs)/sizeof(*bo_refs) - !dec->fw_bo; + +#if NOUVEAU_VP3_DEBUG_FENCE + fence_extra = 4; +#endif + + if (codec == PIPE_VIDEO_CODEC_MPEG4_AVC) { + nouveau_vp3_inter_sizes(dec, desc.h264->slice_count, &slice_size, &bucket_size, &ring_size); + codec_extra += 2; + } else + nouveau_vp3_inter_sizes(dec, 1, &slice_size, &bucket_size, &ring_size); + + if (dec->base.max_references > 2) + codec_extra += 1 + (dec->base.max_references - 2); + + pic_addr[16] = nouveau_vp3_video_addr(dec, target) >> 8; + last_addr = null_addr = nouveau_vp3_video_addr(dec, NULL) >> 8; + + for (i = 0; i < dec->base.max_references; ++i) { + if (!refs[i]) + pic_addr[i] = last_addr; + else if (dec->refs[refs[i]->valid_ref].vidbuf == refs[i]) + last_addr = pic_addr[i] = nouveau_vp3_video_addr(dec, refs[i]) >> 8; + else + pic_addr[i] = null_addr; + } + if (!is_ref) + nv98_decoder_kick_ref(dec, target); + + nouveau_pushbuf_space(push, 8 + 3 * (codec != PIPE_VIDEO_CODEC_MPEG12) + + 6 + codec_extra + fence_extra + 2, num_refs, 0); + + nouveau_pushbuf_refn(push, bo_refs, num_refs); + + bsp_addr = bsp_bo->offset >> 8; +#if NOUVEAU_VP3_DEBUG_FENCE + comm_addr = (dec->fence_bo->offset + COMM_OFFSET)>>8; +#else + comm_addr = bsp_addr + (COMM_OFFSET>>8); +#endif + inter_addr = inter_bo->offset >> 8; + if (dec->fw_bo) + ucode_addr = dec->fw_bo->offset >> 8; + else + ucode_addr = 0; + + BEGIN_NV04(push, SUBC_VP(0x700), 7); + PUSH_DATA (push, caps); // 700 + PUSH_DATA (push, comm_seq); // 704 + PUSH_DATA (push, 0); // 708 fuc targets, ignored for nv98 + PUSH_DATA (push, dec->fw_sizes); // 70c + PUSH_DATA (push, bsp_addr+(VP_OFFSET>>8)); // 710 picparm_addr + PUSH_DATA (push, inter_addr); // 714 inter_parm + PUSH_DATA (push, inter_addr + slice_size + bucket_size); // 718 inter_data_ofs + + if (bucket_size) { + uint64_t tmpimg_addr = dec->ref_bo->offset + dec->ref_stride * (dec->base.max_references+2); + + BEGIN_NV04(push, SUBC_VP(0x71c), 2); + PUSH_DATA (push, tmpimg_addr >> 8); // 71c + PUSH_DATA (push, inter_addr + slice_size); // 720 bucket_ofs + } + + BEGIN_NV04(push, SUBC_VP(0x724), 5); + PUSH_DATA (push, comm_addr); // 724 + PUSH_DATA (push, ucode_addr); // 728 + PUSH_DATA (push, pic_addr[16]); // 734 + PUSH_DATA (push, pic_addr[0]); // 72c + PUSH_DATA (push, pic_addr[1]); // 730 + + if (dec->base.max_references > 2) { + int i; + + BEGIN_NV04(push, SUBC_VP(0x400), dec->base.max_references - 2); + for (i = 2; i < dec->base.max_references; ++i) { + assert(0x400 + (i - 2) * 4 < 0x438); + PUSH_DATA (push, pic_addr[i]); + } + } + + if (codec == PIPE_VIDEO_CODEC_MPEG4_AVC) { + BEGIN_NV04(push, SUBC_VP(0x438), 1); + PUSH_DATA (push, desc.h264->slice_count); + } + + //debug_printf("Decoding %08lx with %08lx and %08lx\n", pic_addr[16], pic_addr[0], pic_addr[1]); + +#if NOUVEAU_VP3_DEBUG_FENCE + BEGIN_NV04(push, SUBC_VP(0x240), 3); + PUSH_DATAh(push, (dec->fence_bo->offset + 0x10)); + PUSH_DATA (push, (dec->fence_bo->offset + 0x10)); + PUSH_DATA (push, dec->fence_seq); + + BEGIN_NV04(push, SUBC_VP(0x300), 1); + PUSH_DATA (push, 1); + PUSH_KICK(push); + + { + unsigned spin = 0; + do { + usleep(100); + if ((spin++ & 0xff) == 0xff) { + debug_printf("v%u: %u\n", dec->fence_seq, dec->fence_map[4]); + dump_comm_vp(dec, dec->comm, comm_seq, inter_bo, slice_size << 8); + } + } while (dec->fence_seq > dec->fence_map[4]); + } + dump_comm_vp(dec, dec->comm, comm_seq, inter_bo, slice_size << 8); +#else + BEGIN_NV04(push, SUBC_VP(0x300), 1); + PUSH_DATA (push, 0); + PUSH_KICK (push); +#endif +} -- 1.8.1.5
Possibly Parallel Threads
- [Bug 89969] New: vaDriverInit fails with gallium/nouveau driver
- [PATCH] nouveau/video: make sure that firmware is present when checking caps
- [PATCH 1/3] nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0)
- assert in nouveau_vp3_video_vp.c ?
- [PATCH] nv50: enable H.264 for NV98+ (VP3, VP4.0)