Hans de Goede
2016-Apr-07 13:27 UTC
[Nouveau] [PATCH] nouveau: codegen: Take src swizzle into account on loads
The llvm TGSI backend does things like: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. Before this commit the data at TEMP[0].x + 4 would be loaded instead. This commit fixes this. Signed-off-by: Hans de Goede <hdegoede at redhat.com> --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 557608e..cc51f5a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -2279,12 +2279,16 @@ Converter::handleLOAD(Value *dst0[4]) Value *off = fetchSrc(1, c); Symbol *sym; + uint32_t src0_component_offset = tgsi.getSrc(0).getSwizzle(c) * 4; + if (tgsi.getSrc(1).getFile() == TGSI_FILE_IMMEDIATE) { off = NULL; sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, - tgsi.getSrc(1).getValueU32(0, info) + 4 * c); + tgsi.getSrc(1).getValueU32(0, info) + + src0_component_offset); } else { - sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, 4 * c); + sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, + src0_component_offset); } Instruction *ld = mkLoad(TYPE_U32, dst0[c], sym, off); -- 2.7.3
Ilia Mirkin
2016-Apr-07 13:58 UTC
[Nouveau] [PATCH] nouveau: codegen: Take src swizzle into account on loads
That's wrong. The spec for the instruction needs to be clarified... The current nouveau impl is correct - only the .x of the address should be loaded, with up to 16 bytes read into the destination. On Thu, Apr 7, 2016 at 9:27 AM, Hans de Goede <hdegoede at redhat.com> wrote:> The llvm TGSI backend does things like: > > LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x > > Expecting the data at address TEMP[0].x to get loaded to > TEMP[0].y. Before this commit the data at TEMP[0].x + 4 would be > loaded instead. This commit fixes this. > > Signed-off-by: Hans de Goede <hdegoede at redhat.com> > --- > src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > index 557608e..cc51f5a 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp > @@ -2279,12 +2279,16 @@ Converter::handleLOAD(Value *dst0[4]) > > Value *off = fetchSrc(1, c); > Symbol *sym; > + uint32_t src0_component_offset = tgsi.getSrc(0).getSwizzle(c) * 4; > + > if (tgsi.getSrc(1).getFile() == TGSI_FILE_IMMEDIATE) { > off = NULL; > sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, > - tgsi.getSrc(1).getValueU32(0, info) + 4 * c); > + tgsi.getSrc(1).getValueU32(0, info) + > + src0_component_offset); > } else { > - sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, 4 * c); > + sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, > + src0_component_offset); > } > > Instruction *ld = mkLoad(TYPE_U32, dst0[c], sym, off); > -- > 2.7.3 >
Hans de Goede
2016-Apr-08 09:27 UTC
[Nouveau] [PATCH] nouveau: codegen: Take src swizzle into account on loads
Hi, On 07-04-16 15:58, Ilia Mirkin wrote:> That's wrong.It used to work with the old RES[] code and if one cannot specify a source swizzle, then how can I do something like LOAD TEMP[0].y, MEMORY[0], address And get the data at absolute global memory address "address" into TEMP[0].y ? This is a must-have for llvm to be able to generate working TGSI code, I do not see any way around this. AFAIK this is exactly what src-swizzling is for. Also note that this commit does not change anything if no src-swizzling is specified, in that case things work exactly as before. > The spec for the instruction needs to be clarified...> The current nouveau impl is correct - only the .x of the address > should be loaded, with up to 16 bytes read into the destination.Ah note this is not about swizzling on the address, that indeed makes no sense given how the addressing works for BUFFERS / MEMORY, no this is about adding a swizlling postfix to the buffer / memory resource specification, for example: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] See the swizzling is done on the resource, not on the address, so the swizzling specifies swizzling of the up to 16 bytes read from address, it does not influence the address handling at all. I now see I made an error in my commit msg, it gives the following example: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x This clearly is wrong, the last TEMP[0].x is not even valid TGSI, the correct example would be: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] Regards, Hans> On Thu, Apr 7, 2016 at 9:27 AM, Hans de Goede <hdegoede at redhat.com> wrote: >> The llvm TGSI backend does things like: >> >> >> >> Expecting the data at address TEMP[0].x to get loaded to >> TEMP[0].y. Before this commit the data at TEMP[0].x + 4 would be >> loaded instead. This commit fixes this. >> >> Signed-off-by: Hans de Goede <hdegoede at redhat.com> >> --- >> src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 8 ++++++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> index 557608e..cc51f5a 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> @@ -2279,12 +2279,16 @@ Converter::handleLOAD(Value *dst0[4]) >> >> Value *off = fetchSrc(1, c); >> Symbol *sym; >> + uint32_t src0_component_offset = tgsi.getSrc(0).getSwizzle(c) * 4; >> + >> if (tgsi.getSrc(1).getFile() == TGSI_FILE_IMMEDIATE) { >> off = NULL; >> sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, >> - tgsi.getSrc(1).getValueU32(0, info) + 4 * c); >> + tgsi.getSrc(1).getValueU32(0, info) + >> + src0_component_offset); >> } else { >> - sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, 4 * c); >> + sym = makeSym(tgsi.getSrc(0).getFile(), r, -1, c, >> + src0_component_offset); >> } >> >> Instruction *ld = mkLoad(TYPE_U32, dst0[c], sym, off); >> -- >> 2.7.3 >>
Apparently Analagous Threads
- [PATCH] nouveau: codegen: Take src swizzle into account on loads
- [PATCH mesa v2 1/3] nouveau: codegen: LOAD: Always use component 0 when getting the address
- [PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
- [PATCH mesa v2 3/3] nouveau: codegen: LOAD: Take src swizzle into account
- [PATCH] nouveau: codegen: Take src swizzle into account on loads