Hans de Goede
2016-Mar-14 15:28 UTC
[Nouveau] [RFC mesa] nouveau: Add support for OpenCL global memory buffers
Hi, On 14-03-16 16:05, Ilia Mirkin wrote:> There's a less hacky and more hacky way forward. The more hacky solution is > to set file index to -1 or something and then not do the lowering when you > see that. > > The less hacky solution is the one you proposed as #1 - introduce a new > file for "buffer" memory and lower it to the global file by adding a base > offset. > > Right now the meaning of global is overloaded - before lowering it > implicitly includes the buffer vase address, and after lowering, it > explicitly includes it. Splitting it out I to another file type seems like > the cleaner way forward, not sure what issue you were seeing with that > approach.Ok. > (I didn't understand your argument about potential future> issues.)There was not much to understand, it is just something I worried about, but was not sure if there actually was something to worry about :) If you feel that solution #1 (which was also my first hunch) is the right one then I will go and implement that.> What I really don't want is to somehow differentiate glsl-sourced > and opencl-sourced compute programs in the backend.Ok, understood. Regards, Hans> On Mar 14, 2016 6:22 AM, "Hans de Goede" <hdegoede at redhat.com> wrote: > >> This little "hack" fixes the use of OpenCL global memory buffers with >> nouveau, but clearly the #if 0 is not a solution as it breaks buffers >> with GLSL. >> >> The reason I'm posting this as an RFC patch is to discuss how to solve >> this properly, 2 solutions come to mind: >> >> 1) Use separate nv50_ir::FILE_MEMORY_xxx values for buffers versus >> TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL, looking at >> translateFile() >> we currently have: >> >> case TGSI_FILE_BUFFER: return nv50_ir::FILE_MEMORY_GLOBAL; >> case TGSI_FILE_MEMORY: return nv50_ir::FILE_MEMORY_GLOBAL; >> >> So doing a s/nv50_ir::FILE_MEMORY_GLOBAL/nv50_ir::FILE_MEMORY_BUFFER/ >> everywhere and then adding a new FILE_MEMORY_GLOBAL seems like an >> obvious fix. >> >> But I'm afraid that we will have similar issues with OpenCL using >> flat addresses where as GLSL will have some implied base-address / >> offset in other places too, which brings me to solution 2: >> >> 2) Add a flag to Program to indicate that it is an OpenCL compute kernel; >> or possible use a different Program::TYPE_* for OpenCL ? >> >> I've a feeling that this is what we want since the addressing models >> are just different and we likely will need to implement different >> behavior >> in various places based on this. >> >> This will also allow us to use INPUT and CONST in tgsi code build from >> OpenCL programs and use that flag to do the right thing, rather then >> introducing new MEMORY[x], INPUT resp. MEMORY[x], CONST declarations >> for this. >> >> I'm esp. worried that once GLSL gets global support it will want >> different behavior for TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL >> then OpenCL, just like things are now with buffers, rendering solution >> 1. a non solution >> >> So I'm seeking input on how to move forward with this ... ? >> >> Signed-off-by: Hans de Goede <hdegoede at redhat.com> >> --- >> src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 ++++ >> src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 ++ >> 2 files changed, 6 insertions(+) >> >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> index de0c72b..15012ac 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >> @@ -1525,6 +1525,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int >> idx, int c, uint32_t address) >> >> if (tgsiFile == TGSI_FILE_MEMORY) { >> switch (code->memoryFiles[fileIdx].mem_type) { >> + case TGSI_MEMORY_TYPE_GLOBAL: >> + /* No-op this is the default for TGSI_FILE_MEMORY */ >> + sym->setFile(FILE_MEMORY_GLOBAL); >> + break; >> case TGSI_MEMORY_TYPE_SHARED: >> sym->setFile(FILE_MEMORY_SHARED); >> break; >> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >> index 6cb4dd4..bcc96de 100644 >> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >> @@ -2106,6 +2106,7 @@ NVC0LoweringPass::visit(Instruction *i) >> } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) { >> assert(prog->getType() == Program::TYPE_TESSELLATION_CONTROL); >> i->op = OP_VFETCH; >> +#if 0 >> } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) { >> Value *ind = i->getIndirect(0, 1); >> Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex * >> 16); >> @@ -2126,6 +2127,7 @@ NVC0LoweringPass::visit(Instruction *i) >> if (i->defExists(0)) { >> bld.mkMov(i->getDef(0), bld.mkImm(0)); >> } >> +#endif >> } >> break; >> case OP_ATOM: >> -- >> 2.7.2 >> >> >
Samuel Pitoiset
2016-Mar-14 15:41 UTC
[Nouveau] [RFC mesa] nouveau: Add support for OpenCL global memory buffers
On 03/14/2016 04:28 PM, Hans de Goede wrote:> Hi, > > On 14-03-16 16:05, Ilia Mirkin wrote: >> There's a less hacky and more hacky way forward. The more hacky >> solution is >> to set file index to -1 or something and then not do the lowering when >> you >> see that. >> >> The less hacky solution is the one you proposed as #1 - introduce a new >> file for "buffer" memory and lower it to the global file by adding a base >> offset. >> >> Right now the meaning of global is overloaded - before lowering it >> implicitly includes the buffer vase address, and after lowering, it >> explicitly includes it. Splitting it out I to another file type seems >> like >> the cleaner way forward, not sure what issue you were seeing with that >> approach. > > Ok.I agree with you guys, the solution #1 is fine by me. Btw, do you need someone with commit access to push your previous series (the tgsi thing)? I can do this for you.> > > (I didn't understand your argument about potential future >> issues.) > > There was not much to understand, it is just something I worried about, > but was not sure if there actually was something to worry about :) > > If you feel that solution #1 (which was also my first hunch) is > the right one then I will go and implement that. > >> What I really don't want is to somehow differentiate glsl-sourced >> and opencl-sourced compute programs in the backend. > > Ok, understood. > > Regards, > > Hans > > >> On Mar 14, 2016 6:22 AM, "Hans de Goede" <hdegoede at redhat.com> wrote: >> >>> This little "hack" fixes the use of OpenCL global memory buffers with >>> nouveau, but clearly the #if 0 is not a solution as it breaks buffers >>> with GLSL. >>> >>> The reason I'm posting this as an RFC patch is to discuss how to solve >>> this properly, 2 solutions come to mind: >>> >>> 1) Use separate nv50_ir::FILE_MEMORY_xxx values for buffers versus >>> TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL, looking at >>> translateFile() >>> we currently have: >>> >>> case TGSI_FILE_BUFFER: return nv50_ir::FILE_MEMORY_GLOBAL; >>> case TGSI_FILE_MEMORY: return nv50_ir::FILE_MEMORY_GLOBAL; >>> >>> So doing a >>> s/nv50_ir::FILE_MEMORY_GLOBAL/nv50_ir::FILE_MEMORY_BUFFER/ >>> everywhere and then adding a new FILE_MEMORY_GLOBAL seems like an >>> obvious fix. >>> >>> But I'm afraid that we will have similar issues with OpenCL using >>> flat addresses where as GLSL will have some implied base-address / >>> offset in other places too, which brings me to solution 2: >>> >>> 2) Add a flag to Program to indicate that it is an OpenCL compute >>> kernel; >>> or possible use a different Program::TYPE_* for OpenCL ? >>> >>> I've a feeling that this is what we want since the addressing models >>> are just different and we likely will need to implement different >>> behavior >>> in various places based on this. >>> >>> This will also allow us to use INPUT and CONST in tgsi code build >>> from >>> OpenCL programs and use that flag to do the right thing, rather then >>> introducing new MEMORY[x], INPUT resp. MEMORY[x], CONST declarations >>> for this. >>> >>> I'm esp. worried that once GLSL gets global support it will want >>> different behavior for TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL >>> then OpenCL, just like things are now with buffers, rendering >>> solution >>> 1. a non solution >>> >>> So I'm seeking input on how to move forward with this ... ? >>> >>> Signed-off-by: Hans de Goede <hdegoede at redhat.com> >>> --- >>> src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 ++++ >>> src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 ++ >>> 2 files changed, 6 insertions(+) >>> >>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>> index de0c72b..15012ac 100644 >>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>> @@ -1525,6 +1525,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx, >>> int >>> idx, int c, uint32_t address) >>> >>> if (tgsiFile == TGSI_FILE_MEMORY) { >>> switch (code->memoryFiles[fileIdx].mem_type) { >>> + case TGSI_MEMORY_TYPE_GLOBAL: >>> + /* No-op this is the default for TGSI_FILE_MEMORY */ >>> + sym->setFile(FILE_MEMORY_GLOBAL); >>> + break; >>> case TGSI_MEMORY_TYPE_SHARED: >>> sym->setFile(FILE_MEMORY_SHARED); >>> break; >>> diff --git >>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>> index 6cb4dd4..bcc96de 100644 >>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>> @@ -2106,6 +2106,7 @@ NVC0LoweringPass::visit(Instruction *i) >>> } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) { >>> assert(prog->getType() =>>> Program::TYPE_TESSELLATION_CONTROL); >>> i->op = OP_VFETCH; >>> +#if 0 >>> } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) { >>> Value *ind = i->getIndirect(0, 1); >>> Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex * >>> 16); >>> @@ -2126,6 +2127,7 @@ NVC0LoweringPass::visit(Instruction *i) >>> if (i->defExists(0)) { >>> bld.mkMov(i->getDef(0), bld.mkImm(0)); >>> } >>> +#endif >>> } >>> break; >>> case OP_ATOM: >>> -- >>> 2.7.2 >>> >>> >>-- -Samuel
Hans de Goede
2016-Mar-14 19:50 UTC
[Nouveau] [RFC mesa] nouveau: Add support for OpenCL global memory buffers
Hi, On 14-03-16 16:41, Samuel Pitoiset wrote:> > > On 03/14/2016 04:28 PM, Hans de Goede wrote: >> Hi, >> >> On 14-03-16 16:05, Ilia Mirkin wrote: >>> There's a less hacky and more hacky way forward. The more hacky >>> solution is >>> to set file index to -1 or something and then not do the lowering when >>> you >>> see that. >>> >>> The less hacky solution is the one you proposed as #1 - introduce a new >>> file for "buffer" memory and lower it to the global file by adding a base >>> offset. >>> >>> Right now the meaning of global is overloaded - before lowering it >>> implicitly includes the buffer vase address, and after lowering, it >>> explicitly includes it. Splitting it out I to another file type seems >>> like >>> the cleaner way forward, not sure what issue you were seeing with that >>> approach. >> >> Ok. > > I agree with you guys, the solution #1 is fine by me. > > Btw, do you need someone with commit access to push your previous series (the tgsi thing)? I can do this for you.Thanks for the offer. IIRC Ilia wanted some minor fixes there, so I'll do a v2 tomorrow. Talking about commit rights, I guess it would be convenient for all if I would get commit rights myself? I promise I won't push anythings without acks. I already have a freedesktop.org account, my username is jwrdegoede. Regards, Hans> >> >> > (I didn't understand your argument about potential future >>> issues.) >> >> There was not much to understand, it is just something I worried about, >> but was not sure if there actually was something to worry about :) >> >> If you feel that solution #1 (which was also my first hunch) is >> the right one then I will go and implement that. >> >>> What I really don't want is to somehow differentiate glsl-sourced >>> and opencl-sourced compute programs in the backend. >> >> Ok, understood. >> >> Regards, >> >> Hans >> >> >>> On Mar 14, 2016 6:22 AM, "Hans de Goede" <hdegoede at redhat.com> wrote: >>> >>>> This little "hack" fixes the use of OpenCL global memory buffers with >>>> nouveau, but clearly the #if 0 is not a solution as it breaks buffers >>>> with GLSL. >>>> >>>> The reason I'm posting this as an RFC patch is to discuss how to solve >>>> this properly, 2 solutions come to mind: >>>> >>>> 1) Use separate nv50_ir::FILE_MEMORY_xxx values for buffers versus >>>> TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL, looking at >>>> translateFile() >>>> we currently have: >>>> >>>> case TGSI_FILE_BUFFER: return nv50_ir::FILE_MEMORY_GLOBAL; >>>> case TGSI_FILE_MEMORY: return nv50_ir::FILE_MEMORY_GLOBAL; >>>> >>>> So doing a >>>> s/nv50_ir::FILE_MEMORY_GLOBAL/nv50_ir::FILE_MEMORY_BUFFER/ >>>> everywhere and then adding a new FILE_MEMORY_GLOBAL seems like an >>>> obvious fix. >>>> >>>> But I'm afraid that we will have similar issues with OpenCL using >>>> flat addresses where as GLSL will have some implied base-address / >>>> offset in other places too, which brings me to solution 2: >>>> >>>> 2) Add a flag to Program to indicate that it is an OpenCL compute >>>> kernel; >>>> or possible use a different Program::TYPE_* for OpenCL ? >>>> >>>> I've a feeling that this is what we want since the addressing models >>>> are just different and we likely will need to implement different >>>> behavior >>>> in various places based on this. >>>> >>>> This will also allow us to use INPUT and CONST in tgsi code build >>>> from >>>> OpenCL programs and use that flag to do the right thing, rather then >>>> introducing new MEMORY[x], INPUT resp. MEMORY[x], CONST declarations >>>> for this. >>>> >>>> I'm esp. worried that once GLSL gets global support it will want >>>> different behavior for TGSI_FILE_MEMORY with TGSI_MEMORY_TYPE_GLOBAL >>>> then OpenCL, just like things are now with buffers, rendering >>>> solution >>>> 1. a non solution >>>> >>>> So I'm seeking input on how to move forward with this ... ? >>>> >>>> Signed-off-by: Hans de Goede <hdegoede at redhat.com> >>>> --- >>>> src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 ++++ >>>> src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 ++ >>>> 2 files changed, 6 insertions(+) >>>> >>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>>> index de0c72b..15012ac 100644 >>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp >>>> @@ -1525,6 +1525,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx, >>>> int >>>> idx, int c, uint32_t address) >>>> >>>> if (tgsiFile == TGSI_FILE_MEMORY) { >>>> switch (code->memoryFiles[fileIdx].mem_type) { >>>> + case TGSI_MEMORY_TYPE_GLOBAL: >>>> + /* No-op this is the default for TGSI_FILE_MEMORY */ >>>> + sym->setFile(FILE_MEMORY_GLOBAL); >>>> + break; >>>> case TGSI_MEMORY_TYPE_SHARED: >>>> sym->setFile(FILE_MEMORY_SHARED); >>>> break; >>>> diff --git >>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>>> index 6cb4dd4..bcc96de 100644 >>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp >>>> @@ -2106,6 +2106,7 @@ NVC0LoweringPass::visit(Instruction *i) >>>> } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) { >>>> assert(prog->getType() =>>>> Program::TYPE_TESSELLATION_CONTROL); >>>> i->op = OP_VFETCH; >>>> +#if 0 >>>> } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) { >>>> Value *ind = i->getIndirect(0, 1); >>>> Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex * >>>> 16); >>>> @@ -2126,6 +2127,7 @@ NVC0LoweringPass::visit(Instruction *i) >>>> if (i->defExists(0)) { >>>> bld.mkMov(i->getDef(0), bld.mkImm(0)); >>>> } >>>> +#endif >>>> } >>>> break; >>>> case OP_ATOM: >>>> -- >>>> 2.7.2 >>>> >>>> >>> >
Possibly Parallel Threads
- [RFC mesa] nouveau: Add support for OpenCL global memory buffers
- [RFC mesa] nouveau: Add support for OpenCL global memory buffers
- [RFC mesa] nouveau: Add support for OpenCL global memory buffers
- [RFC mesa] nouveau: Add support for OpenCL global memory buffers
- [RFC mesa] nouveau: Add support for OpenCL global memory buffers