James Molloy
2012-Sep-24 15:04 UTC
[LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
> > For the record, I just workarounded it in pocl by borrowing the > BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should > be reconsidered.Yes, I agree. On 24 September 2012 15:08, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi>wrote:> Well, > > To be honest I'm not very comfortable with the whole constant GEP > idea. It's a new thing to me and I do not fully understand its > point in LLVM IR, so I probably wasn't very clear ;) > > Anyways, me bringing it up was meant as an example of what can happen > if one (mis)uses the C function static variable semantics for something > that > really is a thread local variable (in usual thread parallel > implementations). > > For the record, I just workarounded it in pocl by borrowing the > BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should > be reconsidered. > > > On 09/24/2012 05:00 PM, James Molloy wrote: > >> Hi, >> >> Sorry, With a prod from Silviu (cc'd) I now understand - I was >> interpreting >> your use of "constant GEP" as "GEP with constant operand" as opposed to >> "ConstantGEP node" which of course can only take a Constant* operand, not >> a >> Value* operand. >> >> I now fully see the problem and realise that my solution is also prone to >> that problem :) >> >> Cheers, >> >> James >> >> On 24 September 2012 14:41, James Molloy <james at jamesmolloy.co.uk >> <mailto:james at jamesmolloy.co.**uk <james at jamesmolloy.co.uk>>> wrote: >> >> Hi, >> >> I don't fully understand your problem description. >> >> ...is caused by LLVM/Clang thinking >> >> they are buffers with a constant base which they eventually won't be in a >> parallel WG implementation. This triggers an issue I'm currently working >> on >> pocl: https://bugs.launchpad.net/**pocl/+bug/1032203<https://bugs.launchpad.net/pocl/+bug/1032203>because Clang generates >> constant GEPs for the local buffer accesses (even though in a parallel >> thread-safe implementation the local variables cannot be stored to >> constant >> locations). >> >> >> Surely if you're passing in pointers to the kernel function that differ >> depending on workgroup, then a GEP from those pointers of a constant >> amount >> is perfectly safe. Why would a constant GEP from a per-workgroup base be a >> problem? >> >> >> I'm sure there's something I've misunderstood about your solution... >> >> Cheers, >> >> James >> >> On 24 September 2012 12:41, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi >> <mailto:pekka.jaaskelainen@**tut.fi <pekka.jaaskelainen at tut.fi>>> wrote: >> >>> Hi all, >>> >>> Another OpenCL C implementation issue I'm currently fighting with >>> >> is how >> >>> to best implement the automatic __local variables. Seems SPIR >>> >> enforces >> >>> the current Clang implementation of them that converts the automatic >>> locals to C function static variables (thus, in practice global >>> >> variables). >> >>> >>> Clearly, this is not a thread safe "final implementation", thus >>> >> works as is >> >>> only when multiple work groups of the same kernel are not executed in >>> parallel. Therefore, some other compiler pass is assumed to >>> >> convert those >> >>> function static (module global variables) to some other storage >>> >> where the >> >>> local buffers are allocated per work group thread. >>> >>> The pocl implementation is what was suggested some time ago in >>> >> this list: >> >>> the locals are converted to local arguments to the kernel >>> >> function which >> >>> are then passed per-thread buffers when the work group is >>> >> executed. Thus, >> >>> pocl needs to convert the references to these dummy globals to local >>> buffer pointers at the end of the kernel function argument list. >>> >>> The problem from the use of the "semantically inadequate" 'function >>> static' variables for the local buffers is caused by LLVM/Clang >>> >> thinking >> >>> they are buffers with a constant base which they eventually won't >>> >> be in >> >>> a parallel WG implementation. This triggers an issue I'm >>> >> currently working >> >>> on pocl: https://bugs.launchpad.net/**pocl/+bug/1032203<https://bugs.launchpad.net/pocl/+bug/1032203>because Clang >>> generates constant GEPs for the local buffer accesses (even though in a >>> >> parallel >> >>> thread-safe implementation the local variables cannot be stored to >>> constant locations). >>> >>> So, I wonder if this piece of SPIR specs might cause other similar >>> problems (LLVM optimizing incorrectly due to the slightly wrong >>> >> semantics) >> >>> in the future and should be improved. The minimal fix would be to add >>> some kind of attribute to the function static global that >>> >> prevents >> >>> Clang/LLVM thinking the address is constant and apply >>> >> optimizations that >> >>> rely on that. Semantically the local buffer is actually a thread-local >>> >> variable. >> >>> Are thread locals somehow supported in LLVM IR? >>> >>> >>> On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote: >>> >>>> >>>> For what it's worth, this issue manifests itself in an unsolved pocl >>>> bug: https://bugs.launchpad.net/**pocl/+bug/987905<https://bugs.launchpad.net/pocl/+bug/987905> >>>> >>>> It would be simpler to implement a portable implementation for >>>> >>> calling the >> >>> kernel from the host if we could assume the kernel calling >>>> >>> convention >> >>> mapped each OpenCL setArg arg to a single kernel function arg (and >>>> >>> preferably all >> >>> arg data in memory). For the non-kernel functions it should not >>>> >>> matter and >> >>> could be target-specific. >>>> >>>> >>> >>> -- Pekka >>> >> >> >> > > -- > Pekka >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120924/5061d86a/attachment.html>
James Molloy
2012-Sep-26 08:06 UTC
[LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
Micah, Boaz, Do you guys have any ideas about how to fix this issue? Cheers, James On 24 September 2012 16:04, James Molloy <james at jamesmolloy.co.uk> wrote:> For the record, I just workarounded it in pocl by borrowing the >> BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this >> should >> be reconsidered. > > > Yes, I agree. > > On 24 September 2012 15:08, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi>wrote: > >> Well, >> >> To be honest I'm not very comfortable with the whole constant GEP >> idea. It's a new thing to me and I do not fully understand its >> point in LLVM IR, so I probably wasn't very clear ;) >> >> Anyways, me bringing it up was meant as an example of what can happen >> if one (mis)uses the C function static variable semantics for something >> that >> really is a thread local variable (in usual thread parallel >> implementations). >> >> For the record, I just workarounded it in pocl by borrowing the >> BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this >> should >> be reconsidered. >> >> >> On 09/24/2012 05:00 PM, James Molloy wrote: >> >>> Hi, >>> >>> Sorry, With a prod from Silviu (cc'd) I now understand - I was >>> interpreting >>> your use of "constant GEP" as "GEP with constant operand" as opposed to >>> "ConstantGEP node" which of course can only take a Constant* operand, >>> not a >>> Value* operand. >>> >>> I now fully see the problem and realise that my solution is also prone to >>> that problem :) >>> >>> Cheers, >>> >>> James >>> >>> On 24 September 2012 14:41, James Molloy <james at jamesmolloy.co.uk >>> <mailto:james at jamesmolloy.co.**uk <james at jamesmolloy.co.uk>>> wrote: >>> >>> Hi, >>> >>> I don't fully understand your problem description. >>> >>> ...is caused by LLVM/Clang thinking >>> >>> they are buffers with a constant base which they eventually won't be in a >>> parallel WG implementation. This triggers an issue I'm currently working >>> on >>> pocl: https://bugs.launchpad.net/**pocl/+bug/1032203<https://bugs.launchpad.net/pocl/+bug/1032203>because Clang generates >>> constant GEPs for the local buffer accesses (even though in a parallel >>> thread-safe implementation the local variables cannot be stored to >>> constant >>> locations). >>> >>> >>> Surely if you're passing in pointers to the kernel function that differ >>> depending on workgroup, then a GEP from those pointers of a constant >>> amount >>> is perfectly safe. Why would a constant GEP from a per-workgroup base be >>> a >>> problem? >>> >>> >>> I'm sure there's something I've misunderstood about your solution... >>> >>> Cheers, >>> >>> James >>> >>> On 24 September 2012 12:41, Pekka Jääskeläinen < >>> pekka.jaaskelainen at tut.fi >>> <mailto:pekka.jaaskelainen@**tut.fi <pekka.jaaskelainen at tut.fi>>> wrote: >>> >>>> Hi all, >>>> >>>> Another OpenCL C implementation issue I'm currently fighting with >>>> >>> is how >>> >>>> to best implement the automatic __local variables. Seems SPIR >>>> >>> enforces >>> >>>> the current Clang implementation of them that converts the automatic >>>> locals to C function static variables (thus, in practice global >>>> >>> variables). >>> >>>> >>>> Clearly, this is not a thread safe "final implementation", thus >>>> >>> works as is >>> >>>> only when multiple work groups of the same kernel are not executed in >>>> parallel. Therefore, some other compiler pass is assumed to >>>> >>> convert those >>> >>>> function static (module global variables) to some other storage >>>> >>> where the >>> >>>> local buffers are allocated per work group thread. >>>> >>>> The pocl implementation is what was suggested some time ago in >>>> >>> this list: >>> >>>> the locals are converted to local arguments to the kernel >>>> >>> function which >>> >>>> are then passed per-thread buffers when the work group is >>>> >>> executed. Thus, >>> >>>> pocl needs to convert the references to these dummy globals to local >>>> buffer pointers at the end of the kernel function argument list. >>>> >>>> The problem from the use of the "semantically inadequate" 'function >>>> static' variables for the local buffers is caused by LLVM/Clang >>>> >>> thinking >>> >>>> they are buffers with a constant base which they eventually won't >>>> >>> be in >>> >>>> a parallel WG implementation. This triggers an issue I'm >>>> >>> currently working >>> >>>> on pocl: https://bugs.launchpad.net/**pocl/+bug/1032203<https://bugs.launchpad.net/pocl/+bug/1032203>because Clang >>>> generates constant GEPs for the local buffer accesses (even though in a >>>> >>> parallel >>> >>>> thread-safe implementation the local variables cannot be stored to >>>> constant locations). >>>> >>>> So, I wonder if this piece of SPIR specs might cause other similar >>>> problems (LLVM optimizing incorrectly due to the slightly wrong >>>> >>> semantics) >>> >>>> in the future and should be improved. The minimal fix would be to add >>>> some kind of attribute to the function static global that >>>> >>> prevents >>> >>>> Clang/LLVM thinking the address is constant and apply >>>> >>> optimizations that >>> >>>> rely on that. Semantically the local buffer is actually a thread-local >>>> >>> variable. >>> >>>> Are thread locals somehow supported in LLVM IR? >>>> >>>> >>>> On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote: >>>> >>>>> >>>>> For what it's worth, this issue manifests itself in an unsolved pocl >>>>> bug: https://bugs.launchpad.net/**pocl/+bug/987905<https://bugs.launchpad.net/pocl/+bug/987905> >>>>> >>>>> It would be simpler to implement a portable implementation for >>>>> >>>> calling the >>> >>>> kernel from the host if we could assume the kernel calling >>>>> >>>> convention >>> >>>> mapped each OpenCL setArg arg to a single kernel function arg (and >>>>> >>>> preferably all >>> >>>> arg data in memory). For the non-kernel functions it should not >>>>> >>>> matter and >>> >>>> could be target-specific. >>>>> >>>>> >>>> >>>> -- Pekka >>>> >>> >>> >>> >> >> -- >> Pekka >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120926/fc47f894/attachment.html>
Villmow, Micah
2012-Sep-26 17:21 UTC
[LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
It is my view that this is an implementation detail and not an issue with the SPIR spec. As SPIR is just a representation of a program in a portable manner, it is up to the consumer of SPIR to correctly set up the kernels based on the devices calling convention/ABI when the SPIR binary is loaded for that specific device. From: mankeyrabbit at gmail.com [mailto:mankeyrabbit at gmail.com] On Behalf Of James Molloy Sent: Wednesday, September 26, 2012 1:06 AM To: Pekka Jääskeläinen Cc: Villmow, Micah; Ouriel, Boaz; cfe-dev at cs.uiuc.edu; llvmdev at cs.uiuc.edu; pocl-devel at lists.sourceforge.net Subject: Re: [cfe-dev] [LLVMdev] SPIR provisional specification is now available in the Khronos website Micah, Boaz, Do you guys have any ideas about how to fix this issue? Cheers, James On 24 September 2012 16:04, James Molloy <james at jamesmolloy.co.uk<mailto:james at jamesmolloy.co.uk>> wrote: For the record, I just workarounded it in pocl by borrowing the BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should be reconsidered. Yes, I agree. On 24 September 2012 15:08, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi<mailto:pekka.jaaskelainen at tut.fi>> wrote: Well, To be honest I'm not very comfortable with the whole constant GEP idea. It's a new thing to me and I do not fully understand its point in LLVM IR, so I probably wasn't very clear ;) Anyways, me bringing it up was meant as an example of what can happen if one (mis)uses the C function static variable semantics for something that really is a thread local variable (in usual thread parallel implementations). For the record, I just workarounded it in pocl by borrowing the BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should be reconsidered. On 09/24/2012 05:00 PM, James Molloy wrote: Hi, Sorry, With a prod from Silviu (cc'd) I now understand - I was interpreting your use of "constant GEP" as "GEP with constant operand" as opposed to "ConstantGEP node" which of course can only take a Constant* operand, not a Value* operand. I now fully see the problem and realise that my solution is also prone to that problem :) Cheers, James On 24 September 2012 14:41, James Molloy <james at jamesmolloy.co.uk<mailto:james at jamesmolloy.co.uk> <mailto:james at jamesmolloy.co.uk<mailto:james at jamesmolloy.co.uk>>> wrote: Hi, I don't fully understand your problem description. ...is caused by LLVM/Clang thinking they are buffers with a constant base which they eventually won't be in a parallel WG implementation. This triggers an issue I'm currently working on pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because Clang generates constant GEPs for the local buffer accesses (even though in a parallel thread-safe implementation the local variables cannot be stored to constant locations). Surely if you're passing in pointers to the kernel function that differ depending on workgroup, then a GEP from those pointers of a constant amount is perfectly safe. Why would a constant GEP from a per-workgroup base be a problem? I'm sure there's something I've misunderstood about your solution... Cheers, James On 24 September 2012 12:41, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi<mailto:pekka.jaaskelainen at tut.fi> <mailto:pekka.jaaskelainen at tut.fi<mailto:pekka.jaaskelainen at tut.fi>>> wrote: Hi all, Another OpenCL C implementation issue I'm currently fighting with is how to best implement the automatic __local variables. Seems SPIR enforces the current Clang implementation of them that converts the automatic locals to C function static variables (thus, in practice global variables). Clearly, this is not a thread safe "final implementation", thus works as is only when multiple work groups of the same kernel are not executed in parallel. Therefore, some other compiler pass is assumed to convert those function static (module global variables) to some other storage where the local buffers are allocated per work group thread. The pocl implementation is what was suggested some time ago in this list: the locals are converted to local arguments to the kernel function which are then passed per-thread buffers when the work group is executed. Thus, pocl needs to convert the references to these dummy globals to local buffer pointers at the end of the kernel function argument list. The problem from the use of the "semantically inadequate" 'function static' variables for the local buffers is caused by LLVM/Clang thinking they are buffers with a constant base which they eventually won't be in a parallel WG implementation. This triggers an issue I'm currently working on pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because Clang generates constant GEPs for the local buffer accesses (even though in a parallel thread-safe implementation the local variables cannot be stored to constant locations). So, I wonder if this piece of SPIR specs might cause other similar problems (LLVM optimizing incorrectly due to the slightly wrong semantics) in the future and should be improved. The minimal fix would be to add some kind of attribute to the function static global that prevents Clang/LLVM thinking the address is constant and apply optimizations that rely on that. Semantically the local buffer is actually a thread-local variable. Are thread locals somehow supported in LLVM IR? On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote: For what it's worth, this issue manifests itself in an unsolved pocl bug: https://bugs.launchpad.net/pocl/+bug/987905 It would be simpler to implement a portable implementation for calling the kernel from the host if we could assume the kernel calling convention mapped each OpenCL setArg arg to a single kernel function arg (and preferably all arg data in memory). For the non-kernel functions it should not matter and could be target-specific. -- Pekka -- Pekka -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120926/b170e8a5/attachment.html>
Apparently Analagous Threads
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website