James Molloy
2012-Sep-24 14:00 UTC
[LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
Hi, Sorry, With a prod from Silviu (cc'd) I now understand - I was interpreting your use of "constant GEP" as "GEP with constant operand" as opposed to "ConstantGEP node" which of course can only take a Constant* operand, not a Value* operand. I now fully see the problem and realise that my solution is also prone to that problem :) Cheers, James On 24 September 2012 14:41, James Molloy <james at jamesmolloy.co.uk> wrote:> Hi, > > I don't fully understand your problem description. > > ...is caused by LLVM/Clang thinking >> >> they are buffers with a constant base which they eventually won't be in >> a parallel WG implementation. This triggers an issue I'm currently >> working on pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because >> Clang generates >> constant GEPs for the local buffer accesses (even though in a parallel >> thread-safe implementation the local variables cannot be stored to >> constant locations). >> > > Surely if you're passing in pointers to the kernel function that differ > depending on workgroup, then a GEP from those pointers of a constant amount > is perfectly safe. Why would a constant GEP from a per-workgroup base be a > problem? > > > I'm sure there's something I've misunderstood about your solution... > > Cheers, > > James > > On 24 September 2012 12:41, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> > wrote: > > Hi all, > > > > Another OpenCL C implementation issue I'm currently fighting with is how > > to best implement the automatic __local variables. Seems SPIR enforces > > the current Clang implementation of them that converts the automatic > > locals to C function static variables (thus, in practice global > variables). > > > > Clearly, this is not a thread safe "final implementation", thus works as > is > > only when multiple work groups of the same kernel are not executed in > > parallel. Therefore, some other compiler pass is assumed to convert those > > function static (module global variables) to some other storage where the > > local buffers are allocated per work group thread. > > > > The pocl implementation is what was suggested some time ago in this list: > > the locals are converted to local arguments to the kernel function which > > are then passed per-thread buffers when the work group is executed. Thus, > > pocl needs to convert the references to these dummy globals to local > > buffer pointers at the end of the kernel function argument list. > > > > The problem from the use of the "semantically inadequate" 'function > > static' variables for the local buffers is caused by LLVM/Clang thinking > > they are buffers with a constant base which they eventually won't be in > > a parallel WG implementation. This triggers an issue I'm currently > working > > on pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because Clang > > generates > > constant GEPs for the local buffer accesses (even though in a parallel > > thread-safe implementation the local variables cannot be stored to > > constant locations). > > > > So, I wonder if this piece of SPIR specs might cause other similar > > problems (LLVM optimizing incorrectly due to the slightly wrong > semantics) > > in the future and should be improved. The minimal fix would be > > to add some kind of attribute to the function static global that prevents > > Clang/LLVM thinking the address is constant and apply optimizations that > > rely > > on that. Semantically the local buffer is actually a thread-local > variable. > > Are thread locals somehow supported in LLVM IR? > > > > > > On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote: > >> > >> For what it's worth, this issue manifests itself in an unsolved pocl > >> bug: https://bugs.launchpad.net/pocl/+bug/987905 > >> > >> It would be simpler to implement a portable implementation for calling > the > >> kernel from the host if we could assume the kernel calling convention > >> mapped > >> each OpenCL setArg arg to a single kernel function arg (and preferably > all > >> arg data in memory). For the non-kernel functions it should not matter > and > >> could be target-specific. > >> > > > > > > -- > > Pekka >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120924/35c8ccd4/attachment.html>
Pekka Jääskeläinen
2012-Sep-24 14:08 UTC
[LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
Well, To be honest I'm not very comfortable with the whole constant GEP idea. It's a new thing to me and I do not fully understand its point in LLVM IR, so I probably wasn't very clear ;) Anyways, me bringing it up was meant as an example of what can happen if one (mis)uses the C function static variable semantics for something that really is a thread local variable (in usual thread parallel implementations). For the record, I just workarounded it in pocl by borrowing the BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should be reconsidered. On 09/24/2012 05:00 PM, James Molloy wrote:> Hi, > > Sorry, With a prod from Silviu (cc'd) I now understand - I was interpreting > your use of "constant GEP" as "GEP with constant operand" as opposed to > "ConstantGEP node" which of course can only take a Constant* operand, not a > Value* operand. > > I now fully see the problem and realise that my solution is also prone to > that problem :) > > Cheers, > > James > > On 24 September 2012 14:41, James Molloy <james at jamesmolloy.co.uk > <mailto:james at jamesmolloy.co.uk>> wrote: > > Hi, > > I don't fully understand your problem description. > > ...is caused by LLVM/Clang thinking > > they are buffers with a constant base which they eventually won't be in a > parallel WG implementation. This triggers an issue I'm currently working on > pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because Clang generates > constant GEPs for the local buffer accesses (even though in a parallel > thread-safe implementation the local variables cannot be stored to constant > locations). > > > Surely if you're passing in pointers to the kernel function that differ > depending on workgroup, then a GEP from those pointers of a constant amount > is perfectly safe. Why would a constant GEP from a per-workgroup base be a > problem? > > > I'm sure there's something I've misunderstood about your solution... > > Cheers, > > James > > On 24 September 2012 12:41, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi > <mailto:pekka.jaaskelainen at tut.fi>> wrote: >> Hi all, >> >> Another OpenCL C implementation issue I'm currently fighting with > is how >> to best implement the automatic __local variables. Seems SPIR > enforces >> the current Clang implementation of them that converts the automatic >> locals to C function static variables (thus, in practice global > variables). >> >> Clearly, this is not a thread safe "final implementation", thus > works as is >> only when multiple work groups of the same kernel are not executed in >> parallel. Therefore, some other compiler pass is assumed to > convert those >> function static (module global variables) to some other storage > where the >> local buffers are allocated per work group thread. >> >> The pocl implementation is what was suggested some time ago in > this list: >> the locals are converted to local arguments to the kernel > function which >> are then passed per-thread buffers when the work group is > executed. Thus, >> pocl needs to convert the references to these dummy globals to local >> buffer pointers at the end of the kernel function argument list. >> >> The problem from the use of the "semantically inadequate" 'function >> static' variables for the local buffers is caused by LLVM/Clang > thinking >> they are buffers with a constant base which they eventually won't > be in >> a parallel WG implementation. This triggers an issue I'm > currently working >> on pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because Clang >> generates constant GEPs for the local buffer accesses (even though in a > parallel >> thread-safe implementation the local variables cannot be stored to >> constant locations). >> >> So, I wonder if this piece of SPIR specs might cause other similar >> problems (LLVM optimizing incorrectly due to the slightly wrong > semantics) >> in the future and should be improved. The minimal fix would be to add >> some kind of attribute to the function static global that > prevents >> Clang/LLVM thinking the address is constant and apply > optimizations that >> rely on that. Semantically the local buffer is actually a thread-local > variable. >> Are thread locals somehow supported in LLVM IR? >> >> >> On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote: >>> >>> For what it's worth, this issue manifests itself in an unsolved pocl >>> bug: https://bugs.launchpad.net/pocl/+bug/987905 >>> >>> It would be simpler to implement a portable implementation for > calling the >>> kernel from the host if we could assume the kernel calling > convention >>> mapped each OpenCL setArg arg to a single kernel function arg (and > preferably all >>> arg data in memory). For the non-kernel functions it should not > matter and >>> could be target-specific. >>> >> >> >> -- Pekka > >-- Pekka
James Molloy
2012-Sep-24 15:04 UTC
[LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
> > For the record, I just workarounded it in pocl by borrowing the > BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should > be reconsidered.Yes, I agree. On 24 September 2012 15:08, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi>wrote:> Well, > > To be honest I'm not very comfortable with the whole constant GEP > idea. It's a new thing to me and I do not fully understand its > point in LLVM IR, so I probably wasn't very clear ;) > > Anyways, me bringing it up was meant as an example of what can happen > if one (mis)uses the C function static variable semantics for something > that > really is a thread local variable (in usual thread parallel > implementations). > > For the record, I just workarounded it in pocl by borrowing the > BreakConstantGEPs code from SAFECode. But for SPIR specs, IMHO, this should > be reconsidered. > > > On 09/24/2012 05:00 PM, James Molloy wrote: > >> Hi, >> >> Sorry, With a prod from Silviu (cc'd) I now understand - I was >> interpreting >> your use of "constant GEP" as "GEP with constant operand" as opposed to >> "ConstantGEP node" which of course can only take a Constant* operand, not >> a >> Value* operand. >> >> I now fully see the problem and realise that my solution is also prone to >> that problem :) >> >> Cheers, >> >> James >> >> On 24 September 2012 14:41, James Molloy <james at jamesmolloy.co.uk >> <mailto:james at jamesmolloy.co.**uk <james at jamesmolloy.co.uk>>> wrote: >> >> Hi, >> >> I don't fully understand your problem description. >> >> ...is caused by LLVM/Clang thinking >> >> they are buffers with a constant base which they eventually won't be in a >> parallel WG implementation. This triggers an issue I'm currently working >> on >> pocl: https://bugs.launchpad.net/**pocl/+bug/1032203<https://bugs.launchpad.net/pocl/+bug/1032203>because Clang generates >> constant GEPs for the local buffer accesses (even though in a parallel >> thread-safe implementation the local variables cannot be stored to >> constant >> locations). >> >> >> Surely if you're passing in pointers to the kernel function that differ >> depending on workgroup, then a GEP from those pointers of a constant >> amount >> is perfectly safe. Why would a constant GEP from a per-workgroup base be a >> problem? >> >> >> I'm sure there's something I've misunderstood about your solution... >> >> Cheers, >> >> James >> >> On 24 September 2012 12:41, Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi >> <mailto:pekka.jaaskelainen@**tut.fi <pekka.jaaskelainen at tut.fi>>> wrote: >> >>> Hi all, >>> >>> Another OpenCL C implementation issue I'm currently fighting with >>> >> is how >> >>> to best implement the automatic __local variables. Seems SPIR >>> >> enforces >> >>> the current Clang implementation of them that converts the automatic >>> locals to C function static variables (thus, in practice global >>> >> variables). >> >>> >>> Clearly, this is not a thread safe "final implementation", thus >>> >> works as is >> >>> only when multiple work groups of the same kernel are not executed in >>> parallel. Therefore, some other compiler pass is assumed to >>> >> convert those >> >>> function static (module global variables) to some other storage >>> >> where the >> >>> local buffers are allocated per work group thread. >>> >>> The pocl implementation is what was suggested some time ago in >>> >> this list: >> >>> the locals are converted to local arguments to the kernel >>> >> function which >> >>> are then passed per-thread buffers when the work group is >>> >> executed. Thus, >> >>> pocl needs to convert the references to these dummy globals to local >>> buffer pointers at the end of the kernel function argument list. >>> >>> The problem from the use of the "semantically inadequate" 'function >>> static' variables for the local buffers is caused by LLVM/Clang >>> >> thinking >> >>> they are buffers with a constant base which they eventually won't >>> >> be in >> >>> a parallel WG implementation. This triggers an issue I'm >>> >> currently working >> >>> on pocl: https://bugs.launchpad.net/**pocl/+bug/1032203<https://bugs.launchpad.net/pocl/+bug/1032203>because Clang >>> generates constant GEPs for the local buffer accesses (even though in a >>> >> parallel >> >>> thread-safe implementation the local variables cannot be stored to >>> constant locations). >>> >>> So, I wonder if this piece of SPIR specs might cause other similar >>> problems (LLVM optimizing incorrectly due to the slightly wrong >>> >> semantics) >> >>> in the future and should be improved. The minimal fix would be to add >>> some kind of attribute to the function static global that >>> >> prevents >> >>> Clang/LLVM thinking the address is constant and apply >>> >> optimizations that >> >>> rely on that. Semantically the local buffer is actually a thread-local >>> >> variable. >> >>> Are thread locals somehow supported in LLVM IR? >>> >>> >>> On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote: >>> >>>> >>>> For what it's worth, this issue manifests itself in an unsolved pocl >>>> bug: https://bugs.launchpad.net/**pocl/+bug/987905<https://bugs.launchpad.net/pocl/+bug/987905> >>>> >>>> It would be simpler to implement a portable implementation for >>>> >>> calling the >> >>> kernel from the host if we could assume the kernel calling >>>> >>> convention >> >>> mapped each OpenCL setArg arg to a single kernel function arg (and >>>> >>> preferably all >> >>> arg data in memory). For the non-kernel functions it should not >>>> >>> matter and >> >>> could be target-specific. >>>> >>>> >>> >>> -- Pekka >>> >> >> >> > > -- > Pekka >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120924/5061d86a/attachment.html>
Reasonably Related Threads
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website
- [LLVMdev] [cfe-dev] SPIR provisional specification is now available in the Khronos website