Dear Justin, I am trying to add the support for some OpenCL builtin functions to the PTX backend. The attached file represent the first stub of a patch for the fmax builtin function. The test case I am trying is the following: define ptx_device float @f(float %x, float %y) { entry: %z = call float @fmax(float %x, float %y) ret float %z } declare float @fmax(float, float) But at the moment llc crashes saying that "calls are not supported", this does not happens with llvm builtins like llvm.sqrt.f32 Can you please give me a hint on what I am missing, or some general advice on how to add builtin functions. Thank you in advance, Alberto. -------------- next part -------------- A non-text attachment was scrubbed... Name: fmax_stub.patch Type: text/x-patch Size: 1962 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111116/679cb98e/attachment.bin>
On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni <alberto.magni86 at gmail.com>wrote:> Dear Justin, > > I am trying to add the support for some OpenCL builtin functions to > the PTX backend. > The attached file represent the first stub of a patch for the fmax > builtin function. >First off, thanks for helping to improve the PTX back-end! There are really two main issues here. First, OpenCL built-in functions do not belong in the PTX back-end. These will be implemented in the libclc library (http://www.pcc.me.uk/~peter/libclc). The back-end will only implement PTX intrinsics, which may be used by the OpenCL built-in functions in libclc. However, this particular function (max) corresponds to a PTX instruction, so it makes sense to implement it as an intrinsic in the back-end. Second, intrinsic functions require a bit more work. You're off to a great start, but intrinsics are implemented a bit differently. It looks like LLVM does not have a max intrinsic, so we'll need to create one. Have a look at include/llvm/IntrinsicsPTX.td. This file defines the PTX-specific intrinsics. You can add an intrinsic for max here, and then implement a pattern-match in the PTXInstrInfo.td file. There is no need to create a new SDNode type for intrinsics, unless they require some special handling in the C++ code, which I do not see being the case here. When you define a new intrinsic, use the following template as a name: int_ptx_max. This will define the LLVM intrinsic as @llvm.ptx.max(). Please follow the same convention when naming the __builtin_* function.> > The test case I am trying is the following: > > define ptx_device float @f(float %x, float %y) { > entry: > %z = call float @fmax(float %x, float %y) > ret float %z > } > > declare float @fmax(float, float) > > But at the moment llc crashes saying that "calls are not supported", > this does not > happens with llvm builtins like llvm.sqrt.f32 >Which version of LLVM are you using? Calls to PTX device functions have been implemented for a little while now, so I'm surprised to see that error. Perhaps it's because the fmax function is not defined as ptx_device.> > Can you please give me a hint on what I am missing, or some general > advice on how > to add builtin functions. > > Thank you in advance, > > Alberto. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111116/1918467a/attachment.html>
On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski < justin.holewinski at gmail.com> wrote:> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni <alberto.magni86 at gmail.com>wrote: > >> Dear Justin, >> >> I am trying to add the support for some OpenCL builtin functions to >> the PTX backend. >> The attached file represent the first stub of a patch for the fmax >> builtin function. >> > > First off, thanks for helping to improve the PTX back-end! > > There are really two main issues here. First, OpenCL built-in functions > do not belong in the PTX back-end. These will be implemented in the libclc > library (http://www.pcc.me.uk/~peter/libclc). The back-end will only > implement PTX intrinsics, which may be used by the OpenCL built-in > functions in libclc. However, this particular function (max) corresponds > to a PTX instruction, so it makes sense to implement it as an intrinsic in > the back-end. > > Second, intrinsic functions require a bit more work. You're off to a > great start, but intrinsics are implemented a bit differently. It looks > like LLVM does not have a max intrinsic, so we'll need to create one. Have > a look at include/llvm/IntrinsicsPTX.td. This file defines the > PTX-specific intrinsics. You can add an intrinsic for max here, and then > implement a pattern-match in the PTXInstrInfo.td file. There is no need to > create a new SDNode type for intrinsics, unless they require some special > handling in the C++ code, which I do not see being the case here. >Sorry, there's a typo here. The intrinsic pattern matching goes in PTXInstrinsicInstrInfo.td.> > When you define a new intrinsic, use the following template as a name: > int_ptx_max. This will define the LLVM intrinsic as @llvm.ptx.max(). > Please follow the same convention when naming the __builtin_* function. > > > >> >> The test case I am trying is the following: >> >> define ptx_device float @f(float %x, float %y) { >> entry: >> %z = call float @fmax(float %x, float %y) >> ret float %z >> } >> >> declare float @fmax(float, float) >> >> But at the moment llc crashes saying that "calls are not supported", >> this does not >> happens with llvm builtins like llvm.sqrt.f32 >> > > Which version of LLVM are you using? Calls to PTX device functions have > been implemented for a little while now, so I'm surprised to see that > error. Perhaps it's because the fmax function is not defined as ptx_device. > > >> >> Can you please give me a hint on what I am missing, or some general >> advice on how >> to add builtin functions. >> >> Thank you in advance, >> >> Alberto. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > > -- > > Thanks, > > Justin Holewinski > >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111116/8cabfef0/attachment.html>
On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni <alberto.magni86 at gmail.com>wrote:> Hi Justin, > > attached you find the patch for the integer max instruction. > The multiclass PTX_INTRINSIC_INT3 in file PTXIntrinsicInstrInfo.td > is almost an exact copy of PTX_INT3 in PTXInstrInfo.td, maybe > a modification of this class can be defined in a separate file. >I'm copying llvmdev. We should keep discussions like this on the list for the benefit of others. We can probably factor out a generic description, or even just use the PTX_INT3 multiclass directly. The PTXIntrinsicInstrInfo.td file is included by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is available in PTXIntrinsicInstrInfo.td.> > Do you agree with this approach ? > Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED > (a clone of PTX_INT3_SIGNED) is required ? >Yes, I believe we should split these into signed and unsigned variants. The results of max/min operations can definitely be different depending on whether the operands are signed or unsigned. Since this information is not encoded in LLVM types, we may want to create two versions for each integer type; something like: i32 @llvm.ptx.max.signed.i32(i32, i32) i32 @llvm.ptx.max.unsigned.i32(i32, i32) Otherwise, the patch looks good.> > Thanks, > > Alberto > > On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni > <alberto.magni86 at gmail.com> wrote: > > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski > > <justin.holewinski at gmail.com> wrote: > >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski > >> <justin.holewinski at gmail.com> wrote: > >>> > >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni < > alberto.magni86 at gmail.com> > >>> wrote: > >>>> > >>>> Dear Justin, > >>>> > >>>> I am trying to add the support for some OpenCL builtin functions to > >>>> the PTX backend. > >>>> The attached file represent the first stub of a patch for the fmax > >>>> builtin function. > >>> > >>> First off, thanks for helping to improve the PTX back-end! > >>> There are really two main issues here. First, OpenCL built-in > functions > >>> do not belong in the PTX back-end. These will be implemented in the > libclc > >>> library (http://www.pcc.me.uk/~peter/libclc). The back-end will only > >>> implement PTX intrinsics, which may be used by the OpenCL built-in > functions > >>> in libclc. However, this particular function (max) corresponds to a > PTX > >>> instruction, so it makes sense to implement it as an intrinsic in the > >>> back-end. > >>> Second, intrinsic functions require a bit more work. You're off to a > >>> great start, but intrinsics are implemented a bit differently. It > looks > >>> like LLVM does not have a max intrinsic, so we'll need to create one. > Have > >>> a look at include/llvm/IntrinsicsPTX.td. This file defines the > PTX-specific > >>> intrinsics. You can add an intrinsic for max here, and then implement > a > >>> pattern-match in the PTXInstrInfo.td file. There is no need to create > a new > >>> SDNode type for intrinsics, unless they require some special handling > in the > >>> C++ code, which I do not see being the case here. > >> > >> Sorry, there's a typo here. The intrinsic pattern matching goes in > >> PTXInstrinsicInstrInfo.td. > >> > > > > Thank you for the pointers I will let you know when I have the first > patch. > > > >>> > >>> When you define a new intrinsic, use the following template as a name: > >>> int_ptx_max. This will define the LLVM intrinsic as @llvm.ptx.max(). > >>> Please follow the same convention when naming the __builtin_* > function. > >>> > >>>> > >>>> The test case I am trying is the following: > >>>> > >>>> define ptx_device float @f(float %x, float %y) { > >>>> entry: > >>>> %z = call float @fmax(float %x, float %y) > >>>> ret float %z > >>>> } > >>>> > >>>> declare float @fmax(float, float) > >>>> > >>>> But at the moment llc crashes saying that "calls are not supported", > >>>> this does not > >>>> happens with llvm builtins like llvm.sqrt.f32 > >>> > >>> Which version of LLVM are you using? Calls to PTX device functions > have > >>> been implemented for a little while now, so I'm surprised to see that > error. > >>> Perhaps it's because the fmax function is not defined as ptx_device. > >>> > > > > This is the testcase that I am using to verify I the max builtin > > function I am impementing > > is actually recognised. I took inspiration from the llvm-intrinsic.ll > test case. > > The command I am using to compile is: > > > > llc -march=ptx32 -mattr=+ptx22 fmax.ll > > > > The option -mattr does not seem to have any effect. > > I tried also with the ptx_device qualifier with the same outcome. > > I am using llvm from the svn repository. > > > > Bye, > > > > Alberto > > > >>>> > >>>> Can you please give me a hint on what I am missing, or some general > >>>> advice on how > >>>> to add builtin functions. > >>>> > >>>> Thank you in advance, > >>>> > >>>> Alberto. > >>>> > >>>> _______________________________________________ > >>>> LLVM Developers mailing list > >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >>>> > >>> > >>> > >>> > >>> -- > >>> > >>> Thanks, > >>> Justin Holewinski > >> > >> > >> > >> -- > >> > >> Thanks, > >> Justin Holewinski > >> >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/71aa2550/attachment.html>
On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski <justin.holewinski at gmail.com> wrote:> On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni <alberto.magni86 at gmail.com> > wrote: >> >> Hi Justin, >> >> attached you find the patch for the integer max instruction. >> The multiclass PTX_INTRINSIC_INT3 in file PTXIntrinsicInstrInfo.td >> is almost an exact copy of PTX_INT3 in PTXInstrInfo.td, maybe >> a modification of this class can be defined in a separate file. > > > I'm copying llvmdev. We should keep discussions like this on the list for > the benefit of others.I always forget "Reply to All".> We can probably factor out a generic description, or even just use the > PTX_INT3 multiclass directly. The PTXIntrinsicInstrInfo.td file is included > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is available in > PTXIntrinsicInstrInfo.td.I agree with you but my class PTX_INTRINSIC_INT3 works with an Intrinsic and not with a SDNode, like PTX_INT3. PTX_INTRINSIC_INT3 also requires the presence of the type of the immediate in the pattern, e.g. (i32 imm:$b).>> >> >> Do you agree with this approach ? >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED >> (a clone of PTX_INT3_SIGNED) is required ? > > > Yes, I believe we should split these into signed and unsigned variants. The > results of max/min operations can definitely be different depending on > whether the operands are signed or unsigned. Since this information is not > encoded in LLVM types, we may want to create two versions for each integer > type; something like: > > i32 @llvm.ptx.max.signed.i32(i32, i32) > i32 @llvm.ptx.max.unsigned.i32(i32, i32)Yes, this the only way.> > Otherwise, the patch looks good. > >> >> >> Thanks, >> >> Alberto >> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni >> <alberto.magni86 at gmail.com> wrote: >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski >> > <justin.holewinski at gmail.com> wrote: >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski >> >> <justin.holewinski at gmail.com> wrote: >> >>> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni >> >>> <alberto.magni86 at gmail.com> >> >>> wrote: >> >>>> >> >>>> Dear Justin, >> >>>> >> >>>> I am trying to add the support for some OpenCL builtin functions to >> >>>> the PTX backend. >> >>>> The attached file represent the first stub of a patch for the fmax >> >>>> builtin function. >> >>> >> >>> First off, thanks for helping to improve the PTX back-end! >> >>> There are really two main issues here. First, OpenCL built-in >> >>> functions >> >>> do not belong in the PTX back-end. These will be implemented in the >> >>> libclc >> >>> library (http://www.pcc.me.uk/~peter/libclc). The back-end will only >> >>> implement PTX intrinsics, which may be used by the OpenCL built-in >> >>> functions >> >>> in libclc. However, this particular function (max) corresponds to a >> >>> PTX >> >>> instruction, so it makes sense to implement it as an intrinsic in the >> >>> back-end. >> >>> Second, intrinsic functions require a bit more work. You're off to a >> >>> great start, but intrinsics are implemented a bit differently. It >> >>> looks >> >>> like LLVM does not have a max intrinsic, so we'll need to create one. >> >>> Have >> >>> a look at include/llvm/IntrinsicsPTX.td. This file defines the >> >>> PTX-specific >> >>> intrinsics. You can add an intrinsic for max here, and then implement >> >>> a >> >>> pattern-match in the PTXInstrInfo.td file. There is no need to create >> >>> a new >> >>> SDNode type for intrinsics, unless they require some special handling >> >>> in the >> >>> C++ code, which I do not see being the case here. >> >> >> >> Sorry, there's a typo here. The intrinsic pattern matching goes in >> >> PTXInstrinsicInstrInfo.td. >> >> >> > >> > Thank you for the pointers I will let you know when I have the first >> > patch. >> > >> >>> >> >>> When you define a new intrinsic, use the following template as a name: >> >>> int_ptx_max. This will define the LLVM intrinsic as @llvm.ptx.max(). >> >>> Please follow the same convention when naming the __builtin_* >> >>> function. >> >>> >> >>>> >> >>>> The test case I am trying is the following: >> >>>> >> >>>> define ptx_device float @f(float %x, float %y) { >> >>>> entry: >> >>>> %z = call float @fmax(float %x, float %y) >> >>>> ret float %z >> >>>> } >> >>>> >> >>>> declare float @fmax(float, float) >> >>>> >> >>>> But at the moment llc crashes saying that "calls are not supported", >> >>>> this does not >> >>>> happens with llvm builtins like llvm.sqrt.f32 >> >>> >> >>> Which version of LLVM are you using? Calls to PTX device functions >> >>> have >> >>> been implemented for a little while now, so I'm surprised to see that >> >>> error. >> >>> Perhaps it's because the fmax function is not defined as ptx_device. >> >>> >> > >> > This is the testcase that I am using to verify I the max builtin >> > function I am impementing >> > is actually recognised. I took inspiration from the llvm-intrinsic.ll >> > test case. >> > The command I am using to compile is: >> > >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll >> > >> > The option -mattr does not seem to have any effect. >> > I tried also with the ptx_device qualifier with the same outcome. >> > I am using llvm from the svn repository. >> > >> > Bye, >> > >> > Alberto >> > >> >>>> >> >>>> Can you please give me a hint on what I am missing, or some general >> >>>> advice on how >> >>>> to add builtin functions. >> >>>> >> >>>> Thank you in advance, >> >>>> >> >>>> Alberto. >> >>>> >> >>>> _______________________________________________ >> >>>> LLVM Developers mailing list >> >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> >> >>> Thanks, >> >>> Justin Holewinski >> >> >> >> >> >> >> >> -- >> >> >> >> Thanks, >> >> Justin Holewinski >> >> > > > > > -- > > Thanks, > > Justin Holewinski >