On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni <alberto.magni86 at gmail.com>wrote:> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski > <justin.holewinski at gmail.com> wrote: > > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni < > alberto.magni86 at gmail.com> > > wrote: > >> > >> Hi Justin, > >> > >> attached you find the patch for the integer max instruction. > >> The multiclass PTX_INTRINSIC_INT3 in file PTXIntrinsicInstrInfo.td > >> is almost an exact copy of PTX_INT3 in PTXInstrInfo.td, maybe > >> a modification of this class can be defined in a separate file. > > > > > > I'm copying llvmdev. We should keep discussions like this on the list > for > > the benefit of others. > > I always forget "Reply to All". > > > We can probably factor out a generic description, or even just use the > > PTX_INT3 multiclass directly. The PTXIntrinsicInstrInfo.td file is > included > > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is available > in > > PTXIntrinsicInstrInfo.td. > > I agree with you but my class PTX_INTRINSIC_INT3 works with an Intrinsic > and not with a SDNode, like PTX_INT3. > PTX_INTRINSIC_INT3 also requires the presence of the type of > the immediate in the pattern, e.g. (i32 imm:$b). >Alright, I'm fine with that.> > >> > >> > >> Do you agree with this approach ? > >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED > >> (a clone of PTX_INT3_SIGNED) is required ? > > > > > > Yes, I believe we should split these into signed and unsigned variants. > The > > results of max/min operations can definitely be different depending on > > whether the operands are signed or unsigned. Since this information is > not > > encoded in LLVM types, we may want to create two versions for each > integer > > type; something like: > > > > i32 @llvm.ptx.max.signed.i32(i32, i32) > > i32 @llvm.ptx.max.unsigned.i32(i32, i32) > > Yes, this the only way. >A couple more comments: 1. Please make sure to set TargetPrefix="ptx" for the intrinsics (probably best in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32) 2. I'm not sure how to define a GCCBuiltin for an intrinsic that can take multiple types, but it's probably worth looking into so we can expose this intrinsic to Clang.> > > > > Otherwise, the patch looks good. > > > >> > >> > >> Thanks, > >> > >> Alberto > >> > >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni > >> <alberto.magni86 at gmail.com> wrote: > >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski > >> > <justin.holewinski at gmail.com> wrote: > >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski > >> >> <justin.holewinski at gmail.com> wrote: > >> >>> > >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni > >> >>> <alberto.magni86 at gmail.com> > >> >>> wrote: > >> >>>> > >> >>>> Dear Justin, > >> >>>> > >> >>>> I am trying to add the support for some OpenCL builtin functions to > >> >>>> the PTX backend. > >> >>>> The attached file represent the first stub of a patch for the fmax > >> >>>> builtin function. > >> >>> > >> >>> First off, thanks for helping to improve the PTX back-end! > >> >>> There are really two main issues here. First, OpenCL built-in > >> >>> functions > >> >>> do not belong in the PTX back-end. These will be implemented in the > >> >>> libclc > >> >>> library (http://www.pcc.me.uk/~peter/libclc). The back-end will > only > >> >>> implement PTX intrinsics, which may be used by the OpenCL built-in > >> >>> functions > >> >>> in libclc. However, this particular function (max) corresponds to a > >> >>> PTX > >> >>> instruction, so it makes sense to implement it as an intrinsic in > the > >> >>> back-end. > >> >>> Second, intrinsic functions require a bit more work. You're off to > a > >> >>> great start, but intrinsics are implemented a bit differently. It > >> >>> looks > >> >>> like LLVM does not have a max intrinsic, so we'll need to create > one. > >> >>> Have > >> >>> a look at include/llvm/IntrinsicsPTX.td. This file defines the > >> >>> PTX-specific > >> >>> intrinsics. You can add an intrinsic for max here, and then > implement > >> >>> a > >> >>> pattern-match in the PTXInstrInfo.td file. There is no need to > create > >> >>> a new > >> >>> SDNode type for intrinsics, unless they require some special > handling > >> >>> in the > >> >>> C++ code, which I do not see being the case here. > >> >> > >> >> Sorry, there's a typo here. The intrinsic pattern matching goes in > >> >> PTXInstrinsicInstrInfo.td. > >> >> > >> > > >> > Thank you for the pointers I will let you know when I have the first > >> > patch. > >> > > >> >>> > >> >>> When you define a new intrinsic, use the following template as a > name: > >> >>> int_ptx_max. This will define the LLVM intrinsic as > @llvm.ptx.max(). > >> >>> Please follow the same convention when naming the __builtin_* > >> >>> function. > >> >>> > >> >>>> > >> >>>> The test case I am trying is the following: > >> >>>> > >> >>>> define ptx_device float @f(float %x, float %y) { > >> >>>> entry: > >> >>>> %z = call float @fmax(float %x, float %y) > >> >>>> ret float %z > >> >>>> } > >> >>>> > >> >>>> declare float @fmax(float, float) > >> >>>> > >> >>>> But at the moment llc crashes saying that "calls are not > supported", > >> >>>> this does not > >> >>>> happens with llvm builtins like llvm.sqrt.f32 > >> >>> > >> >>> Which version of LLVM are you using? Calls to PTX device functions > >> >>> have > >> >>> been implemented for a little while now, so I'm surprised to see > that > >> >>> error. > >> >>> Perhaps it's because the fmax function is not defined as > ptx_device. > >> >>> > >> > > >> > This is the testcase that I am using to verify I the max builtin > >> > function I am impementing > >> > is actually recognised. I took inspiration from the llvm-intrinsic.ll > >> > test case. > >> > The command I am using to compile is: > >> > > >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll > >> > > >> > The option -mattr does not seem to have any effect. > >> > I tried also with the ptx_device qualifier with the same outcome. > >> > I am using llvm from the svn repository. > >> > > >> > Bye, > >> > > >> > Alberto > >> > > >> >>>> > >> >>>> Can you please give me a hint on what I am missing, or some general > >> >>>> advice on how > >> >>>> to add builtin functions. > >> >>>> > >> >>>> Thank you in advance, > >> >>>> > >> >>>> Alberto. > >> >>>> > >> >>>> _______________________________________________ > >> >>>> LLVM Developers mailing list > >> >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> > >> >>> Thanks, > >> >>> Justin Holewinski > >> >> > >> >> > >> >> > >> >> -- > >> >> > >> >> Thanks, > >> >> Justin Holewinski > >> >> > > > > > > > > > > -- > > > > Thanks, > > > > Justin Holewinski > > >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/8d2d0ab7/attachment.html>
On Mon, Nov 21, 2011 at 5:31 PM, Justin Holewinski <justin.holewinski at gmail.com> wrote:> On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni <alberto.magni86 at gmail.com> > wrote: >> >> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski >> <justin.holewinski at gmail.com> wrote: >> > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni >> > <alberto.magni86 at gmail.com> >> > wrote: >> >> >> >> Hi Justin, >> >> >> >> attached you find the patch for the integer max instruction. >> >> The multiclass PTX_INTRINSIC_INT3 in file PTXIntrinsicInstrInfo.td >> >> is almost an exact copy of PTX_INT3 in PTXInstrInfo.td, maybe >> >> a modification of this class can be defined in a separate file. >> > >> > >> > I'm copying llvmdev. We should keep discussions like this on the list >> > for >> > the benefit of others. >> >> I always forget "Reply to All". >> >> > We can probably factor out a generic description, or even just use the >> > PTX_INT3 multiclass directly. The PTXIntrinsicInstrInfo.td file is >> > included >> > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is available >> > in >> > PTXIntrinsicInstrInfo.td. >> >> I agree with you but my class PTX_INTRINSIC_INT3 works with an Intrinsic >> and not with a SDNode, like PTX_INT3. >> PTX_INTRINSIC_INT3 also requires the presence of the type of >> the immediate in the pattern, e.g. (i32 imm:$b). > > > Alright, I'm fine with that. > >> >> >> >> >> >> >> >> Do you agree with this approach ? >> >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED >> >> (a clone of PTX_INT3_SIGNED) is required ? >> > >> > >> > Yes, I believe we should split these into signed and unsigned variants. >> > The >> > results of max/min operations can definitely be different depending on >> > whether the operands are signed or unsigned. Since this information is >> > not >> > encoded in LLVM types, we may want to create two versions for each >> > integer >> > type; something like: >> > >> > i32 @llvm.ptx.max.signed.i32(i32, i32) >> > i32 @llvm.ptx.max.unsigned.i32(i32, i32) >> >> Yes, this the only way. > > > A couple more comments: > > Please make sure to set TargetPrefix="ptx" for the intrinsics (probably best > in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32)]Ok> I'm not sure how to define a GCCBuiltin for an intrinsic that can take > multiple types, but it's probably worth looking into so we can expose this > intrinsic to Clang.This could be an issue. I looked for something similar in other backends and I found no previous examples. It may be worth to ask on the ML explicitly for this. The only fallback that I see is to define explicitly every intrinsic for every data type, but this would prevent the usage of the multiclass for the definition of the patterns. Bye.> > >> >> >> > >> > Otherwise, the patch looks good. >> > >> >> >> >> >> >> Thanks, >> >> >> >> Alberto >> >> >> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni >> >> <alberto.magni86 at gmail.com> wrote: >> >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski >> >> > <justin.holewinski at gmail.com> wrote: >> >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski >> >> >> <justin.holewinski at gmail.com> wrote: >> >> >>> >> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni >> >> >>> <alberto.magni86 at gmail.com> >> >> >>> wrote: >> >> >>>> >> >> >>>> Dear Justin, >> >> >>>> >> >> >>>> I am trying to add the support for some OpenCL builtin functions >> >> >>>> to >> >> >>>> the PTX backend. >> >> >>>> The attached file represent the first stub of a patch for the fmax >> >> >>>> builtin function. >> >> >>> >> >> >>> First off, thanks for helping to improve the PTX back-end! >> >> >>> There are really two main issues here. First, OpenCL built-in >> >> >>> functions >> >> >>> do not belong in the PTX back-end. These will be implemented in >> >> >>> the >> >> >>> libclc >> >> >>> library (http://www.pcc.me.uk/~peter/libclc). The back-end will >> >> >>> only >> >> >>> implement PTX intrinsics, which may be used by the OpenCL built-in >> >> >>> functions >> >> >>> in libclc. However, this particular function (max) corresponds to >> >> >>> a >> >> >>> PTX >> >> >>> instruction, so it makes sense to implement it as an intrinsic in >> >> >>> the >> >> >>> back-end. >> >> >>> Second, intrinsic functions require a bit more work. You're off to >> >> >>> a >> >> >>> great start, but intrinsics are implemented a bit differently. It >> >> >>> looks >> >> >>> like LLVM does not have a max intrinsic, so we'll need to create >> >> >>> one. >> >> >>> Have >> >> >>> a look at include/llvm/IntrinsicsPTX.td. This file defines the >> >> >>> PTX-specific >> >> >>> intrinsics. You can add an intrinsic for max here, and then >> >> >>> implement >> >> >>> a >> >> >>> pattern-match in the PTXInstrInfo.td file. There is no need to >> >> >>> create >> >> >>> a new >> >> >>> SDNode type for intrinsics, unless they require some special >> >> >>> handling >> >> >>> in the >> >> >>> C++ code, which I do not see being the case here. >> >> >> >> >> >> Sorry, there's a typo here. The intrinsic pattern matching goes in >> >> >> PTXInstrinsicInstrInfo.td. >> >> >> >> >> > >> >> > Thank you for the pointers I will let you know when I have the first >> >> > patch. >> >> > >> >> >>> >> >> >>> When you define a new intrinsic, use the following template as a >> >> >>> name: >> >> >>> int_ptx_max. This will define the LLVM intrinsic as >> >> >>> @llvm.ptx.max(). >> >> >>> Please follow the same convention when naming the __builtin_* >> >> >>> function. >> >> >>> >> >> >>>> >> >> >>>> The test case I am trying is the following: >> >> >>>> >> >> >>>> define ptx_device float @f(float %x, float %y) { >> >> >>>> entry: >> >> >>>> %z = call float @fmax(float %x, float %y) >> >> >>>> ret float %z >> >> >>>> } >> >> >>>> >> >> >>>> declare float @fmax(float, float) >> >> >>>> >> >> >>>> But at the moment llc crashes saying that "calls are not >> >> >>>> supported", >> >> >>>> this does not >> >> >>>> happens with llvm builtins like llvm.sqrt.f32 >> >> >>> >> >> >>> Which version of LLVM are you using? Calls to PTX device functions >> >> >>> have >> >> >>> been implemented for a little while now, so I'm surprised to see >> >> >>> that >> >> >>> error. >> >> >>> Perhaps it's because the fmax function is not defined as >> >> >>> ptx_device. >> >> >>> >> >> > >> >> > This is the testcase that I am using to verify I the max builtin >> >> > function I am impementing >> >> > is actually recognised. I took inspiration from the llvm-intrinsic.ll >> >> > test case. >> >> > The command I am using to compile is: >> >> > >> >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll >> >> > >> >> > The option -mattr does not seem to have any effect. >> >> > I tried also with the ptx_device qualifier with the same outcome. >> >> > I am using llvm from the svn repository. >> >> > >> >> > Bye, >> >> > >> >> > Alberto >> >> > >> >> >>>> >> >> >>>> Can you please give me a hint on what I am missing, or some >> >> >>>> general >> >> >>>> advice on how >> >> >>>> to add builtin functions. >> >> >>>> >> >> >>>> Thank you in advance, >> >> >>>> >> >> >>>> Alberto. >> >> >>>> >> >> >>>> _______________________________________________ >> >> >>>> LLVM Developers mailing list >> >> >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >>>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> >> >> >>> Thanks, >> >> >>> Justin Holewinski >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> >> Thanks, >> >> >> Justin Holewinski >> >> >> >> > >> > >> > >> > >> > -- >> > >> > Thanks, >> > >> > Justin Holewinski >> > > > > > > -- > > Thanks, > > Justin Holewinski >
Alberto, The AMDIL backend solves your problem with intrinsic overloading this way: def int_AMDIL_mad : GCCBuiltin<"__amdil_mad">, TernaryIntFloat; Where TernaryIntFloat is defined as: class TernaryIntFloat : Intrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>, LLVMMatchType<0>, LLVMMatchType<0>], []>; This allows us to write a multi-def for int_AMDIL_mad like so: defm MAD : TernaryIntrinsicFloat<IL_OP_MAD, int_AMDIL_mad>; Where TernaryIntrinsicFloat is defined as: multiclass TernaryIntrinsicFloat<ILOpCode opcode, Intrinsic intr> { def _f32 : ThreeInOneOut<opcode, (outs GPRF32:$dst), (ins GPRF32:$src, GPRF32:$src2, GPRF32:$src3), !strconcat(opcode.Text, " $dst, $src, $src2, $src3"), [(set GPRF32:$dst, (intr GPRF32:$src, GPRF32:$src2, GPRF32:$src3))]>; def _v2f32 : ThreeInOneOut<opcode, (outs GPRV2F32:$dst), (ins GPRV2F32:$src, GPRV2F32:$src2, GPRV2F32:$src3), !strconcat(opcode.Text, " $dst, $src, $src2, $src3"), [(set GPRV2F32:$dst, (intr GPRV2F32:$src, GPRV2F32:$src2, GPRV2F32:$src3))]>; ... } Now, this doesn't completely work, because LLVM does not allow overloading of intrinsics values, so there needs to be a little coding in *IntrinsicInfo class. AMD always encodes builtin names as __amdil_mad_f32, __amdil_mad_v2f32, __amdil_mad_v4f32, etc.... So in the function "*IntrinsicInfo::lookup_name", when attempting to find out what intrinsic the function maps to, the AMDIL backend strips off the type, and then looks up for just '__amdil_mad'. This is how you can do intrinsic overloading in LLVM. Hope this helps, Micah> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Alberto Magni > Sent: Tuesday, November 22, 2011 8:41 AM > To: Justin Holewinski > Cc: LLVM Developers Mailing List > Subject: Re: [LLVMdev] PTX builtin functions. > > On Mon, Nov 21, 2011 at 5:31 PM, Justin Holewinski > <justin.holewinski at gmail.com> wrote: > > On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni > <alberto.magni86 at gmail.com> > > wrote: > >> > >> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski > >> <justin.holewinski at gmail.com> wrote: > >> > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni > >> > <alberto.magni86 at gmail.com> > >> > wrote: > >> >> > >> >> Hi Justin, > >> >> > >> >> attached you find the patch for the integer max instruction. > >> >> The multiclass PTX_INTRINSIC_INT3 in file > PTXIntrinsicInstrInfo.td > >> >> is almost an exact copy of PTX_INT3 in PTXInstrInfo.td, maybe > >> >> a modification of this class can be defined in a separate file. > >> > > >> > > >> > I'm copying llvmdev. We should keep discussions like this on the > list > >> > for > >> > the benefit of others. > >> > >> I always forget "Reply to All". > >> > >> > We can probably factor out a generic description, or even just use > the > >> > PTX_INT3 multiclass directly. The PTXIntrinsicInstrInfo.td file > is > >> > included > >> > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is > available > >> > in > >> > PTXIntrinsicInstrInfo.td. > >> > >> I agree with you but my class PTX_INTRINSIC_INT3 works with an > Intrinsic > >> and not with a SDNode, like PTX_INT3. > >> PTX_INTRINSIC_INT3 also requires the presence of the type of > >> the immediate in the pattern, e.g. (i32 imm:$b). > > > > > > Alright, I'm fine with that. > > > >> > >> > >> >> > >> >> > >> >> Do you agree with this approach ? > >> >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED > >> >> (a clone of PTX_INT3_SIGNED) is required ? > >> > > >> > > >> > Yes, I believe we should split these into signed and unsigned > variants. > >> > The > >> > results of max/min operations can definitely be different > depending on > >> > whether the operands are signed or unsigned. Since this > information is > >> > not > >> > encoded in LLVM types, we may want to create two versions for each > >> > integer > >> > type; something like: > >> > > >> > i32 @llvm.ptx.max.signed.i32(i32, i32) > >> > i32 @llvm.ptx.max.unsigned.i32(i32, i32) > >> > >> Yes, this the only way. > > > > > > A couple more comments: > > > > Please make sure to set TargetPrefix="ptx" for the intrinsics > (probably best > > in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32)] > > Ok > > > I'm not sure how to define a GCCBuiltin for an intrinsic that can > take > > multiple types, but it's probably worth looking into so we can expose > this > > intrinsic to Clang. > > This could be an issue. I looked for something similar in other > backends > and I found no previous examples. It may be worth to ask on the ML > explicitly for this. > The only fallback that I see is to define explicitly every intrinsic > for every data type, > but this would prevent the usage of the multiclass for the definition > of the patterns. > > > Bye. > > > > > > >> > >> > >> > > >> > Otherwise, the patch looks good. > >> > > >> >> > >> >> > >> >> Thanks, > >> >> > >> >> Alberto > >> >> > >> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni > >> >> <alberto.magni86 at gmail.com> wrote: > >> >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski > >> >> > <justin.holewinski at gmail.com> wrote: > >> >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski > >> >> >> <justin.holewinski at gmail.com> wrote: > >> >> >>> > >> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni > >> >> >>> <alberto.magni86 at gmail.com> > >> >> >>> wrote: > >> >> >>>> > >> >> >>>> Dear Justin, > >> >> >>>> > >> >> >>>> I am trying to add the support for some OpenCL builtin > functions > >> >> >>>> to > >> >> >>>> the PTX backend. > >> >> >>>> The attached file represent the first stub of a patch for > the fmax > >> >> >>>> builtin function. > >> >> >>> > >> >> >>> First off, thanks for helping to improve the PTX back-end! > >> >> >>> There are really two main issues here. First, OpenCL built- > in > >> >> >>> functions > >> >> >>> do not belong in the PTX back-end. These will be implemented > in > >> >> >>> the > >> >> >>> libclc > >> >> >>> library (http://www.pcc.me.uk/~peter/libclc). The back-end > will > >> >> >>> only > >> >> >>> implement PTX intrinsics, which may be used by the OpenCL > built-in > >> >> >>> functions > >> >> >>> in libclc. However, this particular function (max) > corresponds to > >> >> >>> a > >> >> >>> PTX > >> >> >>> instruction, so it makes sense to implement it as an > intrinsic in > >> >> >>> the > >> >> >>> back-end. > >> >> >>> Second, intrinsic functions require a bit more work. You're > off to > >> >> >>> a > >> >> >>> great start, but intrinsics are implemented a bit > differently. It > >> >> >>> looks > >> >> >>> like LLVM does not have a max intrinsic, so we'll need to > create > >> >> >>> one. > >> >> >>> Have > >> >> >>> a look at include/llvm/IntrinsicsPTX.td. This file defines > the > >> >> >>> PTX-specific > >> >> >>> intrinsics. You can add an intrinsic for max here, and then > >> >> >>> implement > >> >> >>> a > >> >> >>> pattern-match in the PTXInstrInfo.td file. There is no need > to > >> >> >>> create > >> >> >>> a new > >> >> >>> SDNode type for intrinsics, unless they require some special > >> >> >>> handling > >> >> >>> in the > >> >> >>> C++ code, which I do not see being the case here. > >> >> >> > >> >> >> Sorry, there's a typo here. The intrinsic pattern matching > goes in > >> >> >> PTXInstrinsicInstrInfo.td. > >> >> >> > >> >> > > >> >> > Thank you for the pointers I will let you know when I have the > first > >> >> > patch. > >> >> > > >> >> >>> > >> >> >>> When you define a new intrinsic, use the following template > as a > >> >> >>> name: > >> >> >>> int_ptx_max. This will define the LLVM intrinsic as > >> >> >>> @llvm.ptx.max(). > >> >> >>> Please follow the same convention when naming the > __builtin_* > >> >> >>> function. > >> >> >>> > >> >> >>>> > >> >> >>>> The test case I am trying is the following: > >> >> >>>> > >> >> >>>> define ptx_device float @f(float %x, float %y) { > >> >> >>>> entry: > >> >> >>>> %z = call float @fmax(float %x, float %y) > >> >> >>>> ret float %z > >> >> >>>> } > >> >> >>>> > >> >> >>>> declare float @fmax(float, float) > >> >> >>>> > >> >> >>>> But at the moment llc crashes saying that "calls are not > >> >> >>>> supported", > >> >> >>>> this does not > >> >> >>>> happens with llvm builtins like llvm.sqrt.f32 > >> >> >>> > >> >> >>> Which version of LLVM are you using? Calls to PTX device > functions > >> >> >>> have > >> >> >>> been implemented for a little while now, so I'm surprised to > see > >> >> >>> that > >> >> >>> error. > >> >> >>> Perhaps it's because the fmax function is not defined as > >> >> >>> ptx_device. > >> >> >>> > >> >> > > >> >> > This is the testcase that I am using to verify I the max > builtin > >> >> > function I am impementing > >> >> > is actually recognised. I took inspiration from the llvm- > intrinsic.ll > >> >> > test case. > >> >> > The command I am using to compile is: > >> >> > > >> >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll > >> >> > > >> >> > The option -mattr does not seem to have any effect. > >> >> > I tried also with the ptx_device qualifier with the same > outcome. > >> >> > I am using llvm from the svn repository. > >> >> > > >> >> > Bye, > >> >> > > >> >> > Alberto > >> >> > > >> >> >>>> > >> >> >>>> Can you please give me a hint on what I am missing, or some > >> >> >>>> general > >> >> >>>> advice on how > >> >> >>>> to add builtin functions. > >> >> >>>> > >> >> >>>> Thank you in advance, > >> >> >>>> > >> >> >>>> Alberto. > >> >> >>>> > >> >> >>>> _______________________________________________ > >> >> >>>> LLVM Developers mailing list > >> >> >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> >> >>>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> -- > >> >> >>> > >> >> >>> Thanks, > >> >> >>> Justin Holewinski > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> > >> >> >> Thanks, > >> >> >> Justin Holewinski > >> >> >> > >> > > >> > > >> > > >> > > >> > -- > >> > > >> > Thanks, > >> > > >> > Justin Holewinski > >> > > > > > > > > > > > -- > > > > Thanks, > > > > Justin Holewinski > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev