thr3ads.net - llvm dev - [LLVMdev] PTX builtin functions. [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Justin Holewinski

2011-Nov-21 17:31 UTC

[LLVMdev] PTX builtin functions.

On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni
<alberto.magni86 at gmail.com>wrote:
> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski
> <justin.holewinski at gmail.com> wrote:
> > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni <
> alberto.magni86 at gmail.com>
> > wrote:
> >>
> >> Hi Justin,
> >>
> >> attached you find the patch for the integer max instruction.
> >> The multiclass PTX_INTRINSIC_INT3 in file PTXIntrinsicInstrInfo.td
> >> is almost an exact copy of  PTX_INT3 in PTXInstrInfo.td, maybe
> >> a modification of this class can be defined in a separate file.
> >
> >
> > I'm copying llvmdev.  We should keep discussions like this on the
list
> for
> > the benefit of others.
>
> I always forget "Reply to All".
>
> > We can probably factor out a generic description, or even just use the
> > PTX_INT3 multiclass directly.  The PTXIntrinsicInstrInfo.td file is
> included
> > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is
available
> in
> > PTXIntrinsicInstrInfo.td.
>
> I agree with you but my class PTX_INTRINSIC_INT3 works with an Intrinsic
> and not with a SDNode, like PTX_INT3.
> PTX_INTRINSIC_INT3 also requires the presence of the type of
> the immediate in the pattern, e.g. (i32 imm:$b).
>
Alright, I'm fine with that.

>
> >>
> >>
> >> Do you agree with this approach ?
> >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED
> >> (a clone of PTX_INT3_SIGNED) is required ?
> >
> >
> > Yes, I believe we should split these into signed and unsigned
variants.
>  The
> > results of max/min operations can definitely be different depending on
> > whether the operands are signed or unsigned.  Since this information
is
> not
> > encoded in LLVM types, we may want to create two versions for each
> integer
> > type; something like:
> >
> > i32 @llvm.ptx.max.signed.i32(i32, i32)
> > i32 @llvm.ptx.max.unsigned.i32(i32, i32)
>
> Yes, this the only way.
>
A couple more comments:

   1. Please make sure to set TargetPrefix="ptx" for the intrinsics
   (probably best in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32)
   2. I'm not sure how to define a GCCBuiltin for an intrinsic that can
   take multiple types, but it's probably worth looking into so we can
expose
   this intrinsic to Clang.


>
> >
> > Otherwise, the patch looks good.
> >
> >>
> >>
> >> Thanks,
> >>
> >> Alberto
> >>
> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni
> >> <alberto.magni86 at gmail.com> wrote:
> >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski
> >> > <justin.holewinski at gmail.com> wrote:
> >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski
> >> >> <justin.holewinski at gmail.com> wrote:
> >> >>>
> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni
> >> >>> <alberto.magni86 at gmail.com>
> >> >>> wrote:
> >> >>>>
> >> >>>> Dear Justin,
> >> >>>>
> >> >>>> I am trying to add the support for some OpenCL
builtin functions to
> >> >>>> the PTX backend.
> >> >>>> The attached file represent the first stub of a
patch for the fmax
> >> >>>> builtin function.
> >> >>>
> >> >>> First off, thanks for helping to improve the PTX
back-end!
> >> >>> There are really two main issues here.  First, OpenCL
built-in
> >> >>> functions
> >> >>> do not belong in the PTX back-end.  These will be
implemented in the
> >> >>> libclc
> >> >>> library (http://www.pcc.me.uk/~peter/libclc).  The
back-end will
> only
> >> >>> implement PTX intrinsics, which may be used by the
OpenCL built-in
> >> >>> functions
> >> >>> in libclc.  However, this particular function (max)
corresponds to a
> >> >>> PTX
> >> >>> instruction, so it makes sense to implement it as an
intrinsic in
> the
> >> >>> back-end.
> >> >>> Second, intrinsic functions require a bit more work. 
You're off to
> a
> >> >>> great start, but intrinsics are implemented a bit
differently.  It
> >> >>> looks
> >> >>> like LLVM does not have a max intrinsic, so we'll
need to create
> one.
> >> >>>  Have
> >> >>> a look at include/llvm/IntrinsicsPTX.td.  This file
defines the
> >> >>> PTX-specific
> >> >>> intrinsics.  You can add an intrinsic for max here,
and then
> implement
> >> >>> a
> >> >>> pattern-match in the PTXInstrInfo.td file.  There is
no need to
> create
> >> >>> a new
> >> >>> SDNode type for intrinsics, unless they require some
special
> handling
> >> >>> in the
> >> >>> C++ code, which I do not see being the case here.
> >> >>
> >> >> Sorry, there's a typo here.  The intrinsic pattern
matching goes in
> >> >> PTXInstrinsicInstrInfo.td.
> >> >>
> >> >
> >> > Thank you for the pointers I will let you know when I have
the first
> >> > patch.
> >> >
> >> >>>
> >> >>> When you define a new intrinsic, use the following
template as a
> name:
> >> >>> int_ptx_max.  This will define the LLVM intrinsic as
> @llvm.ptx.max().
> >> >>>  Please follow the same convention when naming the
__builtin_*
> >> >>> function.
> >> >>>
> >> >>>>
> >> >>>> The test case I am trying is the following:
> >> >>>>
> >> >>>> define ptx_device float @f(float %x, float %y) {
> >> >>>> entry:
> >> >>>>  %z = call float @fmax(float %x, float %y)
> >> >>>>  ret float %z
> >> >>>> }
> >> >>>>
> >> >>>> declare float @fmax(float, float)
> >> >>>>
> >> >>>> But at the moment llc crashes saying that
"calls are not
> supported",
> >> >>>> this does not
> >> >>>> happens with llvm builtins like llvm.sqrt.f32
> >> >>>
> >> >>> Which version of LLVM are you using?  Calls to PTX
device functions
> >> >>> have
> >> >>> been implemented for a little while now, so I'm
surprised to see
> that
> >> >>> error.
> >> >>>  Perhaps it's because the fmax function is not
defined as
> ptx_device.
> >> >>>
> >> >
> >> > This is the testcase that I am using to verify I the max
builtin
> >> > function I am impementing
> >> > is actually recognised. I took inspiration from the
llvm-intrinsic.ll
> >> > test case.
> >> > The command I am using to compile is:
> >> >
> >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll
> >> >
> >> > The option -mattr does not seem to have any effect.
> >> > I tried also with the ptx_device qualifier with the same
outcome.
> >> > I am using llvm from the svn repository.
> >> >
> >> > Bye,
> >> >
> >> > Alberto
> >> >
> >> >>>>
> >> >>>> Can you please give me a hint on what I am
missing, or some general
> >> >>>> advice on how
> >> >>>> to add builtin functions.
> >> >>>>
> >> >>>> Thank you in advance,
> >> >>>>
> >> >>>> Alberto.
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> LLVM Developers mailing list
> >> >>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
> >> >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>>
> >> >>> Thanks,
> >> >>> Justin Holewinski
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Thanks,
> >> >> Justin Holewinski
> >> >>
> >
> >
> >
> >
> > --
> >
> > Thanks,
> >
> > Justin Holewinski
> >
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111121/8d2d0ab7/attachment.html>

Alberto Magni

2011-Nov-22 16:40 UTC

head link

[LLVMdev] PTX builtin functions.

On Mon, Nov 21, 2011 at 5:31 PM, Justin Holewinski
<justin.holewinski at gmail.com> wrote:> On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni <alberto.magni86 at
gmail.com>
> wrote:
>>
>> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski
>> <justin.holewinski at gmail.com> wrote:
>> > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni
>> > <alberto.magni86 at gmail.com>
>> > wrote:
>> >>
>> >> Hi Justin,
>> >>
>> >> attached you find the patch for the integer max instruction.
>> >> The multiclass PTX_INTRINSIC_INT3 in file
PTXIntrinsicInstrInfo.td
>> >> is almost an exact copy of  PTX_INT3 in PTXInstrInfo.td, maybe
>> >> a modification of this class can be defined in a separate
file.
>> >
>> >
>> > I'm copying llvmdev.  We should keep discussions like this on
the list
>> > for
>> > the benefit of others.
>>
>> I always forget "Reply to All".
>>
>> > We can probably factor out a generic description, or even just use
the
>> > PTX_INT3 multiclass directly.  The PTXIntrinsicInstrInfo.td file
is
>> > included
>> > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is
available
>> > in
>> > PTXIntrinsicInstrInfo.td.
>>
>> I agree with you but my class PTX_INTRINSIC_INT3 works with an
Intrinsic
>> and not with a SDNode, like PTX_INT3.
>> PTX_INTRINSIC_INT3 also requires the presence of the type of
>> the immediate in the pattern, e.g. (i32 imm:$b).
>
>
> Alright, I'm fine with that.
>
>>
>>
>> >>
>> >>
>> >> Do you agree with this approach ?
>> >> Also, do you think that a class like PTX_INTRINSIC_INT3_SIGNED
>> >> (a clone of PTX_INT3_SIGNED) is required ?
>> >
>> >
>> > Yes, I believe we should split these into signed and unsigned
variants.
>> >  The
>> > results of max/min operations can definitely be different
depending on
>> > whether the operands are signed or unsigned.  Since this
information is
>> > not
>> > encoded in LLVM types, we may want to create two versions for each
>> > integer
>> > type; something like:
>> >
>> > i32 @llvm.ptx.max.signed.i32(i32, i32)
>> > i32 @llvm.ptx.max.unsigned.i32(i32, i32)
>>
>> Yes, this the only way.
>
>
> A couple more comments:
>
> Please make sure to set TargetPrefix="ptx" for the intrinsics
(probably best
> in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32)]
Ok
> I'm not sure how to define a GCCBuiltin for an intrinsic that can take
> multiple types, but it's probably worth looking into so we can expose
this
> intrinsic to Clang.
This could be an issue. I looked for something similar in other backends
and I found no previous examples. It may be worth to ask on the ML
explicitly for this.
The only fallback that I see is to define explicitly every intrinsic
for every data type,
but this would prevent the usage of the multiclass for the definition
of the patterns.


Bye.
>
>
>>
>>
>> >
>> > Otherwise, the patch looks good.
>> >
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> Alberto
>> >>
>> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni
>> >> <alberto.magni86 at gmail.com> wrote:
>> >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski
>> >> > <justin.holewinski at gmail.com> wrote:
>> >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin Holewinski
>> >> >> <justin.holewinski at gmail.com> wrote:
>> >> >>>
>> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto Magni
>> >> >>> <alberto.magni86 at gmail.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> Dear Justin,
>> >> >>>>
>> >> >>>> I am trying to add the support for some
OpenCL builtin functions
>> >> >>>> to
>> >> >>>> the PTX backend.
>> >> >>>> The attached file represent the first stub of
a patch for the fmax
>> >> >>>> builtin function.
>> >> >>>
>> >> >>> First off, thanks for helping to improve the PTX
back-end!
>> >> >>> There are really two main issues here.  First,
OpenCL built-in
>> >> >>> functions
>> >> >>> do not belong in the PTX back-end.  These will be
implemented in
>> >> >>> the
>> >> >>> libclc
>> >> >>> library (http://www.pcc.me.uk/~peter/libclc).
 The back-end will
>> >> >>> only
>> >> >>> implement PTX intrinsics, which may be used by
the OpenCL built-in
>> >> >>> functions
>> >> >>> in libclc.  However, this particular function
(max) corresponds to
>> >> >>> a
>> >> >>> PTX
>> >> >>> instruction, so it makes sense to implement it as
an intrinsic in
>> >> >>> the
>> >> >>> back-end.
>> >> >>> Second, intrinsic functions require a bit more
work.  You're off to
>> >> >>> a
>> >> >>> great start, but intrinsics are implemented a bit
differently.  It
>> >> >>> looks
>> >> >>> like LLVM does not have a max intrinsic, so
we'll need to create
>> >> >>> one.
>> >> >>>  Have
>> >> >>> a look at include/llvm/IntrinsicsPTX.td.  This
file defines the
>> >> >>> PTX-specific
>> >> >>> intrinsics.  You can add an intrinsic for max
here, and then
>> >> >>> implement
>> >> >>> a
>> >> >>> pattern-match in the PTXInstrInfo.td file.  There
is no need to
>> >> >>> create
>> >> >>> a new
>> >> >>> SDNode type for intrinsics, unless they require
some special
>> >> >>> handling
>> >> >>> in the
>> >> >>> C++ code, which I do not see being the case here.
>> >> >>
>> >> >> Sorry, there's a typo here.  The intrinsic
pattern matching goes in
>> >> >> PTXInstrinsicInstrInfo.td.
>> >> >>
>> >> >
>> >> > Thank you for the pointers I will let you know when I
have the first
>> >> > patch.
>> >> >
>> >> >>>
>> >> >>> When you define a new intrinsic, use the
following template as a
>> >> >>> name:
>> >> >>> int_ptx_max.  This will define the LLVM intrinsic
as
>> >> >>> @llvm.ptx.max().
>> >> >>>  Please follow the same convention when naming
the __builtin_*
>> >> >>> function.
>> >> >>>
>> >> >>>>
>> >> >>>> The test case I am trying is the following:
>> >> >>>>
>> >> >>>> define ptx_device float @f(float %x, float
%y) {
>> >> >>>> entry:
>> >> >>>>  %z = call float @fmax(float %x, float %y)
>> >> >>>>  ret float %z
>> >> >>>> }
>> >> >>>>
>> >> >>>> declare float @fmax(float, float)
>> >> >>>>
>> >> >>>> But at the moment llc crashes saying that
"calls are not
>> >> >>>> supported",
>> >> >>>> this does not
>> >> >>>> happens with llvm builtins like llvm.sqrt.f32
>> >> >>>
>> >> >>> Which version of LLVM are you using?  Calls to
PTX device functions
>> >> >>> have
>> >> >>> been implemented for a little while now, so
I'm surprised to see
>> >> >>> that
>> >> >>> error.
>> >> >>>  Perhaps it's because the fmax function is
not defined as
>> >> >>> ptx_device.
>> >> >>>
>> >> >
>> >> > This is the testcase that I am using to verify I the max
builtin
>> >> > function I am impementing
>> >> > is actually recognised. I took inspiration from the
llvm-intrinsic.ll
>> >> > test case.
>> >> > The command I am using to compile is:
>> >> >
>> >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll
>> >> >
>> >> > The option -mattr does not seem to have any effect.
>> >> > I tried also with the ptx_device qualifier with the same
outcome.
>> >> > I am using llvm from the svn repository.
>> >> >
>> >> > Bye,
>> >> >
>> >> > Alberto
>> >> >
>> >> >>>>
>> >> >>>> Can you please give me a hint on what I am
missing, or some
>> >> >>>> general
>> >> >>>> advice on how
>> >> >>>> to add builtin functions.
>> >> >>>>
>> >> >>>> Thank you in advance,
>> >> >>>>
>> >> >>>> Alberto.
>> >> >>>>
>> >> >>>>
_______________________________________________
>> >> >>>> LLVM Developers mailing list
>> >> >>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
>> >> >>>>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Justin Holewinski
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> Thanks,
>> >> >> Justin Holewinski
>> >> >>
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Thanks,
>> >
>> > Justin Holewinski
>> >
>
>
>
>
> --
>
> Thanks,
>
> Justin Holewinski
>

Villmow, Micah

2011-Nov-22 17:01 UTC

head link

[LLVMdev] PTX builtin functions.

Alberto,
 The AMDIL backend solves your problem with intrinsic overloading this way:
def int_AMDIL_mad     : GCCBuiltin<"__amdil_mad">,
TernaryIntFloat;

Where TernaryIntFloat is defined as:
class TernaryIntFloat :
          Intrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>,
          LLVMMatchType<0>, LLVMMatchType<0>], []>;

This allows us to write a multi-def for int_AMDIL_mad like so:
defm MAD  : TernaryIntrinsicFloat<IL_OP_MAD, int_AMDIL_mad>;

Where TernaryIntrinsicFloat is defined as:
multiclass TernaryIntrinsicFloat<ILOpCode opcode, Intrinsic intr>
{
  def _f32 : ThreeInOneOut<opcode, (outs GPRF32:$dst),
      (ins GPRF32:$src, GPRF32:$src2, GPRF32:$src3),
      !strconcat(opcode.Text, " $dst, $src, $src2, $src3"),
      [(set GPRF32:$dst,
          (intr GPRF32:$src, GPRF32:$src2, GPRF32:$src3))]>;
  def _v2f32 : ThreeInOneOut<opcode, (outs GPRV2F32:$dst),
      (ins GPRV2F32:$src, GPRV2F32:$src2, GPRV2F32:$src3),
      !strconcat(opcode.Text, " $dst, $src, $src2, $src3"),
      [(set GPRV2F32:$dst,
          (intr GPRV2F32:$src, GPRV2F32:$src2, GPRV2F32:$src3))]>;
...
}

Now, this doesn't completely work, because LLVM does not allow overloading
of intrinsics values, so there needs to be a little coding in *IntrinsicInfo
class.
AMD always encodes builtin names as __amdil_mad_f32, __amdil_mad_v2f32,
__amdil_mad_v4f32, etc....
So in the function "*IntrinsicInfo::lookup_name", when attempting to
find out what intrinsic the function maps to, the AMDIL backend strips off the
type, and then looks up for just '__amdil_mad'.

This is how you can do intrinsic overloading in LLVM.

Hope this helps,
Micah
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
> On Behalf Of Alberto Magni
> Sent: Tuesday, November 22, 2011 8:41 AM
> To: Justin Holewinski
> Cc: LLVM Developers Mailing List
> Subject: Re: [LLVMdev] PTX builtin functions.
> 
> On Mon, Nov 21, 2011 at 5:31 PM, Justin Holewinski
> <justin.holewinski at gmail.com> wrote:
> > On Mon, Nov 21, 2011 at 11:45 AM, Alberto Magni
> <alberto.magni86 at gmail.com>
> > wrote:
> >>
> >> On Mon, Nov 21, 2011 at 3:36 PM, Justin Holewinski
> >> <justin.holewinski at gmail.com> wrote:
> >> > On Mon, Nov 21, 2011 at 7:01 AM, Alberto Magni
> >> > <alberto.magni86 at gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi Justin,
> >> >>
> >> >> attached you find the patch for the integer max
instruction.
> >> >> The multiclass PTX_INTRINSIC_INT3 in file
> PTXIntrinsicInstrInfo.td
> >> >> is almost an exact copy of  PTX_INT3 in PTXInstrInfo.td,
maybe
> >> >> a modification of this class can be defined in a separate
file.
> >> >
> >> >
> >> > I'm copying llvmdev.  We should keep discussions like
this on the
> list
> >> > for
> >> > the benefit of others.
> >>
> >> I always forget "Reply to All".
> >>
> >> > We can probably factor out a generic description, or even
just use
> the
> >> > PTX_INT3 multiclass directly.  The PTXIntrinsicInstrInfo.td
file
> is
> >> > included
> >> > by PTXInstrInfo.td, so anything defined in PTXInstrInfo.td is
> available
> >> > in
> >> > PTXIntrinsicInstrInfo.td.
> >>
> >> I agree with you but my class PTX_INTRINSIC_INT3 works with an
> Intrinsic
> >> and not with a SDNode, like PTX_INT3.
> >> PTX_INTRINSIC_INT3 also requires the presence of the type of
> >> the immediate in the pattern, e.g. (i32 imm:$b).
> >
> >
> > Alright, I'm fine with that.
> >
> >>
> >>
> >> >>
> >> >>
> >> >> Do you agree with this approach ?
> >> >> Also, do you think that a class like
PTX_INTRINSIC_INT3_SIGNED
> >> >> (a clone of PTX_INT3_SIGNED) is required ?
> >> >
> >> >
> >> > Yes, I believe we should split these into signed and unsigned
> variants.
> >> >  The
> >> > results of max/min operations can definitely be different
> depending on
> >> > whether the operands are signed or unsigned.  Since this
> information is
> >> > not
> >> > encoded in LLVM types, we may want to create two versions for
each
> >> > integer
> >> > type; something like:
> >> >
> >> > i32 @llvm.ptx.max.signed.i32(i32, i32)
> >> > i32 @llvm.ptx.max.unsigned.i32(i32, i32)
> >>
> >> Yes, this the only way.
> >
> >
> > A couple more comments:
> >
> > Please make sure to set TargetPrefix="ptx" for the
intrinsics
> (probably best
> > in the multiclass, see PTXReadSpecialRegisterIntrinsic_r32)]
> 
> Ok
> 
> > I'm not sure how to define a GCCBuiltin for an intrinsic that can
> take
> > multiple types, but it's probably worth looking into so we can
expose
> this
> > intrinsic to Clang.
> 
> This could be an issue. I looked for something similar in other
> backends
> and I found no previous examples. It may be worth to ask on the ML
> explicitly for this.
> The only fallback that I see is to define explicitly every intrinsic
> for every data type,
> but this would prevent the usage of the multiclass for the definition
> of the patterns.
> 
> 
> Bye.
> 
> >
> >
> >>
> >>
> >> >
> >> > Otherwise, the patch looks good.
> >> >
> >> >>
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Alberto
> >> >>
> >> >> On Wed, Nov 16, 2011 at 5:44 PM, Alberto Magni
> >> >> <alberto.magni86 at gmail.com> wrote:
> >> >> > On Wed, Nov 16, 2011 at 2:17 PM, Justin Holewinski
> >> >> > <justin.holewinski at gmail.com> wrote:
> >> >> >> On Wed, Nov 16, 2011 at 9:16 AM, Justin
Holewinski
> >> >> >> <justin.holewinski at gmail.com> wrote:
> >> >> >>>
> >> >> >>> On Wed, Nov 16, 2011 at 8:05 AM, Alberto
Magni
> >> >> >>> <alberto.magni86 at gmail.com>
> >> >> >>> wrote:
> >> >> >>>>
> >> >> >>>> Dear Justin,
> >> >> >>>>
> >> >> >>>> I am trying to add the support for some
OpenCL builtin
> functions
> >> >> >>>> to
> >> >> >>>> the PTX backend.
> >> >> >>>> The attached file represent the first
stub of a patch for
> the fmax
> >> >> >>>> builtin function.
> >> >> >>>
> >> >> >>> First off, thanks for helping to improve the
PTX back-end!
> >> >> >>> There are really two main issues here.
 First, OpenCL built-
> in
> >> >> >>> functions
> >> >> >>> do not belong in the PTX back-end.  These
will be implemented
> in
> >> >> >>> the
> >> >> >>> libclc
> >> >> >>> library
(http://www.pcc.me.uk/~peter/libclc).  The back-end
> will
> >> >> >>> only
> >> >> >>> implement PTX intrinsics, which may be used
by the OpenCL
> built-in
> >> >> >>> functions
> >> >> >>> in libclc.  However, this particular
function (max)
> corresponds to
> >> >> >>> a
> >> >> >>> PTX
> >> >> >>> instruction, so it makes sense to implement
it as an
> intrinsic in
> >> >> >>> the
> >> >> >>> back-end.
> >> >> >>> Second, intrinsic functions require a bit
more work.  You're
> off to
> >> >> >>> a
> >> >> >>> great start, but intrinsics are implemented
a bit
> differently.  It
> >> >> >>> looks
> >> >> >>> like LLVM does not have a max intrinsic, so
we'll need to
> create
> >> >> >>> one.
> >> >> >>>  Have
> >> >> >>> a look at include/llvm/IntrinsicsPTX.td.
 This file defines
> the
> >> >> >>> PTX-specific
> >> >> >>> intrinsics.  You can add an intrinsic for
max here, and then
> >> >> >>> implement
> >> >> >>> a
> >> >> >>> pattern-match in the PTXInstrInfo.td file.
 There is no need
> to
> >> >> >>> create
> >> >> >>> a new
> >> >> >>> SDNode type for intrinsics, unless they
require some special
> >> >> >>> handling
> >> >> >>> in the
> >> >> >>> C++ code, which I do not see being the case
here.
> >> >> >>
> >> >> >> Sorry, there's a typo here.  The intrinsic
pattern matching
> goes in
> >> >> >> PTXInstrinsicInstrInfo.td.
> >> >> >>
> >> >> >
> >> >> > Thank you for the pointers I will let you know when
I have the
> first
> >> >> > patch.
> >> >> >
> >> >> >>>
> >> >> >>> When you define a new intrinsic, use the
following template
> as a
> >> >> >>> name:
> >> >> >>> int_ptx_max.  This will define the LLVM
intrinsic as
> >> >> >>> @llvm.ptx.max().
> >> >> >>>  Please follow the same convention when
naming the
> __builtin_*
> >> >> >>> function.
> >> >> >>>
> >> >> >>>>
> >> >> >>>> The test case I am trying is the
following:
> >> >> >>>>
> >> >> >>>> define ptx_device float @f(float %x,
float %y) {
> >> >> >>>> entry:
> >> >> >>>>  %z = call float @fmax(float %x, float
%y)
> >> >> >>>>  ret float %z
> >> >> >>>> }
> >> >> >>>>
> >> >> >>>> declare float @fmax(float, float)
> >> >> >>>>
> >> >> >>>> But at the moment llc crashes saying
that "calls are not
> >> >> >>>> supported",
> >> >> >>>> this does not
> >> >> >>>> happens with llvm builtins like
llvm.sqrt.f32
> >> >> >>>
> >> >> >>> Which version of LLVM are you using?  Calls
to PTX device
> functions
> >> >> >>> have
> >> >> >>> been implemented for a little while now, so
I'm surprised to
> see
> >> >> >>> that
> >> >> >>> error.
> >> >> >>>  Perhaps it's because the fmax function
is not defined as
> >> >> >>> ptx_device.
> >> >> >>>
> >> >> >
> >> >> > This is the testcase that I am using to verify I the
max
> builtin
> >> >> > function I am impementing
> >> >> > is actually recognised. I took inspiration from the
llvm-
> intrinsic.ll
> >> >> > test case.
> >> >> > The command I am using to compile is:
> >> >> >
> >> >> > llc -march=ptx32 -mattr=+ptx22 fmax.ll
> >> >> >
> >> >> > The option -mattr does not seem to have any effect.
> >> >> > I tried also with the ptx_device qualifier with the
same
> outcome.
> >> >> > I am using llvm from the svn repository.
> >> >> >
> >> >> > Bye,
> >> >> >
> >> >> > Alberto
> >> >> >
> >> >> >>>>
> >> >> >>>> Can you please give me a hint on what I
am missing, or some
> >> >> >>>> general
> >> >> >>>> advice on how
> >> >> >>>> to add builtin functions.
> >> >> >>>>
> >> >> >>>> Thank you in advance,
> >> >> >>>>
> >> >> >>>> Alberto.
> >> >> >>>>
> >> >> >>>>
_______________________________________________
> >> >> >>>> LLVM Developers mailing list
> >> >> >>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
> >> >> >>>>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >> >>>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> --
> >> >> >>>
> >> >> >>> Thanks,
> >> >> >>> Justin Holewinski
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Justin Holewinski
> >> >> >>
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > Thanks,
> >> >
> >> > Justin Holewinski
> >> >
> >
> >
> >
> >
> > --
> >
> > Thanks,
> >
> > Justin Holewinski
> >
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Nov 2011 - [LLVMdev] PTX builtin functions.

[LLVMdev] PTX builtin functions.

[LLVMdev] PTX builtin functions.

[LLVMdev] PTX builtin functions.

Apparently Analagous Threads