thr3ads.net - llvm dev - [LLVMdev] Attaching range metadata to IntrinsicInst [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Nick Lewycky

2014-Jun-17 19:19 UTC

[LLVMdev] Attaching range metadata to IntrinsicInst

On 17 June 2014 06:41, Eli Bendersky <eliben at google.com> wrote:
> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky <nicholas at mxc.ca>
wrote:
>
>> Chandler Carruth wrote:
>>
>>> This seems fine to me, but I'd like to make sure it looks OK to
Nick as
>>> well.
>>>
>>
>> I strongly prefer baking in knowledge about the intrinsics themselves
>> into the passes if possible. Metadata will always be secondary.
>>
>
> So you're saying that in this particular case you'd prefer LLVM
passes to
> know about the range of these PTX intrinsics, rather than Clang adding them
> as metadata?
>
Yep.

ValueTracking.cpp already has some iffy target knowledge (someone sneaked
a> direct  Intrinsic::x86_sse42_crc32_64_64 check in there), but extending it
> to other intrinsics in other targets seems like too much...
>
That's not iffy. That's exactly how it should work, and we should have
more
of that. There is a major gotcha and that's dealing with the case where the
intrinsics don't exist because the backend wasn't compiled in. If
x86_sse42_crc32_64_64 is in there (and also in instcombine btw), presumably
that problem is solved somehow? Or does llvm actually not build if you
don't enable the x86 target? I feel like we would've heard about that.

Nick

So should target info be passed into it in some way? Any suggestions
where> to put it? TargetLibraryInfo? TargetTransformInfo? In any case this seems
> like the target interface will have to be augmented, and we'll have to
> carry an object around into ValueTracking's compute* functions. If this
is
> the right way, then this is the way it will be done - design ideas are
> appreciated.
>
> Eli
>
>
>
>
>>
>> Separately, should value tracking look use range metadata when it's
>> available? Absolutely.
>>
>> I think it should apply to all CallInst not just IntrinsicInst (which
is
>> derived from CallInst).
>>
>> Nick
>>
>>  On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu <jingyue at google.com
>>> <mailto:jingyue at google.com>> wrote:
>>>
>>>     Hi,
>>>
>>>     The range metadata can only be attached to LoadInst for now. I
am
>>>     considering extending its usage to IntrinsicInst so that the
>>>     frontend can annotate the range of the return value of an
intrinsic
>>>     call. e.g.,
>>>     %a = call i32 @llvm.xxx(), !range !0
>>>     !0 = metadata !{ i32 0, i23 1024 }
>>>
>>>     The motivation behind this extension is some optimizations we
are
>>>     working on for CUDA programs. Some special registers in CUDA
(e.g.,
>>>     threadIdx.x) are bounded per CUDA programming guide, and
knowing
>>>     their ranges can improve the precision of ValueTracking and
benefit
>>>     optimizations such as InstCombine.
>>>
>>>     To implement this idea, we need ValueTracking to be aware of
the
>>>     ranges of these special variables. These special registers are
so
>>>     far read-only and accessed using intrinsics. e.g.,
>>>     %threadIdx.x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x().
>>>
>>>     One possible approach is to have ValueTracking compute the
known
>>>     bits of these intrinsics as special cases. This approach is
already
>>>     taken for the x86_sse42_crc32_64_64 intrinsic. However, this
>>>     approach may not be elegant because the ranges of these CUDA
special
>>>     registers depend on the GPU compute capability specified by
>>>     -target-cpu. For instance, blockIdx.x is bounded by 65535 in
sm_20
>>>     but 2^31-1 in sm_30. Exposing -target-cpu to ValueTracking is
>>>     probably discouraged.
>>>
>>>     Therefore, the approach I am considering is to have clang
annotate
>>>     the ranges of these CUDA special registers according to the
>>>     -target-cpu flag, and have ValueTracking pick the range
metadata for
>>>     optimization. By doing so, we hide the target-specific info
from
>>>     ValueTracking.
>>>
>>>     The code change in llvm minus clang won't be large. The
core change
>>>     is only a few lines:
>>>     http://reviews.llvm.org/differential/diff/10464/. If this
extension
>>>     sounds good to you, I'll definitely add more tests and
revise the
>>>     documents on range metadata.
>>>
>>>     Best,
>>>     Jingyue
>>>
>>>     _______________________________________________
>>>     LLVM Developers mailing list
>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>>> http://llvm.cs.uiuc.edu
>>>
>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140617/b3eeaf74/attachment.html>

Jingyue Wu

2014-Jun-17 21:09 UTC

head link

[LLVMdev] Attaching range metadata to IntrinsicInst

Hi Nick,

That makes sense. I think a main issue here is that the ranges of these PTX
special registers (e.g., threadIdx.x) depend on -target-cpu which is only
visible to clang and llc. Would you mind we specify "target cpu" in
the IR
similar to what we did for "target triple"?

Thanks,
Jingyue


On Tue, Jun 17, 2014 at 12:19 PM, Nick Lewycky <nlewycky at google.com>
wrote:
> On 17 June 2014 06:41, Eli Bendersky <eliben at google.com> wrote:
>
>> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky <nicholas at
mxc.ca> wrote:
>>
>>> Chandler Carruth wrote:
>>>
>>>> This seems fine to me, but I'd like to make sure it looks
OK to Nick as
>>>> well.
>>>>
>>>
>>> I strongly prefer baking in knowledge about the intrinsics
themselves
>>> into the passes if possible. Metadata will always be secondary.
>>>
>>
>> So you're saying that in this particular case you'd prefer LLVM
passes to
>> know about the range of these PTX intrinsics, rather than Clang adding
them
>> as metadata?
>>
>
> Yep.
>
> ValueTracking.cpp already has some iffy target knowledge (someone sneaked
>> a direct  Intrinsic::x86_sse42_crc32_64_64 check in there), but
extending
>> it to other intrinsics in other targets seems like too much...
>>
>
> That's not iffy. That's exactly how it should work, and we should
have
> more of that. There is a major gotcha and that's dealing with the case
> where the intrinsics don't exist because the backend wasn't
compiled in. If
> x86_sse42_crc32_64_64 is in there (and also in instcombine btw), presumably
> that problem is solved somehow? Or does llvm actually not build if you
> don't enable the x86 target? I feel like we would've heard about
that.
>
> Nick
>
> So should target info be passed into it in some way? Any suggestions where
>> to put it? TargetLibraryInfo? TargetTransformInfo? In any case this
seems
>> like the target interface will have to be augmented, and we'll have
to
>> carry an object around into ValueTracking's compute* functions. If
this is
>> the right way, then this is the way it will be done - design ideas are
>> appreciated.
>>
>> Eli
>>
>>
>>
>>
>>>
>>> Separately, should value tracking look use range metadata when
it's
>>> available? Absolutely.
>>>
>>> I think it should apply to all CallInst not just IntrinsicInst
(which is
>>> derived from CallInst).
>>>
>>> Nick
>>>
>>>  On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu <jingyue at
google.com
>>>> <mailto:jingyue at google.com>> wrote:
>>>>
>>>>     Hi,
>>>>
>>>>     The range metadata can only be attached to LoadInst for
now. I am
>>>>     considering extending its usage to IntrinsicInst so that
the
>>>>     frontend can annotate the range of the return value of an
intrinsic
>>>>     call. e.g.,
>>>>     %a = call i32 @llvm.xxx(), !range !0
>>>>     !0 = metadata !{ i32 0, i23 1024 }
>>>>
>>>>     The motivation behind this extension is some optimizations
we are
>>>>     working on for CUDA programs. Some special registers in
CUDA (e.g.,
>>>>     threadIdx.x) are bounded per CUDA programming guide, and
knowing
>>>>     their ranges can improve the precision of ValueTracking and
benefit
>>>>     optimizations such as InstCombine.
>>>>
>>>>     To implement this idea, we need ValueTracking to be aware
of the
>>>>     ranges of these special variables. These special registers
are so
>>>>     far read-only and accessed using intrinsics. e.g.,
>>>>     %threadIdx.x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x().
>>>>
>>>>     One possible approach is to have ValueTracking compute the
known
>>>>     bits of these intrinsics as special cases. This approach is
already
>>>>     taken for the x86_sse42_crc32_64_64 intrinsic. However,
this
>>>>     approach may not be elegant because the ranges of these
CUDA special
>>>>     registers depend on the GPU compute capability specified by
>>>>     -target-cpu. For instance, blockIdx.x is bounded by 65535
in sm_20
>>>>     but 2^31-1 in sm_30. Exposing -target-cpu to ValueTracking
is
>>>>     probably discouraged.
>>>>
>>>>     Therefore, the approach I am considering is to have clang
annotate
>>>>     the ranges of these CUDA special registers according to the
>>>>     -target-cpu flag, and have ValueTracking pick the range
metadata for
>>>>     optimization. By doing so, we hide the target-specific info
from
>>>>     ValueTracking.
>>>>
>>>>     The code change in llvm minus clang won't be large. The
core change
>>>>     is only a few lines:
>>>>     http://reviews.llvm.org/differential/diff/10464/. If this
extension
>>>>     sounds good to you, I'll definitely add more tests and
revise the
>>>>     documents on range metadata.
>>>>
>>>>     Best,
>>>>     Jingyue
>>>>
>>>>     _______________________________________________
>>>>     LLVM Developers mailing list
>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>> http://llvm.cs.uiuc.edu
>>>>
>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140617/5e67ecff/attachment.html>

Eric Christopher

2014-Jun-17 21:22 UTC

head link

[LLVMdev] Attaching range metadata to IntrinsicInst

Eh? How do you envision this?

-eric

On Tue, Jun 17, 2014 at 2:09 PM, Jingyue Wu <jingyue at google.com>
wrote:> Hi Nick,
>
> That makes sense. I think a main issue here is that the ranges of these PTX
> special registers (e.g., threadIdx.x) depend on -target-cpu which is only
> visible to clang and llc. Would you mind we specify "target cpu"
in the IR
> similar to what we did for "target triple"?
>
> Thanks,
> Jingyue
>
>
> On Tue, Jun 17, 2014 at 12:19 PM, Nick Lewycky <nlewycky at
google.com> wrote:
>>
>> On 17 June 2014 06:41, Eli Bendersky <eliben at google.com>
wrote:
>>>
>>> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky <nicholas at
mxc.ca> wrote:
>>>>
>>>> Chandler Carruth wrote:
>>>>>
>>>>> This seems fine to me, but I'd like to make sure it
looks OK to Nick as
>>>>> well.
>>>>
>>>>
>>>> I strongly prefer baking in knowledge about the intrinsics
themselves
>>>> into the passes if possible. Metadata will always be secondary.
>>>
>>>
>>> So you're saying that in this particular case you'd prefer
LLVM passes to
>>> know about the range of these PTX intrinsics, rather than Clang
adding them
>>> as metadata?
>>
>>
>> Yep.
>>
>>> ValueTracking.cpp already has some iffy target knowledge (someone
sneaked
>>> a direct  Intrinsic::x86_sse42_crc32_64_64 check in there), but
extending it
>>> to other intrinsics in other targets seems like too much...
>>
>>
>> That's not iffy. That's exactly how it should work, and we
should have
>> more of that. There is a major gotcha and that's dealing with the
case where
>> the intrinsics don't exist because the backend wasn't compiled
in. If
>> x86_sse42_crc32_64_64 is in there (and also in instcombine btw),
presumably
>> that problem is solved somehow? Or does llvm actually not build if you
don't
>> enable the x86 target? I feel like we would've heard about that.
>>
>> Nick
>>
>>> So should target info be passed into it in some way? Any
suggestions
>>> where to put it? TargetLibraryInfo? TargetTransformInfo? In any
case this
>>> seems like the target interface will have to be augmented, and
we'll have to
>>> carry an object around into ValueTracking's compute* functions.
If this is
>>> the right way, then this is the way it will be done - design ideas
are
>>> appreciated.
>>>
>>> Eli
>>>
>>>
>>>
>>>>
>>>>
>>>> Separately, should value tracking look use range metadata when
it's
>>>> available? Absolutely.
>>>>
>>>> I think it should apply to all CallInst not just IntrinsicInst
(which is
>>>> derived from CallInst).
>>>>
>>>> Nick
>>>>
>>>>> On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu <jingyue at
google.com
>>>>> <mailto:jingyue at google.com>> wrote:
>>>>>
>>>>>     Hi,
>>>>>
>>>>>     The range metadata can only be attached to LoadInst for
now. I am
>>>>>     considering extending its usage to IntrinsicInst so
that the
>>>>>     frontend can annotate the range of the return value of
an intrinsic
>>>>>     call. e.g.,
>>>>>     %a = call i32 @llvm.xxx(), !range !0
>>>>>     !0 = metadata !{ i32 0, i23 1024 }
>>>>>
>>>>>     The motivation behind this extension is some
optimizations we are
>>>>>     working on for CUDA programs. Some special registers in
CUDA (e.g.,
>>>>>     threadIdx.x) are bounded per CUDA programming guide,
and knowing
>>>>>     their ranges can improve the precision of ValueTracking
and benefit
>>>>>     optimizations such as InstCombine.
>>>>>
>>>>>     To implement this idea, we need ValueTracking to be
aware of the
>>>>>     ranges of these special variables. These special
registers are so
>>>>>     far read-only and accessed using intrinsics. e.g.,
>>>>>     %threadIdx.x = call i32
@llvm.nvvm.read.ptx.sreg.tid.x().
>>>>>
>>>>>     One possible approach is to have ValueTracking compute
the known
>>>>>     bits of these intrinsics as special cases. This
approach is already
>>>>>     taken for the x86_sse42_crc32_64_64 intrinsic. However,
this
>>>>>     approach may not be elegant because the ranges of these
CUDA
>>>>> special
>>>>>     registers depend on the GPU compute capability
specified by
>>>>>     -target-cpu. For instance, blockIdx.x is bounded by
65535 in sm_20
>>>>>     but 2^31-1 in sm_30. Exposing -target-cpu to
ValueTracking is
>>>>>     probably discouraged.
>>>>>
>>>>>     Therefore, the approach I am considering is to have
clang annotate
>>>>>     the ranges of these CUDA special registers according to
the
>>>>>     -target-cpu flag, and have ValueTracking pick the range
metadata
>>>>> for
>>>>>     optimization. By doing so, we hide the target-specific
info from
>>>>>     ValueTracking.
>>>>>
>>>>>     The code change in llvm minus clang won't be large.
The core change
>>>>>     is only a few lines:
>>>>>     http://reviews.llvm.org/differential/diff/10464/. If
this extension
>>>>>     sounds good to you, I'll definitely add more tests
and revise the
>>>>>     documents on range metadata.
>>>>>
>>>>>     Best,
>>>>>     Jingyue
>>>>>
>>>>>     _______________________________________________
>>>>>     LLVM Developers mailing list
>>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>> http://llvm.cs.uiuc.edu
>>>>>
>>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Nick Lewycky

2014-Jun-17 21:46 UTC

head link

[LLVMdev] Attaching range metadata to IntrinsicInst

On 17 June 2014 14:09, Jingyue Wu <jingyue at google.com> wrote:
> Hi Nick,
>
> That makes sense. I think a main issue here is that the ranges of these
> PTX special registers (e.g., threadIdx.x) depend on -target-cpu which is
> only visible to clang and llc. Would you mind we specify "target
cpu" in
> the IR similar to what we did for "target triple"?
>
Aha, that's the salient point. I'd like to see llvm using what it knows
about intrinsics statically. Something like "popcount" is a great
example.
Needing to know more than the data in the intrinsic, needing to know about
what subarch is being targeted is different. I didn't realize we had such
intrinsics.

If the ranges really can't be deduced from the intrinsics as written -- and
that's enough to make me wonder whether these intrinsics are properly
designed but I won't dart down that rabbit hole now ---- if the ranges
really can't be deduced from the intrinsics alone then you should fall back
to using range metadata as you initially suggested. Sorry for running in a
circle on the design.

As an alternative, I asked Eric in person and his suggestion was to query
TargetTransformInfo for information about the intrinsic. That's also
plausible, it depends on whether you feel like the authoritative
information should be coming from the frontend or from the backend. I could
see this going either way.

Nick

> On Tue, Jun 17, 2014 at 12:19 PM, Nick Lewycky <nlewycky at
google.com>
> wrote:
>
>> On 17 June 2014 06:41, Eli Bendersky <eliben at google.com>
wrote:
>>
>>> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky <nicholas at
mxc.ca> wrote:
>>>
>>>> Chandler Carruth wrote:
>>>>
>>>>> This seems fine to me, but I'd like to make sure it
looks OK to Nick as
>>>>> well.
>>>>>
>>>>
>>>> I strongly prefer baking in knowledge about the intrinsics
themselves
>>>> into the passes if possible. Metadata will always be secondary.
>>>>
>>>
>>> So you're saying that in this particular case you'd prefer
LLVM passes
>>> to know about the range of these PTX intrinsics, rather than Clang
adding
>>> them as metadata?
>>>
>>
>> Yep.
>>
>>  ValueTracking.cpp already has some iffy target knowledge (someone
>>> sneaked a direct  Intrinsic::x86_sse42_crc32_64_64 check in there),
but
>>> extending it to other intrinsics in other targets seems like too
much...
>>>
>>
>> That's not iffy. That's exactly how it should work, and we
should have
>> more of that. There is a major gotcha and that's dealing with the
case
>> where the intrinsics don't exist because the backend wasn't
compiled in. If
>> x86_sse42_crc32_64_64 is in there (and also in instcombine btw),
presumably
>> that problem is solved somehow? Or does llvm actually not build if you
>> don't enable the x86 target? I feel like we would've heard
about that.
>>
>> Nick
>>
>>  So should target info be passed into it in some way? Any suggestions
>>> where to put it? TargetLibraryInfo? TargetTransformInfo? In any
case this
>>> seems like the target interface will have to be augmented, and
we'll have
>>> to carry an object around into ValueTracking's compute*
functions. If this
>>> is the right way, then this is the way it will be done - design
ideas are
>>> appreciated.
>>>
>>> Eli
>>>
>>>
>>>
>>>
>>>>
>>>> Separately, should value tracking look use range metadata when
it's
>>>> available? Absolutely.
>>>>
>>>> I think it should apply to all CallInst not just IntrinsicInst
(which
>>>> is derived from CallInst).
>>>>
>>>> Nick
>>>>
>>>>  On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu <jingyue at
google.com
>>>>> <mailto:jingyue at google.com>> wrote:
>>>>>
>>>>>     Hi,
>>>>>
>>>>>     The range metadata can only be attached to LoadInst for
now. I am
>>>>>     considering extending its usage to IntrinsicInst so
that the
>>>>>     frontend can annotate the range of the return value of
an intrinsic
>>>>>     call. e.g.,
>>>>>     %a = call i32 @llvm.xxx(), !range !0
>>>>>     !0 = metadata !{ i32 0, i23 1024 }
>>>>>
>>>>>     The motivation behind this extension is some
optimizations we are
>>>>>     working on for CUDA programs. Some special registers in
CUDA (e.g.,
>>>>>     threadIdx.x) are bounded per CUDA programming guide,
and knowing
>>>>>     their ranges can improve the precision of ValueTracking
and benefit
>>>>>     optimizations such as InstCombine.
>>>>>
>>>>>     To implement this idea, we need ValueTracking to be
aware of the
>>>>>     ranges of these special variables. These special
registers are so
>>>>>     far read-only and accessed using intrinsics. e.g.,
>>>>>     %threadIdx.x = call i32
@llvm.nvvm.read.ptx.sreg.tid.x().
>>>>>
>>>>>     One possible approach is to have ValueTracking compute
the known
>>>>>     bits of these intrinsics as special cases. This
approach is already
>>>>>     taken for the x86_sse42_crc32_64_64 intrinsic. However,
this
>>>>>     approach may not be elegant because the ranges of these
CUDA
>>>>> special
>>>>>     registers depend on the GPU compute capability
specified by
>>>>>     -target-cpu. For instance, blockIdx.x is bounded by
65535 in sm_20
>>>>>     but 2^31-1 in sm_30. Exposing -target-cpu to
ValueTracking is
>>>>>     probably discouraged.
>>>>>
>>>>>     Therefore, the approach I am considering is to have
clang annotate
>>>>>     the ranges of these CUDA special registers according to
the
>>>>>     -target-cpu flag, and have ValueTracking pick the range
metadata
>>>>> for
>>>>>     optimization. By doing so, we hide the target-specific
info from
>>>>>     ValueTracking.
>>>>>
>>>>>     The code change in llvm minus clang won't be large.
The core change
>>>>>     is only a few lines:
>>>>>     http://reviews.llvm.org/differential/diff/10464/. If
this
>>>>> extension
>>>>>     sounds good to you, I'll definitely add more tests
and revise the
>>>>>     documents on range metadata.
>>>>>
>>>>>     Best,
>>>>>     Jingyue
>>>>>
>>>>>     _______________________________________________
>>>>>     LLVM Developers mailing list
>>>>>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at
cs.uiuc.edu>
>>>>> http://llvm.cs.uiuc.edu
>>>>>
>>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140617/1c235751/attachment.html>

Hal Finkel

2014-Jun-17 21:46 UTC

head link

[LLVMdev] Attaching range metadata to IntrinsicInst

----- Original Message -----> From: "Nick Lewycky" <nlewycky at google.com>
> To: "Eli Bendersky" <eliben at google.com>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, June 17, 2014 2:19:57 PM
> Subject: Re: [LLVMdev] Attaching range metadata to IntrinsicInst
> 
> 
> 
> 
> 
> On 17 June 2014 06:41, Eli Bendersky < eliben at google.com > wrote:
> 
> 
> 
> 
> 
> 
> On Tue, Jun 17, 2014 at 1:38 AM, Nick Lewycky < nicholas at mxc.ca >
> wrote:
> 
> 
> 
> Chandler Carruth wrote:
> 
> 
> This seems fine to me, but I'd like to make sure it looks OK to Nick
> as
> well.
> 
> I strongly prefer baking in knowledge about the intrinsics themselves
> into the passes if possible. Metadata will always be secondary.
> 
> 
> 
> So you're saying that in this particular case you'd prefer LLVM
> passes to know about the range of these PTX intrinsics, rather than
> Clang adding them as metadata?
> 
> 
> Yep.
> 
> 
> 
> 
> 
> 
> 
> ValueTracking.cpp already has some iffy target knowledge (someone
> sneaked a direct Intrinsic::x86_sse42_crc32_64_64 check in there),
> but extending it to other intrinsics in other targets seems like too
> much...
> 
> 
> That's not iffy. That's exactly how it should work, and we should
> have more of that. There is a major gotcha and that's dealing with
> the case where the intrinsics don't exist because the backend
wasn't
> compiled in. If x86_sse42_crc32_64_64 is in there (and also in
> instcombine btw), presumably that problem is solved somehow? Or does
> llvm actually not build if you don't enable the x86 target? I feel
> like we would've heard about that.
I think that the IR level intrinsics are still defined even if the corresponding
backend is not enabled (they're in
include/llvm/IR/Intrinsics<TARGET>.td which are all included by
include/llvm/IR/Intrinsics.td).

 -Hal
> 
> 
> Nick
> 
> 
> 
> 
> 
> 
> 
> So should target info be passed into it in some way? Any suggestions
> where to put it? TargetLibraryInfo? TargetTransformInfo? In any case
> this seems like the target interface will have to be augmented, and
> we'll have to carry an object around into ValueTracking's compute*
> functions. If this is the right way, then this is the way it will be
> done - design ideas are appreciated.
> 
> 
> 
> Eli
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Separately, should value tracking look use range metadata when it's
> available? Absolutely.
> 
> I think it should apply to all CallInst not just IntrinsicInst (which
> is derived from CallInst).
> 
> Nick
> 
> 
> 
> 
> On Tue, Jun 17, 2014 at 12:37 AM, Jingyue Wu < jingyue at google.com
> 
> 
> <mailto: jingyue at google.com >> wrote:
> 
> Hi,
> 
> The range metadata can only be attached to LoadInst for now. I am
> considering extending its usage to IntrinsicInst so that the
> frontend can annotate the range of the return value of an intrinsic
> call. e.g.,
> %a = call i32 @llvm.xxx(), !range !0
> !0 = metadata !{ i32 0, i23 1024 }
> 
> The motivation behind this extension is some optimizations we are
> working on for CUDA programs. Some special registers in CUDA (e.g.,
> threadIdx.x) are bounded per CUDA programming guide, and knowing
> their ranges can improve the precision of ValueTracking and benefit
> optimizations such as InstCombine.
> 
> To implement this idea, we need ValueTracking to be aware of the
> ranges of these special variables. These special registers are so
> far read-only and accessed using intrinsics. e.g.,
> %threadIdx.x = call i32 @llvm.nvvm.read.ptx.sreg.tid. x().
> 
> One possible approach is to have ValueTracking compute the known
> bits of these intrinsics as special cases. This approach is already
> taken for the x86_sse42_crc32_64_64 intrinsic. However, this
> approach may not be elegant because the ranges of these CUDA special
> registers depend on the GPU compute capability specified by
> -target-cpu. For instance, blockIdx.x is bounded by 65535 in sm_20
> but 2^31-1 in sm_30. Exposing -target-cpu to ValueTracking is
> probably discouraged.
> 
> Therefore, the approach I am considering is to have clang annotate
> the ranges of these CUDA special registers according to the
> -target-cpu flag, and have ValueTracking pick the range metadata for
> optimization. By doing so, we hide the target-specific info from
> ValueTracking.
> 
> The code change in llvm minus clang won't be large. The core change
> is only a few lines:
> http://reviews.llvm.org/ differential/diff/10464/ . If this extension
> sounds good to you, I'll definitely add more tests and revise the
> documents on range metadata.
> 
> Best,
> Jingyue
> 
> ______________________________ _________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto: LLVMdev at cs.uiuc.edu >
> http://llvm.cs.uiuc.edu
> 
> http://lists.cs.uiuc.edu/ mailman/listinfo/llvmdev
> 
> 
> 
> 
> ______________________________ _________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/ mailman/listinfo/llvmdev
> 
> 
> 
> ______________________________ _________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/ mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Jun 2014 - [LLVMdev] Attaching range metadata to IntrinsicInst

[LLVMdev] Attaching range metadata to IntrinsicInst

[LLVMdev] Attaching range metadata to IntrinsicInst

[LLVMdev] Attaching range metadata to IntrinsicInst

[LLVMdev] Attaching range metadata to IntrinsicInst

[LLVMdev] Attaching range metadata to IntrinsicInst

Maybe Matching Threads