thr3ads.net - llvm dev - [llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Craig Topper via llvm-dev

2020-Apr-09 17:17 UTC

[llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics

That recent X86 bug isn't unique to the intrinsic. We generate the same
code from this which uses the shuffle sequence the vectorizers generated
before the reduction intrinsics existed.

declare i64 @llvm.experimental.vector.reduce.or.v2i64(<2 x i64>)·
declare void @TrapFunc(i64)

define void @parseHeaders(i64 * %ptr) {
  %vptr = bitcast i64 * %ptr to <2 x i64> *
  %vload = load <2 x i64>, <2 x i64> * %vptr, align 8

  %b = shufflevector <2 x i64> %vload, <2 x i64> undef, <2 x
i32> <i32 1,
i32 undef>
  %c = or <2 x i64> %vload, %b
  %vreduce = extractelement <2 x i64> %c, i32 0

  %vcheck = icmp eq i64 %vreduce, 0
  br i1 %vcheck, label %ret, label %trap
trap:
  %v2 = extractelement <2 x i64> %vload, i32 1
  call void @TrapFunc(i64 %v2)
  ret void
ret:
  ret void
}

~Craig


On Thu, Apr 9, 2020 at 10:04 AM Philip Reames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> My experience with them so far is that the code generation for these
> intrinsics is still missing a lot of cases.  Some of them are X86
> specific (the target I look at mostly), but many of them have generic
> forms.
>
> As one recent example, consider
> https://bugs.llvm.org/show_bug.cgi?id=45378.  (There's nothing special
> about this one other than it was recent.)
>
> I'm not necessarily arguing they can't be promoted from
experimental,
> but it would be a much easier case if the code gen was routinely as good
> or better than the scalar forms.  Or to say that a bit differently, if
> we could canonicalize to them in the IR without major regression.
> Having two ways to represent something in the IR without any agreed upon
> canonical form is always sub-optimal.
>
> Philip
>
> On 4/7/20 9:59 PM, Amara Emerson via llvm-dev wrote:
> > Hi,
> >
> > It’s been a few years now since I added some intrinsics for doing
vector
> reductions. We’ve been using them exclusively on AArch64, and I’ve seen
> some traffic a while ago on list for other targets too. Sander did some
> work last year to refine the semantics after some discussion.
> >
> > Are we at the point where we can drop the “experimental” from the
name?
> IMO all target should begin to transition to using these as the preferred
> representation for reductions. But for now, I’m only proposing the naming
> change.
> >
> > Cheers,
> > Amara
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200409/9234f31c/attachment.html>

Amara Emerson via llvm-dev

2020-Apr-09 17:21 UTC

head link

[llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics

Has x86 switched to the intrinsics now?
> On Apr 9, 2020, at 10:17 AM, Craig Topper <craig.topper at gmail.com>
wrote:
> 
> That recent X86 bug isn't unique to the intrinsic. We generate the same
code from this which uses the shuffle sequence the vectorizers generated before
the reduction intrinsics existed.
> 
> declare i64 @llvm.experimental.vector.reduce.or.v2i64(<2 x i64>)·
> declare void @TrapFunc(i64)
> 
> define void @parseHeaders(i64 * %ptr) {
>   %vptr = bitcast i64 * %ptr to <2 x i64> *
>   %vload = load <2 x i64>, <2 x i64> * %vptr, align 8
> 
>   %b = shufflevector <2 x i64> %vload, <2 x i64> undef, <2 x
i32> <i32 1, i32 undef>
>   %c = or <2 x i64> %vload, %b
>   %vreduce = extractelement <2 x i64> %c, i32 0
> 
>   %vcheck = icmp eq i64 %vreduce, 0
>   br i1 %vcheck, label %ret, label %trap
> trap:
>   %v2 = extractelement <2 x i64> %vload, i32 1
>   call void @TrapFunc(i64 %v2)
>   ret void
> ret:
>   ret void
> }
> 
> ~Craig
> 
> 
> On Thu, Apr 9, 2020 at 10:04 AM Philip Reames via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> My experience with them so far is that the code generation for these 
> intrinsics is still missing a lot of cases.  Some of them are X86 
> specific (the target I look at mostly), but many of them have generic
forms.
> 
> As one recent example, consider 
> https://bugs.llvm.org/show_bug.cgi?id=45378
<https://bugs.llvm.org/show_bug.cgi?id=45378>.  (There's nothing
special
> about this one other than it was recent.)
> 
> I'm not necessarily arguing they can't be promoted from
experimental,
> but it would be a much easier case if the code gen was routinely as good 
> or better than the scalar forms.  Or to say that a bit differently, if 
> we could canonicalize to them in the IR without major regression.  
> Having two ways to represent something in the IR without any agreed upon 
> canonical form is always sub-optimal.
> 
> Philip
> 
> On 4/7/20 9:59 PM, Amara Emerson via llvm-dev wrote:
> > Hi,
> >
> > It’s been a few years now since I added some intrinsics for doing
vector reductions. We’ve been using them exclusively on AArch64, and I’ve seen
some traffic a while ago on list for other targets too. Sander did some work
last year to refine the semantics after some discussion.
> >
> > Are we at the point where we can drop the “experimental” from the
name? IMO all target should begin to transition to using these as the preferred
representation for reductions. But for now, I’m only proposing the naming
change.
> >
> > Cheers,
> > Amara
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200409/d1b26172/attachment-0001.html>

Craig Topper via llvm-dev

2020-Apr-09 17:28 UTC

head link

[llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics

No we still use the shuffle expansion which is why the issue isn't unique
to the intrinsic.

~Craig


On Thu, Apr 9, 2020 at 10:21 AM Amara Emerson <aemerson at apple.com>
wrote:
> Has x86 switched to the intrinsics now?
>
> On Apr 9, 2020, at 10:17 AM, Craig Topper <craig.topper at gmail.com>
wrote:
>
> That recent X86 bug isn't unique to the intrinsic. We generate the same
> code from this which uses the shuffle sequence the vectorizers generated
> before the reduction intrinsics existed.
>
> declare i64 @llvm.experimental.vector.reduce.or.v2i64(<2 x i64>)·
> declare void @TrapFunc(i64)
>
> define void @parseHeaders(i64 * %ptr) {
>   %vptr = bitcast i64 * %ptr to <2 x i64> *
>   %vload = load <2 x i64>, <2 x i64> * %vptr, align 8
>
>   %b = shufflevector <2 x i64> %vload, <2 x i64> undef, <2 x
i32> <i32 1,
> i32 undef>
>   %c = or <2 x i64> %vload, %b
>   %vreduce = extractelement <2 x i64> %c, i32 0
>
>   %vcheck = icmp eq i64 %vreduce, 0
>   br i1 %vcheck, label %ret, label %trap
> trap:
>   %v2 = extractelement <2 x i64> %vload, i32 1
>   call void @TrapFunc(i64 %v2)
>   ret void
> ret:
>   ret void
> }
>
> ~Craig
>
>
> On Thu, Apr 9, 2020 at 10:04 AM Philip Reames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> My experience with them so far is that the code generation for these
>> intrinsics is still missing a lot of cases.  Some of them are X86
>> specific (the target I look at mostly), but many of them have generic
>> forms.
>>
>> As one recent example, consider
>> https://bugs.llvm.org/show_bug.cgi?id=45378.  (There's nothing
special
>> about this one other than it was recent.)
>>
>> I'm not necessarily arguing they can't be promoted from
experimental,
>> but it would be a much easier case if the code gen was routinely as
good
>> or better than the scalar forms.  Or to say that a bit differently, if
>> we could canonicalize to them in the IR without major regression.
>> Having two ways to represent something in the IR without any agreed
upon
>> canonical form is always sub-optimal.
>>
>> Philip
>>
>> On 4/7/20 9:59 PM, Amara Emerson via llvm-dev wrote:
>> > Hi,
>> >
>> > It’s been a few years now since I added some intrinsics for doing
>> vector reductions. We’ve been using them exclusively on AArch64, and
I’ve
>> seen some traffic a while ago on list for other targets too. Sander did
>> some work last year to refine the semantics after some discussion.
>> >
>> > Are we at the point where we can drop the “experimental” from the
name?
>> IMO all target should begin to transition to using these as the
preferred
>> representation for reductions. But for now, I’m only proposing the
naming
>> change.
>> >
>> > Cheers,
>> > Amara
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200409/e3f5e405/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Apr 2020 - RFC: Promoting experimental reduction intrinsics to first class intrinsics

[llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics

[llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics

[llvm-dev] RFC: Promoting experimental reduction intrinsics to first class intrinsics

Possibly Parallel Threads