thr3ads.net - llvm dev - [LLVMdev] Convert fdiv

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2013-Aug-08 21:23 UTC

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

On Thu, Aug 8, 2013 at 2:07 PM, Chad Rosier <chad.rosier at gmail.com>
wrote:
> On Thu, Aug 8, 2013 at 1:56 PM, Mark Lacey <mark.lacey at apple.com>
wrote:
>
>>
>> On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at apple.com>
wrote:
>>
>> Hi Chad,
>>
>> This is a great transform to do, but you’re right that it’s only safe
>> under fast-math. This is particularly interesting when the original
divisor
>> is a constant so you can materialize the reciprocal at compile-time.
You’re
>> right that in either case, this optimization should only kick in when
there
>> is more than one divide instruction that will be changed to a mul.
>>
>>
>> It can be worthwhile to do this even in the case where there is only a
>> single divide since 1/Y might be loop invariant, and could then be
hoisted
>> out later by LICM. You just need to be able to fold it back together
when
>> there is only a single use, and that use is not inside a more deeply
nested
>> loop.
>>
>
> Ben's patch does exactly this, so perhaps that is the right approach.
>
Just to be clear of what is being proposed (which I rather like):

1) Canonical form is to use the reciprocal when allowed (by the fast math
flags, whichever we decide are appropriate).
2) The backend folds a single-use reciprocal into a direct divide.

Did I get it right? If so, I think this is a really nice way to capture all
of the potential benefits of forming reciprocals without pessimizing code
where it isn't helpful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/a0f3c417/attachment.html>

Michael Ilseman

2013-Aug-08 21:35 UTC

head link

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

Point #1 makes sense to me.

For point #2, wouldn't that be somewhat orthogonal to the discussion, as it
has/needs no knowledge that an IR-level transformation happened? Also,
reciprocal-multiply will be the preferred option for many (most) backends if the
IR says to do that. But, I suppose some backend might want to be allowed to do
the reverse transformation if allowed by fast-math flags in IR, or fast-math
mode in selection DAG.

On Aug 8, 2013, at 2:23 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Thu, Aug 8, 2013 at 2:07 PM, Chad Rosier <chad.rosier at
gmail.com> wrote:
> On Thu, Aug 8, 2013 at 1:56 PM, Mark Lacey <mark.lacey at apple.com>
wrote:
> 
> On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at apple.com>
wrote:
> 
>> Hi Chad,
>> 
>> This is a great transform to do, but you’re right that it’s only safe
under fast-math. This is particularly interesting when the original divisor is a
constant so you can materialize the reciprocal at compile-time. You’re right
that in either case, this optimization should only kick in when there is more
than one divide instruction that will be changed to a mul.
> 
> It can be worthwhile to do this even in the case where there is only a
single divide since 1/Y might be loop invariant, and could then be hoisted out
later by LICM. You just need to be able to fold it back together when there is
only a single use, and that use is not inside a more deeply nested loop.
> 
> Ben's patch does exactly this, so perhaps that is the right approach.
> 
> Just to be clear of what is being proposed (which I rather like):
> 
> 1) Canonical form is to use the reciprocal when allowed (by the fast math
flags, whichever we decide are appropriate).
> 2) The backend folds a single-use reciprocal into a direct divide.
> 
> Did I get it right? If so, I think this is a really nice way to capture all
of the potential benefits of forming reciprocals without pessimizing code where
it isn't helpful.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/3d40bbf3/attachment.html>

Michael Ilseman

2013-Aug-08 21:36 UTC

head link

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

On Aug 8, 2013, at 2:35 PM, Michael Ilseman <milseman at apple.com> wrote:
> Point #1 makes sense to me.
> 
> For point #2, wouldn't that be somewhat orthogonal to the discussion,
as it has/needs no knowledge that an IR-level transformation happened? Also,
reciprocal-multiply will be the preferred option for many (most) backends if the
IR says to do that. But, I suppose some backend might want to be allowed to do
the reverse transformation if allowed by fast-math flags in IR, or fast-math
mode in selection DAG.
> 
Oh, I forgot about optimize-for-size, which might be a user who desires the
reverse transformation.
> On Aug 8, 2013, at 2:23 PM, Chandler Carruth <chandlerc at
google.com> wrote:
> 
>> 
>> On Thu, Aug 8, 2013 at 2:07 PM, Chad Rosier <chad.rosier at
gmail.com> wrote:
>> On Thu, Aug 8, 2013 at 1:56 PM, Mark Lacey <mark.lacey at
apple.com> wrote:
>> 
>> On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at apple.com>
wrote:
>> 
>>> Hi Chad,
>>> 
>>> This is a great transform to do, but you’re right that it’s only
safe under fast-math. This is particularly interesting when the original divisor
is a constant so you can materialize the reciprocal at compile-time. You’re
right that in either case, this optimization should only kick in when there is
more than one divide instruction that will be changed to a mul.
>> 
>> It can be worthwhile to do this even in the case where there is only a
single divide since 1/Y might be loop invariant, and could then be hoisted out
later by LICM. You just need to be able to fold it back together when there is
only a single use, and that use is not inside a more deeply nested loop.
>> 
>> Ben's patch does exactly this, so perhaps that is the right
approach.
>> 
>> Just to be clear of what is being proposed (which I rather like):
>> 
>> 1) Canonical form is to use the reciprocal when allowed (by the fast
math flags, whichever we decide are appropriate).
>> 2) The backend folds a single-use reciprocal into a direct divide.
>> 
>> Did I get it right? If so, I think this is a really nice way to capture
all of the potential benefits of forming reciprocals without pessimizing code
where it isn't helpful.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/43f5dde4/attachment.html>

Chad Rosier

2013-Aug-08 21:39 UTC

head link

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

On Thu, Aug 8, 2013 at 5:23 PM, Chandler Carruth <chandlerc at
google.com>wrote:
>
> On Thu, Aug 8, 2013 at 2:07 PM, Chad Rosier <chad.rosier at
gmail.com> wrote:
>
>> On Thu, Aug 8, 2013 at 1:56 PM, Mark Lacey <mark.lacey at
apple.com> wrote:
>>
>>>
>>> On Aug 8, 2013, at 9:56 AM, Jim Grosbach <grosbach at
apple.com> wrote:
>>>
>>> Hi Chad,
>>>
>>> This is a great transform to do, but you’re right that it’s only
safe
>>> under fast-math. This is particularly interesting when the original
divisor
>>> is a constant so you can materialize the reciprocal at
compile-time. You’re
>>> right that in either case, this optimization should only kick in
when there
>>> is more than one divide instruction that will be changed to a mul.
>>>
>>>
>>> It can be worthwhile to do this even in the case where there is
only a
>>> single divide since 1/Y might be loop invariant, and could then be
hoisted
>>> out later by LICM. You just need to be able to fold it back
together when
>>> there is only a single use, and that use is not inside a more
deeply nested
>>> loop.
>>>
>>
>> Ben's patch does exactly this, so perhaps that is the right
approach.
>>
>
> Just to be clear of what is being proposed (which I rather like):
>
> 1) Canonical form is to use the reciprocal when allowed (by the fast math
> flags, whichever we decide are appropriate).
> 2) The backend folds a single-use reciprocal into a direct divide.
>
> Did I get it right? If so, I think this is a really nice way to capture
> all of the potential benefits of forming reciprocals without pessimizing
> code where it isn't helpful.
>
I believe you're describing Ben's patch perfectly.  A few
transformations
are pessimize, however.
>From test/Transforms/InstCombine/fast-math.ll1. Previously x/y + x/z was not transformed.  Not it becomes x*(1/y+1/x).

define float @fact_div1(float %x, float %y, float %z) {
  %t1 = fdiv fast float %x, %y
  %t2 = fdiv fast float %x, %z
  %t3 = fadd fast float %t1, %t2
  ret float %t3
}

combines to:

define float @fact_div1(float %x, float %y, float %z) {
  %reciprocal = fdiv fast float 1.000000e+00, %y
  %reciprocal1 = fdiv fast float 1.000000e+00, %z
  %1 = fadd fast float %reciprocal, %reciprocal1
  %2 = fmul fast float %1, %x
  ret float %t3
 }

I don't believe the fixup in CodeGenPrepare will undo such a transformation.

2. Similarly, x/y + z/x was not previously changed, but now we generate
x*(1/y) + z*(1/x).
I believe we can undo this transformation.

3.  Previously we would transform y/x + z/x => (y+z)/x.  Now y/x + z/x is
transformed to y*(1/x)+z*(1/x).
This might be an ordering problem or perhaps we could just transform
y*(1/x)+z*(1/x) => (y+z)/x.  The same
holds true for y/x - z/x.

 Chad
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/fe3cdafe/attachment.html>

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Aug 2013 - [LLVMdev] Convert fdiv - X/Y -> X*1/Y

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

[LLVMdev] Convert fdiv - X/Y -> X*1/Y

Maybe Matching Threads