thr3ads.net - llvm dev - [llvm-dev] beneficial optimization of undef examples needed [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Peter Lawrence via llvm-dev

2017-Jun-19 14:36 UTC

[llvm-dev] beneficial optimization of undef examples needed

Sanjoy,
            You have changed the subject.  We still need real world examples
showing how taking advantage of “undef” results in beneficial optimization.

My belief is that they don’t exist, my reasoning is this: real world programmers
are likely to run UBSan before compiling (or if they don’t they should),
therefore
it is highly unlikely that any “undef” will actually exist during compilation of
their
source code.

Yes, “Undef” can be created during SSA construction, but we already discussed
That in a previous email (its a register allocation problem).


The only other way I can see of “undef” occurring in IR is if we go out of our
way to optimize for example

	if (   ((i64)X + (i64)Y)) < INT_MIN   ||    ((i64)X) + (i64)Y) > INT_MAX 
) {
		. . .
		Z = X  "+nsw”  Y;
		. . .
	}

Here “Z” could in theory be replaced with “undef”, but there is no point in
doing so.

Similarly with provably out-of-bounds GEP, similarly with provably invalid
pointers,
etc, but again there is no point in doing so.


So is there any other way of having “undef” appear in the IR ?


Peter Lawrence.


> On Jun 16, 2017, at 7:45 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> 
> Hi Peter,
> 
> Why we need an undef value is covered here:
> http://sunfishcode.github.io/blog/2014/07/14/undef-introduction.html
> (in short, to do SSA construction well).
> 
> For a while we did not have a literal representation for poison.
> However, in practice having both undef and poison was problematic (see
> the paper), so we decided to ditch undef and keep poison.
> 
> However for the new poison to provide the same functionality as the
> old undef (which is now going away), we need a literal representation
> for the new poison.
> 
> -- Sanjoy
> 
> On Fri, Jun 16, 2017 at 3:03 PM, Peter Lawrence via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> All,
>>     These discussions seem to be based on the premise that there is a
>> need for the compiler to exploit undefined behavior for performance
>> optimization reasons.
>> 
>> So far the only beneficial optimization I am aware of that relies on
some
>> form of “undefined” is Dan Gohman’s original project for LP64 targets
of
>> promoting i32 induction variables to i64 and hoisting sign-extension
out
>> of the loop.
>> 
>> But “undef” / “poison” never appears in either the original or the
transformed
>> IR for these types of loops, instead properties of “+nsw” are used to
>> justify the transformation.  The transformation does not just fall out
because
>> we’ve done a good job at defining “undef” / “poison” IR nodes.
>> 
>> So I’d like to see some concrete examples of where the compiler can
>> do useful optimization based on “undef” / “poison” appearing explicitly
>> In the IR,  finding some would surely advance this discussion.
>> 
>> 
>> 
>> Peter Lawrence.
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

David Blaikie via llvm-dev

2017-Jun-19 15:25 UTC

head link

[llvm-dev] beneficial optimization of undef examples needed

On Mon, Jun 19, 2017 at 7:36 AM Peter Lawrence via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Sanjoy,
>             You have changed the subject.  We still need real world
> examples
> showing how taking advantage of “undef” results in beneficial optimization.
>
> My belief is that they don’t exist, my reasoning is this: real world
> programmers
> are likely to run UBSan before compiling (or if they don’t they should),
> therefore
> it is highly unlikely that any “undef” will actually exist during
> compilation of their
> source code.
>
Wait - that ^ doesn't seem to follow. The point of undefined behavior
optimizations isn't to only break programs that would trigger UBSan. The
point is to optimize on the assumption that they won't break UBSan/exhibit
undefined behavior.

> Yes, “Undef” can be created during SSA construction, but we already
> discussed
> That in a previous email (its a register allocation problem).
>
>
> The only other way I can see of “undef” occurring in IR is if we go out of
> our
> way to optimize for example
>
>         if (   ((i64)X + (i64)Y)) < INT_MIN   ||    ((i64)X) + (i64)Y)
>
> INT_MAX   ) {
>                 . . .
>                 Z = X  "+nsw”  Y;
>                 . . .
>         }
>
> Here “Z” could in theory be replaced with “undef”, but there is no point
> in doing so.
>
> Similarly with provably out-of-bounds GEP, similarly with provably invalid
> pointers,
> etc, but again there is no point in doing so.
>
Why isn't there any point in doing so? One common situation where
exploiting UB (aka: "assuming UB can't happen") may be beneficial
is when
the code in isolation is fine, but when inlined it's possible to prove
certain paths create contradictions. That way even if the conditions for
those paths can't be analyzed to prove they are unreachable (or values are
not needed, etc), the compiler can skip all that and just collapse the code
(remove the unreachable code, delete the value computation, etc) anyway.

>
>
> So is there any other way of having “undef” appear in the IR ?
>
>
> Peter Lawrence.
>
>
>
> > On Jun 16, 2017, at 7:45 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
> wrote:
> >
> > Hi Peter,
> >
> > Why we need an undef value is covered here:
> > http://sunfishcode.github.io/blog/2014/07/14/undef-introduction.html
> > (in short, to do SSA construction well).
> >
> > For a while we did not have a literal representation for poison.
> > However, in practice having both undef and poison was problematic (see
> > the paper), so we decided to ditch undef and keep poison.
> >
> > However for the new poison to provide the same functionality as the
> > old undef (which is now going away), we need a literal representation
> > for the new poison.
> >
> > -- Sanjoy
> >
> > On Fri, Jun 16, 2017 at 3:03 PM, Peter Lawrence via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> >> All,
> >>     These discussions seem to be based on the premise that there
is a
> >> need for the compiler to exploit undefined behavior for
performance
> >> optimization reasons.
> >>
> >> So far the only beneficial optimization I am aware of that relies
on
> some
> >> form of “undefined” is Dan Gohman’s original project for LP64
targets of
> >> promoting i32 induction variables to i64 and hoisting
sign-extension out
> >> of the loop.
> >>
> >> But “undef” / “poison” never appears in either the original or the
> transformed
> >> IR for these types of loops, instead properties of “+nsw” are used
to
> >> justify the transformation.  The transformation does not just fall
out
> because
> >> we’ve done a good job at defining “undef” / “poison” IR nodes.
> >>
> >> So I’d like to see some concrete examples of where the compiler
can
> >> do useful optimization based on “undef” / “poison” appearing
explicitly
> >> In the IR,  finding some would surely advance this discussion.
> >>
> >>
> >>
> >> Peter Lawrence.
> >>
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/34772196/attachment.html>

Peter Lawrence via llvm-dev

2017-Jun-19 15:56 UTC

head link

[llvm-dev] beneficial optimization of undef examples needed

David,
          I’m not asking for the logic behind optimizing UB,
I’m asking for real world examples, a SPEC benchmark would do,
where optimizing UB actually proves to be beneficial.

I believe that in the real world it doesn’t happen, and I disagree
with the generalization of “constant-propagation after inlining is
beneficial” to “UB-propagation after inlining is beneficial”.

rather I believe that undef-propagation is the root cause of our problems.

But even if you disagree with my logic, we still don’t have any real world
examples showing the benefit of “UB-propagation after inlining".


Peter Lawrence.



> On Jun 19, 2017, at 8:25 AM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> 
> 
> On Mon, Jun 19, 2017 at 7:36 AM Peter Lawrence via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> Sanjoy,
>             You have changed the subject.  We still need real world
examples
> showing how taking advantage of “undef” results in beneficial optimization.
> 
> My belief is that they don’t exist, my reasoning is this: real world
programmers
> are likely to run UBSan before compiling (or if they don’t they should),
therefore
> it is highly unlikely that any “undef” will actually exist during
compilation of their
> source code.
> 
> Wait - that ^ doesn't seem to follow. The point of undefined behavior
optimizations isn't to only break programs that would trigger UBSan. The
point is to optimize on the assumption that they won't break UBSan/exhibit
undefined behavior.
>  
> Yes, “Undef” can be created during SSA construction, but we already
discussed
> That in a previous email (its a register allocation problem).
> 
> 
> The only other way I can see of “undef” occurring in IR is if we go out of
our
> way to optimize for example
> 
>         if (   ((i64)X + (i64)Y)) < INT_MIN   ||    ((i64)X) + (i64)Y)
> INT_MAX   ) {
>                 . . .
>                 Z = X  "+nsw”  Y;
>                 . . .
>         }
> 
> Here “Z” could in theory be replaced with “undef”, but there is no point in
doing so.
> 
> Similarly with provably out-of-bounds GEP, similarly with provably invalid
pointers,
> etc, but again there is no point in doing so.
> 
> Why isn't there any point in doing so? One common situation where
exploiting UB (aka: "assuming UB can't happen") may be beneficial
is when the code in isolation is fine, but when inlined it's possible to
prove certain paths create contradictions. That way even if the conditions for
those paths can't be analyzed to prove they are unreachable (or values are
not needed, etc), the compiler can skip all that and just collapse the code
(remove the unreachable code, delete the value computation, etc) anyway.
>  
> 
> 
> So is there any other way of having “undef” appear in the IR ?
> 
> 
> Peter Lawrence.
> 
> 
> 
> > On Jun 16, 2017, at 7:45 PM, Sanjoy Das <sanjoy at
playingwithpointers.com <mailto:sanjoy at playingwithpointers.com>>
wrote:
> >
> > Hi Peter,
> >
> > Why we need an undef value is covered here:
> > http://sunfishcode.github.io/blog/2014/07/14/undef-introduction.html
<http://sunfishcode.github.io/blog/2014/07/14/undef-introduction.html>
> > (in short, to do SSA construction well).
> >
> > For a while we did not have a literal representation for poison.
> > However, in practice having both undef and poison was problematic (see
> > the paper), so we decided to ditch undef and keep poison.
> >
> > However for the new poison to provide the same functionality as the
> > old undef (which is now going away), we need a literal representation
> > for the new poison.
> >
> > -- Sanjoy
> >
> > On Fri, Jun 16, 2017 at 3:03 PM, Peter Lawrence via llvm-dev
> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> >> All,
> >>     These discussions seem to be based on the premise that there
is a
> >> need for the compiler to exploit undefined behavior for
performance
> >> optimization reasons.
> >>
> >> So far the only beneficial optimization I am aware of that relies
on some
> >> form of “undefined” is Dan Gohman’s original project for LP64
targets of
> >> promoting i32 induction variables to i64 and hoisting
sign-extension out
> >> of the loop.
> >>
> >> But “undef” / “poison” never appears in either the original or the
transformed
> >> IR for these types of loops, instead properties of “+nsw” are used
to
> >> justify the transformation.  The transformation does not just fall
out because
> >> we’ve done a good job at defining “undef” / “poison” IR nodes.
> >>
> >> So I’d like to see some concrete examples of where the compiler
can
> >> do useful optimization based on “undef” / “poison” appearing
explicitly
> >> In the IR,  finding some would surely advance this discussion.
> >>
> >>
> >>
> >> Peter Lawrence.
> >>
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170619/9c858081/attachment.html>

Sanjoy Das via llvm-dev

2017-Jun-19 16:30 UTC

head link

[llvm-dev] beneficial optimization of undef examples needed

Hi Peter,

On Mon, Jun 19, 2017 at 7:36 AM, Peter Lawrence
<peterl95124 at sbcglobal.net> wrote:>             You have changed the subject.  We still need real world
examples
Not sure how -- I though I was pretty clear on how my answer was
related to you question.
> showing how taking advantage of “undef” results in beneficial optimization.
>
> My belief is that they don’t exist, my reasoning is this: real world
programmers
> are likely to run UBSan before compiling (or if they don’t they should),
therefore
> it is highly unlikely that any “undef” will actually exist during
compilation of their
> source code.
The canonical example motivating undef is this:

  int a;
  boil initialized;
  if (condition) {
    a = 5;
    initialized = true;
  }
  ...
  int result = 0;
  if (condition) {
    result += a;
  }

You can't optimize the PHI node that you'll get in side the block
guarded by 'condition' to 5 without something like undef.
> Yes, “Undef” can be created during SSA construction, but we already
discussed
> That in a previous email (its a register allocation problem).
>
> The only other way I can see of “undef” occurring in IR is if we go out of
our
> way to optimize for example
>
>         if (   ((i64)X + (i64)Y)) < INT_MIN   ||    ((i64)X) + (i64)Y)
> INT_MAX   ) {
>                 . . .
>                 Z = X  "+nsw”  Y;
>                 . . .
>         }
>
> Here “Z” could in theory be replaced with “undef”, but there is no point in
doing so.
>
> Similarly with provably out-of-bounds GEP, similarly with provably invalid
pointers,
> etc, but again there is no point in doing so.
These would be poison, not undef?

Just to re-iterate what I've already said before:

 - If you're willing to have both undef and poison, then you don't
need a literal representation of poison to do the kind of
transformations poison allows (e.g. A s< (A +nsw 1)).
 - However, having both undef and poison is a bad idea for several
reasons outlined in the paper.
 - Since we want the new poison to "fill in" for the undef too, we
need a literal representation for the new poison to addresses cases
like the PHI node case above.

We don't care about optimizing code that actually has undefined
behavior -- we only care about optimizing code *under the assumption*
that it does not have UB.  Both the old undef and the new poison have
been carefully designed to address this use case.
> So is there any other way of having “undef” appear in the IR ?
Loading from uninitalized memory is another way you can have undef
appear in the IR today.

-- Sanjoy

llvm dev - Jun 2017 - beneficial optimization of undef examples needed

[llvm-dev] beneficial optimization of undef examples needed

[llvm-dev] beneficial optimization of undef examples needed

[llvm-dev] beneficial optimization of undef examples needed

[llvm-dev] beneficial optimization of undef examples needed