thr3ads.net - llvm dev - [LLVMdev] RFC: Missing canonicalization in LLVM [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2015-Jan-21 22:16 UTC

[LLVMdev] RFC: Missing canonicalization in LLVM

So, we've run into some test cases which are pretty alarming.

When inlining code in various different paths we can end up with this IR:

define void @f(float* %value, i8* %b) {
entry:
  %0 = load float* %value, align 4
  %1 = bitcast i8* %b to float*
  store float %0, float* %1, align 1
  ret void
}

define void @g(float* %value, i8* %b) {
entry:
  %0 = bitcast float* %value to i32*
  %1 = load i32* %0, align 4
  %2 = bitcast i8* %b to i32*
  store i32 %1, i32* %2, align 1
  ret void
}

Now, I don't really care one way or the other about these two IR inputs,
but it's pretty concerning that we get these two equivalent bits of code
and nothing canonicalizes to one or the other.

So, the naive first blush approach here would be to canonicalize on the
first -- it has fewer instructions after all -- but I don't think that's
the right approach for two reasons:

1) It will be a *very* narrow canonicalization that only works with overly
specific sets of casted pointers.
2) It doesn't effectively move us toward the optimizer treating IR with
different pointee types for pointer types indistinguishably. Some day, I
continue to think we should get rid of the pointee types entirely.

To see why #1 and #2 are problematic, assume another round of inlining took
place and we suddenly had the following IR:


AFAICT, this is the same and we still don't have a good canonicalization
story.

What seems like the obvious important and missing canonicalization is that
when we have a loaded value that is *only* used by storing it back into
memory, we don't canonicalize the type of that *value* (ignoring the
pointer types) to a single value type.

So, the only really suitable type for this kind of stuff is 'iN' where N
matches the number of bits loaded or stored.

I have this change implemented. It is trivial and unsurprising. However,
the effects of this are impossible to predict so I wanted to make sure it
made sense to others. Essentially, I expect random and hard to track down
performance fluctuations across the board. Some things may get better,
others may get worse, and they will probably all be bugs elsewhere in the
stack.

So, thoughts?
-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/3865309d/attachment.html>

Chandler Carruth

2015-Jan-21 22:18 UTC

head link

[LLVMdev] RFC: Missing canonicalization in LLVM

On Wed, Jan 21, 2015 at 2:16 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> So, we've run into some test cases which are pretty alarming.
>
> When inlining code in various different paths we can end up with this IR:
>
> define void @f(float* %value, i8* %b) {
> entry:
>   %0 = load float* %value, align 4
>   %1 = bitcast i8* %b to float*
>   store float %0, float* %1, align 1
>   ret void
> }
>
> define void @g(float* %value, i8* %b) {
> entry:
>   %0 = bitcast float* %value to i32*
>   %1 = load i32* %0, align 4
>   %2 = bitcast i8* %b to i32*
>   store i32 %1, i32* %2, align 1
>   ret void
> }
>
> Now, I don't really care one way or the other about these two IR
inputs,
> but it's pretty concerning that we get these two equivalent bits of
code
> and nothing canonicalizes to one or the other.
>
> So, the naive first blush approach here would be to canonicalize on the
> first -- it has fewer instructions after all -- but I don't think
that's
> the right approach for two reasons:
>
> 1) It will be a *very* narrow canonicalization that only works with overly
> specific sets of casted pointers.
> 2) It doesn't effectively move us toward the optimizer treating IR with
> different pointee types for pointer types indistinguishably. Some day, I
> continue to think we should get rid of the pointee types entirely.
>
> To see why #1 and #2 are problematic, assume another round of inlining
> took place and we suddenly had the following IR:
>
>And the missing IR example:

define void @f(i8* %value, i8* %b) {
entry:
  %0 = bitcast i8* %value to float*
  %1 = load float* %0, align 4
  %2 = bitcast i8* %b to float*
  store float %0, float* %1, align 1
  ret void
}

define void @g(i8* %value, i8* %b) {
entry:
  %0 = bitcast i8* %value to i32*
  %1 = load i32* %0, align 4
  %2 = bitcast i8* %b to i32*
  store i32 %1, i32* %2, align 1
  ret void
}

>
> AFAICT, this is the same and we still don't have a good
canonicalization
> story.
>
> What seems like the obvious important and missing canonicalization is that
> when we have a loaded value that is *only* used by storing it back into
> memory, we don't canonicalize the type of that *value* (ignoring the
> pointer types) to a single value type.
>
> So, the only really suitable type for this kind of stuff is 'iN'
where N
> matches the number of bits loaded or stored.
>
> I have this change implemented. It is trivial and unsurprising. However,
> the effects of this are impossible to predict so I wanted to make sure it
> made sense to others. Essentially, I expect random and hard to track down
> performance fluctuations across the board. Some things may get better,
> others may get worse, and they will probably all be bugs elsewhere in the
> stack.
>
> So, thoughts?
> -Chandler
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/38fb85fc/attachment.html>

Pete Cooper

2015-Jan-21 22:43 UTC

head link

[LLVMdev] RFC: Missing canonicalization in LLVM

> On Jan 21, 2015, at 2:18 PM, Chandler Carruth <chandlerc at
gmail.com> wrote:
> 
> On Wed, Jan 21, 2015 at 2:16 PM, Chandler Carruth <chandlerc at
gmail.com <mailto:chandlerc at gmail.com>> wrote:
> So, we've run into some test cases which are pretty alarming.
> 
> When inlining code in various different paths we can end up with this IR:
> 
> define void @f(float* %value, i8* %b) {
> entry:
>   %0 = load float* %value, align 4
>   %1 = bitcast i8* %b to float*
>   store float %0, float* %1, align 1
>   ret void
> }
> 
> define void @g(float* %value, i8* %b) {
> entry:
>   %0 = bitcast float* %value to i32*
>   %1 = load i32* %0, align 4
>   %2 = bitcast i8* %b to i32*
>   store i32 %1, i32* %2, align 1
>   ret void
> }
> 
> Now, I don't really care one way or the other about these two IR
inputs, but it's pretty concerning that we get these two equivalent bits of
code and nothing canonicalizes to one or the other.
> 
> So, the naive first blush approach here would be to canonicalize on the
first -- it has fewer instructions after all -- but I don't think that's
the right approach for two reasons:
> 
> 1) It will be a *very* narrow canonicalization that only works with overly
specific sets of casted pointers.
> 2) It doesn't effectively move us toward the optimizer treating IR with
different pointee types for pointer types indistinguishably. Some day, I
continue to think we should get rid of the pointee types entirely.
> 
> To see why #1 and #2 are problematic, assume another round of inlining took
place and we suddenly had the following IR:
> 
> 
> And the missing IR example:
> 
> define void @f(i8* %value, i8* %b) {
> entry:
>   %0 = bitcast i8* %value to float*
>   %1 = load float* %0, align 4
>   %2 = bitcast i8* %b to float*
>   store float %0, float* %1, align 1
>   ret void
> }
> 
> define void @g(i8* %value, i8* %b) {
> entry:
>   %0 = bitcast i8* %value to i32*
>   %1 = load i32* %0, align 4
>   %2 = bitcast i8* %b to i32*
>   store i32 %1, i32* %2, align 1
>   ret void
> }
>  
> 
> AFAICT, this is the same and we still don't have a good
canonicalization story.
> 
> What seems like the obvious important and missing canonicalization is that
when we have a loaded value that is *only* used by storing it back into memory,
we don't canonicalize the type of that *value* (ignoring the pointer types)
to a single value type.
> 
> So, the only really suitable type for this kind of stuff is 'iN'
where N matches the number of bits loaded or stored.
> 
> I have this change implemented. It is trivial and unsurprising. However,
the effects of this are impossible to predict so I wanted to make sure it made
sense to others. Essentially, I expect random and hard to track down performance
fluctuations across the board. Some things may get better, others may get worse,
and they will probably all be bugs elsewhere in the stack.
> 
> So, thoughts?The first thing that springs to mind is that I don’t trust the backend to get
this right.  I don’t think it will understand when an i32 load/store would have
been preferable to a float one or vice versa.  I have no evidence of this, but
given how strongly typed tablegen is, I don’t think it can make a good choice
here.

So I think we probably need to teach the backend how to undo whatever canonical
form we choose if it has a reason to.  And the best long term solution is for
tablegen to have sized load/stores, not typed ones.

One (potentially expensive) way to choose the canonical form here is to look at
the users of the load and see what type works best.  If we load an i32, but bit
cast and do an fp operation on it, then a float load was best.  If we just load
it then store, then in theory either type works.

Pete> -Chandler
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/d1dc877a/attachment.html>

Smith, Kevin B

2015-Jan-21 23:06 UTC

head link

[LLVMdev] RFC: Missing canonicalization in LLVM

define void @f(float* %value, i8* %b) {
entry:
  %0 = load float* %value, align 4
  %1 = bitcast i8* %b to float*
  store float %0, float* %1, align 1
  ret void
}

define void @g(float* %value, i8* %b) {
entry:
  %0 = bitcast float* %value to i32*
  %1 = load i32* %0, align 4
  %2 = bitcast i8* %b to i32*
  store i32 %1, i32* %2, align 1
  ret void
}

I don’t think these are equivalent representations.  The one with the float
loads and stores has the potential of FP exceptions
both during the load and during the store, depending on what exact instruction
set is in use.  For example, floating point loads when done with x87, aren’t
even guaranteed of giving the same bit pattern when using due to signaling Nans
potentially being converted into quiet Nans by the fld instruction.

Thus, I don’t think this is an example of a missing canonicalization, nor even a
legal one in all circumstances. But maybe there is a subtlety of LLVM IR
semantics that I am unaware of.

Kevin B. Smith

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Chandler Carruth
Sent: Wednesday, January 21, 2015 2:17 PM
To: LLVM Developers Mailing List
Subject: [LLVMdev] RFC: Missing canonicalization in LLVM

So, we've run into some test cases which are pretty alarming.

When inlining code in various different paths we can end up with this IR:

define void @f(float* %value, i8* %b) {
entry:
  %0 = load float* %value, align 4
  %1 = bitcast i8* %b to float*
  store float %0, float* %1, align 1
  ret void
}

define void @g(float* %value, i8* %b) {
entry:
  %0 = bitcast float* %value to i32*
  %1 = load i32* %0, align 4
  %2 = bitcast i8* %b to i32*
  store i32 %1, i32* %2, align 1
  ret void
}

Now, I don't really care one way or the other about these two IR inputs, but
it's pretty concerning that we get these two equivalent bits of code and
nothing canonicalizes to one or the other.

So, the naive first blush approach here would be to canonicalize on the first --
it has fewer instructions after all -- but I don't think that's the
right approach for two reasons:

1) It will be a *very* narrow canonicalization that only works with overly
specific sets of casted pointers.
2) It doesn't effectively move us toward the optimizer treating IR with
different pointee types for pointer types indistinguishably. Some day, I
continue to think we should get rid of the pointee types entirely.

To see why #1 and #2 are problematic, assume another round of inlining took
place and we suddenly had the following IR:


AFAICT, this is the same and we still don't have a good canonicalization
story.

What seems like the obvious important and missing canonicalization is that when
we have a loaded value that is *only* used by storing it back into memory, we
don't canonicalize the type of that *value* (ignoring the pointer types) to
a single value type.

So, the only really suitable type for this kind of stuff is 'iN' where N
matches the number of bits loaded or stored.

I have this change implemented. It is trivial and unsurprising. However, the
effects of this are impossible to predict so I wanted to make sure it made sense
to others. Essentially, I expect random and hard to track down performance
fluctuations across the board. Some things may get better, others may get worse,
and they will probably all be bugs elsewhere in the stack.

So, thoughts?
-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/185e7047/attachment.html>

Chandler Carruth

2015-Jan-21 23:33 UTC

head link

[LLVMdev] RFC: Missing canonicalization in LLVM

On Wed, Jan 21, 2015 at 3:06 PM, Smith, Kevin B <kevin.b.smith at
intel.com>
wrote:
> I don’t think these are equivalent representations.  The one with the
> float loads and stores has the potential of FP exceptions
>
> both during the load and during the store
>
LLVM explicitly doesn't support FP exceptions on loads and stores. And it
would break the world to add it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/6d480f78/attachment.html>

Krzysztof Parzyszek

2015-Jan-23 19:02 UTC

head link

[LLVMdev] RFC: Missing canonicalization in LLVM

On 1/21/2015 4:16 PM, Chandler Carruth wrote:>
> 2) It doesn't effectively move us toward the optimizer treating IR with
> different pointee types for pointer types indistinguishably. Some day, I
> continue to think we should get rid of the pointee types entirely.
Pointee?  I'd think that eliminating multiple pointer types would be a 
better thing to "normalize". Having a tag on a load stating the loaded
type would be useful for instruction selection.  Unless I'm not reading 
this correctly...

-Krzysztof


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Jan 2015 - [LLVMdev] RFC: Missing canonicalization in LLVM

[LLVMdev] RFC: Missing canonicalization in LLVM

[LLVMdev] RFC: Missing canonicalization in LLVM

[LLVMdev] RFC: Missing canonicalization in LLVM

[LLVMdev] RFC: Missing canonicalization in LLVM

[LLVMdev] RFC: Missing canonicalization in LLVM

[LLVMdev] RFC: Missing canonicalization in LLVM

Reasonably Related Threads