thr3ads.net - llvm dev - [LLVMdev] failed folding with constant array with opt -O3 [Sep 2014]

If this information is useful, please help other people find it:
Share via:

Peng Cheng

2014-Sep-09 16:30 UTC

[LLVMdev] failed folding with constant array with opt -O3

I have the following simplified llvm ir, which basically returns value
based on the first value of a constant array.

----
; ModuleID = 'simple_ir3.txt'

@f.b = constant [1 x i32] [i32 1], align 4          ; constant array with
value 1 at the first element

define void @f(i32* nocapture %l0) {
entry:
  %fc_ = alloca [1 x i32]
  %f.b.v = load [1 x i32]* @f.b
  store [1 x i32] %f.b.v, [1 x i32]* %fc_
  %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0  ; load the first
element of the constant array, which is actually 1
  %1 = load i32* %0
  %tobool = icmp ne i32 %1, 0             ; check the first element to see
if it is 1, which is actually always true since the first element of
constant array is 1
  br i1 %tobool, label %2, label %4

; <label>:2               ; true branch
  store i32 1, i32* %l0;
  %3 = load i32* %l0;
  br label %4

; <label>:4
  %storemerge = phi i32 [ %3, %2 ], [ 0, %entry ]
  store i32 %storemerge, i32* %l0
  ret void
}
---

I ran opt -O3 simple_ir.txt -S, and got:

---
; ModuleID = 'simple_ir3.txt'

@f.b = constant [1 x i32] [i32 1], align 4

; Function Attrs: nounwind
define void @f(i32* nocapture %l0) #0 {
entry:
  %fc_ = alloca [1 x i32]
  store [1 x i32] [i32 1], [1 x i32]* %fc_
  %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0
  %1 = load i32* %0
  %tobool = icmp eq i32 %1, 0
  br i1 %tobool, label %3, label %2

; <label>:2                                       ; preds = %entry
  store i32 1, i32* %l0
  br label %3

; <label>:3                                       ; preds = %entry, %2
  %storemerge = phi i32 [ 1, %2 ], [ 0, %entry ]
  store i32 %storemerge, i32* %l0
  ret void
}

attributes #0 = { nounwind }
---

I would expect that the constant folding, or some other transformations,
would be able to fold the constant to get the following ir:

---
define void @f(i32* nocapture %l0) #0 {
  store i32 1, i32* %l0
  ret void
}
---

How could I get the expected optimized ir?  update the original ir, or use
different set of transformations?

Any suggestions or comments?


Thanks,
-Peng
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/d90c01d7/attachment.html>

Philip Reames

2014-Sep-10 03:05 UTC

head link

[LLVMdev] failed folding with constant array with opt -O3

One of my coworkers ran across a very similar case as well.  He's still 
reducing his test case to identify the problematic area, but based on 
the initial results, it looks like we have an overly conservative 
aliasing result being returned for two locations in disjoint elements of 
an array.  Your example is slightly different, but an imprecise alias 
query (i.e. not recognizing that the load is completely covered by the 
store) might explain it as well.

If you could file a bug with your example, that would be helpful. If you 
can further reduce it that would be nice, but you've already gotten it 
down small enough to be useful.  Thanks.

Philip

On 09/09/2014 09:30 AM, Peng Cheng wrote:> I have the following simplified llvm ir, which basically returns value 
> based on the first value of a constant array.
>
> ----
> ; ModuleID = 'simple_ir3.txt'
>
> @f.b = constant [1 x i32] [i32 1], align 4          ; constant array 
> with value 1 at the first element
>
> define void @f(i32* nocapture %l0) {
> entry:
>   %fc_ = alloca [1 x i32]
>   %f.b.v = load [1 x i32]* @f.b
>   store [1 x i32] %f.b.v, [1 x i32]* %fc_
>   %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0  ; load the first 
> element of the constant array, which is actually 1
>   %1 = load i32* %0
>   %tobool = icmp ne i32 %1, 0             ; check the first element to 
> see if it is 1, which is actually always true since the first element 
> of constant array is 1
>   br i1 %tobool, label %2, label %4
>
> ; <label>:2               ; true branch
>   store i32 1, i32* %l0;
>   %3 = load i32* %l0;
>   br label %4
>
> ; <label>:4
>   %storemerge = phi i32 [ %3, %2 ], [ 0, %entry ]
>   store i32 %storemerge, i32* %l0
>   ret void
> }
> ---
>
> I ran opt -O3 simple_ir.txt -S, and got:
>
> ---
> ; ModuleID = 'simple_ir3.txt'
>
> @f.b = constant [1 x i32] [i32 1], align 4
>
> ; Function Attrs: nounwind
> define void @f(i32* nocapture %l0) #0 {
> entry:
>   %fc_ = alloca [1 x i32]
>   store [1 x i32] [i32 1], [1 x i32]* %fc_
>   %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0
>   %1 = load i32* %0
>   %tobool = icmp eq i32 %1, 0
>   br i1 %tobool, label %3, label %2
>
> ; <label>:2                                       ; preds = %entry
>   store i32 1, i32* %l0
>   br label %3
>
> ; <label>:3                                       ; preds = %entry,
%2
>   %storemerge = phi i32 [ 1, %2 ], [ 0, %entry ]
>   store i32 %storemerge, i32* %l0
>   ret void
> }
>
> attributes #0 = { nounwind }
> ---
>
> I would expect that the constant folding, or some other 
> transformations, would be able to fold the constant to get the 
> following ir:
>
> ---
> define void @f(i32* nocapture %l0) #0 {
>   store i32 1, i32* %l0
>   ret void
> }
> ---
>
> How could I get the expected optimized ir?  update the original ir, or 
> use different set of transformations?
>
> Any suggestions or comments?
>
>
> Thanks,
> -Peng
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/cc9f076f/attachment.html>

Roel Jordans

2014-Sep-10 09:26 UTC

head link

[LLVMdev] failed folding with constant array with opt -O3

Looking at the -debug output of opt shows that SROA was skipped due to 
missing target data.

Adding something like:

target datalayout = "e-p:32:32:32-i32:32:32"

to the top seems sufficient to fix the issue at -O3.

By defining the size and storage requirements for i32 SROA is capable of 
rewriting the array load into a constant scalar load which can then be 
further optimized.

Cheers,
  Roel

On 09/09/14 18:30, Peng Cheng wrote:> I have the following simplified llvm ir, which basically returns value
> based on the first value of a constant array.
>
> ----
> ; ModuleID = 'simple_ir3.txt'
>
> @f.b = constant [1 x i32] [i32 1], align 4          ; constant array
> with value 1 at the first element
>
> define void @f(i32* nocapture %l0) {
> entry:
>    %fc_ = alloca [1 x i32]
>    %f.b.v = load [1 x i32]* @f.b
>    store [1 x i32] %f.b.v, [1 x i32]* %fc_
>    %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0  ; load the first
> element of the constant array, which is actually 1
>    %1 = load i32* %0
>    %tobool = icmp ne i32 %1, 0             ; check the first element to
> see if it is 1, which is actually always true since the first element of
> constant array is 1
>    br i1 %tobool, label %2, label %4
>
> ; <label>:2               ; true branch
>    store i32 1, i32* %l0;
>    %3 = load i32* %l0;
>    br label %4
>
> ; <label>:4
>    %storemerge = phi i32 [ %3, %2 ], [ 0, %entry ]
>    store i32 %storemerge, i32* %l0
>    ret void
> }
> ---
>
> I ran opt -O3 simple_ir.txt -S, and got:
>
> ---
> ; ModuleID = 'simple_ir3.txt'
>
> @f.b = constant [1 x i32] [i32 1], align 4
>
> ; Function Attrs: nounwind
> define void @f(i32* nocapture %l0) #0 {
> entry:
>    %fc_ = alloca [1 x i32]
>    store [1 x i32] [i32 1], [1 x i32]* %fc_
>    %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0
>    %1 = load i32* %0
>    %tobool = icmp eq i32 %1, 0
>    br i1 %tobool, label %3, label %2
>
> ; <label>:2                                       ; preds = %entry
>    store i32 1, i32* %l0
>    br label %3
>
> ; <label>:3                                       ; preds = %entry,
%2
>    %storemerge = phi i32 [ 1, %2 ], [ 0, %entry ]
>    store i32 %storemerge, i32* %l0
>    ret void
> }
>
> attributes #0 = { nounwind }
> ---
>
> I would expect that the constant folding, or some other transformations,
> would be able to fold the constant to get the following ir:
>
> ---
> define void @f(i32* nocapture %l0) #0 {
>    store i32 1, i32* %l0
>    ret void
> }
> ---
>
> How could I get the expected optimized ir?  update the original ir, or
> use different set of transformations?
>
> Any suggestions or comments?
>
>
> Thanks,
> -Peng
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Philip Reames

2014-Sep-10 16:50 UTC

head link

[LLVMdev] failed folding with constant array with opt -O3

I came in to an email this morning that said basically the same thing 
for the reduced example we were looking at.  However, the original IR it 
came from (before hand reduction) had the data layout set correctly, so 
there's probably still *something* going on.  It's just not what I 
thought at first.  :)

Philip


On 09/10/2014 02:26 AM, Roel Jordans wrote:> Looking at the -debug output of opt shows that SROA was skipped due to 
> missing target data.
>
> Adding something like:
>
> target datalayout = "e-p:32:32:32-i32:32:32"
>
> to the top seems sufficient to fix the issue at -O3.
>
> By defining the size and storage requirements for i32 SROA is capable 
> of rewriting the array load into a constant scalar load which can then 
> be further optimized.
>
> Cheers,
>  Roel
>
> On 09/09/14 18:30, Peng Cheng wrote:
>> I have the following simplified llvm ir, which basically returns value
>> based on the first value of a constant array.
>>
>> ----
>> ; ModuleID = 'simple_ir3.txt'
>>
>> @f.b = constant [1 x i32] [i32 1], align 4          ; constant array
>> with value 1 at the first element
>>
>> define void @f(i32* nocapture %l0) {
>> entry:
>>    %fc_ = alloca [1 x i32]
>>    %f.b.v = load [1 x i32]* @f.b
>>    store [1 x i32] %f.b.v, [1 x i32]* %fc_
>>    %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0  ; load the first
>> element of the constant array, which is actually 1
>>    %1 = load i32* %0
>>    %tobool = icmp ne i32 %1, 0             ; check the first element to
>> see if it is 1, which is actually always true since the first element
of
>> constant array is 1
>>    br i1 %tobool, label %2, label %4
>>
>> ; <label>:2               ; true branch
>>    store i32 1, i32* %l0;
>>    %3 = load i32* %l0;
>>    br label %4
>>
>> ; <label>:4
>>    %storemerge = phi i32 [ %3, %2 ], [ 0, %entry ]
>>    store i32 %storemerge, i32* %l0
>>    ret void
>> }
>> ---
>>
>> I ran opt -O3 simple_ir.txt -S, and got:
>>
>> ---
>> ; ModuleID = 'simple_ir3.txt'
>>
>> @f.b = constant [1 x i32] [i32 1], align 4
>>
>> ; Function Attrs: nounwind
>> define void @f(i32* nocapture %l0) #0 {
>> entry:
>>    %fc_ = alloca [1 x i32]
>>    store [1 x i32] [i32 1], [1 x i32]* %fc_
>>    %0 = getelementptr [1 x i32]* %fc_, i64 0, i64 0
>>    %1 = load i32* %0
>>    %tobool = icmp eq i32 %1, 0
>>    br i1 %tobool, label %3, label %2
>>
>> ; <label>:2                                       ; preds =
%entry
>>    store i32 1, i32* %l0
>>    br label %3
>>
>> ; <label>:3                                       ; preds =
%entry, %2
>>    %storemerge = phi i32 [ 1, %2 ], [ 0, %entry ]
>>    store i32 %storemerge, i32* %l0
>>    ret void
>> }
>>
>> attributes #0 = { nounwind }
>> ---
>>
>> I would expect that the constant folding, or some other
transformations,
>> would be able to fold the constant to get the following ir:
>>
>> ---
>> define void @f(i32* nocapture %l0) #0 {
>>    store i32 1, i32* %l0
>>    ret void
>> }
>> ---
>>
>> How could I get the expected optimized ir?  update the original ir, or
>> use different set of transformations?
>>
>> Any suggestions or comments?
>>
>>
>> Thanks,
>> -Peng
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Sep 2014 - [LLVMdev] failed folding with constant array with opt -O3

[LLVMdev] failed folding with constant array with opt -O3

[LLVMdev] failed folding with constant array with opt -O3

[LLVMdev] failed folding with constant array with opt -O3

[LLVMdev] failed folding with constant array with opt -O3

Maybe Matching Threads