thr3ads.net - llvm dev - [LLVMdev] Intrinsics and dead instruction/code elimination [May 2010]

If this information is useful, please help other people find it:
Share via:

o.j.sivart at gmail.com

2010-May-19 14:07 UTC

[LLVMdev] Intrinsics and dead instruction/code elimination

Hi all,

I'm interested in the impact of representing code via intrinsic functions,
in contrast to via an instruction, when it comes to performing dead
instruction/code elimination. As a concrete example, lets consider the simple
case of the llvm.*.with.overflow.* intrinsics.

If I have some sequence (> 1) of llvm.*.with.overflow.* intrinsics, as in the
form of:

@global = global i32 0

define void @fun() {
  entry:
  %res1 = call {i32, i1} @llvm.*.with.overflow.i32(i32 %a, i32 %b)
  %sum1 = extractvalue {i32, i1} %res, 0
  %obit1 = extractvalue {i32, i1} %res, 1
  store i32 %obit1, i32* @global
  ...
  %res2 = call {i32, i1} @llvm.*.with.overflow.i32(i32 %a, i32 %b)
  %sum2 = extractvalue {i32, i1} %res, 0
  %obit2 = extractvalue {i32, i1} %res, 1
  store i32 %obit2, i32* @global
}

then I assume an optimisation pass is able to eliminate the store i32 %obit1,
i32* @global, since store i32 %obit2, i32* @global clearly clobbers the global
without any interleaving load/access. However, my question is whether
representing code as an intrinsic limits  further dead instruction/code
elimination. In this case, the intrinsic will produce an arithmetic operation
followed by a setcc on overflow on x86. Given the first store is dead, the first
setcc on overflow instruction is also dead and can be eliminated. Is LLVM
capable of such elimination with intrinsics, or is the expectation that an
optimisation pass replace llvm.*.with.overflow.* intrinsics with their
corresponding non-intrinsic arithmetics operations in order to achieve the same
result, or is there another solution?

Thanks in advance

Chris Lattner

2010-May-19 17:31 UTC

head link

[LLVMdev] Intrinsics and dead instruction/code elimination

On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote:
> Hi all,
> 
> I'm interested in the impact of representing code via intrinsic
functions, in contrast to via an instruction, when it comes to performing dead
instruction/code elimination. As a concrete example, lets consider the simple
case of the llvm.*.with.overflow.* intrinsics.
> 
> If I have some sequence (> 1) of llvm.*.with.overflow.* intrinsics, as
in the form of:
> 
> @global = global i32 0
> 
> define void @fun() {
>  entry:
>  %res1 = call {i32, i1} @llvm.*.with.overflow.i32(i32 %a, i32 %b)
>  %sum1 = extractvalue {i32, i1} %res, 0
>  %obit1 = extractvalue {i32, i1} %res, 1
>  store i32 %obit1, i32* @global
>  ...
>  %res2 = call {i32, i1} @llvm.*.with.overflow.i32(i32 %a, i32 %b)
>  %sum2 = extractvalue {i32, i1} %res, 0
>  %obit2 = extractvalue {i32, i1} %res, 1
>  store i32 %obit2, i32* @global
> }
> 
> then I assume an optimisation pass is able to eliminate the store i32
%obit1, i32* @global, since store i32 %obit2, i32* @global clearly clobbers the
global without any interleaving load/access. However, my question is whether
representing code as an intrinsic limits  further dead instruction/code
elimination. In this case, the intrinsic will produce an arithmetic operation
followed by a setcc on overflow on x86. Given the first store is dead, the first
setcc on overflow instruction is also dead and can be eliminated. Is LLVM
capable of such elimination with intrinsics, or is the expectation that an
optimisation pass replace llvm.*.with.overflow.* intrinsics with their
corresponding non-intrinsic arithmetics operations in order to achieve the same
result, or is there another solution?
Intrinsics should be optimized as well as instructions.  In this specific case,
these intrinsics should be marked readnone, which means that load/store
optimization will ignore them.  Dead code elimination will delete the intrinsic
if it is dead etc.  Are you seeing this fail on a specific testcase?

-Chris

o.j.sivart at gmail.com

2010-May-19 22:13 UTC

head link

[LLVMdev] Intrinsics and dead instruction/code elimination

On 20/05/2010, at 3:01 AM, Chris Lattner wrote:
> 
> On May 19, 2010, at 7:07 AM, o.j.sivart at gmail.com wrote:
> 
>> Hi all,
>> 
>> I'm interested in the impact of representing code via intrinsic
functions, in contrast to via an instruction, when it comes to performing dead
instruction/code elimination. As a concrete example, lets consider the simple
case of the llvm.*.with.overflow.* intrinsics.
>> 
>> If I have some sequence (> 1) of llvm.*.with.overflow.* intrinsics,
as in the form of:
>> 
>> @global = global i32 0
>> 
>> define void @fun() {
>> entry:
>> %res1 = call {i32, i1} @llvm.*.with.overflow.i32(i32 %a, i32 %b)
>> %sum1 = extractvalue {i32, i1} %res, 0
>> %obit1 = extractvalue {i32, i1} %res, 1
>> store i32 %obit1, i32* @global
>> ...
>> %res2 = call {i32, i1} @llvm.*.with.overflow.i32(i32 %a, i32 %b)
>> %sum2 = extractvalue {i32, i1} %res, 0
>> %obit2 = extractvalue {i32, i1} %res, 1
>> store i32 %obit2, i32* @global
>> }
>> 
>> then I assume an optimisation pass is able to eliminate the store i32
%obit1, i32* @global, since store i32 %obit2, i32* @global clearly clobbers the
global without any interleaving load/access. However, my question is whether
representing code as an intrinsic limits  further dead instruction/code
elimination. In this case, the intrinsic will produce an arithmetic operation
followed by a setcc on overflow on x86. Given the first store is dead, the first
setcc on overflow instruction is also dead and can be eliminated. Is LLVM
capable of such elimination with intrinsics, or is the expectation that an
optimisation pass replace llvm.*.with.overflow.* intrinsics with their
corresponding non-intrinsic arithmetics operations in order to achieve the same
result, or is there another solution?
> 
> Intrinsics should be optimized as well as instructions.  In this specific
case, these intrinsics should be marked readnone, which means that load/store
optimization will ignore them.  Dead code elimination will delete the intrinsic
if it is dead etc.
I understand that dead code elimination is able to delete the intrinsic if it is
dead. What I'm interested in is whether or not, despite the entire intrinsic
not being dead, anything is able to eliminate the setcc on overflow instruction
part of the first intrinsic given that the store of obit1 is dead, thus obit1 is
not needed, thus extracting the overflow bit from the CFLAGS register via a
setcc instruction is no longer needed. I assume nothing is able to perform such
optimisation on intrinsics and my guess is that the only option is, as I said, a
pass which rewrites the first intrinsic to just its corresponding arithmetic
instruction once a pass has dead store eliminated the first store. Is this the
case?
> Are you seeing this fail on a specific testcase?
No I don't yet have code for the particular testcases I have in mind.
I'm just exploring the limitations of intrinsics over instructions.
> -Chris

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - May 2010 - [LLVMdev] Intrinsics and dead instruction/code elimination

[LLVMdev] Intrinsics and dead instruction/code elimination

[LLVMdev] Intrinsics and dead instruction/code elimination

[LLVMdev] Intrinsics and dead instruction/code elimination

Reasonably Related Threads