D S Khudia
2011-Jul-06 23:02 UTC
[LLVMdev] code generation removes duplicated instructions
Hi Renato, I am trying to add a intrinsic call between the similar two instructions which either I'll remove or convert to nop in codegen. Does that kind of seem appropriate for the purpose here? Thanks Daya On Wed, Jul 6, 2011 at 11:55 AM, Renato Golin <renato.golin at arm.com> wrote:> On 6 July 2011 15:57, D S Khudia <daya.khudia at gmail.com> wrote: > > Since I am inserting a new basic block (contains printf statement and > > program exit) which is jumped upon based on the result of > > the comparison, the compiler cannot/shouldnot optimize that away by means > of > > DCE or anything else. > > It most certainly can, since the comparison yields always the same > result. The compiler can replace all that by a simple branch to > whatever block always executes. > > > > In the above example even the operands are not same and I guess compiler > > cannot be that smart at -O0. I sense something is wrong with the code > > generation for ARM. > > There're no hard rules stopping any compiler to run DCE or DAG > combining/elimination or whatever when in O0 or any other level. > > > > What other way do you suggest for duplicating since you mentioned I > > shouldn't rely on duplication the way I am doing it? > > Thanks a lot. I really appreciate your help. > > What you need is a way to make sure it won't be optimised away, > possibly even on higher O-levels. You need to add a dependency that > the compiler cannot see through (or is not smart enough at O0 to do > so, at least). Because this is a transient change, as a way to debug > other problems, you're allowed to do ugly stuff. > > For example, you can get a fake offset from an argument, and guarantee > that this is always zero. Or you can add a random offset internally to > one of them, and then compare that one is exactly the offset higher > (or lower) than the other. Or you could pass %a and %b as parameters > and memcopy them in the caller. > > Anything that would make it difficult for the compiler to see that > both are the same would do... > > cheers, > --renato >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110706/32b5f926/attachment.html>
Renato Golin
2011-Jul-07 08:05 UTC
[LLVMdev] code generation removes duplicated instructions
On 7 July 2011 00:02, D S Khudia <daya.khudia at gmail.com> wrote:> I am trying to add a intrinsic call between the similar two instructions > which either I'll remove or convert to nopĀ in codegen.If the two instructions are only similar in your real example, than you need to make them similar in your test, not identical. Different offsets, different array... If them two are identical in the real example as it is in your test, than you don't need to worry about it, because the back-end will remove them anyway.> Does that kind of seem appropriate for the purpose here?If you're adding the builtin call just to mark the code in some way, I suggest you using metadata. If that builtin has some function that is necessary in between two similar instructions, than it looks ok. Sorry if I'm being vague, I'm not fully getting the original problem... cheers, --renato
D S Khudia
2011-Jul-07 18:15 UTC
[LLVMdev] code generation removes duplicated instructions
Ok. Let me describe the problem again in some detail. The following is the original bitcode from a real testcase: bb7: %46 = load i32* %j, align 4 %47 = add nsw i32 %46, 1 store i32 %47, i32* %j, align 4 br label %bb8 To protect the operand of the store I duplicate the input chain of operands and insert a comparison to check whether the operand of the stores are correct. As a result of this modification the code looks as follows. Here instructions with HV in there name are extra inserted instructions and relExit BB contains a printf message which is executed if the comparison fails. This is the final code which is supposed to execute on a simulator or real hardware. bb7: %46 = load i32* %j, align 4 %47 = add nsw i32 %46, 1 %HV7_ = add nsw i32 %46, 1 %HVCmp13 = icmp ne i32 %47, %HV7_ br i1 %HVCmp13, label %relExit, label %bb7.split bb7.split: store i32 %47, i32* %j, align 4 br label %bb8 The following is the arm code generated for the above block (llc with -O0): .LBB0_26: @ %bb7 @ in Loop: Header=BB0_28 Depth=2 ldr r0, [sp, #440] add r0, r0, #1 cmp r0, r0 str r0, [sp, #288] bne .LBB0_88 b .LBB0_27 .LBB0_27: @ %bb7.split @ in Loop: Header=BB0_28 Depth=2 ldr r0, [sp, #288] str r0, [sp, #440] The code generated for x86 for the same two blocks looks like as follows (again llc with -O0): .LBB0_26: # %bb7 # in Loop: Header=BB0_28 Depth=2 movl 564(%esp), %eax movl %eax, 400(%esp) # 4-byte Spill addl $1, %eax movl 400(%esp), %ecx # 4-byte Reload addl $1, %ecx cmpl %ecx, %eax movl %eax, 396(%esp) # 4-byte Spill jne .LBB0_88 # BB#27: # %bb7.split # in Loop: Header=BB0_28 Depth=2 movl 396(%esp), %eax # 4-byte Reload movl %eax, 564(%esp) The problem here is in case of ARM code generation the backend has effectively removed the comparison (its comparing same register) of different computations. I want to have that comparison in the final code. The final code would be running on a hardware or simulator so that whenever a transient error occurs I can detect that and trigger a recovery. My goal here is to achieve the duplicated computation (same as visible in x86 asm e.g. adding the one twice) for ARM. Is there a way I can achieve it? Does the problem make more sense now? Thank you for your help. Thanks Daya On Thu, Jul 7, 2011 at 4:05 AM, Renato Golin <renato.golin at arm.com> wrote:> On 7 July 2011 00:02, D S Khudia <daya.khudia at gmail.com> wrote: > > I am trying to add a intrinsic call between the similar two instructions > > which either I'll remove or convert to nop in codegen. > > If the two instructions are only similar in your real example, than > you need to make them similar in your test, not identical. Different > offsets, different array... > > If them two are identical in the real example as it is in your test, > than you don't need to worry about it, because the back-end will > remove them anyway. > > > > Does that kind of seem appropriate for the purpose here? > > If you're adding the builtin call just to mark the code in some way, I > suggest you using metadata. If that builtin has some function that is > necessary in between two similar instructions, than it looks ok. > > Sorry if I'm being vague, I'm not fully getting the original problem... > > cheers, > --renato >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110707/11ac10f2/attachment.html>
Possibly Parallel Threads
- [LLVMdev] code generation removes duplicated instructions
- [LLVMdev] code generation removes duplicated instructions
- [LLVMdev] code generation removes duplicated instructions
- [LLVMdev] code generation removes duplicated instructions
- [LLVMdev] code generation removes duplicated instructions