Patrik Hägglund
2012-Feb-01 11:11 UTC
[LLVMdev] Issues with the llvm.stackrestore intrinsic
Hi, I have two problems regarding the llvm.stackrestore intrinsic. I'm running on 3.0, but a quick test on trunk also showed the same behavior. First problem: --------------- I have code like: tmp1 = call llvm.stacksave() tmp2 = alloca [do some stuff with tmp2] call llvm.stackrestore(tmp1) [some other stuff] tmp3 = call llvm.stacksave() tmp4 = alloca [do some stuff with tmp4] call llvm.stackrestore(tmp3) Then some transformation rewrites this to tmp1 = call llvm.stacksave() tmp2 = alloca [do some stuff with tmp2] call llvm.stackrestore(tmp1) [some other stuff] tmp3 = call llvm.stacksave() tmp4 = tmp2 <----- Ops!!! [do some stuff with tmp4] call llvm.stackrestore(tmp3) Unfortunately the tmp2 pointer isn't valid after the first stackrestore, since the memory it's pointing at has in fact been deallocated by the intrinsic, so the uses of it through the variable tmp4 are wrong. Maybe some dependencies between alloca and the stackrestore instrinsic are missing or how should this work? In Intrinsics.td it says // Note: we treat stacksave/stackrestore as writemem because we don't otherwise // model their dependencies on allocas. def int_stacksave : Intrinsic<[llvm_ptr_ty]>, GCCBuiltin<"__builtin_stack_save">; def int_stackrestore : Intrinsic<[], [llvm_ptr_ty]>, GCCBuiltin<"__builtin_stack_restore">; Does "GCCBuiltin" imply "writemem", or how does the comment and the code correspond? Second problem: --------------- It seems that calls to stackrestore are removed if they are close to a ret instruction. I suppose the idea is that the function epilogue will restore the stack pointer anyway, so the llvm.stackrestore can be safely removed. However, in my target, the stack restoration in the epilogue is currently done by adding the used stacksize to the stackpointer, so I do indeed need the call to the stackrestore intrinsic to be left. Am I breaking some design rule by reading the stackpointer in the epilogue or how is this supposed to work?
Anton Korobeynikov
2012-Feb-01 16:05 UTC
[LLVMdev] Issues with the llvm.stackrestore intrinsic
Hello Patrik,> In Intrinsics.td it says > > // Note: we treat stacksave/stackrestore as writemem because we don't > otherwise > // model their dependencies on allocas. > def int_stacksave : Intrinsic<[llvm_ptr_ty]>, > GCCBuiltin<"__builtin_stack_save">; > def int_stackrestore : Intrinsic<[], [llvm_ptr_ty]>, > GCCBuiltin<"__builtin_stack_restore">; > > Does "GCCBuiltin" imply "writemem", or how does the comment and the code > correspond?"writemem" is by default. Look into the beginning of this .td file for description of various flags. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University
Patrik Hägglund
2012-Feb-03 09:32 UTC
[LLVMdev] Issues with the llvm.stackrestore intrinsic - now LoopRotation handling of alloca
Hi, I've tracked the first problem (mentioned in my previous email, quoted below) down further, ending up in the handling of alloca in LoopRotation.cpp (from trunk): // If the instruction's operands are invariant and it doesn't read or write // memory, then it is safe to hoist. Doing this doesn't change the order of // execution in the preheader, but does prevent the instruction from // executing in each iteration of the loop. This means it is safe to hoist // something that might trap, but isn't safe to hoist something that reads // memory (without proving that the loop doesn't write). if (L->hasLoopInvariantOperands(Inst) && !Inst->mayReadFromMemory() && !Inst->mayWriteToMemory() && !isa<TerminatorInst>(Inst) && !isa<DbgInfoIntrinsic>(Inst)) { Inst->moveBefore(LoopEntryBranch); continue; } The above code happily moves an alloca instruction out of the loop, to the new loop header. Shouldn't we also check on !isa<AllocaInst>(Inst) before allowing to move it? The code that breaks because of the moved alloca looks like start: [...] br loopstart loopbody: [use alloc'd mem] call stackrestore(%oldsp) br loopstart loopstart: %oldsp = call stacksave() alloca [...] brcond loopbody, end end: [use alloc'd mem] call stackrestore(%oldsp) Then LoopRotation clones the stacksave from bb3 to bb1 and _moves_ the alloca from bb3 to bb1. So the alloca is moved outside the loop instead of having it execute each lap. But there still is a stackrestore inside the loop that will "deallocate" the memory, so all but the first lap of the loop will access deallocated memory. So, shouldn't alloca instructions be cloned rather than moved? And maybe there are other instructions with side effects that also should be cloned. Patrik Hägglund wrote 2012-02-01 12:11:> Hi, > > I have two problems regarding the llvm.stackrestore intrinsic. I'm > running on 3.0, but a quick test on trunk also showed the same behavior. > > First problem: > --------------- > I have code like: > > tmp1 = call llvm.stacksave() > tmp2 = alloca > [do some stuff with tmp2] > call llvm.stackrestore(tmp1) > > [some other stuff] > > tmp3 = call llvm.stacksave() > tmp4 = alloca > [do some stuff with tmp4] > call llvm.stackrestore(tmp3) > > Then some transformation rewrites this to > > tmp1 = call llvm.stacksave() > tmp2 = alloca > [do some stuff with tmp2] > call llvm.stackrestore(tmp1) > > [some other stuff] > > tmp3 = call llvm.stacksave() > tmp4 = tmp2<----- Ops!!! > [do some stuff with tmp4] > call llvm.stackrestore(tmp3) > > Unfortunately the tmp2 pointer isn't valid after the first stackrestore, > since the memory it's pointing at has in fact been deallocated by the > intrinsic, so the uses of it through the variable tmp4 are wrong. > > Maybe some dependencies between alloca and the stackrestore instrinsic > are missing or how should this work? > > In Intrinsics.td it says > > // Note: we treat stacksave/stackrestore as writemem because we don't > otherwise > // model their dependencies on allocas. > def int_stacksave : Intrinsic<[llvm_ptr_ty]>, > GCCBuiltin<"__builtin_stack_save">; > def int_stackrestore : Intrinsic<[], [llvm_ptr_ty]>, > GCCBuiltin<"__builtin_stack_restore">; > > Does "GCCBuiltin" imply "writemem", or how does the comment and the code > correspond? > > > Second problem: > --------------- > It seems that calls to stackrestore are removed if they are close to a > ret instruction. I suppose the idea is that the function epilogue will > restore the stack pointer anyway, so the llvm.stackrestore can be safely > removed. > > However, in my target, the stack restoration in the epilogue is > currently done by adding the used stacksize to the stackpointer, so I do > indeed need the call to the stackrestore intrinsic to be left. > > Am I breaking some design rule by reading the stackpointer in the > epilogue or how is this supposed to work?
Eli Friedman
2012-Feb-03 09:57 UTC
[LLVMdev] Issues with the llvm.stackrestore intrinsic - now LoopRotation handling of alloca
2012/2/3 Patrik Hägglund <patrik.h.hagglund at ericsson.com>:> Hi, > > I've tracked the first problem (mentioned in my previous email, quoted > below) down further, ending up in the handling of alloca in > LoopRotation.cpp (from trunk): > > // If the instruction's operands are invariant and it doesn't read > or write > // memory, then it is safe to hoist. Doing this doesn't change the > order of > // execution in the preheader, but does prevent the instruction from > // executing in each iteration of the loop. This means it is safe > to hoist > // something that might trap, but isn't safe to hoist something > that reads > // memory (without proving that the loop doesn't write). > if (L->hasLoopInvariantOperands(Inst) && > !Inst->mayReadFromMemory() && !Inst->mayWriteToMemory() && > !isa<TerminatorInst>(Inst) && !isa<DbgInfoIntrinsic>(Inst)) { > Inst->moveBefore(LoopEntryBranch); > continue; > } > > The above code happily moves an alloca instruction out of the loop, to > the new loop header. Shouldn't we also check on > > !isa<AllocaInst>(Inst) > > before allowing to move it? > > > The code that breaks because of the moved alloca looks like > > start: > [...] > br loopstart > > loopbody: > [use alloc'd mem] > call stackrestore(%oldsp) > br loopstart > > loopstart: > %oldsp = call stacksave() > alloca > [...] > brcond loopbody, end > > end: > [use alloc'd mem] > call stackrestore(%oldsp) > > Then LoopRotation clones the stacksave from bb3 to bb1 and _moves_ the > alloca from bb3 to bb1. So the alloca is moved outside the loop instead > of having it execute each lap. But there still is a stackrestore inside > the loop that will "deallocate" the memory, so all but the first lap of > the loop will access deallocated memory. > > So, shouldn't alloca instructions be cloned rather than moved? And maybe > there are other instructions with side effects that also should be cloned.I haven't read your description very carefully, but your analysis looks roughly correct; we've had similar issues with stacksave/stackrestore before. See r143093, for example. -Eli
Possibly Parallel Threads
- [LLVMdev] Issues with the llvm.stackrestore intrinsic - now LoopRotation handling of alloca
- [LLVMdev] Issues with the llvm.stackrestore intrinsic - now LoopRotation handling of alloca
- [LLVMdev] Explicitly Freeing Allocas
- [LLVMdev] Interaction of stacksave/restore and stack spills
- [LLVMdev] Proposing a new 'alloca' parameter attribute to implement the Microsoft C++ ABI