Sanjoy Das
2011-Aug-29 21:41 UTC
[LLVMdev] [PATCH] Split init.trampoline into init.trampoline & adjust.trampoline
Hi! Attached patches split init.trampoline into adjust.trampoline and init.trampoline, like in gcc. As mentioned in the previous mail, I've not made a documentation patch, since I'm not sure about what the documented semantics of llvm.adjust.trampoline should be. Thanks! -- Sanjoy Das http://playingwithpointers.com -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-New-trampoline-intrinsics-for-LLVM.patch Type: text/x-diff Size: 16784 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110830/7524baf1/attachment.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fix-the-InstCombine-pass.patch Type: text/x-diff Size: 7613 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110830/7524baf1/attachment-0001.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-AutoUpgrade-for-llvm.init.trampoline.patch Type: text/x-diff Size: 3290 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110830/7524baf1/attachment-0002.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-Test-cases.patch Type: text/x-diff Size: 7943 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110830/7524baf1/attachment-0003.patch>
Duncan Sands
2011-Aug-31 15:56 UTC
[LLVMdev] [PATCH] Split init.trampoline into init.trampoline & adjust.trampoline
Hi Sanjoy, the first and last patches look good (except that you didn't add any tests for the auto-upgrade functionality). Comments on the other two below.> Attached patches split init.trampoline into adjust.trampoline and > init.trampoline, like in gcc. > > As mentioned in the previous mail, I've not made a documentation > patch, since I'm not sure about what the documented semantics of > llvm.adjust.trampoline should be.Here's what the GCC docs say: @hook TARGET_TRAMPOLINE_ADJUST_ADDRESS This hook should perform any machine-specific adjustment in the address of the trampoline. Its argument contains the address of the memory block that was passed to @code{TARGET_TRAMPOLINE_INIT}. In case the address to be used for a function call should be different from the address at which the template was stored, the different address should be returned; otherwise @var{addr} should be returned unchanged. If this hook is not defined, @var{addr} will be used for function calls. @end deftypefn Perhaps you can come up with something based on that. When I initially implemented the trampoline stuff I probably documented adjust_trampoline (which was later removed), so maybe you could also dig that up. By the way an example of adjust_trampoline is ARM, which or's a 1 into the address of the trampoline. When the pointer is called the processor sees the 1 and puts itself into thumb mode.> --- a/lib/Transforms/InstCombine/InstCombineCalls.cpp > +++ b/lib/Transforms/InstCombine/InstCombineCalls.cpp > @@ -12,7 +12,6 @@ > //===----------------------------------------------------------------------===// > > #include "InstCombine.h" > -#include "llvm/IntrinsicInst.h" > #include "llvm/Support/CallSite.h" > #include "llvm/Target/TargetData.h" > #include "llvm/Analysis/MemoryBuiltins.h" > @@ -821,6 +820,93 @@ Instruction *InstCombiner::tryOptimizeCall(CallInst *CI, const TargetData *TD) { > return Simplifier.NewInstruction; > } > > +// Given a call to llvm.adjust.trampoline, find the corresponding call to > +// llvm.init.trampoline.I think you should also say that it only returns a non-null value if the transform is safe to do. It's not just a question of finding the init.tramp call.> +// > +static IntrinsicInst *FindInitTrampoline(Value *Callee) { > + Callee = Callee->stripPointerCasts(); > + > + // This technique does not work (directly) if the trampoline has been > + // bitcasted even once.I find this obscure. Especially as the trampoline is almost always bitcast before being passed as a pointer to init.trampoline and adjust.trampoline. For example like it is in the testcase for this feature. However, using -instcombine, -gvn, -instcombine> + // (hence with -O2 and above) usually works, even with multiple bitcasts. It > + // probably is not worthwhile to-reduplicate the work here in the unlikely > + // case of the code being compiled with only the InstCombine pass enabled.I don't think this comment (including the bit before) should exist. Passes rely on other passes have put things in decent shape all the time, and don't bother to say so.> + > + // First check if this instruction is actually an llvm.adjust.trampoline > + IntrinsicInst *AdjustTramp = dyn_cast<IntrinsicInst>(Callee); > + if (!AdjustTramp || > + AdjustTramp->getIntrinsicID() != Intrinsic::adjust_trampoline) > + return 0; > + > + // The first argument passed to adjust_trampoline is the trampoline argument. > + // Get a hold of this. > + Value *TrampMem = AdjustTramp->getOperand(0); > + > + // The first thing to check is if there is a direct path from a > + // init.trampoline instruction to this adjust.trampoline instruction (both > + // operating on the same trampoline). Of course, this is useless if there are > + // writes to the trampoline between the two instructions. > + > + // We look at all the instructions before the adjust.trampoline in its basic > + // block. > + > + BasicBlock::iterator InstIterator = AdjustTramp, > + InstBegin = AdjustTramp->getParent()->begin(); > + IntrinsicInst *In = NULL;Why is In defined here and not locally inside the loop below?> + > + InstIterator--;What if AdjustTramp was the first instruction in the basic block? I think you should first compare InstIterator with InstBegin and only after decrement the iterator.> + > + while(InstIterator != InstBegin) { > + // Check if we got lucky. > + In = dyn_cast<IntrinsicInst>(InstIterator); > + if (In && In->getIntrinsicID() == Intrinsic::init_trampoline && > + In->getOperand(0) == TrampMem) > + return In; > + > + // If there is a write to the trampoline, then the sifting through previous > + // instructions is useless. > + if (InstIterator->mayWriteToMemory()) {Here you should just always give up. After all, the trampoline memory might be a global variable, or a pointer to it might have been squirrelled away somewhere so any call might modify the memory without using TrampMem explicitly. Not to mention that TrampMem is probably a bitcast of the actual memory (eg: alloca), and instructions might be writing to the alloca directly rather than via this bitcast. And so on. You can't do better without making use of alias analysis, and you're not allowed to do that in instcombine.> + for (unsigned i = 0; i < InstIterator->getNumOperands(); i++) > + if (InstIterator->getOperand(i) == TrampMem) > + break; > + } > + > + InstIterator--; > + } > + > + // We can also check for the case where the trampoline memory is straight out > + // of an alloca. Since we now know the origin of the memory, we can simply > + // count its uses. > + > + GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(TrampMem);I think you should also handle bitcast here. You can do it like this: Value *Underlying = TrampMem->stripPointerCasts(); if (!Underlying->hasOneUse() || *Underlying.use_begin() != TrampMem) return 0;> + if (GEPI && GEPI->hasOneUse()) { > + AllocaInst *AI = dyn_cast<AllocaInst>(GEPI->getOperand(0)); > + if (AI == NULL || !AI->hasOneUse()) > + return 0; > + } else { > + return 0; > + }You should do early returns if possible: if (!AI) return 0;> + > + if (!TrampMem->hasNUses(2)) > + return 0;You should do this test earlier, because if it fails you didn't need to bother with the GEP/bitcast/alloca logic. Also, I'm worried that the case of multiple adjust.trampoline calls won't be handled properly. If adjust.trampoline was declared IntrNoMem then probably GVN would unify them (depends on dominance whether this is possible). But it isn't declared like that (but maybe it should be) which makes things harder for GVN. So I reckon you should maybe allow TrampMem to have lots of uses, and check that they are all calls to adjust.trampoline or init.trampoline (at most one of these). [Having lots of adjust.trampoline calls in the original bitcode is the common case].> + > + // At this point, there are be two possibilities: either init.trampoline wasare be -> are> + // called before adjust.trampoline or after it.This reads oddly because you don't even know that there is an init.trampoline call yet. The first case can be> + // correctly optimized. The second case has undefined behavior, so we > + // optimize it anyway. > + > + Value::use_iterator I = TrampMem->use_begin(); > + if (*I == AdjustTramp) // We don't need *this* use. > + ++I; > + > + In = dyn_cast<IntrinsicInst>(*I); > + if (In && In->getIntrinsicID() == Intrinsic::init_trampoline) > + return In;You also have to check that TrampMem is used as the correct argument to init.trampoline, since a priori some evil person may be using it as the second or third argument, rather than the first! Finally I think the alloca case should come before the "hunt back in the basic block" logic, because it is the most common.> + > + // Could not find the init.trampoline call. > + return 0; > +} > + > // visitCallSite - Improvements for call and invoke instructions. > // > Instruction *InstCombiner::visitCallSite(CallSite CS) {> --- a/lib/VMCore/AutoUpgrade.cpp > +++ b/lib/VMCore/AutoUpgrade.cpp > @@ -43,6 +43,32 @@ static bool UpgradeIntrinsicFunction1(Function *F, Function *&NewFn) { > > switch (Name[0]) { > default: break; > + case 'i': > + // This upgrades the old llvm.init.trampoline to the new > + // llvm.init.trampoline and llvm.adjust.trampoline pair. > + if (Name == "init.trampoline") { > + Type *ReturnTy = FTy->getReturnType(); > + > + // The new llvm.init.trampoline returns nothing. > + if (ReturnTy->getTypeID() == Type::VoidTyID)Simpler: if (ReturnTy->isVoidTy())> + break; > + > + assert(ReturnTy->isPointerTy () && > + "old init.trampoline returns a pointer");Is this worth checking? If not (I think not) you can get rid of the variable ReturnTy altogether.> + assert(FTy->getNumParams() == 3 && "old init.trampoline takes 3 args!"); > + > + std::string NameTmp = F->getName(); > + F->setName(""); > + > + // Like in llvm.prefetch below, change the name of the old intrinsic so > + // that we can play with its type.I don't think you should mention llvm.prefetch. It's like using a "goto": the documentation is jumping to some place far away then back! If someone changes the llvm.prefetch logic one day your comment will get stale. I suggest you just remove "Like in llvm.prefetch below, ".> + NewFn = cast<Function>(M->getOrInsertFunction( > + NameTmp, > + Type::getVoidTy(M->getContext()), > + FTy->getParamType(0), FTy->getParamType(1), > + FTy->getParamType(2), (Type *)0)); > + return true; > + } > case 'p': > // This upgrades the llvm.prefetch intrinsic to accept one more parameter, > // which is a instruction / data cache identifier. The old version only > @@ -216,6 +242,34 @@ void llvm::UpgradeIntrinsicCall(CallInst *CI, Function *NewFn) { > CI->eraseFromParent(); > break; > } > + case Intrinsic::init_trampoline: { > + > + // Transform > + // %tramp = call i8* llvm.init.trampoline (i8* x, i8* y, i8* z) > + // to > + // call void llvm.init.trampoline (i8* %x, i8* %y, i8* %z) > + // %tramp = call i8* llvm.adjust.trampoline (i8* %x) > + > + Type *PointerTy = NewFn->getFunctionType()->getParamType(0); > + Function *AdjustTrampolineFn > + cast<Function>(F->getParent()->getOrInsertFunction( > + "llvm.adjust.trampoline", PointerTy, PointerTy, > + (Type *)0));I think you should use Intrinsic::getDeclaration to get AdjustTrampolineFn.> + > + IRBuilder<> Builder(C); > + Builder.SetInsertPoint(CI->getParent(), CI);You can just use: Builder.SetInsertPoint(CI);> + > + Builder.CreateCall3(NewFn, CI->getArgOperand(0), CI->getArgOperand(1), > + CI->getArgOperand(2)); > + > + CallInst *AdjustCall = Builder.CreateCall(AdjustTrampolineFn, > + CI->getArgOperand(0), > + CI->getName()); > + if (!CI->use_empty()) > + CI->replaceAllUsesWith(AdjustCall); > + CI->eraseFromParent(); > + break; > + } > } > }Please don't forget to add an auto-upgrade testcase. Ciao, Duncan.
Sanjoy Das
2011-Sep-03 13:09 UTC
[LLVMdev] [PATCH] Split init.trampoline into init.trampoline & adjust.trampoline
Hi! Thank you for the detailed review. I've attached a new set of patches. -- Sanjoy Das http://playingwithpointers.com -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-New-trampoline-intrinsics-for-LLVM.patch Type: text/x-diff Size: 16783 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110903/f3ced529/attachment.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fix-the-InstCombine-pass.patch Type: text/x-diff Size: 7734 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110903/f3ced529/attachment-0001.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0003-Test-cases-for-the-new-trampoline-intrinsics.patch Type: text/x-diff Size: 9858 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110903/f3ced529/attachment-0002.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0004-AutoUpgrade-for-llvm.init.trampoline.patch Type: text/x-diff Size: 2989 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110903/f3ced529/attachment-0003.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0005-Test-case-for-the-trampoline-AutoUpgrade-code.patch Type: text/x-diff Size: 1165 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110903/f3ced529/attachment-0004.patch>
Possibly Parallel Threads
- [LLVMdev] [PATCH] Split init.trampoline into init.trampoline & adjust.trampoline
- [LLVMdev] [RFC] Splitting init.trampoline into init.trampoline and adjust.trampoline
- [LLVMdev] [RFC] Splitting init.trampoline into init.trampoline and adjust.trampoline
- [LLVMdev] [PATCH] Split init.trampoline into init.trampoline & adjust.trampoline
- [LLVMdev] [PATCH] Go on dragonegg