Mathias Payer via llvm-dev
2017-Mar-04 18:16 UTC
[llvm-dev] Figuring out return address of a call
Hi folks, I'm trying to figure out the return address of a function in an LLVM pass, i.e., the byte address right after the end of the call instruction (so that I can initialize a global variable with the return address of a function for a sanity check). Due to some other constraints, I have to run this pass in somewhere in the midend. At a high level, I want to find the address after a call instruction (my main target is x86_64 for now) at runtime, see the two examples below: 100: e8 ff ff ff ff callq func 105: .marker 100: ff d0 callq *%rax 102: .marker My approach is to find call addresses through a function pass, split the basic block *after* the call instruction, then generate a BlockAddress as follows: if (auto CL = dyn_cast<CallInst>(&*I)) { BasicBlock *callblock = (*CL)->getParent(); BasicBlock *postblock callblock->splitBasicBlock((*CL)->getNextNode()); BlockAddress *retaddr = BlockAddress::get(postblock); ... } This works well except that the BlockAddress is slightly off. I run into the problem that during code generation, my BlockAddress is moved past the instructions that store arguments. E.g., if the function returns an argument, %rax is first spilled somewhere and my BlockAddress points to the end of, e.g., the movq instruction. Is there a better way to retrieve the address right after the call instruction (i.e., before the return value is stored)? Thanks, Mathias -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170304/e77578fb/attachment.sig>
Sanjoy Das via llvm-dev
2017-Mar-04 20:30 UTC
[llvm-dev] Figuring out return address of a call
Hi Mathias, I don't think you can do this using block addresses (since, as you said, the BlockAddress will be slightly off in most cases). I'd suggest (ab)using the patchpoint[1] intrinsic for this purpose. Instead of calling @foo(i32 1, float 2.0) you'll have to instead do something like @llvm.experimental.patchpoint(i64 <id>, i32 5, @foo, i32 2, i32 1, float 2.0) which will get lowered to a normal 5 byte call, and to an entry in the __llvm_stackmaps section (which will state the return PC). It may be difficult to tie back a given __llvm_stackmaps entry to a specific call in the IR (the ID is not sufficient since duplicated patchpoint calls will share the same ID), but you should be able to reify whatever information you need to associate with a given return site as extra "live values" to patchpoint. However, using patchpoint in mid level IR will inhibit inlining. What are you actually trying to do with this RPC information? On March 4, 2017 at 10:17:00 AM, Mathias Payer via llvm-dev (llvm-dev at lists.llvm.org) wrote:> I'm trying to figure out the return address of a function in an LLVM > pass, i.e., the byte address right after the end of the call instruction > (so that I can initialize a global variable with the return address of a > function for a sanity check). Due to some other constraints, I have to > run this pass in somewhere in the midend. > > At a high level, I want to find the address after a call instruction (my > main target is x86_64 for now) at runtime, see the two examples below: > > 100: e8 ff ff ff ff callq func > 105: .marker > > 100: ff d0 callq *%rax > 102: .marker > > My approach is to find call addresses through a function pass, split the > basic block *after* the call instruction, then generate a BlockAddress > as follows: > > if (auto CL = dyn_cast(&*I)) { > BasicBlock *callblock = (*CL)->getParent(); > BasicBlock *postblock > callblock->splitBasicBlock((*CL)->getNextNode()); > BlockAddress *retaddr = BlockAddress::get(postblock); > ... > } > > This works well except that the BlockAddress is slightly off. I run into > the problem that during code generation, my BlockAddress is moved past > the instructions that store arguments. E.g., if the function returns an > argument, %rax is first spilled somewhere and my BlockAddress points to > the end of, e.g., the movq instruction.Btw, there is no guarantee that the store of %RAX will be the only instruction between callbBlock and postBlock -- I know the mid level optimizer is conservative around blocks whose address has been taken, but at the very least the register allocator can emit arbitrary spills / fills there. [1]: http://llvm.org/docs/StackMaps.html#llvm-experimental-patchpoint-intrinsic -- Sanjoy
Mathias Payer via llvm-dev
2017-Mar-04 21:06 UTC
[llvm-dev] Figuring out return address of a call
Hi Sanjoy, Thanks for the quick reply, that's very helpful.> What are you actually trying to do with this RPC information?I'm working on an optimized/fast shadow stack to protect against ROP attacks. Most of the instrumentation could be done in the backend but some of the analysis needs to be done at the midend. I feared someone would point me towards intrinsics. I'll try to either abuse the patchpoints as you suggested (from a first glance it looks feasible) or split my pass into two stages where I store some information in the midend and then inject the code directly in the backend to get around this "moving addresses" problem (which is likely the cleaner approach). I'll have to explore what works better.> Btw, there is no guarantee that the store of %RAX will be the only > instruction between callbBlock and postBlock -- I know the mid level > optimizer is conservative around blocks whose address has been taken, > but at the very least the register allocator can emit arbitrary spills > / fills there.Yes, I tried to come up with a simple example. If the function returns a struct there's a whole bunch of spilling going on :) Thanks again, Mathias> [1]: http://llvm.org/docs/StackMaps.html#llvm-experimental-patchpoint-intrinsic-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170304/6d3b2c4a/attachment.sig>