Paul Muntean via llvm-dev
2017-Jul-05 16:22 UTC
[llvm-dev] Caller callee calling convention enforcement in C++ bin. code
Hi guys, maybe you can help with an issue which I have. I want to recuperate for a C++ program compiled with Clang/LLVM on an Ubuntu CPU x86_64 bit architecture all the addresses of the call instructions (C++ object dispatches) or directly the return address which are just the next address after a call instruction. I think that this information is not obtainable during link time since we have at that moment only IR code. Please corect me if I am wrong. So my assumption is that in the compiler back end after the IR code is lowered to machine code and the addresses for the call instructions and the addresses next to the call instructions are available. Has anybody a suggestion where are the possible places in the compiler where I should look for? Since I am new to this topic suggestions or solutions are highly welcome. -Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170705/8147c848/attachment.html>
Reid Kleckner via llvm-dev
2017-Jul-06 15:53 UTC
[llvm-dev] Caller callee calling convention enforcement in C++ bin. code
Is it enough to compute the set of all possible return addresses, or do you need to limit the set to only C++ method calls? If you just need the full set of return addresses for a given DSO, I'd recommend disassembling the object after linking, scraping the output for "callq" instructions, and taking the address of the next instruction. This will give you the return address "VA" (I think, in ELF parlance), which is the address of the instruction assuming the ELF binary is loaded at the address listed in its program headers. You can compute the possible return addresses at runtime by adding the difference between the on-disk p_vaddr values and the actual addresses that the loader used at runtime. You can probably discover the load addresses with dl_iterate_phdr. If you need only some specific annotated list of return addresses, you will probably have to make complicated changes to LLVM that insert labels after certain CALL instructions and emit some object file section with relocations against those labels. This is doable but complicated. You can follow the EH label machinery to see how to insert labels into the instruction stream and create relocations against them from read-only data sections. On Wed, Jul 5, 2017 at 9:22 AM, Paul Muntean via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi guys, > > maybe you can help with an issue which I have. > > I want to recuperate for a C++ program compiled with Clang/LLVM on an > Ubuntu CPU x86_64 bit architecture all the addresses of the call > instructions (C++ object dispatches) or directly the return address > which are just the next address after a call instruction. > > I think that this information is not obtainable during link time since > we have at that moment only IR code. Please corect me if I am wrong. > So my assumption is that in the compiler back end after the IR code is > lowered to machine code and the addresses for the call instructions > and the addresses next to the call instructions are available. > > Has anybody a suggestion where are the possible places in the compiler > where I should look for? > > Since I am new to this topic suggestions or solutions are highly welcome. > > -Paul > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170706/700ef424/attachment.html>