I want to transform elf binary to llvm IR, and do some instrumentation based on llvm. Is there any tool which can do the transformation? Thanks in advance. - mudongliang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/abee5f23/attachment.html>
Suprateeka R Hegde
2015-Jul-17 07:36 UTC
[LLVMdev] how to transform elf binary to llvm IR?
Its not that easy. Check out projects like MCSEMA. -- Supra On 17 Jul 2015 12:42 pm, "慕冬亮" <mudongliangabcd at gmail.com> wrote:> I want to transform elf binary to llvm IR, and do some instrumentation > based on llvm. > Is there any tool which can do the transformation? > Thanks in advance. > > - mudongliang > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/537d2eef/attachment.html>
mcsema is one such tool(open source). It supports the translation of x86 and x86_64 machine code to LLVM IR as of now. You can check more details about what all instructions it supports on https://github.com/trailofbits/mcsema On Fri, Jul 17, 2015 at 1:06 PM, Suprateeka R Hegde <hegdesmailbox at gmail.com> wrote:> Its not that easy. Check out projects like MCSEMA. > > -- > Supra > On 17 Jul 2015 12:42 pm, "慕冬亮" <mudongliangabcd at gmail.com> wrote: > >> I want to transform elf binary to llvm IR, and do some instrumentation >> based on llvm. >> Is there any tool which can do the transformation? >> Thanks in advance. >> >> - mudongliang >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-- Thanx & Regards *Mayur Pandey * -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/620f8350/attachment.html>
On 7/17/15 2:09 AM, 慕冬亮 wrote:> I want to transform elf binary to llvm IR, and do some instrumentation > based on llvm. > Is there any tool which can do the transformation?There is a tool called Revgen which might do what you need, though I don't know if it meets your needs. Revgen can translate native code to LLVM IR, but I'm not sure if it can translate the LLVM IR back to native code for execution. There is also s2e which does dynamic translation from binary code to LLVM IR; it should be able to run the code after instrumentation. IIRC, both come from George Candea's group at EPFL. A quick Google search should help you find the code. Regards, John Criswell> Thanks in advance. > - mudongliang > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/00fd0fab/attachment.html>
This is not a easy task. And I believe there is *NO* (open-source) tool can fully solve this problem (statically). Correct me if I was wrong. It would be more helpful if you can provide details about what you want to do, say, static or dynamic ? stripped binary or binary with symbolic information? What compiler do you work on? Check out papers below if you are interested. http://dl.acm.org/citation.cfm?id=2465380 http://dl.acm.org/citation.cfm?id=2462165 Shuai On Fri, Jul 17, 2015 at 3:09 AM, 慕冬亮 <mudongliangabcd at gmail.com> wrote:> I want to transform elf binary to llvm IR, and do some instrumentation > based on llvm. > Is there any tool which can do the transformation? > Thanks in advance. > > - mudongliang > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/25591e83/attachment.html>
For every level of translation [in terms of "human readable -> machine code translation", not someone translating a literary work from one language to another - although often some subtle details are lost here too], a little bit of the semantic meaning is lost. This means that you can almost never completely reconstruct the code in original form from the machine-code, or the C-code from the LLVM IR, or the C++ code from the output of something like cfront (the original C++ -> C translator), or the original Pascal code from a Pascal to C compiler, etc. It is, at least sometimes, possible to reconstruct something that can then be "compiled" [in quotes as it's a loose term in this discussion] again from the binary file, but it's often lacking some of the original subtlety. And there are certainly cases where the original code is very hard to derive from the machine-code. I played with a "symbolic disassembler" many years back, and on "well-behaved code" it would reconstruct assembly code that could be recompiled, but it struggled with for example switch-statements that became a PC-relative jump-table, because when you modify the code, it couldn't figure out what the jumps were - just as one example. I'm pretty sure it's possible to, at least as a human, write code that is nearly impossible to translate back to a higher level language. And modern compilers may not use the same types of obfuscation, but they will certainly produce code that is complex, hard to follow and not using obvious instructions for some particular purpose. -- Mats On 17 July 2015 at 17:11, Shuai Wang <wangshuai901 at gmail.com> wrote:> This is not a easy task. And I believe there is *NO* (open-source) tool > can fully solve this problem (statically). Correct me if I was wrong. > > It would be more helpful if you can provide details about what you want to > do, say, static or dynamic ? stripped binary or binary with symbolic > information? > What compiler do you work on? > > Check out papers below if you are interested. > > http://dl.acm.org/citation.cfm?id=2465380 > <https://urldefense.proofpoint.com/v2/url?u=http-3A__dl.acm.org_citation.cfm-3Fid-3D2465380&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=PMWV93YoHpzwPfOq-d9rjutlZ5ICwU8uIp3HLShT_D0&s=74RkRYSGnXHwJXd5DvxXdamQv0mj7_NjyBzbdCNRrYo&e=> > > http://dl.acm.org/citation.cfm?id=2462165 > <https://urldefense.proofpoint.com/v2/url?u=http-3A__dl.acm.org_citation.cfm-3Fid-3D2462165&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=PMWV93YoHpzwPfOq-d9rjutlZ5ICwU8uIp3HLShT_D0&s=rpl0PCuoy_iecIKs3lz3F0nGYQYw1J1cqTapvfLsceo&e=> > > > > Shuai > > > > On Fri, Jul 17, 2015 at 3:09 AM, 慕冬亮 <mudongliangabcd at gmail.com> wrote: > >> I want to transform elf binary to llvm IR, and do some instrumentation >> based on llvm. >> Is there any tool which can do the transformation? >> Thanks in advance. >> >> - mudongliang >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/c59fcebe/attachment.html>
What we want to do is to transform binary(binary with symbolic information) to llvm IR in static way. I will instrument code in the llvm IR. The compiler may be llvm clang. 2015-07-18 0:11 GMT+08:00 Shuai Wang <wangshuai901 at gmail.com>:> This is not a easy task. And I believe there is *NO* (open-source) tool > can fully solve this problem (statically). Correct me if I was wrong. > > It would be more helpful if you can provide details about what you want to > do, say, static or dynamic ? stripped binary or binary with symbolic > information? > What compiler do you work on? > > Check out papers below if you are interested. > > http://dl.acm.org/citation.cfm?id=2465380 > > http://dl.acm.org/citation.cfm?id=2462165 > > > > Shuai > > > > On Fri, Jul 17, 2015 at 3:09 AM, 慕冬亮 <mudongliangabcd at gmail.com> wrote: > >> I want to transform elf binary to llvm IR, and do some instrumentation >> based on llvm. >> Is there any tool which can do the transformation? >> Thanks in advance. >> >> - mudongliang >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150718/62645e85/attachment.html>
On 7/17/2015 2:09 AM, 慕冬亮 wrote:> I want to transform elf binary to llvm IR, and do some instrumentation > based on llvm. > Is there any tool which can do the transformation?It sounds like what you want to do is some form of binary translation, and, quite frankly, LLVM is going to be a poor choice. LLVM is designed to be a compiler IR, and its optimizations rely on source-level hinting information that is irrevocably lost when converted to machine code. While there do exist several projects that can do some conversion from machine code to IR (Dagger, Fracture, MCSema), none of them are sufficiently robust (to my knowledge). In comparison to projects whose raison d'être is binary translation (e.g., Valgrind, Pin), you're not going to see sufficient value-add in using LLVM to outweigh the fact that you're using a very non-robust solution. If you really want to use LLVM, I'd advise using clang to compile the C/C++ code and do instrumentation passes within the clang compilation process. I would not advise trying to do instrumentation via decompiling binaries to LLVM IR. -- Joshua Cranmer Thunderbird and DXR developer Source code archæologist