* Timo Juhani Lindfors (timo.lindfors at iki.fi) wrote:> Hi, > > Alexandre Gouraud <alexandre.gouraud at enst-bretagne.fr> writes: > > if it does not already exists, could it mean it is a nonsense, then why? > > Why don't you compile your program directly to LLVM bitcode?- In security-testing you sometimes apply black boxing. I've had a similar idea lately. http://www.crazylazy.info/blog/content/x86-differently-vine-and-llvm-klee x86 in general for reverse engeneering purposes isn't very useful. If you could use LLVM-qemu to get an intermediate representation of a specific binary and selectively execute functions symbolically, you'd have a "fuzzer" that reaches code-paths - in any case. That's a much deeper verification. If you read the KLEE research paper and take a look at the number of overlooked bugs they were able to identify, this could be very effective. I don't know how to modify llvm-qemu to translate x86 to LLVM IL. This is not trivial: qemu is a very limited "emulation". The "target" x86 won't have MSRs and specific instructions. The abstraction level is higher. However for unspecific targets it might scale. Marking variables as symbolic in LLVM bytecode however... In any case it would be interesting to be able to translate x86 to LLVM IR. If somebody want's to give that a try let's make a plan ;). Have fun, Marius -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 801 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090929/3c06f420/attachment.sig>
Tilmann Scheller
2009-Sep-29 19:29 UTC
[LLVMdev] converting x86 instructions to LLVM instructions
Hi Marius, On Tue, Sep 29, 2009 at 6:05 PM, Marius <wishinet at googlemail.com> wrote:> * Timo Juhani Lindfors (timo.lindfors at iki.fi) wrote: >> Hi, >> >> Alexandre Gouraud <alexandre.gouraud at enst-bretagne.fr> writes: >> > if it does not already exists, could it mean it is a nonsense, then why? >> >> Why don't you compile your program directly to LLVM bitcode? > - In security-testing you sometimes apply black boxing.Once you use the structure of the machine code of the system under test to generate test cases it is no longer black box testing though :)> I've had a similar idea lately. > http://www.crazylazy.info/blog/content/x86-differently-vine-and-llvm-klee > > x86 in general for reverse engeneering purposes isn't very useful. > If you could use LLVM-qemu to get an intermediate representation of a > specific binary and selectively execute functions symbolically, you'd > have a "fuzzer" that reaches code-paths - in any case. That's a much > deeper verification. If you read the KLEE research paper and take a look > at the number of overlooked bugs they were able to identify, this could > be very effective.I agree, this is an interesting idea.> I don't know how to modify llvm-qemu to translate x86 to LLVM IL. This > is not trivial: qemu is a very limited "emulation". The "target" x86 > won't have MSRs and specific instructions. The abstraction level is > higher.Actually quite the opposite is true :) The emulation is very accurate, otherwise it would not be possible to take a random operating systems and run it without modification in full system emulation mode. And this requires an accurate emulation of other things as well, e.g. the MMU. After all, the authors of the "Selective Symbolic Execution" paper have shown that llvm-qemu is suited for this purpose. Essentially what happens when llvm-qemu translates a basic block of machine code is that you get a semantically equivalent version of your machine code in form of LLVM IR. With the LLVM IR operating on a structure which represents the machine state (a bunch of registers and some additional state). Regardless of how you translate machine code to LLVM IR, you somehow need to model the machine state. I highly doubt that LLVM IR generated by llvm-qemu looks much different than LLVM IR generated by a hand-written frontend which goes directly from machine code to LLVM IR.> However for unspecific targets it might scale. Marking variables > as symbolic in LLVM bytecode however...Well, as your input is machine code you somehow need to specify in which register you want to put your symbolic value (or at which memory address). Then you need to map it to LLVM IR, which at least in the register case is rather straightforward.> In any case it would be interesting to be able to translate x86 to LLVM > IR. If somebody want's to give that a try let's make a plan ;).Cheers, Tilmann
Reasonably Related Threads
- [LLVMdev] converting x86 instructions to LLVM instructions
- [LLVMdev] converting x86 instructions to LLVM instructions
- [LLVMdev] converting x86 instructions to LLVM instructions
- [LLVMdev] converting x86 instructions to LLVM instructions
- [LLVMdev] converting x86 instructions to LLVM instructions