thr3ads.net - llvm dev - [LLVMdev] converting x86 instructions to LLVM instructions [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Marius

2009-Sep-29 16:05 UTC

[LLVMdev] converting x86 instructions to LLVM instructions

* Timo Juhani Lindfors (timo.lindfors at iki.fi) wrote:> Hi,
> 
> Alexandre Gouraud <alexandre.gouraud at enst-bretagne.fr> writes:
> > if it does not already exists, could it mean it is a nonsense, then
why?
> 
> Why don't you compile your program directly to LLVM bitcode?- In security-testing you sometimes apply black boxing. 

I've had a similar idea lately. 
http://www.crazylazy.info/blog/content/x86-differently-vine-and-llvm-klee

x86 in general for reverse engeneering purposes isn't very useful. 
If you could use LLVM-qemu to get an intermediate representation of a
specific binary and selectively execute functions symbolically, you'd
have a "fuzzer" that reaches code-paths - in any case. That's a
much
deeper verification. If you read the KLEE research paper and take a look
at the number of overlooked bugs they were able to identify, this could
be very effective.

I don't know how to modify llvm-qemu to translate x86 to LLVM IL. This
is not trivial: qemu is a very limited "emulation". The
"target" x86
won't have MSRs and specific instructions. The abstraction level is
higher. However for unspecific targets it might scale. Marking variables
as symbolic in LLVM bytecode however... 

In any case it would be interesting to be able to translate x86 to LLVM
IR. If somebody want's to give that a try let's make a plan ;).

Have fun,
Marius
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090929/3c06f420/attachment.sig>

Tilmann Scheller

2009-Sep-29 19:29 UTC

head link

[LLVMdev] converting x86 instructions to LLVM instructions

Hi Marius,

On Tue, Sep 29, 2009 at 6:05 PM, Marius <wishinet at googlemail.com>
wrote:> * Timo Juhani Lindfors (timo.lindfors at iki.fi) wrote:
>> Hi,
>>
>> Alexandre Gouraud <alexandre.gouraud at enst-bretagne.fr> writes:
>> > if it does not already exists, could it mean it is a nonsense,
then why?
>>
>> Why don't you compile your program directly to LLVM bitcode?
> - In security-testing you sometimes apply black boxing.Once you use the structure of the machine code of the system under
test to generate test cases it is no longer black box testing though
:)
> I've had a similar idea lately.
> http://www.crazylazy.info/blog/content/x86-differently-vine-and-llvm-klee
>
> x86 in general for reverse engeneering purposes isn't very useful.
> If you could use LLVM-qemu to get an intermediate representation of a
> specific binary and selectively execute functions symbolically, you'd
> have a "fuzzer" that reaches code-paths - in any case. That's
a much
> deeper verification. If you read the KLEE research paper and take a look
> at the number of overlooked bugs they were able to identify, this could
> be very effective.I agree, this is an interesting idea.
> I don't know how to modify llvm-qemu to translate x86 to LLVM IL. This
> is not trivial: qemu is a very limited "emulation". The
"target" x86
> won't have MSRs and specific instructions. The abstraction level is
> higher.Actually quite the opposite is true :) The emulation is very accurate,
otherwise it would not be possible to take a random operating systems
and run it without modification in full system emulation mode. And
this requires an accurate emulation of other things as well, e.g. the
MMU. After all, the authors of the "Selective Symbolic Execution"
paper have shown that llvm-qemu is suited for this purpose.

Essentially what happens when llvm-qemu translates a basic block of
machine code is that you get a semantically equivalent version of your
machine code in form of LLVM IR. With the LLVM IR operating on a
structure which represents the machine state (a bunch of registers and
some additional state). Regardless of how you translate machine code
to LLVM IR, you somehow need to model the machine state. I highly
doubt that LLVM IR generated by llvm-qemu looks much different than
LLVM IR generated by a hand-written frontend which goes directly from
machine code to LLVM IR.
> However for unspecific targets it might scale. Marking variables
> as symbolic in LLVM bytecode however...Well, as your input is machine code you somehow need to specify in
which register you want to put your symbolic value (or at which memory
address). Then you need to map it to LLVM IR, which at least in the
register case is rather straightforward.
> In any case it would be interesting to be able to translate x86 to LLVM
> IR. If somebody want's to give that a try let's make a plan ;).
Cheers,

Tilmann

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Sep 2009 - [LLVMdev] converting x86 instructions to LLVM instructions

[LLVMdev] converting x86 instructions to LLVM instructions

[LLVMdev] converting x86 instructions to LLVM instructions

Reasonably Related Threads