thr3ads.net - llvm dev - [llvm-dev] Understanding optimizations below LLVM IR. [Sep 2018]

If this information is useful, please help other people find it:
Share via:

James Courtier-Dutton via llvm-dev

2018-Sep-02 15:42 UTC

[llvm-dev] Understanding optimizations below LLVM IR.

Consider the following x86_64 assembly:
cmpl   $0x12,%rax
sbb    %esi,%esi
and    $0xffffffffffffffdf,%esi
add    $0x5b,%esi
It contains the SBB instruction. SBB cannot be represented in LLVM_IR
cmpl: if ($0x12 < %rax) set the carry flag.
sbb: if carry flag set: %esi = 0xffffffffffffffff else %esi = 0;    //
Note: 0xffffffffffffffff = -1
and: if carry flag set: %esi = 0xffffffffffffffdf else %esi = 0;    //
Note: 0xffffffffffffffdf = -33
add: if carry flag set: %esi = 0x3a else %esi = 0x5b;
This can then be converted into:
if ($0x12 < %rax) {
        %esi = 58;  // 0x3a
} else {
        %esi = 91;  // 0x5b
}
So, the SBB is used here to remove the need for a Branch instruction.
WOW, compilers are clever!!!

I am writing a de-compiler "Binary -> LLVM IR". So, I obviously
need to
treat SBB as a special case and transform it into something that can be
represented in LLVM IR.
I wish to obtain a list of all the optimizations done by LLVM that result
in assembly that cannot immediately be represented in LLVM IR.
The above being one example.
For example:
1) List all optimizations that result in a SBB instruction.

Where in LLVM should I start looking ?

Kind Regards

James
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180902/45247711/attachment.html>

Krzysztof Parzyszek via llvm-dev

2018-Sep-02 17:00 UTC

head link

[llvm-dev] Understanding optimizations below LLVM IR.

On 9/2/2018 10:42 AM, James Courtier-Dutton via llvm-dev
wrote:> 
> I am writing a de-compiler "Binary -> LLVM IR". So, I
obviously need to
> treat SBB as a special case and transform it into something that can be 
> represented in LLVM IR.
Not a "special case", it's just an instruction whose function
needs to
be represented in the LLVM IR somehow.

> I wish to obtain a list of all the optimizations done by LLVM that 
> result in assembly that cannot immediately be represented in LLVM IR.
That won't take you anywhere.

Think of this as a compiler that takes source programs in ELF format 
(for example) and produces output in .ll format. The resulting .ll will 
never look exactly like the original bitcode, the best you can get is 
that it will have the same semantics. The SBB instruction uses the carry 
  bit and modifies the carry bit, so you need to represent the carry in 
your bitcode model somehow, and then do just that: write bitcode that 
produces the result of the subtraction and the value of the simulated 
carry bit. For things like EFLAGS are difficult to model because they 
are like a global variable, but if you assume some default value of it 
at function entries, you can still "decompile" functions that use it.

-Krzysztof

llvm dev - Sep 2018 - Understanding optimizations below LLVM IR.

[llvm-dev] Understanding optimizations below LLVM IR.

[llvm-dev] Understanding optimizations below LLVM IR.