Dean Michael Berris via llvm-dev
2016-Jun-22 17:40 UTC
[llvm-dev] x86: How to Force 2-byte `jmp` instruction in lowering
Thanks Nirav,
I can confirm that this works when I do the compile with llc, but then when
linking to an executable with clang (patched with
http://reviews.llvm.org/D20352 and compiler-rt patched with
http://reviews.llvm.org/D21612) on Linux, I'm getting something different.
Here's a sample of the transcript, and what I'm seeing:
--->8 clang invocation 8<---
[16-06-23 3:33:42] dberris at dberris: ~/xray/llvm-build% ./bin/clang
-fxray-instrument -x c++ -std=c++11 -o test.bin test.cc -g --verbose
clang version 3.9.0 (http://llvm.org/git/clang.git
3ae26ac8b1c9c5db65f3dc0236139448b8b0520a) (http://llvm.org/git/llvm.git
8fd5dd6aa8a633eeb03b245cd0060479371fc521)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/google/home/dberris/xray/llvm-build/./bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
"/usr/local/google/home/dberris/xray/llvm-build/bin/clang-3.9" -cc1
-triple x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free
-main-file-name test.cc -mrelocation-model static -mthread-model posix
-mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases
-munwind-tables -fuse-init-array -target-cpu x86-64 -v -dwarf-column-info
-debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb
-fxray-instrument -resource-dir
/usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0
-internal-isystem
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8
-internal-isystem
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8
-internal-isystem
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8
-internal-isystem
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward
-internal-isystem /usr/local/include -internal-isystem
/usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include -internal-externc-isystem /usr/include
-std=c++11 -fdeprecated-macro -fdebug-compilation-dir
/usr/local/google/home/dberris/xray/llvm-build -ferror-limit 19
-fmessage-length 272 -fobjc-runtime=gcc -fcxx-exceptions -fexceptions
-fdiagnostics-show-option -fcolor-diagnostics -o /tmp/test-03d46e.o -x c++
test.cc
clang -cc1 version 3.9.0 based upon LLVM 3.9.0svn default target
x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring duplicate directory
"/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8"
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward
/usr/local/include
/usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0/include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
"/usr/bin/ld" -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64
-dynamic-linker /lib64/ld-linux-x86-64.so.2 -o test.bin
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/4.8/crtbegin.o
-L/usr/lib/gcc/x86_64-linux-gnu/4.8
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu
-L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu
-L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../..
-L/usr/local/google/home/dberris/xray/llvm-build/bin/../lib -L/lib
-L/usr/lib -whole-archive
/usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0/lib/linux/libclang_rt.xray-x86_64.a
-no-whole-archive /tmp/test-03d46e.o --no-as-needed -lpthread -lrt -lm
-latomic -ldl -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc
--as-needed -lgcc_s --no-as-needed
/usr/lib/gcc/x86_64-linux-gnu/4.8/crtend.o
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crtn.o
--->8 clang invocation 8<---
The test.cc is simply:
--->8 test.cc 8<---
#include <cstdio>
#include <cassert>
[[clang::xray_always_instrument]] void foo() { std::printf("Hello,
XRay!\n"); }
void bar() { std::printf("Not instrumented\n"); }
extern "C" {
extern int __xray_patch();
}
int main(int argc, char* argv[]) {
printf("main has started.\n");
bar();
foo();
__xray_patch();
foo();
}
--->8 test.cc 8<---
A snippet of the disassembly (llvm-objdump -disassemble test.bin) looks
like:
--->8 disassembly 8<---
_Z3foov:
400cb0: e9 09 00 00 00 jmp 9 <_Z3foov+0xE>
400cb5: 66 0f 1f 84 00 00 02 00 00 nopw 512(%rax,%rax)
400cbe: 55 pushq %rbp
400cbf: 48 89 e5 movq %rsp, %rbp
400cc2: 48 83 ec 10 subq $16, %rsp
400cc6: 48 bf c5 0e 40 00 00 00 00 00 movabsq $4198085, %rdi
400cd0: b0 00 movb $0, %al
400cd2: e8 a9 f9 ff ff callq -1623 <.plt+0x30>
400cd7: 89 45 fc movl %eax, -4(%rbp)
400cda: 48 83 c4 10 addq $16, %rsp
400cde: 5d popq %rbp
400cdf: c3 retq
400ce0: 2e 66 0f 1f 84 00 00 02 00 00 nopw %cs:512(%rax,%rax)
400cea: 66 0f 1f 44 00 00 nopw (%rax,%rax)
--->8 disassembly 8<---
Having looked at this a bit, I think you're right that the jumps are being
relaxed, due to the -mrelax-all option being used by clang. The question
becomes whether it's possible to inhibit relaxation for specific
instructions at the LLVM level.
Cheers
On Wed, Jun 22, 2016 at 9:37 AM Nirav Davé <niravd at google.com> wrote:
> Hmm. Odd. I just rebuilt from scratch and it seems to work with
> the test/CodeGen/X86/xray-attribute-instrumentation.ll test case outputing
> straight to obj:
>
> llc -filetype=obj -o ~/a.o -mtriple=x86_64-apple-macosx <
> test/CodeGen/X86/xray-attribute-instrumentation.ll
>
> What test case are you using?
>
> In any case, the issue appears to be that llvm doesn't realize that the
> target address is resolved and erroneously applies branch relaxation to the
> jump. I don't know why a linker private symbol would make a difference.
>
> -Nirav
>
>
>
> On Wed, Jun 22, 2016 at 12:14 PM, Dean Michael Berris <dberris at
google.com>
> wrote:
>
>> On Wed, Jun 22, 2016 at 6:05 AM Nirav Davé <niravd at google.com>
wrote:
>>
>>> This appears to work:
>>>
>>> auto Target = OutContext.createLinkerPrivateTempSymbol();
>>>
>>> with
>>>
>>> auto Target = OutContext.createTempSymbol();
>>>
>>> -Nirav
>>>
>>>
>> Thanks Nirav -- I tried this but I'm still getting a "jmpq
<address>"
>> with this incantation when I load and disassemble from gdb. I'm
seeing a
>> 5-instruction jump, followed by the nops.
>>
>> If I disassemble with llvm-objdump though I see the following:
>>
>> _Z3foov:
>> 400c10: e9 09 00 00 00 jmp 9 <_Z3foov+0xE>
>> 400c15: 66 0f 1f 84 00 00 02 00 00 nopw 512(%rax,%rax)
>>
>> I'm not sure whether the extra 0's after '0xe9 0x09'
are alignment
>> padding (though I was expecing 0x90 to show up if this was an alignment
>> issue).
>>
>> Is there anything else I can try here?
>>
>> Thanks in advance!
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160622/ebbc6ddd/attachment.html>
Dean Michael Berris via llvm-dev
2016-Jun-22 20:36 UTC
[llvm-dev] x86: How to Force 2-byte `jmp` instruction in lowering
Peter suggested just writing out '.byte 0xeb, 0x09' and that allowed the jump instruction to bypass the relaxation, so that fixes my immediate problem. The question still stands though whether it should be possible to do through the instruction builder interface. Cheers On Wed, Jun 22, 2016 at 10:40 AM Dean Michael Berris <dberris at google.com> wrote:> Thanks Nirav, > > I can confirm that this works when I do the compile with llc, but then > when linking to an executable with clang (patched with > http://reviews.llvm.org/D20352 and compiler-rt patched with > http://reviews.llvm.org/D21612) on Linux, I'm getting something > different. Here's a sample of the transcript, and what I'm seeing: > > --->8 clang invocation 8<--- > [16-06-23 3:33:42] dberris at dberris: ~/xray/llvm-build% ./bin/clang > -fxray-instrument -x c++ -std=c++11 -o test.bin test.cc -g --verbose > clang version 3.9.0 (http://llvm.org/git/clang.git > 3ae26ac8b1c9c5db65f3dc0236139448b8b0520a) (http://llvm.org/git/llvm.git > 8fd5dd6aa8a633eeb03b245cd0060479371fc521) > Target: x86_64-unknown-linux-gnu > Thread model: posix > InstalledDir: /usr/local/google/home/dberris/xray/llvm-build/./bin > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7 > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3 > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4 > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3 > Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 > Candidate multilib: .;@m64 > Candidate multilib: 32;@m32 > Candidate multilib: x32;@mx32 > Selected multilib: .;@m64 > "/usr/local/google/home/dberris/xray/llvm-build/bin/clang-3.9" -cc1 > -triple x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free > -main-file-name test.cc -mrelocation-model static -mthread-model posix > -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases > -munwind-tables -fuse-init-array -target-cpu x86-64 -v -dwarf-column-info > -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb > -fxray-instrument -resource-dir > /usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0 > -internal-isystem > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8 > -internal-isystem > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8 > -internal-isystem > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8 > -internal-isystem > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward > -internal-isystem /usr/local/include -internal-isystem > /usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0/include > -internal-externc-isystem /usr/include/x86_64-linux-gnu > -internal-externc-isystem /include -internal-externc-isystem /usr/include > -std=c++11 -fdeprecated-macro -fdebug-compilation-dir > /usr/local/google/home/dberris/xray/llvm-build -ferror-limit 19 > -fmessage-length 272 -fobjc-runtime=gcc -fcxx-exceptions -fexceptions > -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/test-03d46e.o -x c++ > test.cc > clang -cc1 version 3.9.0 based upon LLVM 3.9.0svn default target > x86_64-unknown-linux-gnu > ignoring nonexistent directory "/include" > ignoring duplicate directory > "/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8" > #include "..." search starts here: > #include <...> search starts here: > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8 > > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8 > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward > /usr/local/include > > /usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0/include > /usr/include/x86_64-linux-gnu > /usr/include > End of search list. > "/usr/bin/ld" -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 > -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o test.bin > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crti.o > /usr/lib/gcc/x86_64-linux-gnu/4.8/crtbegin.o > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu > -L/usr/lib/gcc/x86_64-linux-gnu/4.8/../../.. > -L/usr/local/google/home/dberris/xray/llvm-build/bin/../lib -L/lib > -L/usr/lib -whole-archive > /usr/local/google/home/dberris/xray/llvm-build/bin/../lib/clang/3.9.0/lib/linux/libclang_rt.xray-x86_64.a > -no-whole-archive /tmp/test-03d46e.o --no-as-needed -lpthread -lrt -lm > -latomic -ldl -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc > --as-needed -lgcc_s --no-as-needed > /usr/lib/gcc/x86_64-linux-gnu/4.8/crtend.o > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crtn.o > --->8 clang invocation 8<--- > > The test.cc is simply: > > --->8 test.cc 8<--- > #include <cstdio> > #include <cassert> > > [[clang::xray_always_instrument]] void foo() { std::printf("Hello, > XRay!\n"); } > > void bar() { std::printf("Not instrumented\n"); } > > extern "C" { > extern int __xray_patch(); > } > > int main(int argc, char* argv[]) { > printf("main has started.\n"); > bar(); > foo(); > __xray_patch(); > foo(); > } > --->8 test.cc 8<--- > > A snippet of the disassembly (llvm-objdump -disassemble test.bin) looks > like: > > --->8 disassembly 8<--- > _Z3foov: > 400cb0: e9 09 00 00 00 jmp 9 <_Z3foov+0xE> > 400cb5: 66 0f 1f 84 00 00 02 00 00 nopw 512(%rax,%rax) > 400cbe: 55 pushq %rbp > 400cbf: 48 89 e5 movq %rsp, %rbp > 400cc2: 48 83 ec 10 subq $16, %rsp > 400cc6: 48 bf c5 0e 40 00 00 00 00 00 movabsq $4198085, %rdi > 400cd0: b0 00 movb $0, %al > 400cd2: e8 a9 f9 ff ff callq -1623 <.plt+0x30> > 400cd7: 89 45 fc movl %eax, -4(%rbp) > 400cda: 48 83 c4 10 addq $16, %rsp > 400cde: 5d popq %rbp > 400cdf: c3 retq > 400ce0: 2e 66 0f 1f 84 00 00 02 00 00 nopw %cs:512(%rax,%rax) > 400cea: 66 0f 1f 44 00 00 nopw (%rax,%rax) > > --->8 disassembly 8<--- > > Having looked at this a bit, I think you're right that the jumps are being > relaxed, due to the -mrelax-all option being used by clang. The question > becomes whether it's possible to inhibit relaxation for specific > instructions at the LLVM level. > > Cheers > > > On Wed, Jun 22, 2016 at 9:37 AM Nirav Davé <niravd at google.com> wrote: > >> Hmm. Odd. I just rebuilt from scratch and it seems to work with >> the test/CodeGen/X86/xray-attribute-instrumentation.ll test case outputing >> straight to obj: >> >> llc -filetype=obj -o ~/a.o -mtriple=x86_64-apple-macosx < >> test/CodeGen/X86/xray-attribute-instrumentation.ll >> >> What test case are you using? >> >> In any case, the issue appears to be that llvm doesn't realize that the >> target address is resolved and erroneously applies branch relaxation to the >> jump. I don't know why a linker private symbol would make a difference. >> >> -Nirav >> >> >> >> On Wed, Jun 22, 2016 at 12:14 PM, Dean Michael Berris <dberris at google.com >> > wrote: >> >>> On Wed, Jun 22, 2016 at 6:05 AM Nirav Davé <niravd at google.com> wrote: >>> >>>> This appears to work: >>>> >>>> auto Target = OutContext.createLinkerPrivateTempSymbol(); >>>> >>>> with >>>> >>>> auto Target = OutContext.createTempSymbol(); >>>> >>>> -Nirav >>>> >>>> >>> Thanks Nirav -- I tried this but I'm still getting a "jmpq <address>" >>> with this incantation when I load and disassemble from gdb. I'm seeing a >>> 5-instruction jump, followed by the nops. >>> >>> If I disassemble with llvm-objdump though I see the following: >>> >>> _Z3foov: >>> 400c10: e9 09 00 00 00 jmp 9 <_Z3foov+0xE> >>> 400c15: 66 0f 1f 84 00 00 02 00 00 nopw 512(%rax,%rax) >>> >>> I'm not sure whether the extra 0's after '0xe9 0x09' are alignment >>> padding (though I was expecing 0x90 to show up if this was an alignment >>> issue). >>> >>> Is there anything else I can try here? >>> >>> Thanks in advance! >>> >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160622/a653f2e5/attachment.html>
Rafael Espíndola via llvm-dev
2016-Jun-29 02:06 UTC
[llvm-dev] x86: How to Force 2-byte `jmp` instruction in lowering
On 22 June 2016 at 16:36, Dean Michael Berris via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Peter suggested just writing out '.byte 0xeb, 0x09' and that allowed the > jump instruction to bypass the relaxation, so that fixes my immediate > problem. The question still stands though whether it should be possible to > do through the instruction builder interface. >I don't think so. When the relax-all flag is on MC will relax all instructions. Cheers, Rafael
Seemingly Similar Threads
- x86: How to Force 2-byte `jmp` instruction in lowering
- x86: How to Force 2-byte `jmp` instruction in lowering
- x86: How to Force 2-byte `jmp` instruction in lowering
- x86: How to Force 2-byte `jmp` instruction in lowering
- x86: How to Force 2-byte `jmp` instruction in lowering