陈志伟 via llvm-dev
2020-Dec-29 14:25 UTC
[llvm-dev] LLVM trunk generates different machine code for JCC instruction w/ or w/o debug info
Hi folks, it’s my first post in llvm-dev mailing list, and definitely not the last :-) Recently, I found an elf file built with or without debug info has different machine code generated. Sadly, it cannot be reproduced in a piece of code. Here is my investigation.> clang -S -emit-llvm foo.cc<http://foo.cc> -O3 -ggdb3 -o dbg.ll > clang -S -emit-llvm foo.cc<http://foo.cc> -O3 -o rel.llWhere foo.cc<http://foo.cc> is a cc file in my company of 10k+ LOC and depends on tons of 3rd libraries. The difference between dbg.ll and rel.ll are the llvm debug intrinsics. Emmmm, looks fine.> llc dbg.ll -o dbg.s > llc rel.ll -o rel.sAnd the asm instructions are the same. Emmm, fine again.> llvm-mc -filetype=obj dbg.s -o dbg.o > llvm-mc -filetype=obj rel.s -o rel.oThe 2 obj files generated by LLVM assembler has DIFFERENT machine codes.> 74 19 je f20The obj compiled with debug info use 0x74 to represent a JE instruction, while> 0f 84 15 00 00 00 je f20The obj compiled without debug info use 0x0f 0x84 instead. What? Why the debug info affects the generation of machine code? As a LLVM beginner, I’m willing to dive deeper to find the root cause. Thanks in advance. -- Zhiwei Chen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201229/228ad217/attachment.html>
David Blaikie via llvm-dev
2020-Dec-29 19:45 UTC
[llvm-dev] LLVM trunk generates different machine code for JCC instruction w/ or w/o debug info
Yeah - we try to ensure that LLVM's debug info doesn't change what code is generated, but it's best effort - no one's done fuzzing/etc to make it especially robust. If you want to investigate this I'd suggest using CReduce ( https://embed.cs.utah.edu/creduce/ ) to reduce the example to something small/manageable and then possibly report it here and/or investigate it yourself (LLVM/Clang support dumping the intermediate representation after every pass (-mllvm -dump-after-all/-print-after-all, something like that, I forget the precise spelling) and you could see where the IR or machine IR diverges between the debuginfo/not-debuginfo cases) On Tue, Dec 29, 2020 at 6:26 AM 陈志伟 via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi folks, it’s my first post in llvm-dev mailing list, and definitely not > the last :-) > > Recently, I found an elf file built with or without debug info has > different machine code generated. Sadly, it cannot be reproduced in a piece > of code. Here is my investigation. > > > clang -S -emit-llvm foo.cc -O3 -ggdb3 -o dbg.ll > > clang -S -emit-llvm foo.cc -O3 -o rel.ll > > Where foo.cc is a cc file in my company of 10k+ LOC and depends on tons > of 3rd libraries. > > The difference between dbg.ll and rel.ll are the llvm debug intrinsics. > Emmmm, looks fine. > > > llc dbg.ll -o dbg.s > > llc rel.ll -o rel.s > > And the asm instructions are the same. Emmm, fine again. > > > llvm-mc -filetype=obj dbg.s -o dbg.o > > llvm-mc -filetype=obj rel.s -o rel.o > > The 2 obj files generated by LLVM assembler has DIFFERENT machine codes. > > > 74 19 je f20 > > The obj compiled with debug info use 0x74 to represent a JE instruction, > while > > > 0f 84 15 00 00 00 je f20 > > The obj compiled without debug info use 0x0f 0x84 instead. > > What? Why the debug info affects the generation of machine code? As a LLVM > beginner, I’m willing to dive deeper to find the root cause. > > Thanks in advance. > > -- > Zhiwei Chen > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201229/20968f2b/attachment.html>
Neil Nelson via llvm-dev
2020-Dec-29 19:54 UTC
[llvm-dev] LLVM trunk generates different machine code for JCC instruction w/ or w/o debug info
Bug 37728 - [meta] Make llvm passes debug info invariant https://bugs.llvm.org/show_bug.cgi?id=37728 Further discussion on methods. https://groups.google.com/g/llvm-dev/c/yvbWr4azdh0/m/gy1tQIzIDwAJ Neil Nelson On 12/29/20 7:25 AM, 陈志伟 via llvm-dev wrote:> Hi folks, it’s my first post in llvm-dev mailing list, and definitely > not the last :-) > > Recently, I found an elf file built with or without debug info has > different machine code generated. Sadly, it cannot be reproduced in a > piece of code. Here is my investigation. > > > clang -S -emit-llvm foo.cc <http://foo.cc> -O3 -ggdb3 -o dbg.ll > > clang -S -emit-llvm foo.cc <http://foo.cc> -O3 -o rel.ll > > Where foo.cc <http://foo.cc> is a cc file in my company of 10k+ LOC > and depends on tons of 3rd libraries. > > The difference between dbg.ll and rel.ll are the llvm debug > intrinsics. Emmmm, looks fine. > > > llc dbg.ll -o dbg.s > > llc rel.ll -o rel.s > > And the asm instructions are the same. Emmm, fine again. > > > llvm-mc -filetype=obj dbg.s -o dbg.o > > llvm-mc -filetype=obj rel.s -o rel.o > > The 2 obj files generated by LLVM assembler has DIFFERENT machine codes. > > > 74 19 je f20 > > The obj compiled with debug info use 0x74 to represent a JE > instruction, while > > > 0f 84 15 00 00 00 je f20 > > The obj compiled without debug info use 0x0f 0x84 instead. > > What? Why the debug info affects the generation of machine code? As a > LLVM beginner, I’m willing to dive deeper to find the root cause. > > Thanks in advance. > > -- > Zhiwei Chen > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201229/0078f59a/attachment.html>