Mehdi Amini via llvm-dev
2016-May-30 00:16 UTC
[llvm-dev] [cfe-dev] How to debug if LTO generate wrong code?
> On May 29, 2016, at 5:10 PM, Shi, Steven <steven.shi at intel.com> wrote: > > Hi Mehdi, > GCC LTO seems support large code model in my side as below, if the code model is linker specific, does the GCC LTO use a special linker which is different from the one in GNU Binutils?I don't know anything about GCC. (And I doubt the GNU linker supports LTO with LLVM).> I’m a bit surprised if both OS X ld64 and gold plugin do not support large code model in LTO. Since modern system widely use the 64bit, the code need to run in high address (larger than 2 GB) is a reasonable requirement.The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add. What is the linker you're using? Are you building your own clang? -- Mehdi> > $ gcc -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto_gcc.bin > $ objdump -dS codemodel1_large_lto_gcc.bin > > int main(int argc, const char* argv[]) > { > 40048b: 55 push %rbp > 40048c: 48 89 e5 mov %rsp,%rbp > 40048f: 48 83 ec 20 sub $0x20,%rsp > 400493: 89 7d ec mov %edi,-0x14(%rbp) > 400496: 48 89 75 e0 mov %rsi,-0x20(%rbp) > int t = global_func(argc); > 40049a: 8b 45 ec mov -0x14(%rbp),%eax > 40049d: 89 c7 mov %eax,%edi > 40049f: 48 b8 76 04 40 00 00 movabs $0x400476,%rax > 4004a6: 00 00 00 > 4004a9: ff d0 callq *%rax > 4004ab: 89 45 fc mov %eax,-0x4(%rbp) > t += global_arr[7]; > 4004ae: 48 b8 20 09 60 00 00 movabs $0x600920,%rax > 4004b5: 00 00 00 > 4004b8: 8b 40 1c mov 0x1c(%rax),%eax > 4004bb: 01 45 fc add %eax,-0x4(%rbp) > t += static_arr[7]; > 4004be: 48 b8 c0 0a 60 00 00 movabs $0x600ac0,%rax > 4004c5: 00 00 00 > 4004c8: 8b 40 1c mov 0x1c(%rax),%eax > 4004cb: 01 45 fc add %eax,-0x4(%rbp) > t += global_arr_big[7]; > 4004ce: 48 b8 60 0c 60 00 00 movabs $0x600c60,%rax > 4004d5: 00 00 00 > 4004d8: 8b 40 1c mov 0x1c(%rax),%eax > 4004db: 01 45 fc add %eax,-0x4(%rbp) > t += static_arr_big[7]; > 4004de: 48 b8 a0 19 63 00 00 movabs $0x6319a0,%rax > 4004e5: 00 00 00 > 4004e8: 8b 40 1c mov 0x1c(%rax),%eax > 4004eb: 01 45 fc add %eax,-0x4(%rbp) > return t; > 4004ee: 8b 45 fc mov -0x4(%rbp),%eax > } > > Steven Shi > Intel\SSG\STO\UEFI Firmware > > Tel: +86 021-61166522 > iNet: 821-6522 > > <>From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] > Sent: Monday, May 30, 2016 4:28 AM > To: Shi, Steven <steven.shi at intel.com> > Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com>; eliben at gmail.com; llvm-dev <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org; Rafael Espíndola <rafael.espindola at gmail.com> > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? > > Hi, > > > On May 29, 2016, at 7:36 AM, Shi, Steven <steven.shi at intel.com <mailto:steven.shi at intel.com>> wrote: > > Hi Mehdi, > After deeper debug, I found my firmware LTO wrong code issue is related to X64 code model (-mcmodel=large) is always overridden as small (-mcmodel=small) if LTO build. And I don't know how to correctly specific the large code model for my X64 firmware LTO build. Appreciate if you could let me know it. > > You know, parts of my Uefi firmware (BIOS) have to been loaded to run in high address (larger than 2 GB) at the very beginning, and I need the code makes absolutely no assumptions about the addresses and data sections. But current LLVM LTO seems stick to use the small code model and generate many code with 32-bit RIP-relative addressing, which cause CPU exceptions when run in address larger than 2GB. > > Below, I just simply reuse the Eli's codemodel1.c example (link: http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models <http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models>) to show the LLVM LTO code model issue. > $ clang -g -O0 codemodel1.c -mcmodel=large -o codemodel1_large.bin > $ clang -g -O0 codemodel1.c -mcmodel=small -o codemodel1_small.bin > $ clang -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto.bin > $ clang -g -O0 -flto codemodel1.c -mcmodel=small -o codemodel1_small_lto.bin > > You will see the codemodel1_large_lto.bin and codemodel1_small_lto.bin are exactly the same! > And if you disassemble the codemodel1_large_lto.bin, you will see it uses the small code model (32-bit RIP-relative), not large, to do addressing as below. > > $ objdump -dS codemodel1_large_lto.bin > > int main(int argc, const char* argv[]) > { > 4004f0: 55 push %rbp > 4004f1: 48 89 e5 mov %rsp,%rbp > 4004f4: 48 83 ec 20 sub $0x20,%rsp > 4004f8: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) > 4004ff: 89 7d f8 mov %edi,-0x8(%rbp) > 400502: 48 89 75 f0 mov %rsi,-0x10(%rbp) > int t = global_func(argc); > 400506: 8b 7d f8 mov -0x8(%rbp),%edi > 400509: e8 d2 ff ff ff callq 4004e0 <global_func> > 40050e: 89 45 ec mov %eax,-0x14(%rbp) > t += global_arr[7]; > 400511: 8b 04 25 4c 10 60 00 mov 0x60104c,%eax > 400518: 03 45 ec add -0x14(%rbp),%eax > 40051b: 89 45 ec mov %eax,-0x14(%rbp) > t += static_arr[7]; > 40051e: 8b 04 25 dc 11 60 00 mov 0x6011dc,%eax > 400525: 03 45 ec add -0x14(%rbp),%eax > 400528: 89 45 ec mov %eax,-0x14(%rbp) > t += global_arr_big[7]; > 40052b: 8b 04 25 6c 13 60 00 mov 0x60136c,%eax > 400532: 03 45 ec add -0x14(%rbp),%eax > 400535: 89 45 ec mov %eax,-0x14(%rbp) > t += static_arr_big[7]; > 400538: 8b 04 25 ac 20 63 00 mov 0x6320ac,%eax > 40053f: 03 45 ec add -0x14(%rbp),%eax > 400542: 89 45 ec mov %eax,-0x14(%rbp) > return t; > 400545: 8b 45 ec mov -0x14(%rbp),%eax > 400548: 48 83 c4 20 add $0x20,%rsp > 40054c: 5d pop %rbp > 40054d: c3 retq > 40054e: 66 90 xchg %ax,%ax > > > So, does LTO support large code model? How to correctly specify the LTO code model option? > > Same answer as before: LTO is setup by the linker, so the option for that, if it exists, will be linker specific. > > As far as I can tell, neither libLTO-based linker (ld64 on OS X for example), neither the gold plugin supports such an option and the code model is always "default". > > I don't know about lld, CC Rafael about that. > > -- > Mehdi > > > > > > > > Steven Shi > Intel\SSG\STO\UEFI Firmware > > Tel: +86 021-61166522 > iNet: 821-6522 > > > -----Original Message----- > > From: mehdi.amini at apple.com <mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>] > > Sent: Wednesday, May 18, 2016 4:02 AM > > To: Umesh Kalappa <umesh.kalappa0 at gmail.com <mailto:umesh.kalappa0 at gmail.com>> > > Cc: Shi, Steven <steven.shi at intel.com <mailto:steven.shi at intel.com>>; llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; > > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> > > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? > > > > > > > On May 17, 2016, at 11:21 AM, Umesh Kalappa > > <umesh.kalappa0 at gmail.com <mailto:umesh.kalappa0 at gmail.com>> wrote: > > > > > > Steven, > > > > > > As mehdi stated , the optimisation level is specific to linker and it > > > enables Inter-Pro opts passes ,please refer function > > > > To be very clear: the -O option may trigger *linker* optimizations as well, > > independently of LTO. > > > > -- > > Mehdi > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160529/568becba/attachment-0001.html>
Shi, Steven via llvm-dev
2016-May-30 00:44 UTC
[llvm-dev] [cfe-dev] How to debug if LTO generate wrong code?
(And I doubt the GNU linker supports LTO with LLVM). [Steven]: I’ve pushed GNU Binutils ld to support LLVM gold plugin, see detail in this bug https://sourceware.org/bugzilla/show_bug.cgi?id=20070. The new GNU ld linker works well with LLVM/Clang LTO when build IA32 code in my side. And from the ld owner input in the bug comments, the current X64 LLVM LTO issue is in llvm LTO plugin. The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add. What is the linker you're using? Are you building your own clang? [Steven]: I’m using the standard LLVM 3.8 with the above GNU new ld linker. I can build my own clang in my side if needed. I’m happy to know it is not difficult to enable the large code model in LLVM LTO and “it is really a trivial option to add”. Could you let me know how to enable it? My lots of work have been blocked by the large code model issue. Thank you! Steven Shi Intel\SSG\STO\UEFI Firmware Tel: +86 021-61166522 iNet: 821-6522 From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] Sent: Monday, May 30, 2016 8:17 AM To: Shi, Steven <steven.shi at intel.com> Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com>; eliben at gmail.com; llvm-dev <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org; Rafael Espíndola <rafael.espindola at gmail.com> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? On May 29, 2016, at 5:10 PM, Shi, Steven <steven.shi at intel.com<mailto:steven.shi at intel.com>> wrote: Hi Mehdi, GCC LTO seems support large code model in my side as below, if the code model is linker specific, does the GCC LTO use a special linker which is different from the one in GNU Binutils? I don't know anything about GCC. (And I doubt the GNU linker supports LTO with LLVM). I’m a bit surprised if both OS X ld64 and gold plugin do not support large code model in LTO. Since modern system widely use the 64bit, the code need to run in high address (larger than 2 GB) is a reasonable requirement. The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add. What is the linker you're using? Are you building your own clang? -- Mehdi $ gcc -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto_gcc.bin $ objdump -dS codemodel1_large_lto_gcc.bin int main(int argc, const char* argv[]) { 40048b: 55 push %rbp 40048c: 48 89 e5 mov %rsp,%rbp 40048f: 48 83 ec 20 sub $0x20,%rsp 400493: 89 7d ec mov %edi,-0x14(%rbp) 400496: 48 89 75 e0 mov %rsi,-0x20(%rbp) int t = global_func(argc); 40049a: 8b 45 ec mov -0x14(%rbp),%eax 40049d: 89 c7 mov %eax,%edi 40049f: 48 b8 76 04 40 00 00 movabs $0x400476,%rax 4004a6: 00 00 00 4004a9: ff d0 callq *%rax 4004ab: 89 45 fc mov %eax,-0x4(%rbp) t += global_arr[7]; 4004ae: 48 b8 20 09 60 00 00 movabs $0x600920,%rax 4004b5: 00 00 00 4004b8: 8b 40 1c mov 0x1c(%rax),%eax 4004bb: 01 45 fc add %eax,-0x4(%rbp) t += static_arr[7]; 4004be: 48 b8 c0 0a 60 00 00 movabs $0x600ac0,%rax 4004c5: 00 00 00 4004c8: 8b 40 1c mov 0x1c(%rax),%eax 4004cb: 01 45 fc add %eax,-0x4(%rbp) t += global_arr_big[7]; 4004ce: 48 b8 60 0c 60 00 00 movabs $0x600c60,%rax 4004d5: 00 00 00 4004d8: 8b 40 1c mov 0x1c(%rax),%eax 4004db: 01 45 fc add %eax,-0x4(%rbp) t += static_arr_big[7]; 4004de: 48 b8 a0 19 63 00 00 movabs $0x6319a0,%rax 4004e5: 00 00 00 4004e8: 8b 40 1c mov 0x1c(%rax),%eax 4004eb: 01 45 fc add %eax,-0x4(%rbp) return t; 4004ee: 8b 45 fc mov -0x4(%rbp),%eax } Steven Shi Intel\SSG\STO\UEFI Firmware Tel: +86 021-61166522 iNet: 821-6522 From: mehdi.amini at apple.com<mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com] Sent: Monday, May 30, 2016 4:28 AM To: Shi, Steven <steven.shi at intel.com<mailto:steven.shi at intel.com>> Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com<mailto:umesh.kalappa0 at gmail.com>>; eliben at gmail.com<mailto:eliben at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>; Rafael Espíndola <rafael.espindola at gmail.com<mailto:rafael.espindola at gmail.com>> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? Hi, On May 29, 2016, at 7:36 AM, Shi, Steven <steven.shi at intel.com<mailto:steven.shi at intel.com>> wrote: Hi Mehdi, After deeper debug, I found my firmware LTO wrong code issue is related to X64 code model (-mcmodel=large) is always overridden as small (-mcmodel=small) if LTO build. And I don't know how to correctly specific the large code model for my X64 firmware LTO build. Appreciate if you could let me know it. You know, parts of my Uefi firmware (BIOS) have to been loaded to run in high address (larger than 2 GB) at the very beginning, and I need the code makes absolutely no assumptions about the addresses and data sections. But current LLVM LTO seems stick to use the small code model and generate many code with 32-bit RIP-relative addressing, which cause CPU exceptions when run in address larger than 2GB. Below, I just simply reuse the Eli's codemodel1.c example (link: http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models) to show the LLVM LTO code model issue. $ clang -g -O0 codemodel1.c -mcmodel=large -o codemodel1_large.bin $ clang -g -O0 codemodel1.c -mcmodel=small -o codemodel1_small.bin $ clang -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto.bin $ clang -g -O0 -flto codemodel1.c -mcmodel=small -o codemodel1_small_lto.bin You will see the codemodel1_large_lto.bin and codemodel1_small_lto.bin are exactly the same! And if you disassemble the codemodel1_large_lto.bin, you will see it uses the small code model (32-bit RIP-relative), not large, to do addressing as below. $ objdump -dS codemodel1_large_lto.bin int main(int argc, const char* argv[]) { 4004f0: 55 push %rbp 4004f1: 48 89 e5 mov %rsp,%rbp 4004f4: 48 83 ec 20 sub $0x20,%rsp 4004f8: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 4004ff: 89 7d f8 mov %edi,-0x8(%rbp) 400502: 48 89 75 f0 mov %rsi,-0x10(%rbp) int t = global_func(argc); 400506: 8b 7d f8 mov -0x8(%rbp),%edi 400509: e8 d2 ff ff ff callq 4004e0 <global_func> 40050e: 89 45 ec mov %eax,-0x14(%rbp) t += global_arr[7]; 400511: 8b 04 25 4c 10 60 00 mov 0x60104c,%eax 400518: 03 45 ec add -0x14(%rbp),%eax 40051b: 89 45 ec mov %eax,-0x14(%rbp) t += static_arr[7]; 40051e: 8b 04 25 dc 11 60 00 mov 0x6011dc,%eax 400525: 03 45 ec add -0x14(%rbp),%eax 400528: 89 45 ec mov %eax,-0x14(%rbp) t += global_arr_big[7]; 40052b: 8b 04 25 6c 13 60 00 mov 0x60136c,%eax 400532: 03 45 ec add -0x14(%rbp),%eax 400535: 89 45 ec mov %eax,-0x14(%rbp) t += static_arr_big[7]; 400538: 8b 04 25 ac 20 63 00 mov 0x6320ac,%eax 40053f: 03 45 ec add -0x14(%rbp),%eax 400542: 89 45 ec mov %eax,-0x14(%rbp) return t; 400545: 8b 45 ec mov -0x14(%rbp),%eax 400548: 48 83 c4 20 add $0x20,%rsp 40054c: 5d pop %rbp 40054d: c3 retq 40054e: 66 90 xchg %ax,%ax So, does LTO support large code model? How to correctly specify the LTO code model option? Same answer as before: LTO is setup by the linker, so the option for that, if it exists, will be linker specific. As far as I can tell, neither libLTO-based linker (ld64 on OS X for example), neither the gold plugin supports such an option and the code model is always "default". I don't know about lld, CC Rafael about that. -- Mehdi Steven Shi Intel\SSG\STO\UEFI Firmware Tel: +86 021-61166522 iNet: 821-6522> -----Original Message----- > From: mehdi.amini at apple.com<mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com] > Sent: Wednesday, May 18, 2016 4:02 AM > To: Umesh Kalappa <umesh.kalappa0 at gmail.com<mailto:umesh.kalappa0 at gmail.com>> > Cc: Shi, Steven <steven.shi at intel.com<mailto:steven.shi at intel.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>; > cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org> > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? > > > > On May 17, 2016, at 11:21 AM, Umesh Kalappa > <umesh.kalappa0 at gmail.com<mailto:umesh.kalappa0 at gmail.com>> wrote: > > > > Steven, > > > > As mehdi stated , the optimisation level is specific to linker and it > > enables Inter-Pro opts passes ,please refer function > > To be very clear: the -O option may trigger *linker* optimizations as well, > independently of LTO. > > -- > Mehdi > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160530/40084a3e/attachment.html>
Mehdi Amini via llvm-dev
2016-May-30 06:13 UTC
[llvm-dev] [cfe-dev] How to debug if LTO generate wrong code?
> On May 29, 2016, at 5:44 PM, Shi, Steven <steven.shi at intel.com> wrote: > > (And I doubt the GNU linker supports LTO with LLVM). > [Steven]: I’ve pushed GNU Binutils ld to support LLVM gold plugin, see detail in this bug https://sourceware.org/bugzilla/show_bug.cgi?id=20070 <https://sourceware.org/bugzilla/show_bug.cgi?id=20070>. The new GNU ld linker works well with LLVM/Clang LTO when build IA32 code in my side. And from the ld owner input in the bug comments, the current X64 LLVM LTO issue is in llvm LTO plugin. > > > The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add. > What is the linker you're using? Are you building your own clang? > [Steven]: I’m using the standard LLVM 3.8 with the above GNU new ld linker. I can build my own clang in my side if needed. I’m happy to know it is not difficult to enable the large code model in LLVM LTO and “it is really a trivial option to add”. Could you let me know how to enable it? My lots of work have been blocked by the large code model issue. Thank you!I can't test it locally, but here is a starting point in the gold plugin, inspired by the code present in clang: You need to use your linker-specific way of passing the option "-lto-use-large-codemodel=..." to the plugin. Let me know if it works for you! -- Mehdi> > > Steven Shi > Intel\SSG\STO\UEFI Firmware > > Tel: +86 021-61166522 > iNet: 821-6522 > > <>From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] > Sent: Monday, May 30, 2016 8:17 AM > To: Shi, Steven <steven.shi at intel.com> > Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com>; eliben at gmail.com; llvm-dev <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org; Rafael Espíndola <rafael.espindola at gmail.com> > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? > > > On May 29, 2016, at 5:10 PM, Shi, Steven <steven.shi at intel.com <mailto:steven.shi at intel.com>> wrote: > > Hi Mehdi, > GCC LTO seems support large code model in my side as below, if the code model is linker specific, does the GCC LTO use a special linker which is different from the one in GNU Binutils? > > I don't know anything about GCC. > (And I doubt the GNU linker supports LTO with LLVM). > > > I’m a bit surprised if both OS X ld64 and gold plugin do not support large code model in LTO. Since modern system widely use the 64bit, the code need to run in high address (larger than 2 GB) is a reasonable requirement. > > The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add. > What is the linker you're using? Are you building your own clang? > > -- > Mehdi > > > > > > $ gcc -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto_gcc.bin > $ objdump -dS codemodel1_large_lto_gcc.bin > > int main(int argc, const char* argv[]) > { > 40048b: 55 push %rbp > 40048c: 48 89 e5 mov %rsp,%rbp > 40048f: 48 83 ec 20 sub $0x20,%rsp > 400493: 89 7d ec mov %edi,-0x14(%rbp) > 400496: 48 89 75 e0 mov %rsi,-0x20(%rbp) > int t = global_func(argc); > 40049a: 8b 45 ec mov -0x14(%rbp),%eax > 40049d: 89 c7 mov %eax,%edi > 40049f: 48 b8 76 04 40 00 00 movabs $0x400476,%rax > 4004a6: 00 00 00 > 4004a9: ff d0 callq *%rax > 4004ab: 89 45 fc mov %eax,-0x4(%rbp) > t += global_arr[7]; > 4004ae: 48 b8 20 09 60 00 00 movabs $0x600920,%rax > 4004b5: 00 00 00 > 4004b8: 8b 40 1c mov 0x1c(%rax),%eax > 4004bb: 01 45 fc add %eax,-0x4(%rbp) > t += static_arr[7]; > 4004be: 48 b8 c0 0a 60 00 00 movabs $0x600ac0,%rax > 4004c5: 00 00 00 > 4004c8: 8b 40 1c mov 0x1c(%rax),%eax > 4004cb: 01 45 fc add %eax,-0x4(%rbp) > t += global_arr_big[7]; > 4004ce: 48 b8 60 0c 60 00 00 movabs $0x600c60,%rax > 4004d5: 00 00 00 > 4004d8: 8b 40 1c mov 0x1c(%rax),%eax > 4004db: 01 45 fc add %eax,-0x4(%rbp) > t += static_arr_big[7]; > 4004de: 48 b8 a0 19 63 00 00 movabs $0x6319a0,%rax > 4004e5: 00 00 00 > 4004e8: 8b 40 1c mov 0x1c(%rax),%eax > 4004eb: 01 45 fc add %eax,-0x4(%rbp) > return t; > 4004ee: 8b 45 fc mov -0x4(%rbp),%eax > } > > Steven Shi > Intel\SSG\STO\UEFI Firmware > > Tel: +86 021-61166522 > iNet: 821-6522 > > From: mehdi.amini at apple.com <mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>] > Sent: Monday, May 30, 2016 4:28 AM > To: Shi, Steven <steven.shi at intel.com <mailto:steven.shi at intel.com>> > Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com <mailto:umesh.kalappa0 at gmail.com>>; eliben at gmail.com <mailto:eliben at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>; Rafael Espíndola <rafael.espindola at gmail.com <mailto:rafael.espindola at gmail.com>> > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? > > Hi, > > > On May 29, 2016, at 7:36 AM, Shi, Steven <steven.shi at intel.com <mailto:steven.shi at intel.com>> wrote: > > Hi Mehdi, > After deeper debug, I found my firmware LTO wrong code issue is related to X64 code model (-mcmodel=large) is always overridden as small (-mcmodel=small) if LTO build. And I don't know how to correctly specific the large code model for my X64 firmware LTO build. Appreciate if you could let me know it. > > You know, parts of my Uefi firmware (BIOS) have to been loaded to run in high address (larger than 2 GB) at the very beginning, and I need the code makes absolutely no assumptions about the addresses and data sections. But current LLVM LTO seems stick to use the small code model and generate many code with 32-bit RIP-relative addressing, which cause CPU exceptions when run in address larger than 2GB. > > Below, I just simply reuse the Eli's codemodel1.c example (link: http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models <http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models>) to show the LLVM LTO code model issue. > $ clang -g -O0 codemodel1.c -mcmodel=large -o codemodel1_large.bin > $ clang -g -O0 codemodel1.c -mcmodel=small -o codemodel1_small.bin > $ clang -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto.bin > $ clang -g -O0 -flto codemodel1.c -mcmodel=small -o codemodel1_small_lto.bin > > You will see the codemodel1_large_lto.bin and codemodel1_small_lto.bin are exactly the same! > And if you disassemble the codemodel1_large_lto.bin, you will see it uses the small code model (32-bit RIP-relative), not large, to do addressing as below. > > $ objdump -dS codemodel1_large_lto.bin > > int main(int argc, const char* argv[]) > { > 4004f0: 55 push %rbp > 4004f1: 48 89 e5 mov %rsp,%rbp > 4004f4: 48 83 ec 20 sub $0x20,%rsp > 4004f8: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) > 4004ff: 89 7d f8 mov %edi,-0x8(%rbp) > 400502: 48 89 75 f0 mov %rsi,-0x10(%rbp) > int t = global_func(argc); > 400506: 8b 7d f8 mov -0x8(%rbp),%edi > 400509: e8 d2 ff ff ff callq 4004e0 <global_func> > 40050e: 89 45 ec mov %eax,-0x14(%rbp) > t += global_arr[7]; > 400511: 8b 04 25 4c 10 60 00 mov 0x60104c,%eax > 400518: 03 45 ec add -0x14(%rbp),%eax > 40051b: 89 45 ec mov %eax,-0x14(%rbp) > t += static_arr[7]; > 40051e: 8b 04 25 dc 11 60 00 mov 0x6011dc,%eax > 400525: 03 45 ec add -0x14(%rbp),%eax > 400528: 89 45 ec mov %eax,-0x14(%rbp) > t += global_arr_big[7]; > 40052b: 8b 04 25 6c 13 60 00 mov 0x60136c,%eax > 400532: 03 45 ec add -0x14(%rbp),%eax > 400535: 89 45 ec mov %eax,-0x14(%rbp) > t += static_arr_big[7]; > 400538: 8b 04 25 ac 20 63 00 mov 0x6320ac,%eax > 40053f: 03 45 ec add -0x14(%rbp),%eax > 400542: 89 45 ec mov %eax,-0x14(%rbp) > return t; > 400545: 8b 45 ec mov -0x14(%rbp),%eax > 400548: 48 83 c4 20 add $0x20,%rsp > 40054c: 5d pop %rbp > 40054d: c3 retq > 40054e: 66 90 xchg %ax,%ax > > > So, does LTO support large code model? How to correctly specify the LTO code model option? > > Same answer as before: LTO is setup by the linker, so the option for that, if it exists, will be linker specific. > > As far as I can tell, neither libLTO-based linker (ld64 on OS X for example), neither the gold plugin supports such an option and the code model is always "default". > > I don't know about lld, CC Rafael about that. > > -- > Mehdi > > > > > > > > > Steven Shi > Intel\SSG\STO\UEFI Firmware > > Tel: +86 021-61166522 > iNet: 821-6522 > > > -----Original Message----- > > From: mehdi.amini at apple.com <mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>] > > Sent: Wednesday, May 18, 2016 4:02 AM > > To: Umesh Kalappa <umesh.kalappa0 at gmail.com <mailto:umesh.kalappa0 at gmail.com>> > > Cc: Shi, Steven <steven.shi at intel.com <mailto:steven.shi at intel.com>>; llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; > > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> > > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code? > > > > > > > On May 17, 2016, at 11:21 AM, Umesh Kalappa > > <umesh.kalappa0 at gmail.com <mailto:umesh.kalappa0 at gmail.com>> wrote: > > > > > > Steven, > > > > > > As mehdi stated , the optimisation level is specific to linker and it > > > enables Inter-Pro opts passes ,please refer function > > > > To be very clear: the -O option may trigger *linker* optimizations as well, > > independently of LTO. > > > > -- > > Mehdi > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160529/8e3ea1f2/attachment-0002.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: code-model-gold.patch Type: application/octet-stream Size: 1449 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160529/8e3ea1f2/attachment-0001.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160529/8e3ea1f2/attachment-0003.html>