Mehdi Amini via llvm-dev
2016-Jul-08 17:57 UTC
[llvm-dev] IPRA, interprocedural register allocation, question
> On Jul 7, 2016, at 9:17 PM, Lawrence, Peter via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Vivek, > I am looking into these function attributes in the clang docs > Preserve_most > Preserve_all > They are not available in the 3.6.2 that I am currently using, but I hope they exist in 3.8 > > These should provide enough info to solve my problem, > at the MC level calls to functions with these attributes > with be code-gen’ed through different “calling conventions”, > and CALL instructions to them should have different register USE and DEF info, > > This CALL instruction register USE and DEF info should already be useful > to the intra-procedural register allocator (allowing values live across these > calls to be in what are otherwise caller-save registers), > at least that’s how I read the MC dumps, every call instruction seems to have > every caller-save register flagged as “imp-def”, IE implicitly-defined by the instruction, > and hopefully what is considered a caller-save register at a call-site is defined by the callee. > And this should be the information that IPRA takes advantage of in its bottom-up analysis.The idea of IPRA is to *produce* more accurate list of clobbered register by a functions, so that at call site the caller needs to only save/restore the minimum amount of registers across the call.> > Which leads me to this question, when compiling an entire whole program at one time, > so there is no linking and no LTO, will there ever be IPRA that works within LLC for this scenario, > and is this an objective of your project, or are you focusing only on LTO ?LTO is just a way of exposing more to the analysis: IPRA can only “optimize” calls to function that are codegen’d during the same compilation. With LTO since you codegen the full program at once you can basically optimize “every” call. So IPRA works well without LTO, but will be able to operate only on calls to function that are part of the current compilation.> > I know this is not the typical “linux” scenario (dynamic linking of not only standard libraries, > but also sometimes even application libraries, and lots of static linking because of program > size), but it is a typical “embedded” scenario, which is where I am currently. > > > Other thoughts or comments ?Any reason *not* to use LTO in your case? — Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160708/078d54a4/attachment.html>
Lawrence, Peter via llvm-dev
2016-Jul-12 01:51 UTC
[llvm-dev] IPRA, interprocedural register allocation, question
Mehdi, The external functions I need to call are all hand-written assembly language, How would/could LTO handle that ? --Peter Lawrence. From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] Sent: Friday, July 08, 2016 10:58 AM To: Lawrence, Peter <c_plawre at qca.qualcomm.com> Cc: vivek pandya <vivekvpandya at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org>; llvm-dev-request at lists.llvm.org Subject: Re: [llvm-dev] IPRA, interprocedural register allocation, question On Jul 7, 2016, at 9:17 PM, Lawrence, Peter via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Vivek, I am looking into these function attributes in the clang docs Preserve_most Preserve_all They are not available in the 3.6.2 that I am currently using, but I hope they exist in 3.8 These should provide enough info to solve my problem, at the MC level calls to functions with these attributes with be code-gen’ed through different “calling conventions”, and CALL instructions to them should have different register USE and DEF info, This CALL instruction register USE and DEF info should already be useful to the intra-procedural register allocator (allowing values live across these calls to be in what are otherwise caller-save registers), at least that’s how I read the MC dumps, every call instruction seems to have every caller-save register flagged as “imp-def”, IE implicitly-defined by the instruction, and hopefully what is considered a caller-save register at a call-site is defined by the callee. And this should be the information that IPRA takes advantage of in its bottom-up analysis. The idea of IPRA is to *produce* more accurate list of clobbered register by a functions, so that at call site the caller needs to only save/restore the minimum amount of registers across the call. Which leads me to this question, when compiling an entire whole program at one time, so there is no linking and no LTO, will there ever be IPRA that works within LLC for this scenario, and is this an objective of your project, or are you focusing only on LTO ? LTO is just a way of exposing more to the analysis: IPRA can only “optimize” calls to function that are codegen’d during the same compilation. With LTO since you codegen the full program at once you can basically optimize “every” call. So IPRA works well without LTO, but will be able to operate only on calls to function that are part of the current compilation. I know this is not the typical “linux” scenario (dynamic linking of not only standard libraries, but also sometimes even application libraries, and lots of static linking because of program size), but it is a typical “embedded” scenario, which is where I am currently. Other thoughts or comments ? Any reason *not* to use LTO in your case? — Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160712/3855e6e9/attachment-0001.html>
Mehdi Amini via llvm-dev
2016-Jul-12 01:53 UTC
[llvm-dev] IPRA, interprocedural register allocation, question
> On Jul 11, 2016, at 6:51 PM, Lawrence, Peter <c_plawre at qca.qualcomm.com> wrote: > > Mehdi, > The external functions I need to call are all hand-written assembly language, > How would/could LTO handle that ?I thought about inline asm function, not pure .s files. — Mehdi> > --Peter Lawrence. > > > From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] > Sent: Friday, July 08, 2016 10:58 AM > To: Lawrence, Peter <c_plawre at qca.qualcomm.com> > Cc: vivek pandya <vivekvpandya at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org>; llvm-dev-request at lists.llvm.org > Subject: Re: [llvm-dev] IPRA, interprocedural register allocation, question > > > On Jul 7, 2016, at 9:17 PM, Lawrence, Peter via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Vivek, > I am looking into these function attributes in the clang docs > Preserve_most > Preserve_all > They are not available in the 3.6.2 that I am currently using, but I hope they exist in 3.8 > > These should provide enough info to solve my problem, > at the MC level calls to functions with these attributes > with be code-gen’ed through different “calling conventions”, > and CALL instructions to them should have different register USE and DEF info, > > This CALL instruction register USE and DEF info should already be useful > to the intra-procedural register allocator (allowing values live across these > calls to be in what are otherwise caller-save registers), > at least that’s how I read the MC dumps, every call instruction seems to have > every caller-save register flagged as “imp-def”, IE implicitly-defined by the instruction, > and hopefully what is considered a caller-save register at a call-site is defined by the callee. > And this should be the information that IPRA takes advantage of in its bottom-up analysis. > > The idea of IPRA is to *produce* more accurate list of clobbered register by a functions, so that at call site the caller needs to only save/restore the minimum amount of registers across the call. > > > > Which leads me to this question, when compiling an entire whole program at one time, > so there is no linking and no LTO, will there ever be IPRA that works within LLC for this scenario, > and is this an objective of your project, or are you focusing only on LTO ? > > > LTO is just a way of exposing more to the analysis: IPRA can only “optimize” calls to function that are codegen’d during the same compilation. With LTO since you codegen the full program at once you can basically optimize “every” call. > > So IPRA works well without LTO, but will be able to operate only on calls to function that are part of the current compilation. > > > I know this is not the typical “linux” scenario (dynamic linking of not only standard libraries, > but also sometimes even application libraries, and lots of static linking because of program > size), but it is a typical “embedded” scenario, which is where I am currently. > > > Other thoughts or comments ? > > Any reason *not* to use LTO in your case? > > — > Mehdi-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160711/c135d123/attachment.html>