vivek pandya via llvm-dev
2016-Jun-12 04:49 UTC
[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation
Dear Community, The patch for Interprocedural Register Allocation has been committed now , thanks to Mehdi Amini for that. We would like you to play with it and let us know your views and more importantly ideas to improve it. The test-suite run has indicated some non trivial issue that results in run time failure of the programs, we will be investigating it more. Here are some stats : test-suite has been tested with IPRA enabled and overall results are not much encouraging. On average 30% increase in compile time. Many programs have also increase in execution time ( average 20%) that is really serious concern for us. About 60 tests have failed on run time this indicates error in compilation. how ever 3 tests have improvement in their runtime and that is 7% average. This week I think good thing for me to learn is to setup llvm development environment properly other wise one can end up wasting too much time building the llvm it self. So here is brief summary: Implementation: =========== The patch has been split into analysis and transformation passes. The pass responsible for register usage propagation has been made target independent. A print method and command line option -print-regusage has been added so that RegMaks details can be printed in Release builds also, this enables lit test case to be testable in Release build too. Other minor changes to adhere coding and naming conventions. Testing: ===== test-suite has been tested with IPRA enabled. Study and other: ============ Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR, fastcc, local functions, MCStream class. In C++ I leaned about emplace family of methods in STL and perfect forwarding introduced in C++11. Plan for next week: 1) Investigate issue related to functional correctness that leads to run time failures 2) profile the compilation process to verify increase in time due to IPRA 3) Improve IPRA by instructing codegen to not save register for local function. 4) Make the pass emit asm comments to indicate register clobbered by function call at call site in generated ASM file. Sincerely, Vivek On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com> wrote:> Dear Community, > > This week I got my patch reviewed by mentors and based on that I have done > changes. Also we have identified a problem with callee-saved registers > being marked as clobbered registers so we fixed that bug. I have described > other minor changes in following section. > > It was expected to get the patch committed by end of this week but due to > unexpected mistake I was not able to complete writing test cases. Sorry for > that. > I had build llvm with ipra enable by default and that build files were on > my path ! Due to that next time I tried to build llvm it was terribly slow > (almost 1 hour for 10% build ). I spend to much time on fixing this by > playing around with environment variables, cmake options etc. > But I think this is a serious concern, we need to think verify this time > complexity other wise building a large software with IPRA enable would be > very time consuming. > > The toughest part for this week was to get lit and FileCheck work as you > expect them to work, specially when analysis pass prints info on stdio and > there is also a output file generated by llc or opt command. > > So here is brief summary : > > Implementation: > ===========> > RegUsageInfoCollector is now Calling Convention aware so that RegMask does > not mark callee saved register as clobbered register. Due to this register > allocator can use callee saved register for caller. > PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp. > StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of > GlobalVariable * to RegMask. > DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h. > Minor correction in comments, changes to adhere coding standards of LLVM. > > Testing: > ====> > The above mentioned changes has been tested with SNU-Realtime benchmarks. > > Studied lit and FileCheck tool and written simple test to verify > functionality of coding. > > > Study and other: > > ===========> > Studied some examples of lit compatible llvm IR with comments to RUN test > cases, FileCheck tool syntax and how to use it with in lit infrastructure. > > I also understand X86 calling convention in more details. > > I also studied basic concepts in llvm IR language while reading .ll files > written for lit. > > I learned about rvalue references and move semantics introduced in C++11. > > > Plan for next week: > > 1) Get the patch committed along with proper tets cases. > > 2) Analyse time complexity of the approach. > > 3) Make target specific pass to CodeGen as it seems it is not required to > be target specific. > > 4) If possible build a large application with ipra enable and analyze the > impact. > > > Sincerely, > > Vivek > > > On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com> > wrote: > >> Dear community, >> >> This is to brief you the progress of Interprocedural Register Allocation, >> for those who are interested to see the progress in terms of code please >> consider http://reviews.llvm.org/D20769 >> This patch contains simple infrastructure to propagate register usage >> information of callee to caller in call graph. The code generation order is >> changed to follow bottom up order on call graph , Thanks to Mehdi Amini for >> the patch for the same ! I will write a blog on this very soon. >> >> So during this week as per the schedule proposed it should be study >> related infrastructure in LLVM and finalizing an approach for IPRA, but >> instead I am able to implement a working (may not be fully correct) >> prototype because I have used community bonding period to discuss and learn >> related stuffs from the mentors and also due to patch for CodeGen >> reordering was provided by dear mentor Mehdi Amini. >> >> So I conclude the work done during this week as follows: >> Implementation : >> ===========>> Following passes have been implemented during this week: An immutable >> pass to store competed RegMask, a machine function pass that iterates >> through each registers and check if it is used or not and based on that >> details create a RegMask and a target specific machine function pass that >> uses the RegMask created by second pass and propagates information by >> updating call instructions RegMask. To update the RegMask of MI , >> setRegMask() function has been added to MachineOperand, a command line >> option -enable-ipra and debug type -debug-only=“ipra" has been added to >> control the optimization through llc. >> >> Testing: >> ====>> The above mentioned implementation has been tested over SNU-Real-Time >> benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html) and >> some simple programs that uses library function ( for a library function >> register allocation is not done by LLVM so this optimization will simply >> skip them) >> >> Study and Other: >> ============>> I have learned following things in LLVM, how it stores reg clobbering >> information? how it is used by Reg allocators through LivePhysRegs, >> LiveRegMatrix and other related passes? How to schedule a pass using >> TargetPassConfig and TargetMachine? What are called callee saved registers? >> What is an Immutable Pass? Apart from that I have also learned how to use >> phabricator to send review request. I have also read some related >> literatures. >> >> During this week though task was to schedule the passes in proper order >> so that dependencies of related passes are satisfied. >> >> Plan for next week: >> 1) Perform more testing and debug any known issue >> 2) Fine ture the implementation so as to eliminate any unnecessary work >> 3) During the testing from the stats I have observed that IPRA does not >> always improve the work of IntraProcedural register allocators and it is >> also observer that the amount of benefit (in terms of spilled live ranges ) >> is not deterministic. So I would like to find reasons for this behavior. >> 4) Start implementing target specific pass for other targets if review >> passes properly with no major bugs. >> >> Please provide any feedback/suggestion including for format of this >> email. >> >> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin >> Colombet, Matthias Braun and other community members for providing quick >> help every time when I asked ( I have got replies even after 8 PM ( PDT) ! >> ) . >> >> Sincerely, >> Vivek >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160612/b0118374/attachment.html>
Quentin Colombet via llvm-dev
2016-Jun-15 00:46 UTC
[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation
Hi Vivek, How much of the slow down on runtime comes from the different layout of the function in the asm file? (I.e., because of the dummy scc pass.) Cheers, Q> Le 11 juin 2016 à 21:49, vivek pandya via llvm-dev <llvm-dev at lists.llvm.org> a écrit : > > Dear Community, > > The patch for Interprocedural Register Allocation has been committed now , thanks to Mehdi Amini for that. We would like you to play with it and let us know your views and more importantly ideas to improve it. > > The test-suite run has indicated some non trivial issue that results in run time failure of the programs, we will be investigating it more. Here are some stats : > test-suite has been tested with IPRA enabled and overall results are not much encouraging. On average 30% increase in compile time. Many programs have also increase in execution time ( average 20%) that is really serious concern for us. About 60 tests have failed on run time this indicates error in compilation. how ever 3 tests have improvement in their runtime and that is 7% average. > > This week I think good thing for me to learn is to setup llvm development environment properly other wise one can end up wasting too much time building the llvm it self. > > So here is brief summary: > Implementation: > ===========> The patch has been split into analysis and transformation passes. The pass responsible for register usage propagation has been made target independent. A print method and command line option -print-regusage has been added so that RegMaks details can be printed in Release builds also, this enables lit test case to be testable in Release build too. Other minor changes to adhere coding and naming conventions. > > Testing: > =====> test-suite has been tested with IPRA enabled. > > Study and other: > ============> Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR, fastcc, local functions, MCStream class. In C++ I leaned about emplace family of methods in STL and perfect forwarding introduced in C++11. > > Plan for next week: > 1) Investigate issue related to functional correctness that leads to run time failures > 2) profile the compilation process to verify increase in time due to IPRA > 3) Improve IPRA by instructing codegen to not save register for local function. > 4) Make the pass emit asm comments to indicate register clobbered by function call at call site in generated ASM file. > > Sincerely, > Vivek > >> On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com> wrote: >> Dear Community, >> >> This week I got my patch reviewed by mentors and based on that I have done changes. Also we have identified a problem with callee-saved registers being marked as clobbered registers so we fixed that bug. I have described other minor changes in following section. >> >> It was expected to get the patch committed by end of this week but due to unexpected mistake I was not able to complete writing test cases. Sorry for that. >> I had build llvm with ipra enable by default and that build files were on my path ! Due to that next time I tried to build llvm it was terribly slow (almost 1 hour for 10% build ). I spend to much time on fixing this by playing around with environment variables, cmake options etc. >> But I think this is a serious concern, we need to think verify this time complexity other wise building a large software with IPRA enable would be very time consuming. >> >> The toughest part for this week was to get lit and FileCheck work as you expect them to work, specially when analysis pass prints info on stdio and there is also a output file generated by llc or opt command. >> >> So here is brief summary : >> >> Implementation: >> ===========>> RegUsageInfoCollector is now Calling Convention aware so that RegMask does not mark callee saved register as clobbered register. Due to this register allocator can use callee saved register for caller. >> PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp. >> StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of GlobalVariable * to RegMask. >> DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h. >> Minor correction in comments, changes to adhere coding standards of LLVM. >> Testing: >> ====>> The above mentioned changes has been tested with SNU-Realtime benchmarks. >> Studied lit and FileCheck tool and written simple test to verify functionality of coding. >> >> Study and other: >> ===========>> Studied some examples of lit compatible llvm IR with comments to RUN test cases, FileCheck tool syntax and how to use it with in lit infrastructure. >> I also understand X86 calling convention in more details. >> I also studied basic concepts in llvm IR language while reading .ll files written for lit. >> I learned about rvalue references and move semantics introduced in C++11. >> >> Plan for next week: >> 1) Get the patch committed along with proper tets cases. >> 2) Analyse time complexity of the approach. >> 3) Make target specific pass to CodeGen as it seems it is not required to be target specific. >> 4) If possible build a large application with ipra enable and analyze the impact. >> >> Sincerely, >> Vivek >> >> >>> On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com> wrote: >>> Dear community, >>> >>> This is to brief you the progress of Interprocedural Register Allocation, for those who are interested to see the progress in terms of code please consider http://reviews.llvm.org/D20769 >>> This patch contains simple infrastructure to propagate register usage information of callee to caller in call graph. The code generation order is changed to follow bottom up order on call graph , Thanks to Mehdi Amini for the patch for the same ! I will write a blog on this very soon. >>> >>> So during this week as per the schedule proposed it should be study related infrastructure in LLVM and finalizing an approach for IPRA, but instead I am able to implement a working (may not be fully correct) prototype because I have used community bonding period to discuss and learn related stuffs from the mentors and also due to patch for CodeGen reordering was provided by dear mentor Mehdi Amini. >>> >>> So I conclude the work done during this week as follows: >>> Implementation : >>> ===========>>> Following passes have been implemented during this week: >>> An immutable pass to store competed RegMask, a machine function pass that iterates through each registers and check if it is used or not and based on that details create a RegMask and a target specific machine function pass that uses the RegMask created by second pass and propagates information by updating call instructions RegMask. To update the RegMask of MI , setRegMask() function has been added to MachineOperand, a command line option -enable-ipra and debug type -debug-only=“ipra" has been added to control the optimization through llc. >>> >>> Testing: >>> ====>>> The above mentioned implementation has been tested over SNU-Real-Time benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html) and some simple programs that uses library function ( for a library function register allocation is not done by LLVM so this optimization will simply skip them) >>> >>> Study and Other: >>> ============>>> I have learned following things in LLVM, how it stores reg clobbering information? how it is used by Reg allocators through LivePhysRegs, LiveRegMatrix and other related passes? How to schedule a pass using TargetPassConfig and TargetMachine? What are called callee saved registers? What is an Immutable Pass? >>> Apart from that I have also learned how to use phabricator to send review request. >>> I have also read some related literatures. >>> >>> During this week though task was to schedule the passes in proper order so that dependencies of related passes are satisfied. >>> >>> Plan for next week: >>> 1) Perform more testing and debug any known issue >>> 2) Fine ture the implementation so as to eliminate any unnecessary work >>> 3) During the testing from the stats I have observed that IPRA does not always improve the work of IntraProcedural register allocators and it is also observer that the amount of benefit (in terms of spilled live ranges ) is not deterministic. So I would like to find reasons for this behavior. >>> 4) Start implementing target specific pass for other targets if review passes properly with no major bugs. >>> >>> Please provide any feedback/suggestion including for format of this email. >>> >>> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin Colombet, Matthias Braun and other community members for providing quick help every time when I asked ( I have got replies even after 8 PM ( PDT) ! ) . >>> >>> Sincerely, >>> Vivek > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160614/e01d38d1/attachment.html>
vivek pandya via llvm-dev
2016-Jun-15 03:10 UTC
[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation
On Wed, Jun 15, 2016 at 6:16 AM, Quentin Colombet <qcolombet at apple.com> wrote:> Hi Vivek, > > How much of the slow down on runtime comes from the different layout of > the function in the asm file? (I.e., because of the dummy scc pass.) > > Hello Quentin,Please do not consider previous results as there was a major bug in RegMask calculation due to not considering RegMasks of callee in MF body while calculating register usage information, that has been fixed now ( as discussed with Matthias Braun and Mehdi Amini ) and after this bugfix I have run test-suite with and without IPRA. Yes there is runtime slow down for some test cases ranging from 1% to 64% similarly compile time slow down is ranging from 1% to 48%. The runtime performance improvement is ranging from 1% to 35% and surprisingly there is also compile time improvement in a range from 1% to 60% . I would request you to go through complete results at https://docs.google.com/document/d/1cavn-POrZdhw-rrdPXV8mSvyppvOWs2rxmLgaOnd6KE/edit?usp=sharing Also there is not extra failure due to IPRA now so in the result above I have removed failures. Sincerely, Vivek> Cheers, > Q > > Le 11 juin 2016 à 21:49, vivek pandya via llvm-dev < > llvm-dev at lists.llvm.org> a écrit : > > Dear Community, > > The patch for Interprocedural Register Allocation has been committed now , > thanks to Mehdi Amini for that. We would like you to play with it and let > us know your views and more importantly ideas to improve it. > > The test-suite run has indicated some non trivial issue that results in > run time failure of the programs, we will be investigating it more. Here > are some stats : > > test-suite has been tested with IPRA enabled and overall results are not > much encouraging. On average 30% increase in compile time. Many programs > have also increase in execution time ( average 20%) that is really serious > concern for us. About 60 tests have failed on run time this indicates error > in compilation. how ever 3 tests have improvement in their runtime and that > is 7% average. > > > This week I think good thing for me to learn is to setup llvm development > environment properly other wise one can end up wasting too much time > building the llvm it self. > > So here is brief summary: > Implementation: > ===========> > The patch has been split into analysis and transformation passes. The pass > responsible for register usage propagation has been made target > independent. A print method and command line option -print-regusage has > been added so that RegMaks details can be printed in Release builds also, > this enables lit test case to be testable in Release build too. Other minor > changes to adhere coding and naming conventions. > > > Testing: > > =====> > test-suite has been tested with IPRA enabled. > > > Study and other: > > ============> > Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR, > fastcc, local functions, MCStream class. In C++ I leaned about emplace > family of methods in STL and perfect forwarding introduced in C++11. > > > Plan for next week: > > 1) Investigate issue related to functional correctness that leads to run > time failures > > 2) profile the compilation process to verify increase in time due to IPRA > > 3) Improve IPRA by instructing codegen to not save register for local > function. > > 4) Make the pass emit asm comments to indicate register clobbered by > function call at call site in generated ASM file. > > > Sincerely, > > Vivek > > On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com> > wrote: > >> Dear Community, >> >> This week I got my patch reviewed by mentors and based on that I have >> done changes. Also we have identified a problem with callee-saved registers >> being marked as clobbered registers so we fixed that bug. I have described >> other minor changes in following section. >> >> It was expected to get the patch committed by end of this week but due to >> unexpected mistake I was not able to complete writing test cases. Sorry for >> that. >> I had build llvm with ipra enable by default and that build files were on >> my path ! Due to that next time I tried to build llvm it was terribly slow >> (almost 1 hour for 10% build ). I spend to much time on fixing this by >> playing around with environment variables, cmake options etc. >> But I think this is a serious concern, we need to think verify this time >> complexity other wise building a large software with IPRA enable would be >> very time consuming. >> >> The toughest part for this week was to get lit and FileCheck work as you >> expect them to work, specially when analysis pass prints info on stdio and >> there is also a output file generated by llc or opt command. >> >> So here is brief summary : >> >> Implementation: >> ===========>> >> RegUsageInfoCollector is now Calling Convention aware so that RegMask >> does not mark callee saved register as clobbered register. Due to this >> register allocator can use callee saved register for caller. >> PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp. >> StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of >> GlobalVariable * to RegMask. >> DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h. >> Minor correction in comments, changes to adhere coding standards of LLVM. >> >> Testing: >> ====>> >> The above mentioned changes has been tested with SNU-Realtime benchmarks. >> >> Studied lit and FileCheck tool and written simple test to verify >> functionality of coding. >> >> >> Study and other: >> >> ===========>> >> Studied some examples of lit compatible llvm IR with comments to RUN test >> cases, FileCheck tool syntax and how to use it with in lit infrastructure. >> >> I also understand X86 calling convention in more details. >> >> I also studied basic concepts in llvm IR language while reading .ll files >> written for lit. >> >> I learned about rvalue references and move semantics introduced in C++11. >> >> >> Plan for next week: >> >> 1) Get the patch committed along with proper tets cases. >> >> 2) Analyse time complexity of the approach. >> >> 3) Make target specific pass to CodeGen as it seems it is not required to >> be target specific. >> >> 4) If possible build a large application with ipra enable and analyze the >> impact. >> >> >> Sincerely, >> >> Vivek >> >> >> On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com> >> wrote: >> >>> Dear community, >>> >>> This is to brief you the progress of Interprocedural Register >>> Allocation, for those who are interested to see the progress in terms of >>> code please consider http://reviews.llvm.org/D20769 >>> This patch contains simple infrastructure to propagate register usage >>> information of callee to caller in call graph. The code generation order is >>> changed to follow bottom up order on call graph , Thanks to Mehdi Amini for >>> the patch for the same ! I will write a blog on this very soon. >>> >>> So during this week as per the schedule proposed it should be study >>> related infrastructure in LLVM and finalizing an approach for IPRA, but >>> instead I am able to implement a working (may not be fully correct) >>> prototype because I have used community bonding period to discuss and learn >>> related stuffs from the mentors and also due to patch for CodeGen >>> reordering was provided by dear mentor Mehdi Amini. >>> >>> So I conclude the work done during this week as follows: >>> Implementation : >>> ===========>>> Following passes have been implemented during this week: An immutable >>> pass to store competed RegMask, a machine function pass that iterates >>> through each registers and check if it is used or not and based on that >>> details create a RegMask and a target specific machine function pass that >>> uses the RegMask created by second pass and propagates information by >>> updating call instructions RegMask. To update the RegMask of MI , >>> setRegMask() function has been added to MachineOperand, a command line >>> option -enable-ipra and debug type -debug-only=“ipra" has been added to >>> control the optimization through llc. >>> >>> Testing: >>> ====>>> The above mentioned implementation has been tested over SNU-Real-Time >>> benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html) and >>> some simple programs that uses library function ( for a library function >>> register allocation is not done by LLVM so this optimization will simply >>> skip them) >>> >>> Study and Other: >>> ============>>> I have learned following things in LLVM, how it stores reg clobbering >>> information? how it is used by Reg allocators through LivePhysRegs, >>> LiveRegMatrix and other related passes? How to schedule a pass using >>> TargetPassConfig and TargetMachine? What are called callee saved registers? >>> What is an Immutable Pass? Apart from that I have also learned how to use >>> phabricator to send review request. I have also read some related >>> literatures. >>> >>> During this week though task was to schedule the passes in proper order >>> so that dependencies of related passes are satisfied. >>> >>> Plan for next week: >>> 1) Perform more testing and debug any known issue >>> 2) Fine ture the implementation so as to eliminate any unnecessary work >>> 3) During the testing from the stats I have observed that IPRA does not >>> always improve the work of IntraProcedural register allocators and it is >>> also observer that the amount of benefit (in terms of spilled live ranges ) >>> is not deterministic. So I would like to find reasons for this behavior. >>> 4) Start implementing target specific pass for other targets if review >>> passes properly with no major bugs. >>> >>> Please provide any feedback/suggestion including for format of this >>> email. >>> >>> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin >>> Colombet, Matthias Braun and other community members for providing >>> quick help every time when I asked ( I have got replies even after 8 PM ( >>> PDT) ! ) . >>> >>> Sincerely, >>> Vivek >>> >> >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160615/9b259f49/attachment-0001.html>