Adve, Vikram Sadanand via llvm-dev
2016-Jun-19 17:26 UTC
[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation
Hi Vivek, I have one question (and I apologize if I missed this in your previous messages): Do you handle, or do you expect to handle, indirect function calls? If so, how exactly are you going about doing that? For context, I’m interested because I’ve been working a set of passes to do profile-based devirtualization and IPO, and I’m wondering if this pass could benefit from that. Thanks, -—Vikram // Vikram S. Adve // Professor, Department of Computer Science // University of Illinois at Urbana-Champaign // vadve at illinois.edu // http://llvm.org> On Jun 19, 2016, at 7:46 AM, via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Date: Sun, 19 Jun 2016 16:59:27 +0530 > From: vivek pandya via llvm-dev <llvm-dev at lists.llvm.org> > To: Quentin Colombet <qcolombet at apple.com> > Cc: llvm-dev <llvm-dev at lists.llvm.org>, Matthias Braun > <matze at braunis.de> > Subject: Re: [llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural > Register Allocation > Message-ID: > <CAHYgpoL+gZmmyTzujXhfcM6ikbaGZFFJj0sVtrmpgbvAT+nAoQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Dear Community, > > Please find summary of work done during this week as follow: > > Implementation: > ===========> > During this week we have identified a bug in IPRA due to not considering > RegMask of function calls in given machine function. The same bug on > AArch64 has been reported by Chad Rosier and more detailed description can > be found at https://llvm.org/bugs/show_bug.cgi?id=28144 . To fix this bug > RegMask calculation have been modified to consider RegMask of function call > in a Machine Function. The patch is here http://reviews.llvm.org/D21395. > > AsmPrinter.cpp is modified to print call preserved registers in comments at > call site in generated assembly file. This suggestion was by Quentin > Colombet to improve readability of asm files while experimenting RegMask > and calling convention etc. This simple patch can be found here > http://reviews.llvm.org/D21490. > > We have also experimented a simple improvement to IPRA by setting callee > saved registers to none for local function and we have found some > performance improvement. > > Testing: > =====> > After bug 28144 fix there is no runtime failures in test suite. Also due to > bug 28144 there was about 60 run time failures and total time taken for > test suite compilation was 30% more compare to with out IPRA. After bug fix > with IPRA total compile time improvement compare to without IPRA is about 4 > to 8 minutes. > > > Study: > > ====> > This week I study code responsible for adding spill and restore for callee > saved registers. Also studied how calling convention is defined in target > specific .td files. I studied AsmPrinter.cpp and specifically > emitComments() method which is responsible for adding comments in llvm > generated assembly files. I also studied about some linkage type in LLMV IR > like ‘internal’ which represent local function in module. > > > Plan for next week: > > 1) Submit patch related to local function optimization for review > > 2) Find more possible improvements > > 3) Get active patches committed > > 4) Compile large software with IPRA enabled > > > Sincerely, > > Vivek > > On Wed, Jun 15, 2016 at 8:45 AM, vivek pandya <vivekvpandya at gmail.com> > wrote: > >> >> >> On Wed, Jun 15, 2016 at 8:40 AM, vivek pandya <vivekvpandya at gmail.com> >> wrote: >> >>> >>> >>> On Wed, Jun 15, 2016 at 6:16 AM, Quentin Colombet <qcolombet at apple.com> >>> wrote: >>> >>>> Hi Vivek, >>>> >>>> How much of the slow down on runtime comes from the different layout of >>>> the function in the asm file? (I.e., because of the dummy scc pass.) >>>> >>>> Hello Quentin, >>> >>> Please do not consider previous results as there was a major bug in >>> RegMask calculation due to not considering RegMasks of callee in MF body >>> while calculating register usage information, that has been fixed now ( as >>> discussed with Matthias Braun and Mehdi Amini ) and after this bugfix I >>> have run test-suite with and without IPRA. Yes there is runtime slow down >>> for some test cases ranging from 1% to 64% similarly compile time slow down >>> is ranging from 1% to 48%. The runtime performance improvement is ranging >>> from 1% to 35% and surprisingly there is also compile time improvement in a >>> range from 1% to 60% . I would request you to go through complete results >>> at >>> https://docs.google.com/document/d/1cavn-POrZdhw-rrdPXV8mSvyppvOWs2rxmLgaOnd6KE/edit?usp=sharing >>> >>> In above result baseline is IPRA and current is without IPRA. So actually >> data with background red is actual improvement and green is regression. >> -Vivek >> >>> Also there is not extra failure due to IPRA now so in the result above I >>> have removed failures. >>> >>> Sincerely, >>> Vivek >>> >>> >>>> Cheers, >>>> Q >>>> >>>> Le 11 juin 2016 à 21:49, vivek pandya via llvm-dev < >>>> llvm-dev at lists.llvm.org> a écrit : >>>> >>>> Dear Community, >>>> >>>> The patch for Interprocedural Register Allocation has been committed now >>>> , thanks to Mehdi Amini for that. We would like you to play with it and let >>>> us know your views and more importantly ideas to improve it. >>>> >>>> The test-suite run has indicated some non trivial issue that results in >>>> run time failure of the programs, we will be investigating it more. Here >>>> are some stats : >>>> >>>> test-suite has been tested with IPRA enabled and overall results are not >>>> much encouraging. On average 30% increase in compile time. Many programs >>>> have also increase in execution time ( average 20%) that is really serious >>>> concern for us. About 60 tests have failed on run time this indicates error >>>> in compilation. how ever 3 tests have improvement in their runtime and that >>>> is 7% average. >>>> >>>> >>>> This week I think good thing for me to learn is to setup llvm >>>> development environment properly other wise one can end up wasting too much >>>> time building the llvm it self. >>>> >>>> So here is brief summary: >>>> Implementation: >>>> ===========>>>> >>>> The patch has been split into analysis and transformation passes. The >>>> pass responsible for register usage propagation has been made target >>>> independent. A print method and command line option -print-regusage has >>>> been added so that RegMaks details can be printed in Release builds also, >>>> this enables lit test case to be testable in Release build too. Other minor >>>> changes to adhere coding and naming conventions. >>>> >>>> >>>> Testing: >>>> >>>> =====>>>> >>>> test-suite has been tested with IPRA enabled. >>>> >>>> >>>> Study and other: >>>> >>>> ============>>>> >>>> Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR, >>>> fastcc, local functions, MCStream class. In C++ I leaned about emplace >>>> family of methods in STL and perfect forwarding introduced in C++11. >>>> >>>> >>>> Plan for next week: >>>> >>>> 1) Investigate issue related to functional correctness that leads to run >>>> time failures >>>> >>>> 2) profile the compilation process to verify increase in time due to IPRA >>>> >>>> 3) Improve IPRA by instructing codegen to not save register for local >>>> function. >>>> >>>> 4) Make the pass emit asm comments to indicate register clobbered by >>>> function call at call site in generated ASM file. >>>> >>>> >>>> Sincerely, >>>> >>>> Vivek >>>> >>>> On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com> >>>> wrote: >>>> >>>>> Dear Community, >>>>> >>>>> This week I got my patch reviewed by mentors and based on that I have >>>>> done changes. Also we have identified a problem with callee-saved registers >>>>> being marked as clobbered registers so we fixed that bug. I have described >>>>> other minor changes in following section. >>>>> >>>>> It was expected to get the patch committed by end of this week but due >>>>> to unexpected mistake I was not able to complete writing test cases. Sorry >>>>> for that. >>>>> I had build llvm with ipra enable by default and that build files were >>>>> on my path ! Due to that next time I tried to build llvm it was terribly >>>>> slow (almost 1 hour for 10% build ). I spend to much time on fixing this >>>>> by playing around with environment variables, cmake options etc. >>>>> But I think this is a serious concern, we need to think verify this >>>>> time complexity other wise building a large software with IPRA enable would >>>>> be very time consuming. >>>>> >>>>> The toughest part for this week was to get lit and FileCheck work as >>>>> you expect them to work, specially when analysis pass prints info on stdio >>>>> and there is also a output file generated by llc or opt command. >>>>> >>>>> So here is brief summary : >>>>> >>>>> Implementation: >>>>> ===========>>>>> >>>>> RegUsageInfoCollector is now Calling Convention aware so that RegMask >>>>> does not mark callee saved register as clobbered register. Due to this >>>>> register allocator can use callee saved register for caller. >>>>> PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp. >>>>> StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of >>>>> GlobalVariable * to RegMask. >>>>> DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h. >>>>> Minor correction in comments, changes to adhere coding standards of >>>>> LLVM. >>>>> >>>>> Testing: >>>>> ====>>>>> >>>>> The above mentioned changes has been tested with SNU-Realtime >>>>> benchmarks. >>>>> >>>>> Studied lit and FileCheck tool and written simple test to verify >>>>> functionality of coding. >>>>> >>>>> >>>>> Study and other: >>>>> >>>>> ===========>>>>> >>>>> Studied some examples of lit compatible llvm IR with comments to RUN >>>>> test cases, FileCheck tool syntax and how to use it with in lit >>>>> infrastructure. >>>>> >>>>> I also understand X86 calling convention in more details. >>>>> >>>>> I also studied basic concepts in llvm IR language while reading .ll >>>>> files written for lit. >>>>> >>>>> I learned about rvalue references and move semantics introduced in >>>>> C++11. >>>>> >>>>> >>>>> Plan for next week: >>>>> >>>>> 1) Get the patch committed along with proper tets cases. >>>>> >>>>> 2) Analyse time complexity of the approach. >>>>> >>>>> 3) Make target specific pass to CodeGen as it seems it is not required >>>>> to be target specific. >>>>> >>>>> 4) If possible build a large application with ipra enable and analyze >>>>> the impact. >>>>> >>>>> >>>>> Sincerely, >>>>> >>>>> Vivek >>>>> >>>>> >>>>> On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com> >>>>> wrote: >>>>> >>>>>> Dear community, >>>>>> >>>>>> This is to brief you the progress of Interprocedural Register >>>>>> Allocation, for those who are interested to see the progress in terms of >>>>>> code please consider http://reviews.llvm.org/D20769 >>>>>> This patch contains simple infrastructure to propagate register usage >>>>>> information of callee to caller in call graph. The code generation order is >>>>>> changed to follow bottom up order on call graph , Thanks to Mehdi Amini for >>>>>> the patch for the same ! I will write a blog on this very soon. >>>>>> >>>>>> So during this week as per the schedule proposed it should be study >>>>>> related infrastructure in LLVM and finalizing an approach for IPRA, but >>>>>> instead I am able to implement a working (may not be fully correct) >>>>>> prototype because I have used community bonding period to discuss and learn >>>>>> related stuffs from the mentors and also due to patch for CodeGen >>>>>> reordering was provided by dear mentor Mehdi Amini. >>>>>> >>>>>> So I conclude the work done during this week as follows: >>>>>> Implementation : >>>>>> ===========>>>>>> Following passes have been implemented during this week: An immutable >>>>>> pass to store competed RegMask, a machine function pass that iterates >>>>>> through each registers and check if it is used or not and based on that >>>>>> details create a RegMask and a target specific machine function pass that >>>>>> uses the RegMask created by second pass and propagates information by >>>>>> updating call instructions RegMask. To update the RegMask of MI , >>>>>> setRegMask() function has been added to MachineOperand, a command line >>>>>> option -enable-ipra and debug type -debug-only=“ipra" has been added to >>>>>> control the optimization through llc. >>>>>> >>>>>> Testing: >>>>>> ====>>>>>> The above mentioned implementation has been tested over SNU-Real-Time >>>>>> benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html) and >>>>>> some simple programs that uses library function ( for a library function >>>>>> register allocation is not done by LLVM so this optimization will simply >>>>>> skip them) >>>>>> >>>>>> Study and Other: >>>>>> ============>>>>>> I have learned following things in LLVM, how it stores reg clobbering >>>>>> information? how it is used by Reg allocators through LivePhysRegs, >>>>>> LiveRegMatrix and other related passes? How to schedule a pass using >>>>>> TargetPassConfig and TargetMachine? What are called callee saved registers? >>>>>> What is an Immutable Pass? Apart from that I have also learned how to use >>>>>> phabricator to send review request. I have also read some related >>>>>> literatures. >>>>>> >>>>>> During this week though task was to schedule the passes in proper >>>>>> order so that dependencies of related passes are satisfied. >>>>>> >>>>>> Plan for next week: >>>>>> 1) Perform more testing and debug any known issue >>>>>> 2) Fine ture the implementation so as to eliminate any unnecessary work >>>>>> 3) During the testing from the stats I have observed that IPRA does >>>>>> not always improve the work of IntraProcedural register allocators and it >>>>>> is also observer that the amount of benefit (in terms of spilled live >>>>>> ranges ) is not deterministic. So I would like to find reasons for this >>>>>> behavior. >>>>>> 4) Start implementing target specific pass for other targets if review >>>>>> passes properly with no major bugs. >>>>>> >>>>>> Please provide any feedback/suggestion including for format of this >>>>>> email. >>>>>> >>>>>> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin >>>>>> Colombet, Matthias Braun and other community members for providing >>>>>> quick help every time when I asked ( I have got replies even after 8 PM ( >>>>>> PDT) ! ) . >>>>>> >>>>>> Sincerely, >>>>>> Vivek >>>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160619/178770c2/attachment-0001.html>