search for: epilogs

Displaying 20 results from an estimated 254 matches for "epilogs".

Did you mean: epilog
2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
Hello, I'd like to upstream our work over the time which the community would benefit from. This is a part of effort toward minimizing code size presented in here <https://llvm.org/devmtg/2020-02-23/slides/Kyungwoo-GlobalMachineOutlinerForThinLTO.pdf>. In particular, this RFC is about optimizing prolog and epilog for size. *Homogeneous Prolog and Epilog for Size Optimization, D76570
2017 Feb 27
4
[Proposal][RFC] Epilog loop vectorization
Thanks for looking into this. 1) Issues with re running vectorizer: Vectorizer might generate redundant alias checks while vectorizing epilog loop. Redundant alias checks are expensive, we like to reuse the results of already computed alias checks. With metadata we can limit the width of epilog loop, but not sure about reusing alias check result. Any thoughts on rerunning vectorizer with reusing
2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
...an opt-out option. Regards, Kyungwoo On Tue, Mar 24, 2020 at 12:01 PM Vedant Kumar <vedant_kumar at apple.com> wrote: > This looks really interesting. In the slides, it’s mentioned that the > combination of tuning the MachineOutliner for ThinLTO and of optimizing > function prolog/epilogs improved measured run-time performance. > > What kind of performance impact do you see from simply homogenizing > prolog/epilogs? (If, say across LNT/aarch64/-Oz the performance impact is > not large, it may make sense to have homogenization enabled by default.) > > best, > ved...
2019 Jul 16
2
MachinePipeliner refactoring
...can be scheduled at an earlier cycle, but later stage. Several different combinations can occur, and each use of the Phi can have a different stage/cycle. Maintaining the map between new and old virtual registers depends on the stage, cycle, and the current block (prolog, kernel, or epilog). The epilogs are generated in reverse order due to the way the instructions need to be laid out. The first epilog block after the kernel is going to have instructions scheduled in the last stage only, then the next epilog block, if needed, will have instructions from LastStage-1 first, then the LastStage. Let’...
2017 Mar 14
10
[Proposal][RFC] Epilog loop vectorization
Summarizing the discussion on the implementation approaches. Discussed about two approaches, first running ‘InnerLoopVectorizer’ again on the epilog loop immediately after vectorizing the original loop within the same vectorization pass, the second approach where re-running vectorization pass and limiting vectorization factor of epilog loop by metadata. <Approach-2> Challenges with
2017 Mar 14
4
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 12:11 PM, Adam Nemet wrote: > >> On Mar 14, 2017, at 9:49 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> On 03/14/2017 11:21 AM, Adam Nemet wrote: >>> >>>> On Mar 14, 2017, at 6:00 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com >>>> <mailto:Ashutosh.Nema at
2011 Nov 10
2
[LLVMdev] Possible Phi Removal Pass?
Looking through this mailing list's archives, I've found that the most common fix attempted for removing phi instructions is to use the reg2mem pass. However I'm also finding that this does no guarantee the removal of all phi instructions. I want to write a pass to remove phi instructions without changing register/memory usage. Does this sort of translation of phi
2017 Mar 14
2
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 11:21 AM, Adam Nemet wrote: > >> On Mar 14, 2017, at 6:00 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com >> <mailto:Ashutosh.Nema at amd.com>> wrote: >> >> Summarizing the discussion on the implementation approaches. >> Discussed about two approaches, first running ‘InnerLoopVectorizer’ >> again on the epilog loop immediately after
2017 Feb 27
2
[Proposal][RFC] Epilog loop vectorization
> On Feb 27, 2017, at 7:27 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On 02/27/2017 06:29 AM, Nema, Ashutosh wrote: >> Thanks for looking into this. >> >> 1) Issues with re running vectorizer: >> Vectorizer might generate redundant alias checks while vectorizing epilog loop. >> Redundant alias checks are expensive, we like to reuse the
2017 Feb 22
3
[Proposal][RFC] Epilog loop vectorization
Hi, This is a proposal about epilog loop vectorization. Currently Loop Vectorizer inserts an epilogue loop for handling loops that don't have known iteration counts. The Loop Vectorizer supports loops with an unknown trip count, unknown trip count may not be a multiple of the vector width, and the vectorizer has to execute the last few iterations as scalar code. It keeps a scalar copy of
2017 Feb 23
2
[Proposal][RFC] Epilog loop vectorization
On 02/22/2017 11:52 AM, Adam Nemet via llvm-dev wrote: > Hi Ashutosh, > >> On Feb 22, 2017, at 1:57 AM, Nema, Ashutosh via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi, >> This is a proposal about epilog loop vectorization. >> Currently Loop Vectorizer inserts an epilogue loop for handling loops
2017 Mar 15
4
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 07:50 PM, Adam Nemet wrote: > >> On Mar 14, 2017, at 11:33 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> >> On 03/14/2017 12:11 PM, Adam Nemet wrote: >>> >>>> On Mar 14, 2017, at 9:49 AM, Hal Finkel <hfinkel at anl.gov >>>> <mailto:hfinkel at
2017 Jun 09
2
Question about Prolog/Epilog Code Insertion
Hi All, When seeing the title "Prolog/Epilog Code Insertion", I'd expect something about XXXFrameLowering.cpp (particular about emitPrologue/emitEpilogue). But the document [1] is about unwind. Is it placed at the right place/section? Thanks. [1] http://llvm.org/docs/CodeGenerator.html#prolog-epilog-code-insertion Regards, chenwj -- Wei-Ren Chen (陳韋任) Homepage:
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc, Thanks a lot for reviewing this huge assembly function! silk_warped_autocorrelation_FIX_c()'s kernel part is for( n = 0; n < length; n++ ) { tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS ); /* Loop over allpass sections */ for( i = 0; i < order; i++ ) { /* Output of allpass section */ tmp2_QS = silk_SMLAWB(
2019 Jul 15
1
MachinePipeliner refactoring
Hi James: Personally, I like the idea of refactoring and more abstraction, But unfortunately, I don't know enough about the edges cases either. BTW: the prototype is still causing quite some Asseertions in PowerPC - some nodes are not generated in correct order. Best, Jinsong Ji (纪金松), PhD. XL/LLVM on Power Compiler Development E-mail: jji at us.ibm.com From: James Molloy <james at
2009 Sep 18
0
[LLVMdev] OT: intel darwin losing primary target status
I dug into this. Based on the .s files in bugzilla, the latest gcc is now adding dwarf unwind info to describe the function epilog. If you run dwarfdump --eh-frame on the .o files made with the new compiler, you'll see extra dwarf unwind instructions at the end like: ... DW_CFA_advance_loc4 (64) #<-- advance to near end of function
2017 Mar 14
2
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 11:58 AM, Michael Kuperstein wrote: > I'm still not sure about this, for a few reasons: > > 1) I'd like to try to treat epilogue loops the same way regardless of > whether the main loop was vectorized by hand or automatically. So if > someone hand-wrote an avx-512 16-wide loop, with alias checks, and we > decide it's profitable to vectorize the
2009 Apr 09
3
[LLVMdev] Calling Conventions, function prologs and epilogs.
How/where are function prologs and epilogs generated, is it bespoke C++ code or TableGen generated ? If someone could point me in the right direction please. Many thanks in advance, Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090409/fb333...
2017 Feb 28
3
[Proposal][RFC] Epilog loop vectorization
I have tried running both gvn and newgvn but it did not helped in hoisting the alias checks: Please check, maybe I have missed something. <TestCase> void foo (char *A, char *B, char *C, int len) { int i = 0; for (i=0 ; i< len; i++) A[i] = B[i] + C[i]; } <Command> $ opt –O3 –gvn test.ll –o test.opt.ll $ opt –O3 –newgvn test.ll –o test.opt.ll “test.ll” is attached, it
2017 Jan 31
6
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi, Attached is a patch with arm neon optimizations for silk_warped_autocorrelation_FIX(). Please review. Thanks, Felicia -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: