similar to: [RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization

Displaying 20 results from an estimated 3000 matches similar to: "[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization"

2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
Hi Vedant, Thanks for your interest and comment. Size-optimization improves page-faults and a start-up time for a large application, which this enabling also followed. Even though I didn't see a large regression/complaint on a CPU-bound case, which is not a typical case for mobile workload, I wanted to be precautious of enabling it by default. However, as with default outlining case, I
2017 Jun 09
2
Question about Prolog/Epilog Code Insertion
Hi All, When seeing the title "Prolog/Epilog Code Insertion", I'd expect something about XXXFrameLowering.cpp (particular about emitPrologue/emitEpilogue). But the document [1] is about unwind. Is it placed at the right place/section? Thanks. [1] http://llvm.org/docs/CodeGenerator.html#prolog-epilog-code-insertion Regards, chenwj -- Wei-Ren Chen (陳韋任) Homepage:
2010 Apr 12
1
[LLVMdev] Question. about Machinefunction pass, funtion Prolog/Epilog code, stack frame
I am new to the LLVM, and need some help with this points. 1. how can we add special code for the Prolog/Epilog for some certain functions, this should be done with machinefunction pass, rt? 2. Basically, I want to get the function stack frame, that is the size and the initial position. I found int64_t llvm::MachineFrameInfo::getObjectSize ( int *ObjectIdx* ) const[inline] This method is
2019 Jul 16
2
MachinePipeliner refactoring
Hi James, I also think that refactoring the code generation part is a great idea. That code is very complicated and difficult to maintain. I’ve wanted to rewrite that code for a long time, but just have never got to it. There are quite a few edge cases to handle (at least in the current code). I’ll take a deeper look at your patch. The abstractions that you mention, Stage and Block, are good
2010 Apr 07
3
[LLVMdev] Injecting code before function prolog
I'm trying to implement something similar to this: http://gcc.gnu.org/wiki/SplitStacks in LLVM. The reason I want this is so that I can have dynamically growing and shrinking stacks in my programming language. In order to do this, I need to be able to check for overflow of a stack frame. The methods of doing this are outlined in the link above, but my intention is to pass the current stack
2019 Jul 15
1
MachinePipeliner refactoring
Hi James: Personally, I like the idea of refactoring and more abstraction, But unfortunately, I don't know enough about the edges cases either. BTW: the prototype is still causing quite some Asseertions in PowerPC - some nodes are not generated in correct order. Best, Jinsong Ji (纪金松), PhD. XL/LLVM on Power Compiler Development E-mail: jji at us.ibm.com From: James Molloy <james at
2010 Apr 10
0
[LLVMdev] Injecting code before function prolog
On Wed, Apr 7, 2010 at 12:43 PM, Arlen Cox <arlencox at gmail.com> wrote: > I'm trying to implement something similar to this: > http://gcc.gnu.org/wiki/SplitStacks in LLVM.  The reason I want this > is so that I can have dynamically growing and shrinking stacks in my > programming language.  In order to do this, I need to be able to check > for overflow of a stack frame.
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc, Thanks a lot for reviewing this huge assembly function! silk_warped_autocorrelation_FIX_c()'s kernel part is for( n = 0; n < length; n++ ) { tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS ); /* Loop over allpass sections */ for( i = 0; i < order; i++ ) { /* Output of allpass section */ tmp2_QS = silk_SMLAWB(
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
This is a great idea. But the order (psEncC->shapingLPCOrder) can be configured to 12, 14, 16, 20 and 24 according to complexity parameter. It's hard to get a universal function to handle all these orders efficiently. Any suggestions? Thanks, Linfeng On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 02:51 PM,
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc, Thanks for your suggestions. Will get back to you once we have some updates. Linfeng On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 07:18 PM, Linfeng Zhang wrote: > > This is a great idea. But the order (psEncC->shapingLPCOrder) can be > > configured to 12, 14, 16, 20 and 24 according to
2015 Aug 16
2
[LLVMdev] Adding a stack probe function attribute
I started to implement inlining of the stack probe function based on Microsoft's inlined stack probes in https://github.com/Microsoft/llvm/tree/MS. Do we know why the stack pointer cannot be updated in a loop (which results in ideal code)? I noticed that was commented in Microsoft's code. I suspect this is due to debug or unwinding information, since it is allowed on Windows x86-32. I
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
I attached a new patch with small cleanup (disassembly is identical as the last patch). We have done the same internal testing as usual. Also, attached 2 failed temporary versions which try to reduce code size (just for code review reference purpose). The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of 3,228 bytes (with gcc). smaller_slower.c has a code size of 2,304
2019 Jul 15
2
MachinePipeliner refactoring
Hi Brendan (and friends of MachinePipeliner, +llvm-dev for openness), Over the past week or so I've been attempting to extend the MachinePipeliner to support different idioms of code generation. To make this a bit more concrete, there are two areas where the currently generated code could be improved depending on architecture: 1) The epilog blocks peel off the final iterations in reverse
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Thank Jean-Marc! The speedup percentages are all relative to the entire encoder. Comparing to master, this optimization patch speeds up fixed-point SILK encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8% Complexity 8: 5.5% Complexity 10: 4.0% when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max MHz: 2116.5 Thanks, Linfeng On Wed, Apr 5, 2017 at 11:02 AM,
2011 Aug 22
0
[LLVMdev] Segmented Stacks (re-roll)
Hi Sanjoy, The patch generally looks fine except for this part: diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp new file mode 100644 index 0000000..5ffb8f2 --- /dev/null +++ b/lib/CodeGen/StackSegmenter.cpp @@ -0,0 +1,48 @@ +//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===// The comment is obviously incorrect. diff --git
2009 Apr 09
3
[LLVMdev] Calling Conventions, function prologs and epilogs.
How/where are function prologs and epilogs generated, is it bespoke C++ code or TableGen generated ? If someone could point me in the right direction please. Many thanks in advance, Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090409/fb3336e4/attachment.html>
2011 Aug 23
2
[LLVMdev] Segmented Stacks (re-roll)
Hi! > diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp > new file mode 100644 > index 0000000..5ffb8f2 > --- /dev/null > +++ b/lib/CodeGen/StackSegmenter.cpp > @@ -0,0 +1,48 @@ > +//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===// > > The comment is obviously incorrect. Thanks. So much for lifting file
2010 Jun 04
1
[LLVMdev] Heads up: Local register allocator going away
On Jun 4, 2010, at 3:05 AM, Sylvere Teissier wrote: > > In my target the CALL instruction change the link Register %LR > In the target InstrInfo.td I have "Defs=[LR]" on the CALL instruction > definition to handle that. So your CALL instructions are clobbering your callee-saved registers, eh? ;-) > It works well with others registers allocators: when there is a call
2005 Jul 27
2
[LLVMdev] Making a pass available to llc?
On 7/26/05, Reid Spencer <reid at x10sys.com> wrote: > On Tue, 2005-07-26 at 17:25 -0700, Michael McCracken wrote: > > > Since I'm modifying llc, I have a couple small questions about that code: > > > > opt and analyze (and a couple of other places) add a verifier pass, > > but llc doesn't. > > This would seem to make sense for llc as well -
2007 Sep 06
1
[LLVMdev] Prolog/Epilog Insertion Question
I've been looking through the code for pologue/epilogoue generation and noticed this oddity: void PEI::replaceFrameIndices(MachineFunction &Fn) { [...] for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end(); ) { [...] if (I->getOpcode() == FrameSetupOpcode || I->getOpcode() == FrameDestroyOpcode) { [...] } else {