Displaying 20 results from an estimated 3000 matches similar to: "[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization"
2020 Mar 24
2
[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization
Hi Vedant,
Thanks for your interest and comment.
Size-optimization improves page-faults and a start-up time for a large
application, which this enabling also followed.
Even though I didn't see a large regression/complaint on a CPU-bound case,
which is not a typical case for mobile workload, I wanted to be precautious
of enabling it by default.
However, as with default outlining case, I
2017 Jun 09
2
Question about Prolog/Epilog Code Insertion
Hi All,
When seeing the title "Prolog/Epilog Code Insertion", I'd expect
something about XXXFrameLowering.cpp
(particular about emitPrologue/emitEpilogue). But the document [1] is about
unwind. Is it placed at the right
place/section?
Thanks.
[1] http://llvm.org/docs/CodeGenerator.html#prolog-epilog-code-insertion
Regards,
chenwj
--
Wei-Ren Chen (陳韋任)
Homepage:
2010 Apr 12
1
[LLVMdev] Question. about Machinefunction pass, funtion Prolog/Epilog code, stack frame
I am new to the LLVM, and need some help with this points.
1. how can we add special code for the Prolog/Epilog for some
certain functions, this should be done with machinefunction pass, rt?
2. Basically, I want to get the function stack frame, that is the size and
the initial position.
I found
int64_t llvm::MachineFrameInfo::getObjectSize ( int *ObjectIdx* )
const[inline]
This method is
2019 Jul 16
2
MachinePipeliner refactoring
Hi James,
I also think that refactoring the code generation part is a great idea. That code is very complicated and difficult to maintain. I’ve wanted to rewrite that code for a long time, but just have never got to it. There are quite a few edge cases to handle (at least in the current code). I’ll take a deeper look at your patch. The abstractions that you mention, Stage and Block, are good
2010 Apr 07
3
[LLVMdev] Injecting code before function prolog
I'm trying to implement something similar to this:
http://gcc.gnu.org/wiki/SplitStacks in LLVM. The reason I want this
is so that I can have dynamically growing and shrinking stacks in my
programming language. In order to do this, I need to be able to check
for overflow of a stack frame. The methods of doing this are outlined
in the link above, but my intention is to pass the current stack
2019 Jul 15
1
MachinePipeliner refactoring
Hi James:
Personally, I like the idea of refactoring and more abstraction,
But unfortunately, I don't know enough about the edges cases either.
BTW: the prototype is still causing quite some Asseertions in PowerPC -
some nodes are not generated in correct order.
Best,
Jinsong Ji (纪金松), PhD.
XL/LLVM on Power Compiler Development
E-mail: jji at us.ibm.com
From: James Molloy <james at
2010 Apr 10
0
[LLVMdev] Injecting code before function prolog
On Wed, Apr 7, 2010 at 12:43 PM, Arlen Cox <arlencox at gmail.com> wrote:
> I'm trying to implement something similar to this:
> http://gcc.gnu.org/wiki/SplitStacks in LLVM. The reason I want this
> is so that I can have dynamically growing and shrinking stacks in my
> programming language. In order to do this, I need to be able to check
> for overflow of a stack frame.
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks a lot for reviewing this huge assembly function!
silk_warped_autocorrelation_FIX_c()'s kernel part is
for( n = 0; n < length; n++ ) {
tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS );
/* Loop over allpass sections */
for( i = 0; i < order; i++ ) {
/* Output of allpass section */
tmp2_QS = silk_SMLAWB(
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
This is a great idea. But the order (psEncC->shapingLPCOrder) can be
configured to 12, 14, 16, 20 and 24 according to complexity parameter.
It's hard to get a universal function to handle all these orders
efficiently. Any suggestions?
Thanks,
Linfeng
On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 02:51 PM,
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks for your suggestions. Will get back to you once we have some updates.
Linfeng
On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 07:18 PM, Linfeng Zhang wrote:
> > This is a great idea. But the order (psEncC->shapingLPCOrder) can be
> > configured to 12, 14, 16, 20 and 24 according to
2015 Aug 16
2
[LLVMdev] Adding a stack probe function attribute
I started to implement inlining of the stack probe function based on
Microsoft's inlined stack probes in
https://github.com/Microsoft/llvm/tree/MS.
Do we know why the stack pointer cannot be updated in a loop (which
results in ideal code)? I noticed that was commented in Microsoft's
code.
I suspect this is due to debug or unwinding information, since it is
allowed on Windows x86-32.
I
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
I attached a new patch with small cleanup (disassembly is identical as the
last patch). We have done the same internal testing as usual.
Also, attached 2 failed temporary versions which try to reduce code size
(just for code review reference purpose).
The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of
3,228 bytes (with gcc).
smaller_slower.c has a code size of 2,304
2019 Jul 15
2
MachinePipeliner refactoring
Hi Brendan (and friends of MachinePipeliner, +llvm-dev for openness),
Over the past week or so I've been attempting to extend the
MachinePipeliner to support different idioms of code generation. To make
this a bit more concrete, there are two areas where the currently generated
code could be improved depending on architecture:
1) The epilog blocks peel off the final iterations in reverse
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Thank Jean-Marc!
The speedup percentages are all relative to the entire encoder.
Comparing to master, this optimization patch speeds up fixed-point SILK
encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8%
Complexity 8: 5.5% Complexity 10: 4.0%
when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max
MHz: 2116.5
Thanks,
Linfeng
On Wed, Apr 5, 2017 at 11:02 AM,
2011 Aug 22
0
[LLVMdev] Segmented Stacks (re-roll)
Hi Sanjoy,
The patch generally looks fine except for this part:
diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp
new file mode 100644
index 0000000..5ffb8f2
--- /dev/null
+++ b/lib/CodeGen/StackSegmenter.cpp
@@ -0,0 +1,48 @@
+//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===//
The comment is obviously incorrect.
diff --git
2009 Apr 09
3
[LLVMdev] Calling Conventions, function prologs and epilogs.
How/where are function prologs and epilogs generated, is it bespoke C++ code or TableGen generated ?
If someone could point me in the right direction please.
Many thanks in advance,
Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090409/fb3336e4/attachment.html>
2011 Aug 23
2
[LLVMdev] Segmented Stacks (re-roll)
Hi!
> diff --git a/lib/CodeGen/StackSegmenter.cpp b/lib/CodeGen/StackSegmenter.cpp
> new file mode 100644
> index 0000000..5ffb8f2
> --- /dev/null
> +++ b/lib/CodeGen/StackSegmenter.cpp
> @@ -0,0 +1,48 @@
> +//===-- StackSegmenter.h - Prolog/Epilog code insertion -------*- C++ -* --===//
>
> The comment is obviously incorrect.
Thanks. So much for lifting file
2010 Jun 04
1
[LLVMdev] Heads up: Local register allocator going away
On Jun 4, 2010, at 3:05 AM, Sylvere Teissier wrote:
>
> In my target the CALL instruction change the link Register %LR
> In the target InstrInfo.td I have "Defs=[LR]" on the CALL instruction
> definition to handle that.
So your CALL instructions are clobbering your callee-saved registers, eh? ;-)
> It works well with others registers allocators: when there is a call
2005 Jul 27
2
[LLVMdev] Making a pass available to llc?
On 7/26/05, Reid Spencer <reid at x10sys.com> wrote:
> On Tue, 2005-07-26 at 17:25 -0700, Michael McCracken wrote:
>
> > Since I'm modifying llc, I have a couple small questions about that code:
> >
> > opt and analyze (and a couple of other places) add a verifier pass,
> > but llc doesn't.
> > This would seem to make sense for llc as well -
2007 Sep 06
1
[LLVMdev] Prolog/Epilog Insertion Question
I've been looking through the code for pologue/epilogoue generation and
noticed this oddity:
void PEI::replaceFrameIndices(MachineFunction &Fn) {
[...]
for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end(); ) {
[...]
if (I->getOpcode() == FrameSetupOpcode ||
I->getOpcode() == FrameDestroyOpcode) {
[...]
} else {