thr3ads.net - similar to: "[LLVMdev] Injecting code before function prolog"

Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] Injecting code before function prolog"

[LLVMdev] Injecting code before function prolog

2010 Apr 10

[LLVMdev] Injecting code before function prolog

On Wed, Apr 7, 2010 at 12:43 PM, Arlen Cox <arlencox at gmail.com> wrote: > I'm trying to implement something similar to this: > http://gcc.gnu.org/wiki/SplitStacks in LLVM. The reason I want this > is so that I can have dynamically growing and shrinking stacks in my > programming language. In order to do this, I need to be able to check > for overflow of a stack frame.

[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization

2020 Mar 24

[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization

Hello, I'd like to upstream our work over the time which the community would benefit from. This is a part of effort toward minimizing code size presented in here <https://llvm.org/devmtg/2020-02-23/slides/Kyungwoo-GlobalMachineOutlinerForThinLTO.pdf>. In particular, this RFC is about optimizing prolog and epilog for size. *Homogeneous Prolog and Epilog for Size Optimization, D76570

[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization

2020 Mar 24

[RFC][AArch64] Homogeneous Prolog and Epilog for Size Optimization

Hi Vedant, Thanks for your interest and comment. Size-optimization improves page-faults and a start-up time for a large application, which this enabling also followed. Even though I didn't see a large regression/complaint on a CPU-bound case, which is not a typical case for mobile workload, I wanted to be precautious of enabling it by default. However, as with default outlining case, I

Question about Prolog/Epilog Code Insertion

2017 Jun 09

Question about Prolog/Epilog Code Insertion

Hi All, When seeing the title "Prolog/Epilog Code Insertion", I'd expect something about XXXFrameLowering.cpp (particular about emitPrologue/emitEpilogue). But the document [1] is about unwind. Is it placed at the right place/section? Thanks. [1] http://llvm.org/docs/CodeGenerator.html#prolog-epilog-code-insertion Regards, chenwj -- Wei-Ren Chen (陳韋任) Homepage:

[LLVMdev] Problem adding a MachineBasicBlock during X86 EmitPrologue

2010 Jun 18

[LLVMdev] Problem adding a MachineBasicBlock during X86 EmitPrologue

I'm attempting to add an error handler to functions with a custom calling convention. This error is checked upon function entry, before any code is run (specifically, I cannot allow any stack operations). Because of this, I figured a good place to do this code insertion is in EmitPrologue. I also, at this time, create the block that handles the error case. // create a new block for

[LLVMdev] ARM libgcc dependencies

2008 Nov 15

[LLVMdev] ARM libgcc dependencies

I was trying to build some code today for an ARM7TDMI, which does not have a hardware divider and I noticed that LLVM translated divide instructions into a call into libgcc's udivsi3. Is there any way of removing this library dependency and allowing LLVM's link time optimizer optimize the generated division code (inline it, merge the div/mod if using both, etc)? Thanks much, Arlen

[LLVMdev] Question. about Machinefunction pass, funtion Prolog/Epilog code, stack frame

2010 Apr 12

[LLVMdev] Question. about Machinefunction pass, funtion Prolog/Epilog code, stack frame

I am new to the LLVM, and need some help with this points. 1. how can we add special code for the Prolog/Epilog for some certain functions, this should be done with machinefunction pass, rt? 2. Basically, I want to get the function stack frame, that is the size and the initial position. I found int64_t llvm::MachineFrameInfo::getObjectSize ( int *ObjectIdx* ) const[inline] This method is

[LLVMdev] Prolog/Epilog Insertion Question

2007 Sep 06

[LLVMdev] Prolog/Epilog Insertion Question

I've been looking through the code for pologue/epilogoue generation and noticed this oddity: void PEI::replaceFrameIndices(MachineFunction &Fn) { [...] for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end(); ) { [...] if (I->getOpcode() == FrameSetupOpcode || I->getOpcode() == FrameDestroyOpcode) { [...] } else {

[LLVMdev] PR19267 - Add a feature to clobber non-calle-save regs in the prolog.

2014 Mar 27

[LLVMdev] PR19267 - Add a feature to clobber non-calle-save regs in the prolog.

This is a feature I’m considering for the LLVM backend. Feel free to provide input in the following PR. llvm.org/pr19267 - Add a feature to clobber non-callee-save regs in the prolog. I’m copying llvm-dev because it seems like something that others must have already done or at least thought about at some point. -Andy

[PATCH] customize: Add --ssh-inject option for injecting SSH keys.

2014 Nov 02

[PATCH] customize: Add --ssh-inject option for injecting SSH keys.

This adds a customize option: virt-customize --ssh-inject USER[=KEY] virt-builder --ssh-inject USER[=KEY] virt-sysprep --ssh-inject USER[=KEY] In each case this either injects the current (host) user's ssh pubkey into the guest user USER (adding it to ~USER/.ssh/authorized_keys in the guest), or you can specify a particular key. For example: virt-builder fedora-20 --ssh-inject root

MachinePipeliner refactoring

2019 Jul 16

MachinePipeliner refactoring

Hi James, I also think that refactoring the code generation part is a great idea. That code is very complicated and difficult to maintain. I’ve wanted to rewrite that code for a long time, but just have never got to it. There are quite a few edge cases to handle (at least in the current code). I’ll take a deeper look at your patch. The abstractions that you mention, Stage and Block, are good

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 06

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Hi Jean-Marc, Thanks a lot for reviewing this huge assembly function! silk_warped_autocorrelation_FIX_c()'s kernel part is for( n = 0; n < length; n++ ) { tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS ); /* Loop over allpass sections */ for( i = 0; i < order; i++ ) { /* Output of allpass section */ tmp2_QS = silk_SMLAWB(

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Jan 31

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Hi, Attached is a patch with arm neon optimizations for silk_warped_autocorrelation_FIX(). Please review. Thanks, Felicia -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170131/9a912bb4/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:

MachinePipeliner refactoring

2019 Jul 15

MachinePipeliner refactoring

Hi James: Personally, I like the idea of refactoring and more abstraction, But unfortunately, I don't know enough about the edges cases either. BTW: the prototype is still causing quite some Asseertions in PowerPC - some nodes are not generated in correct order. Best, Jinsong Ji (纪金松), PhD. XL/LLVM on Power Compiler Development E-mail: jji at us.ibm.com From: James Molloy <james at

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

This is a great idea. But the order (psEncC->shapingLPCOrder) can be configured to 12, 14, 16, 20 and 24 according to complexity parameter. It's hard to get a universal function to handle all these orders efficiently. Any suggestions? Thanks, Linfeng On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 02:51 PM,

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Feb 07

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Hi Jean-Marc, Thanks for your suggestions. Will get back to you once we have some updates. Linfeng On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 06/02/17 07:18 PM, Linfeng Zhang wrote: > > This is a great idea. But the order (psEncC->shapingLPCOrder) can be > > configured to 12, 14, 16, 20 and 24 according to

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

I attached a new patch with small cleanup (disassembly is identical as the last patch). We have done the same internal testing as usual. Also, attached 2 failed temporary versions which try to reduce code size (just for code review reference purpose). The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of 3,228 bytes (with gcc). smaller_slower.c has a code size of 2,304

[LLVMdev] X86 Frame info question

2004 Jun 09

[LLVMdev] X86 Frame info question

The X86 backend has this code: X86TargetMachine::X86TargetMachine(const Module &M, IntrinsicLowering *IL) : .... FrameInfo(TargetFrameInfo::StackGrowsDown, 8/*16 for SSE*/, 4), That is, it uses "4" as local area offset. Based on prior discussion this should mean that the local area starts and address ESP+4. Is this really true? On X86 stack grows down, so

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Thank Jean-Marc! The speedup percentages are all relative to the entire encoder. Comparing to master, this optimization patch speeds up fixed-point SILK encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8% Complexity 8: 5.5% Complexity 10: 4.0% when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max MHz: 2116.5 Thanks, Linfeng On Wed, Apr 5, 2017 at 11:02 AM,

MachinePipeliner refactoring

2019 Jul 15

MachinePipeliner refactoring

Hi Brendan (and friends of MachinePipeliner, +llvm-dev for openness), Over the past week or so I've been attempting to extend the MachinePipeliner to support different idioms of code generation. To make this a bit more concrete, there are two areas where the currently generated code could be improved depending on architecture: 1) The epilog blocks peel off the final iterations in reverse

similar to: [LLVMdev] Injecting code before function prolog