thr3ads.net - similar to: "Why does LLVM keep some loads in the loops even after applying the O3 optimization?"

Displaying 20 results from an estimated 3000 matches similar to: "Why does LLVM keep some loads in the loops even after applying the O3 optimization?"

Why does LLVM keep some loads in the loops even after applying the O3 optimization?

2019 Mar 28

Why does LLVM keep some loads in the loops even after applying the O3 optimization?

Ryan Taylor via llvm-dev <llvm-dev at lists.llvm.org> writes: > r0 gets overwritten inside the loop (assuming dst, src, src), is ldr > r0, [r5] needed to initialize r0 for the loop at each iteration? Register allocation should handle that if the load is hoisted. I'm with the others. The printf is the most likely culprit. -0David > On Thu, Mar 28, 2019

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

2011 May 26

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

Hi all, LLVM CodeGen and Tools team at Apple is looking for exceptional compiler engineers. This is a great opportunity to work with many of the leaders in the LLVM community. If you are interested in this position, please send your resume / CV and relevant information to evan.cheng at apple.com Thanks, Evan Job description The Apple compiler team is seeking an engineer who is strongly

[LLVMdev] Post-inc combining

2011 Feb 07

[LLVMdev] Post-inc combining

When I compile the following program (for ARM): for(i=0;i<n2;i+=n3) { s+=a[i]; } , with GCC, I get the following loop body, with a post-modify load: .L4: add r1, r1, r3 ldr r4, [ip], r6 rsb r5, r3, r1 cmp r2, r5 add r0, r0, r4 bgt .L4 With LLVM, however, I get: .LBB0_3: @

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

2012 Apr 25

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

For the following code fragment, ; <label>:27 ; preds = %27, %entry %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 %29 = icmp slt i32 %28, 0 br i1 %29, label %27, label %loop.exit loop.exit: ; preds = %27 llc will generate following MIPS code, $BB0_1: lui $3, 32800 ori $3, $3, 1032 lw

Speex crashing on ARM with assembler optimization enabled.

2007 Dec 12

Speex crashing on ARM with assembler optimization enabled.

Hi, I'm trying to get speex working on an ARM board (ARM926EJ-Sid(wb) core, ARM 5TE architecture) and getting segfaults if build with "--enable-fixed-point --enable-arm5e-asm" options. If I use just "--enable-fixed-point", then it runs fine, but once I add "--enable-arm5e-asm" it start crashing (I use testenc to test it). Further investigation showed, that it

[LLVMdev] Post-inc combining

2011 Jan 28

[LLVMdev] Post-inc combining

On Jan 27, 2011, at 11:13 PM, Jonas Paulsson wrote: > Hi, > > I would like to transform a LLVM function containing a load and an add of the base address inside a loop to a post-incremented load. In DAGCombiner.cpp::CombineToPostIndexedLoadStore(), it says it cannot fold the add for instance if it is a predecessor/successor of the load. I find this odd, as this > is exactly what I

[LLVMdev] introducing sign extending halfword loads into the LLVM IR

2013 Jan 23

[LLVMdev] introducing sign extending halfword loads into the LLVM IR

Hi Bjorn, could you file a bug on llvm.org/bugs and cc me on it. Thanks, Arnold > So it appears that also the ARM backend has a big problems with sign-extending loads. > > I've compiled the following loop > > short in[]; > int out[]; > int value; > > for (i = 0; i < nr; i++) { > value = in[i]; > if (value>2047) >

[LLVMdev] Post-inc combining

2011 Jan 28

[LLVMdev] Post-inc combining

Hi, I would like to transform a LLVM function containing a load and an add of the base address inside a loop to a post-incremented load. In DAGCombiner.cpp::CombineToPostIndexedLoadStore(), it says it cannot fold the add for instance if it is a predecessor/successor of the load. I find this odd, as this is exactly what I would like to handle: a simple loop with an address that is inremented in

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

2012 Apr 29

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

On Apr 24, 2012, at 11:48 PM, Fan Dawei wrote: > For the following code fragment, > > ; <label>:27 ; preds = %27, %entry > %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 > %29 = icmp slt i32 %28, 0 > br i1 %29, label %27, label %loop.exit > > loop.exit: ; preds = %27

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

2012 Apr 29

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

On 04/29/2012 01:19 PM, Evan Cheng wrote: > On Apr 24, 2012, at 11:48 PM, Fan Dawei wrote: > >> For the following code fragment, >> >> ;<label>:27 ; preds = %27, %entry >> %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 >> %29 = icmp slt i32 %28, 0 >> br i1 %29, label %27, label

[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels

2011 Nov 12

[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels

This would be best reported to Apple's Radar bug database at http://bugreport.apple.com/ but its whole website has been down for a while. I have a 100% reproducible Thumb-2 code generation error that occurs at all of the levels of optimization available in the Xcode 4.2 for Snow Leopard build settings GUI: -O0, -O1, -O2, -O3 and -Os. However the bad machine code only occurs in Release

[LLVMdev] Question about ARM/vfp/NEON code generation

2011 May 27

[LLVMdev] Question about ARM/vfp/NEON code generation

I have a code generation question for ARM with VFP and NEON. I am generating code for the following function as a test: void FloatingPointTest(float f1, float f2, float f3) { float f4 = f1 * f2; if (f4 > f3) printf("%f\n",f2); else printf("%f\n",f3); } I have tried compiling with: 1. -mfloat-abi=softfp and -mfpu=neon 2.

[LLD] Linker Relaxation

2017 Jul 11

[LLD] Linker Relaxation

Here's an example using the gcc toolchain for embedded 32 bit RISC-V (my HiFive1 board): #include <stdio.h> int foo(int i){ if (i < 100){ printf("%d\n", i); } return i; } int main(){ foo(10); return 0; } After compiling to a .o with -O2 -march=RV32IC we get (just looking at foo) 00000000 <foo>: 0: 1141 addi sp,sp,-16

[LLD] Linker Relaxation

2017 Jul 11

[LLD] Linker Relaxation

Hi, Does lld support linker relaxation that may shrink code size? As far as I see lld seems to assume that the content of input sections to be fixed other than patching up relocations, but I believe some targets may benefit the extra optimization opportunity with relaxation. Specifically, I'm currently working on adding support for RISC-V in lld, and RISC-V heavily relies on linker relaxation

[PATCH v4 00/25] xen: ARMv7 with virtualization extensions

2012 Jan 09

[PATCH v4 00/25] xen: ARMv7 with virtualization extensions

Hello everyone, this is the fourth version of the patch series that introduces ARMv7 with virtualization extensions support in Xen. The series allows Xen and Dom0 to boot on a Cortex-A15 based Versatile Express simulator. See the following announce email for more informations about what we are trying to achieve, as well as the original git history: See

[LLVMdev] Incorrect execution of global constructor with JIT on ARM

2010 Feb 17

[LLVMdev] Incorrect execution of global constructor with JIT on ARM

On 15 February 2010 14:49, Martins Mozeiko <49640f8a at gmail.com> wrote: > #include <stdio.h> > struct Global { > typedef unsigned char ArrayType[4]; > ArrayType value; > Global(const ArrayType& arg) { > for (int i = 0; i < 4; i++) this->value[i] = arg[i]; > } > }; > static const unsigned char arr[] = { 1, 2, 3, 4 }; > static const Global

[LLVMdev] LLVM 3.0 release notes ARM Target

2011 Nov 16

[LLVMdev] LLVM 3.0 release notes ARM Target

what do you mean by "more optimal instructions" ? -omer On Wed, Nov 16, 2011 at 1:28 AM, Joe Abbey <jabbey at arxan.com> wrote: > I've done a first pass over the past 6 months of changes and some notable > things stood out: > > * The ARM backend has reworked Set Jump Long Jump EH Lowering. > * The ARM backend includes improved support for Cortex-M > *

[PATCH RFC 00/25] xen: ARMv7 with virtualization extensions

2011 Dec 06

[PATCH RFC 00/25] xen: ARMv7 with virtualization extensions

Hello everyone, this is the very first version of the patch series that introduces ARMv7 with virtualization extensions support in Xen. The series allows Xen and Dom0 to boot on a Cortex-A15 based Versatile Express simulator. See the following announce email for more informations about what we are trying to achieve, as well as the original git history: See

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Apr 15

[ARM] Register pressure with -mthumb forces register reload before each call

On Wed, 15 Apr 2020 at 03:36, John Brawn <John.Brawn at arm.com> wrote: > > > Could you please point out what am I doing wrong in the patch ? > > It's because you're getting the function name by doing > callee->getName().str().c_str() > The str() call generates a temporary copy of the name which ceases to exist outside of this expression > causing the

[LLVMdev] Incorrect execution of global constructor with JIT on ARM

2010 Feb 15

[LLVMdev] Incorrect execution of global constructor with JIT on ARM

Hello, llvm developers! I am running LLVM with JIT on ARM. For simple programs it runs ok, but for lager code I have stumbled upon some issues. See following C++ code to which I have reduced the problem: #include <stdio.h> struct Global { typedef unsigned char ArrayType[4]; ArrayType value; Global(const ArrayType& arg) { for (int i = 0; i < 4; i++) this->value[i] =

similar to: Why does LLVM keep some loads in the loops even after applying the O3 optimization?