thr3ads.net - similar to: "[LLVMdev] Mapping registers into memory address"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Mapping registers into memory address"

[LLVMdev] How can I remove these redundant copy between registers?

2015 May 21

[LLVMdev] How can I remove these redundant copy between registers?

Hi, I've been working on a Blackfin backend (llvm-3.6.0) based on the previous one that was removed in llvm-3.1. llc generates codes like this: 29 p1 = r2; 30 r5 = [p1]; 31 p1 = r2; 32 r6 = [p1 + 4]; 33 r5 = r6 + r5; 34 r6 = [p0 + -4]; 35 r5 *= r6; 36 p1 = r2; 37 r6 = [p1 + 8]; 38 p1 = r2; p1 and r2 are in different register classes. A p*

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written

How to implement load/store for vector predicate register

2020 Jun 25

How to implement load/store for vector predicate register

Hi, there I am writing an backend, and I met a problem. We don't have load/store instructions for vector predicate registers(vpr for short). The hardware has 64 vector registers(vr for short) and 8 vector predicate registers. And there is no move instructions between vr and vpr. vr supports many operations, and vpr supports vpror, vprxor, vprand and vprinv operations. A vr has 512 bits, and

Debug symbols are missing in elf

2020 Apr 18

Debug symbols are missing in elf

Hello All, I was trying to add Microblaze target to LLVM backend. I was able to generate object file with relocations. and debug symbols. When I try to link this object file with microblaze GCC linker I am getting below errors and debug symbols are missing in it. mb-objdump: DWARF error: found dwarf version '15877', this reader only handles version 2, 3, 4 and 5 information

Debug symbols are missing in elf

2020 Apr 18

Debug symbols are missing in elf

On Saturday, April 18, 2020, David Blaikie <dblaikie at gmail.com> wrote: > > > On Sat, Apr 18, 2020 at 3:02 AM Nagaraju Mekala via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello All, >> >> I was trying to add Microblaze target to LLVM backend. I was able to >> generate object file with relocations. and debug symbols. >>

[LLVMdev] Help with a Microblaze code generation problem.

2013 Oct 03

[LLVMdev] Help with a Microblaze code generation problem.

Sorry if this is a duplicate: I tried to send it last night and it didn't go through. I'm trimming some text to see if it helps. I have a simple program that fails on the Microblaze: int main() { unsigned long long x, y; x = 100; y = 0x8000000000000000ULL; return !(x > y); } As you can see, the test case compares two unsigned long long values. To try to track

Debug symbols are missing in elf

2020 Apr 20

Debug symbols are missing in elf

On Sat, Apr 18, 2020 at 11:11 PM David Blaikie <dblaikie at gmail.com> wrote: > > Yeah, not sure - you mention the linker produces errors, but the errors you showed looked like objdump errors? Were those errors from trying to dump the linked executable, and not errors that were produced by the linker itself? Yes, as mentioned earlier I was able to generate final executable but it

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 13

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul

How to implement load/store for vector predicate register

2020 Jun 26

How to implement load/store for vector predicate register

Hi, I am planning to expanding the pseudo instructions in XXXTargetLowering::EmitInstrWithCustomInserter(), and use temporary virtual registers as operands. If I use virtual registers, do I need to mark them as "early clobber"? I saw that sometimes they marked virtual register as "early clobber" in EmitInstrWithCustomInserter() in MIPS backend. What is the effect of marking a

[LLVMdev] Using CallingConvLower in ARM target

2009 Apr 16

[LLVMdev] Using CallingConvLower in ARM target

After wasting an inordinate amount of time trying to get test-suite to run on arm-apple-darwin so I could reproduce your results, attached is a patch that fixes the small copy&paste error of having 8-byte alignment for stack-allocated f64s instead of the proper 4-byte. I've updated the patch to the top of trunk changes as well. deep On Fri, Feb 27, 2009 at 8:31 PM, Sandeep Patel

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

2017 Oct 20

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

On 20 October 2017 at 09:24, Ingo Molnar <mingo at kernel.org> wrote: > > * Thomas Garnier <thgarnie at google.com> wrote: > >> Change the assembly code to use only relative references of symbols for the >> kernel to be PIE compatible. >> >> Position Independent Executable (PIE) support will allow to extended the >> KASLR randomization range below

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

2017 Oct 20

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

2016 May 12

[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator

> On May 12, 2016, at 11:00 AM, Francois Pichet <pichet2000 at gmail.com> wrote: > > Here is a specific case that make the debugging experiences degraded on my target: > This is a loop simplified CFG: > > BB#0: > %R5<def> = OR_rr %R0, %R49 // this is %R5 only def. > DBG_VALUE %R5, %noreg, !"argc", <!18>; line no:4 > Successors

What pattern string corresponds to CopyToReg?

2017 Nov 05

What pattern string corresponds to CopyToReg?

Hmm, okay. Then what's the problem being reported here? I'm not sure what I'm supposed to do with "LLVM ERROR: Cannot select: t1: i16 = Constant<127>".BTW, the function is: ; ModuleID = 'return.c' source_filename = "return.c" target datalayout = "E-m:e-p:16:16:16-i1:16:16-i8:16:16-i16:16:16-i32:16:16-i64:16:16-S16-n16" target triple =

Loop invariant not being optimized

2016 Nov 17

Loop invariant not being optimized

I've got an example where I think that there should be some loop-invariant optimization happening, but it's not. Here's the C code: #define DIM 8 #define UNROLL_DIM DIM typedef double InArray[DIM][DIM]; __declspec(noalias) void f1( InArray c, const InArray a, const InArray b ) { #pragma clang loop unroll_count(UNROLL_DIM) for( int i=0;i<DIM;i++) #pragma clang loop

What pattern string corresponds to CopyToReg?

2017 Nov 05

What pattern string corresponds to CopyToReg?

Well, that's the thing: I thought that was CopyToReg. I don't know what the name of the node is to load one value into a register, so I don't know how to construct such a pattern. On Sat, Nov 4, 2017 at 9:23 PM Craig Topper <craig.topper at gmail.com> wrote: > Do you have a pattern for loading an i16 immediate into a 16-bit register? > > ~Craig > > On Sat, Nov 4,

[LLVMdev] Request for merge: GHC/ARM calling convention.

2012 Jun 29

[LLVMdev] Request for merge: GHC/ARM calling convention.

Hi Renato, On 06/25/12 12:13 AM, Renato Golin wrote: > Hi Karel, > > I understand this patch has already been merged (to 3.0), so don't > take my question as stopping the merge to head, I'm just making sure I > got it right... The rest looks correct. > > + CCIfType<[v2f64], CCAssignToReg<[Q4, Q5]>>, > + CCIfType<[f64], CCAssignToReg<[D8, D9,

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in

Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td

2020 Sep 23

Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td

In ARM.td, I see that the ProcessorModel for cortex-r4, cortex-r4f, and cortex-r5 (as well as r7 and r8) is based on "CortexA8Model", which seems incorrect. When this was added in 2015, there were also comments associated with this configuration, such as "// FIXME: R5 has currently the same ProcessorModel as A8" (later removed). The processor model for Cortex-r52 appears to

[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs

2014 Oct 24

[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs

Hi, I noticed a significant performance regression (up to 40%) on some internal CUDA benchmarks (a reduced example presented below). The root cause of this regression seems that IndVarSimpilfy widens induction variables assuming arithmetics on wider integer types are as cheap as those on narrower ones. However, this assumption is wrong at least for the NVPTX64 target. Although the NVPTX64 target

similar to: [LLVMdev] Mapping registers into memory address