similar to: [LLVMdev] Mapping registers into memory address

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Mapping registers into memory address"

2015 May 21
2
[LLVMdev] How can I remove these redundant copy between registers?
Hi, I've been working on a Blackfin backend (llvm-3.6.0) based on the previous one that was removed in llvm-3.1. llc generates codes like this: 29 p1 = r2; 30 r5 = [p1]; 31 p1 = r2; 32 r6 = [p1 + 4]; 33 r5 = r6 + r5; 34 r6 = [p0 + -4]; 35 r5 *= r6; 36 p1 = r2; 37 r6 = [p1 + 8]; 38 p1 = r2; p1 and r2 are in different register classes. A p*
2009 Feb 13
3
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
It seems to me that LLVM sub-register is not for the following hardware architecture. All instructions of a hardware are vector instructions. All registers contains 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. Most instructions write more than one elements in this way: mul r0.xyw, r1, r2 add r0.z, r3, r4 sub r5, r0, r1 Notice that the four elements of r0 are written
2020 Jun 25
2
How to implement load/store for vector predicate register
Hi, there I am writing an backend, and I met a problem. We don't have load/store instructions for vector predicate registers(vpr for short). The hardware has 64 vector registers(vr for short) and 8 vector predicate registers. And there is no move instructions between vr and vpr. vr supports many operations, and vpr supports vpror, vprxor, vprand and vprinv operations. A vr has 512 bits, and
2020 Apr 18
2
Debug symbols are missing in elf
Hello All, I was trying to add Microblaze target to LLVM backend. I was able to generate object file with relocations. and debug symbols. When I try to link this object file with microblaze GCC linker I am getting below errors and debug symbols are missing in it. mb-objdump: DWARF error: found dwarf version '15877', this reader only handles version 2, 3, 4 and 5 information
2020 Apr 18
2
Debug symbols are missing in elf
On Saturday, April 18, 2020, David Blaikie <dblaikie at gmail.com> wrote: > > > On Sat, Apr 18, 2020 at 3:02 AM Nagaraju Mekala via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello All, >> >> I was trying to add Microblaze target to LLVM backend. I was able to >> generate object file with relocations. and debug symbols. >>
2013 Oct 03
1
[LLVMdev] Help with a Microblaze code generation problem.
Sorry if this is a duplicate: I tried to send it last night and it didn't go through. I'm trimming some text to see if it helps. I have a simple program that fails on the Microblaze: int main() { unsigned long long x, y; x = 100; y = 0x8000000000000000ULL; return !(x > y); } As you can see, the test case compares two unsigned long long values. To try to track
2020 Apr 20
2
Debug symbols are missing in elf
On Sat, Apr 18, 2020 at 11:11 PM David Blaikie <dblaikie at gmail.com> wrote: > > Yeah, not sure - you mention the linker produces errors, but the errors you showed looked like objdump errors? Were those errors from trying to dump the linked executable, and not errors that were produced by the linker itself? Yes, as mentioned earlier I was able to generate final executable but it
2009 Feb 13
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
On Feb 13, 2009, at 9:47 AM, Alex wrote: > It seems to me that LLVM sub-register is not for the following > hardware architecture. > > All instructions of a hardware are vector instructions. All > registers contains > 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w. > > Most instructions write more than one elements in this way: > > mul
2020 Jun 26
2
How to implement load/store for vector predicate register
Hi, I am planning to expanding the pseudo instructions in XXXTargetLowering::EmitInstrWithCustomInserter(), and use temporary virtual registers as operands. If I use virtual registers, do I need to mark them as "early clobber"? I saw that sometimes they marked virtual register as "early clobber" in EmitInstrWithCustomInserter() in MIPS backend. What is the effect of marking a
2009 Apr 16
2
[LLVMdev] Using CallingConvLower in ARM target
After wasting an inordinate amount of time trying to get test-suite to run on arm-apple-darwin so I could reproduce your results, attached is a patch that fixes the small copy&paste error of having 8-byte alignment for stack-allocated f64s instead of the proper 4-byte. I've updated the patch to the top of trunk changes as well. deep On Fri, Feb 27, 2009 at 8:31 PM, Sandeep Patel
2017 Oct 20
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
On 20 October 2017 at 09:24, Ingo Molnar <mingo at kernel.org> wrote: > > * Thomas Garnier <thgarnie at google.com> wrote: > >> Change the assembly code to use only relative references of symbols for the >> kernel to be PIE compatible. >> >> Position Independent Executable (PIE) support will allow to extended the >> KASLR randomization range below
2017 Oct 20
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
On 20 October 2017 at 09:24, Ingo Molnar <mingo at kernel.org> wrote: > > * Thomas Garnier <thgarnie at google.com> wrote: > >> Change the assembly code to use only relative references of symbols for the >> kernel to be PIE compatible. >> >> Position Independent Executable (PIE) support will allow to extended the >> KASLR randomization range below
2016 May 12
2
[LLVMdev] Improving the quality of debug locations / DbgValueHistoryCalculator
> On May 12, 2016, at 11:00 AM, Francois Pichet <pichet2000 at gmail.com> wrote: > > Here is a specific case that make the debugging experiences degraded on my target: > This is a loop simplified CFG: > > BB#0: > %R5<def> = OR_rr %R0, %R49 // this is %R5 only def. > DBG_VALUE %R5, %noreg, !"argc", <!18>; line no:4 > Successors
2017 Nov 05
2
What pattern string corresponds to CopyToReg?
Hmm, okay. Then what's the problem being reported here? I'm not sure what I'm supposed to do with "LLVM ERROR: Cannot select: t1: i16 = Constant<127>".BTW, the function is: ; ModuleID = 'return.c' source_filename = "return.c" target datalayout = "E-m:e-p:16:16:16-i1:16:16-i8:16:16-i16:16:16-i32:16:16-i64:16:16-S16-n16" target triple =
2016 Nov 17
2
Loop invariant not being optimized
I've got an example where I think that there should be some loop-invariant optimization happening, but it's not. Here's the C code: #define DIM 8 #define UNROLL_DIM DIM typedef double InArray[DIM][DIM]; __declspec(noalias) void f1( InArray c, const InArray a, const InArray b ) { #pragma clang loop unroll_count(UNROLL_DIM) for( int i=0;i<DIM;i++) #pragma clang loop
2017 Nov 05
2
What pattern string corresponds to CopyToReg?
Well, that's the thing: I thought that was CopyToReg. I don't know what the name of the node is to load one value into a register, so I don't know how to construct such a pattern. On Sat, Nov 4, 2017 at 9:23 PM Craig Topper <craig.topper at gmail.com> wrote: > Do you have a pattern for loading an i16 immediate into a 16-bit register? > > ~Craig > > On Sat, Nov 4,
2012 Jun 29
2
[LLVMdev] Request for merge: GHC/ARM calling convention.
Hi Renato, On 06/25/12 12:13 AM, Renato Golin wrote: > Hi Karel, > > I understand this patch has already been merged (to 3.0), so don't > take my question as stopping the merge to head, I'm just making sure I > got it right... The rest looks correct. > > + CCIfType<[v2f64], CCAssignToReg<[Q4, Q5]>>, > + CCIfType<[f64], CCAssignToReg<[D8, D9,
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in
2020 Sep 23
2
Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td
In ARM.td, I see that the ProcessorModel for cortex-r4, cortex-r4f, and cortex-r5 (as well as r7 and r8) is based on "CortexA8Model", which seems incorrect. When this was added in 2015, there were also comments associated with this configuration, such as "// FIXME: R5 has currently the same ProcessorModel as A8" (later removed). The processor model for Cortex-r52 appears to
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
Hi, I noticed a significant performance regression (up to 40%) on some internal CUDA benchmarks (a reduced example presented below). The root cause of this regression seems that IndVarSimpilfy widens induction variables assuming arithmetics on wider integer types are as cheap as those on narrower ones. However, this assumption is wrong at least for the NVPTX64 target. Although the NVPTX64 target