thr3ads.net - similar to: "LiveInterval error with 2 dead defs"

Displaying 20 results from an estimated 400 matches similar to: "LiveInterval error with 2 dead defs"

2019 Oct 07

LiveInterval error with 2 dead defs

The associated patch caused a compilation problems on Hexagon: https://bugs.llvm.org/show_bug.cgi?id=43302 The splitting of a live interval should not be done automatically upon creation. Calling LIS->getInterval(Reg) should not go around changing the code behind the scenes. There is already a function “splitSeparateComponents” that does that. It should be added where it’s missing. --

Fwd: MachineScheduler not scheduling for latency

2019 Sep 09

Fwd: MachineScheduler not scheduling for latency

Hi, I'm trying to understand why MachineScheduler does a poor job in straight line code in cases like the one in the attached debug dump. This is on AMDGPU, an in-order target, and the problem is that the IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in the resulting schedule they are often placed right next to their uses like this: 1784B %140:vgpr_32 =

MachineScheduler not scheduling for latency

2019 Sep 10

MachineScheduler not scheduling for latency

Hi Andy, Thanks for the explanations. Yes AMDGPU is in-order and has MicroOpBufferSize = 1. Re "issue limited" and instruction groups: could it make sense to disable the generic scheduler's detection of issue limitation on in-order CPUs, or on CPUs that don't define instruction groups, or some similar condition? Something like: --- a/lib/CodeGen/MachineScheduler.cpp +++

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

> > PHIElim and TwoAddress passes leave SSA form. > May be a missed something in your code but %vreg48 seems to be there > after PHI elimination. PHIElim tags those kind of registers as being > PHIJoin regs, updating LiveVariables pass, so the regcoalescer is aware > of them (some SSA info is still alive but the reg coalescer will > invalidate that information after

RFC: atomic operations on SI+

2016 Mar 28

RFC: atomic operations on SI+

On Fri, Mar 25, 2016 at 02:22:11PM -0400, Jan Vesely wrote: > Hi Tom, Matt, > > I'm working on a project that needs few coherent atomic operations (HSA > mode: load, store, compare-and-swap) for std::atomic_uint in HCC. > > the attached patch implements atomic compare and swap for SI+ > (untested). I tried to stay within what was available, but there are > few issues

Bug in TableGen RegisterBankEmitter

2017 May 16

Bug in TableGen RegisterBankEmitter

On 05/16/2017 11:57 AM, Daniel Sanders wrote: >> If that's right, one possible fix would be to rename some of the subregister indices but that's likely to be quite painful. I'll have a think and see if I can come up with something nicer. > > I haven't been able to come up with a better answer for this, just an alternate choice as to where the complexity is. If we were

[LLVMdev] Subregister liveness tracking

2013 Oct 07

[LLVMdev] Subregister liveness tracking

I've been working on patches to improve subregister liveness tracking on llvm and I wanted to inform the llvm community about the overal design/motivation for them. I will send the patches to llvm-commits later today. Greetings Matthias Braun Subregisters in llvm ==================== Some targets can access registers in different ways resulting in wider or narrower accesses. For

[LLVMdev] Possible missed optimization?

2011 Mar 26

[LLVMdev] Possible missed optimization?

> > You can look at the output of -debug-only=regcoalescing to see what is > going on. > > This is the debug output i've got, some information is a bit cryptic for me so next is what i understood: ********** SIMPLE REGISTER COALESCING ********** ********** Function: foo ********** JOINING INTERVALS *********** entry: 16L %vreg0<def> = COPY %R25R24<kill>;

[LLVMdev] Possible missed optimization?

2011 Mar 26

[LLVMdev] Possible missed optimization?

On Mar 26, 2011, at 1:04 PM, Borja Ferrer wrote: > Hello Jakob, thanks for the reply. The three regclasses involved here are all subsets from each other and aren't disjoint. These are the basic descriptions of the regclasses involved to show what i mean: > > DREGS: R31R30, R29R28 down to R1R0 (16 regs) > DLDREGS: R31R30, R29R28 down to R17R16 (8 regs) > PTRREGS:

[LLVMdev] Possible missed optimization?

2011 Mar 26

[LLVMdev] Possible missed optimization?

Hello Jakob, thanks for the reply. The three regclasses involved here are all subsets from each other and aren't disjoint. These are the basic descriptions of the regclasses involved to show what i mean: DREGS: R31R30, R29R28 down to R1R0 (16 regs) DLDREGS: R31R30, R29R28 down to R17R16 (8 regs) PTRREGS: R31R30, R29R28, R27R26 (3 regs) All classes intersect each other

[LLVMdev] Subregister liveness tracking

2013 Oct 08

[LLVMdev] Subregister liveness tracking

Currently it will always spill / restore the whole vreg but only spilling the parts that are actually live would be a nice addition in the future. Looking at r192119': if "mtlo" writes to $LO and sets $HI to an unpredictable value, then it should just have an additional (dead) def operand for $hi, shouldn't it? Greetings Matthias Am 10/8/13, 11:03 AM, schrieb Akira

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

When examining the debug output of regalloc, it seems that joining 32bits reg also joins 128 parent reg. If I look at the : %vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34 R600_Reg128:%vreg6 instructions ; it gets joined to : 928B%vreg34<def> = COPY %vreg48:sel_y; when vreg6 and vreg48 are joined. It's right. But joining the following copy

RFC: atomic operations on SI+

2016 Mar 25

RFC: atomic operations on SI+

Hi Tom, Matt, I'm working on a project that needs few coherent atomic operations (HSA mode: load, store, compare-and-swap) for std::atomic_uint in HCC. the attached patch implements atomic compare and swap for SI+ (untested). I tried to stay within what was available, but there are few issues that I was unsure how to address: 1.) it currently uses v2i32 for both input and output. This

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Mar 31

[ARM] Register pressure with -mthumb forces register reload before each call

Hi, Compiling attached test-case, which is reduced version of of uECC_shared_secret from tinycrypt library [1], with --target=arm-linux-gnueabi -march=armv6-m -Oz -S results in reloading of register holding function's address before every call to blx: ldr r3, .LCPI0_0 blx r3 mov r0, r6 mov r1, r5 mov r2, r4 ldr r3,

[LLVMdev] Subregister liveness tracking

2013 Oct 08

[LLVMdev] Subregister liveness tracking

What I didn't mention in r192119 is that mthi/lo clobbers the other sub-register only if the contents of hi and lo are produced by mult or other arithmetic instructions (div, madd, etc.) It doesn't have this side-effect if it is produced by another mthi/lo. So I don't think making mthi/lo clobber the other half would work. For example, this is an illegal sequence of instructions,

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Apr 07

[ARM] Register pressure with -mthumb forces register reload before each call

If I'm understanding what's going on in this test correctly, what's happening is: * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6) * The function has three arguments, so those three plus the register we need to hold the

[LLVMdev] Subregister liveness tracking

2013 Oct 09

[LLVMdev] Subregister liveness tracking

On Oct 8, 2013, at 2:06 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: > What I didn't mention in r192119 is that mthi/lo clobbers the other sub-register only if the contents of hi and lo are produced by mult or other arithmetic instructions (div, madd, etc.) It doesn't have this side-effect if it is produced by another mthi/lo. So I don't think making mthi/lo clobber the

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Apr 15

[ARM] Register pressure with -mthumb forces register reload before each call

Hi, I have attached WIP patch for adding foldMemoryOperand to Thumb1InstrInfo. For the following case: void f(int x, int y, int z) { void bar(int, int, int); bar(x, y, z); bar(x, z, y); bar(y, x, z); bar(y, y, x); } it calls foldMemoryOperand twice, and thus converts two calls from blx to bl. callMI->dump() shows the function name "bar" correctly, however in generated

Spill hoisting on RAL: looking for some debugging ideas

2016 Dec 22

Spill hoisting on RAL: looking for some debugging ideas

Hi, I am debugging private backend and faced interesting problem: sometimes spill hoisting creates double stores. (some output from -debug-only=regalloc). First hoisting: Checking redundant spills for 0 at 16r in %vreg19 [16r,144B:0)[144B,240B:1)[240B,280r:2)[296r,416B:3)[416B,456r:4)[472r,592B:5) 0 at 16r 1 at 144B-phi 2 at 240B-phi 3 at 296r 4 at 416B-phi 5 at 472r Merged to stack int: SS#0

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Hi Vincent, On 24/10/2012 23:26, Vincent Lejeune wrote: > Hi, > > I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. > > The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : > > // BEFORE LOOP >

similar to: LiveInterval error with 2 dead defs