thr3ads.net - similar to: "[LLVMdev] Marking implicit registers as "trashed""

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Marking implicit registers as "trashed""

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 24

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Hi, I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : // BEFORE LOOP ... Some COPYs.... 400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2

[LLVMdev] Tracking down a SELECT expansion to predicated moves

2013 May 13

[LLVMdev] Tracking down a SELECT expansion to predicated moves

I've inherited some target code, but there is no SELECT lowering in my target. But somewhere/somehow SELECT is being transformed into a predicated move. I've traced SELECT everywhere in Codegen/SelectionDAG. Any ideas on tracking this down to the point in Codegen lowering/dag-conversion to a predicated series? Again, I do not have a lowering rule in my target for SELECT. If I do a IR

[LLVMdev] Problem in X86 backend

2014 Oct 27

[LLVMdev] Problem in X86 backend

Hi, I'm having some trouble wirting an instruction in the X86 backend. I made a new intrinsic and I wrote a custom inserter for my intrinsic in the X86 backend. Everything works fine, except for one instruction that I can't find how to write. I want to add this instruction in one of my machine basic block: mov [rdi], 0 How can I achieve that with the LLVM api? I tried several

TwoAddressInstructionPass bug?

2017 Nov 30

TwoAddressInstructionPass bug?

Hi, we are in the midst of an interesting work that begun with setting 'guessInstructionProperties = 0' in the SystemZ backend. We have found this to be useful, and discovered many instructions where the hasSideEffects flag was incorrectly set while it actually shouldn't. The attached patch and test case triggers an assert in TwoAddress. (bin/llc ./tc_TwoAddr_crash.ll

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

Consider this example: --- ex.ll --- declare void @bar() ; Function Attrs: nounwind optsize define void @main() { entry: %hin = alloca [256 x i32], align 4 %xin = alloca [256 x i32], align 4 call void @bar() ret void } ------------- Freshly built llc: llc -O2 -march=x86 < ex.ll -print-before-all # *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: #

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

Hi, We recently found a testcase showing that simplifications in instcombine sometimes change the instruction without reducing the instruction cost, but causing problems in TwoAddressInstruction pass. And it looks like the problem is generic and other simplification may have the same issue. I want to get some ideas about what is the best way to fix such kind of problem. The testcase:

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

CallFrameSetupOpcode is a pseudo opcode like X86::ADJCALLSTACKDOWN64. That means when the code is expected to be called before the pseudo instructions are eliminated. I don't know why it's not the case for you. A quick look at PEI code indicates the pseudo's should not have been removed at the time when replaceFrameIndices are run. Evan On Sep 25, 2013, at 8:57 AM, Krzysztof

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 25

[LLVMdev] Register scavenger and SP/FP adjustments

Hi All, I'm dealing with a problem where the spill/restore instructions inserted during scavenging span an adjustment of the SP/FP register. The result is that despite the base register (SP/FP) being changed between the spill and the restore, both store and load use the same immediate offset. I see code in the PEI (replaceFrameIndices) that is supposed to track the SP/FP adjustment:

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

The code has changed a lot over the years. Looks like at some point of time the assumption was broken. calculateCallsInformation() may have eliminated the pseudo set up instructions already. // If call frames are not being included as part of the stack frame, and

[LLVMdev] Adding masked vector load and store intrinsics

2014 Oct 24

[LLVMdev] Adding masked vector load and store intrinsics

> So %passthrough can *only* be undef or zeroinitializer? No, that wasn't the intent. %passthrough can be any other definition that is needed. Zero and undef were simply two possible values that illustrated some interesting behavior. Mapping of the %passthrough to the actual semantics of many vector instruction sets where the masked instructions leave the masked-off elements of the

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31 > use %v1 > use %v2 > > and transform this to > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v0, 31 > ... > > This is a classical problem with instruction

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

Thanks, I'll look into that. Still, the case where the function does not call anything remains---in such a situation there are no ADJCALLSTACK pseudos, so regardless of what that function you pointed at does, there won't be any target-independent information about the SP adjustment by the time the frame index elimination runs. Would it make sense to have ADJCALLSTACK pseudos every

Greedy register allocator allocates live sub-register

2016 Mar 10

Greedy register allocator allocates live sub-register

Hi all, I've come across a problem with register allocation which I have been unable to track down the root cause of. 6728B %vreg304<def> = COPY %vreg278; VRF128:%vreg304,%vreg278 6736B %vreg302<def> = COPY %vreg278; VRF128:%vreg302,%vreg278 6752B %vreg278<def,tied1> = foo %vreg278<tied0>, %vreg277, 14, pred:1, pred:%noreg, 5; VRF128:%vreg278 VRF64_l:%vreg277 * bar

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

Hi Vincent, On 24/10/2012 23:26, Vincent Lejeune wrote: > Hi, > > I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. > > The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : > > // BEFORE LOOP >

[VSXFMAMutate] OldFMAReg may be wrongly rewritten

2016 Mar 05

[VSXFMAMutate] OldFMAReg may be wrongly rewritten

I wonder if we can do this in a separate analysis MachineFunction SSA pass. 1) SelectionDAG will generate a pseudo instruction MutatingFMA. When it's generated it's allowed to have d = a * b + c form, where d doesn't have to be in {a, b, c}. 2) Later, the proposed pass uses an algorithm to decide for instruction MI: `%vreg0 = MutatingFMA %vreg1, %vreg2, %vreg3`, it should tie %vreg0

[LLVMdev] Virtual register problem in X86 backend

2014 Dec 10

[LLVMdev] Virtual register problem in X86 backend

Hi, Thx for your help... Here is the IR code: ; ModuleID = 'foo_bar.c' target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @.str = private unnamed_addr constant [6 x i8] c"MAIN\0A\00", align 1 ; Function Attrs: nounwind uwtable define i32 @main(i32 %argc, i8** %argv) #0 { entry: %retval = alloca i32,

[VSXFMAMutate] OldFMAReg may be wrongly rewritten

2016 Feb 29

[VSXFMAMutate] OldFMAReg may be wrongly rewritten

Ping? On Mon, Feb 22, 2016 at 1:06 PM Tim Shen <timshen at google.com> wrote: > On Fri, Feb 19, 2016 at 5:10 PM Tim Shen <timshen at google.com> wrote: > >> I wonder if we can fix this by making the transformation simpler, that >> is, instead of doing: >> > > I wrote a prototype (see attach) for this idea, it actually improves some > of the test cases

[VSXFMAMutate] OldFMAReg may be wrongly rewritten

2016 Mar 16

[VSXFMAMutate] OldFMAReg may be wrongly rewritten

I implemented a proof of concept of a new generic MachineFunction SSA pass. The code is not readable and not efficient yet, but it shows interesting results: In fma.ll @test_FMSUB2 (return dummy(A * B + C, A * B - D)): before: fmr 0, 1 xsmaddadp 3, 0, 2 xsmsubmdp 0, 2, 4 fmr 1, 3 fmr 2, 0 bl dummy2 after: xsmsubadp 4, 1, 2 xsmaddmdp

[LLVMdev] X86 - Help on fixing a poor code generation bug

2013 Dec 05

[LLVMdev] X86 - Help on fixing a poor code generation bug

Hi all, I noticed that the x86 backend tends to emit unnecessary vector insert instructions immediately after sse scalar fp instructions like addss/mulss. For example: ///////////////////////////////// __m128 foo(__m128 A, __m128 B) { _mm_add_ss(A, B); } ///////////////////////////////// produces the sequence: addss %xmm0, %xmm1 movss %xmm1, %xmm0 which could be easily optimized into

[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.

2014 Sep 05

[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.

Hi Quentin, Jonas looked further into the problem below, and asked me to submit his patch. Note the we have our own out-of-tree target, and we have not been able to reproduce this problem on an in-tree target. /Patrik Hägglund [MachineSinking] Conservatively clear kill flags after coalescing. This solves the problem of having a kill flag inside a loop with a definition of the register prior to

similar to: [LLVMdev] Marking implicit registers as "trashed"