similar to: [VSXFMAMutate] OldFMAReg may be wrongly rewritten

Displaying 20 results from an estimated 200 matches similar to: "[VSXFMAMutate] OldFMAReg may be wrongly rewritten"

2016 Feb 22
0
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
On Fri, Feb 19, 2016 at 5:10 PM Tim Shen <timshen at google.com> wrote: > I wonder if we can fix this by making the transformation simpler, that is, > instead of doing: > I wrote a prototype (see attach) for this idea, it actually improves some of the test cases (e.g. fma-assoc.ll: test_FMADD_ASSOC1), but pessimize several other cases (e.g. test_FMADD_ASSOC_EXT1). I'm not
2016 Mar 25
1
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
On Tue, Mar 22, 2016 at 5:13 PM Eric Christopher wrote: > I think we can probably go ahead and throw this up on Phabricator for > review. I'd probably bring in Matthias for review as well. > > Thanks! > > -eric > (Follow up on the discuss in IRC) I tried to bridge PPC backend and commuteInstruction, not sure if correctly, but here I got some non-optimal results: in 12
2016 Feb 29
2
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
Ping? On Mon, Feb 22, 2016 at 1:06 PM Tim Shen <timshen at google.com> wrote: > On Fri, Feb 19, 2016 at 5:10 PM Tim Shen <timshen at google.com> wrote: > >> I wonder if we can fix this by making the transformation simpler, that >> is, instead of doing: >> > > I wrote a prototype (see attach) for this idea, it actually improves some > of the test cases
2016 Mar 23
0
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
I think we can probably go ahead and throw this up on Phabricator for review. I'd probably bring in Matthias for review as well. Thanks! -eric On Wed, Mar 16, 2016 at 10:53 AM Tim Shen <timshen at google.com> wrote: > I implemented a proof of concept of a new generic MachineFunction SSA > pass. The code is not readable and not efficient yet, but it shows > interesting
2016 Mar 16
2
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
I implemented a proof of concept of a new generic MachineFunction SSA pass. The code is not readable and not efficient yet, but it shows interesting results: In fma.ll @test_FMSUB2 (return dummy(A * B + C, A * B - D)): before: fmr 0, 1 xsmaddadp 3, 0, 2 xsmsubmdp 0, 2, 4 fmr 1, 3 fmr 2, 0 bl dummy2 after: xsmsubadp 4, 1, 2 xsmaddmdp
2016 Mar 05
2
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
I wonder if we can do this in a separate analysis MachineFunction SSA pass. 1) SelectionDAG will generate a pseudo instruction MutatingFMA. When it's generated it's allowed to have d = a * b + c form, where d doesn't have to be in {a, b, c}. 2) Later, the proposed pass uses an algorithm to decide for instruction MI: `%vreg0 = MutatingFMA %vreg1, %vreg2, %vreg3`, it should tie %vreg0
2009 Jun 17
2
[LLVMdev] possible PowerPC (32bits) backend bug
I have been doing some playing with the patterns that define complex instructions, and I saw a behavior that doesn't look right. I think its a bug in the PPC backend. The 32-bit PPC .td file defines a pattern for the fnmsubs instruction like this: def : Pat<(fsub F4RC:$B, (fmul F4RC:$A, F4RC:$C)), (FNMSUBS F4RC:$A, F4RC:$C, F4RC:$B)>,
2013 Mar 25
1
[LLVMdev] Types in TableGen instruction selection patterns
Sebastian Pop wrote: > same mechanism could be useful. It would be nice to be able to write this: > > def insn : Inst<(outs i32:$dst), (ins i32:$src1, i32:$src2), > "some assembler", > [(set $dst, (Op $src1, $src2))]>; >From the PPC changes, I see that this is already possible under a slightly different form: def FSUBS :
2012 Oct 20
2
[LLVMdev] RegisterCoalescing pass crashes with ImplicitDef registers
Hi, below is an output of "llc -march=r600 -mcpu=cayman -print-before-all -debug-only=regalloc file.shader" command from llvm3.2svn. The register coalescing pass crashes when joining vreg12:sel_z with vreg13 registers, because it tries to access the interval liveness of vreg13... which is undefined. I don't know if it's a bug of the pass, or if my backend should do something
2012 Jun 08
0
[LLVMdev] Strong vs. default phi elimination and single-reg classes
On Jun 7, 2012, at 7:31 PM, Hal Finkel wrote: > 112B BB#1: derived from LLVM BB %for.body, ADDRESS TAKEN > Predecessors according to CFG: BB#0 BB#1 > %vreg12<def> = PHI %vreg13, <BB#1>, %vreg11, <BB#0>;CTRRC8:%vreg12,%vreg13,%vreg11 > %vreg13<def> = COPY %vreg12<kill>; CTRRC8:%vreg13,%vreg12 > %vreg13<def> = BDNZ8 %vreg13,
2012 Jun 08
2
[LLVMdev] Strong vs. default phi elimination and single-reg classes
Hello again, I am trying to implement an optimization pass for PowerPC such that simple loops use the special "counter register" (CTR) to track the induction variable. This is helpful because, in addition to reducing register pressure, there is a combined decrement-compare-and-branch instruction BZND (there are also other related instructions). I started this process by converting the
2016 Mar 04
2
PHI node to different register class vs TailDuplication
Hi, We're having an issue with TailDuplication in our out-of-tree target and it's this PHI-node that seems to be the cause of the trouble: %vreg2<def> = PHI %vreg0, <BB#2>, %vreg1, <BB#3>; rN:%vreg2 aNlh_0_7:%vreg0 aNlh_rN:%vreg1 Note that the defined %vreg2 has register class "rN" while the read %vreg0 has register class "aNlh_0_7".
2012 Jun 08
2
[LLVMdev] Strong vs. default phi elimination and single-reg classes
On Thu, 7 Jun 2012 22:14:00 -0700 Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Jun 7, 2012, at 7:31 PM, Hal Finkel wrote: > > > 112B BB#1: derived from LLVM BB %for.body, ADDRESS TAKEN > > Predecessors according to CFG: BB#0 BB#1 > > %vreg12<def> = PHI %vreg13, <BB#1>, %vreg11, > >
2012 May 09
2
[LLVMdev] register allocation problems in trunk with IMPLICIT_DEF
Hi, Recently code using IMPLICIT_DEF and INSERT_SUBREG started to break: %vreg9<def> = IMPLICIT_DEF %vreg10<def> = INSERT_SUBREG %vreg9<kill>, %vreg1<kill>, hi %vreg12<def> = sub %vreg10<kill>, %vreg11<kill> => %vreg10<def> = IMPLICIT_DEF %vreg10:hi<def> = COPY %vreg1<kill>
2012 May 09
0
[LLVMdev] register allocation problems in trunk with IMPLICIT_DEF
On May 9, 2012, at 6:27 AM, Jonas Paulsson <jonas.paulsson at ericsson.com> wrote: > Hi, > > Recently code using IMPLICIT_DEF and INSERT_SUBREG started to break: > > %vreg9<def> = IMPLICIT_DEF > %vreg10<def> = INSERT_SUBREG %vreg9<kill>, %vreg1<kill>, hi > %vreg12<def> = sub %vreg10<kill>,
2012 May 14
1
[LLVMdev] register allocation problems in trunk with IMPLICIT_DEF
I used llvm-stress to find a similar problem on x86-64. See http://llvm.org/bugs/show_bug.cgi?id=12821. BTW, llvm-stress is a great tool! /Patrik Hägglund ________________________________ From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Jakob Stoklund Olesen Sent: den 9 maj 2012 18:21 To: Jonas Paulsson Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev]
2017 Oct 25
3
How vregs are assigned to operands in IR
Hi, I'm trying to understand how virtual regs are assigned to operands in IR instructions. I looked into SelectionDAG but could not figure out where the assignment happens. How and where does this conversion happen? Furthermore, I want to build a map between variable and the virtual register (x corresponds to vreg11 in below code). I've been stuck here for a while. Any help is greatly
2011 Dec 20
0
[LLVMdev] specializing hybrid_ls_rr_sort (was: Re: Bottom-Up Scheduling?)
On Tue, 2011-12-20 at 10:35 -0600, Hal Finkel wrote: > On Mon, 2011-12-19 at 23:20 -0800, Andrew Trick wrote: > > > > On Dec 19, 2011, at 10:53 PM, Hal Finkel wrote: > > > > > Here's my "thought experiment" (from PR11589): I have a bunch of > > > load-fadd-store chains to schedule. A store takes two cycles to > > > clear > >
2017 Feb 09
2
Improving the split heuristics for the Greedy Register Allocator
On Wed, Feb 8, 2017 at 6:21 PM, Wei Mi <wmi at google.com> wrote: > I have an issue that I've been wrestling with for quite some time and I'm > hoping that someone with a deeper understanding of the register allocator > can help me with. > > Namely, I am trying to teach RA to split a live range rather than > allocating a CSR. I've attempted a very large number
2011 Dec 20
1
[LLVMdev] specializing hybrid_ls_rr_sort (was: Re: Bottom-Up Scheduling?)
On Dec 20, 2011, at 10:29 AM, Hal Finkel wrote: > On Tue, 2011-12-20 at 10:35 -0600, Hal Finkel wrote: >> On Mon, 2011-12-19 at 23:20 -0800, Andrew Trick wrote: >>> >>> On Dec 19, 2011, at 10:53 PM, Hal Finkel wrote: >>> >>>> Here's my "thought experiment" (from PR11589): I have a bunch of >>>> load-fadd-store chains to