similar to: [LLVMdev] specifying accumulator based load/stores

Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] specifying accumulator based load/stores"

2008 Jan 17
0
[LLVMdev] specifying accumulator based load/stores
On Jan 17, 2008, at 4:12 AM, Sanjiv Gupta wrote: > I have load / store instructions that require accumulator. > So a store looks like.. > > mov 3, acc > st acc, addr > > I have specified "acc" as a separate register class containing only > one register which is the "acc". > The instr patterns are then splitted into: > > set imm:$src,
2009 Apr 16
2
[LLVMdev] How do I model MUL with multiply-accumulate instruction?
The only multiplication instruction on my target CPU is multiply-and-accumulate. The result goes into a special register that can destructively read at the end of a sequence of multiply-adds. The following sequence is required to so a simple multiply: acc r0 # clear accumulator, discarding its value (r0 reads as 0, and sinks writes) mac rSRC1, rSRC2 # multiply sources, store
2009 Apr 17
0
[LLVMdev] How do I model MUL with multiply-accumulate instruction?
On Apr 16, 2009, at 2:19 PM, Greg McGary wrote: > The only multiplication instruction on my target CPU is > multiply-and-accumulate. The result goes into a special register that > can destructively read at the end of a sequence of multiply-adds. The > following sequence is required to so a simple multiply: > > acc r0 # clear accumulator, discarding its value (r0 reads as
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi all, The 213 variant of the FMA3 instructions is currently marked commutable (see X86InstrFMA.td). Is that safe? According to the ISA the FMA3 instructions aren't commutable for non-numeric results, so I'd have thought commuting this would only be valid in fast-math mode? For the curious, the reason that I'm asking is that we currently always select the 213 variant, but this
2010 Jul 26
1
[LLVMdev] How to specify patterns for instructions with accumulator in selection DAG?
Hi, I am wondering how to specify the selection DAG patterns for instructions that use accumulator. For example multiply-accumulate instruction with one destination operand and two source operands: mac $dst, $src1, $src2 ;; $dst += $src1*$src2 Seems that it has a cycle in the pattern. So how do I specify it in the DAG? There are a few instructions in the ARM backend like this one, but the
2016 Feb 13
4
Register spilling fix for experimental 6502 backend
So I've been designing an experimental 6502 backend for LLVM. Link: <https://github.com/beholdnec/llvm-m6502> The 6502 only has a few registers, one accumulator and two indexes, none of which are large enough to hold an absolute pointer. Because of this, the backend really tests the power of LLVM's register allocator (RA). I've made a change to the RA that might be of interest
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Kay, My patch will partially address your bug. For now I'm just looking to switch the default FMA from vfmadd213xx to vfmadd231xx. That will cause the code in PR17229 to compile as desired, but would regress code like: double foo(double a, double b, double c) { return a * b + c; } Which will now require a vmovaps + vfmadd231. If this impacts real benchmarks we could add an
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Lang, Unfortunately, I don't have an answer on the commutability question, but I wanted to let you know that I filed a bug on this: http://llvm.org/bugs/show_bug.cgi?id=17229 This also shows a memory operand variant of the fma that you may want to consider in your patch and testcases. Thanks! On Thu, Dec 19, 2013 at 10:45 PM, Lang Hames <lhames at gmail.com> wrote: > Hi all,
2013 Mar 25
3
[LLVMdev] [PATCH] RegScavenger::scavengeRegister
This patch adds parameter "EliminateFI" to RegScavenger::scavengeRegister, which tells register scavenger not to eliminate frame index of the emergency spill slot if set to false. I have pseudo load, store and copy instructions which are generated during register allocation and expanded post-RA but before the final stack size is known. I use register scavenger to search for a temporary
2013 Mar 25
0
[LLVMdev] [PATCH] RegScavenger::scavengeRegister
On Mar 25, 2013, at 12:04 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: > This patch adds parameter "EliminateFI" to RegScavenger::scavengeRegister, which tells register scavenger not to eliminate frame index of the emergency spill slot if set to false. > > I have pseudo load, store and copy instructions which are generated during register allocation and expanded
2013 Dec 23
2
[LLVMdev] Commutability of X86 FMA3 instructions.
Hi Elena, Thank you very much for looking in to that. I'll go ahead and remove the isCommutable flag from those instructions, since it sounds like that's the right thing to do. I would still like to change the default from the 231 variant to 213 too, as this will reduce code-size for accumulator-style loops. I have at least one benchmark that shows significant speedups when this change
2011 Mar 30
1
[LLVMdev] Bignums
Hello all! I'm working on a library with bignum support, and I wanted to try LLVM as an apparently simpler and more portable system to my current design (a Haskell script which spits out mixed C and assembly). Porting the script to use the LLVM bindings instead of the current hack was pretty easy. But I have a few remaining questions: (1) Are bignums exposed to any higher-level
2010 Aug 16
2
[LLVMdev] NumLoads/NumStores for linearscan?
Hi, Is there a way for me to collect statistics about the number of loads/stores added by the "linearscan" register allocator (just like can be done with the "local" allocator)? I still haven't grokked very well the interaction between RALinScan and Spiller... Should I add those two statistics to the spiller's class? Thanks, -- Silvio Ricardo Cordeiro --------------
2010 Jan 13
2
[LLVMdev] Cross-module function inlining
On 13 Jan 2010, at 20:34, Nick Lewycky wrote: > On 13 January 2010 12:05, Mark Muir <mark.i.r.muir at gmail.com> wrote: > > But... now there's a small problem with library calls. Symbols such as 'memset', 'malloc', etc. are being removed by global dead code elimination. They are implemented in one of the bitcode modules that are linked together (implementations
2006 Aug 21
3
[LLVMdev] Recalculating live intervals
I'm not sure about one thing: you assign stack slot to each new register you replace the spilled one with. And then you need to allocate physical registers to them. Is it possible to assign physical register to the virtual one which has a stack slot already? On 8/21/06, Fernando Magno Quintao Pereira <fernando at cs.ucla.edu> wrote: > > > > So what addIntervalsToSpills
2006 Aug 23
1
[LLVMdev] Recalculating live intervals
Fernando Magno Quintao Pereira wrote: >> I'm not sure about one thing: you assign stack slot to each new register you >> replace the spilled one with. And then you need to allocate physical >> registers to them. Is it possible to assign physical register to the virtual >> one which has a stack slot already? >> > > Yes. The stack slot is the place where the
2010 Aug 16
0
[LLVMdev] NumLoads/NumStores for linearscan?
On Aug 15, 2010, at 5:12 PM, Silvio Ricardo Cordeiro wrote: > Is there a way for me to collect statistics about the number of loads/stores added by the "linearscan" register allocator (just like can be done with the "local" allocator)? I still haven't grokked very well the interaction between RALinScan and Spiller... Should I add those two statistics to the
2018 Feb 27
0
Parallel assignments and goto
No clue, but see ?assign perhaps if you have not done so already. -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Feb 27, 2018 at 6:51 AM, Thomas Mailund <thomas.mailund at gmail.com> wrote: > Interestingly, the
2018 Feb 27
2
Parallel assignments and goto
Interestingly, the <<- operator is also a lot faster than using a namespace explicitly, and only slightly slower than using <- with local variables, see below. But, surely, both must at some point insert values in a given environment ? either the local one, for <-, or an enclosing one, for <<- ? so I guess I am asking if there is a more low-level assignment operation I can get my
2018 Feb 26
0
Parallel assignments and goto
Following up on this attempt of implementing the tail-recursion optimisation ? now that I?ve finally had the chance to look at it again ? I find that non-local return implemented with callCC doesn?t actually incur much overhead once I do it more sensibly. I haven?t found a good way to handle parallel assignments that isn?t vastly slower than simply introducing extra variables, so I am going with