thr3ads.net - similar to: "[LLVMdev] controlling(enable/disable) FMA instruction generation"

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] controlling(enable/disable) FMA instruction generation"

[LLVMdev] X86 rsqrt instruction generated

2012 Nov 15

[LLVMdev] X86 rsqrt instruction generated

Hi, We have implemented the rsqrt instruction generation for X86 target architecture. We have introduced a flag -fp-rsqrt flag which controls the generatation of X86 rsqrt instruction generation. We have observed minor effects on precision due to rsqrt and hence has put these transformations under the mentioned flag. Note that -fp-rsqrt is only enabled with -enable-unsafe-fp-math flag presently.

[LLVMdev] Error running spec benchmark with FMA4 on X86

2012 Sep 06

[LLVMdev] Error running spec benchmark with FMA4 on X86

Hi All, I am facing miscompare error when running povray (and few other C/C++ benchmarks) from spec cpu2006 suite enabling FMA4 (and disabling FMA3). I have used -ffp-contract=fast to turn on this option. (Compilation options and targets pasted below). >>>>>>>> clang version 3.2 (trunk 163295:163308) (llvm/trunk 163295) Target: x86_64-unknown-linux-gnu Thread model: posix

[LLVMdev] X86 rcp instruction generated

2012 Nov 15

[LLVMdev] X86 rcp instruction generated

Hi, We have implemented the rcp instruction generation for X86 target architecture. We have introduced a flag -fp-rcp flag which controls the generatation of X86 rcp instruction generation. We have observed minor effects on precision and hence hve put these transformations under the mentioned flag. Note that -fp-rcp is only enabled with -enable-unsafe-fp-math flag presently. Moreover we have

[LLVMdev] X86 rsqrt instruction generated

2012 Nov 15

[LLVMdev] X86 rsqrt instruction generated

On Wed, Nov 14, 2012 at 10:43 PM, Chakraborty, Soham <Soham.Chakraborty at amd.com> wrote: > Hi, > > > > We have implemented the rsqrt instruction generation for X86 target > architecture. We have introduced a flag -fp-rsqrt flag which controls the > generatation of X86 rsqrt instruction generation. > > We have observed minor effects on precision due to rsqrt and

[LLVMdev] X86 rsqrt instruction generated

2012 Dec 03

[LLVMdev] X86 rsqrt instruction generated

Hi, Please find attached the modified patch and description. We have modified and retested the patch taking into consideration the comments and inputs provided earlier. Thanks & Regards, soham -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Thursday, November 15, 2012 12:59 PM To: Chakraborty, Soham Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev]

[LLVMdev] AVX support

2010 Feb 25

[LLVMdev] AVX support

On Thursday 25 February 2010 15:33:58 Jan Sjodin wrote: > I have seen some re-factoring work done to prepare for AVX support. What > are the plans (time wise) to add the AVX patterns to the backend? Has > anyone thought about FMA4? Oh yes. :) FMA4 will have a different feature bit than AVX or FMA3. FMA4 is our top priority after AVX due to Bulldozer. What would you like to see for

[LLVMdev] conditional assignment in selectionDAG

2013 Sep 17

[LLVMdev] conditional assignment in selectionDAG

Hi, I am trying following transformation in X86 selection dag. lhs = rhs; // lhs and rhs are f32 => if(rhs == 0.0) lhs = rhs; else lhs = rhs'; i.e. conditionally replace rhs by rhs'. I guess it can be done using ISD::SELECT node(for float values)? In that case I have to create a condition node. Can you please suggest/refer to how to do so? You may please suggest

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi Lang, Unfortunately, I don't have an answer on the commutability question, but I wanted to let you know that I filed a bug on this: http://llvm.org/bugs/show_bug.cgi?id=17229 This also shows a memory operand variant of the fma that you may want to consider in your patch and testcases. Thanks! On Thu, Dec 19, 2013 at 10:45 PM, Lang Hames <lhames at gmail.com> wrote: > Hi all,

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi all, The 213 variant of the FMA3 instructions is currently marked commutable (see X86InstrFMA.td). Is that safe? According to the ISA the FMA3 instructions aren't commutable for non-numeric results, so I'd have thought commuting this would only be valid in fast-math mode? For the curious, the reason that I'm asking is that we currently always select the 213 variant, but this

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 23

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi Elena, Thank you very much for looking in to that. I'll go ahead and remove the isCommutable flag from those instructions, since it sounds like that's the right thing to do. I would still like to change the default from the 231 variant to 213 too, as this will reduce code-size for accumulator-style loops. I have at least one benchmark that shows significant speedups when this change

[LLVMdev] Commutability of X86 FMA3 instructions.

2013 Dec 20

[LLVMdev] Commutability of X86 FMA3 instructions.

Hi Kay, My patch will partially address your bug. For now I'm just looking to switch the default FMA from vfmadd213xx to vfmadd231xx. That will cause the code in PR17229 to compile as desired, but would regress code like: double foo(double a, double b, double c) { return a * b + c; } Which will now require a vmovaps + vfmadd231. If this impacts real benchmarks we could add an

undef * 0

2016 Sep 14

undef * 0

Hi, > Both A and B are undef: > LHS = (undef & undef) | (undef & undef) = undef // Since ~undef = undef > RHS = undef > Thus transform is correct. LLVM documentation (http://llvm.org/docs/LangRef.html#undefined-values) suggests that it is unsafe to consider (a & undef = undef) and (a | undef = undef). "As such, it is unsafe to optimize or assume

[LLVMdev] Operand order in dag pattern matching in td files

2012 Nov 16

[LLVMdev] Operand order in dag pattern matching in td files

Hi, I have a simple question w.r.t the order of operands used in dag pattern matching in target files. Some of them seem intuitive. But I want to get it clarified anyway. I am using a pattern from X86InstrFMA.td in the below example. Consider FMA3 pattern (simplified). let Constraints = "$src1 = $dst" in { multiclass fma3s_rm<bits<8> opc, string OpcodeStr, X86MemOperand

[LLVMdev] Operand order in dag pattern matching in td files

2012 Nov 16

[LLVMdev] Operand order in dag pattern matching in td files

On 16 November 2012 13:41, Anitha B Gollamudi <anitha.boyapati at gmail.com> wrote: > Hi, > > I have a simple question w.r.t the order of operands used in dag > pattern matching in target files. Some of them seem intuitive. But I > want to get it clarified anyway. I am using a pattern from > X86InstrFMA.td in the below example. Consider FMA3 pattern > (simplified). >

order.max specification problem in the ar.ols function

2012 Nov 27

order.max specification problem in the ar.ols function

Hello I am facing a curious problem.I have a time series data with which i want to fit auto-regressive model of order p, where p runs from 1:9.I am using a for loop which will fit an AR(p) model for each value of p using the *ar.ols* function. I am using the following code for ( p in 1:9){ a=ar.ols (x=data.ts, order.max=p, demean=T, intercept=T) } Specifying the *order.max* to be p, it gives me a

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

Hi, A question from r162999 changes: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrFMA.td?r1=162999&r2=162998&pathrev=162999 For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and "MemOp4" like those of "rm" or "rr" ? multiclass fma4s< > ... def mr : FMA4<opc, MRMSrcMem, (outs

[LLVMdev] X86 Tablegen Description and VEX.W

2012 Nov 08

[LLVMdev] X86 Tablegen Description and VEX.W

On 8 November 2012 11:12, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > On Wed, Nov 7, 2012 at 10:52 PM, Anitha Boyapati <anitha.boyapati at gmail.com> > wrote: > ... >> >> For the multiclass "fma4s", why is "mr" not inherited from "VEX_W" and >> "MemOp4" like those of "rm" or "rr" ? >

[LLVMdev] Operand order in dag pattern matching in td files

2012 Nov 16

[LLVMdev] Operand order in dag pattern matching in td files

You've unfortunately chosen a complex example. Your second question is needs be answered first. null_frag causes the pattern to be dropped. Now having covered that the reason the operands are in the order they are is because the only instruction that doesn't use null_frag is this one defm r213 : fma3s_rm<opc213, !strconcat(OpStr, !strconcat("213", PackTy)),

[LLVMdev] adding architecture specific flag

2012 Nov 05

[LLVMdev] adding architecture specific flag

Hi, Can anybody please suggest where to add architecture specific code generation flags(e.g. X86) in llvm? Thanks in advance. Best Regards, soham

undef * 0

2016 Sep 02

undef * 0

What is the value of undef * 0 in LLVM? According to its definition in the LLVM IR reference; "The string ‘undef‘ can be used anywhere a constant is expected..." Am I correct to say that undef * 0 = 0 following this definition? Best Regards, soham

similar to: [LLVMdev] controlling(enable/disable) FMA instruction generation