thr3ads.net - search: "fadd"

Displaying 20 results from an estimated 612 matches for "fadd".

Did you mean: add

Substitute instruction with a jump to a library code

2015 Dec 30

Substitute instruction with a jump to a library code

I'm trying to find a way to emulate a floating point instruction, say a floating point add. My understanding is that in order to do that I need to execute setOperationAction(ISD::FADD, (MVT::f32, Expand); setOperationAction(ISD::FADD, (MVT::f64, Expand); in MyTargetISelLowering.cpp, MyTargetLowering::MyTargetLowering(...). However for some reason I'm still seeing a floating point add in the final assembly. I tried running my test code (provided below) on MSP430 and can see...

[LLVMdev] SLP vectorizer on AVX feature

2015 Jul 01

[LLVMdev] SLP vectorizer on AVX feature

...in(i64 %lo, i64 %hi, float* noalias %arg0, float* noalias %arg1, float* noalias %arg2) { entrypoint: %0 = bitcast float* %arg1 to <4 x float>* %1 = load <4 x float>* %0, align 4 %2 = bitcast float* %arg2 to <4 x float>* %3 = load <4 x float>* %2, align 4 %4 = fadd <4 x float> %3, %1 %5 = bitcast float* %arg0 to <4 x float>* store <4 x float> %4, <4 x float>* %5, align 4 .... So, it could make use of <8 x float> available in that machine. But it doesn't. Then I thought, that maybe the YMM registers get used when lowe...

[LLVMdev] how to use "new instruction()"

2015 Apr 17

[LLVMdev] how to use "new instruction()"

It seems that the problem was because I used builder.CreateFAdd to create a <2 x double> vectortype FADD instruction. It works if I use it to create the scalar version FADD. I want to have an instruction like: *%2 = fadd <2 x double> undef, <2 x double> undef. *The following is the way I used to create the vectorized FADD instruction: //pInst...

[LLVMdev] how to use "new instruction()"

2015 Apr 17

[LLVMdev] how to use "new instruction()"

I got it. Thanks, Nick. So, it is back to the previous problem. If I have the following instruction: %3 = fadd double %1, double %2 I want to change it into %6 = fadd <2 x double> %4, double %5 where %4 = <double %1, double %1>, %5 = <double %2, double %2>, how can I do this? Thanks, Best On Fri, Apr 17, 2015 at 1:56 AM, Nick Lewycky <nicholas at mxc.ca> wrote: > zhi chen wr...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Aug 28

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

> On Aug 28, 2014, at 10:58 AM, Duncan Sands <duncan.sands at deepbluecap.com> wrote: > > Hi Stephen, > >>> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that &...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Aug 28

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...; LLVM is used to target many platforms for which sNaNs do not trap, and indeed >> many platforms that do not have floating point exceptions at all. > > thanks for the info. All of the floating point folds that rely on snans trapping should be corrected then. > > In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that &...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 10

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...d to its result and has a single NaN > as an input should produce a NaN with the payload of the input NaN if > representable in the destination format"./ thanks for finding this out. > > Floating point add propagates a NaN. There is no conversion in the context of > LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a NaN > with the same payload. Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies that "fadd undef, undef" can be folded to "undef". > > As regards "fadd %x, und...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 17

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Hi, Thank you for all your helpful comments. To sum up, below is the list of correct folding examples for fadd: (1) fadd %x, -0.0 -> %x (2) fadd undef, undef -> undef (3) fadd %x, undef -> NaN (undef is a NaN which is propagated) Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathF...

[LLVMdev] how to use "new instruction()"

2015 Apr 17

[LLVMdev] how to use "new instruction()"

...index0 = ConstantInt::get(u32Ty, 0); Value *index1 = ConstantInt::get(u32Ty, 1); Instruction *InsertVal = InsertElementInst::Create(emptyVec, oprnd, index0, "insert"); InsertVal = InsertElementInst::Create(emptyVec, oprnd, index1, "insert"); vecVal = builder.CreateFAdd(emptyVec, emptyVec, ""); Best, Zhi On Fri, Apr 17, 2015 at 12:17 PM, Nick Lewycky <nicholas at mxc.ca> wrote: > zhi chen wrote: > >> I got it. Thanks, Nick. So, it is back to the previous problem. If I >> have the following instruction: >> >> %3 = fa...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 16

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

...one or more qNaN operands return a NaN different from the operands. I.e. operand NaN is not propagated. This happens when the "default NaN" flag is set in the FPSCR (floating point status and control register). The result in this case is some default NaN value. > > This means "fadd %x, -0.0", which is currently folded to %x by InstructionSimplify, might produce a different result if %x is a NaN. This breaks the NaN propagation rules the IEEE standard establishes and significantly reduces folding capabilities for the FP operations. > > This also applies to "fa...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Sep 22

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

Hi Duncan, On 17.09.2014 21:10, Duncan Sands wrote: > Hi Oleg, > > On 17/09/14 18:45, Oleg Ranevskyy wrote: >> Hi, >> >> Thank you for all your helpful comments. >> >> To sum up, below is the list of correct folding examples for fadd: >> (1) fadd %x, -0.0 -> %x >> (2) fadd undef, undef -> undef >> (3) fadd %x, undef -> NaN (undef is a NaN which >> is propagated) >> >> Looking through the code I found the "NoNaNs&quot...

Implement VLIW Backend on LLVM (Assembler Related Questions)

2018 Dec 07

Implement VLIW Backend on LLVM (Assembler Related Questions)

...do 3 operations/cycle, 3 operations forms an instruction. One of the Integer Instruction looks like this: add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq An int instruction and a float instruction forms a VLIW instruction (bundle), e.g. { add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq } I want to express above concept in this way: // Assembly Language { add Ri, Rj, Rk add Rl, Rm, Rn add Ro, Rp, Rq fadd Fi, Fj, Fk fadd Fl, Fm, Fn fadd Fo, Fp, Fq } Q1: My first question is, the instruction encoding can only be...

Fusing contract fadd/fsub with normal fmul

2017 Jun 10

Fusing contract fadd/fsub with normal fmul

Hi, On LLVM 5.0 (current trunk), fadd/fsub and fmul that are both marked with `contract` or `fast` can be merged to a fma instruction by the backend. I'm wondering about the exact semantic of this new flag as well as `fast` and in particular, would it be valid to do this when only the `fadd`/`fsub` (and not the `fmul`) is marked w...

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 18

[LLVMdev] SIMD instructions and memory alignment on X86

Are you able to send any IR for others to reproduce this issue? On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com> wrote: > Unfortunately, this doesn't appear to be the bug I'm hitting. I applied > the fix to my source and it didn't make a difference. > > Also further testing found me getting the same behavior with other SIMD > instructions.

[LLVMdev] MCJIT generates MOVAPS on unaligned address

2014 Aug 07

[LLVMdev] MCJIT generates MOVAPS on unaligned address

MCJIT when lowering to x86-64 generates a MOVAPS (Move Aligned Packed Single-Precision Floating-Point Values) on a non-aligned memory address: movaps 88(%rdx), %xmm0 where %rdx comes in as a function argument with only natural alignment (float*). This x86 instruction requires the memory address to be 16 byte aligned which 88 plus something aligned to 4 byte isn't. Here the

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 18

[LLVMdev] SIMD instructions and memory alignment on X86

Unfortunately, this doesn't appear to be the bug I'm hitting. I applied the fix to my source and it didn't make a difference. Also further testing found me getting the same behavior with other SIMD instructions. The common factor is in each case, ECX is set to 0x7fffffff, and it's an operation using xmm ptr ecx+offset . Additionally, turning the optimization level passed to

[RFC] Changes to llvm.experimental.vector.reduce intrinsics

2019 Apr 05

[RFC] Changes to llvm.experimental.vector.reduce intrinsics

On 05/04/2019 09:37, Simon Pilgrim via llvm-dev wrote: > On 04/04/2019 14:11, Sander De Smalen wrote: >> Proposed change: >> >> ---------------------------- >> >> In this RFC I propose changing the intrinsics for >> llvm.experimental.vector.reduce.fadd and >> llvm.experimental.vector.reduce.fmul (see options A and B). I also >> propose renaming the 'accumulator' operand to 'start value' because >> for fmul this is the start value of the reduction, rather than a >> value to which the fmul reduction is ac...

[LLVMdev] spilling & xmm register usage

2010 Sep 29

[LLVMdev] spilling & xmm register usage

..._0 to i64 > %arrayidx.i = getelementptr float addrspace(1)* %1, i64 %8 > %tmp3.i = load float addrspace(1)* %arrayidx.i, align 4 > %tmp5.i = fmul float %tmp3.i, 1.000000e+01 > %tmp7.i = fsub float 1.000000e+00, %tmp3.i > %tmp8.i = fmul float %tmp7.i, 1.000000e+02 > %tmp9.i = fadd float %tmp5.i, %tmp8.i > %tmp20.i = fmul float %tmp7.i, 1.000000e+01 > %tmp21.i = fadd float %tmp3.i, %tmp20.i > %tmp23.i = fmul float %tmp3.i, 0x3F847AE140000000 > %tmp26.i = fmul float %tmp7.i, 0x3FA99999A0000000 > %tmp27.i = fadd float %tmp23.i, %tmp26.i > %tmp32.i = fmul...

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

2014 Aug 29

[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef

LLVM does not (today) try to preserve rounding mode or sNaNs. The only remaining question is whether we should be trying to preserve NaN payloads. —Owen On Aug 29, 2014, at 5:39 AM, Oleg Ranevskyy <llvm.mail.list at gmail.com> wrote: > Hi, > > So, the result of "fadd x, -0.0" might have a bit pattern different from the one of "x" depending on the value of "x" and the target. > If I get it right, the result does not necessarily compare equal to "x" in floating point comparisons. > Does this mean that folding of the above...

[LLVMdev] How to broaden the SLP vectorizer's search

2014 Aug 07

[LLVMdev] How to broaden the SLP vectorizer's search

On 7 August 2014 17:33, Chad Rosier <mcrosier at codeaurora.org> wrote: > You might consider filing a bug (llvm.org/bugs) requesting a flag, but I > don't know if the code owners want to expose such a flag. I'm not sure that's a good idea as a raw access to that limit, as there are no guarantees that it'll stay the same. But maybe a flag turning some

search for: fadd