search for: fadd

Displaying 20 results from an estimated 601 matches for "fadd".

Did you mean: add
2015 Jul 01
3
[LLVMdev] SLP vectorizer on AVX feature
...in(i64 %lo, i64 %hi, float* noalias %arg0, float* noalias %arg1, float* noalias %arg2) { entrypoint: %0 = bitcast float* %arg1 to <4 x float>* %1 = load <4 x float>* %0, align 4 %2 = bitcast float* %arg2 to <4 x float>* %3 = load <4 x float>* %2, align 4 %4 = fadd <4 x float> %3, %1 %5 = bitcast float* %arg0 to <4 x float>* store <4 x float> %4, <4 x float>* %5, align 4 .... So, it could make use of <8 x float> available in that machine. But it doesn't. Then I thought, that maybe the YMM registers get used when lowe...
2015 Dec 30
2
Substitute instruction with a jump to a library code
I'm trying to find a way to emulate a floating point instruction, say a floating point add. My understanding is that in order to do that I need to execute setOperationAction(ISD::FADD, (MVT::f32, Expand); setOperationAction(ISD::FADD, (MVT::f64, Expand); in MyTargetISelLowering.cpp, MyTargetLowering::MyTargetLowering(...). However for some reason I'm still seeing a floating point add in the final assembly. I tried running my test code (provided below) on MSP430 and can see...
2015 Apr 17
2
[LLVMdev] how to use "new instruction()"
It seems that the problem was because I used builder.CreateFAdd to create a <2 x double> vectortype FADD instruction. It works if I use it to create the scalar version FADD. I want to have an instruction like: *%2 = fadd <2 x double> undef, <2 x double> undef. *The following is the way I used to create the vectorized FADD instruction: /...
2015 Apr 17
2
[LLVMdev] how to use "new instruction()"
I got it. Thanks, Nick. So, it is back to the previous problem. If I have the following instruction: %3 = fadd double %1, double %2 I want to change it into %6 = fadd <2 x double> %4, double %5 where %4 = <double %1, double %1>, %5 = <double %2, double %2>, how can I do this? Thanks, Best On Fri, Apr 17, 2015 at 1:56 AM, Nick Lewycky <nicholas at mxc.ca> wrote: > zhi chen wr...
2014 Aug 28
2
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
> On Aug 28, 2014, at 10:58 AM, Duncan Sands <duncan.sands at deepbluecap.com> wrote: > > Hi Stephen, > >>> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that &...
2014 Aug 28
2
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
...; LLVM is used to target many platforms for which sNaNs do not trap, and indeed >> many platforms that do not have floating point exceptions at all. > > thanks for the info. All of the floating point folds that rely on snans trapping should be corrected then. > > In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that &...
2014 Sep 10
3
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
...its result and has a single NaN > as an input should produce a NaN with the payload of the input NaN if > representable in the destination format"./ thanks for finding this out. > > Floating point add propagates a NaN. There is no conversion in the context of > LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a NaN > with the same payload. Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies that "fadd undef, undef" can be folded to "undef". > > As regards "fadd %x, und...
2014 Sep 17
3
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Hi, Thank you for all your helpful comments. To sum up, below is the list of correct folding examples for fadd: (1) fadd %x, -0.0 -> %x (2) fadd undef, undef -> undef (3) fadd %x, undef -> NaN (undef is a NaN which is propagated) Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathF...
2015 Apr 17
2
[LLVMdev] how to use "new instruction()"
...index0 = ConstantInt::get(u32Ty, 0); Value *index1 = ConstantInt::get(u32Ty, 1); Instruction *InsertVal = InsertElementInst::Create(emptyVec, oprnd, index0, "insert"); InsertVal = InsertElementInst::Create(emptyVec, oprnd, index1, "insert"); vecVal = builder.CreateFAdd(emptyVec, emptyVec, ""); Best, Zhi On Fri, Apr 17, 2015 at 12:17 PM, Nick Lewycky <nicholas at mxc.ca> wrote: > zhi chen wrote: > >> I got it. Thanks, Nick. So, it is back to the previous problem. If I >> have the following instruction: >> >> %3 = fa...
2014 Sep 16
2
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
...one or more qNaN operands return a NaN different from the operands. I.e. operand NaN is not propagated. This happens when the "default NaN" flag is set in the FPSCR (floating point status and control register). The result in this case is some default NaN value. > > This means "fadd %x, -0.0", which is currently folded to %x by InstructionSimplify, might produce a different result if %x is a NaN. This breaks the NaN propagation rules the IEEE standard establishes and significantly reduces folding capabilities for the FP operations. > > This also applies to "fa...
2014 Sep 22
2
[LLVMdev] Bug 16257 - fmul of undef ConstantExpr not folded to undef
Hi Duncan, On 17.09.2014 21:10, Duncan Sands wrote: > Hi Oleg, > > On 17/09/14 18:45, Oleg Ranevskyy wrote: >> Hi, >> >> Thank you for all your helpful comments. >> >> To sum up, below is the list of correct folding examples for fadd: >> (1) fadd %x, -0.0 -> %x >> (2) fadd undef, undef -> undef >> (3) fadd %x, undef -> NaN (undef is a NaN which >> is propagated) >> >> Looking through the code I found the "NoNaNs&quot...
2018 Dec 07
3
Implement VLIW Backend on LLVM (Assembler Related Questions)
...operations/cycle, 3 operations forms an instruction. One of the Integer Instruction looks like this: add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq An int instruction and a float instruction forms a VLIW instruction (bundle), e.g. { add Ri, Rj, Rk; add Rl, Rm, Rn; add Ro, Rp, Rq fadd Fi, Fj, Fk; fadd Fl, Fm, Fn; fadd Fo, Fp, Fq } I want to express above concept in this way: // Assembly Language { add Ri, Rj, Rk add Rl, Rm, Rn add Ro, Rp, Rq fadd Fi, Fj, Fk fadd Fl, Fm, Fn fadd Fo, Fp, Fq } Q1: My first question is, the instruction encoding can...
2013 Jul 18
0
[LLVMdev] SIMD instructions and memory alignment on X86
Are you able to send any IR for others to reproduce this issue? On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com> wrote: > Unfortunately, this doesn't appear to be the bug I'm hitting. I applied > the fix to my source and it didn't make a difference. > > Also further testing found me getting the same behavior with other SIMD > instructions.
2017 Jun 10
3
Fusing contract fadd/fsub with normal fmul
Hi, On LLVM 5.0 (current trunk), fadd/fsub and fmul that are both marked with `contract` or `fast` can be merged to a fma instruction by the backend. I'm wondering about the exact semantic of this new flag as well as `fast` and in particular, would it be valid to do this when only the `fadd`/`fsub` (and not the `fmul`) is...
2013 Jul 18
2
[LLVMdev] SIMD instructions and memory alignment on X86
Unfortunately, this doesn't appear to be the bug I'm hitting. I applied the fix to my source and it didn't make a difference. Also further testing found me getting the same behavior with other SIMD instructions. The common factor is in each case, ECX is set to 0x7fffffff, and it's an operation using xmm ptr ecx+offset . Additionally, turning the optimization level passed to
2019 Apr 05
4
[RFC] Changes to llvm.experimental.vector.reduce intrinsics
...5/04/2019 09:37, Simon Pilgrim via llvm-dev wrote: > On 04/04/2019 14:11, Sander De Smalen wrote: >> Proposed change: >> >> ---------------------------- >> >> In this RFC I propose changing the intrinsics for >> llvm.experimental.vector.reduce.fadd and >> llvm.experimental.vector.reduce.fmul (see options A and B). I also >> propose renaming the 'accumulator' operand to 'start value' because >> for fmul this is the start value of the reduction, rather than a >> value to which the fmul reduction is ac...
2014 Aug 07
3
[LLVMdev] MCJIT generates MOVAPS on unaligned address
MCJIT when lowering to x86-64 generates a MOVAPS (Move Aligned Packed Single-Precision Floating-Point Values) on a non-aligned memory address: movaps 88(%rdx), %xmm0 where %rdx comes in as a function argument with only natural alignment (float*). This x86 instruction requires the memory address to be 16 byte aligned which 88 plus something aligned to 4 byte isn't. Here the
2010 Sep 29
0
[LLVMdev] spilling & xmm register usage
..._0 to i64 > %arrayidx.i = getelementptr float addrspace(1)* %1, i64 %8 > %tmp3.i = load float addrspace(1)* %arrayidx.i, align 4 > %tmp5.i = fmul float %tmp3.i, 1.000000e+01 > %tmp7.i = fsub float 1.000000e+00, %tmp3.i > %tmp8.i = fmul float %tmp7.i, 1.000000e+02 > %tmp9.i = fadd float %tmp5.i, %tmp8.i > %tmp20.i = fmul float %tmp7.i, 1.000000e+01 > %tmp21.i = fadd float %tmp3.i, %tmp20.i > %tmp23.i = fmul float %tmp3.i, 0x3F847AE140000000 > %tmp26.i = fmul float %tmp7.i, 0x3FA99999A0000000 > %tmp27.i = fadd float %tmp23.i, %tmp26.i > %tmp32.i = fmul...
2014 Aug 07
3
[LLVMdev] How to broaden the SLP vectorizer's search
On 7 August 2014 17:33, Chad Rosier <mcrosier at codeaurora.org> wrote: > You might consider filing a bug (llvm.org/bugs) requesting a flag, but I > don't know if the code owners want to expose such a flag. I'm not sure that's a good idea as a raw access to that limit, as there are no guarantees that it'll stay the same. But maybe a flag turning some
2020 Feb 12
6
[RFC] Optional parameter tuples
...x is: %z = call @llvm.some.intrinsic(%a, %b) optional_tuple(%x, %y, %z) where from the perspective of the call site %x, %y and %z are simply additional parameters. Optional parameter tuples would be very useful for constrained fp intrinsics and vector predication. Some examples: ; Default fpenv fadd (isomorphic to the fadd instruction) %z = call double @llvm.fadd(%a, %b) ; Constrained fp add %x = call double @llvm.fadd(%a, %b) fpenv(metadata !fpround.tonearest, metadata !fpexcept.strict) ; Constrained fp add with vector predication (https://reviews.llvm.org/D57504) %x = call &l...