similar to: TypePromoteFloat loses intermediate rounding operations

Displaying 20 results from an estimated 1000 matches similar to: "TypePromoteFloat loses intermediate rounding operations"

2019 Dec 10
2
TypePromoteFloat loses intermediate rounding operations
Thanks Eli. I forgot to bring up the strict FP questions which I was working on when I found this. If we're in a strict FP function, do the fp_to_f16/f16_to_fp emitted by promoting load/store/bitcast need to be strict versions of fp_to_f16/f16_to_fp. And if so where do we get the chain, especially for the bitcast case which isn't a chained node. ~Craig On Tue, Dec 10, 2019 at 3:18 PM
2014 Jul 14
5
[LLVMdev] RFC: Do we still need @llvm.convert.to.fp16 and the reverse?
Hi all, What do people think of doing away with the @llvm.convert.to.fp16 and @llvm.convert.from.fp16 intrinsics, in favour of using "half" and fpext/fptrunc? [1] It looks like those intrinsics originally date from before "half" actually existed in LLVM, and of course the backends have grown up assuming that's what Clang will produce, so we'd have to improve their
2014 Jul 09
4
[LLVMdev] Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!
On 07/09/2014 12:41 PM, Matt Arsenault wrote: > On 07/09/2014 03:30 PM, yalong at multicorewareinc.com wrote: >> Thank you Kevin!!! >> If I use fptrunc and bitcast realise NEON vcvtt ( I can sure, >> "fptrunc double %tmp to float" is right, but "fptrunc float %tmp to >> half" is wrong). My target platform is MIPS. The command as following:
2014 Jul 14
2
[LLVMdev] RFC: Do we still need @llvm.convert.to.fp16 and the reverse?
On Jul 14, 2014, at 7:23 AM, Tom Stellard <tom at stellard.net> wrote: > On Mon, Jul 14, 2014 at 01:08:54PM +0100, Tim Northover wrote: >> Hi all, >> >> What do people think of doing away with the @llvm.convert.to.fp16 and >> @llvm.convert.from.fp16 intrinsics, in favour of using "half" and >> fpext/fptrunc? [1] >> > > I am in favor
2014 Jul 10
2
[LLVMdev] Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!
Hi Andrea    Thank you your replying.    I do like your letter. Add following to line to MipsISelLowering.cpp. As your words,   @llvm.convert.to.fp16  can compile successfully. However, the runtime is not right. +  setOperationAction(ISD::FP16_TO_FP32, MVT::f32, Expand);+  setOperationAction(ISD::FP32_TO_FP16, MVT::i32, Expand); Robin yalong at multicorewareinc.com  From: Andrea Di
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good. I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16 and FP16_TO_FP nodes are created to avoid inefficient code generation. I will double check if I can't achieve the same without using these nodes (because I really would like to get completely rid of them). Cheers, Sjoerd.
2008 Jul 21
0
[LLVMdev] Extending vector operations
On Mon, Jul 21, 2008 at 1:21 PM, Stefanus Du Toit <stefanus.dutoit at rapidmind.com> wrote: > 1) Vector shl, lshr, ashr > > I think these are no-brainers. We would like to extend the semantics > of the shifting instructions to naturally apply to vectors as well. > One issue is that these operations often only support a single shift > amount for an entire vector. I assume it
2011 Sep 08
1
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
On Thu, Sep 08, 2011 at 11:15:06AM -0500, Villmow, Micah wrote: > Peter, > Is there a way to make this flag globally available? Metadata can be fairly expensive to handle at each node when in many cases it is a global flag and not a per operation flag. There are two main reasons why I think we shouldn't go for global flags: 1) It becomes difficult if not impossible to correctly link
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
I would like to revive this thread, as I am struggling a lot with the FP16 implementation in the ARM backend. My implementation in https://reviews.llvm.org/D38315 is finished (except one case), but a more robust alternative implementation was suggested. One can indeed argue that my current implementation is a bit fragile, because it involves manually patching up the isel dags for a few cases. The
2013 Dec 05
3
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi all, I noticed that the x86 backend tends to emit unnecessary vector insert instructions immediately after sse scalar fp instructions like addss/mulss. For example: ///////////////////////////////// __m128 foo(__m128 A, __m128 B) { _mm_add_ss(A, B); } ///////////////////////////////// produces the sequence: addss %xmm0, %xmm1 movss %xmm1, %xmm0 which could be easily optimized into
2010 Nov 20
2
[LLVMdev] Poor floating point optimizations?
I wanted to use LLVM for my math parser but it seems that floating point optimizations are poor. For example consider such C code: float foo(float x) { return x+x+x; } and here is the code generated with "optimized" live demo: define float @foo(float %x) nounwind readnone { entry: %0 = fmul float %x, 2.000000e+00 ; <float> [#uses=1] %1 = fadd float %0, %x
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
Hi Sjoerd, For ISel, I think having a separate register class will give you less headache. I wondering if you could get away with not touching the instructions descriptions at all, instead defining external pattens for the FullFP16 case, like so: def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm), IIC_fpCVTSH, "vcvtb",
2010 Nov 20
0
[LLVMdev] Poor floating point optimizations?
And also the resulting assembly code is very poor: 00460013 movss xmm0,dword ptr [esp+8] 00460019 movaps xmm1,xmm0 0046001C addss xmm1,xmm1 00460020 pxor xmm2,xmm2 00460024 addss xmm2,xmm1 00460028 addss xmm2,xmm0 0046002C movss dword ptr [esp],xmm2 00460031 fld dword ptr [esp] Especially pxor&and instead of movss (which is
2008 Jul 21
10
[LLVMdev] Extending vector operations
Hi, We would like to extend the vector operations in llvm a bit. We're hoping to get some feedback on the right way to go, or some starting points. I had previously had some discussion on this list about a subset of the changes we have in mind. All of these changes are intended to make target-independent IR (i.e. IR without machine specific intrinsics) generate better code or be
2011 Sep 08
0
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
Peter, Is there a way to make this flag globally available? Metadata can be fairly expensive to handle at each node when in many cases it is a global flag and not a per operation flag. > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Robert Quill > Sent: Thursday, September 08, 2011 3:24 AM > To: Peter
2018 Jan 09
2
[cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
On 01/08/2018 07:06 PM, Richard Smith via llvm-dev wrote: > On 8 January 2018 at 11:15, Kaylor, Andrew via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi Kevin, > > Thanks for reaching out about this, and thanks especially for > offering to help. I've had some other priorities that have > prevented
2013 Dec 05
0
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi Andrea, Thanks for working on this. I can see two approaches to solving this problem. The first one (that you suggested) is to catch this pattern after register allocation. The second approach is to eliminate this redundancy during instruction selection. Can you please look into catching this pattern during iSel? The idea is that ADDSS does an ADD plus BLEND operations, and you can easily
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, LLVM community: I found basicaa seems not to tell must-not-alias for __restrict__ arguments in c/c++. It only compares two pointers and the underlying objects they point to. I wonder how clang does alias analysis for c/c++ keyword restrict. let assume we compile the following code: $cat myalias.cc float foo(float * __restrict__ v0, float * __restrict__ v1, float * __restrict__ v2, float *
2018 Jan 08
4
[cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
Hi Kevin, Thanks for reaching out about this, and thanks especially for offering to help. I've had some other priorities that have prevented me from making progress on this recently. As far as I know, there is no support at all in clang for handling the FENV_ACCESS pragma. I have a sample patch somewhere that I created to demonstrate how the front end would create the constrained intrinsics
2018 Jan 09
0
[cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
On 8 January 2018 at 11:15, Kaylor, Andrew via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi Kevin, > > Thanks for reaching out about this, and thanks especially for offering to > help. I've had some other priorities that have prevented me from making > progress on this recently. > > As far as I know, there is no support at all in clang for handling the >