thr3ads.net - similar to: "[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt

Displaying 20 results from an estimated 100 matches similar to: "[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps"

[LLVMdev] X86 rsqrt instruction generated

2012 Nov 15

[LLVMdev] X86 rsqrt instruction generated

Hi, We have implemented the rsqrt instruction generation for X86 target architecture. We have introduced a flag -fp-rsqrt flag which controls the generatation of X86 rsqrt instruction generation. We have observed minor effects on precision due to rsqrt and hence has put these transformations under the mentioned flag. Note that -fp-rsqrt is only enabled with -enable-unsafe-fp-math flag presently.

[LLVMdev] X86 rsqrt instruction generated

2012 Nov 15

[LLVMdev] X86 rsqrt instruction generated

On Wed, Nov 14, 2012 at 10:43 PM, Chakraborty, Soham <Soham.Chakraborty at amd.com> wrote: > Hi, > > > > We have implemented the rsqrt instruction generation for X86 target > architecture. We have introduced a flag -fp-rsqrt flag which controls the > generatation of X86 rsqrt instruction generation. > > We have observed minor effects on precision due to rsqrt and

Pattern transformation between scalar and vector on IR.

2016 Sep 08

Pattern transformation between scalar and vector on IR.

Hi All, I'm tring to use RSQRT instructions on follow case for ARM (now what using is sqrt): 1.0 / sqrt(x) The RSQRT instructions(VRSQRTE/VRSQRTS) are vector type, but above operation is scalar type. So a transformation must be done(transform sqrt pattern to rsqrt). I have completed a patch for this, but I made the transformation in the backend which will leads to additional

[LLVMdev] X86 rsqrt instruction generated

2012 Dec 03

[LLVMdev] X86 rsqrt instruction generated

Hi, Please find attached the modified patch and description. We have modified and retested the patch taking into consideration the comments and inputs provided earlier. Thanks & Regards, soham -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Thursday, November 15, 2012 12:59 PM To: Chakraborty, Soham Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev]

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

Hi all, I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: ; ModuleID = 'vmax.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32" target triple =

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

I have code that is generating sign extend in reg on a v8i32, but the backend does not support this data type. This then asserts in LegalizeVectorTypes.cpp:389 because there is no function to split this vector into smaller sizes. Would a correct solution be to add this case so to trigger the SplitVecRes_BinaryOp function? This asserts on both my backend and x86 and TOT does not seem to have

RFC: Moving DAG heuristic-based transforms to MI passes

2017 Jan 27

RFC: Moving DAG heuristic-based transforms to MI passes

All llvm-devs, We're going to introduce the new possible implementation for such optimizations as reciprocal estimation instead of fdiv. In short it's a replacement of fdiv instruction (which is very expensive in most of CPUs) with alternative sequence of instructions which is usually cheaper but has appropriate precision (see genReciprocalDiv in lib/Target/X86/X86InstrInfo.cpp for

RFC: Moving DAG heuristic-based transforms to MI passes

2017 Jan 28

RFC: Moving DAG heuristic-based transforms to MI passes

In fact to commit the change before dealing with worst-case performance is a good idea because here we have 2 different issues. But the main idea of this RFC is an attempt to show the better approach to to these kinds of transformations and to suggest to use this approach in the future. At the same time, I'm trying to explain that this patch is not the performance one because the

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. > Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: > > > ; ModuleID = 'vmax.ll' > target datalayout =

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

On Wed, Dec 9, 2009 at 8:40 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: > I have code that is generating sign extend in reg on a v8i32, but the > backend does not support this data type. This then asserts in > LegalizeVectorTypes.cpp:389 because there is no function to split this > vector into smaller sizes. Would a correct solution be to add this case so > to

[LLVMdev] RE : Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] RE : Question about LLVM NEON intrinsics

Hi Eli, Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ? Or shall I define my own intrinsics for non supported types ? Best Regards Seb ________________________________________ De : Eli Friedman [eli.friedman at gmail.com] Date d'envoi : vendredi 21 septembre 2012 11:54 À : Sebastien

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

On 21 September 2012 09:28, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float>, <16 x float>) nounwind readnone > > llc fails with following message: > > SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs 0x2258250, 0x2258050, 0x2258150 [ORD=3] [ID=0] > > LLVM ERROR: Do not

[LLVMdev] sqrt

2010 Jan 07

[LLVMdev] sqrt

On Jan 7, 2010, at 7:06 AM, Jon Harrop wrote: > > What is the state of sqrt in LLVM? > > It was an intrinsic but there are no OCaml bindings for it and, last > I looked, > it generated inefficient code on Linux due to this bug: > > http://www.llvm.org/PR3219 > > Is the intrinsic deprecated? Am I losing a lot of performance by > calling sqrt > from libm

[LLVMdev] sqrt

2010 Jan 07

[LLVMdev] sqrt

What is the state of sqrt in LLVM? It was an intrinsic but there are no OCaml bindings for it and, last I looked, it generated inefficient code on Linux due to this bug: http://www.llvm.org/PR3219 Is the intrinsic deprecated? Am I losing a lot of performance by calling sqrt from libm instead of using the intrinsic? -- Dr Jon Harrop, Flying Frog Consultancy Ltd.

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

Thanks Eli, I'll see if I can get something working and submit a patch. Micah -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Wednesday, December 09, 2009 11:18 PM To: Villmow, Micah Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported On Wed, Dec 9, 2009 at 8:40 PM, Villmow, Micah <Micah.Villmow at

[LLVMdev] SplitVectorOp from CopyFromReg

2010 Apr 20

[LLVMdev] SplitVectorOp from CopyFromReg

Hello, I have a kernel that's swizzling a vector inside a loop. The vector was created before the loop. The first node in the dag is an extract subvector which calls into SplitVectorOp. The issue is that the node passed to it comes from a CopyFromReg and SplitVectorOp doesn't know what to do. Is there a reason why SplitVectorOp doesn't handle CopyFromReg nodes? If not, can I submit a

[LLVMdev] SplitVectorOp from CopyFromReg

2010 Apr 20

[LLVMdev] SplitVectorOp from CopyFromReg

Hi Javier, > I have a kernel that's swizzling a vector inside a loop. The vector was > created before the loop. The first node in the dag is an extract subvector > which calls into SplitVectorOp. The issue is that the node passed to it > comes from a CopyFromReg and SplitVectorOp doesn't know what to do. Is > there a reason why SplitVectorOp doesn't handle CopyFromReg

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

On Sep 21, 2012, at 2:58 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi Eli, > > Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ? > Or shall I define my own intrinsics for non supported types ? You should never generate these sorts of intrinsics with

Updating LLVM Tests for Patch

2017 Sep 20

Updating LLVM Tests for Patch

Hi, I am currently working on a more or less intrusive patch (D37896), which pulls optimizations on multiplications from some back-ends, e.g., (mul x, 2^N + 1) => (add (shl x, N), x) in AArch64, into the DAGCombiner to have this optimization generic on all targets. However, running the LLVM Tests leads to 67 unexpected results. Am 19.09.2017 um 15:58 schrieb Sanjay Patel: > For the

Updating LLVM Tests for Patch

2017 Sep 20

Updating LLVM Tests for Patch

There are multiple problems/questions here: 1. Make sure you've updated trunk to the latest rev before running update_llc_test_checks.py on lea-3.ll. Ie, I would only expect the output you're seeing if you're running the script on a version of that test file before r313631. After that commit, each RUN has its own check prefix, so there should be no conflict opportunity. 2. I

similar to: [LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps