thr3ads.net - search: "rsqrt"

Displaying 11 results from an estimated 11 matches for "rsqrt".

Did you mean: sqrt

[LLVMdev] X86 rsqrt instruction generated

2012 Nov 15

[LLVMdev] X86 rsqrt instruction generated

Hi, We have implemented the rsqrt instruction generation for X86 target architecture. We have introduced a flag -fp-rsqrt flag which controls the generatation of X86 rsqrt instruction generation. We have observed minor effects on precision due to rsqrt and hence has put these transformations under the mentioned flag. Note that -fp-...

[LLVMdev] X86 rsqrt instruction generated

2012 Nov 15

[LLVMdev] X86 rsqrt instruction generated

On Wed, Nov 14, 2012 at 10:43 PM, Chakraborty, Soham <Soham.Chakraborty at amd.com> wrote: > Hi, > > > > We have implemented the rsqrt instruction generation for X86 target > architecture. We have introduced a flag -fp-rsqrt flag which controls the > generatation of X86 rsqrt instruction generation. > > We have observed minor effects on precision due to rsqrt and hence has put > these transformations under the menti...

[LLVMdev] X86 rsqrt instruction generated

2012 Dec 03

[LLVMdev] X86 rsqrt instruction generated

...aking into consideration the comments and inputs provided earlier. Thanks & Regards, soham -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Thursday, November 15, 2012 12:59 PM To: Chakraborty, Soham Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] X86 rsqrt instruction generated On Wed, Nov 14, 2012 at 10:43 PM, Chakraborty, Soham <Soham.Chakraborty at amd.com> wrote: > Hi, > > > > We have implemented the rsqrt instruction generation for X86 target > architecture. We have introduced a flag -fp-rsqrt flag which controls >...

Pattern transformation between scalar and vector on IR.

2016 Sep 08

Pattern transformation between scalar and vector on IR.

Hi All, I'm tring to use RSQRT instructions on follow case for ARM (now what using is sqrt): 1.0 / sqrt(x) The RSQRT instructions(VRSQRTE/VRSQRTS) are vector type, but above operation is scalar type. So a transformation must be done(transform sqrt pattern to rsqrt). I have completed a patch for this, but I made the transf...

[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps

2014 Dec 13

[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps

I'm getting this on LLVM trunk: SplitVectorResult #0: 0x27e6250: v8f32 = llvm.x86.avx.rsqrt.ps.256 0x2739310, 0x2739420 [ORD=16] [ID=0] LLVM ERROR: Do not know how to split the result of this operator! clang: error: linker command failed with exit code 1 (use -v to see invocation) Oddly, when I build the same code without -flto I don't see this issue. I see a similar bug was reporte...

[LLVMdev] X86 rcp instruction generated

2012 Nov 15

[LLVMdev] X86 rcp instruction generated

...generatation of X86 rcp instruction generation. We have observed minor effects on precision and hence hve put these transformations under the mentioned flag. Note that -fp-rcp is only enabled with -enable-unsafe-fp-math flag presently. Moreover we have achieved some derived optimizations along with rsqrt generations. Following is the details of the -fp-rcp flag along with its values and enabled optimizations. -fp-rcp =off - No rcp =on - y/x => y * rcp(x) // Standard =fda - Stan...

RFC: Moving DAG heuristic-based transforms to MI passes

2017 Jan 27

RFC: Moving DAG heuristic-based transforms to MI passes

...replacement of fdiv instruction (which is very expensive in most of CPUs) with alternative sequence of instructions which is usually cheaper but has appropriate precision (see genReciprocalDiv in lib/Target/X86/X86InstrInfo.cpp for details). There are other similar optimizations like usage of rsqrt, etc. but at the moment we're dealing with recip estimation only - see https://reviews.llvm.org/D26855 for details. The current version of optimization is done at DAG Combiner level when we don't know the exact target instructions which will be used by CodeGen. As result we don't k...

RFC: Moving DAG heuristic-based transforms to MI passes

2017 Jan 28

RFC: Moving DAG heuristic-based transforms to MI passes

...hich is very expensive in most of >> CPUs) with alternative sequence of instructions which is usually >> cheaper but has appropriate precision (see genReciprocalDiv in >> lib/Target/X86/X86InstrInfo.cpp for details). There are other similar >> optimizations like usage of rsqrt, etc. but at the moment we're >> dealing with recip estimation only - see >> https://reviews.llvm.org/D26855 for details. >> >> The current version of optimization is done at DAG Combiner level >> when we don't know the exact target instructions which will b...

[LLVMdev] sqrt

2010 Jan 07

[LLVMdev] sqrt

On Jan 7, 2010, at 7:06 AM, Jon Harrop wrote: > > What is the state of sqrt in LLVM? > > It was an intrinsic but there are no OCaml bindings for it and, last > I looked, > it generated inefficient code on Linux due to this bug: > > http://www.llvm.org/PR3219 > > Is the intrinsic deprecated? Am I losing a lot of performance by > calling sqrt > from libm

[LLVMdev] sqrt

2010 Jan 07

[LLVMdev] sqrt

What is the state of sqrt in LLVM? It was an intrinsic but there are no OCaml bindings for it and, last I looked, it generated inefficient code on Linux due to this bug: http://www.llvm.org/PR3219 Is the intrinsic deprecated? Am I losing a lot of performance by calling sqrt from libm instead of using the intrinsic? -- Dr Jon Harrop, Flying Frog Consultancy Ltd.

Clang for the PlayStation 2

2018 Sep 01

Clang for the PlayStation 2

...rocessor" (IOP), which is used for PS1 compatibility and for I/O. The EE is based on a custom chip called the R5900, which implements most of MIPS III (except the ll and sc instructions, which make little sense on a single-core CPU), as well as some instructions from MIPS IV (pref, movz/movn, rsqrt.s), and a set of SIMD instructions known as Multimedia Instructions (MMI). It also contains a non-IEEE 754 single-precision FPU (which has provided a lot of headaches). It was later re-used by Toshiba as the TX79, along with a proper FPU. The IOP is based on the MIPS I R3051A, and was also used as...

search for: rsqrt