search for: fmov

Displaying 20 results from an estimated 28 matches for "fmov".

Did you mean: cmov
2017 May 11
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...rks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): Function > Reference_IDCT: Probably due to creating all constants in the entry BB + > spilling floating point data through an X register: > > FastISel: > fadd d0, d1, d0 > str d0, [sp,#528] > GlobalISel: > fadd d0, d1, d0 > fmov x9, d0 > stur x9, [x29,#-48] > > > Good finding, I forgot to do stores in my previous fix. I’ll do them > shortly. > > > Should be fixed by r302679 > > > Thanks Quentin, > > That reduces the slow-down when enabling globalisel at -O0 from 13% (on > r302453)...
2017 May 12
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...rks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%): Function > Reference_IDCT: Probably due to creating all constants in the entry BB + > spilling floating point data through an X register: > > FastISel: > fadd d0, d1, d0 > str d0, [sp,#528] > GlobalISel: > fadd d0, d1, d0 > fmov x9, d0 > stur x9, [x29,#-48] > > > Good finding, I forgot to do stores in my previous fix. I’ll do them > shortly. > > > Should be fixed by r302679 > > > Thanks Quentin, > > That reduces the slow-down when enabling globalisel at -O0 from 13% (on > r302453)...
2017 May 10
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...lications/lua/lua (46%). >> SingleSource/Benchmarks/Misc/flops-2 (75%): Poor lowering of fneg: >> FastISel: >> ldur d0, [x29,#-16] >> fneg d0, d0 >> stur d0, [x29,#-16] >> GlobalISel: >> ldur d0, [x29,#-64] >> orr x8, xzr, #0x8000000000000000 >> fmov d1, x8 >> fsub d0, d1, d0 >> fmov x8, d0 >> stur x8, [x29,#-64] >> MultiSource/Benchmarks/Prolangs-C++/city/city (74%): a call to memcpy for copying 4 bytes is present with GlobalISel that isn't present with FastISel, in function vehicle::select_move(). >> Same iss...
2016 Jul 30
2
Cannot compile speexdsp 1.2rc3 on ARM64
...ndroid version which is rc2 so I'll need to integrate, but here it is: > > #if defined(__aarch64__) > static inline int32_t saturate_32bit_to_16bit(int32_t a) { > int32_t ret; > asm volatile ("sqxtn h0, %s[a]\n" > "sxtl v0.4s, v0.4h\n" > "fmov %w[ret], s0\n" > : [ret] "=&r" (ret) > : [a] "w" (a) > : "v0" ); > return ret; > } > #elif defined(__ARM_NEON__) > static inline int32_t saturate_32bit_to_16bit(int32_t a) { > int32_t ret; > asm volatile ("vmov.s3...
2017 May 12
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
..._IDCT: Probably due to creating all constants in the entry BB + > >> spilling floating point data through an X register: > >> > >> FastISel: > >> fadd d0, d1, d0 > >> str d0, [sp,#528] > >> GlobalISel: > >> fadd d0, d1, d0 > >> fmov x9, d0 > >> stur x9, [x29,#-48] > >> > >> > >> Good finding, I forgot to do stores in my previous fix. I’ll do them > >> shortly. > >> > >> > >> Should be fixed by r302679 > >> > >> > >> Thanks Quenti...
2015 Nov 23
1
[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.
...00078 smlal2.2d v4, v0, v3 000000000000007c smlal.2d v4, v1, v2 0000000000000080 smlal2.2d v4, v1, v2 0000000000000084 ext.16b v2, v4, v4, #8 0000000000000088 add d2, d4, d2 000000000000008c sshr d2, d2, #16 0000000000000090 fmov w0, s2 0000000000000094 stp q0, q1, [x1] 0000000000000098 ret (Non-vectorized code for non-order-8 omitted.) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/opus/attachments/20151123/77f930ee/attachment.htm
2017 May 09
4
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...sqlite3 (71%). Same issue causes MultiSource/Applications/lua/lua (46%). * SingleSource/Benchmarks/Misc/flops-2 (75%): Poor lowering of fneg: * FastISel: ldur d0, [x29,#-16] fneg d0, d0 stur d0, [x29,#-16] * GlobalISel: ldur d0, [x29,#-64] orr x8, xzr, #0x8000000000000000 fmov d1, x8 fsub d0, d1, d0 fmov x8, d0 stur x8, [x29,#-64] * MultiSource/Benchmarks/Prolangs-C++/city/city (74%): a call to memcpy for copying 4 bytes is present with GlobalISel that isn't present with FastISel, in function vehicle::select_move(). Same issue causes SingleSource/Benchmarks/Sh...
2017 May 09
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...ks/Misc/flops-2 (75%): Poor lowering of > fneg: > - FastISel: > ldur d0, [x29,#-16] > fneg d0, d0 > stur d0, [x29,#-16] > - GlobalISel: > ldur d0, [x29,#-64] > orr x8, xzr, #0x8000000000000000 > fmov d1, x8 > fsub d0, d1, d0 > fmov x8, d0 > stur x8, [x29,#-64] > - MultiSource/Benchmarks/Prolangs-C++/city/city (74%): a call to > memcpy for copying 4 bytes is present with GlobalISel that isn't present > with FastISel, in function...
2017 Nov 14
6
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
To give an update here, we actually are not missing a mapping. The code complains because we are copying around a fp16 into a gpr32 and that shouldn’t be done with a copy (default mapping). I extended the repairing code to issue G_ANYEXT in those cases instead of asserting. However, now, I have to teach instruction select about those ANYEXT otherwise we’ll fallback in that case. But that’s a
2015 May 04
2
[LLVMdev] Incorrect code generated for arm64
Hi all, I’ve narrowed down a problem in my code to the following test case: - - - - typedef struct {float v[2];} vec2; typedef struct {float v[3];} vec3; vec2 getVec2(); vec3 getVec3() { vec2 myVec = getVec2(); vec3 res; res.v[0] = myVec.v[0]; res.v[1] = myVec.v[1]; res.v[2] = 1; return res; } - - - - Compiling this with any level of optimization for arm64 gives incorrect code,
2017 May 16
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...>>> >> spilling floating point data through an X register: >>> >> >>> >> FastISel: >>> >> fadd d0, d1, d0 >>> >> str d0, [sp,#528] >>> >> GlobalISel: >>> >> fadd d0, d1, d0 >>> >> fmov x9, d0 >>> >> stur x9, [x29,#-48] >>> >> >>> >> >>> >> Good finding, I forgot to do stores in my previous fix. I’ll do them >>> >> shortly. >>> >> >>> >> >>> >> Should be fixed by...
2017 Nov 17
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...l-abort=0 -mbig-endian -o - -S > > _Z9get_first3foo: // @_Z9get_first3foo > // BB#0: // %entry > sub sp, sp, #16 // =16 > // implicit-def: %X8 > fmov w9, s0 > mov w10, w9 > bfxil x8, x10, #0, #32 > fmov w9, s1 > mov w10, w9 > bfi x8, x10, #32, #32 > add x10, sp, #8 // =8 > str x8, [sp, #8] > ldr w9,...
2017 May 18
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...ench/mpeg2/mpeg2dec/mpeg2decode (46%): > Function > Reference_IDCT: Probably due to creating all constants in the entry BB > + > spilling floating point data through an X register: > > FastISel: > fadd d0, d1, d0 > str d0, [sp,#528] > GlobalISel: > fadd d0, d1, d0 > fmov x9, d0 > stur x9, [x29,#-48] > > > Good finding, I forgot to do stores in my previous fix. I’ll do them > shortly. > > > Should be fixed by r302679 > > > Thanks Quentin, > > That reduces the slow-down when enabling globalisel at -O0 from 13% (on > r302453)...
2015 May 04
2
[LLVMdev] Incorrect code generated for arm64
....v1 = myVec.v1; > res.v2 = 1; > return res; > } > > .section __TEXT,__text,regular,pure_instructions > .globl _getVec3 > .align 2 > _getVec3: ; @getVec3 > ; BB#0: > stp fp, lr, [sp, #-16]! > mov fp, sp > bl _getVec2 > fmov s2, #1.000000e+00 > ldp fp, lr, [sp], #16 > ret > > > On Mon, May 4, 2015 at 1:19 PM, Simon Taylor <simontaylor1 at ntlworld.com <mailto:simontaylor1 at ntlworld.com>> wrote: > Hi all, > > I’ve narrowed down a problem in my code to the following test case: &...
2015 Apr 21
2
[LLVMdev] RFC: Missing canonicalization in LLVM
...a load/store of type double 3) InstCombine canonicalizes the load/store to use i64 instead of double 4) SROA removes the load/store & inserts a phi back in, using i64 as the type. Inserts bitcast to get to double. 5) The bitcast sticks around and eventually get translated into FMOVs (for AArch64 at least). The function findCommonType() in SROA.cpp is used to obtain the type that should be used for the new alloca that SROA wants to create. It’s decision process is essentially – if all loads/stores of alloca are the same, use that type; else use the corresponding integer ty...
2017 Nov 27
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...l-abort=0 -mbig-endian -o - -S > > _Z9get_first3foo: // @_Z9get_first3foo > // BB#0: // %entry > sub sp, sp, #16 // =16 > // implicit-def: %X8 > fmov w9, s0 > mov w10, w9 > bfxil x8, x10, #0, #32 > fmov w9, s1 > mov w10, w9 > bfi x8, x10, #32, #32 > add x10, sp, #8 // =8 > str x8, [sp, #8] > ldr w9,...
2017 May 19
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...: Probably due to creating all constants in the entry BB >>> + >>> spilling floating point data through an X register: >>> >>> FastISel: >>> fadd d0, d1, d0 >>> str d0, [sp,#528] >>> GlobalISel: >>> fadd d0, d1, d0 >>> fmov x9, d0 >>> stur x9, [x29,#-48] >>> >>> >>> Good finding, I forgot to do stores in my previous fix. I’ll do them >>> shortly. >>> >>> >>> Should be fixed by r302679 >>> >>> >>> Thanks Quentin, >>...
2015 Mar 28
4
Cannot compile speexdsp 1.2rc3 on ARM64
Hi all, I build successfully with speex-1.2rc2. And with speexdsp 1.2rc3, I build with i386, X86_64, armv7 and armv7s all passed. But when I build for ARM64 (for iPhone 6), it failed with: /Applications/Xcode.app/Contents/Developer/usr/bin/make all-recursive Making all in libspeexdsp CC preprocess.lo CC jitter.lo CC mdf.lo CC fftwrap.lo CC
2016 Apr 19
0
Cannot compile speexdsp 1.2rc3 on ARM64
...t neon. I'm working off an android version which is rc2 so I'll need to integrate, but here it is: #if defined(__aarch64__) static inline int32_t saturate_32bit_to_16bit(int32_t a) { int32_t ret; asm volatile ("sqxtn h0, %s[a]\n" "sxtl v0.4s, v0.4h\n" "fmov %w[ret], s0\n" : [ret] "=&r" (ret) : [a] "w" (a) : "v0" ); return ret; } #elif defined(__ARM_NEON__) static inline int32_t saturate_32bit_to_16bit(int32_t a) { int32_t ret; asm volatile ("vmov.s32 d24[0], %[a]\n" "vqmovn.s32...
2005 Nov 09
1
Samba over NFS: Total and Free disk incorrect in Windows.
Hello, I have a Linux machine auto-mounting an NFS share, then sharing it out via Samba (not my idea). Everything is fine - except: On Windows machines that have mapped a drive to the Samba share the Total Size and Free Space for the mapped network drive (Samba/NFSshare) shows Total Size of 20.0 MB and Free Space of 0 bytes. To rule out a very simple Samba problem I created a new (local)