search for: pinsrw

Displaying 10 results from an estimated 10 matches for "pinsrw".

2007 Jul 20
5
[LLVMdev] Seg faulting on vector ops
...s because esp+4 isn't 16-byte aligned What is that line trying to achieve? X is at [esp+24]. There weren't any other parameters. 00000000`01b8001e f30f10c8 movss xmm1,xmm0 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h] 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2 00000000`01b8002b 89c1 mov ecx,eax 00000000`01b8002d c1e910 shr ecx,10h 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4 00000000`01b8003a 660fc4c905 pinsrw xmm1,ecx,5 00000000`01b8003f 660f...
2007 Jul 21
0
[LLVMdev] Seg faulting on vector ops
...> What is that line trying to achieve? X is at [esp+24]. There weren't > any other parameters. > > > > 00000000`01b8001e f30f10c8 movss xmm1,xmm0 > > 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h] > > 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2 > > 00000000`01b8002b 89c1 mov ecx,eax > > 00000000`01b8002d c1e910 shr ecx,10h > > 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3 > > 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4 > > 00000000`01b8003a 660fc4c905...
2007 Jul 24
2
[LLVMdev] Seg faulting on vector ops
...s at [esp+24]. There >> weren't >> any other parameters. >> >> >> >> 00000000`01b8001e f30f10c8 movss xmm1,xmm0 >> >> 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h] >> >> 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2 >> >> 00000000`01b8002b 89c1 mov ecx,eax >> >> 00000000`01b8002d c1e910 shr ecx,10h >> >> 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3 >> >> 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4 >>...
2007 Jul 20
0
[LLVMdev] Seg faulting on vector ops
...t; > What is that line trying to achieve? X is at [esp+24]. There > weren’t any other parameters. > > > > 00000000`01b8001e f30f10c8 movss xmm1,xmm0 > > 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h] > > 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2 > > 00000000`01b8002b 89c1 mov ecx,eax > > 00000000`01b8002d c1e910 shr ecx,10h > > 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3 > > 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4 > > 00000000`01b8003a 660fc4c905...
2007 Jul 26
0
[LLVMdev] Seg faulting on vector ops
...en't >>> any other parameters. >>> >>> >>> >>> 00000000`01b8001e f30f10c8 movss xmm1,xmm0 >>> >>> 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h] >>> >>> 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2 >>> >>> 00000000`01b8002b 89c1 mov ecx,eax >>> >>> 00000000`01b8002d c1e910 shr ecx,10h >>> >>> 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3 >>> >>> 00000000`01b80035 660fc4c804...
2005 Aug 17
2
MMX loop filter for theora-exp
...uot; /* >>3 f coefs low */ +"paddw (V100),%%mm3\n" /* add 256 */ +"paddw (V100),%%mm2\n" /* add 256 */ + +" pextrw $0,%%mm2,%%esi\n" /* In MM4:MM0 we have f coefs (16bits) */ +" pextrw $1,%%mm2,%%edi\n" /* now perform MM7:MM6 = *(_bv+ f) */ +" pinsrw $0,(%2,%%esi,4),%%mm6\n" +" pinsrw $1,(%2,%%edi,4),%%mm6\n" + +" pextrw $2,%%mm2,%%esi\n" +" pextrw $3,%%mm2,%%edi\n" +" pinsrw $2,(%2,%%esi,4),%%mm6\n" +" pinsrw $3,(%2,%%edi,4),%%mm6\n" + +" pextrw $0,%%mm3,%%esi\n" +" pextrw...
2011 Oct 17
0
[LLVMdev] LLVM Build Bot failure on llmv-x86_64-ubuntu
...s like pinsr is not being generated on llvm-x86_64-ubuntu... jabbey at davinci:~$ /home/jabbey/src/osuosl/buildbot/sandbox/llvm-x86_64-ubuntu/llvm-x86_64-ubuntu/llvm/Debug+Asserts/bin/llc < /home/jabbey/src/osuosl/buildbot/sandbox/llvm-x86_64-ubuntu/llvm-x86_64-ubuntu/llvm/test/CodeGen/X86/mmx-pinsrw.ll -mtriple=x86_64-linux -mattr=+mmx,+sse2 produces: .file "<stdin>" .section .rodata.cst16,"aM", at progbits,16 .align 16 .LCPI0_0: .byte 0 # 0x0 .byte 1 # 0x1 .byte 4 # 0x4 .byte 5...
2019 Jan 24
2
[cfe-dev] _Float16 support
...ars ago for AArch64, for the pre-v8.2 mode), but vectors are fun, because of build_vector (where it helps to have the truncating behavior we have for integers, but for fp), extract_vector_elt (where you need the matching extend), and insert_vector_elt (which you have to lower using some movd and/or pinsrw trickery, if you want to avoid the generic slow via-memory fallback). Alternatively, we can immediately, in call lowering/register assignment logic (this covers the SDAG cross-BB vreg assignments Craig mentions) promote to f32 "via" i16. I'm afraid I don't remember the arguments...
2019 Jan 24
4
[cfe-dev] _Float16 support
...the pre-v8.2 mode), but vectors are fun, because of build_vector > (where it helps to have the truncating behavior we have for integers, > but for fp), extract_vector_elt (where you need the matching extend), > and insert_vector_elt (which you have to lower using some movd and/or > pinsrw trickery, if you want to avoid the generic slow via-memory > fallback). > Alternatively, we can immediately, in call lowering/register > assignment logic (this covers the SDAG cross-BB vreg assignments Craig > mentions) promote to f32 "via" i16. I'm afraid I don't r...
2019 Jan 22
4
_Float16 support
I'd like to start a discussion about how clang supports _Float16 for target architectures that don't have direct support for 16-bit floating point arithmetic. The current clang language extensions documentation says, "If half-precision instructions are unavailable, values will be promoted to single-precision, similar to the semantics of __fp16 except that the results will be stored