Displaying 10 results from an estimated 10 matches for "pinsrw".
2007 Jul 20
5
[LLVMdev] Seg faulting on vector ops
...s because esp+4 isn't 16-byte aligned
What is that line trying to achieve? X is at [esp+24]. There weren't
any other parameters.
00000000`01b8001e f30f10c8 movss xmm1,xmm0
00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h]
00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2
00000000`01b8002b 89c1 mov ecx,eax
00000000`01b8002d c1e910 shr ecx,10h
00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3
00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4
00000000`01b8003a 660fc4c905 pinsrw xmm1,ecx,5
00000000`01b8003f 660f...
2007 Jul 21
0
[LLVMdev] Seg faulting on vector ops
...> What is that line trying to achieve? X is at [esp+24]. There weren't
> any other parameters.
>
>
>
> 00000000`01b8001e f30f10c8 movss xmm1,xmm0
>
> 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h]
>
> 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2
>
> 00000000`01b8002b 89c1 mov ecx,eax
>
> 00000000`01b8002d c1e910 shr ecx,10h
>
> 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3
>
> 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4
>
> 00000000`01b8003a 660fc4c905...
2007 Jul 24
2
[LLVMdev] Seg faulting on vector ops
...s at [esp+24]. There
>> weren't
>> any other parameters.
>>
>>
>>
>> 00000000`01b8001e f30f10c8 movss xmm1,xmm0
>>
>> 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h]
>>
>> 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2
>>
>> 00000000`01b8002b 89c1 mov ecx,eax
>>
>> 00000000`01b8002d c1e910 shr ecx,10h
>>
>> 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3
>>
>> 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4
>>...
2007 Jul 20
0
[LLVMdev] Seg faulting on vector ops
...t;
> What is that line trying to achieve? X is at [esp+24]. There
> weren’t any other parameters.
>
>
>
> 00000000`01b8001e f30f10c8 movss xmm1,xmm0
>
> 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h]
>
> 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2
>
> 00000000`01b8002b 89c1 mov ecx,eax
>
> 00000000`01b8002d c1e910 shr ecx,10h
>
> 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3
>
> 00000000`01b80035 660fc4c804 pinsrw xmm1,eax,4
>
> 00000000`01b8003a 660fc4c905...
2007 Jul 26
0
[LLVMdev] Seg faulting on vector ops
...en't
>>> any other parameters.
>>>
>>>
>>>
>>> 00000000`01b8001e f30f10c8 movss xmm1,xmm0
>>>
>>> 00000000`01b80022 8b442424 mov eax,dword ptr [esp+24h]
>>>
>>> 00000000`01b80026 660fc4c802 pinsrw xmm1,eax,2
>>>
>>> 00000000`01b8002b 89c1 mov ecx,eax
>>>
>>> 00000000`01b8002d c1e910 shr ecx,10h
>>>
>>> 00000000`01b80030 660fc4c903 pinsrw xmm1,ecx,3
>>>
>>> 00000000`01b80035 660fc4c804...
2005 Aug 17
2
MMX loop filter for theora-exp
...uot; /* >>3 f coefs low */
+"paddw (V100),%%mm3\n" /* add 256 */
+"paddw (V100),%%mm2\n" /* add 256 */
+
+" pextrw $0,%%mm2,%%esi\n" /* In MM4:MM0 we have f coefs (16bits) */
+" pextrw $1,%%mm2,%%edi\n" /* now perform MM7:MM6 = *(_bv+ f) */
+" pinsrw $0,(%2,%%esi,4),%%mm6\n"
+" pinsrw $1,(%2,%%edi,4),%%mm6\n"
+
+" pextrw $2,%%mm2,%%esi\n"
+" pextrw $3,%%mm2,%%edi\n"
+" pinsrw $2,(%2,%%esi,4),%%mm6\n"
+" pinsrw $3,(%2,%%edi,4),%%mm6\n"
+
+" pextrw $0,%%mm3,%%esi\n"
+" pextrw...
2011 Oct 17
0
[LLVMdev] LLVM Build Bot failure on llmv-x86_64-ubuntu
...s like pinsr is not being generated on llvm-x86_64-ubuntu...
jabbey at davinci:~$ /home/jabbey/src/osuosl/buildbot/sandbox/llvm-x86_64-ubuntu/llvm-x86_64-ubuntu/llvm/Debug+Asserts/bin/llc < /home/jabbey/src/osuosl/buildbot/sandbox/llvm-x86_64-ubuntu/llvm-x86_64-ubuntu/llvm/test/CodeGen/X86/mmx-pinsrw.ll -mtriple=x86_64-linux -mattr=+mmx,+sse2
produces:
.file "<stdin>"
.section .rodata.cst16,"aM", at progbits,16
.align 16
.LCPI0_0:
.byte 0 # 0x0
.byte 1 # 0x1
.byte 4 # 0x4
.byte 5...
2019 Jan 24
2
[cfe-dev] _Float16 support
...ars ago for AArch64, for the pre-v8.2 mode), but vectors are fun, because of build_vector (where it helps to have the truncating behavior we have for integers, but for fp), extract_vector_elt (where you need the matching extend), and insert_vector_elt (which you have to lower using some movd and/or pinsrw trickery, if you want to avoid the generic slow via-memory fallback).
Alternatively, we can immediately, in call lowering/register assignment logic (this covers the SDAG cross-BB vreg assignments Craig
mentions) promote to f32 "via" i16. I'm afraid I don't remember the arguments...
2019 Jan 24
4
[cfe-dev] _Float16 support
...the pre-v8.2 mode), but vectors are fun, because of build_vector
> (where it helps to have the truncating behavior we have for integers,
> but for fp), extract_vector_elt (where you need the matching extend),
> and insert_vector_elt (which you have to lower using some movd and/or
> pinsrw trickery, if you want to avoid the generic slow via-memory
> fallback).
> Alternatively, we can immediately, in call lowering/register
> assignment logic (this covers the SDAG cross-BB vreg assignments Craig
> mentions) promote to f32 "via" i16. I'm afraid I don't r...
2019 Jan 22
4
_Float16 support
I'd like to start a discussion about how clang supports _Float16 for target architectures that don't have direct support for 16-bit floating point arithmetic.
The current clang language extensions documentation says, "If half-precision instructions are unavailable, values will be promoted to single-precision, similar to the semantics of __fp16 except that the results will be stored