thr3ads.net - search: "vpcmpeqb"

[PATCH] x86: AVX instruction emulation fixes

2013 Aug 28

3

[PATCH] x86: AVX instruction emulation fixes

...int main(int argc, char **argv) else printf("skipped\n"); + printf("%-40s", "Testing vmovdqu %ymm2,(%ecx)..."); + if ( stack_exec && cpu_has_avx ) + { + extern const unsigned char vmovdqu_to_mem[]; + + asm volatile ( "vpcmpeqb %%xmm2, %%xmm2, %%xmm2\n" + ".pushsection .test, \"a\", @progbits\n" + "vmovdqu_to_mem: vmovdqu %%ymm2, (%0)\n" + ".popsection" :: "c" (NULL) ); + + memcpy(instr, vmovdqu_...

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

2011 Nov 30

2

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

...t> %indices) { %pointer = getelementptr float* @lut, <8 x i32> %indices %values = load <8 x float*> %pointer ret <8 x float> %values; } And the final AVX2 code I'd expect would consist of a single VGATHERDPS, both on 64bits and 32bits addressing mode: foo: VPCMPEQB ymm1, ymm1, ymm1 ; generate all ones VGATHERDPS ymm0, DWORD PTR [ymm0 * 4 + lut], ymm1 RET Jose ----- Original Message ----- > Hi Jose, > > The proposed IR change does not contribute nor hinder the usecase you > mentioned. The case of a base + vector-inde...

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

2011 Nov 29

0

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

Hi Jose, The proposed IR change does not contribute nor hinder the usecase you mentioned. The case of a base + vector-index should be easily addressed by an intrinsic. The pointer-vector proposal comes to support full scatter/gather instructions (such as the AVX2 gather instructions). Nadav -----Original Message----- From: Jose Fonseca [mailto:jfonseca at vmware.com] Sent: Tuesday, November

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

2011 Nov 30

0

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

...t> %indices) { %pointer = getelementptr float* @lut, <8 x i32> %indices %values = load <8 x float*> %pointer ret <8 x float> %values; } And the final AVX2 code I'd expect would consist of a single VGATHERDPS, both on 64bits and 32bits addressing mode: foo: VPCMPEQB ymm1, ymm1, ymm1 ; generate all ones VGATHERDPS ymm0, DWORD PTR [ymm0 * 4 + lut], ymm1 RET Jose ----- Original Message ----- > Hi Jose, > > The proposed IR change does not contribute nor hinder the usecase you > mentioned. The case of a base + vector-inde...

windows ABI problem with i128?

2018 Apr 26

2

windows ABI problem with i128?

...callq 5f <_start+0x4f> 5f: 48 89 55 d8 mov %rdx,-0x28(%rbp) 63: 48 89 45 d0 mov %rax,-0x30(%rbp) 67: c5 fa 6f 45 d0 vmovdqu -0x30(%rbp),%xmm0 6c: c5 fa 6f 4d e0 vmovdqu -0x20(%rbp),%xmm1 71: c5 f9 74 c1 vpcmpeqb %xmm1,%xmm0,%xmm0 75: c5 79 d7 c0 vpmovmskb %xmm0,%r8d 79: 41 81 e8 ff ff 00 00 sub $0xffff,%r8d 80: 44 89 45 cc mov %r8d,-0x34(%rbp) 84: 74 06 je 8c <_start+0x7c> 86: eb 00 jmp 88 <_s...

windows ABI problem with i128?

2018 Apr 26

0

windows ABI problem with i128?

...> > 5f: 48 89 55 d8 mov %rdx,-0x28(%rbp) > 63: 48 89 45 d0 mov %rax,-0x30(%rbp) > 67: c5 fa 6f 45 d0 vmovdqu -0x30(%rbp),%xmm0 > 6c: c5 fa 6f 4d e0 vmovdqu -0x20(%rbp),%xmm1 > 71: c5 f9 74 c1 vpcmpeqb %xmm1,%xmm0,%xmm0 > 75: c5 79 d7 c0 vpmovmskb %xmm0,%r8d > 79: 41 81 e8 ff ff 00 00 sub $0xffff,%r8d > 80: 44 89 45 cc mov %r8d,-0x34(%rbp) > 84: 74 06 je 8c <_start+0x7c> > 86: eb 00...

windows ABI problem with i128?

2018 Apr 26

1

windows ABI problem with i128?

...8 89 55 d8 mov %rdx,-0x28(%rbp) > > 63: 48 89 45 d0 mov %rax,-0x30(%rbp) > > 67: c5 fa 6f 45 d0 vmovdqu -0x30(%rbp),%xmm0 > > 6c: c5 fa 6f 4d e0 vmovdqu -0x20(%rbp),%xmm1 > > 71: c5 f9 74 c1 vpcmpeqb %xmm1,%xmm0,%xmm0 > > 75: c5 79 d7 c0 vpmovmskb %xmm0,%r8d > > 79: 41 81 e8 ff ff 00 00 sub $0xffff,%r8d > > 80: 44 89 45 cc mov %r8d,-0x34(%rbp) > > 84: 74 06 je 8c <_start+0x7c> > >...

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

2011 Nov 29

4

[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP

----- Original Message ----- > "Rotem, Nadav" <nadav.rotem at intel.com> writes: > > > David, > > > > Thanks for the support! I sent a detailed email with the overall > > plan. But just to reiterate, the GEP would look like this: > > > > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, i32 > > 3, i32

search for: vpcmpeqb