Displaying 20 results from an estimated 33 matches for "avx512f".
Did you mean:
avx512
2016 May 01
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
...: http://llvm.org/viewvc/llvm-project?rev=267690&view=rev
Log:
[Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Differential Revision: http://reviews.llvm.org/D19529
Modified:
cfe/trunk/include/clang/Basic/BuiltinsX86.def
cfe/trunk/lib/Headers/avx512fintrin.h
cfe/trunk/test/CodeGen/avx512f-builtins.c
Modified: cfe/trunk/include/clang/Basic/BuiltinsX86.def
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/BuiltinsX86.def?rev=267690&r1=267689&r2=267690&view=diff
================================================...
2016 May 15
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
...: http://llvm.org/viewvc/llvm-project?rev=267690&view=rev
Log:
[Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Differential Revision: http://reviews.llvm.org/D19529
Modified:
cfe/trunk/include/clang/Basic/BuiltinsX86.def
cfe/trunk/lib/Headers/avx512fintrin.h
cfe/trunk/test/CodeGen/avx512f-builtins.c
Modified: cfe/trunk/include/clang/Basic/BuiltinsX86.def
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/BuiltinsX86.def?rev=267690&r1=267689&r2=267690&view=diff
================================================...
2016 Nov 23
4
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
Hi All.
This is an RFC for a proposed target specific X86 optimization for reducing code size in the encoding of AVX-512 instructions when possible.
When the AVX512F instruction set was introduced in X86 it included additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as additional 16 XMM registers XMM16-XMM31 and 16 YMM registers YMM16-YMM31.
In order to encode the new registers of 16-31 and the additional instructions, a new encoding prefix calle...
2016 May 09
2
Building LLVM 3.8 and later with 2016 Intel C++ compiler
...uot;declval"
iterator_range<decltype(begin(std::declval<T>()))> drop_begin(T &&t,
int n) {
An isolated std::declval example shows that this compiler doesn't
support it. I can't really roll back to an earlier LLVM version since I
need the latest LLVM backends (avx512f etc). I also can't really use GCC
to build LLVM since I need to build other parts of the project with
Intel and linking would then be a problem.
Does this std::declval piece of code happen to be in a place which can
be turned on/off with a configure/cmake option?
Thanks,
Frank
2016 Nov 23
2
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
...: code size reduction in X86 by replacing EVEX
> with VEX encoding
>
>
>
> Hi All.
>
>
>
> This is an RFC for a proposed target specific X86 optimization for
> reducing code size in the encoding of AVX-512 instructions when possible.
>
>
>
> When the AVX512F instruction set was introduced in X86 it included
> additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as
> additional 16 XMM registers XMM16-XMM31 and 16 YMM registers YMM16-YMM31.
>
> In order to encode the new registers of 16-31 and the additional
> instructions, a...
2017 May 08
2
LLVM and Xeon Skylake v5
Thank you. I'm letting it auto detect by setting the target using getProcessTarget. I disabled avx512 support by passing -avx512f (and the other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in X86.td. It's the exact same executable running on Kabylake.
What does the Cannot select: specifically mean? Is there some table that doesn't have a definition for a key in it that I would need to patch up?...
2016 Nov 24
3
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
...nt: Wednesday, November 23, 2016 5:50:42 AM
Subject: [llvm-dev] RFC: code size reduction in X86 by replacing EVEX with VEX encoding
Hi All.
This is an RFC for a proposed target specific X86 optimization for reducing code size in the encoding of AVX-512 instructions when possible.
When the AVX512F instruction set was introduced in X86 it included additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as additional 16 XMM registers XMM16-XMM31 and 16 YMM registers YMM16-YMM31.
In order to encode the new registers of 16-31 and the additional instructions, a new encoding prefix calle...
2017 Sep 30
2
invalid code generated on Windows x86_64 using skylake-specific features
...strdup(features.getString().c_str());
}
On this windows laptop that I am testing on, I get these values:
target_specific_cpu_args: skylake
target_specific_features:
+sse2,+cx16,-tbm,-avx512ifma,-avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,+xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-pku,+mmx,-lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsave,-avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+sse4.1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+f16c,+ssse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-sha,+adx,-avx512pf,+sse3
It successfully creates a binary, bu...
2016 Nov 28
2
RFC: code size reduction in X86 by replacing EVEX with VEX encoding
...ent: Wednesday, November 23, 2016 5:50:42 AM
Subject: [llvm-dev] RFC: code size reduction in X86 by replacing EVEX with VEX encoding
Hi All.
This is an RFC for a proposed target specific X86 optimization for reducing code size in the encoding of AVX-512 instructions when possible.
When the AVX512F instruction set was introduced in X86 it included additional 32 registers of 512bit size each ZMM0 - ZMM31, as well as additional 16 XMM registers XMM16-XMM31 and 16 YMM registers YMM16-YMM31.
In order to encode the new registers of 16-31 and the additional instructions, a new encoding prefix calle...
2017 May 08
2
LLVM and Xeon Skylake v5
...rection: getProcessTriple not getProcessTarget.
>
> On 8 May 2017, at 17:55, Andy Schneider via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Thank you. I'm letting it auto detect by setting the target using
> getProcessTarget. I disabled avx512 support by passing -avx512f (and the
> other variants) to setMAttrs on EngineBuilder. I can see refs to avx512 in
> X86.td. It's the exact same executable running on Kabylake.
>
> What does the Cannot select: specifically mean? Is there some table that
> doesn't have a definition for a key in it that I...
2017 Oct 01
1
invalid code generated on Windows x86_64 using skylake-specific features
...>
> On this windows laptop that I am testing on, I get these values:
>
> target_specific_cpu_args: skylake
>
> target_specific_features: +sse2,+cx16,-tbm,-avx512ifma,-
> avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,
> +xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-pku,+mmx,-
> lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsave,-
> avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+sse4.
> 1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+f16c,+
> ssse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-sha,+adx,-avx512pf,+sse3
>
>
>...
2016 Jun 29
2
avx512 JIT backend generates wrong code on <4 x float>
...is wrong.
When I execute the exploit program on an Intel KNL the following output
is produced:
CPU name = knl
-sse4a,-avx512bw,cx16,-tbm,xsave,-fma4,-avx512vl,prfchw,bmi2,adx,-xsavec,fsgsbase,avx,avx512cd,avx512pf,-rtm,popcnt,fma,bmi,aes,rdrnd,-xsaves,sse4.1,sse4.2,avx2,avx512er,sse,lzcnt,pclmul,avx512f,f16c,ssse3,mmx,-pku,cmov,-xop,rdseed,movbe,-hle,xsaveopt,-sha,sse2,sse3,-avx512dq,
Assembly:
.text
.file "module_KFxOBX_i4_after.ll"
.globl adjmul
.align 16, 0x90
.type adjmul, at function
adjmul:
.cfi_startproc
leaq (%rdi,%r8), %rdx...
2017 Aug 17
4
unable to emit vectorized code in LLVM IR
...-math"="false"
"no-signed-zeros-fp-math"="false" "no-trapping-math"="false"
"stack-protector-buffer-size"="8" "target-cpu"="knl"
"target-features"="+adx,+aes,+avx,+avx2,+avx512cd,+avx512er,+avx512f,+avx512pf,+bmi,+bmi2,+cx16,+f16c,+fma,+fsgsbase,+fxsr,+lzcnt,+mmx,+movbe,+pclmul,+popcnt,+prefetchwt1,+rdrnd,+rdseed,+rtm,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt"
"unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{...
2016 Jun 29
0
avx512 JIT backend generates wrong code on <4 x float>
...e the exploit program on an Intel KNL the following
> output
> is produced:
>
> CPU name = knl
> -sse4a,-avx512bw,cx16,-tbm,xsave,-fma4,-avx512vl,prfchw,bmi2,adx,-xsavec,fsgsbase,avx,avx512cd,avx512pf,-rtm,popcnt,fma,bmi,aes,rdrnd,-xsaves,sse4.1,sse4.2,avx2,avx512er,sse,lzcnt,pclmul,avx512f,f16c,ssse3,mmx,-pku,cmov,-xop,rdseed,movbe,-hle,xsaveopt,-sha,sse2,sse3,-avx512dq,
> Assembly:
> .text
> .file "module_KFxOBX_i4_after.ll"
> .globl adjmul
> .align 16, 0x90
> .type adjmul, at function
> adjmul:
> .cfi_start...
2017 Jun 21
2
AVX 512 Assembly Code Generation issues
...t; a[i] = b[i] + c[i];
> }
> }
>
> i first generated its .ll file via clang
>
> clang -S -emit-llvm test.c -o test.ll
>
> then i optimized it;
>
> opt -S -O3 test.ll -o test_o3.ll
>
> then i used llc for code generation
>
> llc -mcpu=skylake-avx512 -mattr=+avx512f test_o3.ll -o test_o3.s
>
> llc -mcpu=knl -mattr=+avx512f test_o3.ll -o test_o3.s
>
>
> here is my generated code;
>
>
>
> .text
> .file "filer_o3.ll"
> .globl foo
> .p2align 4, 0x90
> .type foo, at function
> foo:...
2016 Jun 30
1
avx512 JIT backend generates wrong code on <4 x float>
...am on an Intel KNL the following
>> output
>> is produced:
>>
>> CPU name = knl
>> -sse4a,-avx512bw,cx16,-tbm,xsave,-fma4,-avx512vl,prfchw,bmi2,adx,-xsavec,fsgsbase,avx,avx512cd,avx512pf,-rtm,popcnt,fma,bmi,aes,rdrnd,-xsaves,sse4.1,sse4.2,avx2,avx512er,sse,lzcnt,pclmul,avx512f,f16c,ssse3,mmx,-pku,cmov,-xop,rdseed,movbe,-hle,xsaveopt,-sha,sse2,sse3,-avx512dq,
>> Assembly:
>> .text
>> .file "module_KFxOBX_i4_after.ll"
>> .globl adjmul
>> .align 16, 0x90
>> .type adjmul, at function
>...
2018 Apr 10
1
64 bit mask in x86vshuffle instruction
...uffleAsRotate(DL, MVT::v64i32, V1, V2,
Mask, Subtarget, DAG))
return Rotate;
// Assume that a single SHUFPS is faster than using a permv shuffle.
// If some CPU is harmed by the domain switch, we can fix it in a later
pass.
// If we have AVX512F support, we can use VEXPAND.
if (SDValue V = lowerVectorShuffleToEXPAND(DL, MVT::v64i32, Zeroable,
Mask,
V1, V2, DAG, Subtarget))
return V;
return lowerVectorShuffleWithPERMV(DL, MVT::v64i32, Mask, V1, V2, DAG);
}
static SDValue lowerV32I64Vec...
2017 Aug 17
2
unable to emit vectorized code in LLVM IR
...; "no-signed-zeros-fp-math"="false" "no-trapping-math"="false"
>> "stack-protector-buffer-size"="8" "target-cpu"="knl"
>> "target-features"="+adx,+aes,+avx,+avx2,+avx512cd,+avx512er,
>> +avx512f,+avx512pf,+bmi,+bmi2,+cx16,+f16c,+fma,+fsgsbase,+fx
>> sr,+lzcnt,+mmx,+movbe,+pclmul,+popcnt,+prefetchwt1,+rdrnd,+
>> rdseed,+rtm,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt"
>> "unsafe-fp-math"="false" "use-soft-float"="fa...
2016 Jun 23
2
AVX512 instruction generated when JIT compiling for an avx2 architecture
...section ".note.GNU-stack","", at progbits
end assembly!
I am not sure what instruction is the offending one, but the 'vmovdqa32'
looks avx512.
I wasn't able to reproduce this with 'opt' - it generates avx2
instructions. And when I force it to use e.g. avx512f it rejects the CPU
type.
Any ideas?
Frank
2017 Oct 03
2
invalid code generated on Windows x86_64 using skylake-specific features
...ing on, I get these values:
>>>
>>> target_specific_cpu_args: skylake
>>>
>>> target_specific_features: +sse2,+cx16,-tbm,-avx512ifma,-
>>> avx512dq,-fma4,+prfchw,+bmi2,+xsavec,+fsgsbase,+popcnt,+aes,
>>> +xsaves,-avx512er,-avx512vpopcntdq,-clwb,-avx512f,-clzero,-p
>>> ku,+mmx,-lwp,-xop,+rdseed,-sse4a,-avx512bw,+clflushopt,+xsav
>>> e,-avx512vl,-avx512cd,+avx,-rtm,+fma,+bmi,+rdrnd,-mwaitx,+
>>> sse4.1,+sse4.2,+avx2,+sse,+lzcnt,+pclmul,-prefetchwt1,+
>>> f16c,+ssse3,+sgx,+cmov,-avx512vbmi,+movbe,+xsaveopt,-
>&g...