thr3ads.net - search: "hasavx2"

Displaying 15 results from an estimated 15 matches for "hasavx2".

Did you mean: hasavx

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

...core-avx2" platform on Haswell i7 CPUs when running -march=native. Currently it detects it as generic x86_64. lib/Support/Host.cpp: * Haswell is detected for CPUID Family 6 Model 60 * Similar to Ivy and Sandy Bridge we check for AVX2 since some Haswell Pentiums are SSE4.x only * I have marked HasAVX2 as "volatile", since otherwise it gets magically zeroed (by optimizer?) when compiling clang with latest clang build from trunk lib/Target/X86/X86Subtarget.cpp: * Also enabling X86::FeatureFastUAMem for Haswell Regards, -- Adam Strzelecki | nanoant.com | twitter.com/nanoant -----------...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 22

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

...lib/Support/Host.cpp +++ b/lib/Support/Host.cpp @@ -138,6 +138,8 @@ std::string sys::getHostCPUName() { // switch, then we have full AVX support. const unsigned AVXBits = (1 << 27) | (1 << 28); bool HasAVX = ((ECX & AVXBits) == AVXBits) && OSHasAVXSupport(); + bool HasAVX2 = HasAVX && !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) && + (EBX & 0x20); GetX86CpuIDAndInfo(0x80000001, &EAX, &EBX, &ECX, &EDX); bool Em64T = (EDX >> 29) & 0x1; @@ -258,6 +260,12 @@ std::string sys::getHost...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

Hi Adam, > * I have marked HasAVX2 as "volatile", since otherwise it gets > magically zeroed (by optimizer?) when compiling clang with latest > clang build from trunk That's far more worrying to me than not being able to detect Haswell. I can't reproduce the problem here at the moment: both debug and release...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

...elease builds give identical assembly for Host.cpp. OK. I know the reason you cannot reproduce it, before posting the patch I've decided to check for AVX before checking AVX2, just not to cpuid AVX2 when we don't have AVX1 anyway. So the problem exists with following predicate: (1) bool HasAVX2 = !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) && (EBX & 0x20); However it is working absolutely fine if I add "volatile": (2) volatile bool HasAVX2 = !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) &&...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 22

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

Hi Adam, > + bool HasAVX2 = HasAVX && !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) && > + (EBX & 0x20); I don't think this guarantees %ecx is 0, does it? Wasn't that the entire reason the original code went wrong? Cheers. Tim.

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

Hi Adam, > OK. I know the reason you cannot reproduce it, before posting > the patch I've decided to check for AVX before checking AVX2, > just not to cpuid AVX2 when we don't have AVX1 anyway. I suspect it was also incompetence on my part. Given the differences I'm seeing now I can't believe there'd be *no* difference in my tests if I'd done them properly.

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 23

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

...port for XSAVE, XRESTORE and AVX, and XGETBV @@ -138,15 +215,12 @@ std::string sys::getHostCPUName() { // switch, then we have full AVX support. const unsigned AVXBits = (1 << 27) | (1 << 28); bool HasAVX = ((ECX & AVXBits) == AVXBits) && OSHasAVXSupport(); + bool HasAVX2 = HasAVX && MaxLeaf >= 0x7 && + !GetX86CpuIDAndInfoEx(0x7, 0x0, &EAX, &EBX, &ECX, &EDX) && + (EBX & 0x20); GetX86CpuIDAndInfo(0x80000001, &EAX, &EBX, &ECX, &EDX); bool Em64T = (EDX >> 29) &amp...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 23

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in Host.cpp and pass the correct value to ecx. Also you need to verify that 7 is a valid leaf because an invalid leaf is defined to return the highest supported leaf on that processor. So if a processor supports say leaf 6 and not leaf 7, then an access leaf 7 will return the data from leaf 6 causing unrelated bits to be

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

If I'm understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical future processor that manages to implement it properly. The AVX2 spec includes gather; whether it's slow or fast is an implementation detail. We need a feature bit / cost model entry somewhere to signify this, so we're no...

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 22

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

>> + bool HasAVX2 = HasAVX && !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) && >> + (EBX & 0x20); > > I don't think this guarantees %ecx is 0, does it? Wasn't that the > entire reason the original code went wrong? I don’t remember really...

VSelect Instruction Error

2017 Sep 21

VSelect Instruction Error

Hello, I am getting this error. What instruction is required to be implemented? LLVM ERROR: Cannot select: t22: v32i32 = vselect t724, t11, t16 t724: v32i32,ch = load<LD128[FixedStack1]> t723, FrameIndex:i64<1>, undef:i64 t659: i64 = FrameIndex<1> t10: i64 = undef t11: v32i32,ch = load<LD128[%sunkaddr45](align=4)(tbaa=<0x481f1e8>)> t0, t8, undef:i64

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

...I'm understanding correctly, you're saying that vgather* is slow on all > of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will > not generate it for any of those machines. > > Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() > && !hasAVX512()". It could break for some hypothetical future processor > that manages to implement it properly. The AVX2 spec includes gather; > whether it's slow or fast is an implementation detail. We need a feature > bit / cost model entry somewhere to signify t...

[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

2015 May 04

[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

Thanks Nadav for the info. It clears my query :) Yes its an integer ADD, and since AVX2 supports 256 bits integer arithmetic, so its cost is less than AVX1. One query though - shouldn't then the cost of integer ADD/SUB/MUL (which would be 1) be explicitly specified in AVX2 cost table? Because right now this entry is missing and cost of these operations are taken from BaseTTI (which is

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >

search for: hasavx2