Craig Topper
2013-Nov-23 04:59 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in Host.cpp and pass the correct value to ecx. Also you need to verify that 7 is a valid leaf because an invalid leaf is defined to return the highest supported leaf on that processor. So if a processor supports say leaf 6 and not leaf 7, then an access leaf 7 will return the data from leaf 6 causing unrelated bits to be interpretted as feature flags for AVX etc. On Fri, Nov 22, 2013 at 7:34 AM, Tim Northover <t.p.northover at gmail.com>wrote:> > I don’t remember really, but presuming the conclusions of the > discussion, seems it is fixed now. It was something about registers when > using inline assembly. Anyway this works just fine on all my Haswell > machines. > > I think that's more coincidence than anything else (something > perturbed in your host compiler's backend). If you look at > lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp there's a > GetCpuIDAndInfoEx function which specifically sets %ecx to a valid > value before executing "cpuid". > > The code in Host.cpp needs to do something similar. > > Cheers. > > Tim. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131122/4c209343/attachment.html>
Adam Strzelecki
2013-Nov-23 16:35 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Here we go, updated patch following your advice checking max leaf and porting cpuidex for subleaf (ECX) 0. NOTE: I’ve set Haswell to be not only 60, but also 63, 69 & 70 model, following changes in Linux kernel & Xen. Also set 62 as Ivy Bridge EP aka E5 v3 (which I has in my workstation). Cheers, -- Adam Detects x86 family 6 model 60, 63, 69, 70 CPU that has AVX2 CPUID leaf 7 subleaf 0 AVX2 flag as core-avx2. Port GetX86CpuIDAndInfoEx from X86MCTargetDesc to support x86 CPUID subleafs. Also detect family 6 model 62 (0x3E) Ivy Bridge EP as core-avx-i. --- lib/Support/Host.cpp | 96 ++++++++++++++++++++++++++++++++++++++--- lib/Target/X86/X86Subtarget.cpp | 7 ++- 2 files changed, 96 insertions(+), 7 deletions(-) diff --git a/lib/Support/Host.cpp b/lib/Support/Host.cpp index 380df6b..6e9a5c9 100644 --- a/lib/Support/Host.cpp +++ b/lib/Support/Host.cpp @@ -95,6 +95,75 @@ static bool GetX86CpuIDAndInfo(unsigned value, unsigned *rEAX, unsigned *rEBX, #endif } +/// GetX86CpuIDAndInfoEx - Execute the specified cpuid with subleaf and return the +/// 4 values in the specified arguments. If we can't run cpuid on the host, +/// return true. +bool GetX86CpuIDAndInfoEx(unsigned value, unsigned subleaf, unsigned *rEAX, + unsigned *rEBX, unsigned *rECX, unsigned *rEDX) { +#if defined(__x86_64__) || defined(_M_AMD64) || defined (_M_X64) + #if defined(__GNUC__) + // gcc doesn't know cpuid would clobber ebx/rbx. Preseve it manually. + asm ("movq\t%%rbx, %%rsi\n\t" + "cpuid\n\t" + "xchgq\t%%rbx, %%rsi\n\t" + : "=a" (*rEAX), + "=S" (*rEBX), + "=c" (*rECX), + "=d" (*rEDX) + : "a" (value), + "c" (subleaf)); + return false; + #elif defined(_MSC_VER) + // __cpuidex was added in MSVC++ 9.0 SP1 + #if (_MSC_VER > 1500) || (_MSC_VER == 1500 && _MSC_FULL_VER >= 150030729) + int registers[4]; + __cpuidex(registers, value, subleaf); + *rEAX = registers[0]; + *rEBX = registers[1]; + *rECX = registers[2]; + *rEDX = registers[3]; + return false; + #else + return true; + #endif + #else + return true; + #endif +#elif defined(i386) || defined(__i386__) || defined(__x86__) || defined(_M_IX86) + #if defined(__GNUC__) + asm ("movl\t%%ebx, %%esi\n\t" + "cpuid\n\t" + "xchgl\t%%ebx, %%esi\n\t" + : "=a" (*rEAX), + "=S" (*rEBX), + "=c" (*rECX), + "=d" (*rEDX) + : "a" (value), + "c" (subleaf)); + return false; + #elif defined(_MSC_VER) + __asm { + mov eax,value + mov ecx,subleaf + cpuid + mov esi,rEAX + mov dword ptr [esi],eax + mov esi,rEBX + mov dword ptr [esi],ebx + mov esi,rECX + mov dword ptr [esi],ecx + mov esi,rEDX + mov dword ptr [esi],edx + } + return false; + #else + return true; + #endif +#else + return true; +#endif +} + static bool OSHasAVXSupport() { #if defined(__GNUC__) // Check xgetbv; this uses a .byte sequence instead of the instruction @@ -131,6 +200,14 @@ std::string sys::getHostCPUName() { unsigned Model = 0; DetectX86FamilyModel(EAX, Family, Model); + union { + unsigned u[3]; + char c[12]; + } text; + + GetX86CpuIDAndInfo(0, &EAX, text.u+0, text.u+2, text.u+1); + + unsigned MaxLeaf = EAX; bool HasSSE3 = (ECX & 0x1); bool HasSSE41 = (ECX & 0x80000); // If CPUID indicates support for XSAVE, XRESTORE and AVX, and XGETBV @@ -138,15 +215,12 @@ std::string sys::getHostCPUName() { // switch, then we have full AVX support. const unsigned AVXBits = (1 << 27) | (1 << 28); bool HasAVX = ((ECX & AVXBits) == AVXBits) && OSHasAVXSupport(); + bool HasAVX2 = HasAVX && MaxLeaf >= 0x7 && + !GetX86CpuIDAndInfoEx(0x7, 0x0, &EAX, &EBX, &ECX, &EDX) && + (EBX & 0x20); GetX86CpuIDAndInfo(0x80000001, &EAX, &EBX, &ECX, &EDX); bool Em64T = (EDX >> 29) & 0x1; - union { - unsigned u[3]; - char c[12]; - } text; - - GetX86CpuIDAndInfo(0, &EAX, text.u+0, text.u+2, text.u+1); if (memcmp(text.c, "GenuineIntel", 12) == 0) { switch (Family) { case 3: @@ -254,10 +328,20 @@ std::string sys::getHostCPUName() { // Ivy Bridge: case 58: + case 62: // Ivy Bridge EP // Not all Ivy Bridge processors support AVX (such as the Pentium // versions instead of the i7 versions). return HasAVX ? "core-avx-i" : "corei7"; + // Haswell: + case 60: + case 63: + case 69: + case 70: + // Not all Haswell processors support AVX too (such as the Pentium + // versions instead of the i7 versions). + return HasAVX2 ? "core-avx2" : "corei7"; + case 28: // Most 45 nm Intel Atom processors case 38: // 45 nm Atom Lincroft case 39: // 32 nm Atom Medfield diff --git a/lib/Target/X86/X86Subtarget.cpp b/lib/Target/X86/X86Subtarget.cpp index fa04c38..597fccb 100644 --- a/lib/Target/X86/X86Subtarget.cpp +++ b/lib/Target/X86/X86Subtarget.cpp @@ -285,7 +285,12 @@ void X86Subtarget::AutoDetectSubtargetFeatures() { (Family == 6 && Model == 0x2F) || // Westmere: Westmere-EX (Family == 6 && Model == 0x2A) || // SandyBridge (Family == 6 && Model == 0x2D) || // SandyBridge: SandyBridge-E* - (Family == 6 && Model == 0x3A))) {// IvyBridge + (Family == 6 && Model == 0x3A) || // IvyBridge + (Family == 6 && Model == 0x3E) || // IvyBridge EP + (Family == 6 && Model == 0x3C) || // Haswell + (Family == 6 && Model == 0x3F) || // ... + (Family == 6 && Model == 0x45) || // ... + (Family == 6 && Model == 0x46))) { // ... IsUAMemFast = true; ToggleFeature(X86::FeatureFastUAMem); } -- 1.8.3.4 (Apple Git-47)
Tim Northover
2013-Nov-25 09:57 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> Here we go, updated patch following your advice checking max leaf and porting cpuidex for subleaf (ECX) 0.Thanks for the updated version Adam, that looks good! I've committed this as r195632 and asked Bill (the release manager) whether it can be merged to the 3.4 branch, I think it would be a good idea too. Cheers. Tim
Apparently Analagous Threads
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] Patch to compile LLVM with MSVC 2010
- [LLVMdev] Patch to compile LLVM with MSVC 2010