similar to: [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

Displaying 20 results from an estimated 600 matches similar to: "[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)"

2013 Sep 12
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Hi Adam, > * I have marked HasAVX2 as "volatile", since otherwise it gets > magically zeroed (by optimizer?) when compiling clang with latest > clang build from trunk That's far more worrying to me than not being able to detect Haswell. I can't reproduce the problem here at the moment: both debug and release builds give identical assembly for Host.cpp. I don't
2013 Sep 12
3
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> That's far more worrying to me than not being able to detect Haswell. > I can't reproduce the problem here at the moment: both debug and > release builds give identical assembly for Host.cpp. OK. I know the reason you cannot reproduce it, before posting the patch I've decided to check for AVX before checking AVX2, just not to cpuid AVX2 when we don't have AVX1 anyway.
2013 Nov 22
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> I promise I'll do the review of your code after that. Tim, I don’t want to push too much. But since there’s 3.4 release on the horizon, maybe you could find a moment review this patch. Especially Haswell is all there since few months. Cheers, -- Adam --- lib/Support/Host.cpp | 8 ++++++++ lib/Target/X86/X86Subtarget.cpp | 3 ++- 2 files changed, 10 insertions(+), 1
2013 Sep 12
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Hi Adam, > OK. I know the reason you cannot reproduce it, before posting > the patch I've decided to check for AVX before checking AVX2, > just not to cpuid AVX2 when we don't have AVX1 anyway. I suspect it was also incompetence on my part. Given the differences I'm seeing now I can't believe there'd be *no* difference in my tests if I'd done them properly.
2013 Nov 23
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in Host.cpp and pass the correct value to ecx. Also you need to verify that 7 is a valid leaf because an invalid leaf is defined to return the highest supported leaf on that processor. So if a processor supports say leaf 6 and not leaf 7, then an access leaf 7 will return the data from leaf 6 causing unrelated bits to be
2013 Nov 22
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Hi Adam, > + bool HasAVX2 = HasAVX && !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) && > + (EBX & 0x20); I don't think this guarantees %ecx is 0, does it? Wasn't that the entire reason the original code went wrong? Cheers. Tim.
2013 Nov 23
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Here we go, updated patch following your advice checking max leaf and porting cpuidex for subleaf (ECX) 0. NOTE: I’ve set Haswell to be not only 60, but also 63, 69 & 70 model, following changes in Linux kernel & Xen. Also set 62 as Ivy Bridge EP aka E5 v3 (which I has in my workstation). Cheers, -- Adam Detects x86 family 6 model 60, 63, 69, 70 CPU that has AVX2 CPUID leaf 7 subleaf
2013 Nov 22
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
>> + bool HasAVX2 = HasAVX && !GetX86CpuIDAndInfo(0x7, &EAX, &EBX, &ECX, &EDX) && >> + (EBX & 0x20); > > I don't think this guarantees %ecx is 0, does it? Wasn't that the > entire reason the original code went wrong? I don’t remember really, but presuming the conclusions of the discussion, seems it is fixed now. It
2013 Sep 12
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> Anyway, thanks very much for the information. Hopefully that'll let me > track things down. Let me know if you need some more information or dumps. > Would you mind me taking a day or so to investigate what's going on > here properly? Introducing a volatile to work around a bug in Clang > itself just seems perverse to me. (And we shouldn't let a CodeGen bug >
2013 Sep 13
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Pretty sure you need to check EAX>=7 from cpuid leaf 0 before calling leaf 7 and you need to use the pass ECX=0 to leaf 7. See lib/Target/X86/X86Subtarget.cpp which uses a GetX86CpuIDAndInfoEx function to pass EAX and ECX to cpuid. I don't think it explains your compiler bug though. On Thu, Sep 12, 2013 at 2:12 PM, Adam Strzelecki <ono at java.pl> wrote: > > Anyway, thanks
2013 Sep 13
2
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Actually there is no miscompile there as esi isn't needed. The flags are which the cmove is using. 342: shr esi,0x5 345: lea rbp,[rip+0x0] # 34c <llvm::sys::getHostCPUName()+0xbc> 34c: lea r12,[rip+0x0] # 353 <llvm::sys::getHostCPUName()+0xc3> 353: cmove rbp,r12 <- this is dependent on the flags from the shift. I think your real problem is
2013 Sep 13
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> I think your real problem is that garbage went into ECX instead of 0 and > caused cpuid to return 0. Ah, that looks very likely. The value seems to come from "xorl %eax, %eax" in both good object files, but a previous cpuid in the bad one. Excellent work Craig, I suspect that would have taken me days to find. Tim.
2013 Nov 22
0
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> I don’t remember really, but presuming the conclusions of the discussion, seems it is fixed now. It was something about registers when using inline assembly. Anyway this works just fine on all my Haswell machines. I think that's more coincidence than anything else (something perturbed in your host compiler's backend). If you look at lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
2015 May 04
3
[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo
Thanks Nadav for the info. It clears my query :) Yes its an integer ADD, and since AVX2 supports 256 bits integer arithmetic, so its cost is less than AVX1. One query though - shouldn't then the cost of integer ADD/SUB/MUL (which would be 1) be explicitly specified in AVX2 cost table? Because right now this entry is missing and cost of these operations are taken from BaseTTI (which is
2013 Sep 17
2
[LLVMdev] [PATCH] Detect SVN revision and path on Git working copy
On 17 Sep 2013, at 18:18, Mark Lacey <mark.lacey at apple.com> wrote: > This means that from build to build, depending on whether a Git change or SVN change is the most recent on my branch, I will get different results. That seems problematic. If you're not rebasing after pulling in changes from LLVM svn, then you're going to be in for a world of pain later (I, unfortunately,
2014 Sep 19
2
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Tom Stellard > Sent: 19 September 2014 01:36 > To: Sanjay Patel > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] predicates vs. requirements [TableGen, > X86InstrInfo.td] > > On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote: >
2014 Mar 13
3
[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI
Not sure who owns this bit of code, so sending this to the general list. It looks like there may be an unintentional fall through happening in the X86RegisterInfo::getCallPreservedMask function. http://llvm.org/docs/doxygen/html/X86RegisterInfo_8cpp_source.html case CallingConv::Intel_OCL_BI
2014 Sep 18
3
[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]
I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td, but it appears to be ignored. However, the condition was detected when specified as a predicate. So this doesn't work: def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr: $src)>, *Requires<[OptForSize**]>*; But this does: * let Predicates = [OptForSize]
2006 Jan 23
2
linux-2.6-xen.hg: Selecting "Xen-compatible" subarchitecture makes most of the drivers unselectable
Hi list! What''s the correct way to solve this problem? When you select "Xen-compatible" subarchitecture most of the drivers (network,scsi,etc) disappear from the menuconfig.. Currently linux-2.6-xen.hg repository is unusable (as is).. for dom0 and domU also. -- Pasi Kärkkäinen ^ . .
2011 Sep 22
3
[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits
Hi Bruno, > Some comments: > > + // Try to synthesize horizontal adds from adds of shuffles. > + if (((Subtarget->hasSSE3()&& (VT == MVT::v4f32 || VT == MVT::v2f64)) || > + (Subtarget->hasAVX()&& (VT == MVT::v8f32 || VT == MVT::v4f64)))&& > + isHorizontalBinOp(LHS, RHS, true)) > > 1) You probably want to do something like: >