Craig Topper
2013-Sep-13 04:36 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Actually there is no miscompile there as esi isn't needed. The flags are which the cmove is using. 342: shr esi,0x5 345: lea rbp,[rip+0x0] # 34c <llvm::sys::getHostCPUName()+0xbc> 34c: lea r12,[rip+0x0] # 353 <llvm::sys::getHostCPUName()+0xc3> 353: cmove rbp,r12 <- this is dependent on the flags from the shift. I think your real problem is that garbage went into ECX instead of 0 and caused cpuid to return 0. On Thu, Sep 12, 2013 at 9:29 PM, Craig Topper <craig.topper at gmail.com>wrote:> Pretty sure you need to check EAX>=7 from cpuid leaf 0 before calling leaf > 7 and you need to use the pass ECX=0 to leaf 7. See > lib/Target/X86/X86Subtarget.cpp which uses a GetX86CpuIDAndInfoEx function > to pass EAX and ECX to cpuid. > > I don't think it explains your compiler bug though. > > > On Thu, Sep 12, 2013 at 2:12 PM, Adam Strzelecki <ono at java.pl> wrote: > >> > Anyway, thanks very much for the information. Hopefully that'll let me >> > track things down. >> >> Let me know if you need some more information or dumps. >> >> > Would you mind me taking a day or so to investigate what's going on >> > here properly? Introducing a volatile to work around a bug in Clang >> > itself just seems perverse to me. (And we shouldn't let a CodeGen bug >> > dictate how we implement our functions either). >> >> Looking at the assembly there's something wrong with SI that is not >> getting saved anywhere after CPUID and 0x20 bit test before it gets >> overwritten by LEA. >> >> 332: mov eax,0x7 >> 337: mov rsi,rbx >> 33a: cpuid >> 33c: xchg rsi,rbx >> 33f: and esi,0x20 >> 342: shr esi,0x5 >> 345: lea rbp,[rip+0x0] # 34c >> <llvm::sys::getHostCPUName()+0xbc> >> 34c: lea r12,[rip+0x0] # 353 >> <llvm::sys::getHostCPUName()+0xc3> >> 353: cmove rbp,r12 >> 357: lea rdi,[rsp+0x188] >> 35f: lea rsi,[rip+0x0] # 366 >> <llvm::sys::getHostCPUName()+0xd6> >> >> In both other cases (2) & (3) SI is saved into stack region. >> >> > I promise I'll do the review of your code after that. >> >> Thanks. >> >> Regards, >> -- >> Adam >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > > > -- > ~Craig >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130912/ec07a8c0/attachment.html>
Tim Northover
2013-Sep-13 06:21 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> I think your real problem is that garbage went into ECX instead of 0 and > caused cpuid to return 0.Ah, that looks very likely. The value seems to come from "xorl %eax, %eax" in both good object files, but a previous cpuid in the bad one. Excellent work Craig, I suspect that would have taken me days to find. Tim.
Craig Topper
2013-Sep-13 06:36 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
It helps that I'm the one who implemented a portion of the code in X86Subtarget.cpp. On Thu, Sep 12, 2013 at 11:21 PM, Tim Northover <t.p.northover at gmail.com>wrote:> > I think your real problem is that garbage went into ECX instead of 0 and > > caused cpuid to return 0. > > Ah, that looks very likely. The value seems to come from "xorl %eax, > %eax" in both good object files, but a previous cpuid in the bad one. > > Excellent work Craig, I suspect that would have taken me days to find. > > Tim. >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130912/a7b84298/attachment.html>
Seemingly Similar Threads
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)