Adam Strzelecki
2013-Sep-12 21:12 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
> Anyway, thanks very much for the information. Hopefully that'll let me > track things down.Let me know if you need some more information or dumps.> Would you mind me taking a day or so to investigate what's going on > here properly? Introducing a volatile to work around a bug in Clang > itself just seems perverse to me. (And we shouldn't let a CodeGen bug > dictate how we implement our functions either).Looking at the assembly there's something wrong with SI that is not getting saved anywhere after CPUID and 0x20 bit test before it gets overwritten by LEA. 332: mov eax,0x7 337: mov rsi,rbx 33a: cpuid 33c: xchg rsi,rbx 33f: and esi,0x20 342: shr esi,0x5 345: lea rbp,[rip+0x0] # 34c <llvm::sys::getHostCPUName()+0xbc> 34c: lea r12,[rip+0x0] # 353 <llvm::sys::getHostCPUName()+0xc3> 353: cmove rbp,r12 357: lea rdi,[rsp+0x188] 35f: lea rsi,[rip+0x0] # 366 <llvm::sys::getHostCPUName()+0xd6> In both other cases (2) & (3) SI is saved into stack region.> I promise I'll do the review of your code after that.Thanks. Regards, -- Adam
Craig Topper
2013-Sep-13 04:29 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Pretty sure you need to check EAX>=7 from cpuid leaf 0 before calling leaf 7 and you need to use the pass ECX=0 to leaf 7. See lib/Target/X86/X86Subtarget.cpp which uses a GetX86CpuIDAndInfoEx function to pass EAX and ECX to cpuid. I don't think it explains your compiler bug though. On Thu, Sep 12, 2013 at 2:12 PM, Adam Strzelecki <ono at java.pl> wrote:> > Anyway, thanks very much for the information. Hopefully that'll let me > > track things down. > > Let me know if you need some more information or dumps. > > > Would you mind me taking a day or so to investigate what's going on > > here properly? Introducing a volatile to work around a bug in Clang > > itself just seems perverse to me. (And we shouldn't let a CodeGen bug > > dictate how we implement our functions either). > > Looking at the assembly there's something wrong with SI that is not > getting saved anywhere after CPUID and 0x20 bit test before it gets > overwritten by LEA. > > 332: mov eax,0x7 > 337: mov rsi,rbx > 33a: cpuid > 33c: xchg rsi,rbx > 33f: and esi,0x20 > 342: shr esi,0x5 > 345: lea rbp,[rip+0x0] # 34c > <llvm::sys::getHostCPUName()+0xbc> > 34c: lea r12,[rip+0x0] # 353 > <llvm::sys::getHostCPUName()+0xc3> > 353: cmove rbp,r12 > 357: lea rdi,[rsp+0x188] > 35f: lea rsi,[rip+0x0] # 366 > <llvm::sys::getHostCPUName()+0xd6> > > In both other cases (2) & (3) SI is saved into stack region. > > > I promise I'll do the review of your code after that. > > Thanks. > > Regards, > -- > Adam > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130912/e205b248/attachment.html>
Craig Topper
2013-Sep-13 04:36 UTC
[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
Actually there is no miscompile there as esi isn't needed. The flags are which the cmove is using. 342: shr esi,0x5 345: lea rbp,[rip+0x0] # 34c <llvm::sys::getHostCPUName()+0xbc> 34c: lea r12,[rip+0x0] # 353 <llvm::sys::getHostCPUName()+0xc3> 353: cmove rbp,r12 <- this is dependent on the flags from the shift. I think your real problem is that garbage went into ECX instead of 0 and caused cpuid to return 0. On Thu, Sep 12, 2013 at 9:29 PM, Craig Topper <craig.topper at gmail.com>wrote:> Pretty sure you need to check EAX>=7 from cpuid leaf 0 before calling leaf > 7 and you need to use the pass ECX=0 to leaf 7. See > lib/Target/X86/X86Subtarget.cpp which uses a GetX86CpuIDAndInfoEx function > to pass EAX and ECX to cpuid. > > I don't think it explains your compiler bug though. > > > On Thu, Sep 12, 2013 at 2:12 PM, Adam Strzelecki <ono at java.pl> wrote: > >> > Anyway, thanks very much for the information. Hopefully that'll let me >> > track things down. >> >> Let me know if you need some more information or dumps. >> >> > Would you mind me taking a day or so to investigate what's going on >> > here properly? Introducing a volatile to work around a bug in Clang >> > itself just seems perverse to me. (And we shouldn't let a CodeGen bug >> > dictate how we implement our functions either). >> >> Looking at the assembly there's something wrong with SI that is not >> getting saved anywhere after CPUID and 0x20 bit test before it gets >> overwritten by LEA. >> >> 332: mov eax,0x7 >> 337: mov rsi,rbx >> 33a: cpuid >> 33c: xchg rsi,rbx >> 33f: and esi,0x20 >> 342: shr esi,0x5 >> 345: lea rbp,[rip+0x0] # 34c >> <llvm::sys::getHostCPUName()+0xbc> >> 34c: lea r12,[rip+0x0] # 353 >> <llvm::sys::getHostCPUName()+0xc3> >> 353: cmove rbp,r12 >> 357: lea rdi,[rsp+0x188] >> 35f: lea rsi,[rip+0x0] # 366 >> <llvm::sys::getHostCPUName()+0xd6> >> >> In both other cases (2) & (3) SI is saved into stack region. >> >> > I promise I'll do the review of your code after that. >> >> Thanks. >> >> Regards, >> -- >> Adam >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > > > -- > ~Craig >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130912/ec07a8c0/attachment.html>
Maybe Matching Threads
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)
- [LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)