On Sep 24, 2008, at 10:16 AM, Chris Lattner wrote:> On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote: >>> #2 is conceptually correct. The problem is 128 bit XMM0 is the same >>> register as 32 bit (or 64 bit) XMM0. So it's not possible to define >>> the super-register / sub-register relationship. >> >> I don't understand the problem with subregs here. Is it just a >> naming issue? That can be solved by introducing alternate names, >> like XMM0_32 and XMM0_64, for each of the subregs. They could >> still be printed as "xmm0" in the assembly output of course. > > this is what the PPC64 backend does. "X0" (64-bit GPR) and "R0" (32- > bit GPR) both print as "r0".Somewhat different scenario there. On PPC64, these are actually distinct registers, right? Evan> > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Sep 24, 2008, at 11:34 AM, Evan Cheng wrote:> > On Sep 24, 2008, at 10:16 AM, Chris Lattner wrote: > >> On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote: >>>> #2 is conceptually correct. The problem is 128 bit XMM0 is the same >>>> register as 32 bit (or 64 bit) XMM0. So it's not possible to define >>>> the super-register / sub-register relationship. >>> >>> I don't understand the problem with subregs here. Is it just a >>> naming issue? That can be solved by introducing alternate names, >>> like XMM0_32 and XMM0_64, for each of the subregs. They could >>> still be printed as "xmm0" in the assembly output of course. >> >> this is what the PPC64 backend does. "X0" (64-bit GPR) and "R0" (32- >> bit GPR) both print as "r0". > > Somewhat different scenario there. On PPC64, these are actually > distinct registers, right?No, the 32-bit GPR is a subreg (low 32-bits) of the 64-bit register. IIRC, storing to the 32-bit GPR clears out the top half. This seems very analogous to the SSE case. However, in full disclosure, we never fixed the coalescing performance issue for PPC64. -Chris
On Sep 24, 2008, at 1:42 PM, Chris Lattner wrote:> > On Sep 24, 2008, at 11:34 AM, Evan Cheng wrote: > >> >> On Sep 24, 2008, at 10:16 AM, Chris Lattner wrote: >> >>> On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote: >>>>> #2 is conceptually correct. The problem is 128 bit XMM0 is the >>>>> same >>>>> register as 32 bit (or 64 bit) XMM0. So it's not possible to >>>>> define >>>>> the super-register / sub-register relationship. >>>> >>>> I don't understand the problem with subregs here. Is it just a >>>> naming issue? That can be solved by introducing alternate names, >>>> like XMM0_32 and XMM0_64, for each of the subregs. They could >>>> still be printed as "xmm0" in the assembly output of course. >>> >>> this is what the PPC64 backend does. "X0" (64-bit GPR) and >>> "R0" (32- >>> bit GPR) both print as "r0". >> >> Somewhat different scenario there. On PPC64, these are actually >> distinct registers, right? > > No, the 32-bit GPR is a subreg (low 32-bits) of the 64-bit register. > IIRC, storing to the 32-bit GPR clears out the top half. > > This seems very analogous to the SSE case. However, in full > disclosure, we never fixed the coalescing performance issue for PPC64.There's a performance issue, maybe related, with 32-bit subregs of 64-bit GPRs on x86-64 too. CodeGen does eliminate some subreg-related copies, and actually it's improved in this aspect since LLVM 2.3, but there remain more opportunities for improvement. Dan