On Wed, September 24, 2008 12:10 am, Evan Cheng wrote:> > On Sep 23, 2008, at 7:17 PM, David Greene wrote: > >> Chris Lattner wrote: >>> On Sep 23, 2008, at 11:26 AM, David Greene wrote: >>> >>>> Are there any examples of using tablegen to generate multiple >>>> machine >>>> instructions from a single pattern? Or do these cases always have >>>> to be >>>> manually expanded? >>> >>> PPC has a bunch of examples, for example: >>> >>> // Arbitrary immediate support. Implement in terms of LIS/ORI. >>> def : Pat<(i32 imm:$imm), >>> (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>; >> >> Yep, I actually found some x86 ones buried in the .td files. :) >> >> So now I have a couple of other questions. >> >> I wrote a pattern that looks something like the above in form, but how >> do I tell the selection DAG to prefer my pattern over another that >> already exists. I can't easily just disable that other pattern >> because >> it generates Machine Instruction opcode enums that are assumed to be >> available in other parts of the x86 codegen. > > Try AddedComplexity = n to increase "goodness" of the pattern. It's a > bit of a hack. > >> >> >> So given two patterns that match the same thing, what's the >> tiebreaker? >> I thought it was order in the .td file but that doesn't appear to be >> the >> case. I put my pattern first and it isn't selected. I change the >> other >> pattern slightly so it won't match anything and then my pattern gets >> used (so I know my pattern is valid). >> >> Also, I really wanted to express this pattern as transforming from one >> DAG to another, not down to machine instructions. I saw this in >> x86InstSSE.td: >> >> // FIXME: may not be able to eliminate this movss with coalescing the >> src and >> // dest register classes are different. We really want to write this >> pattern >> // like this: >> // def : Pat<(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))), >> // (f32 FR32:$src)>; >> >> (this is actually a very useful and important pattern, I wish it was >> available!) > > Right. It would be nice to be able to eliminate the unnecessary > movss. It hasn't shown up on my radar so I haven't really thought out > the right way to model this. I can see a couple of options: > > 1. Treat these instructions as cross register class copies. The src > and dst classes are different (VR128 and FR32) but "compatible". > 2. Model it as extract_subreg which coalescer can eliminate. > > #2 is conceptually correct. The problem is 128 bit XMM0 is the same > register as 32 bit (or 64 bit) XMM0. So it's not possible to define > the super-register / sub-register relationship.I don't understand the problem with subregs here. Is it just a naming issue? That can be solved by introducing alternate names, like XMM0_32 and XMM0_64, for each of the subregs. They could still be printed as "xmm0" in the assembly output of course. Dan
On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote:>>> >>> >>> So given two patterns that match the same thing, what's the >>> tiebreaker? >>> I thought it was order in the .td file but that doesn't appear to be >>> the >>> case. I put my pattern first and it isn't selected. I change the >>> other >>> pattern slightly so it won't match anything and then my pattern gets >>> used (so I know my pattern is valid). >>> >>> Also, I really wanted to express this pattern as transforming from >>> one >>> DAG to another, not down to machine instructions. I saw this in >>> x86InstSSE.td: >>> >>> // FIXME: may not be able to eliminate this movss with coalescing >>> the >>> src and >>> // dest register classes are different. We really want to write this >>> pattern >>> // like this: >>> // def : Pat<(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))), >>> // (f32 FR32:$src)>; >>> >>> (this is actually a very useful and important pattern, I wish it was >>> available!) >> >> Right. It would be nice to be able to eliminate the unnecessary >> movss. It hasn't shown up on my radar so I haven't really thought >> out >> the right way to model this. I can see a couple of options: >> >> 1. Treat these instructions as cross register class copies. The src >> and dst classes are different (VR128 and FR32) but "compatible". >> 2. Model it as extract_subreg which coalescer can eliminate. >> >> #2 is conceptually correct. The problem is 128 bit XMM0 is the same >> register as 32 bit (or 64 bit) XMM0. So it's not possible to define >> the super-register / sub-register relationship. > > I don't understand the problem with subregs here. Is it just a > naming issue? That can be solved by introducing alternate names, > like XMM0_32 and XMM0_64, for each of the subregs. They could > still be printed as "xmm0" in the assembly output of course.Right. That's a workable solution. However, it still adds complexity: XMM0_32 = MOVPS2SSrr XMM0 We need to teach the allocator that the two registers are the "same" and this is a identity copy. Evan> > > Dan > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080924/21a3b1a6/attachment.html>
On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote:>> #2 is conceptually correct. The problem is 128 bit XMM0 is the same >> register as 32 bit (or 64 bit) XMM0. So it's not possible to define >> the super-register / sub-register relationship. > > I don't understand the problem with subregs here. Is it just a > naming issue? That can be solved by introducing alternate names, > like XMM0_32 and XMM0_64, for each of the subregs. They could > still be printed as "xmm0" in the assembly output of course.this is what the PPC64 backend does. "X0" (64-bit GPR) and "R0" (32- bit GPR) both print as "r0". -Chris
On Sep 24, 2008, at 10:16 AM, Chris Lattner wrote:> On Sep 24, 2008, at 8:44 AM, Dan Gohman wrote: >>> #2 is conceptually correct. The problem is 128 bit XMM0 is the same >>> register as 32 bit (or 64 bit) XMM0. So it's not possible to define >>> the super-register / sub-register relationship. >> >> I don't understand the problem with subregs here. Is it just a >> naming issue? That can be solved by introducing alternate names, >> like XMM0_32 and XMM0_64, for each of the subregs. They could >> still be printed as "xmm0" in the assembly output of course. > > this is what the PPC64 backend does. "X0" (64-bit GPR) and "R0" (32- > bit GPR) both print as "r0".Somewhat different scenario there. On PPC64, these are actually distinct registers, right? Evan> > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev