On Wednesday 24 September 2008 02:10, Evan Cheng wrote:> > I wrote a pattern that looks something like the above in form, but how > > do I tell the selection DAG to prefer my pattern over another that > > already exists. I can't easily just disable that other pattern > > because > > it generates Machine Instruction opcode enums that are assumed to be > > available in other parts of the x86 codegen. > > Try AddedComplexity = n to increase "goodness" of the pattern. It's a > bit of a hack.Ah, AddedComplexity == goodness. I thought it == badness so I'd tried to add it to the pattern I wanted to disable. Thanks for the pointer.> 1. Treat these instructions as cross register class copies. The src > and dst classes are different (VR128 and FR32) but "compatible".This seems reasonable.> 2. Model it as extract_subreg which coalescer can eliminate. > > #2 is conceptually correct. The problem is 128 bit XMM0 is the same > register as 32 bit (or 64 bit) XMM0. So it's not possible to define > the super-register / sub-register relationship.I'm not seeing how this is "conceptually correct." It's a vector extract, not a subregister. It's just that we want to reuse the same register. Perhaps the answer is to add vector extract support to the coalescer, in the same way you added subregister support. I don't understand the nitty gritty of that, though.> That leaves us with #1. I have added support to coalesce cross-class > copies (see CrossClassJoin in SimpleRegisterCoalescing.cpp).Yep.> Unfortunately, it breaks a few tests and I haven't had the time to > look into them. If that's done, we just need to add the concept of > "compatible register classes" and mark MOVPS2SSrr as a copy and *it > should just work*.What about getting tblgen support for the pattern in the .td file? That would be another way to tackle this and would open up a whole bunch of other opportunities. Instcombine could be entirely expressed as a set of tblgen patterns, for example, which is desireable from a maintenance perspective and well as new development. It's much easier to write patterns than to go through all of the manual examination that currently exists in instcombine. -Dave
On Sep 24, 2008, at 9:15 AM, David Greene wrote:> >> 1. Treat these instructions as cross register class copies. The src >> and dst classes are different (VR128 and FR32) but "compatible". > > This seems reasonable. > >> 2. Model it as extract_subreg which coalescer can eliminate. >> >> #2 is conceptually correct. The problem is 128 bit XMM0 is the same >> register as 32 bit (or 64 bit) XMM0. So it's not possible to define >> the super-register / sub-register relationship. > > I'm not seeing how this is "conceptually correct." It's a vector > extract, not > a subregister. It's just that we want to reuse the same register.It is though. Sub-register is a machine specific concept. It means vector_extract can be modeled as subreg_extract on this machine. Nothing is wrong with thatt.> > > Perhaps the answer is to add vector extract support to the > coalescer, in > the same way you added subregister support. I don't understand the > nitty > gritty of that, though.I don't think that's a good idea. Conceptually vector_extract is very different from a move.> > >> That leaves us with #1. I have added support to coalesce cross-class >> copies (see CrossClassJoin in SimpleRegisterCoalescing.cpp). > > Yep.As Dan pointed out, #2 is also a workable solution.> >> Unfortunately, it breaks a few tests and I haven't had the time to >> look into them. If that's done, we just need to add the concept of >> "compatible register classes" and mark MOVPS2SSrr as a copy and *it >> should just work*. > > What about getting tblgen support for the pattern in the .td file? > That would > be another way to tackle this and would open up a whole bunch of other > opportunities. Instcombine could be entirely expressed as a set of > tblgen > patterns, for example, which is desireable from a maintenance > perspective > and well as new development. It's much easier to write patterns > than to go > through all of the manual examination that currently exists in > instcombine.I don't think that would work. We still have to model the value as being produced by an instruction. So either we have to select it to a target specific instruction or a target independent instruction (i.e. extract_subreg). After thinking about this some more, I think #2 is a better solution. Adding XMM0_32 etc. teaches codegen that only lower 32-bits of the registers are used. Perhaps this can open up additional optimization opportunities. On the other hand, adding these registers means introducing more aliasing which has compile time implication. Evan> > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Wednesday 24 September 2008 12:20, Evan Cheng wrote:> > I'm not seeing how this is "conceptually correct." It's a vector > > extract, not > > a subregister. It's just that we want to reuse the same register. > > It is though. Sub-register is a machine specific concept. It means > vector_extract can be modeled as subreg_extract on this machine. > Nothing is wrong with thatt.I didn't mean to imply anything was "wrong." It just strikes me as kind of strange, in a mind-warping kind of way. :)> > Perhaps the answer is to add vector extract support to the > > coalescer, in > > the same way you added subregister support. I don't understand the > > nitty > > gritty of that, though. > > I don't think that's a good idea. Conceptually vector_extract is very > different from a move.Ok, I agree.> >> That leaves us with #1. I have added support to coalesce cross-class > >> copies (see CrossClassJoin in SimpleRegisterCoalescing.cpp). > > > > Yep. > > As Dan pointed out, #2 is also a workable solution.Yes, I like Dan's proposal.> > What about getting tblgen support for the pattern in the .td file? > > That would > > be another way to tackle this and would open up a whole bunch of other > > opportunities. Instcombine could be entirely expressed as a set of > > tblgen > > patterns, for example, which is desireable from a maintenance > > perspective > > and well as new development. It's much easier to write patterns > > than to go > > through all of the manual examination that currently exists in > > instcombine. > > I don't think that would work. We still have to model the value as > being produced by an instruction. So either we have to select it to a > target specific instruction or a target independent instruction (i.e. > extract_subreg).That's right. The pattern doesn't know if the rest of the vector is going to be used elsewhere so we need dataflow information and that implies it needs to be done in the coalescer (or some other transformation pass).> After thinking about this some more, I think #2 is a better solution. > Adding XMM0_32 etc. teaches codegen that only lower 32-bits of the > registers are used. Perhaps this can open up additional optimization > opportunities. On the other hand, adding these registers means > introducing more aliasing which has compile time implication.What about XMM0_64? What about things like AVX which applies the GPR aliasing scheme to vectors? I think this is the right way to go but we need to do things in a comprehensive way so we can expand as needed. -Dave