Daniel Sanders via llvm-dev
2018-Nov-15 18:00 UTC
[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules
> On Nov 13, 2018, at 08:01, David Greene <dag at cray.com> wrote: > > Daniel Sanders via llvm-dev <llvm-dev at lists.llvm.org> writes: > >> That's an interesting idea. Certainly tablegenerating InstCombine >> ought to be possible and sharing code sounds like it ought to be >> doable. MIR and IR are pretty similar especially after IRTranslator >> (which is a direct translation) through to the Legalizer (which is the >> first point target instructions can until targets make custom >> passes). From the Legalizer to ISel, there's still likely to be a fair >> amount of overlap between the two as a lot of the G_* opcodes directly >> correspond to LLVM-IR instructions. The tricky bit will be the escape >> hatches into C++ would need to either have Instruction/MachineInstr >> versions or would need to accept both. > > Yes. I wonder if templates can help with this. I'm thinking of > LoopInfo, which is parameterized with BasicBlock/Loop or > MachineBasicBlock/MachineLoop, as a guide. I don't know if Instruction > and MachineInstr have enough APIs in common, though. Perhaps a > longer-term effort could be made to develop a common API with the same > spellings for both, at least enough for dagcombine purposes.Unfortunately, Instruction is a class-hierachy based design with specialized getters for each subclass whereas MachineInstr is a single one-size-fits-all class. It's likely to be difficult to unify them with templates. A common API sounds feasible to some degree, certainly the G_* opcodes and the Instruction hierarchy ought to be able to agree on abstraction that can interact with both, and MachineMemOperand ought to be able to agree with LoadInst/StoreInst/Atomic*Inst> Your initial response to Nicolai's suggestion to use TableGen's dag type > triggered something else in my mind. The TableGen SelectionDAG stuff is > really restrictive in a lot of ways because the TableGen dag type is > used to express actual DAGs and you rightly pointed out that that's too > unwieldy for dagcombine. It's unwieldy for isel too, which has many of > the same issues (multiple outputs, use of things like EXTRACT_SUBREG and > so on). It's never handled selecting to multiple instructions very well > either. I wonder if it would be possible to express isel patterns using > TableGen's dag type in the way that Nicolai suggests for dagcombine. In > a lot of ways, isel is "just another kind of dagcombine."I expect it to be able to deal with ISel too. The main difference between the two is that ISel's apply step is required to constrain operands to a register class whereas Combine doesn't need to do that but can choose to. I'm intending to share code between the tablegen passes for Combine and ISel so that it will be the same underlying code generator. The syntax could also work for the Legalizer, and RegBankAlloc too. The legalizer is a degenerate case of matching a single instruction's opcode, types, and MachineMemOperand so it's probably not worth doing there, but using it in RegBankAlloc to specify the alternative code for each bank and the constraints to apply could be useful. The RegBankAlloc pass boils down to lots of rules of the form: %2(s32) = ADD %0(s32), %1(s32) => %2:GPR = ADD %0:GPR, %1:GPR with some contradictory rules like: %1(s32) = FNEG %0(s32) => %1:FPR = FNEG %0: FPR %1(s32) = FNEG %0(s32) => %1:GPR = XOR %0:GPR, i32 0x80000000 RegBankAlloc's job is then to resolve the contradictions in the most globally efficient way. For example, if the input is already in FPR's then the higher latency of an FNEG might beat the cost of transferring it to GPR and using the lower-latency XOR. However, if the result needs to be in a GPR anyway then it's better to copy to GPR before the FNEG and use the lower-latency XOR instead.>>> The use of '%' vs. '$' here really threw me for a loop. I would have >>> expected '$D' and '$S' everywhere. What's the significance of '%' >>> vs. '$'? I know MIR uses '%' for names but must that be the case in >>> these match blocks? >> >> In MIR, '%' and '$' have a semantic difference when used on operands. >> '%foo' is a virtual register named foo but '$foo' is the physical register foo. > > Ok, thanks for the explanation. I have not worked extensively with MIR > yet. > >> The main reason I didn't pick something distinct from either >> (e.g. '${foo}') is that I'd like to minimize the need to modify the >> MIR parser to support pattern-specific syntax. > > Sure, that makes sense. > >>> My understanding is that "root" means the sink here, but later in the >>> "upside-down" example, "root" means the source. It seems to me the use >>> of "root" here is a misnomer/confusing. I liked Nicolai's ideas about >>> specifying the insert point. Why do users need to know about "root" at >>> all? Is there some other special meaning attached to it? >> >> It doesn't correspond with any property of the DAG being matched. It's >> the entry point that the combine algorithm uses to begin an attempt to >> match the rule. In DAGCombine terms, it's the first node checked for >> the rule inside the call to DAGCombine::visit(SDNode *). > > Got it. "root" has a specific meeting for a tree/dag and I think I was > getting hung up on that, but I see how it has meaning as the starting > point of a match. I was going to say maybe we can find another name for > it, but now I can't think of a better one. :)The best I can think of is 'matchpoint' or 'matchfrom'. As mentioned on the thread with Nicolai, we might be able to drop it altogether in favour of a let statement which can a longer name which will help to clarify it.> -David
David Greene via llvm-dev
2018-Nov-15 18:57 UTC
[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules
Daniel Sanders via llvm-dev <llvm-dev at lists.llvm.org> writes:> Unfortunately, Instruction is a class-hierachy based design with > specialized getters for each subclass whereas MachineInstr is a single > one-size-fits-all class. It's likely to be difficult to unify them > with templates. A common API sounds feasible to some degree, certainly > the G_* opcodes and the Instruction hierarchy ought to be able to > agree on abstraction that can interact with both, and > MachineMemOperand ought to be able to agree with > LoadInst/StoreInst/Atomic*InstYeah, that's along the lines of what I was thinking. Long way off, of course. :)> I expect it to be able to deal with ISel too. The main difference > between the two is that ISel's apply step is required to constrain > operands to a register class whereas Combine doesn't need to do that > but can choose to. I'm intending to share code between the tablegen > passes for Combine and ISel so that it will be the same underlying > code generator.Cool. If combine can choose to enforce constraints, then the mechanism is there for isel to use.> The syntax could also work for the Legalizer, and RegBankAlloc too.I hadn't thought about legalizer and things like RegBankAlloc. That's neat! Even though legalizer is a special case, there still may be value in making its specification more declarative. -David
Quentin Colombet via llvm-dev
2018-Nov-16 17:25 UTC
[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules
Hi Daniel, I finally read the proposal. I have a couple of comments that re-enforce what I was thinking initially: it is too early IMHO to develop a full tablegen backend for this. I believe we still miss ways to represent things that matters (see my comments below) and I am afraid we still don't know all that needs to be represented.> It can reach across basic blocks when it's safe to do so. We may need to introduce a means to limit the inter-basic-block reach as I can foresee situations where it may pull expensive operations into a loop if it's not careful about thisI think we would need a syntax for that and in particular, I don't think a generic heuristic based on frequency can solely solve the problem (i.e., what you're hinting in your RFC). Another thing that we miss has to do with the number of users. E.g., should a pattern only be applied if there is only one user? Is it valid if there are more that one users, if we cannot move something into the destination block can we still apply the pattern is we move the result in the source block, etc. The number of users may not be that relevant, but this gives use is a notion of profitability (it is just how SDISel does this). This actually ties back to the inter-basic-block case: is this pattern profitable if it is between basic blocks with different frequencies. In other words, I believe we should start to think in terms of how profitable is a pattern. E.g., the source pattern as some cost and the destination patterns as some other cost. Applying the pattern should give us a profitability score, which would be higher when it produces dead code (i.e., what the oneUse check tries to identify in SDISel). Following that logic, we may get away with all the disabling specific patterns kind of thing. I.e., if something is not profitable it does not get applied. All in all, the syntax that your suggesting is reasonable (though IMHO MIR is going to limit our ability to share more logic with IR, but maybe that's fine to give up on that goal?), but I feel it actually distracts us from thinking about the problems we are solving and thus, I am hesitant to stamp a work that would write a new tablegen backend. That said, again, I am not the one doing the work! Cheers, -Quentin [snip]