thr3ads.net - llvm dev - [llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules [Nov 2018]

If this information is useful, please help other people find it:
Share via:

Daniel Sanders via llvm-dev

2018-Nov-15 18:00 UTC

[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

> On Nov 13, 2018, at 08:01, David Greene <dag at cray.com> wrote:
> 
> Daniel Sanders via llvm-dev <llvm-dev at lists.llvm.org> writes:
> 
>> That's an interesting idea. Certainly tablegenerating InstCombine
>> ought to be possible and sharing code sounds like it ought to be
>> doable. MIR and IR are pretty similar especially after IRTranslator
>> (which is a direct translation) through to the Legalizer (which is the
>> first point target instructions can until targets make custom
>> passes). From the Legalizer to ISel, there's still likely to be a
fair
>> amount of overlap between the two as a lot of the G_* opcodes directly
>> correspond to LLVM-IR instructions. The tricky bit will be the escape
>> hatches into C++ would need to either have Instruction/MachineInstr
>> versions or would need to accept both.
> 
> Yes.  I wonder if templates can help with this.  I'm thinking of
> LoopInfo, which is parameterized with BasicBlock/Loop or
> MachineBasicBlock/MachineLoop, as a guide.  I don't know if Instruction
> and MachineInstr have enough APIs in common, though.  Perhaps a
> longer-term effort could be made to develop a common API with the same
> spellings for both, at least enough for dagcombine purposes.
Unfortunately, Instruction is a class-hierachy based design with specialized
getters for each subclass whereas MachineInstr is a single one-size-fits-all
class. It's likely to be difficult to unify them with templates. A common
API sounds feasible to some degree, certainly the G_* opcodes and the
Instruction hierarchy ought to be able to agree on abstraction that can interact
with both, and MachineMemOperand ought to be able to agree with
LoadInst/StoreInst/Atomic*Inst
> Your initial response to Nicolai's suggestion to use TableGen's dag
type
> triggered something else in my mind.  The TableGen SelectionDAG stuff is
> really restrictive in a lot of ways because the TableGen dag type is
> used to express actual DAGs and you rightly pointed out that that's too
> unwieldy for dagcombine.  It's unwieldy for isel too, which has many of
> the same issues (multiple outputs, use of things like EXTRACT_SUBREG and
> so on).  It's never handled selecting to multiple instructions very
well
> either.  I wonder if it would be possible to express isel patterns using
> TableGen's dag type in the way that Nicolai suggests for dagcombine. 
In
> a lot of ways, isel is "just another kind of dagcombine."
I expect it to be able to deal with ISel too. The main difference between the
two is that ISel's apply step is required to constrain operands to a
register class whereas Combine doesn't need to do that but can choose to.
I'm intending to share code between the tablegen passes for Combine and ISel
so that it will be the same underlying code generator.

The syntax could also work for the Legalizer, and RegBankAlloc too. The
legalizer is a degenerate case of matching a single instruction's opcode,
types, and MachineMemOperand so it's probably not worth doing there, but
using it in RegBankAlloc to specify the alternative code for each bank and the
constraints to apply could be useful. The RegBankAlloc pass boils down to lots
of rules of the form:
	%2(s32) = ADD %0(s32), %1(s32) => %2:GPR = ADD %0:GPR, %1:GPR
with some contradictory rules like:
	%1(s32) = FNEG %0(s32) => %1:FPR = FNEG %0: FPR
	%1(s32) = FNEG %0(s32) => %1:GPR = XOR %0:GPR, i32 0x80000000
RegBankAlloc's job is then to resolve the contradictions in the most
globally efficient way. For example, if the input is already in FPR's then
the higher latency of an FNEG might beat the cost of transferring it to GPR and
using the lower-latency XOR. However, if the result needs to be in a GPR anyway
then it's better to copy to GPR before the FNEG and use the lower-latency
XOR instead.
>>> The use of '%' vs. '$' here really threw me for a
loop.  I would have
>>> expected '$D' and '$S' everywhere.  What's the
significance of '%'
>>> vs. '$'?  I know MIR uses '%' for names but must
that be the case in
>>> these match blocks?
>> 
>> In MIR, '%' and '$' have a semantic difference when
used on operands.
>> '%foo' is a virtual register named foo but '$foo' is
the physical register foo.
> 
> Ok, thanks for the explanation.  I have not worked extensively with MIR
> yet.
> 
>> The main reason I didn't pick something distinct from either
>> (e.g. '${foo}') is that I'd like to minimize the need to
modify the
>> MIR parser to support pattern-specific syntax.
> 
> Sure, that makes sense.
> 
>>> My understanding is that "root" means the sink here, but
later in the
>>> "upside-down" example, "root" means the source.
It seems to me the use
>>> of "root" here is a misnomer/confusing.  I liked
Nicolai's ideas about
>>> specifying the insert point.  Why do users need to know about
"root" at
>>> all?  Is there some other special meaning attached to it?
>> 
>> It doesn't correspond with any property of the DAG being matched.
It's
>> the entry point that the combine algorithm uses to begin an attempt to
>> match the rule. In DAGCombine terms, it's the first node checked
for
>> the rule inside the call to DAGCombine::visit(SDNode *).
> 
> Got it.  "root" has a specific meeting for a tree/dag and I think
I was
> getting hung up on that, but I see how it has meaning as the starting
> point of a match.  I was going to say maybe we can find another name for
> it, but now I can't think of a better one.  :)
The best I can think of is 'matchpoint' or 'matchfrom'. As
mentioned on the thread with Nicolai, we might be able to drop it altogether in
favour of a let statement which can a longer name which will help to clarify it.
>                               -David

David Greene via llvm-dev

2018-Nov-15 18:57 UTC

head link

[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

Daniel Sanders via llvm-dev <llvm-dev at lists.llvm.org> writes:
> Unfortunately, Instruction is a class-hierachy based design with
> specialized getters for each subclass whereas MachineInstr is a single
> one-size-fits-all class. It's likely to be difficult to unify them
> with templates. A common API sounds feasible to some degree, certainly
> the G_* opcodes and the Instruction hierarchy ought to be able to
> agree on abstraction that can interact with both, and
> MachineMemOperand ought to be able to agree with
> LoadInst/StoreInst/Atomic*Inst
Yeah, that's along the lines of what I was thinking.  Long way off, of
course.  :)
> I expect it to be able to deal with ISel too. The main difference
> between the two is that ISel's apply step is required to constrain
> operands to a register class whereas Combine doesn't need to do that
> but can choose to. I'm intending to share code between the tablegen
> passes for Combine and ISel so that it will be the same underlying
> code generator.
Cool.  If combine can choose to enforce constraints, then the mechanism
is there for isel to use.
> The syntax could also work for the Legalizer, and RegBankAlloc too.
I hadn't thought about legalizer and things like RegBankAlloc.  That's
neat!  Even though legalizer is a special case, there still may be value
in making its specification more declarative.

                               -David

Quentin Colombet via llvm-dev

2018-Nov-16 17:25 UTC

head link

[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

Hi Daniel,

I finally read the proposal.

I have a couple of comments that re-enforce what I was thinking
initially: it is too early IMHO to develop a full tablegen backend for
this. I believe we still miss ways to represent things that matters
(see my comments below) and I am afraid we still don't know all that
needs to be represented.
> It can reach across basic blocks when it's safe to do so. We may need
to introduce a means to limit the inter-basic-block reach as I can foresee
situations where it may pull expensive operations into a loop if it's not
careful about this
I think we would need a syntax for that and in particular, I don't
think a generic heuristic based on frequency can solely solve the
problem (i.e., what you're hinting in your RFC).

Another thing that we miss has to do with the number of users. E.g.,
should a pattern only be applied if there is only one user? Is it
valid if there are more that one users, if we cannot move something
into the destination block can we still apply the pattern is we move
the result in the source block, etc.

The number of users may not be that relevant, but this gives use is a
notion of profitability (it is just how SDISel does this). This
actually ties back to the inter-basic-block case: is this pattern
profitable if it is between basic blocks with different frequencies.
In other words, I believe we should start to think in terms of how
profitable is a pattern. E.g., the source pattern as some cost and the
destination patterns as some other cost. Applying the pattern should
give us a profitability score, which would be higher when it produces
dead code (i.e., what the oneUse check tries to identify in SDISel).

Following that logic, we may get away with all the disabling specific
patterns kind of thing. I.e., if something is not profitable it does
not get applied.

All in all, the syntax that your suggesting is reasonable (though IMHO
MIR is going to limit our ability to share more logic with IR, but
maybe that's fine to give up on that goal?), but I feel it actually
distracts us from thinking about the problems we are solving and thus,
I am hesitant to stamp a work that would write a new tablegen backend.

That said, again, I am not the one doing the work!

Cheers,
-Quentin
[snip]

llvm dev - Nov 2018 - [RFC] Tablegen-erated GlobalISel Combine Rules

[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules