Matthijs Kooijman
2008-Nov-03 17:05 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
Hi all,
I'm trying some stuff with tblgen and it is doing things I didn't
expect.
As for some background, I have this RD (read) instruction, which reads a value
from an external output. In our architecture, we have two types of registers:
buses and registers. The RD instructions puts its result on bus, while the
consumer of that data wants to have it in a register. To accomplish this, a
move instruction must be inserted.
My RD and MOVE instructions are simply defined as follows:
class MontiumInst< bits<6> op,
dag outs,
dag ins,
string asmstr,
list<dag> pattern
> : Instruction
{
bits<6> opcode = op;
dag OutOperandList = outs;
dag InOperandList = ins;
let AsmString = asmstr;
let Pattern = pattern;
}
def RD : MontiumInst< 0x00,
(outs Bus:$dst),
(ins i32imm:$addr),
"$dst = rd $addr",
[]
>;
def MOVE : MontiumInst< 0x00,
(outs DatapathReg:$dst),
(ins Bus:$src),
"$dst = $src",
[]
>;
In the isel input, there is a "rd" SDNode, which I'd like to turn
into a RD and
a MOVE instruction. Since I can't do that in the Instruction's pattern,
I added
the folowing generic Pat:
def : Pat< (rd imm:$addr),
(MOVE (RD imm:$addr))
>;
At first, this seems to work as expected. However, when I have two rd nodes in
my input, both reading from the same "address", these nodes get merged
into a
single RD node. Since reads actually read from a FIFO, this should really not
happen.
Setting hasSideEffects = 1 on the RD Instruction seemed like a logical idea,
but only resulting in tblgen telling me he already inferred that. After some
discussion, it seems that the problem here is that the RD nodes don't get a
chain input and output in the selected DAG.
As a second try, I set hasCtrlDep = 1 on the RD instruction, assuming it would
ensure that RD would always use chain operands (the comment on hasCtrlDep says
"Does this instruction r/w ctrl-flow chains?"). However, this
didn't help at
all. In fact, it seems that the value of hasCtrlDep is totally ignored by
tblgen. It gets assigned to an instance variable at
utils/TableGen/CodeGenInstruction.cpp:98, but that variable doesn't seem to
be
used anywhere.
Some further investigation of the code turns out that EmitResultCode in
utils/TableGen/DAGISelEmitter.cpp tries to find out if an output node requires
chain operands, but gets it wrong. Around line 913, it does the following:
Try to find the pattern belonging to the current Instruction. This is the
first pattern listed in the Instruction definition, which does not exist for
the above RD and Move instructions.
If there is no such pattern, and this is the root node (MOVE in the above
Pat), then use the pattern from the Pat node (rd imm:$addr above).
Then, from the pattern you found, see if any node in the pattern is an
SDNode that has the SDNPHasChain property set. If so, the current
instruction needs chain operands.
Applying the above (slightly simplified) logic to the my Pat result
"(MOVE (RD imm:$addr))", we find the following:
* The first node, MOVE, has no associated pattern. It is however the root
node, so we use "(rd imm:$addr)" as its pattern. This pattern has
the
SDNPHasChain property set, so the resulting MOVE node needs a chain.
* The second node, RD, has no associated pattern. It is also not the root
node, so we can't find any pattern for it. Thus, it does not get any
chain
operands.
As a result, the chain operands end up at the MOVE instruction (which really
doesn't need them) instead of at the RD instruction.
I tried fixing this by associating a pattern with the RD instruction, that has
the SDNPHasChain property set. In particular, I tried
"(set Bus:$reg, (rd imm:$addr))". This works to some degree: The Emit
function
generated by the Pat node now properly sets up the chain on the RD
instruction. However, the pattern from the RD Instruction is now matched
before the Pat node, so now the MOVE node never gets created at al...
It seems like tblgen is wrong here, and should probably just respect the
hasCtrlDep property. Alternatively, one might want to say that in the
"(MOVE (RD imm:$addr))" result dag, the RD node is to be considered
the "root"
node, in the sense that it should inherit the properties of the input pattern.
Am I doing something wrong here, or is tblgen mistaken?
Gr.
Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20081103/70bae391/attachment.sig>
Dan Gohman
2008-Nov-04 01:42 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
Hi Matthijs, On Nov 3, 2008, at 9:05 AM, Matthijs Kooijman wrote:> > Am I doing something wrong here, or is tblgen mistaken?I think you're doing something that is beyond what tblgen is currently prepared for. But it's interesting :-). Having rd and RD have explicit chain operands and results sounds like the right thing to do. This is needed to prevent them from being reordered or CSE'd. Having tblgen pretend that the MOVE isn't the root seems a bit counter-intuitive though. I think it would be useful to have a way to explicitly tell tblgen which node(s) in the output pattern to hook up the chain(s) to. I believe this is related to a problem David Greene is seeing, where he has a case involving multiple chains. A fully general approach would be to make chain operands explicit in patterns, and to invent some syntax for identifying chain results. It might look something like this: def : Pat<(rd ch:$input_chain, imm:$addr)[$output_chain], (MOVE (RD ch:$input_chain, imm:$addr)[$output_chain])>; This would be a pretty big change though. An alternative that's a bit less ambitious would be to have tablegen search the output pattern for nodes which are declared to support chain operands/results, and if it finds exactly one such node, use that node to hook up all the chain operands/results. If it finds more than one, it could issue an error. This would probably be simpler and less invasive, and probably enough for your specific example, though it wouldn't be as general-purpose. Dan
Matthijs Kooijman
2008-Nov-04 08:12 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
Hi Dan,> Having tblgen pretend that the MOVE isn't the root seems a bit > counter-intuitive though.I didn't really mean making RD the root, but rather telling tablegen that RD is the "primary" node, corresponding to the input pattern. This would allow the properties of the input rd SDNode be properly transferred to the output RD node instead of the MOVE node. I doubt this is a very elegant solution, nor will it work in the general case, but this is what I meant anyway.> I think it would be useful to have a way to explicitly tell > tblgen which node(s) in the output pattern to hook up the chain(s) > to. I believe this is related to a problem David Greene is seeing, > where he has a case involving multiple chains. > A fully general approach would be to make chain operands explicit > in patterns, and to invent some syntax for identifying chain > results. It might look something like this: > > def : Pat<(rd ch:$input_chain, imm:$addr)[$output_chain], > (MOVE (RD ch:$input_chain, imm:$addr)[$output_chain])>; > > This would be a pretty big change though.So you're basically allowing for a way to specify outputs beyond the first output in a dag? That does make sense, and might be useful for other cases to, perhaps? I won't be able to implement this, however, since I'm only prototyping just yet...> An alternative that's a bit less ambitious would be to have > tablegen search the output pattern for nodes which are declared > to support chain operands/results, and if it finds exactly one > such node, use that node to hook up all the chain > operands/results. If it finds more than one, it could issue > an error. This would probably be simpler and less invasive, > and probably enough for your specific example, though it > wouldn't be as general-purpose.Currently, I think that the code supports having multiple nodes with chain just fine (when I hook up a pattern to my RD instruction, I get an Emit function that hooks up the chain to both RD and MOVE. I think it might even join the two output chains together, not exactly sure about that. Not sure if this is really what I want, but it would work. So, the main thing needed is some way of telling that an instruction is chained, other than giving it a pattern. Using hasCtrlDep for this makes sense, but I'm afraid I won't be having the time nor expertise to implement this... I'd be glad to test it though :-) Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081104/ae57b8d4/attachment.sig>
David Greene
2008-Nov-10 20:45 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
On Monday 03 November 2008 19:42, Dan Gohman wrote: [Been on jury duty, catching up on things...]> I think it would be useful to have a way to explicitly tell > tblgen which node(s) in the output pattern to hook up the chain(s) > to. I believe this is related to a problem David Greene is seeing, > where he has a case involving multiple chains.Yes, it sounds like a similar issue. I have a patch that I hope to post this week to start some more discussion. I'll have to look at your ideas more closely. I'm still confused by what tblgen tries to do and what chains should be inserted where... -Dave