Matthijs Kooijman
2008-Nov-03 17:05 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
Hi all, I'm trying some stuff with tblgen and it is doing things I didn't expect. As for some background, I have this RD (read) instruction, which reads a value from an external output. In our architecture, we have two types of registers: buses and registers. The RD instructions puts its result on bus, while the consumer of that data wants to have it in a register. To accomplish this, a move instruction must be inserted. My RD and MOVE instructions are simply defined as follows: class MontiumInst< bits<6> op, dag outs, dag ins, string asmstr, list<dag> pattern > : Instruction { bits<6> opcode = op; dag OutOperandList = outs; dag InOperandList = ins; let AsmString = asmstr; let Pattern = pattern; } def RD : MontiumInst< 0x00, (outs Bus:$dst), (ins i32imm:$addr), "$dst = rd $addr", [] >; def MOVE : MontiumInst< 0x00, (outs DatapathReg:$dst), (ins Bus:$src), "$dst = $src", [] >; In the isel input, there is a "rd" SDNode, which I'd like to turn into a RD and a MOVE instruction. Since I can't do that in the Instruction's pattern, I added the folowing generic Pat: def : Pat< (rd imm:$addr), (MOVE (RD imm:$addr)) >; At first, this seems to work as expected. However, when I have two rd nodes in my input, both reading from the same "address", these nodes get merged into a single RD node. Since reads actually read from a FIFO, this should really not happen. Setting hasSideEffects = 1 on the RD Instruction seemed like a logical idea, but only resulting in tblgen telling me he already inferred that. After some discussion, it seems that the problem here is that the RD nodes don't get a chain input and output in the selected DAG. As a second try, I set hasCtrlDep = 1 on the RD instruction, assuming it would ensure that RD would always use chain operands (the comment on hasCtrlDep says "Does this instruction r/w ctrl-flow chains?"). However, this didn't help at all. In fact, it seems that the value of hasCtrlDep is totally ignored by tblgen. It gets assigned to an instance variable at utils/TableGen/CodeGenInstruction.cpp:98, but that variable doesn't seem to be used anywhere. Some further investigation of the code turns out that EmitResultCode in utils/TableGen/DAGISelEmitter.cpp tries to find out if an output node requires chain operands, but gets it wrong. Around line 913, it does the following: Try to find the pattern belonging to the current Instruction. This is the first pattern listed in the Instruction definition, which does not exist for the above RD and Move instructions. If there is no such pattern, and this is the root node (MOVE in the above Pat), then use the pattern from the Pat node (rd imm:$addr above). Then, from the pattern you found, see if any node in the pattern is an SDNode that has the SDNPHasChain property set. If so, the current instruction needs chain operands. Applying the above (slightly simplified) logic to the my Pat result "(MOVE (RD imm:$addr))", we find the following: * The first node, MOVE, has no associated pattern. It is however the root node, so we use "(rd imm:$addr)" as its pattern. This pattern has the SDNPHasChain property set, so the resulting MOVE node needs a chain. * The second node, RD, has no associated pattern. It is also not the root node, so we can't find any pattern for it. Thus, it does not get any chain operands. As a result, the chain operands end up at the MOVE instruction (which really doesn't need them) instead of at the RD instruction. I tried fixing this by associating a pattern with the RD instruction, that has the SDNPHasChain property set. In particular, I tried "(set Bus:$reg, (rd imm:$addr))". This works to some degree: The Emit function generated by the Pat node now properly sets up the chain on the RD instruction. However, the pattern from the RD Instruction is now matched before the Pat node, so now the MOVE node never gets created at al... It seems like tblgen is wrong here, and should probably just respect the hasCtrlDep property. Alternatively, one might want to say that in the "(MOVE (RD imm:$addr))" result dag, the RD node is to be considered the "root" node, in the sense that it should inherit the properties of the input pattern. Am I doing something wrong here, or is tblgen mistaken? Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081103/70bae391/attachment.sig>
Dan Gohman
2008-Nov-04 01:42 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
Hi Matthijs, On Nov 3, 2008, at 9:05 AM, Matthijs Kooijman wrote:> > Am I doing something wrong here, or is tblgen mistaken?I think you're doing something that is beyond what tblgen is currently prepared for. But it's interesting :-). Having rd and RD have explicit chain operands and results sounds like the right thing to do. This is needed to prevent them from being reordered or CSE'd. Having tblgen pretend that the MOVE isn't the root seems a bit counter-intuitive though. I think it would be useful to have a way to explicitly tell tblgen which node(s) in the output pattern to hook up the chain(s) to. I believe this is related to a problem David Greene is seeing, where he has a case involving multiple chains. A fully general approach would be to make chain operands explicit in patterns, and to invent some syntax for identifying chain results. It might look something like this: def : Pat<(rd ch:$input_chain, imm:$addr)[$output_chain], (MOVE (RD ch:$input_chain, imm:$addr)[$output_chain])>; This would be a pretty big change though. An alternative that's a bit less ambitious would be to have tablegen search the output pattern for nodes which are declared to support chain operands/results, and if it finds exactly one such node, use that node to hook up all the chain operands/results. If it finds more than one, it could issue an error. This would probably be simpler and less invasive, and probably enough for your specific example, though it wouldn't be as general-purpose. Dan
Matthijs Kooijman
2008-Nov-04 08:12 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
Hi Dan,> Having tblgen pretend that the MOVE isn't the root seems a bit > counter-intuitive though.I didn't really mean making RD the root, but rather telling tablegen that RD is the "primary" node, corresponding to the input pattern. This would allow the properties of the input rd SDNode be properly transferred to the output RD node instead of the MOVE node. I doubt this is a very elegant solution, nor will it work in the general case, but this is what I meant anyway.> I think it would be useful to have a way to explicitly tell > tblgen which node(s) in the output pattern to hook up the chain(s) > to. I believe this is related to a problem David Greene is seeing, > where he has a case involving multiple chains. > A fully general approach would be to make chain operands explicit > in patterns, and to invent some syntax for identifying chain > results. It might look something like this: > > def : Pat<(rd ch:$input_chain, imm:$addr)[$output_chain], > (MOVE (RD ch:$input_chain, imm:$addr)[$output_chain])>; > > This would be a pretty big change though.So you're basically allowing for a way to specify outputs beyond the first output in a dag? That does make sense, and might be useful for other cases to, perhaps? I won't be able to implement this, however, since I'm only prototyping just yet...> An alternative that's a bit less ambitious would be to have > tablegen search the output pattern for nodes which are declared > to support chain operands/results, and if it finds exactly one > such node, use that node to hook up all the chain > operands/results. If it finds more than one, it could issue > an error. This would probably be simpler and less invasive, > and probably enough for your specific example, though it > wouldn't be as general-purpose.Currently, I think that the code supports having multiple nodes with chain just fine (when I hook up a pattern to my RD instruction, I get an Emit function that hooks up the chain to both RD and MOVE. I think it might even join the two output chains together, not exactly sure about that. Not sure if this is really what I want, but it would work. So, the main thing needed is some way of telling that an instruction is chained, other than giving it a pattern. Using hasCtrlDep for this makes sense, but I'm afraid I won't be having the time nor expertise to implement this... I'd be glad to test it though :-) Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081104/ae57b8d4/attachment.sig>
David Greene
2008-Nov-10 20:45 UTC
[LLVMdev] Multi-instruction patterns, tablegen and chains
On Monday 03 November 2008 19:42, Dan Gohman wrote: [Been on jury duty, catching up on things...]> I think it would be useful to have a way to explicitly tell > tblgen which node(s) in the output pattern to hook up the chain(s) > to. I believe this is related to a problem David Greene is seeing, > where he has a case involving multiple chains.Yes, it sounds like a similar issue. I have a patch that I hope to post this week to start some more discussion. I'll have to look at your ideas more closely. I'm still confused by what tblgen tries to do and what chains should be inserted where... -Dave