Hi David,> [(set GR32:$rD, globaladdr:$addr)]> It seems to have somehow managed to create a cycle in the DAG, which is > of course wrong. But how?When I write a similar pattern into the ARM .td files and look at (from the build directory) lib/Target/ARM/ARMGenDAGISel.inc, I see: /*56478*/ /*SwitchOpcode*/ 13, TARGET_VAL(ISD::GlobalAddress),// ->56494 /*56481*/ OPC_RecordNode, // #0 = $src /*56482*/ OPC_CheckType, MVT::i32, /*56484*/ OPC_CheckPatternPredicate, 4, // (!Subtarget->isThumb()) /*56486*/ OPC_MorphNodeTo, TARGET_VAL(ARM::MOVi32imm), 0, 1/*#VTs*/, MVT::i32, 1/*#Ops*/, 0, // Src: (globaladdr:i32):$src - Complexity = 3 // Dst: (MOVi32imm:i32 (globaladdr:i32):$src) The first lines are fairly unimportant, but the last two show what the DAG thinks it's doing and make it clear why the recursion happens. It's because the globaladdr node doesn't actually get changed during matching, so LLVM is just going to encounter it again at the next step and perform the same substitution. I'd want that Dst globaladdr to be a tglobaladdr (target globaladdr) since LLVM knows it shouldn't try to select target nodes. Existing targets mostly do that during ISelLowering for various reasons, but for your case a separate pattern can probably do the job: def : Pat<(globaladdr:$addr), (MOVar tglobaladdr:$addr)>;> (2) Even when I do manage to generate machine instructions, they're all > discarded because they're dead. This is leading to very fast but > slightly flawed code (all I get is a return instruction). > > My best guess is that this is because I'm incorrectly wiring up the > function outputs, resulting in the compiler discarding the instructions > because the result isn't used.That sounds most likely. If you run "llc -print-after-all" you should be able to see the MachineInstr dataflow before it all gets removed. For example a simple "ret i32* @glob" function (on ARM again) gives me: BB#0: derived from LLVM BB %0 %vreg0<def> = MOVi32imm <ga:@addr>; GPR:%vreg0 %R0<def> = COPY %vreg0; GPR:%vreg0 BX_RET pred:14, pred:%noreg, %R0<imp-use> The important point here is that the BX_RET has some kind of <imp-use> of the registers that will be used to return (%R0 in this case). It gets added first in XYZTargetLowering::LowerReturn, and then selected in the .td file using a variadic node. Which other target's LowerReturn are you using? That's probably the first thing we should check is doing its job. The pictures from of "llc -view-isel-dags" and "llc -view-sched-dags" on a simple function returning something would be very useful there. If you can upload them somewhere (or attach them) I'll take a look and see if anything looks wrong. Cheers. Tim.
On 3/9/14, 8:02 AM, Tim Northover wrote: [...]> I'd want that Dst globaladdr to be a tglobaladdr (target globaladdr) > since LLVM knows it shouldn't try to select target nodes. Existing > targets mostly do that during ISelLowering for various reasons, but > for your case a separate pattern can probably do the job: > > def : Pat<(globaladdr:$addr), (MOVar tglobaladdr:$addr)>;So if I've understood what this does correctly: instead of just describing a machine instruction to LLVM and letting it figure out how to transform the DAG tree into machine instructions, we're giving it an explicit rule about how to transform globaladdr nodes? That seems sensible... but I'm afraid it doesn't work, dying in precisely the same way as above. My rules look like this: def MOVar : S32< (outs GR32:$rD), (ins i32imm:$addr), // i32imm is wrong here "mov $rD, $addr", [(set GR32:$rD, tglobaladdr:$addr)]>;def : Pat<(globaladdr:$addr), (MOVar tglobaladdr:$addr)>; In the crash dump, the DAG is dumped. The beginning looks like this: Offending node: 0x7fd14880c010: i32 = <<Unknown Machine Node #65508>> 0x7fd14880c010 [ORD=2] [ID=1] 0x7fd14880c010: i32 = <<Unknown Machine Node #65508>> 0x7fd14880c010 [ORD=2] [ID=1] ...etc... (I'm pretty sure that machine node #65508 is the MOVar, but how do I find out for sure?) My gut feeling is that we have a node n, and the pattern fragment above is transforming this to (MOVar n) *at the same address*, so n points at the new transformed node. Thus, forming a cycle. The other targets don't suffer from this because they represent address as complex patterns and use custom-written selectors to pull the information out of the globaladdr node and represent it in a different way. I'd really rather not do this, as I'd rather like to try and suss out pattern based codegen before trying to customise it; is it actually possible to do this using pure patterns? [...]> The important point here is that the BX_RET has some kind of <imp-use> > of the registers that will be used to return (%R0 in this case). It > gets added first in XYZTargetLowering::LowerReturn, and then selected > in the .td file using a variadic node.Yep --- it wasn't marked as variadic, and so my RET instruction had no inputs. Fixing this makes it generate code quite happily. Ta. -- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ │ "You cannot truly appreciate _Atlas Shrugged_ until you have read it │ in the original Klingon." --- Sea Wasp on r.a.sf.w -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 876 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140309/89c47e47/attachment.sig>
On 3/9/14, 10:40 PM, David Given wrote:> That seems > sensible... but I'm afraid it doesn't work, dying in precisely the same > way as above.So I think I've reached the bottom of this. It looks like the generated table loses type information. At one point I had a rule that looked like this: def MOVr : S16< (outs GR32:$rD), (ins GR32:$rS), "mov $rD, $rS", [(set GR32:$rD, GR32:$rS)]>;This ended up last on the opcode list, and was always invoked if no other instruction matched --- because the selection pattern is basically 'x = y' from the point of view of the instruction selector, and therefore fits everything. I believe that the same thing's applying to the globaladdr issue. I had rules that look like this:> def MOVar : S32< > (outs GR32:$rD), (ins i32imm:$addr), // i32imm is wrong here > "mov $rD, $addr", > [(set GR32:$rD, tglobaladdr:$addr)] >> ; > def : Pat<(globaladdr:$addr), (MOVar tglobaladdr:$addr)>;...but that (set) ends up as a rule which compiles to 'x = y' and everything goes horribly wrong. What I eventually did was to copy what the ARM target does: add code to TargetLowering::LowerOperation() to explicitly convert the globaladdr nodes into wrapped machine-specific nodes. My MOVar now looks like: def MOVar : S32< (outs GR32:$rD), (ins i32imm:$addr), // i32imm is wrong here "mov $rD, $addr", [(set GR32:$rD, (WrappedGlobal tglobaladdr:$addr))]>;The WrappedGlobal has no semantic meaning, simply being a machine-specific ISD node with a single parameter. I'm not quite sure *why* this works; it is, after all, the failing rule above should do precisely the same thing, but in a less cryptic way. I suspect there's an llvm bug here, but I'm not quite sure what it is. It would be nice if tablegen was a bit more rigorous about detecting invalid patterns (is there an option I'm missing?). -- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ │ "You cannot truly appreciate _Atlas Shrugged_ until you have read it │ in the original Klingon." --- Sea Wasp on r.a.sf.w -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 876 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140311/4fd6f34e/attachment.sig>