Jakob Stoklund Olesen
2012-Aug-21 22:45 UTC
[LLVMdev] Let's get rid of neverHasSideEffects
On Aug 21, 2012, at 3:02 PM, Chris Lattner <clattner at apple.com> wrote:> > On Aug 21, 2012, at 2:02 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > >> All, >> >> TableGen likes to infer the MCID::UnmodeledSideEffects flag from an instruction's pattern. When an instruction doesn't have a pattern, it is assumed to have side effects. > > Hi Jakob, > > I don't understand what you're saying. Are you proposing that all properties (may load, store, side effects) be explicitly added to all instructions, and the pattern only be used to produce warnings?Yes. The side effect inference is worse than the load/store inference, but features like this work best when they work all the time. If all instructions had patterns, and we could accurately infer properties, it wouldn't be a problem.> Won't this hugely bloat the .td files?Not really, TableGen has a fairly convenient syntax for bulk flagging: let mayLoad = 1 in { let Defs = [AL,EFLAGS,AX], Uses = [AX] in def DIV8m : I<0xF6, MRM6m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH "div{b}\t$src", [], IIC_DIV8_MEM>; let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX "div{w}\t$src", [], IIC_DIV16>, OpSize; let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src), "div{l}\t$src", [], IIC_DIV32>; // RDX:RAX/[mem64] = RAX,RDX let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src), "div{q}\t$src", [], IIC_DIV64>; } It is a complete coincidence that these instructions happen to be missing neverHasSideEffects = 1 ;) Often, you can set the mayLoad flag as part of a multiclass when you define the *rm variant. X86 is the worst case because so many instructions can fold loads and stores. You can handle most of them by setting the mayLoad and mayStore flags in the big ArithBinOp multiclasses. You can see the problem in the big ArithBinOp_RF multiclass in X86InstrArithmetic.td. It's not obvious at a glance which of those instructions have patterns, and as a result we forgot to set neverHasSideEffects on the *_REV and *8i8 variants. I don't think it would detract to add "let mayLoad = 1 in {…}" around the *rm groups etc. Or you could set the flags on the various BinOpRMW super classes. /jakob
On Aug 21, 2012, at 3:45 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:>> I don't understand what you're saying. Are you proposing that all properties (may load, store, side effects) be explicitly added to all instructions, and the pattern only be used to produce warnings? > > Yes. > > The side effect inference is worse than the load/store inference, but features like this work best when they work all the time. If all instructions had patterns, and we could accurately infer properties, it wouldn't be a problem.Ok, I agree with that. I think it's crazy that we still can't write a pattern for copy/move instructions, so we have to smatter attributes on them. What are the other cases that we can't write a pattern for them? I agree that it is sometimes convenient to write Pat<> patterns instead of putting a pattern on an instruction: why not make tblgen also infer predicates from Pat patterns that expand to a single instruction?>> Won't this hugely bloat the .td files? > > Not really, TableGen has a fairly convenient syntax for bulk flagging: > > let mayLoad = 1 in { > let Defs = [AL,EFLAGS,AX], Uses = [AX] in > def DIV8m : I<0xF6, MRM6m, (outs), (ins i8mem:$src), // AX/[mem8] = AL,AH > "div{b}\t$src", [], IIC_DIV8_MEM>; > let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in > def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src), // DX:AX/[mem16] = AX,DX > "div{w}\t$src", [], IIC_DIV16>, OpSize; > let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in // EDX:EAX/[mem32] = EAX,EDX > def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src), > "div{l}\t$src", [], IIC_DIV32>; > // RDX:RAX/[mem64] = RAX,RDX > let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in > def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src), > "div{q}\t$src", [], IIC_DIV64>; > } > > It is a complete coincidence that these instructions happen to be missing neverHasSideEffects = 1 ;)It's also a serious tblgen deficiency that we can't write patterns for these. Look at the terrible code in X86ISelDAGToDAG.cpp that is required to match these.> Often, you can set the mayLoad flag as part of a multiclass when you define the *rm variant. X86 is the worst case because so many instructions can fold loads and stores. You can handle most of them by setting the mayLoad and mayStore flags in the big ArithBinOp multiclasses. > > You can see the problem in the big ArithBinOp_RF multiclass in X86InstrArithmetic.td. It's not obvious at a glance which of those instructions have patterns, and as a result we forgot to set neverHasSideEffects on the *_REV and *8i8 variants. I don't think it would detract to add "let mayLoad = 1 in {…}" around the *rm groups etc.Personally, I don't like the direction of making everything be redundantly specified with "let" clauses and in the patterns. I agree that it is a problem that we're not inferring from Pat<> patterns, and that not all instructions are expressible with patterns, but I'd rather we solve *those* problems than throw out inference. -Chris
Jakob Stoklund Olesen
2012-Aug-22 17:41 UTC
[LLVMdev] Let's get rid of neverHasSideEffects
On Aug 21, 2012, at 4:41 PM, Chris Lattner <clattner at apple.com> wrote:> On Aug 21, 2012, at 3:45 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >>> I don't understand what you're saying. Are you proposing that all properties (may load, store, side effects) be explicitly added to all instructions, and the pattern only be used to produce warnings? >> >> Yes. >> >> The side effect inference is worse than the load/store inference, but features like this work best when they work all the time. If all instructions had patterns, and we could accurately infer properties, it wouldn't be a problem. > > Ok, I agree with that. I think it's crazy that we still can't write a pattern for copy/move instructions, so we have to smatter attributes on them. What are the other cases that we can't write a pattern for them?Off the top of my head, I'm sure Jim and Owen could pile on: - Copy/move instructions. We specifically don't want isel to emit these instructions. We prefer a generic COPY which can be recognized quickly and completely by SSA passes and coalescing/regalloc. The only attribute that needs to be set on a copy/move instruction is neverHasSideEffects - precisely the default I want to change. - Big chunks of X86InstrSystem.td. Some of these have intrinsics, some don't. We want accurate attributes on all of them. - Assembly-only instructions. Some instructions have variants that can be written in assembly, but that isel would never produce. - Disassembly-only instructions. Some variants can't even be written in assembly, but a disassembler can decode it, and we want accurate attributes for binary analysis. - Instructions that are produced by later codegen passes. Opcodes for branch relaxation, cbz formation on ARM, the 27 variants of xmm loads and stores on x86, etc. - The non-pseudo versions of the x87 instruction set. - Instructions that produce non-existant types. ARM's VLD1D3 instruction class loads 3 consecutive D-registers. There is no v3f64 or v3i64 type, and I don't think we want there to be. - Instructions that produce illegal types. ARM's VLD1D4 instruction class could be modeled as a v4f64 load, but we don't support other operations on that type, and it wouldn't make sense to make it legal. If I put a pattern on such an instruction, I would rather have TableGen tell me that the pattern will never be able to match. - Instructions that don't map 1-1 to their intrinsics. Some NEON intrinsics expand to two instructions. - Instructions that read or write specific physregs, like DIV and MUL. Clearly a TableGen defect, as you mentioned. - Instructions that define multiple values. Another TableGen shortcoming. In many of these cases, we don't ever want isel emitting the instructions. I think adding disabled patterns to such instructions as a way of specifying properties would be backwards.> I agree that it is sometimes convenient to write Pat<> patterns instead of putting a pattern on an instruction: why not make tblgen also infer predicates from Pat patterns that expand to a single instruction?I would love to treat Pat<> and instruction patterns equally. Keep reading.> Personally, I don't like the direction of making everything be redundantly specified with "let" clauses and in the patterns. I agree that it is a problem that we're not inferring from Pat<> patterns, and that not all instructions are expressible with patterns, but I'd rather we solve *those* problems than throw out inference.We need to fix the current approach. It appears capricious to someone who isn't closely familiar, and that is most of us. Two features are directly harmful: Inference silently falls back to plain guessing when it fails, and when it works, you get a warning about any redundantly specified properties. I would compare it to a clang warning: "'x' is initialized, but I can prove that the value is always overwritten before being used." You remove the initializer on x to get rid of the warning. When the code later changes to read the uninitialized value, clang silently guesses that you wanted x to be 7. We would drive Doug out of town with pitchforks if clang did something like that. We could improve the inference coverage to include Pat<>'s, and I think we should. However, it would only add caprice. Somebody changes a single-instruction Pat<> to a multi-instruction pattern, and we expect him to know that he needs to decorate the instruction with attributes? That's not a reasonable expectation. A tolerable inference approach is possible, but it absolutely needs to: - Don't guess when it fails, but complain loudly if some attributes are not given explicitly, and - Permit and verify explicitly set attributes when it does work. I don't want to do that because I think we can do better. I want to: - Explicitly specify all instruction properties that differ from the sane defaults (not a load, not a store, no side effects). Exploiting the regularity in an instruction set to do this easily is actually one of TableGen's strong sides. - Use the current inference code to emit warnings instead. This would significantly improve TableGen diagnostics. - Analyze single- and multi-instruction Pat<>'s the same way instruction patterns are analyzed. Notice how easy it is to analyze multi-instruction patterns compared to the near-impossible task of using them for inference. This has real benefits. If you factor your instruction classes well, you can tag entire classes of instructions with mayLoad/mayStore very easily. Obscure opcodes that would never get a pattern are likely to be tagged correctly, and you get free verification of the instructions that do have patterns. Any kind of pattern. It also improves readability by making in-instruction patterns and Pat<>'s equals, and by treating all instruction properties equally. There will be much fewer surprises for the casual visitor. /jakob
Jakob Stoklund Olesen
2012-Aug-22 21:26 UTC
[LLVMdev] Let's get rid of neverHasSideEffects
On Aug 21, 2012, at 4:41 PM, Chris Lattner <clattner at apple.com> wrote:> Personally, I don't like the direction of making everything be redundantly specified with "let" clauses and in the patterns. I agree that it is a problem that we're not inferring from Pat<> patterns, and that not all instructions are expressible with patterns, but I'd rather we solve *those* problems than throw out inference.After discussing this, we have reached a workable compromise. Here's what we'll do: The inferred instruction properties will become tristate and default to Unset: bit hasSideEffects = ?; bit mayLoad = ?; bit mayStore = ?; TableGen will attempt to infer these properties from any instruction patterns. If inference fails and a property is unset, TableGen will issue an error instead of guessing. If inference succeeds and a property is set to an inconsistent value, TableGen will issue an error. If inference succeeds and a property is set to a consistent value, TableGen will happily and silently go about its business. This will also handle migration for out-of-tree targets. They can replace 'neverHasSideEffects = 1' with 'hasSideEffects = 0', and any instructions without patterns will be pointed out for audit. /jakob