thr3ads.net - llvm dev - [LLVMdev] Let's get rid of neverHasSideEffects [Aug 2012]

If this information is useful, please help other people find it:
Share via:

Jakob Stoklund Olesen

2012-Aug-21 22:45 UTC

[LLVMdev] Let's get rid of neverHasSideEffects

On Aug 21, 2012, at 3:02 PM, Chris Lattner <clattner at apple.com> wrote:
> 
> On Aug 21, 2012, at 2:02 PM, Jakob Stoklund Olesen <stoklund at
2pi.dk> wrote:
> 
>> All,
>> 
>> TableGen likes to infer the MCID::UnmodeledSideEffects flag from an
instruction's pattern. When an instruction doesn't have a pattern, it is
assumed to have side effects.
> 
> Hi Jakob,
> 
> I don't understand what you're saying.  Are you proposing that all
properties (may load, store, side effects) be explicitly added to all
instructions, and the pattern only be used to produce warnings?
Yes.

The side effect inference is worse than the load/store inference, but features
like this work best when they work all the time. If all instructions had
patterns, and we could accurately infer properties, it wouldn't be a
problem.
> Won't this hugely bloat the .td files?
Not really, TableGen has a fairly convenient syntax for bulk flagging:

let mayLoad = 1 in {
let Defs = [AL,EFLAGS,AX], Uses = [AX] in
def DIV8m  : I<0xF6, MRM6m, (outs), (ins i8mem:$src),   // AX/[mem8] = AL,AH
               "div{b}\t$src", [], IIC_DIV8_MEM>;
let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in
def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src),  // DX:AX/[mem16] =
AX,DX
               "div{w}\t$src", [], IIC_DIV16>, OpSize;
let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in    // EDX:EAX/[mem32] = EAX,EDX
def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src),
               "div{l}\t$src", [], IIC_DIV32>;
// RDX:RAX/[mem64] = RAX,RDX
let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in
def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src),
                "div{q}\t$src", [], IIC_DIV64>;
}

It is a complete coincidence that these instructions happen to be missing
neverHasSideEffects = 1 ;)

Often, you can set the mayLoad flag as part of a multiclass when you define the
*rm variant. X86 is the worst case because so many instructions can fold loads
and stores. You can handle most of them by setting the mayLoad and mayStore
flags in the big ArithBinOp multiclasses.

You can see the problem in the big ArithBinOp_RF multiclass in
X86InstrArithmetic.td. It's not obvious at a glance which of those
instructions have patterns, and as a result we forgot to set neverHasSideEffects
on the *_REV and *8i8 variants. I don't think it would detract to add
"let mayLoad = 1 in {…}" around the *rm groups etc.

Or you could set the flags on the various BinOpRMW super classes.

/jakob

Chris Lattner

2012-Aug-21 23:41 UTC

head link

[LLVMdev] Let's get rid of neverHasSideEffects

On Aug 21, 2012, at 3:45 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk>
wrote:>> I don't understand what you're saying.  Are you proposing that
all properties (may load, store, side effects) be explicitly added to all
instructions, and the pattern only be used to produce warnings?
> 
> Yes.
> 
> The side effect inference is worse than the load/store inference, but
features like this work best when they work all the time. If all instructions
had patterns, and we could accurately infer properties, it wouldn't be a
problem.
Ok, I agree with that.  I think it's crazy that we still can't write a
pattern for copy/move instructions, so we have to smatter attributes on them. 
What are the other cases that we can't write a pattern for them?

I agree that it is sometimes convenient to write Pat<> patterns instead of
putting a pattern on an instruction: why not make tblgen also infer predicates
from Pat patterns that expand to a single instruction?
>> Won't this hugely bloat the .td files?
> 
> Not really, TableGen has a fairly convenient syntax for bulk flagging:
> 
> let mayLoad = 1 in {
> let Defs = [AL,EFLAGS,AX], Uses = [AX] in
> def DIV8m  : I<0xF6, MRM6m, (outs), (ins i8mem:$src),   // AX/[mem8] =
AL,AH
>               "div{b}\t$src", [], IIC_DIV8_MEM>;
> let Defs = [AX,DX,EFLAGS], Uses = [AX,DX] in
> def DIV16m : I<0xF7, MRM6m, (outs), (ins i16mem:$src),  // DX:AX/[mem16]
= AX,DX
>               "div{w}\t$src", [], IIC_DIV16>, OpSize;
> let Defs = [EAX,EDX,EFLAGS], Uses = [EAX,EDX] in    // EDX:EAX/[mem32] =
EAX,EDX
> def DIV32m : I<0xF7, MRM6m, (outs), (ins i32mem:$src),
>               "div{l}\t$src", [], IIC_DIV32>;
> // RDX:RAX/[mem64] = RAX,RDX
> let Defs = [RAX,RDX,EFLAGS], Uses = [RAX,RDX] in
> def DIV64m : RI<0xF7, MRM6m, (outs), (ins i64mem:$src),
>                "div{q}\t$src", [], IIC_DIV64>;
> }
> 
> It is a complete coincidence that these instructions happen to be missing
neverHasSideEffects = 1 ;)
It's also a serious tblgen deficiency that we can't write patterns for
these.  Look at the terrible code in X86ISelDAGToDAG.cpp that is required to
match these.
> Often, you can set the mayLoad flag as part of a multiclass when you define
the *rm variant. X86 is the worst case because so many instructions can fold
loads and stores. You can handle most of them by setting the mayLoad and
mayStore flags in the big ArithBinOp multiclasses.
> 
> You can see the problem in the big ArithBinOp_RF multiclass in
X86InstrArithmetic.td. It's not obvious at a glance which of those
instructions have patterns, and as a result we forgot to set neverHasSideEffects
on the *_REV and *8i8 variants. I don't think it would detract to add
"let mayLoad = 1 in {…}" around the *rm groups etc.
Personally, I don't like the direction of making everything be redundantly
specified with "let" clauses and in the patterns.  I agree that it is
a problem that we're not inferring from Pat<> patterns, and that not
all instructions are expressible with patterns, but I'd rather we solve
*those* problems than throw out inference.

-Chris

Jakob Stoklund Olesen

2012-Aug-22 17:41 UTC

head link

[LLVMdev] Let's get rid of neverHasSideEffects

On Aug 21, 2012, at 4:41 PM, Chris Lattner <clattner at apple.com> wrote:
> On Aug 21, 2012, at 3:45 PM, Jakob Stoklund Olesen <stoklund at
2pi.dk> wrote:
>>> I don't understand what you're saying.  Are you proposing
that all properties (may load, store, side effects) be explicitly added to all
instructions, and the pattern only be used to produce warnings?
>> 
>> Yes.
>> 
>> The side effect inference is worse than the load/store inference, but
features like this work best when they work all the time. If all instructions
had patterns, and we could accurately infer properties, it wouldn't be a
problem.
> 
> Ok, I agree with that.  I think it's crazy that we still can't
write a pattern for copy/move instructions, so we have to smatter attributes on
them.  What are the other cases that we can't write a pattern for them?
Off the top of my head, I'm sure Jim and Owen could pile on:

- Copy/move instructions. We specifically don't want isel to emit these
instructions. We prefer a generic COPY which can be recognized quickly and
completely by SSA passes and coalescing/regalloc. The only attribute that needs
to be set on a copy/move instruction is neverHasSideEffects - precisely the
default I want to change.

- Big chunks of X86InstrSystem.td. Some of these have intrinsics, some
don't. We want accurate attributes on all of them.

- Assembly-only instructions. Some instructions have variants that can be
written in assembly, but that isel would never produce.

- Disassembly-only instructions. Some variants can't even be written in
assembly, but a disassembler can decode it, and we want accurate attributes for
binary analysis.

- Instructions that are produced by later codegen passes. Opcodes for branch
relaxation, cbz formation on ARM, the 27 variants of xmm loads and stores on
x86, etc.

- The non-pseudo versions of the x87 instruction set.

- Instructions that produce non-existant types. ARM's VLD1D3 instruction
class loads 3 consecutive D-registers. There is no v3f64 or v3i64 type, and I
don't think we want there to be.

- Instructions that produce illegal types. ARM's VLD1D4 instruction class
could be modeled as a v4f64 load, but we don't support other operations on
that type, and it wouldn't make sense to make it legal. If I put a pattern
on such an instruction, I would rather have TableGen tell me that the pattern
will never be able to match.

- Instructions that don't map 1-1 to their intrinsics. Some NEON intrinsics
expand to two instructions.

- Instructions that read or write specific physregs, like DIV and MUL. Clearly a
TableGen defect, as you mentioned.

- Instructions that define multiple values. Another TableGen shortcoming.

In many of these cases, we don't ever want isel emitting the instructions. I
think adding disabled patterns to such instructions as a way of specifying
properties would be backwards.
> I agree that it is sometimes convenient to write Pat<> patterns
instead of putting a pattern on an instruction: why not make tblgen also infer
predicates from Pat patterns that expand to a single instruction?
I would love to treat Pat<> and instruction patterns equally. Keep
reading.
> Personally, I don't like the direction of making everything be
redundantly specified with "let" clauses and in the patterns.  I agree
that it is a problem that we're not inferring from Pat<> patterns, and
that not all instructions are expressible with patterns, but I'd rather we
solve *those* problems than throw out inference.
We need to fix the current approach. It appears capricious to someone who
isn't closely familiar, and that is most of us. Two features are directly
harmful: Inference silently falls back to plain guessing when it fails, and when
it works, you get a warning about any redundantly specified properties.

I would compare it to a clang warning: "'x' is initialized, but I
can prove that the value is always overwritten before being used." You
remove the initializer on x to get rid of the warning. When the code later
changes to read the uninitialized value, clang silently guesses that you wanted
x to be 7. We would drive Doug out of town with pitchforks if clang did
something like that.

We could improve the inference coverage to include Pat<>'s, and I
think we should. However, it would only add caprice. Somebody changes a
single-instruction Pat<> to a multi-instruction pattern, and we expect him
to know that he needs to decorate the instruction with attributes? That's
not a reasonable expectation.

A tolerable inference approach is possible, but it absolutely needs to:
- Don't guess when it fails, but complain loudly if some attributes are not
given explicitly, and
- Permit and verify explicitly set attributes when it does work.

I don't want to do that because I think we can do better. I want to:

- Explicitly specify all instruction properties that differ from the sane
defaults (not a load, not a store, no side effects). Exploiting the regularity
in an instruction set to do this easily is actually one of TableGen's strong
sides.

- Use the current inference code to emit warnings instead. This would
significantly improve TableGen diagnostics.

- Analyze single- and multi-instruction Pat<>'s the same way
instruction patterns are analyzed. Notice how easy it is to analyze
multi-instruction patterns compared to the near-impossible task of using them
for inference.

This has real benefits. If you factor your instruction classes well, you can tag
entire classes of instructions with mayLoad/mayStore very easily. Obscure
opcodes that would never get a pattern are likely to be tagged correctly, and
you get free verification of the instructions that do have patterns. Any kind of
pattern.

It also improves readability by making in-instruction patterns and
Pat<>'s equals, and by treating all instruction properties equally.
There will be much fewer surprises for the casual visitor.

/jakob

Jakob Stoklund Olesen

2012-Aug-22 21:26 UTC

head link

[LLVMdev] Let's get rid of neverHasSideEffects

On Aug 21, 2012, at 4:41 PM, Chris Lattner <clattner at apple.com> wrote:
> Personally, I don't like the direction of making everything be
redundantly specified with "let" clauses and in the patterns.  I agree
that it is a problem that we're not inferring from Pat<> patterns, and
that not all instructions are expressible with patterns, but I'd rather we
solve *those* problems than throw out inference.
After discussing this, we have reached a workable compromise. Here's what
we'll do:

The inferred instruction properties will become tristate and default to Unset:

  bit hasSideEffects = ?;
  bit mayLoad = ?;
  bit mayStore = ?;

TableGen will attempt to infer these properties from any instruction patterns.

If inference fails and a property is unset, TableGen will issue an error instead
of guessing.

If inference succeeds and a property is set to an inconsistent value, TableGen
will issue an error.

If inference succeeds and a property is set to a consistent value, TableGen will
happily and silently go about its business.

This will also handle migration for out-of-tree targets. They can replace
'neverHasSideEffects = 1' with 'hasSideEffects = 0', and any
instructions without patterns will be pointed out for audit.

/jakob

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - Aug 2012 - [LLVMdev] Let's get rid of neverHasSideEffects

[LLVMdev] Let's get rid of neverHasSideEffects

[LLVMdev] Let's get rid of neverHasSideEffects

[LLVMdev] Let's get rid of neverHasSideEffects

[LLVMdev] Let's get rid of neverHasSideEffects

Maybe Matching Threads