thr3ads.net - llvm dev - [LLVMdev] DAGCombiner rant [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Scott Michel

2009-Jan-26 04:04 UTC

[LLVMdev] DAGCombiner rant

Yes, it was I who put that rant in the commit log and it's justified. Worse,
it's unreasonable to actually go through all of DAGCombiner's code and
check
to see if certain kinds of constants, e.g., i64, are legal during a
particular phase of DAGCombiner. DAGCombiner does good work and the backends
are supposed to be good citizens. CellSPU is certainly trying to be a good
citizen, no matter how frustrating that becomes on certain days.

We need to document this knowledge somewhere:

 (a) If you're going to legalize operations, you should constrain yourself
to using target independent nodes insofar as feasible. Think twice before
adding a new pseudo-instruction: it prevents DAGCombiner optimizations and
sometimes DAGcombiner will unintentionally work around your operation
legalizations in unexpected ways (i.e., transforming (fabs:f64 arg) to
(bitconvert:f64 (and:i64 (bitconvert:i64 arg), 0x7fffffffffffffff)))

(b) If you add new pseudo-instructions (i.e., target-specific nodes in your
ISelLowering source), make sure you know how to PerformDAGCombine. If
there's no possible way to do PerformDAGCombine on your new
pseudo-instructions, see (a).

(c) If at all possible, handle target-specific instruction selection, e.g.,
64-bit loads or other speciality instruction selection, in your ISelDAGToDAG
source. For example, CellSPU cannot lower 64-bit add, substract or multiply
operations using target-independent nodes. These are custom
instruction-selected using pseudo-instructions that only the instruction
selector will see.

(d) If (a), (b) or (c) don't work, then, and only then, create a new
target-dependent node and its corresponding ISD-extended pseudo instruction.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090125/8511b1a2/attachment.html>

Dan Gohman

2009-Jan-28 21:42 UTC

head link

[LLVMdev] DAGCombiner rant

Hi Scott,

I'm not clear on what you're saying here; some of your
points below seem to be contradictory. The advice to
use target-independent nodes when feasible seems
sound to me, so I wrote up a comment about it in
SelectionDAGNodes.h. If you can formulate your
thoughts in the form of specific documentation changes,
that would be helpful.

In theory, DAGCombiner is supposed to check if an operation
is legal before using it, when it is running after legalize.
It seems in practice it often doesn't do this. Our usual
response when people hit this problem is "then fix it".
Are you saying that the extent of the problem makes this
infeasible?

Dan

On Jan 25, 2009, at 8:04 PM, Scott Michel wrote:
> Yes, it was I who put that rant in the commit log and it's  
> justified. Worse, it's unreasonable to actually go through all of  
> DAGCombiner's code and check to see if certain kinds of constants,  
> e.g., i64, are legal during a particular phase of DAGCombiner.  
> DAGCombiner does good work and the backends are supposed to be good  
> citizens. CellSPU is certainly trying to be a good citizen, no  
> matter how frustrating that becomes on certain days.
>
> We need to document this knowledge somewhere:
>
>  (a) If you're going to legalize operations, you should constrain  
> yourself to using target independent nodes insofar as feasible.  
> Think twice before adding a new pseudo-instruction: it prevents  
> DAGCombiner optimizations and sometimes DAGcombiner will  
> unintentionally work around your operation legalizations in  
> unexpected ways (i.e., transforming (fabs:f64 arg) to  
> (bitconvert:f64 (and:i64 (bitconvert:i64 arg), 0x7fffffffffffffff)))
>
> (b) If you add new pseudo-instructions (i.e., target-specific nodes  
> in your ISelLowering source), make sure you know how to  
> PerformDAGCombine. If there's no possible way to do  
> PerformDAGCombine on your new pseudo-instructions, see (a).
>
> (c) If at all possible, handle target-specific instruction  
> selection, e.g., 64-bit loads or other speciality instruction  
> selection, in your ISelDAGToDAG source. For example, CellSPU cannot  
> lower 64-bit add, substract or multiply operations using target- 
> independent nodes. These are custom instruction-selected using  
> pseudo-instructions that only the instruction selector will see.
>
> (d) If (a), (b) or (c) don't work, then, and only then, create a new  
> target-dependent node and its corresponding ISD-extended pseudo  
> instruction.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Scott Michel

2009-Jan-30 20:10 UTC

head link

[LLVMdev] DAGCombiner rant

On Wed, Jan 28, 2009 at 1:42 PM, Dan Gohman <gohman at apple.com> wrote:
> Hi Scott,
>
> I'm not clear on what you're saying here; some of your
> points below seem to be contradictory. The advice to
> use target-independent nodes when feasible seems
> sound to me, so I wrote up a comment about it in
> SelectionDAGNodes.h. If you can formulate your
> thoughts in the form of specific documentation changes,
> that would be helpful.

Sorry if I wasn't clear, since I was venting some of my frustration with my
continual battle with DAGCombine and the CellSPU backend. DAGCombine should
operate according to the principle of least surprise; yet, I'm continually
being surprised.

My biggest frustration, until recently, was finding new i64 constants being
generated by DAGCombine after operation legalization. Prior to that,
InstCombine was inserting i64 multiplications; I ended up biting the bullet
and generating the full 48+ instruction sequence for i64 mul for CellSPU in
terms of the i16 mulipliers that Cell's SPU actually support.I moved all of
i64 constant generation to instruction selection, when originally, i64
constants were legalized by target-specific nodes.

My approach now is the path of least pain and better citizenship within the
LLVM framework itself.

Specfic guidance for backend developers:

(a) Avoid target-specific nodes.
(b) Legalize operations in terms of existing ISD instructions.
(c) Don't be surprised if DAGCombiner is too smart for your backend's
capabilities.
(d) The size and scope of changing DAGCombiner is daunting, so if you can
avoid changing or altering DAGCombiner, don't change DAGCombiner. For
example, how many changes need to be made to check of i64 constants are
legal during a particular phase of DAGCombiner -- you have to do a
__complete__ job, not just patch the place that just bit you in the
backside.
(e) Handle as much target-specific <foo> during instruction selection
(e.g.,
i64 constants). IFF there's no way to handle <foo> during instruction
selection, then consider adding a new target-specific node for operation
legalization.
(f) If in doubt about any of the above suggestions, refer to (e) and (a).

Regarding (d), I'd rather do a complete job than a half-baked job if I were
going to submit a patch against DAGCombiner. Looking at the code, even using
NetBeans' IDE, tracking down "illegal" i64 constants is a daunting
task. The
path of least pain was to move i64 constant generation to instruction
selection and leave DAGCombiner alone. Basically, the guidance is "Do as
much of what you can during instruction selection, rewrite operations as
target-independent nodes insofar as feasible, and only as a last resort, add
new target-specific nodes because the pain threshold for altering
DAGCombiner or SelectionDAGLegalize is sufficiently intolerable." (Yes, all
one needs to look for are occurances of
"DAG->getConstant(MVT::i64", but
there's the possibility of either getting it wrong or being incomplete.)

In (e), however, most backed developers will immediately notice that they
have to run a set of predicates that will be executed a second time by the
generated Select() code. There's probably a better way to do that.

Example: Cell's SPU is a vector processor with a "unform register
set",
meaning that scalars are always in slot 0 of the vector (mostly.) Unformity
has to be preserved at all times. Hence, at least in CellSPU's case, i64
constants __have__ to be represented as BUILD_VECTOR nodes and the various
selection predicates are run against the node to determine if the result is
a simple constant load that propagates to all vector elements or whether a
constant pool spill needs to occur (depends on the constant and avoiding a
CP spill.) If a predicate succeeds, then the generated Select() method is
invoked, which runs all of the predicates a second time.

Hope that clears up some of the confusion.

-scooter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090130/04507975/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Jan 2009 - [LLVMdev] DAGCombiner rant

[LLVMdev] DAGCombiner rant

[LLVMdev] DAGCombiner rant

[LLVMdev] DAGCombiner rant

Maybe Matching Threads