Guo, Xiaoyi
2013-Apr-29  23:48 UTC
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Hi,
We've recently encountered a problem in our compiler where the line number
in debug info jumps back and force even at O0. This is caused by DAG node
ordering not being properly kept during legalization and instruction selection.
There are still uncaught cases after applying the patch mentioned here.
So I have decided to implement the approach suggested by Andy as below. i.e.
maintain the node ordering as a field inside the DAG node and force anyone
creating the DAG node to provide the ordering. When new DAG nodes are created
inside DAG builder, DAG builder maintains the ordering of the current
instruction being processed and provide that ordering to the DAG node creating
routine. When new DAG nodes are created after DAG builder, e.g., during
legalization, the original node's ordering would be transferred to the new
node.
Since it's going to involve a lot of changes, I'd like to get feedback
on the idea and the interface changes before I make changes to all the call
sites.
Attached is a diff of the first batch of changes, which includes interface
changes: a new wrapper class, new fields, interface changes to
SelectionDAG::getXXX() functions.
Your feedback would be appreciated.
Thanks,
Xiaoyi
    From: Andrew Trick <atrick at apple.com>
    Subject: Re: [PATCH] Propagate DAG node ordering during legalization and
instruction selection
    Date: March 20, 2013 12:01:48 AM PDT
    To: Justin Holewinski <justin.holewinski at gmail.com>
    Cc: llvm-commits <llvm-commits at cs.uiuc.edu>
    On Mar 19, 2013, at 1:17 PM, Justin Holewinski <justin.holewinski at
gmail.com> wrote:
        Updated patch attached.
        I've addressed the CSE during legalization issue.  Now ordering is
only propagated if the new node does not have an ordering (is zero), or has an
ordering that is greater than the replaced node.
        As for compile time, I think I may have had some other machine
interference in the 8% figure.  I can't reproduce that now, and both LLVM
unit tests and LNT are not showing any statistically significant differences.  I
see some variation across runs in LNT, but it looks to be machine noise as I see
both regressions and improvements in different benchmarks in different runs.  I
see +/- 0.5% in the unit tests, but that goes both ways.  Both tests use
release+asserts build.
    Your patch looks fine but doesn't go far enough. I'd like to add an
IROrder field to SDNode (eventually we might be able to make it redundant with
NodeId, although there would be temporary points at which nodes have to share an
IROrder). That would remove any concerns about the compile-time of potentially
frequent DenseMap lookup. But I really want to do it to help ensure that IROrder
remains present and valid as a topological order. Then we don't need a
"source order" scheduler at all, which would be really great. Just
emit MIs for the selected nodes in place, and break some physreg interferences.
    Ideally, anyone who creates an SDNode needs to track down the IROrder.
Rather than propagating IROrder when we replace all uses, we would do it when we
morph or CSE the node, similar to DebugLoc. Ensuring topological order is
another aspect of the problem that can be dealt with later.
    It's a big infrastructure project. But if you find any part of this plan
will help you, progress toward that goal is welcome.
    -Andy
        On Tue, Mar 19, 2013 at 1:07 PM, Justin Holewinski <justin.holewinski
at gmail.com> wrote:
            On Tue, Mar 19, 2013 at 12:47 PM, Justin Holewinski
<justin.holewinski at gmail.com> wrote:
                On Tue, Mar 19, 2013 at 2:26 AM, Evan Cheng <evan.cheng at
apple.com> wrote:
                    Sent from my iPad
                    On Mar 18, 2013, at 2:02 PM, Justin Holewinski
<justin.holewinski at gmail.com> wrote:
                    Compile-time impact is negligible for a release build on the
unit tests.  There is about an 8% impact with assertions enabled.
                    Unit tests are much too small for measuring compile time. 8%
for assertion build is massive. Why are there such large discrepancy?
                I've been trying to get measurements from LNT, but I'm
getting too much run-to-run variation.  A few benchmarks show significant
changes (both positive and negative), but the affected benchmarks are diffe
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130429/ae3a5e7b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DAGIROrder.1.diff
Type: application/octet-stream
Size: 46730 bytes
Desc: DAGIROrder.1.diff
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130429/ae3a5e7b/attachment.obj>
Andrew Trick
2013-Apr-30  00:40 UTC
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Xiaoyi, Thanks for doing this! I think it's critical, not just for debug information, but for the many people who have started using pre-RA-sched=source. This looks more efficient and maintainable to me than the previous approach. Preserving the order for everything except a certain class of nodes (e.g. constants) make DAG serialization straightforward. It just needs to insert copies in rare cases, but won't need to find an optimal order. I'm particularly happy that we can remove uncertainty from the IR->MI conversion. It will be much easier to debug and write tests. My only concern is merge conflicts in out-of-tree <target>ISelLowering and <target>ISelDAGToDAG. I personally don't think it will be too bad, since, as Nadav suggested, it's just a mechanical process of converting DebugLoc definitions: DebugLoc dl = N->getDebugLoc() -> SDLoc dl(N); The getMachineNode call sites themselves should rarely need to change. Also, we just had a significant API change in this area less than 2 weeks ago. It's a good time to make this change and do it right once and for all. But it will be a giant diff. If you think that the changes will be too much, or if anyone else is concerned about merge conflicts, we can consider supporting a backward compatible API. -Andy On Apr 29, 2013, at 4:48 PM, "Guo, Xiaoyi" <Xiaoyi.Guo at amd.com> wrote:> Hi, > > We’ve recently encountered a problem in our compiler where the line number in debug info jumps back and force even at O0. This is caused by DAG node ordering not being properly kept during legalization and instruction selection. There are still uncaught cases after applying the patch mentioned here. > > So I have decided to implement the approach suggested by Andy as below. i.e. maintain the node ordering as a field inside the DAG node and force anyone creating the DAG node to provide the ordering. When new DAG nodes are created inside DAG builder, DAG builder maintains the ordering of the current instruction being processed and provide that ordering to the DAG node creating routine. When new DAG nodes are created after DAG builder, e.g., during legalization, the original node’s ordering would be transferred to the new node. > > Since it’s going to involve a lot of changes, I’d like to get feedback on the idea and the interface changes before I make changes to all the call sites. > > Attached is a diff of the first batch of changes, which includes interface changes: a new wrapper class, new fields, interface changes to SelectionDAG::getXXX() functions. > > Your feedback would be appreciated. > > Thanks, > Xiaoyi > > > From: Andrew Trick <atrick at apple.com> > > Subject: Re: [PATCH] Propagate DAG node ordering during legalization and instruction selection > > Date: March 20, 2013 12:01:48 AM PDT > > To: Justin Holewinski <justin.holewinski at gmail.com> > > Cc: llvm-commits <llvm-commits at cs.uiuc.edu> > > > > On Mar 19, 2013, at 1:17 PM, Justin Holewinski <justin.holewinski at gmail.com> wrote: > > > Updated patch attached. > > > I've addressed the CSE during legalization issue. Now ordering is only propagated if the new node does not have an ordering (is zero), or has an ordering that is greater than the replaced node. > > > As for compile time, I think I may have had some other machine interference in the 8% figure. I can't reproduce that now, and both LLVM unit tests and LNT are not showing any statistically significant differences. I see some variation across runs in LNT, but it looks to be machine noise as I see both regressions and improvements in different benchmarks in different runs. I see +/- 0.5% in the unit tests, but that goes both ways. Both tests use release+asserts build. > > > > Your patch looks fine but doesn't go far enough. I'd like to add an IROrder field to SDNode (eventually we might be able to make it redundant with NodeId, although there would be temporary points at which nodes have to share an IROrder). That would remove any concerns about the compile-time of potentially frequent DenseMap lookup. But I really want to do it to help ensure that IROrder remains present and valid as a topological order. Then we don't need a "source order" scheduler at all, which would be really great. Just emit MIs for the selected nodes in place, and break some physreg interferences. > > Ideally, anyone who creates an SDNode needs to track down the IROrder. Rather than propagating IROrder when we replace all uses, we would do it when we morph or CSE the node, similar to DebugLoc. Ensuring topological order is another aspect of the problem that can be dealt with later. > > It's a big infrastructure project. But if you find any part of this plan will help you, progress toward that goal is welcome. > > -Andy > > > On Tue, Mar 19, 2013 at 1:07 PM, Justin Holewinski <justin.holewinski at gmail.com> wrote: > > > On Tue, Mar 19, 2013 at 12:47 PM, Justin Holewinski <justin.holewinski at gmail.com> wrote: > > > > On Tue, Mar 19, 2013 at 2:26 AM, Evan Cheng <evan.cheng at apple.com> wrote: > > > > > Sent from my iPad > > On Mar 18, 2013, at 2:02 PM, Justin Holewinski <justin.holewinski at gmail.com> wrote: > > > > Compile-time impact is negligible for a release build on the unit tests. There is about an 8% impact with assertions enabled. > > > Unit tests are much too small for measuring compile time. 8% for assertion build is massive. Why are there such large discrepancy? > > > I've been trying to get measurements from LNT, but I'm getting too much run-to-run variation. A few benchmarks show significant changes (both positive and negative), but the affected benchmarks are diffe > > > <DAGIROrder.1.diff>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130429/496262b0/attachment.html>
Eric Christopher
2013-Apr-30  07:54 UTC
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
On Tue, Apr 30, 2013 at 12:48 AM, Guo, Xiaoyi <Xiaoyi.Guo at amd.com> wrote:> Hi,**** > > ** ** > > We’ve recently encountered a problem in our compiler where the line number > in debug info jumps back and force even at O0. This is caused by DAG node > ordering not being properly kept during legalization and instruction > selection. There are still uncaught cases after applying the patch > mentioned here.**** > > ** >Another option is checking how much, at O0, the fast instruction selector is falling back to the DAG - if you have one at least. Fast isel doesn't reorder to the degree that selection on the DAG does. -eric> ** > > So I have decided to implement the approach suggested by Andy as below. > i.e. maintain the node ordering as a field inside the DAG node and force > anyone creating the DAG node to provide the ordering. When new DAG nodes > are created inside DAG builder, DAG builder maintains the ordering of the > current instruction being processed and provide that ordering to the DAG > node creating routine. When new DAG nodes are created after DAG builder, > e.g., during legalization, the original node’s ordering would be > transferred to the new node.**** > > ** ** > > Since it’s going to involve a lot of changes, I’d like to get feedback on > the idea and the interface changes before I make changes to all the call > sites.**** > > ** ** > > Attached is a diff of the first batch of changes, which includes interface > changes: a new wrapper class, new fields, interface changes to > SelectionDAG::getXXX() functions.**** > > ** ** > > Your feedback would be appreciated.**** > > ** ** > > Thanks,**** > > Xiaoyi**** > > > > From: Andrew Trick <atrick at apple.com> > > Subject: Re: [PATCH] Propagate DAG node ordering during legalization > and instruction selection > > Date: March 20, 2013 12:01:48 AM PDT > > To: Justin Holewinski <justin.holewinski at gmail.com> > > Cc: llvm-commits <llvm-commits at cs.uiuc.edu> > > > > > On Mar 19, 2013, at 1:17 PM, Justin Holewinski < > justin.holewinski at gmail.com> wrote: > > > Updated patch attached. > > > I've addressed the CSE during legalization issue. Now ordering is > only propagated if the new node does not have an ordering (is zero), or has > an ordering that is greater than the replaced node. > > > As for compile time, I think I may have had some other machine > interference in the 8% figure. I can't reproduce that now, and both LLVM > unit tests and LNT are not showing any statistically significant > differences. I see some variation across runs in LNT, but it looks to be > machine noise as I see both regressions and improvements in different > benchmarks in different runs. I see +/- 0.5% in the unit tests, but that > goes both ways. Both tests use release+asserts build. > > > > Your patch looks fine but doesn't go far enough. I'd like to add an > IROrder field to SDNode (eventually we might be able to make it redundant > with NodeId, although there would be temporary points at which nodes have > to share an IROrder). That would remove any concerns about the compile-time > of potentially frequent DenseMap lookup. But I really want to do it to help > ensure that IROrder remains present and valid as a topological order. Then > we don't need a "source order" scheduler at all, which would be really > great. Just emit MIs for the selected nodes in place, and break some > physreg interferences. > > Ideally, anyone who creates an SDNode needs to track down the IROrder. > Rather than propagating IROrder when we replace all uses, we would do it > when we morph or CSE the node, similar to DebugLoc. Ensuring topological > order is another aspect of the problem that can be dealt with later. > > It's a big infrastructure project. But if you find any part of this > plan will help you, progress toward that goal is welcome. > > -Andy > > > On Tue, Mar 19, 2013 at 1:07 PM, Justin Holewinski < > justin.holewinski at gmail.com> wrote: > > > On Tue, Mar 19, 2013 at 12:47 PM, Justin Holewinski < > justin.holewinski at gmail.com> wrote: > > > > On Tue, Mar 19, 2013 at 2:26 AM, Evan Cheng < > evan.cheng at apple.com> wrote: > > > > > Sent from my iPad > > On Mar 18, 2013, at 2:02 PM, Justin Holewinski < > justin.holewinski at gmail.com> wrote: > > > > Compile-time impact is negligible for a release build > on the unit tests. There is about an 8% impact with assertions enabled. > > > Unit tests are much too small for measuring compile > time. 8% for assertion build is massive. Why are there such large > discrepancy? > > > I've been trying to get measurements from LNT, but I'm > getting too much run-to-run variation. A few benchmarks show significant > changes (both positive and negative), but the affected benchmarks are diffe > **** > > ** ** > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130430/bb3e96b6/attachment.html>
Guo, Xiaoyi
2013-Apr-30  18:00 UTC
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Hi Eric,
Sorry I wasn't clear. The problem happened in the "source" pre-RA
scheduler, which relies on DAG node ordering to schedule the nodes.
Xiaoyi
From: Eric Christopher [mailto:echristo at gmail.com]
Sent: Tuesday, April 30, 2013 12:54 AM
To: Guo, Xiaoyi
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] [PATCH] Propagate DAG node ordering during legalization
and instruction selection
On Tue, Apr 30, 2013 at 12:48 AM, Guo, Xiaoyi <Xiaoyi.Guo at
amd.com<mailto:Xiaoyi.Guo at amd.com>> wrote:
Hi,
We've recently encountered a problem in our compiler where the line number
in debug info jumps back and force even at O0. This is caused by DAG node
ordering not being properly kept during legalization and instruction selection.
There are still uncaught cases after applying the patch mentioned here.
Another option is checking how much, at O0, the fast instruction selector is
falling back to the DAG - if you have one at least. Fast isel doesn't
reorder to the degree that selection on the DAG does.
-eric
So I have decided to implement the approach suggested by Andy as below. i.e.
maintain the node ordering as a field inside the DAG node and force anyone
creating the DAG node to provide the ordering. When new DAG nodes are created
inside DAG builder, DAG builder maintains the ordering of the current
instruction being processed and provide that ordering to the DAG node creating
routine. When new DAG nodes are created after DAG builder, e.g., during
legalization, the original node's ordering would be transferred to the new
node.
Since it's going to involve a lot of changes, I'd like to get feedback
on the idea and the interface changes before I make changes to all the call
sites.
Attached is a diff of the first batch of changes, which includes interface
changes: a new wrapper class, new fields, interface changes to
SelectionDAG::getXXX() functions.
Your feedback would be appreciated.
Thanks,
Xiaoyi
    From: Andrew Trick <atrick at apple.com<mailto:atrick at
apple.com>>
    Subject: Re: [PATCH] Propagate DAG node ordering during legalization and
instruction selection
    Date: March 20, 2013 12:01:48 AM PDT
    To: Justin Holewinski <justin.holewinski at
gmail.com<mailto:justin.holewinski at gmail.com>>
    Cc: llvm-commits <llvm-commits at cs.uiuc.edu<mailto:llvm-commits at
cs.uiuc.edu>>
    On Mar 19, 2013, at 1:17 PM, Justin Holewinski <justin.holewinski at
gmail.com<mailto:justin.holewinski at gmail.com>> wrote:
        Updated patch attached.
        I've addressed the CSE during legalization issue.  Now ordering is
only propagated if the new node does not have an ordering (is zero), or has an
ordering that is greater than the replaced node.
        As for compile time, I think I may have had some other machine
interference in the 8% figure.  I can't reproduce that now, and both LLVM
unit tests and LNT are not showing any statistically significant differences.  I
see some variation across runs in LNT, but it looks to be machine noise as I see
both regressions and improvements in different benchmarks in different runs.  I
see +/- 0.5% in the unit tests, but that goes both ways.  Both tests use
release+asserts build.
    Your patch looks fine but doesn't go far enough. I'd like to add an
IROrder field to SDNode (eventually we might be able to make it redundant with
NodeId, although there would be temporary points at which nodes have to share an
IROrder). That would remove any concerns about the compile-time of potentially
frequent DenseMap lookup. But I really want to do it to help ensure that IROrder
remains present and valid as a topological order. Then we don't need a
"source order" scheduler at all, which would be really great. Just
emit MIs for the selected nodes in place, and break some physreg interferences.
    Ideally, anyone who creates an SDNode needs to track down the IROrder.
Rather than propagating IROrder when we replace all uses, we would do it when we
morph or CSE the node, similar to DebugLoc. Ensuring topological order is
another aspect of the problem that can be dealt with later.
    It's a big infrastructure project. But if you find any part of this plan
will help you, progress toward that goal is welcome.
    -Andy
        On Tue, Mar 19, 2013 at 1:07 PM, Justin Holewinski <justin.holewinski
at gmail.com<mailto:justin.holewinski at gmail.com>> wrote:
            On Tue, Mar 19, 2013 at 12:47 PM, Justin Holewinski
<justin.holewinski at gmail.com<mailto:justin.holewinski at
gmail.com>> wrote:
                On Tue, Mar 19, 2013 at 2:26 AM, Evan Cheng <evan.cheng at
apple.com<mailto:evan.cheng at apple.com>> wrote:
                    Sent from my iPad
                    On Mar 18, 2013, at 2:02 PM, Justin Holewinski
<justin.holewinski at gmail.com<mailto:justin.holewinski at
gmail.com>> wrote:
                    Compile-time impact is negligible for a release build on the
unit tests.  There is about an 8% impact with assertions enabled.
                    Unit tests are much too small for measuring compile time. 8%
for assertion build is massive. Why are there such large discrepancy?
                I've been trying to get measurements from LNT, but I'm
getting too much run-to-run variation.  A few benchmarks show significant
changes (both positive and negative), but the affected benchmarks are diffe
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130430/3cfe7f30/attachment.html>
Apparently Analagous Threads
- [LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
- [LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
- [LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
- [LLVMdev] Ordering not assigned to DAG Nodes create after DAG builder
- [LLVMdev] Ordering not assigned to DAG Nodes create after DAG builder