thr3ads.net - llvm dev - [LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Nadav Rotem

2013-Jun-24 21:59 UTC

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

> 
> Just for the record, I have no real expectation that this is a good idea
yet... But it's hard to collect numbers without a flag of some kind, and
it's also really annoying to craft this flag given the current pass manager,
so I figured I would get a skeleton in place that folks could experiment with,
and we could keep or delete based on this discussion and any numbers.
I agree. Adding a temporary flag is a good way to allow people to test this
change with minimal effort.  This is what we did when Jeffery Yasskin wanted to
check the vectorizer a few weeks ago.
>  
> 
> I see some potential issues:
> 
> * We run a loop pass later again with the associated compile time cost (?)
> 
> All of these share a common thread: the vectorizer somewhat inherently
loses information, and thus doing it during the iterative pass manager is very
risky as later iterations may be hampered by it.

I agree. The vectorizer is a *lowering* pass, and much like LSR and it loses
information.  A few months ago some of us talked about this and came up with a
general draft for the ideal pass ordering.
If I remember correctly the plan was that the second half of the pipe should
start with GVN (which currently runs after the loop passes). After that come the
loop passes, the Vectorizers (loop vectorization first), and finally LSR,
Lower-switch, CGP, etc.  I think that when we discussed this most people argued
that the inliner should be before GVN and the loop passes. It would be
interesting to see the performance numbers for the new pass order.

Nadav 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130624/549b3c86/attachment.html>

Chandler Carruth

2013-Jun-24 22:09 UTC

head link

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

On Mon, Jun 24, 2013 at 2:59 PM, Nadav Rotem <nrotem at apple.com> wrote:
> I agree. The vectorizer is a *lowering* pass, and much like LSR and it
> loses information.  A few months ago some of us talked about this and came
> up with a general draft for the ideal pass ordering.
>
Where? On the mailing list?

> If I remember correctly the plan was that the second half of the pipe
> should start with GVN (which currently runs after the loop passes). After
> that come the loop passes, the Vectorizers (loop vectorization first), and
> finally LSR, Lower-switch, CGP, etc.  I think that when we discussed this
> most people argued that the inliner should be before GVN and the loop
> passes. It would be interesting to see the performance numbers for the new
> pass order.
>
This doesn't make a lot of sense to me yet.

The inliner, GVN, and the loop passes run together, *iteratively*. They are
neither before or after one another. And this is important as it allows
iterative simplification in the inliner. It is one of the most critical
optimizations for C++ code that LLVM does.

We can't sink all of the loop passes out of the iterative pass model
either, because deleting loops, simplifying them, etc. all directly feed
the iterative simplification needed by GVN and the inliner.

We need a *second* loop pass that happens after the iterative CGSCC walk
which does the further optimizations such as (potentially indvars, ) the
vectorizers, LSR, lower-switch, CGP, CG. I think we actually want most of
the post CGSCC module passes to run after the vectorizers and before LSR to
fold away constants and globals that look different after vectorization
compared to before, but aren't significantly shifted by LSR and CGP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130624/ea0069e9/attachment.html>

Chris Lattner

2013-Jun-24 22:17 UTC

head link

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

On Jun 24, 2013, at 3:09 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:> The inliner, GVN, and the loop passes run together, *iteratively*. They are
neither before or after one another. And this is important as it allows
iterative simplification in the inliner. It is one of the most critical
optimizations for C++ code that LLVM does.
> 
> We can't sink all of the loop passes out of the iterative pass model
either, because deleting loops, simplifying them, etc. all directly feed the
iterative simplification needed by GVN and the inliner.
> 
> We need a *second* loop pass that happens after the iterative CGSCC walk
which does the further optimizations such as (potentially indvars, ) the
vectorizers, LSR, lower-switch, CGP, CG. I think we actually want most of the
post CGSCC module passes to run after the vectorizers and before LSR to fold
away constants and globals that look different after vectorization compared to
before, but aren't significantly shifted by LSR and CGP.
In terms of mental model, is it best to think of vectorization as being a loop
pass or as a late lowering pass?

What about when we get more aggressive loop transformations like blocking, strip
mining, fusion, etc?

-Chris

Andrew Trick

2013-Jun-25 02:16 UTC

head link

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

On Jun 24, 2013, at 3:09 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:
On Mon, Jun 24, 2013 at 2:59 PM, Nadav Rotem <nrotem at apple.com>
wrote:> I agree. The vectorizer is a *lowering* pass, and much like LSR and it
loses information.  A few months ago some of us talked about this and came up
with a general draft for the ideal pass ordering.
> 
> Where? On the mailing list?
These discussions had more to do with formalizing the use of target information
within IR passes (legality, instruction-level cost). There were some list
threads and offline discussion. I set this aside because I wasn’t sure how it
was going to fit with some of the other work in progress, particularly LTO.

I don’t think there’s any controversy over the high-level goals. But there will
be controversy when we start proposing concrete pass ordering changes.

When I return to work mid-July, I’d be happy to send out some proposed changes
for discussion. The first step will be an improved interface for IR-level cost
metrics, which we already agreed to some time ago.
> If I remember correctly the plan was that the second half of the pipe
should start with GVN (which currently runs after the loop passes). After that
come the loop passes, the Vectorizers (loop vectorization first), and finally
LSR, Lower-switch, CGP, etc.  I think that when we discussed this most people
argued that the inliner should be before GVN and the loop passes. It would be
interesting to see the performance numbers for the new pass order.
> 
> This doesn't make a lot of sense to me yet.
> The inliner, GVN, and the loop passes run together, *iteratively*. They are
neither before or after one another. And this is important as it allows
iterative simplification in the inliner. It is one of the most critical
optimizations for C++ code that LLVM does.
> 
> We can't sink all of the loop passes out of the iterative pass model
either, because deleting loops, simplifying them, etc. all directly feed the
iterative simplification needed by GVN and the inliner.
> 
> We need a *second* loop pass that happens after the iterative CGSCC walk
which does the further optimizations such as (potentially indvars, ) the
vectorizers, LSR, lower-switch, CGP, CG. I think we actually want most of the
post CGSCC module passes to run after the vectorizers and before LSR to fold
away constants and globals that look different after vectorization compared to
before, but aren't significantly shifted by LSR and CGP.
I don't want to start a centi-thread yet, but here's a very rough idea
(leaving many things out):

Canonicalize {
  Func {
    SimpCFG
    SROA-1
    EarlyCSE
  }
  CGSCC {
    Inline
    EarlyCSE
    SimpCFG
    InstCombine
    Early Loop Opts {
      LoopSimplify
      Rotate
      Obvious-Full-Unroll
    }
    SROA-2
    InstCombine
    GVN
    Reassociate
    Late Loop Opts {
      LICM
      Unswitch
    }
    SCCP
    InstCombine
    JT
    CVP
    DCE
  }
}
Lower {
  Target Loop Opts {
    IndvarSimplify
    Vectorize/Unroll
    LSR
  }
  SLP Vectorize
}

We might need to pull some things like exit value replacement out of
IndvarSimplify into target-independent loop opts.

-Andy

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130624/0e07b7af/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Jun 2013 - [LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its

Possibly Parallel Threads