thr3ads.net - llvm dev - [LLVMdev] Removing the separation between opt and codegen? [Jun 2012]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2012-Jun-29 03:54 UTC

[LLVMdev] Removing the separation between opt and codegen?

Hello,

One important next step in turning LLVM into a first-class
autovectorizing compiler will be to incorporate target information into
the vectorization logic. To really make good decisions regarding what
is profitable to vectorize, and how that vectorization should be done,
it will be important for the vectorization pass(es) to understand the
underlying target capabilities. The same will hold true for various
kinds of loop iteration-space transformations.

As I recall, Chris suggested to me some months ago the following
work-around: allow optimization passes to access target lowering info
only when it is available. Specifically this means that only for
frontends (like clang) that link in both the optimization passes and
codegen, we would provide some mechanism for providing a TLI instance
to the optimization passes. While I think this could certainly be made
to work, it seems suboptimal. It would mean that 'opt' could no longer
perform the same level of optimization as 'clang' with equivalent
inputs. That being the case, I think that over time 'opt' would simply
fall out of use. My general question is this: What do we gain by
keeping a strict separation between the
(mostly-target-independent) optimization layer and the codegen layer?

To partially answer my own question, I can think of one advantage: It
keeps us from being lazy. Specifically, it forces us to keep a single
canonical expression form that is handed to the backends. The eases the
maintenance burden by forcing a certain amount of generality into the
whole system and by limiting target-specific variants of the
canonical expression forms. This makes it harder to break things in odd
ways with seemingly-innocuous changes.

I fear, however, that this leads to a system which is generally
good, but not great on any particular target. Furthermore, it is
sometimes very difficult or impossible for the backends to undo bad
decisions made by the target-independent optimization layer. I think
it is time to reconsider this separation and make optimization a truly
target-dependent process where needed. Obviously, we should not make
target-dependent decisions where they're not necessary, and we should
introduce appropriate abstraction layers to characterize target
differences. Nevertheless, the most efficient and maintainable way to
provide target information to the optimization passes will be to
provide that information directly from the backend code (and
associated tablegen files).

I would like to hear other opinions on this.

Thanks again,
Hal

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Das, Dibyendu

2012-Jun-29 05:14 UTC

head link

[LLVMdev] Removing the separation between opt and codegen?

Hal-

I generally agree with what you are saying here. Based on my recent experience
with working on a partial-simdizer (not llvm) I found that even to decide which
instructions to group for good simdization requires some knowledge of the
underlying target.

Lets take an instruction like haddps which adds all the components of a vector
register in a certain way. Whether such an instruction is supported by the
target does impact your simdization choice. Furthermore, the cost of haddps may
also decide how/where to simdize. Hence the simdization-choice phase which
should (theoretically) be fairly target-independent needs to have some 
knowledge of the target. Now whether this can be abstracted way in some form can
be discussed.

-Dibyendu

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Hal Finkel
Sent: Friday, June 29, 2012 9:25 AM
To: llvmdev at cs.uiuc.edu
Subject: [LLVMdev] Removing the separation between opt and codegen?

Hello,

One important next step in turning LLVM into a first-class autovectorizing
compiler will be to incorporate target information into the vectorization logic.
To really make good decisions regarding what is profitable to vectorize, and how
that vectorization should be done, it will be important for the vectorization
pass(es) to understand the underlying target capabilities. The same will hold
true for various kinds of loop iteration-space transformations.

As I recall, Chris suggested to me some months ago the following
work-around: allow optimization passes to access target lowering info only when
it is available. Specifically this means that only for frontends (like clang)
that link in both the optimization passes and codegen, we would provide some
mechanism for providing a TLI instance to the optimization passes. While I think
this could certainly be made to work, it seems suboptimal. It would mean that
'opt' could no longer perform the same level of optimization as
'clang' with equivalent inputs. That being the case, I think that over
time 'opt' would simply fall out of use. My general question is this:
What do we gain by keeping a strict separation between the
(mostly-target-independent) optimization layer and the codegen layer?

To partially answer my own question, I can think of one advantage: It keeps us
from being lazy. Specifically, it forces us to keep a single canonical
expression form that is handed to the backends. The eases the maintenance burden
by forcing a certain amount of generality into the whole system and by limiting
target-specific variants of the canonical expression forms. This makes it harder
to break things in odd ways with seemingly-innocuous changes.

I fear, however, that this leads to a system which is generally good, but not
great on any particular target. Furthermore, it is sometimes very difficult or
impossible for the backends to undo bad decisions made by the target-independent
optimization layer. I think it is time to reconsider this separation and make
optimization a truly target-dependent process where needed. Obviously, we should
not make target-dependent decisions where they're not necessary, and we
should introduce appropriate abstraction layers to characterize target
differences. Nevertheless, the most efficient and maintainable way to provide
target information to the optimization passes will be to provide that
information directly from the backend code (and associated tablegen files).

I would like to hear other opinions on this.

Thanks again,
Hal

--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Rotem, Nadav

2012-Jun-29 07:05 UTC

head link

[LLVMdev] Removing the separation between opt and codegen?

>>seems suboptimal. It would mean that 'opt' could no longer
perform the same level of optimization as
>>'clang' with equivalent inputs. That being the case, I think
that over time 'opt' would simply fall out of
>>use. My general question is this: What do we gain by keeping a strict
separation between the
>>(mostly-target-independent) optimization layer and the codegen layer?
I was under the impression that opt would also be able to benefit from the added
capabilities. After all, opt is just a driver, and can be taught to understand
the 'march' 'mcpu' flags.
I agree that it is important to allow opt to access the TLI for two reasons: 1.
We use opt to test our code. 2. Vectorizers may want to service domain specific
languages which may not necessarily use clang.
>>I fear, however, that this leads to a system which is generally good,
but not great on any particular
>> target. Furthermore, it is sometimes very difficult or impossible for
the backends to undo bad
>> decisions made by the target-independent optimization layer. 
Yes, but this is a general compiler problem. Early optimizations have no
knowledge of how they affects later stages. For example, we don't consider
register pressure when we inline a function. The problem is even more severe
with vectorizing compilers. One problem that I mentioned in the past was that on
64bit systems, 32bit scalars are promoted into 64bit numbers. Later on, the
vectorizer attempts to vectorize this value, but the problem is, that it is much
more difficult to vectorize vectors of i64s. For example, array indices which
were i32 values are now vectors of i64s, which can't be used for
scatter/gather operations (which use i32 indices).
>> Nevertheless,  the most efficient and maintainable way to provide
target information to the
>> optimization passes will be to provide that information directly from
the backend code (and
>> associated tablegen files).
>>I would like to hear other opinions on this.---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Jun 2012 - [LLVMdev] Removing the separation between opt and codegen?

[LLVMdev] Removing the separation between opt and codegen?

[LLVMdev] Removing the separation between opt and codegen?

[LLVMdev] Removing the separation between opt and codegen?

Possibly Parallel Threads