thr3ads.net - llvm dev - [llvm-dev] [RFC] Register Rematerialization (remat) Extension [Sep 2016]

If this information is useful, please help other people find it:
Share via:

vivek pandya via llvm-dev

2016-Sep-19 18:17 UTC

[llvm-dev] [RFC] Register Rematerialization (remat) Extension

On Mon, Sep 19, 2016 at 6:21 PM, James Molloy <james at jamesmolloy.co.uk>
wrote:
> Hi,
>
> I've been looking at this myself for ARM, and came up with a much
simpler
> solution: lower immediate materializations to a post-RA pseudo and expand
> the chain of materialization instructions after register allocation /
> remat. Remat only sees one instruction with no dependencies.
>
> Did you look down this route and discount it?
>No actually I am not much familiar with this topic so I mostly reply on
research papers available.
But your idea seems to be simple and good solution but I am not sure if
this can cover every possible cases.

Do you have a patch for this? I can work on this to make it work for other
architectures for which this will be beneficial.

Sincerely,
Vivek
>
> Cheers,
>
> James
>
> On Wed, 14 Sep 2016 at 02:43 Gerolf Hoflehner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Sep 12, 2016, at 10:14 AM, Andrew Trick via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>
>> On Sep 12, 2016, at 8:51 AM, vivek pandya via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>
>> 1 ) As LLVM MI is already in SSA form before reg allocation so for LLVM
I
>> think it does not require to build SSA graph and converting it back
after
>> optimization completed as mentioned in [1]
>>
>> 2 ) We would like to add a pass similar to SCCP.cpp (Sparse Conditional
>> Constant
>> Propagation based on Wegman and Zadeck's work
http://dl.acm.org/citation.
>> cfm?id=103136) as desribed in [1]. This pass will be scheduled to run
>> before register allocation.
>>
>> 3 ) Output of the pass added in Step 2 will be a Map of def to
>> instructions pointers (instructions which can be used to remat the
given
>> live range). The map will contain live ranges which is due to single
>> instruction and multiple instructions.
>>
>>
>> LiveIntervals maintains a quasi-SSA form via VNInfo. It does not allow
>> efficient def-use queries, but use-def is there, which is all that you
>> should need.
>>
>>
>> I also only see a narrow and specific remat cost problem in the
example:
>> is it cheaper is issue a  chain of instructions rather than a fill? And
for
>> this a use-def is sufficient.
>>
>>
>> It would be great to have better remat during regalloc, but please try
to
>> avoid building additional state that needs to be maintained.
>>
>>
>> You proposed a fairly complex scheme. The question then is always is it
>> worth it? To answer that question you would need to investigate and
break
>> down the current remat problems (spills but should remat, remat but
should
>> spill, should remat at a different location, etc) eg. for the llvm test
>> suite, and show that your algorithms solves the most important ones.
>>
>>
>>
>>
>> -Andy
>>
>>
>> 4 ) The remat APIs defined in LiveRangeEdit.cpp will use analysis from
>> the Map
>> when a spill is required for RA.
>>
>> 5 ) The remat transformation APIs like rematerializeAt() will be
teached
>> to remat
>> live ranges with multiple instructions too.
>>
>> 6 ) A cost analysis will be require to decide between remat and spill.
>> This should be based on at least two factors register pressure and
spill
>> cost
>>
>> Few points:
>> --------------
>> * The analysis pass to be addes as per (2) will use target specific
>> information
>> from TargetInstrInfo.cpp as the current remat infrastructure uses.
>>
>> * This approach will not be on demand as the current approach is (i.e
>> remat specific
>> code will be executed only if there is a spill) so the pass in (2) can
be
>> an
>> overhead so we may want it to enable only for higher level of
>> optimization.
>>
>> * Will it be possible to use existing SCCP.cpp code with few
modification
>> to lattice
>> and related mathematical operation so that it can serve both purpose?
>>
>> * No changes in current register allocators or spill framework will be
>> required
>> because remat entry point will be LiveRangeEdit.
>>
>> Any other way with less overhead is always welcomed.
>> Please help us developing a plan to implement this.
>>
>> Hoping for comments!
>>
>> Sincerely,
>> Vivek
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160919/dfa6894a/attachment.html>

Quentin Colombet via llvm-dev

2016-Sep-19 18:27 UTC

head link

[llvm-dev] [RFC] Register Rematerialization (remat) Extension

Hi Vivek,
> On Sep 19, 2016, at 11:17 AM, vivek pandya via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> 
> On Mon, Sep 19, 2016 at 6:21 PM, James Molloy <james at
jamesmolloy.co.uk <mailto:james at jamesmolloy.co.uk>> wrote:
> Hi,
> 
> I've been looking at this myself for ARM, and came up with a much
simpler solution: lower immediate materializations to a post-RA pseudo and
expand the chain of materialization instructions after register allocation /
remat. Remat only sees one instruction with no dependencies.
> 
> Did you look down this route and discount it?
> No actually I am not much familiar with this topic so I mostly reply on
research papers available.
> But your idea seems to be simple and good solution but I am not sure if
this can cover every possible cases.
This is the way all targets deal with simple rematerialization involving several
instructions in LLVM AFAIK.

Basically, the target defines a pseudo instruction that encodes this sequence of
instructions and expands it after register allocation. This is a case by case
thing, there is no patch that can be generalized for other target.
For instance, look at the expansion of AArch64::MOVi64imm.

The bottom line is, our rematerialization scheme is currently limited, but I am
not sure your proposal get us beyond what we already support.

Cheers,
Q.
> 
> Do you have a patch for this? I can work on this to make it work for other
architectures for which this will be beneficial.
> 
> Sincerely,
> Vivek
> 
> Cheers,
> 
> James
> 
> On Wed, 14 Sep 2016 at 02:43 Gerolf Hoflehner via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> On Sep 12, 2016, at 10:14 AM, Andrew Trick via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> 
>>> On Sep 12, 2016, at 8:51 AM, vivek pandya via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>> 
>>> 
>>> 1 ) As LLVM MI is already in SSA form before reg allocation so for
LLVM I think it does not require to build SSA graph and converting it back after
optimization completed as mentioned in [1]
>>> 
>>> 2 ) We would like to add a pass similar to SCCP.cpp (Sparse
Conditional Constant
>>> Propagation based on Wegman and Zadeck's work
http://dl.acm.org/citation.cfm?id=103136
<http://dl.acm.org/citation.cfm?id=103136>) as desribed in [1]. This pass
will be scheduled to run before register allocation.
>>> 
>>> 3 ) Output of the pass added in Step 2 will be a Map of def to
instructions pointers (instructions which can be used to remat the given live
range). The map will contain live ranges which is due to single instruction and
multiple instructions.
>> 
>> LiveIntervals maintains a quasi-SSA form via VNInfo. It does not allow
efficient def-use queries, but use-def is there, which is all that you should
need.
> 
> I also only see a narrow and specific remat cost problem in the example: is
it cheaper is issue a  chain of instructions rather than a fill? And for this a
use-def is sufficient.
> 
>> 
>> It would be great to have better remat during regalloc, but please try
to avoid building additional state that needs to be maintained.
> 
> You proposed a fairly complex scheme. The question then is always is it
worth it? To answer that question you would need to investigate and break down
the current remat problems (spills but should remat, remat but should spill,
should remat at a different location, etc) eg. for the llvm test suite, and show
that your algorithms solves the most important ones.
> 
> 
> 
>> 
>> -Andy
>> 
>>> 
>>> 4 ) The remat APIs defined in LiveRangeEdit.cpp will use analysis
from the Map
>>> when a spill is required for RA.
>>> 
>>> 5 ) The remat transformation APIs like rematerializeAt() will be
teached to remat
>>> live ranges with multiple instructions too.
>>> 
>>> 6 ) A cost analysis will be require to decide between remat and
spill. This should be based on at least two factors register pressure and spill
cost
>>> 
>>> Few points:
>>> --------------
>>> * The analysis pass to be addes as per (2) will use target specific
information
>>> from TargetInstrInfo.cpp as the current remat infrastructure uses.
>>> 
>>> * This approach will not be on demand as the current approach is
(i.e remat specific
>>> code will be executed only if there is a spill) so the pass in (2)
can be an
>>> overhead so we may want it to enable only for higher level of
optimization.
>>> 
>>> * Will it be possible to use existing SCCP.cpp code with few
modification to lattice
>>> and related mathematical operation so that it can serve both
purpose?
>>> 
>>> * No changes in current register allocators or spill framework will
be required
>>> because remat entry point will be LiveRangeEdit.
>>> 
>>> Any other way with less overhead is always welcomed.
>>> Please help us developing a plan to implement this.
>>> 
>>> Hoping for comments!
>>> 
>>> Sincerely,
>>> Vivek
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160919/6e01b1fc/attachment-0001.html>

Hal Finkel via llvm-dev

2016-Sep-26 17:12 UTC

head link

[llvm-dev] [RFC] Register Rematerialization (remat) Extension

----- Original Message -----
> From: "Quentin Colombet via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "vivek pandya" <vivekvpandya at gmail.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Nirav
Rana"
> <h2015087 at pilani.bits-pilani.ac.in>, "Matthias Braun"
> <matze at braunis.de>
> Sent: Monday, September 19, 2016 1:27:10 PM
> Subject: Re: [llvm-dev] [RFC] Register Rematerialization (remat)
> Extension
> Hi Vivek,
> > On Sep 19, 2016, at 11:17 AM, vivek pandya via llvm-dev <
> > llvm-dev at lists.llvm.org > wrote:
> 
> > On Mon, Sep 19, 2016 at 6:21 PM, James Molloy <
> > james at jamesmolloy.co.uk > wrote:
> 
> > > Hi,
> > 
> 
> > > I've been looking at this myself for ARM, and came up with a
much
> > > simpler solution: lower immediate materializations to a post-RA
> > > pseudo and expand the chain of materialization instructions after
> > > register allocation / remat. Remat only sees one instruction with
> > > no
> > > dependencies.
> > 
> 
> > > Did you look down this route and discount it?
> > 
> 
> > No actually I am not much familiar with this topic so I mostly
> > reply
> > on research papers available.
> 
> > But your idea seems to be simple and good solution but I am not
> > sure
> > if this can cover every possible cases.
> 
> This is the way all targets deal with simple rematerialization
> involving several instructions in LLVM AFAIK.
> Basically, the target defines a pseudo instruction that encodes this
> sequence of instructions and expands it after register allocation.
> This is a case by case thing, there is no patch that can be
> generalized for other target.
> For instance, look at the expansion of AArch64:: MOVi64imm.
> The bottom line is, our rematerialization scheme is currently
> limited, but I am not sure your proposal get us beyond what we
> already support.I might have misunderstood the proposal, but why do you say that? The problem is
not limited to constants (as perhaps evidenced by Ivan Baev's talk at the
2014 dev meeting). One basic thing we should get, for example, is:

q = ...; 
r = ...; 
for (...) { 
// complicated stuff 
foo(q, r, q - r); // should prefer to remat (q-r) here instead of spilling. 
} 

Also, this is all perhaps related to
https://llvm.org/bugs/show_bug.cgi?id=25373#c9

Thanks again, 
Hal 
> Cheers,
> Q.
> > Do you have a patch for this? I can work on this to make it work
> > for
> > other architectures for which this will be beneficial.
> 
> > Sincerely,
> 
> > Vivek
> 
> > > Cheers,
> > 
> 
> > > James
> > 
> 
> > > On Wed, 14 Sep 2016 at 02:43 Gerolf Hoflehner via llvm-dev <
> > > llvm-dev at lists.llvm.org > wrote:
> > 
> 
> > > > > On Sep 12, 2016, at 10:14 AM, Andrew Trick via llvm-dev
<
> > > > > llvm-dev at lists.llvm.org > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > On Sep 12, 2016, at 8:51 AM, vivek pandya via
llvm-dev <
> > > > > > llvm-dev at lists.llvm.org > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 1 ) As LLVM MI is already in SSA form before reg
allocation
> > > > > > so
> > > > > > for
> > > > > > LLVM I think it does not require to build SSA
graph and
> > > > > > converting
> > > > > > it back after optimization completed as mentioned
in [1]
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 2 ) We would like to add a pass similar to
SCCP.cpp (Sparse
> > > > > > Conditional Constant
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Propagation based on Wegman and Zadeck's work
> > > > > > http://dl.acm.org/citation.cfm?id=103136 ) as
desribed in
> > > > > > [1].
> > > > > > This
> > > > > > pass will be scheduled to run before register
allocation.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 3 ) Output of the pass added in Step 2 will be a
Map of def
> > > > > > to
> > > > > > instructions pointers (instructions which can be
used to
> > > > > > remat
> > > > > > the
> > > > > > given live range). The map will contain live
ranges which
> > > > > > is
> > > > > > due
> > > > > > to
> > > > > > single instruction and multiple instructions.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > LiveIntervals maintains a quasi-SSA form via VNInfo. It
does
> > > > > not
> > > > > allow efficient def-use queries, but use-def is there,
which
> > > > > is
> > > > > all
> > > > > that you should need.
> > > > 
> > > 
> > 
> 
> > > > I also only see a narrow and specific remat cost problem in
the
> > > > example: is it cheaper is issue a chain of instructions
rather
> > > > than
> > > > a fill? And for this a use-def is sufficient.
> > > 
> > 
> 
> > > > > It would be great to have better remat during regalloc,
but
> > > > > please
> > > > > try to avoid building additional state that needs to be
> > > > > maintained.
> > > > 
> > > 
> > 
> 
> > > > You proposed a fairly complex scheme. The question then is
> > > > always
> > > > is
> > > > it worth it? To answer that question you would need to
> > > > investigate
> > > > and break down the current remat problems (spills but should
> > > > remat,
> > > > remat but should spill, should remat at a different
location,
> > > > etc)
> > > > eg. for the llvm test suite, and show that your algorithms
> > > > solves
> > > > the most important ones.
> > > 
> > 
> 
> > > > > -Andy
> > > > 
> > > 
> > 
> 
> > > > > > 4 ) The remat APIs defined in LiveRangeEdit.cpp
will use
> > > > > > analysis
> > > > > > from the Map
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > when a spill is required for RA.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 5 ) The remat transformation APIs like
rematerializeAt()
> > > > > > will
> > > > > > be
> > > > > > teached to remat
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > live ranges with multiple instructions too.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 6 ) A cost analysis will be require to decide
between remat
> > > > > > and
> > > > > > spill. This should be based on at least two
factors
> > > > > > register
> > > > > > pressure and spill cost
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Few points:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > --------------
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > * The analysis pass to be addes as per (2) will
use target
> > > > > > specific
> > > > > > information
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > from TargetInstrInfo.cpp as the current remat
> > > > > > infrastructure
> > > > > > uses.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > * This approach will not be on demand as the
current
> > > > > > approach
> > > > > > is
> > > > > > (i.e
> > > > > > remat specific
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > code will be executed only if there is a spill) so
the pass
> > > > > > in
> > > > > > (2)
> > > > > > can be an
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > overhead so we may want it to enable only for
higher level
> > > > > > of
> > > > > > optimization.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > * Will it be possible to use existing SCCP.cpp
code with
> > > > > > few
> > > > > > modification to lattice
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > and related mathematical operation so that it can
serve
> > > > > > both
> > > > > > purpose?
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > * No changes in current register allocators or
spill
> > > > > > framework
> > > > > > will
> > > > > > be required
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > because remat entry point will be LiveRangeEdit.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Any other way with less overhead is always
welcomed.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Please help us developing a plan to implement
this.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Hoping for comments!
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Sincerely,
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Vivek
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > _______________________________________________
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > LLVM Developers mailing list
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > llvm-dev at lists.llvm.org
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > >
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > _______________________________________________
> > > > 
> > > 
> > 
> 
> > > > > LLVM Developers mailing list
> > > > 
> > > 
> > 
> 
> > > > > llvm-dev at lists.llvm.org
> > > > 
> > > 
> > 
> 
> > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > 
> > > 
> > 
> 
> > > > _______________________________________________
> > > 
> > 
> 
> > > > LLVM Developers mailing list
> > > 
> > 
> 
> > > > llvm-dev at lists.llvm.org
> > > 
> > 
> 
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > 
> > 
> 
> > _______________________________________________
> 
> > LLVM Developers mailing list
> 
> > llvm-dev at lists.llvm.org
> 
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160926/459a2828/attachment.html>

llvm dev - Sep 2016 - [RFC] Register Rematerialization (remat) Extension

[llvm-dev] [RFC] Register Rematerialization (remat) Extension

[llvm-dev] [RFC] Register Rematerialization (remat) Extension

[llvm-dev] [RFC] Register Rematerialization (remat) Extension