thr3ads.net - llvm dev - [llvm-dev] A use of RDF to extend register Remat [Oct 2016]

If this information is useful, please help other people find it:
Share via:

vivek pandya via llvm-dev

2016-Oct-18 16:23 UTC

[llvm-dev] A use of RDF to extend register Remat

Dear Community,

I would like to discuss few points to use RDF to extend register remat
scope. Mr. Krzysztof and I have started discussion this on private mail.
But I think now it would be better to include community.
Interested community member kindly previous discussion (at the end of mail)
before starting here.

After analyzing if RDF can be used for solving Remat, we think that problem
with RDF is that since it is post-RA, rematable register values also will
be spilled and all its use will be surrounded with load-spill instructions.
So identifying rematable sequence of instructions will be possible with RDF
but its uses will not be associated with these instructions with any
use-def chain because live-ranges have been split by spill-relaod.

One solution that I can think of is that during RA when a spill code is
inserted , it can add a dummy instruction as a maker that can be used after
post-RA to identify live-range (reg value) which was spilled and which
instructions is using it after reloading (i.e where to use remated value).
The post-RA pass can use RDF to construct remat sequence and use it and
remove spill/restore as well as the marker instruction (dummy) inserted
during the spill code generation. Is this possible with out changing much
of Spill code framework? What all steps are required to add such a dummy
/debug instruction at MIR ?

In case if RDF graph is not useable for this problem then simply use-def
chain on MIR can be traversed in bottom up manner to identify remat
instruction sequence.

Please share your thoughts.
Vivek

On Tue, Oct 11, 2016 at 12:34 AM, Krzysztof Parzyszek <
kparzysz at codeaurora.org> wrote:
> Hi Vivek,
> Yes, RDF would help with that, even if these instructions were not in the
> same block.
>
> Once you have the node corresponding to the use of R3 in the last
> statement, you can get the node corresponding to the reaching def of that
> node. This def node would be a member of the "oris" statement.
Within that
> statement you would then look for use nodes and you'd find the use node
for
> R3: "oris r3, *r3*, 35809". From that node, you'd follow the
reaching def
> and this would give you: "sldi *r3*, r3, 32". Then you'd look
for use nodes
> and you'd get "sldi r3, *r3*, 32". Following this traversal
would
> eventually get you to the first "lis".
>
> Getting from use-node to def-node is a single step.
> Visiting other def/use nodes from the same statement, given a def-node is
> linear in terms of number of def/use nodes in that statement.
>
> If you want to rematerialize r30, then the whole sequence would need to
> use r30 (instead of r3). This is because r3 may not be available at the
> point where r30 is to be rematerialized. Finding out if r3 is live or not
> is also possible, but would require extra analysis.
>
> Getting an instruction, given an RDF node is easy:
> - def/use nodes have a pointer to MachineOperand,
> - statement node has a pointer to MachineInstr,
> - block node has a pointer to MachineBasicBlok.
>
> I don't think there are functions to do the inverse, i.e. given a
> MachineInstr, find the corresponding node. If necessary, such functions
> could be added, but existing RDF applications were simply traversing the
> RDF graph itself (instead of traversing the function and then finding
> corresponding nodes in the graph).
>
> Overall, RDF would definitely provide you with the information you are
> looking for.
>
> You can even try to see what you get without regmasks. Regmasks are used
> by most targets to provide a whole set of registers in a single machine
> operand, and they are mostly used in call instructions, where they indicate
> which registers are clobbered. This is mostly to avoid having tens of
> operands on these instructions. Hexagon doesn't use regmasks (I guess
they
> were not present when the backend was first added, and we never switched
> over). R3 is used by PowerPC as a parameter/return value register, so a
> call should modify it explicitly (and not via regmask). For other
> registers, a regmask would simply be ignored (right now), which could lead
> to wrong results.
>
> -Krzysztof
>
>
>
> On 10/10/2016 1:34 PM, vivek pandya wrote:
>
>> Hello Sir,
>>
>> I have a problem that I think can be solved with RDF. But I am not able
>> to identify runtime complexity for this.
>> Consider following powerpc asm :
>>         lis 3, 12414
>>         ori 3, 3, 27470
>>         sldi 3, 3, 32
>>         oris 3, 3, 35809
>>         ori 30, 3, 20615
>> here what I am looking for is that value in reg 30 is rematerilizable
>> (can be recalculated) because this whole sequence is using constants to
>> calculate the value that is in reg 30.
>> So this is how I think I can use RDFGraph here, for statement
>> ori 30, 3, 20615
>> reaching defs for reg 3 can be traced back on RDFGraph if it found to
be
>> (rematerilizable) then continue going backward on RDFGraph untill a
live
>> range of reg 3 does not end (in this example it will be lis 3, 12414)
>> This is very primitive example and also this may require RDFLiveness
>> analysis.
>>
>> But do you think this problem is feasible to solve with RDF if yes than
>> can you help me to build a concrete idea? (we may switch to llvm-dev if
>> this seems a possible path).
>> Ultimately this can be used to avoid spill by recalculating the value
>> for a register. This also means that it may require to access RDFGraph
>> based on the register i.e if RAX is required to be spilled I would like
>> to check if RAX at statement X is having rematerilizable value or not.
>> Also this may require to access instruction for a given RDFNode.
>> Also will it be a liner time algorithm in terms of number of nodes in
>> RDFGraph?
>>
>> Please share your thoughts.
>>
>> Sincerely,
>> Vivek
>>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161018/d49df6bb/attachment-0001.html>

Quentin Colombet via llvm-dev

2016-Oct-21 20:38 UTC

head link

[llvm-dev] A use of RDF to extend register Remat

Hi Vivek,

What is the scope of your extended rematerialization support?

Indeed, rematerialization is usually a way to avoid spilling, but proposed
solution sounds more like an optimization to get rid of some load/store pairs.
It seems to me clumsy to extend rematerialization after RA, because that means
we have to check and recompute extra things to be able to do it. For instance,
we have to rebuild the live-ranges, check for redefinition etc.

In my opinion, if you want to stay away of the register allocator, it would make
more sense to explore a rematerialization algorithm working on SSA that uses
register pressure to take rematerialization decisions.

Anyhow, whatever is the direction we want to pursue, I believe it is best to
start with motivating examples where the current framework is not sufficient and
see what would work best.

Cheers,
-Quentin > On Oct 18, 2016, at 9:23 AM, vivek pandya via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Dear Community,
> 
> I would like to discuss few points to use RDF to extend register remat
scope. Mr. Krzysztof and I have started discussion this on private mail.  But I
think now it would be better to include community.
> Interested community member kindly previous discussion (at the end of mail)
before starting here.
> 
> After analyzing if RDF can be used for solving Remat, we think that problem
with RDF is that since it is post-RA, rematable register values also will be
spilled and all its use will be surrounded with load-spill instructions. So
identifying rematable sequence of instructions will be possible with RDF but its
uses will not be associated with these instructions with any use-def chain
because live-ranges have been split by spill-relaod.
> 
> One solution that I can think of is that during RA when a spill code is
inserted , it can add a dummy instruction as a maker that can be used after
post-RA to identify live-range (reg value) which was spilled and which
instructions is using it after reloading (i.e where to use remated value). The
post-RA pass can use RDF to construct remat sequence and use it and remove
spill/restore as well as the marker instruction (dummy) inserted during the
spill code generation. Is this possible with out changing much of Spill code
framework? What all steps are required to add such a dummy /debug instruction at
MIR ?
> 
> In case if RDF graph is not useable for this problem then simply use-def
chain on MIR can be traversed in bottom up manner to identify remat instruction
sequence.
> 
> Please share your thoughts.
> Vivek
> 
> On Tue, Oct 11, 2016 at 12:34 AM, Krzysztof Parzyszek <kparzysz at
codeaurora.org <mailto:kparzysz at codeaurora.org>> wrote:
> Hi Vivek,
> Yes, RDF would help with that, even if these instructions were not in the
same block.
> 
> Once you have the node corresponding to the use of R3 in the last
statement, you can get the node corresponding to the reaching def of that node.
This def node would be a member of the "oris" statement. Within that
statement you would then look for use nodes and you'd find the use node for
R3: "oris r3, *r3*, 35809". From that node, you'd follow the
reaching def and this would give you: "sldi *r3*, r3, 32". Then
you'd look for use nodes and you'd get "sldi r3, *r3*, 32".
Following this traversal would eventually get you to the first "lis".
> 
> Getting from use-node to def-node is a single step.
> Visiting other def/use nodes from the same statement, given a def-node is
linear in terms of number of def/use nodes in that statement.
> 
> If you want to rematerialize r30, then the whole sequence would need to use
r30 (instead of r3). This is because r3 may not be available at the point where
r30 is to be rematerialized. Finding out if r3 is live or not is also possible,
but would require extra analysis.
> 
> Getting an instruction, given an RDF node is easy:
> - def/use nodes have a pointer to MachineOperand,
> - statement node has a pointer to MachineInstr,
> - block node has a pointer to MachineBasicBlok.
> 
> I don't think there are functions to do the inverse, i.e. given a
MachineInstr, find the corresponding node. If necessary, such functions could be
added, but existing RDF applications were simply traversing the RDF graph itself
(instead of traversing the function and then finding corresponding nodes in the
graph).
> 
> Overall, RDF would definitely provide you with the information you are
looking for.
> 
> You can even try to see what you get without regmasks. Regmasks are used by
most targets to provide a whole set of registers in a single machine operand,
and they are mostly used in call instructions, where they indicate which
registers are clobbered. This is mostly to avoid having tens of operands on
these instructions. Hexagon doesn't use regmasks (I guess they were not
present when the backend was first added, and we never switched over). R3 is
used by PowerPC as a parameter/return value register, so a call should modify it
explicitly (and not via regmask). For other registers, a regmask would simply be
ignored (right now), which could lead to wrong results.
> 
> -Krzysztof
> 
> 
> 
> On 10/10/2016 1:34 PM, vivek pandya wrote:
> Hello Sir,
> 
> I have a problem that I think can be solved with RDF. But I am not able
> to identify runtime complexity for this.
> Consider following powerpc asm :
>         lis 3, 12414
>         ori 3, 3, 27470
>         sldi 3, 3, 32
>         oris 3, 3, 35809
>         ori 30, 3, 20615
> here what I am looking for is that value in reg 30 is rematerilizable
> (can be recalculated) because this whole sequence is using constants to
> calculate the value that is in reg 30.
> So this is how I think I can use RDFGraph here, for statement
> ori 30, 3, 20615
> reaching defs for reg 3 can be traced back on RDFGraph if it found to be
> (rematerilizable) then continue going backward on RDFGraph untill a live
> range of reg 3 does not end (in this example it will be lis 3, 12414)
> This is very primitive example and also this may require RDFLiveness
> analysis.
> 
> But do you think this problem is feasible to solve with RDF if yes than
> can you help me to build a concrete idea? (we may switch to llvm-dev if
> this seems a possible path).
> Ultimately this can be used to avoid spill by recalculating the value
> for a register. This also means that it may require to access RDFGraph
> based on the register i.e if RAX is required to be spilled I would like
> to check if RAX at statement X is having rematerilizable value or not.
> Also this may require to access instruction for a given RDFNode.
> Also will it be a liner time algorithm in terms of number of nodes in
> RDFGraph?
> 
> Please share your thoughts.
> 
> Sincerely,
> Vivek
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161021/710fa696/attachment.html>

vivek pandya via llvm-dev

2016-Oct-22 04:22 UTC

head link

[llvm-dev] A use of RDF to extend register Remat

On Sat, Oct 22, 2016 at 2:08 AM, Quentin Colombet <qcolombet at apple.com>
wrote:
> Hi Vivek,
>
> Hello Quentin,
> What is the scope of your extended rematerialization support?
>
> Sorry for staling http://lists.llvm.org/pipermail/llvm-dev/2016-September/104847.html

but I would like to quote a simple example from previous discussion:
 void foo(long);
 void bar() {
   for (int i = 0; i < 1600; ++i)
     foo(3494348345984503943);
 }

 $ clang -O3 -S -o - ~/tmp/tl.c -target powerpc64
 ...
 # BB#0:                                 # %entry
...
        lis 3, 12414
        ori 3, 3, 27470
        sldi 3, 3, 32
        oris 3, 3, 35809
        ori 30, 3, 20615
...

So my aim is to remat a value which is constructed with multiple
instructions (mostly on an architecture where complex addressing mode is
not supported)

Indeed, rematerialization is usually a way to avoid spilling, but
proposed> solution sounds more like an optimization to get rid of some load/store
> pairs.
> It seems to me clumsy to extend rematerialization after RA, because that
> means we have to check and recompute extra things to be able to do it. For
> instance, we have to rebuild the live-ranges, check for redefinition etc.
>I agree.
>
> In my opinion, if you want to stay away of the register allocator, it
> would make more sense to explore a rematerialization algorithm working on
> SSA that uses register pressure to take rematerialization decisions.
>Do you here mean working on MIR during RA phase itself ? In that case are
you referring to use size of VirtRegMap to get estimation of register
pressure or any other technique?

Also I have studied a patch for GCC for remat problem (how ever this is for
onny SH architecture) https://gcc.gnu.org/ml/gcc-
patches/2003-12/txt00058.txt
in this patch particularly df.c file . In this patch to find out complex
patterns it just travers use-def chains. I think similar thing can be
implemented at MIR level. I also thinks that it would be better to spend
more time to think about cost functions to decide if it will be good to
remat or not. Also try to address related bug https://llvm.org/bugs/show_
bug.cgi?id=25373#c9

Sincerely,
Vivek

Anyhow, whatever is the direction we want to pursue, I believe it is
best> to start with motivating examples where the current framework is not
> sufficient and see what would work best.
>
> Cheers,
> -Quentin
>
> On Oct 18, 2016, at 9:23 AM, vivek pandya via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Dear Community,
>
> I would like to discuss few points to use RDF to extend register remat
> scope. Mr. Krzysztof and I have started discussion this on private mail.
> But I think now it would be better to include community.
> Interested community member kindly previous discussion (at the end of
> mail) before starting here.
>
> After analyzing if RDF can be used for solving Remat, we think that
> problem with RDF is that since it is post-RA, rematable register values
> also will be spilled and all its use will be surrounded with load-spill
> instructions. So identifying rematable sequence of instructions will be
> possible with RDF but its uses will not be associated with these
> instructions with any use-def chain because live-ranges have been split by
> spill-relaod.
>
> One solution that I can think of is that during RA when a spill code is
> inserted , it can add a dummy instruction as a maker that can be used after
> post-RA to identify live-range (reg value) which was spilled and which
> instructions is using it after reloading (i.e where to use remated value).
> The post-RA pass can use RDF to construct remat sequence and use it and
> remove spill/restore as well as the marker instruction (dummy) inserted
> during the spill code generation. Is this possible with out changing much
> of Spill code framework? What all steps are required to add such a dummy
> /debug instruction at MIR ?
>
> In case if RDF graph is not useable for this problem then simply use-def
> chain on MIR can be traversed in bottom up manner to identify remat
> instruction sequence.
>
> Please share your thoughts.
> Vivek
>
> On Tue, Oct 11, 2016 at 12:34 AM, Krzysztof Parzyszek <
> kparzysz at codeaurora.org> wrote:
>
>> Hi Vivek,
>> Yes, RDF would help with that, even if these instructions were not in
the
>> same block.
>>
>> Once you have the node corresponding to the use of R3 in the last
>> statement, you can get the node corresponding to the reaching def of
that
>> node. This def node would be a member of the "oris"
statement. Within that
>> statement you would then look for use nodes and you'd find the use
node for
>> R3: "oris r3, *r3*, 35809". From that node, you'd follow
the reaching def
>> and this would give you: "sldi *r3*, r3, 32". Then you'd
look for use nodes
>> and you'd get "sldi r3, *r3*, 32". Following this
traversal would
>> eventually get you to the first "lis".
>>
>> Getting from use-node to def-node is a single step.
>> Visiting other def/use nodes from the same statement, given a def-node
is
>> linear in terms of number of def/use nodes in that statement.
>>
>> If you want to rematerialize r30, then the whole sequence would need to
>> use r30 (instead of r3). This is because r3 may not be available at the
>> point where r30 is to be rematerialized. Finding out if r3 is live or
not
>> is also possible, but would require extra analysis.
>>
>> Getting an instruction, given an RDF node is easy:
>> - def/use nodes have a pointer to MachineOperand,
>> - statement node has a pointer to MachineInstr,
>> - block node has a pointer to MachineBasicBlok.
>>
>> I don't think there are functions to do the inverse, i.e. given a
>> MachineInstr, find the corresponding node. If necessary, such functions
>> could be added, but existing RDF applications were simply traversing
the
>> RDF graph itself (instead of traversing the function and then finding
>> corresponding nodes in the graph).
>>
>> Overall, RDF would definitely provide you with the information you are
>> looking for.
>>
>> You can even try to see what you get without regmasks. Regmasks are
used
>> by most targets to provide a whole set of registers in a single machine
>> operand, and they are mostly used in call instructions, where they
indicate
>> which registers are clobbered. This is mostly to avoid having tens of
>> operands on these instructions. Hexagon doesn't use regmasks (I
guess they
>> were not present when the backend was first added, and we never
switched
>> over). R3 is used by PowerPC as a parameter/return value register, so a
>> call should modify it explicitly (and not via regmask). For other
>> registers, a regmask would simply be ignored (right now), which could
lead
>> to wrong results.
>>
>> -Krzysztof
>>
>>
>>
>> On 10/10/2016 1:34 PM, vivek pandya wrote:
>>
>>> Hello Sir,
>>>
>>> I have a problem that I think can be solved with RDF. But I am not
able
>>> to identify runtime complexity for this.
>>> Consider following powerpc asm :
>>>         lis 3, 12414
>>>         ori 3, 3, 27470
>>>         sldi 3, 3, 32
>>>         oris 3, 3, 35809
>>>         ori 30, 3, 20615
>>> here what I am looking for is that value in reg 30 is
rematerilizable
>>> (can be recalculated) because this whole sequence is using
constants to
>>> calculate the value that is in reg 30.
>>> So this is how I think I can use RDFGraph here, for statement
>>> ori 30, 3, 20615
>>> reaching defs for reg 3 can be traced back on RDFGraph if it found
to be
>>> (rematerilizable) then continue going backward on RDFGraph untill a
live
>>> range of reg 3 does not end (in this example it will be lis 3,
12414)
>>> This is very primitive example and also this may require
RDFLiveness
>>> analysis.
>>>
>>> But do you think this problem is feasible to solve with RDF if yes
than
>>> can you help me to build a concrete idea? (we may switch to
llvm-dev if
>>> this seems a possible path).
>>> Ultimately this can be used to avoid spill by recalculating the
value
>>> for a register. This also means that it may require to access
RDFGraph
>>> based on the register i.e if RAX is required to be spilled I would
like
>>> to check if RAX at statement X is having rematerilizable value or
not.
>>> Also this may require to access instruction for a given RDFNode.
>>> Also will it be a liner time algorithm in terms of number of nodes
in
>>> RDFGraph?
>>>
>>> Please share your thoughts.
>>>
>>> Sincerely,
>>> Vivek
>>>
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted
>> by The Linux Foundation
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161022/0c3dacab/attachment-0001.html>

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Oct 2016 - A use of RDF to extend register Remat

[llvm-dev] A use of RDF to extend register Remat

[llvm-dev] A use of RDF to extend register Remat

[llvm-dev] A use of RDF to extend register Remat

Maybe Matching Threads