thr3ads.net - llvm dev - [LLVMdev] Predication before CodeGen [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Nikhil A. Patil

2007-Oct-07 07:51 UTC

[LLVMdev] Predication before CodeGen

Hi,

I am  planning to generate code for a peculiar architecture with _no_ branch
instructions (!), but with predicated loads and stores to memory. This means
the architecture is not Turing complete, is going to waste a lot of
computation, and any input program that can hope to get compiled for this
architecture must have loops that can be fully unrolled, and all its
functions must get fully inlined.

Since I have no branches, I think I will need to predicate code before DAG
Selection. I was thinking of doing this:
1) Run an analysis on the LLVM bitcode that calculates the condition under
which an instruction will be "reached" -- this is straightforward
since my
input program is not allowed to have cycles in the CFG.
2) Run a pass that requires the above analysis and uses it to:
   - merge all basic blocks in topological sort order (which exists, because
CFG is acyclic).
   - insert appropriate instructions to generate the predicates.
   - change all PHI-nodes into Select nodes.
   - predicate memory operations (well, at least the stores).

It is this final predication step that I am not sure how to handle. Since
LLVM does not have predicated load/store instructions, will I have to
upgrade the memory operations to a call or something that would then have to
be custom handled by the SelectionDAG mechanism?

Or should I just leave the load/stores alone, and somehow, later after the
machine instructions are created predicate them using this predicate
information? If so, how do I ensure that the dependency on the predicate
generating instructions is preserved (the predicate instructions are not
dce'd away)?

Or should I be trying to do the predication right before Legalize? If so,
I'll need to think about it a little more.

Thank you for your time :)
nikhil
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20071007/9f2a1407/attachment.html>

Nikhil A. Patil

2007-Oct-07 07:57 UTC

head link

[LLVMdev] Predication before CodeGen

On 07/10/2007, Nikhil A. Patil <patil.nikhil at gmail.com>
wrote:> Hi,
>
> I am  planning to generate code for a peculiar architecture with _no_
branch
> instructions (!), but with predicated loads and stores to memory. This
means
> the architecture is not Turing complete, is going to waste a lot of
> computation, and any input program that can hope to get compiled for this
> architecture must have loops that can be fully unrolled, and all its
> functions must get fully inlined.
>
> Since I have no branches, I think I will need to predicate code before DAG
> Selection. I was thinking of doing this:
oops, I meant to say before "Initial SelectionDAG construction"
> 1) Run an analysis on the LLVM bitcode that calculates the condition under
> which an instruction will be "reached" -- this is straightforward
since my
> input program is not allowed to have cycles in the CFG.
> 2) Run a pass that requires the above analysis and uses it to:
>    - merge all basic blocks in topological sort order (which exists,
because
> CFG is acyclic).
>    - insert appropriate instructions to generate the predicates.
>    - change all PHI-nodes into Select nodes.
>    - predicate memory operations (well, at least the stores).
>
> It is this final predication step that I am not sure how to handle. Since
> LLVM does not have predicated load/store instructions, will I have to
> upgrade the memory operations to a call or something that would then have
to
> be custom handled by the SelectionDAG mechanism?
>
> Or should I just leave the load/stores alone, and somehow, later after the
> machine instructions are created predicate them using this predicate
> information? If so, how do I ensure that the dependency on the predicate
> generating instructions is preserved (the predicate instructions are not
> dce'd away)?
>
> Or should I be trying to do the predication right before Legalize? If so,
> I'll need to think about it a little more.
>
> Thank you for your time :)
> nikhil

Evan Cheng

2007-Oct-08 06:17 UTC

head link

[LLVMdev] Predication before CodeGen

On Oct 7, 2007, at 12:51 AM, Nikhil A. Patil wrote:
> Hi,
>
> I am  planning to generate code for a peculiar architecture with  
> _no_ branch instructions (!), but with predicated loads and stores  
> to memory. This means the architecture is not Turing complete, is  
> going to waste a lot of computation, and any input program that can  
> hope to get compiled for this architecture must have loops that can  
> be fully unrolled, and all its functions must get fully inlined.
>
> Since I have no branches, I think I will need to predicate code  
> before DAG Selection. I was thinking of doing this:
> 1) Run an analysis on the LLVM bitcode that calculates the  
> condition under which an instruction will be "reached" -- this is
> straightforward since my input program is not allowed to have  
> cycles in the CFG.
> 2) Run a pass that requires the above analysis and uses it to:
>    - merge all basic blocks in topological sort order (which  
> exists, because CFG is acyclic).
>    - insert appropriate instructions to generate the predicates.
>    - change all PHI-nodes into Select nodes.
>    - predicate memory operations (well, at least the stores).
I am not aware of a mechanism to completely remove branches for any  
general program. Don't you have to restrict it a loop with known  
iteration count?
>
> It is this final predication step that I am not sure how to handle.  
> Since LLVM does not have predicated load/store instructions, will I  
> have to upgrade the memory operations to a call or something that  
> would then have to be custom handled by the SelectionDAG mechanism?
Target specific instructions can have predicate operands. See ARM.  
I've implemented a if-converter but it runs after register allocation  
so it won't fit your needs.

Ideally (because that's what everyone does) you want a pass that  
operate on 3-address machine specific instructions. Then you can  
predicate code and merge blocks together before you do scheduling.  
The complication our codegen scheme is we do instruction scheduling  
on DAGs and only translate into 3-address code at the end of scheduling.

Sometime down the road we will be implementing whole function  
instruction scheduling. Once that's done, you can perhaps do a  
predication pass that operate on the DAG that represent the whole  
function. But that's not going to help you right now. So for now,  
your choice is either 1) Operate on DAGs that represent basic blocks,  
predicate nodes and merge multiple DAGs  into a larger DAG. 2)  
Operate on the machine instructions and machine basic blocks after  
instruction scheduling and redo scheduling afterwards.

Evan
>
> Or should I just leave the load/stores alone, and somehow, later  
> after the machine instructions are created predicate them using  
> this predicate information? If so, how do I ensure that the  
> dependency on the predicate generating instructions is preserved  
> (the predicate instructions are not dce'd away)?
>
> Or should I be trying to do the predication right before Legalize?  
> If so, I'll need to think about it a little more.
>
> Thank you for your time :)
> nikhil
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Nikhil A. Patil

2007-Oct-08 14:32 UTC

head link

[LLVMdev] Predication before CodeGen

On 08/10/2007, Evan Cheng <evan.cheng at apple.com>
wrote:>
> On Oct 7, 2007, at 12:51 AM, Nikhil A. Patil wrote:
>
> > Hi,
> >
> > I am  planning to generate code for a peculiar architecture with
> > _no_ branch instructions (!), but with predicated loads and stores
> > to memory. This means the architecture is not Turing complete, is
> > going to waste a lot of computation, and any input program that can
> > hope to get compiled for this architecture must have loops that can
> > be fully unrolled, and all its functions must get fully inlined.
> >
> > Since I have no branches, I think I will need to predicate code
> > before DAG Selection. I was thinking of doing this:
> > 1) Run an analysis on the LLVM bitcode that calculates the
> > condition under which an instruction will be "reached" --
this is
> > straightforward since my input program is not allowed to have
> > cycles in the CFG.
> > 2) Run a pass that requires the above analysis and uses it to:
> >    - merge all basic blocks in topological sort order (which
> > exists, because CFG is acyclic).
> >    - insert appropriate instructions to generate the predicates.
> >    - change all PHI-nodes into Select nodes.
> >    - predicate memory operations (well, at least the stores).
>
> I am not aware of a mechanism to completely remove branches for any
> general program. Don't you have to restrict it a loop with known
> iteration count?
Yes, I am going to have to live with that restriction. I also need to
assume no recursive functions, so all calls can be fully inlined.
>
> >
> > It is this final predication step that I am not sure how to handle.
> > Since LLVM does not have predicated load/store instructions, will I
> > have to upgrade the memory operations to a call or something that
> > would then have to be custom handled by the SelectionDAG mechanism?
>
> Target specific instructions can have predicate operands. See ARM.
> I've implemented a if-converter but it runs after register allocation
> so it won't fit your needs.
>
> Ideally (because that's what everyone does) you want a pass that
> operate on 3-address machine specific instructions. Then you can
> predicate code and merge blocks together before you do scheduling.
> The complication our codegen scheme is we do instruction scheduling
> on DAGs and only translate into 3-address code at the end of scheduling.
>
> Sometime down the road we will be implementing whole function
> instruction scheduling. Once that's done, you can perhaps do a
> predication pass that operate on the DAG that represent the whole
> function. But that's not going to help you right now. So for now,
> your choice is either 1) Operate on DAGs that represent basic blocks,
> predicate nodes and merge multiple DAGs  into a larger DAG. 2)
> Operate on the machine instructions and machine basic blocks after
> instruction scheduling and redo scheduling afterwards.
Thanks, that solves my confusion. Actually I did think of this earlier
but rejected the idea since this will mean that the Legalize phase is
going to generate "illegal" output -- it will still have branches in
it. I guess that can't be avoided anyway, because all input programs
aren't even legal :)

Also, is there a straightforward way to directly go from LLVM bitcode
to some generic form of machine instructions -- something like
lib/Target/Generic? The docs mention an "old-style 'simple'
instruction selector, which effectively peephole selects each LLVM
instruction into a series of machine instructions."  Is this still
available?

Thanks,
nikhil
>
> Evan
>
> >
> > Or should I just leave the load/stores alone, and somehow, later
> > after the machine instructions are created predicate them using
> > this predicate information? If so, how do I ensure that the
> > dependency on the predicate generating instructions is preserved
> > (the predicate instructions are not dce'd away)?
> >
> > Or should I be trying to do the predication right before Legalize?
> > If so, I'll need to think about it a little more.
> >
> > Thank you for your time :)
> > nikhil
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Oct 2007 - [LLVMdev] Predication before CodeGen

[LLVMdev] Predication before CodeGen

[LLVMdev] Predication before CodeGen

[LLVMdev] Predication before CodeGen

[LLVMdev] Predication before CodeGen

Reasonably Related Threads