thr3ads.net - llvm dev - [LLVMdev] VLIW Scheduling [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Tzu-Chien Chiu

2005-Sep-14 08:27 UTC

[LLVMdev] VLIW Scheduling

VLIW (Very Long Instruction Word) is a long instruction format (called
"group" hereafter) contains several instructions. These instructions
are not dependent on each other and could be issued in a single cycle.

At this moment there is no correspondent class for VLIW. MachineInstr
object can only represent one instruction. Usually the number of
instructions in a group is fixed. The grouping could be implicitly
maintained by inserting dummy nops to the instruction sequence. For
example, there are at most two  instructions in a VLIW, if items in
MachineBasicBlock::Insts is:

  MUL
  ADD
  ADD
  NOP
  MUL
  NOP

It implies the scheduling:

  MUL - ADD
  ADD - NOP
  MUL - NOP

Each pairs could be issued in parallel.

However, "one single MachineInstr" is the "unit" of many
passes, e.g.
LiveVariables and LiveIntervals. Without modifications, some code-gen
result may be inefficient (or incorrect).

For example:

  mul %reg1025, %reg1024, 1
  add %reg1026, %reg1024, 2

Suppose the life interval of %reg1024 ends at ADD (no use after ADD).
These two instructions are not dependent and could be scheduled
together:

  mul %reg1024, %reg1025, 1 -  add %reg1026, %reg1025, 2

The life intervals of  vi%reg1024 and %reg1025 are actually _not_
interfered with each other,

  %reg1024 => r1
  %reg1025 => r1
  %reg1026 => r2

  mul  r1, r1, 1 -  add r2, r1, 2

Note the physical register r1 could be re-used. r1 is the source
register of both instructions, its valued is fetched and used as the
inputs to MUL and ADD, before the MUL updates r1. It's not an
anti-dependency.

But without modification, the live interval analysis and the register
allocator will consider %reg1024 and %reg1025, and allocate different
physical registers to each of them. What's the minimal changes I have
to make to "cheat" the LiveIntervals and LiveVariables?

What else I may have to change for VLIW scheduling?

-- 
Tzu-Chien Chiu,
3D Graphics Hardware Architect
<URL:http://www.csie.nctu.edu.tw/~jwchiu>

Chris Lattner

2005-Sep-14 17:09 UTC

head link

[LLVMdev] VLIW Scheduling

On Wed, 14 Sep 2005, Tzu-Chien Chiu wrote:> VLIW (Very Long Instruction Word) is a long instruction format (called
> "group" hereafter) contains several instructions. These
instructions
> are not dependent on each other and could be issued in a single cycle.
>
> At this moment there is no correspondent class for VLIW. MachineInstr
> object can only represent one instruction. Usually the number of
> instructions in a group is fixed. The grouping could be implicitly
> maintained by inserting dummy nops to the instruction sequence. For
> example, there are at most two  instructions in a VLIW, if items in
> MachineBasicBlock::Insts is:
>
>  MUL
>  ADD
>  ADD
>  NOP
>  MUL
>  NOP
>
> It implies the scheduling:
>
>  MUL - ADD
>  ADD - NOP
>  MUL - NOP
>
> Each pairs could be issued in parallel.
Yup, this is currently the approach we're taking with IA64, for example. 
It has bundles which are similar in spirit to what a VLIW has (though not 
identical).
> However, "one single MachineInstr" is the "unit" of
many passes, e.g.
> LiveVariables and LiveIntervals. Without modifications, some code-gen
> result may be inefficient (or incorrect).
>
> For example:
>
>  mul %reg1025, %reg1024, 1
>  add %reg1026, %reg1024, 2
>
> Suppose the life interval of %reg1024 ends at ADD (no use after ADD).
> These two instructions are not dependent and could be scheduled
> together:
>
>  mul %reg1024, %reg1025, 1 -  add %reg1026, %reg1025, 2
>
> The life intervals of  vi%reg1024 and %reg1025 are actually _not_
> interfered with each other,
>
>  %reg1024 => r1
>  %reg1025 => r1
>  %reg1026 => r2
>
>  mul  r1, r1, 1 -  add r2, r1, 2
I see.
> Note the physical register r1 could be re-used. r1 is the source
> register of both instructions, its valued is fetched and used as the
> inputs to MUL and ADD, before the MUL updates r1. It's not an
> anti-dependency.
Right.
> But without modification, the live interval analysis and the register
> allocator will consider %reg1024 and %reg1025, and allocate different
> physical registers to each of them. What's the minimal changes I have
> to make to "cheat" the LiveIntervals and LiveVariables?
> What else I may have to change for VLIW scheduling?
I think the best way to handle this is to make "macro instructions". 
The
current pipeline we have is this:

1. Instruction Selection   (LLVM code to a target DAG)
2. Pre-pass Scheduling     (target DAG to machineinstrs)
3. Register allocation
4. ...

Given this setup, I think it would make the most sense for you, to have a 
target-specific phase #2.5, that takes the individual operations and makes 
the VLIW bundles you need for proper register allocation.  In the case 
above, this would turn into:

  mul_add  r1, r1, 1  ,   r2, r1, 2

The problem with this is that you'll obviously have an N^2 explosion of 
instructions.  If this isn't an option for you, you could set up a few 
classes of macro ops, so you could have something like this:

  vliw_rri_rri  "mul", r1, r1, 1   ,  "add", r2, r1, 2

where "mul" and "add" are immediates that are rendered as
the appropriate
opcodes by the asmprinter.  If you did this, you'd only need one 
instruction for each "pattern" (in this case, a VLIW packet where the 
first takes a register/register/immediate and the second takes the same).

This should expose to the RA the property that all inputs are read before 
the outputs are produced.

If you have your own custom isel/scheduler, it might make sense to have 
the scheduler build these packets for you.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Sep 2005 - [LLVMdev] VLIW Scheduling

[LLVMdev] VLIW Scheduling

[LLVMdev] VLIW Scheduling

Possibly Parallel Threads