thr3ads.net - llvm dev - [LLVMdev] Question about an unusual jump instruction [Jul 2012]

If this information is useful, please help other people find it:
Share via:

Michele Scandale

2012-Jul-25 07:48 UTC

[LLVMdev] Question about an unusual jump instruction

Dear all,

I'm working on an exploratory backend on llvm. In the instruction set
I'm using
I have an instruction (called DECJNZ) that decrements a register and, if the 
decremented value is not zero, jumps (with a relative jump) to a given offset.

I've described in tablegen this instruction as follow:

def DECJNZ : Instruction {
let Namespace = "MyTarget";
let OutOperandList = (outs GprRegs:$R0);
let InOperandList = (ins GprRegs: $R1, imm16:$dest);
let AsmString = "DECJNZ $R0, $dest";
let isBranch = 1;
let isTerminator = 1;
let Constraints = "$R1 = $R0";
let Defs = [SR];
}

I would like to create an optimization pass to make countable loops faster by 
using this instruction.

The simplest loop that I would like to optimize is like:

//////////////////////////
int i = a;
do {
   // loop body
   --i;
} while (i != 0);
//////////////////////////


After code selection I've something like:

BB0:
   %vreg0<def> = COPY %R0; // R0 contains 'a'
   J <#BB1>
BB1:
   %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
   J <#BB2>
BB2:
   // loop body
BB3:
   %vreg3<def> = ADDI %vreg1<kill>, 1
   CMPNE %vreg3, 0, %SR<implicit,def>
   JNZ <#BB1>
   J <#BB4>
BB4:
   // end

With the optimization pass I replace the decrement, comparison and conditional 
jump with the DECJNZ. The resulting code will be:

BB0:
   %vreg0<def> = COPY %R0; // R0 contains 'a'
   J <#BB1>
BB1:
   %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
   J <#BB2>
BB2:
   // loop body
BB3:
   %vreg3<def> = DECJNZ %vreg1<kill>, <#BB1>,
%SR<implicit,def>
   J <#BB4>
BB4:
   // end

A first problem was related to PHIElimination, while eliminating the PHI-node a 
copy was generated before the DECJNZ, because it's a terminator instruction,
but
the copy should use the value defined by the DECJNZ.

To solve this problem I wrote a preprocess pass which is run just before 
PHIElimination and change the opcode of PHIs that have at least one source alue 
generated by a DECJNZ. In this way it is ignored by the PHIElimination passes 
and then a pass run just after PHIElimination that 'lowers' in a custom
way the
marked PHIs and updates the information about live variables.

With some tests I had a second problem generated by the spilling of the register
used as loop counter. A store instruction is generated after the definition of 
the value, so is inserted between the DECJNZ and J.

I think that with another pass I can try to manually move the spill-store 
instruction at the beginning of the destination basic block, but I think
it's
not enough to preserve the semantics of the code.

Is my approach correct? Does it exist a cleaner and more elegant way to support 
this kind of instruction? I tried to look for a similar instruction in other 
targets, but I've found nothing...

Thanks in advance for future replies.

Regards,

Michele Scandale

Eli Friedman

2012-Jul-25 08:07 UTC

head link

[LLVMdev] Question about an unusual jump instruction

On Wed, Jul 25, 2012 at 12:48 AM, Michele Scandale
<michele.scandale at gmail.com> wrote:> Dear all,
>
> I'm working on an exploratory backend on llvm. In the instruction set
I'm using
> I have an instruction (called DECJNZ) that decrements a register and, if
the
> decremented value is not zero, jumps (with a relative jump) to a given
offset.
>
> I've described in tablegen this instruction as follow:
>
> def DECJNZ : Instruction {
> let Namespace = "MyTarget";
> let OutOperandList = (outs GprRegs:$R0);
> let InOperandList = (ins GprRegs: $R1, imm16:$dest);
> let AsmString = "DECJNZ $R0, $dest";
> let isBranch = 1;
> let isTerminator = 1;
> let Constraints = "$R1 = $R0";
> let Defs = [SR];
> }
>
> I would like to create an optimization pass to make countable loops faster
by
> using this instruction.
>
> The simplest loop that I would like to optimize is like:
>
> //////////////////////////
> int i = a;
> do {
>    // loop body
>    --i;
> } while (i != 0);
> //////////////////////////
>
>
> After code selection I've something like:
>
> BB0:
>    %vreg0<def> = COPY %R0; // R0 contains 'a'
>    J <#BB1>
> BB1:
>    %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
>    J <#BB2>
> BB2:
>    // loop body
> BB3:
>    %vreg3<def> = ADDI %vreg1<kill>, 1
>    CMPNE %vreg3, 0, %SR<implicit,def>
>    JNZ <#BB1>
>    J <#BB4>
> BB4:
>    // end
>
> With the optimization pass I replace the decrement, comparison and
conditional
> jump with the DECJNZ. The resulting code will be:
>
> BB0:
>    %vreg0<def> = COPY %R0; // R0 contains 'a'
>    J <#BB1>
> BB1:
>    %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
>    J <#BB2>
> BB2:
>    // loop body
> BB3:
>    %vreg3<def> = DECJNZ %vreg1<kill>, <#BB1>,
%SR<implicit,def>
>    J <#BB4>
> BB4:
>    // end
>
> A first problem was related to PHIElimination, while eliminating the
PHI-node a
> copy was generated before the DECJNZ, because it's a terminator
instruction, but
> the copy should use the value defined by the DECJNZ.
>
> To solve this problem I wrote a preprocess pass which is run just before
> PHIElimination and change the opcode of PHIs that have at least one source
alue
> generated by a DECJNZ. In this way it is ignored by the PHIElimination
passes
> and then a pass run just after PHIElimination that 'lowers' in a
custom way the
> marked PHIs and updates the information about live variables.
>
> With some tests I had a second problem generated by the spilling of the
register
> used as loop counter. A store instruction is generated after the definition
of
> the value, so is inserted between the DECJNZ and J.
>
> I think that with another pass I can try to manually move the spill-store
> instruction at the beginning of the destination basic block, but I think
it's
> not enough to preserve the semantics of the code.
>
> Is my approach correct? Does it exist a cleaner and more elegant way to
support
> this kind of instruction? I tried to look for a similar instruction in
other
> targets, but I've found nothing...
See PPCCTRLoops.cpp in the PPC backend.

-Eli

Michele Scandale

2012-Jul-25 08:55 UTC

head link

[LLVMdev] Question about an unusual jump instruction

Il 25/07/2012 10:07, Eli Friedman ha scritto:> On Wed, Jul 25, 2012 at 12:48 AM, Michele Scandale
> <michele.scandale at gmail.com> wrote:
>> Dear all,
>>
>> I'm working on an exploratory backend on llvm. In the instruction
set I'm using
>> I have an instruction (called DECJNZ) that decrements a register and,
if the
>> decremented value is not zero, jumps (with a relative jump) to a given
offset.
>>
>> I've described in tablegen this instruction as follow:
>>
>> def DECJNZ : Instruction {
>> let Namespace = "MyTarget";
>> let OutOperandList = (outs GprRegs:$R0);
>> let InOperandList = (ins GprRegs: $R1, imm16:$dest);
>> let AsmString = "DECJNZ $R0, $dest";
>> let isBranch = 1;
>> let isTerminator = 1;
>> let Constraints = "$R1 = $R0";
>> let Defs = [SR];
>> }
>>
>> I would like to create an optimization pass to make countable loops
faster by
>> using this instruction.
>>
>> The simplest loop that I would like to optimize is like:
>>
>> //////////////////////////
>> int i = a;
>> do {
>>     // loop body
>>     --i;
>> } while (i != 0);
>> //////////////////////////
>>
>>
>> After code selection I've something like:
>>
>> BB0:
>>     %vreg0<def> = COPY %R0; // R0 contains 'a'
>>     J <#BB1>
>> BB1:
>>     %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
>>     J <#BB2>
>> BB2:
>>     // loop body
>> BB3:
>>     %vreg3<def> = ADDI %vreg1<kill>, 1
>>     CMPNE %vreg3, 0, %SR<implicit,def>
>>     JNZ <#BB1>
>>     J <#BB4>
>> BB4:
>>     // end
>>
>> With the optimization pass I replace the decrement, comparison and
conditional
>> jump with the DECJNZ. The resulting code will be:
>>
>> BB0:
>>     %vreg0<def> = COPY %R0; // R0 contains 'a'
>>     J <#BB1>
>> BB1:
>>     %vreg1<def> = PHI %vreg0, <#BB0>, %vreg3, <#BB3>
>>     J <#BB2>
>> BB2:
>>     // loop body
>> BB3:
>>     %vreg3<def> = DECJNZ %vreg1<kill>, <#BB1>,
%SR<implicit,def>
>>     J <#BB4>
>> BB4:
>>     // end
>>
>> A first problem was related to PHIElimination, while eliminating the
PHI-node a
>> copy was generated before the DECJNZ, because it's a terminator
instruction, but
>> the copy should use the value defined by the DECJNZ.
>>
>> To solve this problem I wrote a preprocess pass which is run just
before
>> PHIElimination and change the opcode of PHIs that have at least one
source alue
>> generated by a DECJNZ. In this way it is ignored by the PHIElimination
passes
>> and then a pass run just after PHIElimination that 'lowers' in
a custom way the
>> marked PHIs and updates the information about live variables.
>>
>> With some tests I had a second problem generated by the spilling of the
register
>> used as loop counter. A store instruction is generated after the
definition of
>> the value, so is inserted between the DECJNZ and J.
>>
>> I think that with another pass I can try to manually move the
spill-store
>> instruction at the beginning of the destination basic block, but I
think it's
>> not enough to preserve the semantics of the code.
>>
>> Is my approach correct? Does it exist a cleaner and more elegant way to
support
>> this kind of instruction? I tried to look for a similar instruction in
other
>> targets, but I've found nothing...
>
> See PPCCTRLoops.cpp in the PPC backend.
>
> -Eli
>
I took a quick look to PPCCTRLoops.cpp, but I don't think it's the same
of my
case. The instruction I have defines explicitly a GPR register, while the BDNZ 
defines and uses an implicit dedicated register. I based my optimization pass on
what seen in Hexagon target too, but due to the fact that I have an explicit GPR
as loop counter, my instruction defines a virtual register and the fact that the
instruction is also a terminator creates all the problems I listed in my 
previous mail.

Regards,
Michele Scandale

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Jul 2012 - [LLVMdev] Question about an unusual jump instruction

[LLVMdev] Question about an unusual jump instruction

[LLVMdev] Question about an unusual jump instruction

[LLVMdev] Question about an unusual jump instruction

Possibly Parallel Threads