Hi, Gang:
I remember there were different voices when you check-in the code.
I agree with them although I didn't reply your mail in open64's mailing
list.
In the transformation you illustrate, it involves two operations:
1) promote WHILE-loop into DO-loop (i.e noncountable loop to
countable loop)
2) get rid of trip-count dec/inc and compare.
1) is irrelevant to HW loop. Any scalar optimizer should handle 1).
It is not difficult at all to handle 2) in CodeGen and it is unnecessary to
to introduce a Operator just for that purpose.
Shuxin
On 11/22/2012 06:03 AM, Gang Yu wrote:> I am the designer for open64 hwloop structure, but I am not a student.
>
> Hope the following helps:
>
> To transform a loop into hwloop, we need the help from optimizer. For
> example,
> |
> while(k3>=10){
> sum+=k1;
> k3 --;
> }
> |
>
> into the form:||
>
> |
> zdl_loop(k3-9) {
> sum+=k1;
> }
> |
>
> So, we introduce a new ZDLBR whirl(open64 optimizer intermediate)
> operator, which represents the loop in whirl as:||
>
> |
> LABEL L2050 0 {line: 0}
> LOOP_INFO 0 1 1
> I4I4LDID 73 <1,2,.preg_I4> T<4,.predef_I4,4> # k3
> I4I4LDID 77 <1,2,.preg_I4> T<4,.predef_I4,4> # <preg>
> END_LOOP_INFO
> I4I4LDID 74 <1,2,.preg_I4> T<4,.predef_I4,4> # k1
> I4I4LDID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum
> I4ADD
> I4STID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum {line: 5}
> ZDLBR L2050 {line: 0}
> Then, we let cg do things. Such a design abstract the general
> operations in optimizer, while target specific part in cg, still a
> simulated op, until cg loop optimization finished. We implement a
> multi nested level hwloop by this approach. Gcc's 3 doloop expand
> names do the same, we believe.
> |
>
> More details, please take a look at
>
> http://wiki.open64.net/index.php/Zero_Delay_Loop
>
> Thanks
> Gang
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121122/ba844f04/attachment.html>
Hi shuxin, Promote while-loop to do-loop is the job of loop induction recognized, not this transformation. The scalar transform for hwloop in optimizer is for that it is a trouble to discriminate trip counting code with the real production code stuff and do the elimination in cg, we have to write customized code to handle this general stuff in ervey targets. So, we take the help from optimizer DCE, make the trip count code hidden in emitted whirl, that greatly simply the design, especially interact with cg unroll, you can see the code, we add validity check functionality , but the code reduced, more stable. Gang 在 2012-11-23,3:17,Shuxin Yang <shuxin.llvm at gmail.com> 写道:> Hi, Gang: > > I remember there were different voices when you check-in the code. > I agree with them although I didn't reply your mail in open64's mailing list. > > In the transformation you illustrate, it involves two operations: > 1) promote WHILE-loop into DO-loop (i.e noncountable loop to countable loop) > 2) get rid of trip-count dec/inc and compare. > > 1) is irrelevant to HW loop. Any scalar optimizer should handle 1). > It is not difficult at all to handle 2) in CodeGen and it is unnecessary to > to introduce a Operator just for that purpose. > > Shuxin > > On 11/22/2012 06:03 AM, Gang Yu wrote: >> I am the designer for open64 hwloop structure, but I am not a student. >> >> Hope the following helps: >> >> To transform a loop into hwloop, we need the help from optimizer. For example, >> while(k3>=10){ >> sum+=k1; >> k3 --; >> } >> into the form: >> zdl_loop(k3-9) { >> sum+=k1; >> } >> So, we introduce a new ZDLBR whirl(open64 optimizer intermediate) operator, which represents the loop in whirl as: >> LABEL L2050 0 {line: 0} >> LOOP_INFO 0 1 1 >> I4I4LDID 73 <1,2,.preg_I4> T<4,.predef_I4,4> # k3 >> I4I4LDID 77 <1,2,.preg_I4> T<4,.predef_I4,4> # <preg> >> END_LOOP_INFO >> I4I4LDID 74 <1,2,.preg_I4> T<4,.predef_I4,4> # k1 >> I4I4LDID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum >> I4ADD >> I4STID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum {line: 5} >> ZDLBR L2050 {line: 0} >> Then, we let cg do things. Such a design abstract the general operations in optimizer, while target specific part in cg, still a simulated op, until cg loop optimization finished. We implement a multi nested level hwloop by this approach. Gcc's 3 doloop expand names do the same, we believe. >> >> More details, please take a look at >> >> http://wiki.open64.net/index.php/Zero_Delay_Loop >> >> Thanks >> Gang >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121123/d378019b/attachment.html>
Hi, Gang:
I don't want to discuss Open64 internal in LLVM mailing list. Let us
only focus on the design per se.
As your this mail and your previous mail combined give me a impression
that :
The only reason you introduce the specific operator for HW loop in
Scalar Opt simply because
you have hard time in figure out the trip count in CodeGen.
This might be true for Open64's CodeGen (I don't want to discuss
this issue on this mailling list), but
in general it is not true for other compilers.
I'm dubious about "It greatly simplify the design". The
downstream
passes need to be fully aware
of this new operator, which doesn't make things any simpler.
Thanks
Shuxin
On 11/22/2012 02:56 PM, Gang Yu wrote:> Hi shuxin,
>
> Promote while-loop to do-loop is the job of loop induction recognized,
> not this transformation. The scalar transform for hwloop in optimizer
> is for that it is a trouble to discriminate trip counting code with
> the real production code stuff and do the elimination in cg, we have
> to write customized code to handle this general stuff in ervey
> targets. So, we take the help from optimizer DCE, make the trip count
> code hidden in emitted whirl, that greatly simply the design,
> especially interact with cg unroll, you can see the code, we add
> validity check functionality , but the code reduced, more stable.
>
> Gang
>