Hi, Gang: I remember there were different voices when you check-in the code. I agree with them although I didn't reply your mail in open64's mailing list. In the transformation you illustrate, it involves two operations: 1) promote WHILE-loop into DO-loop (i.e noncountable loop to countable loop) 2) get rid of trip-count dec/inc and compare. 1) is irrelevant to HW loop. Any scalar optimizer should handle 1). It is not difficult at all to handle 2) in CodeGen and it is unnecessary to to introduce a Operator just for that purpose. Shuxin On 11/22/2012 06:03 AM, Gang Yu wrote:> I am the designer for open64 hwloop structure, but I am not a student. > > Hope the following helps: > > To transform a loop into hwloop, we need the help from optimizer. For > example, > | > while(k3>=10){ > sum+=k1; > k3 --; > } > | > > into the form:|| > > | > zdl_loop(k3-9) { > sum+=k1; > } > | > > So, we introduce a new ZDLBR whirl(open64 optimizer intermediate) > operator, which represents the loop in whirl as:|| > > | > LABEL L2050 0 {line: 0} > LOOP_INFO 0 1 1 > I4I4LDID 73 <1,2,.preg_I4> T<4,.predef_I4,4> # k3 > I4I4LDID 77 <1,2,.preg_I4> T<4,.predef_I4,4> # <preg> > END_LOOP_INFO > I4I4LDID 74 <1,2,.preg_I4> T<4,.predef_I4,4> # k1 > I4I4LDID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum > I4ADD > I4STID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum {line: 5} > ZDLBR L2050 {line: 0} > Then, we let cg do things. Such a design abstract the general > operations in optimizer, while target specific part in cg, still a > simulated op, until cg loop optimization finished. We implement a > multi nested level hwloop by this approach. Gcc's 3 doloop expand > names do the same, we believe. > | > > More details, please take a look at > > http://wiki.open64.net/index.php/Zero_Delay_Loop > > Thanks > Gang >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121122/ba844f04/attachment.html>
Hi shuxin, Promote while-loop to do-loop is the job of loop induction recognized, not this transformation. The scalar transform for hwloop in optimizer is for that it is a trouble to discriminate trip counting code with the real production code stuff and do the elimination in cg, we have to write customized code to handle this general stuff in ervey targets. So, we take the help from optimizer DCE, make the trip count code hidden in emitted whirl, that greatly simply the design, especially interact with cg unroll, you can see the code, we add validity check functionality , but the code reduced, more stable. Gang 在 2012-11-23,3:17,Shuxin Yang <shuxin.llvm at gmail.com> 写道:> Hi, Gang: > > I remember there were different voices when you check-in the code. > I agree with them although I didn't reply your mail in open64's mailing list. > > In the transformation you illustrate, it involves two operations: > 1) promote WHILE-loop into DO-loop (i.e noncountable loop to countable loop) > 2) get rid of trip-count dec/inc and compare. > > 1) is irrelevant to HW loop. Any scalar optimizer should handle 1). > It is not difficult at all to handle 2) in CodeGen and it is unnecessary to > to introduce a Operator just for that purpose. > > Shuxin > > On 11/22/2012 06:03 AM, Gang Yu wrote: >> I am the designer for open64 hwloop structure, but I am not a student. >> >> Hope the following helps: >> >> To transform a loop into hwloop, we need the help from optimizer. For example, >> while(k3>=10){ >> sum+=k1; >> k3 --; >> } >> into the form: >> zdl_loop(k3-9) { >> sum+=k1; >> } >> So, we introduce a new ZDLBR whirl(open64 optimizer intermediate) operator, which represents the loop in whirl as: >> LABEL L2050 0 {line: 0} >> LOOP_INFO 0 1 1 >> I4I4LDID 73 <1,2,.preg_I4> T<4,.predef_I4,4> # k3 >> I4I4LDID 77 <1,2,.preg_I4> T<4,.predef_I4,4> # <preg> >> END_LOOP_INFO >> I4I4LDID 74 <1,2,.preg_I4> T<4,.predef_I4,4> # k1 >> I4I4LDID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum >> I4ADD >> I4STID 75 <1,2,.preg_I4> T<4,.predef_I4,4> # sum {line: 5} >> ZDLBR L2050 {line: 0} >> Then, we let cg do things. Such a design abstract the general operations in optimizer, while target specific part in cg, still a simulated op, until cg loop optimization finished. We implement a multi nested level hwloop by this approach. Gcc's 3 doloop expand names do the same, we believe. >> >> More details, please take a look at >> >> http://wiki.open64.net/index.php/Zero_Delay_Loop >> >> Thanks >> Gang >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121123/d378019b/attachment.html>
Hi, Gang: I don't want to discuss Open64 internal in LLVM mailing list. Let us only focus on the design per se. As your this mail and your previous mail combined give me a impression that : The only reason you introduce the specific operator for HW loop in Scalar Opt simply because you have hard time in figure out the trip count in CodeGen. This might be true for Open64's CodeGen (I don't want to discuss this issue on this mailling list), but in general it is not true for other compilers. I'm dubious about "It greatly simplify the design". The downstream passes need to be fully aware of this new operator, which doesn't make things any simpler. Thanks Shuxin On 11/22/2012 02:56 PM, Gang Yu wrote:> Hi shuxin, > > Promote while-loop to do-loop is the job of loop induction recognized, > not this transformation. The scalar transform for hwloop in optimizer > is for that it is a trouble to discriminate trip counting code with > the real production code stuff and do the elimination in cg, we have > to write customized code to handle this general stuff in ervey > targets. So, we take the help from optimizer DCE, make the trip count > code hidden in emitted whirl, that greatly simply the design, > especially interact with cg unroll, you can see the code, we add > validity check functionality , but the code reduced, more stable. > > Gang >