"Валерий Хаменя"
2003-Aug-26 15:44 UTC
[LLVMdev] repeated recursion with "frozen" arguments
Hi llvm-devels, there is a very simple using case concerning LLVM and I am wondering whether it might be optimized at LLVM layer: //----------- int rec_func(int x, int y) { if(y<0) return x; return y + rec_func(x, y-1); // should we really push x? } void main() { rec_func(1, 1000); rec_func(2, 2000); } //----------- Guys, don't focus on a stupid mathematics beyond this function :) Please focus your attention at the fact, that argument `x' is once supplied at the top level calls in `main' and never changed anymore from within `rec_func' function. This example is just quite close to a "lexical closures" specifics. The similar things we have when we pass `this' pointer to non-static member functions in C++ (why should we pass `this' pointer always if it was not changed?). However, I guess in classical code generation approaches argument `x' (like `this' pointer) is always pushed into a stack with _every_ call. But, it is possible to generate individual code implementations for every top-level call of `rec_func' to avoid big number of extra `push'. In terms of C++: there should be a possibility to regenerate non-static member functions for a given instantiated object of the classes where performance is critical, e.g. for classes like valarray. So, my questions are: Q1. should/might the optimization like this be resolved at LLVM layer? if "yes" then Q2. is LLVM capable of doing things like that today? Thanks. -- Valery
> there is a very simple using case concerning LLVM and I am wondering > whether it might be optimized at LLVM layer: > > //----------- > int rec_func(int x, int y) { > if(y<0) return x; > return y + rec_func(x, y-1); // should we really push x? > }Probably not, at least not in the near future. At some point I have had thoughts about implementing a tail recursion pass, but without prior transformation, this function would not be a candidate.> Please focus your attention at the fact, that argument `x' > is once supplied at the top level calls in `main' and > never changed anymore from within `rec_func' function. > This example is just quite close to a "lexical closures" specifics.The problem with "lexical closures", at least in this instance, is that you have to pass a pointer to the closure around: if you only have one parameter which can use it, you aren't saving anything.> The similar things we have when we pass `this' pointer > to non-static member functions in C++ (why should we > pass `this' pointer always if it was not changed?).In C++, the 'this' object pointer can actually be thought of as a closure in some sense already...> However, I guess in classical code generation approaches > argument `x' (like `this' pointer) is always pushed into > a stack with _every_ call.Absolutely.> But, it is possible to generate individual code implementations > for every top-level call of `rec_func' to avoid big number of > extra `push'.In terms of LLVM you could certainly do this. For example you could translate it to use an explicit stack: int rec_func(int x, int y) { if(y<0) return x; return y + rec_func(x, y-1); // should we really push x? } Or you could just perform an aggressive dead code elimination pass which would realize that this function returns the same value for X no matter what the value of Y is. There are other transformations that you can do to work on this kind of thing, the functional language community has a lot of ways to deal with this.> In terms of C++: there should be a possibility to regenerate > non-static member functions for a given instantiated object > of the classes where performance is critical, e.g. for classes > like valarray.Sure, but you still need the 'this' pointer, to know WHICH valarray you are talking about... ?> So, my questions are: > Q1. should/might the optimization like this be resolved at LLVM layer?Yes, LLVM is designed for spiffy optimizations like this. :)> if "yes" then > Q2. is LLVM capable of doing things like that today?Nope, not yet. Contributions are always appreciated though. :) -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
Valery A.Khamenya
2003-Aug-26 16:32 UTC
[LLVMdev] repeated recursion with "frozen" arguments
Hello Chris, Tuesday, August 26, 2003, 11:02:45 PM, you wrote:>> there is a very simple using case concerning LLVM and I am wondering >> whether it might be optimized at LLVM layer: >> >> //----------- >> int rec_func(int x, int y) { >> if(y<0) return x; >> return y + rec_func(x, y-1); // should we really push x? >> }CL> Probably not, at least not in the near future. At some point I have had CL> thoughts about implementing a tail recursion pass, but without prior CL> transformation, this function would not be a candidate. wait... if you break this C-example at the place as you did then it is absolutely not what I have meant. Indeed, sense of my example comes iff the top-level call is given, i.e. it might be optimized in the context of concrete external call. CL> The problem with "lexical closures", at least in this instance, is CL> that you have to pass a pointer to the closure around: if you only CL> have one parameter which can use it, you aren't saving anything. generally you are right. But only generally :) In particular, my example showed a feature of typical lexical closure.>> The similar things we have when we pass `this' pointer >> to non-static member functions in C++ (why should we >> pass `this' pointer always if it was not changed?).CL> In C++, the 'this' object pointer can actually be thought of as a CL> closure in some sense already... this is exactly what I have meant. If the C example I gave might be optimized with LLVM engine, than `this' and lexical closures might be boosted as well.>> But, it is possible to generate individual code implementations >> for every top-level call of `rec_func' to avoid big number of >> extra `push'.CL> In terms of LLVM you could certainly do this. For example you could CL> translate it to use an explicit stack: CL> int rec_func(int x, int y) { CL> if(y<0) return x; CL> return y + rec_func(x, y-1); // should we really push x? CL> } then I have to dig LLVM docs about explicit stack :) CL> Or you could just perform an aggressive dead code elimination pass CL> which would realize that this function returns the same value for CL> X no matter what the value of Y is. you misunderstood this function a bit, but it is not important, dead code elimination was out of my scope.>> In terms of C++: there should be a possibility to regenerate >> non-static member functions for a given instantiated object >> of the classes where performance is critical, e.g. for classes >> like valarray.CL> Sure, but you still need the 'this' pointer, to know WHICH valarray you CL> are talking about... ? sure. But what if this `this' was set once for 1mio calls? Today I have a lot heavy math function working with two spectra and every time I have to think twice before making another member function which is invoked in deeper stack frames. In other words, just imagine the situation, when for a given object really a lot of *its* members function are called without changing `this'. Today compilers can't eliminate this _extra_ `this'. And real C++ code has tendency to be slower. I'd say it is not just a problem of current compilers. It is problem of ultimate static compilation. In cases when `this' comes in run-time even a super-smart static compilers can't recompile a function with "built-in" `this'.>> So, my questions are: >> Q1. should/might the optimization like this be resolved at LLVM layer?CL> Yes, LLVM is designed for spiffy optimizations like this. :) nicccce.>> if "yes" then >> Q2. is LLVM capable of doing things like that today?CL> Nope, not yet. Contributions are always appreciated though. :) now I have to prepare for sleeping, that's for sure. Probably this nice patch from me doesn't come today :)) Have a nice day! -- Best regards, Valery A.Khamenya mailto:khamenya at mail.ru Local Time: 23:01
The optimization you are describing is called procedure cloning, where the body of a procedure is copied and then specialized for particular call sites. It is usually driven by interprocedural constant propagation, and often by profiling to find which cases are frequent enough to make it worthwhile (here the recursive calls would make both cases -- x=1 and x=2 -- worthwhile). LLVM would be very well suited to this though we don't have any of the above capabilities yet. --Vikram http://www.cs.uiuc.edu/~vadve> -----Original Message----- > From: llvmdev-admin at cs.uiuc.edu > [mailto:llvmdev-admin at cs.uiuc.edu]On Behalf Of Валерий Хаменя > Sent: Tuesday, August 26, 2003 9:54 AM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] repeated recursion with "frozen" arguments > > > Hi llvm-devels, > > there is a very simple using case concerning LLVM and I am > wondering whether it might be optimized at LLVM layer: > > //----------- > int rec_func(int x, int y) { > if(y<0) return x; > return y + rec_func(x, y-1); // should we really push x? > } > void main() { > rec_func(1, 1000); > rec_func(2, 2000); > } > //----------- > > Guys, don't focus on a stupid mathematics beyond this function :) > Please focus your attention at the fact, that argument `x' > is once supplied at the top level calls in `main' and > never changed anymore from within `rec_func' function. > > This example is just quite close to a "lexical closures" specifics. > > The similar things we have when we pass `this' pointer > to non-static member functions in C++ (why should we > pass `this' pointer always if it was not changed?). > > However, I guess in classical code generation approaches > argument `x' (like `this' pointer) is always pushed into > a stack with _every_ call. > > But, it is possible to generate individual code implementations > for every top-level call of `rec_func' to avoid big number of > extra `push'. > > In terms of C++: there should be a possibility to regenerate > non-static member functions for a given instantiated object > of the classes where performance is critical, e.g. for classes > like valarray. > > So, my questions are: > > Q1. should/might the optimization like this be resolved at LLVM layer? > > if "yes" then > > Q2. is LLVM capable of doing things like that today? > > Thanks. > -- > Valery > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev >
Hi LLVM-devels> The optimization you are describing is called procedure cloning, > [...] > LLVM would be very well suited to this though we don't have any > of the above capabilities yet.oh, anyway nice to here that it might come. I am looking towards LLVM abilities like that, because I believe they could really boost the family of languages for functional programming. And I do believe that with a help of LLVM those languages should bring better performance then classically used imperative languages with single static compilation. -- Valery A.Khamenya
Possibly Parallel Threads
- [LLVMdev] repeated recursion with "frozen" arguments
- [LLVMdev] repeated recursion with "frozen" arguments
- [LLVMdev] repeated recursion with "frozen" arguments
- [LLVMdev] repeated recursion with "frozen" arguments
- [LLVMdev] repeated recursion with "frozen" arguments