Philip, I post here because I think AlwaysInliner should inline it. I want to detect the indirect calls for Inliner, and I want to hear inputs. let me define indirect call first in my idea. In one single expression, one function may be subject to bitcast more than one time. we can detect this situation and treat it as a regular call of last function, is that okay? thanks, --lx On Mon, Jan 5, 2015 at 7:32 AM, Philip Reames <listmail at philipreames.com> wrote:> On 01/04/2015 12:04 AM, Liu Xin wrote: > >> >> %294= callfloatbitcast (float(float, float*)* @__gpu_modff to >> float(float, i64)*)(float%293, i64 %preg.212.addr.0) >> >> as you may know, some gpu backends don't support function call. we need >> to make sure to inline all functions here. however, Inliner can not figure >> out that this is a valid callsite in this form. actually, it is. in C >> words, cast a function and then call should be treat as callsite, right? >> >> Generally, the inliner doesn't do much with indirect calls, but given > there is no simpler canonical case here, I expect we'll have to. > > Its possible we might even want to define this as a direct call. I'm not > sure what the expectations are with regards to the type of the function > being called and the type of the callsite. I suspect a lot of code would > get confused if getCalledFunction returned __gpu_modff with it's unbitcast > type. That's possibly something we should fix though. > > We'll want to get other folks input here, but a small patch to the inliner > to handle this case would seem reasonable to me. > > Philip >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150105/5016835d/attachment.html>
Hi lx, Philip I've seen an instcombine which helps with this situation. It fires when the function types on both sides of the bitcast have the same number of operands and compatible types. It then adds bitcasts on the arguments and removes the one on the called function. I don't have IR to hand, but it would be worth passing your IR through instcombine to see if that helps you. The idea of improving the inliner is also great, but you may find that it's needed for cases other than this one if i'm right about the instcombine. Thanks Pete Sent from my iPhone> On Jan 5, 2015, at 3:16 AM, Liu Xin <navy.xliu at gmail.com> wrote: > > Philip, > > I post here because I think AlwaysInliner should inline it. I want to detect the indirect calls for Inliner, and I want to hear inputs. > > let me define indirect call first in my idea. In one single expression, one function may be subject to bitcast more than one time. we can detect this situation and treat it as a regular call of last function, is that okay? > > thanks, > --lx > > >> On Mon, Jan 5, 2015 at 7:32 AM, Philip Reames <listmail at philipreames.com> wrote: >>> On 01/04/2015 12:04 AM, Liu Xin wrote: >>> >>> %294= callfloatbitcast (float(float, float*)* @__gpu_modff to float(float, i64)*)(float%293, i64 %preg.212.addr.0) >>> >>> as you may know, some gpu backends don't support function call. we need to make sure to inline all functions here. however, Inliner can not figure out that this is a valid callsite in this form. actually, it is. in C words, cast a function and then call should be treat as callsite, right? >> Generally, the inliner doesn't do much with indirect calls, but given there is no simpler canonical case here, I expect we'll have to. >> >> Its possible we might even want to define this as a direct call. I'm not sure what the expectations are with regards to the type of the function being called and the type of the callsite. I suspect a lot of code would get confused if getCalledFunction returned __gpu_modff with it's unbitcast type. That's possibly something we should fix though. >> >> We'll want to get other folks input here, but a small patch to the inliner to handle this case would seem reasonable to me. >> >> Philip > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150105/4d0373db/attachment.html>
On Mon, Jan 5, 2015 at 1:40 AM, Pete Cooper <peter_cooper at apple.com> wrote:> Hi lx, Philip > > I've seen an instcombine which helps with this situation. It fires when > the function types on both sides of the bitcast have the same number of > operands and compatible types. It then adds bitcasts on the arguments and > removes the one on the called function. >It indeed does, InstCombiner::transformConstExprCastCall.> > I don't have IR to hand, but it would be worth passing your IR through > instcombine to see if that helps you. >The following should be a sufficiently workable example of what we would hope to transform. target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" define i32 @g(i32* %a) { entry: %call = tail call i32 bitcast (i32 (i64)* @f to i32 (i32*)*)(i32* %a) ret i32 %call } declare i32 @f(i64) define i32 @h(i64 %a) { entry: %call = tail call i32 bitcast (i32 (i32*)* @g to i32 (i64)*)(i64 %a) ret i32 %call }> > The idea of improving the inliner is also great, but you may find that > it's needed for cases other than this one if i'm right about the > instcombine. >Sadly, the combine fails because InstCombine queries CastInst::isBitCastable to determine the castable-ness of the parameter type and the argument type. It isn't bitcastable though, it's ptrtoint/inttoptr castable. The following patch opens up the optimization: --- a/lib/Transforms/InstCombine/InstCombineCalls.cpp +++ b/lib/Transforms/InstCombine/InstCombineCalls.cpp @@ -1456,7 +1456,7 @@ bool InstCombiner::transformConstExprCastCall(CallSite CS) { Type *ParamTy = FT->getParamType(i); Type *ActTy = (*AI)->getType(); - if (!CastInst::isBitCastable(ActTy, ParamTy)) + if (!CastInst::isBitOrNoopPointerCastable(ActTy, ParamTy, DL)) return false; // Cannot transform this parameter value. if (AttrBuilder(CallerPAL.getParamAttributes(i + 1), i + 1). @@ -1551,7 +1551,7 @@ bool InstCombiner::transformConstExprCastCall(CallSite CS) { if ((*AI)->getType() == ParamTy) { Args.push_back(*AI); } else { - Args.push_back(Builder->CreateBitCast(*AI, ParamTy)); + Args.push_back(Builder->CreateBitOrPointerCast(*AI, ParamTy)); } // Add any parameter attributes. Running opt -instcombine -inline -instcombine with this patch results in: define i32 @g(i32* %a) { entry: %0 = ptrtoint i32* %a to i64 %call = tail call i32 @f(i64 %0) ret i32 %call } declare i32 @f(i64) define i32 @h(i64 %a) { entry: %call.i = tail call i32 @f(i64 %a) ret i32 %call.i }> > Thanks > Pete > > Sent from my iPhone > > On Jan 5, 2015, at 3:16 AM, Liu Xin <navy.xliu at gmail.com> wrote: > > Philip, > > I post here because I think AlwaysInliner should inline it. I want to > detect the indirect calls for Inliner, and I want to hear inputs. > > let me define indirect call first in my idea. In one single expression, > one function may be subject to bitcast more than one time. we can detect > this situation and treat it as a regular call of last function, is that > okay? > > thanks, > --lx > > > On Mon, Jan 5, 2015 at 7:32 AM, Philip Reames <listmail at philipreames.com> > wrote: > >> On 01/04/2015 12:04 AM, Liu Xin wrote: >> >>> >>> %294= callfloatbitcast (float(float, float*)* @__gpu_modff to >>> float(float, i64)*)(float%293, i64 %preg.212.addr.0) >>> >>> as you may know, some gpu backends don't support function call. we need >>> to make sure to inline all functions here. however, Inliner can not figure >>> out that this is a valid callsite in this form. actually, it is. in C >>> words, cast a function and then call should be treat as callsite, right? >>> >>> Generally, the inliner doesn't do much with indirect calls, but given >> there is no simpler canonical case here, I expect we'll have to. >> >> Its possible we might even want to define this as a direct call. I'm not >> sure what the expectations are with regards to the type of the function >> being called and the type of the callsite. I suspect a lot of code would >> get confused if getCalledFunction returned __gpu_modff with it's unbitcast >> type. That's possibly something we should fix though. >> >> We'll want to get other folks input here, but a small patch to the >> inliner to handle this case would seem reasonable to me. >> >> Philip >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150105/07d3bc01/attachment.html>
hi, Pete, I understand InstCombine may simplify bitcast to nothing, but the order matters. AlwaysInliner happens in very early stage, and we never call it again. if InstCombine works, can we invoke it before Inliner? thanks, --lx On Mon, Jan 5, 2015 at 5:40 PM, Pete Cooper <peter_cooper at apple.com> wrote:> Hi lx, Philip > > I've seen an instcombine which helps with this situation. It fires when > the function types on both sides of the bitcast have the same number of > operands and compatible types. It then adds bitcasts on the arguments and > removes the one on the called function. > > I don't have IR to hand, but it would be worth passing your IR through > instcombine to see if that helps you. > > The idea of improving the inliner is also great, but you may find that > it's needed for cases other than this one if i'm right about the > instcombine. > > Thanks > Pete > > Sent from my iPhone > > On Jan 5, 2015, at 3:16 AM, Liu Xin <navy.xliu at gmail.com> wrote: > > Philip, > > I post here because I think AlwaysInliner should inline it. I want to > detect the indirect calls for Inliner, and I want to hear inputs. > > let me define indirect call first in my idea. In one single expression, > one function may be subject to bitcast more than one time. we can detect > this situation and treat it as a regular call of last function, is that > okay? > > thanks, > --lx > > > On Mon, Jan 5, 2015 at 7:32 AM, Philip Reames <listmail at philipreames.com> > wrote: > >> On 01/04/2015 12:04 AM, Liu Xin wrote: >> >>> >>> %294= callfloatbitcast (float(float, float*)* @__gpu_modff to >>> float(float, i64)*)(float%293, i64 %preg.212.addr.0) >>> >>> as you may know, some gpu backends don't support function call. we need >>> to make sure to inline all functions here. however, Inliner can not figure >>> out that this is a valid callsite in this form. actually, it is. in C >>> words, cast a function and then call should be treat as callsite, right? >>> >>> Generally, the inliner doesn't do much with indirect calls, but given >> there is no simpler canonical case here, I expect we'll have to. >> >> Its possible we might even want to define this as a direct call. I'm not >> sure what the expectations are with regards to the type of the function >> being called and the type of the callsite. I suspect a lot of code would >> get confused if getCalledFunction returned __gpu_modff with it's unbitcast >> type. That's possibly something we should fix though. >> >> We'll want to get other folks input here, but a small patch to the >> inliner to handle this case would seem reasonable to me. >> >> Philip >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150105/bda1a63b/attachment.html>
Apparently Analagous Threads
- [LLVMdev] should AlwaysInliner inline this case?
- [LLVMdev] should AlwaysInliner inline this case?
- [LLVMdev] should AlwaysInliner inline this case?
- [LLVMdev] Is there pass to break down <4 x float> to scalars
- [LLVMdev] Does Mips resolve hazard in pre-ra-sched or post-ra-sched?