Philip Reames
2015-Mar-06 00:39 UTC
[LLVMdev] Optimizing out redundant alloca involving byval params
Reid is right that this would go in memcpyopt, but... we there's an active discussion on the commit list which will solve this through a different mechanism. There's an active desire to avoid teaching GVN and related pieces (of which memcpyopt is one) about first class aggregates. We don't have enough active users of the feature to justify and maintain the complexity. If you haven't already seen it, this background may help: http://llvm.org/docs/Frontend/PerformanceTips.html#avoid-loads-and-stores-of-large-aggregate-type The current proposal is to convert such aggregate loads and stores into their component pieces. If that happens, you're example should come "for free" provided that the same example works when you break down the FCA into it's component pieces. If it doesn't, please say so. Philip On 03/05/2015 04:21 PM, Reid Kleckner wrote:> I think lib/Transforms/Scalar/MemCpyOptimizer.cpp might be the right > place for this, considering that most frontends will use memcpy for > that copy anyway. It already has some logic for byval args. > > On Thu, Mar 5, 2015 at 3:51 PM, Mircea Trofin <mtrofin at google.com > <mailto:mtrofin at google.com>> wrote: > > Hello all, > > I'm trying to find the pass that would convert from: > > define void @main(%struct* byval %ptr) { > %val = load %struct* %ptr > %val.ptr = alloca %struct > store %struct %val, %struct* %val.ptr > call void @extern_func(%struct* byval %val.ptr) > ret void > } > > to this: > define void @main(%struct* byval %ptr) { > call void @extern_func(%struct* byval %ptr) > ret void > } > > First, am I missing something - would this be a correct optimization? > > Thank you, > Mircea. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> > http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150305/34f42542/attachment.html>
Mircea Trofin
2015-Mar-06 02:16 UTC
[LLVMdev] Optimizing out redundant alloca involving byval params
Thanks! Philip, do you mean I should transform the original IR to something like this? (...which is what -expand-struct-regs can do, when applied to my original input) define void @main(%struct* byval %ptr) { %val.index = getelementptr %struct* %ptr, i32 0, i32 0 %val.field = load i32* %val.index %val.index1 = getelementptr %struct* %ptr, i32 0, i32 1 %val.field2 = load i32* %val.index1 %val.ptr = alloca %struct %val.ptr.index = getelementptr %struct* %val.ptr, i32 0, i32 0 store i32 %val.field, i32* %val.ptr.index %val.ptr.index4 = getelementptr %struct* %val.ptr, i32 0, i32 1 store i32 %val.field2, i32* %val.ptr.index4 call void @extern_func(%struct* byval %val.ptr) ret void } If so, would you mind pointing me to the phase that would reduce this? (I'm assuming that's what you meant by "for free" - there's an existing phase I could use) Thank you. Mircea. On Thu, Mar 5, 2015 at 4:39 PM Philip Reames <listmail at philipreames.com> wrote:> Reid is right that this would go in memcpyopt, but... we there's an > active discussion on the commit list which will solve this through a > different mechanism. There's an active desire to avoid teaching GVN and > related pieces (of which memcpyopt is one) about first class aggregates. > We don't have enough active users of the feature to justify and maintain > the complexity. > > If you haven't already seen it, this background may help: > http://llvm.org/docs/Frontend/PerformanceTips.html#avoid- > loads-and-stores-of-large-aggregate-type > > The current proposal is to convert such aggregate loads and stores into > their component pieces. If that happens, you're example should come "for > free" provided that the same example works when you break down the FCA into > it's component pieces. If it doesn't, please say so. > > Philip > > > On 03/05/2015 04:21 PM, Reid Kleckner wrote: > > I think lib/Transforms/Scalar/MemCpyOptimizer.cpp might be the right > place for this, considering that most frontends will use memcpy for that > copy anyway. It already has some logic for byval args. > > On Thu, Mar 5, 2015 at 3:51 PM, Mircea Trofin <mtrofin at google.com> wrote: > >> Hello all, >> >> I'm trying to find the pass that would convert from: >> >> define void @main(%struct* byval %ptr) { >> %val = load %struct* %ptr >> %val.ptr = alloca %struct >> store %struct %val, %struct* %val.ptr >> call void @extern_func(%struct* byval %val.ptr) >> ret void >> } >> >> to this: >> define void @main(%struct* byval %ptr) { >> call void @extern_func(%struct* byval %ptr) >> ret void >> } >> >> First, am I missing something - would this be a correct optimization? >> >> Thank you, >> Mircea. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > > _______________________________________________ > LLVM Developers mailing listLLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150305/7d50338f/attachment.html>
Philip Reames
2015-Mar-06 20:01 UTC
[LLVMdev] Optimizing out redundant alloca involving byval params
On 03/05/2015 06:16 PM, Mircea Trofin wrote:> Thanks! > > Philip, do you mean I should transform the original IR to something > like this?Yes.> (...which is what -expand-struct-regs can do, when applied to my > original input)Sorry, what? This doesn't appear to be a pass in ToT. Are you using an older version of LLVM? If so, none of my comments will apply.> > define void @main(%struct* byval %ptr) { > %val.index = getelementptr %struct* %ptr, i32 0, i32 0 > %val.field = load i32* %val.index > %val.index1 = getelementptr %struct* %ptr, i32 0, i32 1 > %val.field2 = load i32* %val.index1 > %val.ptr = alloca %struct > %val.ptr.index = getelementptr %struct* %val.ptr, i32 0, i32 0 > store i32 %val.field, i32* %val.ptr.index > %val.ptr.index4 = getelementptr %struct* %val.ptr, i32 0, i32 1 > store i32 %val.field2, i32* %val.ptr.index4 > call void @extern_func(%struct* byval %val.ptr) > ret void > } > > If so, would you mind pointing me to the phase that would reduce this? > (I'm assuming that's what you meant by "for free" - there's an > existing phase I could use)I would expect GVN to get this. If you can run this through a fully -O3 pass order and get the right result, isolating the pass in question should be easy.> > Thank you. > Mircea. > > On Thu, Mar 5, 2015 at 4:39 PM Philip Reames > <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote: > > Reid is right that this would go in memcpyopt, but... we there's > an active discussion on the commit list which will solve this > through a different mechanism. There's an active desire to avoid > teaching GVN and related pieces (of which memcpyopt is one) about > first class aggregates. We don't have enough active users of the > feature to justify and maintain the complexity. > > If you haven't already seen it, this background may help: > http://llvm.org/docs/Frontend/PerformanceTips.html#avoid-loads-and-stores-of-large-aggregate-type > > The current proposal is to convert such aggregate loads and stores > into their component pieces. If that happens, you're example > should come "for free" provided that the same example works when > you break down the FCA into it's component pieces. If it doesn't, > please say so. > > Philip > > > On 03/05/2015 04:21 PM, Reid Kleckner wrote: >> I think lib/Transforms/Scalar/MemCpyOptimizer.cpp might be the >> right place for this, considering that most frontends will use >> memcpy for that copy anyway. It already has some logic for byval >> args. >> >> On Thu, Mar 5, 2015 at 3:51 PM, Mircea Trofin <mtrofin at google.com >> <mailto:mtrofin at google.com>> wrote: >> >> Hello all, >> >> I'm trying to find the pass that would convert from: >> >> define void @main(%struct* byval %ptr) { >> %val = load %struct* %ptr >> %val.ptr = alloca %struct >> store %struct %val, %struct* %val.ptr >> call void @extern_func(%struct* byval %val.ptr) >> ret void >> } >> >> to this: >> define void @main(%struct* byval %ptr) { >> call void @extern_func(%struct* byval %ptr) >> ret void >> } >> >> First, am I missing something - would this be a correct >> optimization? >> >> Thank you, >> Mircea. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> >> http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150306/7f1a2a6d/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Optimizing out redundant alloca involving byval params
- [LLVMdev] Optimizing out redundant alloca involving byval params
- [LLVMdev] Optimizing out redundant alloca involving byval params
- [RFC] FileCheck: (dis)allowing unused prefixes
- [RFC] FileCheck: (dis)allowing unused prefixes