Johannes Doerfert via llvm-dev
2018-Aug-23 15:25 UTC
[llvm-dev] [RFC] "Properly" Derive Function/Argument/Parameter Attributes
After I spend some time working with the function attribute* deduction pass** [1,3], I would like to propose a "proper" organization***. Why? Because we do not derive nearly as many attributes as we could****, while we do maintain various (separate and diffently organized) "data-flow-like analyses" to do so. What else? I propose a single optimistic data-flow (fixpoint) loop to perform the deduction (similar to [2,3] but for all propagation forms and not only call sites -> callee). Given that a pessimistic initialization allows to terminate early, we can then also vary the compile time investment. Where is the catch? It will require a substantial rewrite with (some) parts that cannot be split in small independent patches. While I believe these to be easy to review*****, I want to know if (1) there is interest in having better attribute deduction, and (2) there are volunteers to review such patches. I do appreciate any input on this proposal. Cheers, Johannes -------- * It derives function attributes but also parameter and argument attributes. ** lib/Transforms/IPO/FunctionAttrs.cpp *** This statement is intended to sound harsh. I believe it to be appropriate because it is hard to find consistency in the current implementation. Different parts perform argument deduction similarly but still different. This does lead to code duplication and missed deductions as there is no information exchange going on. **** This includes missing attributes in existing propagations e.g., dereferencability (see also [0,1]), missing forms of propagation, e.g., call sites -> callee (see [2,3]), as well as missing deductions due to the fact that there is no (global) fixpoint iteration. Additional examples to showcase some current shortcomings are attached to this mail. ***** In the sense that they will "just contain" the implementation of a fixpoint data-flow analysis. The logic to determine/eliminate attributes will be similar to the one we have now. [0] https://reviews.llvm.org/D48387 [1] https://reviews.llvm.org/D50107 [2] https://reviews.llvm.org/D4609 [3] https://reviews.llvm.org/D50125 -- Johannes Doerfert PhD Student / Researcher Compiler Design Lab (Professor Hack) / Argonne National Laboratory Saarland Informatics Campus, Germany / Lemont, IL 60439, USA Building E1.3, Room 4.31 Tel. +49 (0)681 302-57521 : doerfert at cs.uni-saarland.de / jdoerfert at anl.gov Fax. +49 (0)681 302-3065 : http://www.cdl.uni-saarland.de/people/doerfert -------------- next part -------------- ; RUN: opt -S -functionattrs -enable-nonnull-arg-prop %s | FileCheck %s ; ; This is an evolved example to stress test SCC parameter attribute propagation. ; The SCC in this test is made up of the following six function, three of which ; are internal and three externally visible: ; ; static int* internal_ret0_nw(int *n0, int *w0); ; static int* internal_ret1_rw(int *r0, int *w0); ; static int* internal_ret1_rrw(int *r0, int *r1, int *w0); ; int* external_ret2_nrw(int *n0, int *r0, int *w0); ; int* external_sink_ret2_nrw(int *n0, int *r0, int *w0); ; int* external_source_ret2_nrw(int *n0, int *r0, int *w0); ; ; The top four functions call each other while the "sink" function will not ; call anything and the "source" function will not be called in this module. ; The names of the functions define the returned parameter (X for "_retX_"), ; as well as how the parameters are (transitively) used (n = readnone, ; r = readonly, w = writeonly). ; ; Currently we get the following statistics: ; 1 functionattrs - Number of functions marked as norecurse ; 6 functionattrs - Number of functions marked as nounwind ; 1 functionattrs - Number of arguments marked nocapture ; 1 functionattrs - Number of arguments marked readnone ; 1 functionattrs - Number of arguments marked readonly ; 1 functionattrs - Number of arguments marked returned ; -- ; 11 ; ; This basically describes that we only derived attributes for the "sink" ; function (except nounwind which is properly derived). ; ; ; What we should get is something along the lines of: ; 1 functionattrs - Number of functions marked as norecurse ; 6 functionattrs - Number of functions marked argmemonly ; 6 functionattrs - Number of functions marked as nounwind ; 16 functionattrs - Number of arguments marked nocapture ; 4 functionattrs - Number of arguments marked readnone ; 6 functionattrs - Number of arguments marked writeonly ; 6 functionattrs - Number of arguments marked readonly ; 6 functionattrs - Number of arguments marked returned ; -- ; 51 ; target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" ; Function Attrs: noinline nounwind uwtable define dso_local i32* @external_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) #0 { entry: %call = call i32* @internal_ret0_nw(i32* %n0, i32* %w0) %call1 = call i32* @internal_ret1_rrw(i32* %r0, i32* %r0, i32* %w0) %call2 = call i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) %call3 = call i32* @internal_ret1_rw(i32* %r0, i32* %w0) ret i32* %call3 } ; Function Attrs: noinline nounwind uwtable define internal i32* @internal_ret0_nw(i32* %n0, i32* %w0) #0 { entry: %r0 = alloca i32, align 4 %r1 = alloca i32, align 4 %tobool = icmp ne i32* %n0, null br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry br label %return if.end: ; preds = %entry store i32 3, i32* %r0, align 4 store i32 5, i32* %r1, align 4 store i32 1, i32* %w0, align 4 %call = call i32* @internal_ret1_rrw(i32* %r0, i32* %r1, i32* %w0) %call1 = call i32* @external_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) %call2 = call i32* @external_ret2_nrw(i32* %n0, i32* %r1, i32* %w0) %call3 = call i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) %call4 = call i32* @external_sink_ret2_nrw(i32* %n0, i32* %r1, i32* %w0) %call5 = call i32* @internal_ret0_nw(i32* %n0, i32* %w0) br label %return return: ; preds = %if.end, %if.then %retval.0 = phi i32* [ %call5, %if.end ], [ %n0, %if.then ] ret i32* %retval.0 } ; Function Attrs: noinline nounwind uwtable define internal i32* @internal_ret1_rrw(i32* %r0, i32* %r1, i32* %w0) #0 { entry: %0 = load i32, i32* %r0, align 4 %tobool = icmp ne i32 %0, 0 br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry br label %return if.end: ; preds = %entry %call = call i32* @internal_ret1_rw(i32* %r0, i32* %w0) %1 = load i32, i32* %r0, align 4 %2 = load i32, i32* %r1, align 4 %add = add nsw i32 %1, %2 store i32 %add, i32* %w0, align 4 %call1 = call i32* @internal_ret1_rw(i32* %r1, i32* %w0) %call2 = call i32* @internal_ret0_nw(i32* %r0, i32* %w0) %call3 = call i32* @internal_ret0_nw(i32* %w0, i32* %w0) %call4 = call i32* @external_ret2_nrw(i32* %r0, i32* %r1, i32* %w0) %call5 = call i32* @external_ret2_nrw(i32* %r1, i32* %r0, i32* %w0) %call6 = call i32* @external_sink_ret2_nrw(i32* %r0, i32* %r1, i32* %w0) %call7 = call i32* @external_sink_ret2_nrw(i32* %r1, i32* %r0, i32* %w0) %call8 = call i32* @internal_ret0_nw(i32* %r1, i32* %w0) br label %return return: ; preds = %if.end, %if.then %retval.0 = phi i32* [ %call8, %if.end ], [ %r1, %if.then ] ret i32* %retval.0 } ; Function Attrs: noinline nounwind uwtable define dso_local i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) #0 { entry: %tobool = icmp ne i32* %n0, null br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry br label %return if.end: ; preds = %entry %0 = load i32, i32* %r0, align 4 store i32 %0, i32* %w0, align 4 br label %return return: ; preds = %if.end, %if.then ret i32* %w0 } ; Function Attrs: noinline nounwind uwtable define internal i32* @internal_ret1_rw(i32* %r0, i32* %w0) #0 { entry: %0 = load i32, i32* %r0, align 4 %tobool = icmp ne i32 %0, 0 br i1 %tobool, label %if.end, label %if.then if.then: ; preds = %entry br label %return if.end: ; preds = %entry %call = call i32* @internal_ret1_rrw(i32* %r0, i32* %r0, i32* %w0) %1 = load i32, i32* %r0, align 4 store i32 %1, i32* %w0, align 4 %call1 = call i32* @internal_ret0_nw(i32* %r0, i32* %w0) %call2 = call i32* @internal_ret0_nw(i32* %w0, i32* %w0) %call3 = call i32* @external_sink_ret2_nrw(i32* %r0, i32* %r0, i32* %w0) %call4 = call i32* @external_ret2_nrw(i32* %r0, i32* %r0, i32* %w0) br label %return return: ; preds = %if.end, %if.then %retval.0 = phi i32* [ %call4, %if.end ], [ %w0, %if.then ] ret i32* %retval.0 } ; Function Attrs: noinline nounwind uwtable define dso_local i32* @external_source_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) #0 { entry: %call = call i32* @external_sink_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) %call1 = call i32* @external_ret2_nrw(i32* %n0, i32* %r0, i32* %w0) ret i32* %call1 } -------------- next part -------------- ; RUN: opt -S -functionattrs -enable-nonnull-arg-prop %s | FileCheck %s ; In the following SCC, one function (scc_bottom_vs_top_fn_top) cannot be ; marked nonnull. We should still mark the others as they will always return ; nonnull pointers. However, this only works if we initialize our abstract ; state with bottom (not top), thus optimistically assume nonnull until proven ; otherwise. declare nonnull i8* @ret_nonnull() ; CHECK: define i8* @scc_bottom_vs_top_fn_top define i8* @scc_bottom_vs_top_fn_top() { call i8* @scc_bottom_vs_top_fn_scc0() call i8* @scc_bottom_vs_top_fn_scc1() ret i8* null } ; CHECK: define nonnull i8* @scc_bottom_vs_top_fn_scc0 define i8* @scc_bottom_vs_top_fn_scc0() { ; The null returing function is called here but the result is unused. call i8* @scc_bottom_vs_top_fn_top() br i1 undef, label %bb0, label %bb1 bb0: %ret0 = call i8* @ret_nonnull() ret i8* %ret0 bb1: %ret1 = call i8* @scc_bottom_vs_top_fn_scc1() ret i8* %ret1 } ; CHECK: define nonnull i8* @scc_bottom_vs_top_fn_scc1 define i8* @scc_bottom_vs_top_fn_scc1() { ; The null returing function is called here but the result is unused. call i8* @scc_bottom_vs_top_fn_top() br i1 undef, label %bb0, label %bb1 bb0: %ret0 = call i8* @ret_nonnull() ret i8* %ret0 bb1: %ret1 = call i8* @scc_bottom_vs_top_fn_scc0() ret i8* %ret1 } -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180823/b4ab1a06/attachment.sig>
Hal Finkel via llvm-dev
2018-Sep-01 23:37 UTC
[llvm-dev] [RFC] "Properly" Derive Function/Argument/Parameter Attributes
Hi, Johannes, I'm certainly in favor of rewriting the attribute deduction so that it's more consistent and more powerful. There's a lot more that we could do. I'll help to review the patches. On 08/23/2018 10:25 AM, Johannes Doerfert via llvm-dev wrote:> After I spend some time working with the function attribute* deduction > pass** [1,3], I would like to propose a "proper" organization***. > > > > Why? > > Because we do not derive nearly as many attributes as we could****, > while we do maintain various (separate and diffently organized) > "data-flow-like analyses" to do so. > > > > What else? > > I propose a single optimistic data-flow (fixpoint) loop to perform the > deduction (similar to [2,3] but for all propagation forms and not only > call sites -> callee).Right now, as I recall, we apply an optimistic fixed-point optimization to each SCC, moving up the call graph from leaves to root. When you say that you want a more-general inference, besides just call sites -> callees (and the existing callees -> callers), can you please elaborate? Doing both of these at the same time as part of a optimistic fixed-point optimization seems to imply that the iteration will involve the entire call graph at once (or, at least, each connected component at once). Is that what you have in mind?> Given that a pessimistic initialization allows > to terminate early, we can then also vary the compile time investment.This is certainly a true statement, but I don't understand what you're proposing. Are you saying that we should have both an optimistic and pessimistic mode? Thanks again, Hal> > > > Where is the catch? > > It will require a substantial rewrite with (some) parts that cannot be > split in small independent patches. While I believe these to be easy > to review*****, I want to know if > (1) there is interest in having better attribute deduction, and > (2) there are volunteers to review such patches. > > > > I do appreciate any input on this proposal. > > Cheers, > Johannes > > -------- > > * It derives function attributes but also parameter and argument > attributes. > > ** lib/Transforms/IPO/FunctionAttrs.cpp > > *** This statement is intended to sound harsh. I believe it to be > appropriate because it is hard to find consistency in the current > implementation. Different parts perform argument deduction similarly > but still different. This does lead to code duplication and missed > deductions as there is no information exchange going on. > > **** This includes missing attributes in existing propagations > e.g., dereferencability (see also [0,1]), missing forms of > propagation, e.g., call sites -> callee (see [2,3]), as well as > missing deductions due to the fact that there is no (global) > fixpoint iteration. Additional examples to showcase some current > shortcomings are attached to this mail. > > ***** In the sense that they will "just contain" the implementation of a > fixpoint data-flow analysis. The logic to determine/eliminate > attributes will be similar to the one we have now. > > > [0] https://reviews.llvm.org/D48387 > [1] https://reviews.llvm.org/D50107 > [2] https://reviews.llvm.org/D4609 > [3] https://reviews.llvm.org/D50125 > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180901/b1dbf713/attachment.html>
Johannes Doerfert via llvm-dev
2018-Sep-04 01:27 UTC
[llvm-dev] [RFC] "Properly" Derive Function/Argument/Parameter Attributes
Hi Hal, thanks for your interest and questions. On 09/01, Hal Finkel via llvm-dev wrote:> Hi, Johannes, > > I'm certainly in favor of rewriting the attribute deduction so that it's > more consistent and more powerful. There's a lot more that we could do. > I'll help to review the patches.Even if we only do it to clean up the code, it is probably worth it. Deriving new/different kinds of attributes, e.g., "dereferencable" and "align", is another good reason. Finally, I detailed how we currently derive attributes (below) to highlight the traversal duplications and pessimistic choices we currently make.> On 08/23/2018 10:25 AM, Johannes Doerfert via llvm-dev wrote: > > After I spend some time working with the function attribute* deduction > > pass** [1,3], I would like to propose a "proper" organization***. > > > > > > > > Why? > > > > Because we do not derive nearly as many attributes as we could****, > > while we do maintain various (separate and diffently organized) > > "data-flow-like analyses" to do so. > > > > > > > > What else? > > > > I propose a single optimistic data-flow (fixpoint) loop to perform the > > deduction (similar to [2,3] but for all propagation forms and not only > > call sites -> callee). > > Right now, as I recall, we apply an optimistic fixed-point optimization > to each SCC, moving up the call graph from leaves to root.Before I describe what I plan to below, I want to elaborate on the current situation. Let me try to summarize it in a bit more detail: We have two inference passes post order (po) and reverse post order (rpo) of the call graph (CG). The second is not very interesting as it is neither actual rpo nor do we use it for anything but to detect non-recursive functions based on non-recursive callers (which seems to be a special situation but I was unable to find any details on that). Now the po traversal does many things, separately, in the following order: - addArgumentReturnedAttrs() -- "returned" parameter attribute. Traversal of all blocks to find an argument that (1) has the same type as the return type, and (2) is returned by all "return" instructions. This is not optimistic and return values of other functions in the SCC will prevent the deduction. - addReadAttrs() -- "readonly"/"readnone" function attributes. Traverses all instructions in the SCC and optimistically determines "readonly"/"readnone" function attributes. - addArgumentAttrs() -- "nonnull"/"nocapture"/"readonly"/"readnone" "writeonly" parameter attributes Traverse the uses of arguments through the SCC to determine "nocapture"/"readonly"/"readnone"/"writeonly" attributes. Additionally perform the callee -> call site deduction for "nonnull" attributes if the call site is for sure executed. Note that "returned" pointers (in the SCC) are assumed captured and therefore neither "readonly", "readnone" nor "writeonly" even though they do not outlive the function/SCC. - addNoAliasAttrs() -- "noalias" function return value Check for "malloc-like" SCCs by verifying that if all return values are "NULL", "undef" or "noalias" return values of calls. - addNonNullAttrs() -- "nonnull" function return value Check for "nonnull" return values by verifying all function return values are non-null. However, this is not optimistic. Returning the return value of another function in the SCC is considered potentially null. - removeConvergentAttrs() -- remove "convergent" function attribute Traverse all instructions in the SCC and remove the "convergent" attribute if all contained call sites are "non-convergent". - addNoRecurseAttrs() -- "norecurse" function attribute Traverse all instructions in a singleton SCC and mark it as "norecurse" if there is no self or potentially recursive call.> When you say that you want a more-general inference, besides just call > sites -> callees (and the existing callees -> callers), can you please > elaborate? Doing both of these at the same time as part of a > optimistic fixed-point optimization seems to imply that the iteration > will involve the entire call graph at once (or, at least, each > connected component at once). Is that what you have in mind?There is certainly a large design space that allows variations. If we properly traverse the SCCs in post order and optimistically derive all attributes, it should already improve on the current state. However, the callees -> call sites deduction requires to look into the callers of the SCC as well, assuming all are known. For the best result we probably need to combine three deduction methods: - If something is known for all call sites it is known for the callee. - If something is know for the callee it is known for all call sites. - If something can be derived from the code, e.g., guaranteed executed calls, it is known for the function. One optimistic fixpoint iteration for the whole call graph is beneficial if we can use information derived by one method for another one. This is the case if we have domain knowledge, e.g., "read-only" annotations from the user or the front-end on an interior node in the call graph. For descriptive properties, like "non-null", this can also work very well. If we expect additional attributes of the latter kind, or more annotations through the front-ends/users, we should consider this solution.> > Given that a pessimistic initialization allows to terminate early, > > we can then also vary the compile time investment. > > This is certainly a true statement, but I don't understand what you're > proposing. Are you saying that we should have both an optimistic and > pessimistic mode?I was just saying that we can easily use the same framework for both, optimistic and pessimistic deduction: - If we want a fast result, or be able to stop the fixpoint iteration early, we start with a pessimistic state, - If we want the best possible result, we start with an optimistic state. Does this make sense? Cheers, Johannes> Thanks again, Hal > > > > > > > > > Where is the catch? > > > > It will require a substantial rewrite with (some) parts that > > cannot be split in small independent patches. While I believe > > these to be easy to review*****, I want to know if (1) there is > > interest in having better attribute deduction, and (2) there are > > volunteers to review such patches. > > > > > > > > I do appreciate any input on this proposal. > > > > Cheers, Johannes > > > > -------- > > > > * It derives function attributes but also parameter and argument > > attributes. > > > > ** lib/Transforms/IPO/FunctionAttrs.cpp > > > > *** This statement is intended to sound harsh. I believe it to be > > appropriate because it is hard to find consistency in the current > > implementation. Different parts perform argument deduction similarly > > but still different. This does lead to code duplication and missed > > deductions as there is no information exchange going on. > > > > **** This includes missing attributes in existing propagations > > e.g., dereferencability (see also [0,1]), missing forms of > > propagation, e.g., call sites -> callee (see [2,3]), as well as > > missing deductions due to the fact that there is no (global) > > fixpoint iteration. Additional examples to showcase some current > > shortcomings are attached to this mail. > > > > ***** In the sense that they will "just contain" the implementation > > of a fixpoint data-flow analysis. The logic to determine/eliminate > > attributes will be similar to the one we have now. > > > > > > [0] https://reviews.llvm.org/D48387 [1] > > https://reviews.llvm.org/D50107 [2] https://reviews.llvm.org/D4609 > > [3] https://reviews.llvm.org/D50125 > > > > > > > > _______________________________________________ LLVM Developers > > mailing list llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- Hal Finkel Lead, Compiler Technology and Programming Languages > Leadership Computing Facility Argonne National Laboratory >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Johannes Doerfert PhD Student / Researcher Compiler Design Lab (Professor Hack) / Argonne National Laboratory Saarland Informatics Campus, Germany / Lemont, IL 60439, USA Building E1.3, Room 4.31 Tel. +49 (0)681 302-57521 : doerfert at cs.uni-saarland.de / jdoerfert at anl.gov Fax. +49 (0)681 302-3065 : http://www.cdl.uni-saarland.de/people/doerfert -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180904/e8650b22/attachment.sig>