----- Original Message -----> From: "Bjorn De Sutter" <bjorn.desutter at elis.ugent.be> > To: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 27, 2012 6:49:39 AM > Subject: Re: [LLVMdev] loop pragmas > > I am thinking about another use of annotations that fits in a longer > term vision, which centers around feeding compilers with information > from higher-level tools such as precompilers. > > Deciding how to map a portable piece of software to a heterogeneous > multicore processor and get the best performance for a range of > widely varying architectures, requires much higher level code > analysis than what is possible in "standard" compilers on their > relatively low-level IRs. To have portable performance and high > programmer productivity, an application will need to be written in a > higher-level language, like Julia, MATLAB, ... that can deliver much > of the needed information to the compilers, for example by using > parallel programming patterns (see Berkeley Parlab), and that allows > a compiler to choose among different implementations of algorithms > and data structures (see Petabricks). Building compiler front-ends, > middle-ends and back-ends from scratch that can use all information > available in such programs and that produce high-quality code for a > range of architectures is undoable for most if not all research > labs. > > And it should also not be necessary: many excellent lower-level > language compilers already exist, some of which support a wide range > of architectures, such as LLVM (OoO CPUs, VLIWs, GPUs, CGRAs in my > own backend and in Samsung's proprietary SRP backend, etc.). So for > research purposes and hopefully later also real-world development if > the research results are good, ideally we would have to only develop > the necessary precompiler tools that take in very high-level code > and that produce tuned lower-level (C or bitcode or LLVM IR or ...) > after high-level analysis and target-dependent optimisations and > selections, after which LLVM then takes care of the further > low-level optimizations and specific code generation for the > different targets in the heterogeneous multicore. > > To facilitate research in this direction and use LLVM in such a tool > flow, it is absolutely necessary that both the precompiler and the > researcher doing manual experiments can steer the low-level LLVM > compiler with code annotations such as loop pragmas or attributes, > simply because once the code is in a lower-level form such as C or > bitcode or LLVM IR, there is not enough information available in the > code itself to select the best low-level transformations. > > It is interesting to note that in such an approach, the precompiler > probably would be trained with machine learning. If this is the > case, it might also be able to learn which annotations actually > influence the compiler and which do not, for example because they > are destroyed before some optimization pass is executed. So the > precompiler can learn to generate code tuned for specific targets > taking into account all limitations of the compiler that will do the > actual code generation, including its incomplete or stupid or > whatever support for pragmas and other such things...+1 This still leaves the question of exactly how to attach metadata to loops, etc. -Hal> > Best, > > Bjorn > > On 21 Nov 2012, at 18:56, Krzysztof Parzyszek > <kparzysz at codeaurora.org> wrote: > > > On 11/21/2012 11:32 AM, Tobias Grosser wrote: > >> On 11/21/2012 03:45 PM, Krzysztof Parzyszek wrote: > >>> > >>> I'm thinking of this in terms of parallelization directives. The > >>> optimizations that rely on such annotations would need to be done > >>> as > >>> early as possible, before any optimization that could invalidate > >>> them. > >>> If the annotation can become false, you are right---it's probably > >>> not a > >>> good idea to have it as the medium. > >> > >> If we use metadata to model annotations, we need to ensure that it > >> is > >> either correct or in case a transformation can not guarantee the > >> correctness of the meta data, that it is removed. > > > > Yes, that is not hard to accomplish. > > > > > >>> Other types of annotations that are > >>> "harmless" are probably good to have, for example "unroll-by" > >>> (assuming > >>> that this is a suggestion to the compiler, not an order). > >> > >> To my knowledge, we are avoiding to allow the user to 'tune' the > >> compiler. Manual tuning may be good for a certain piece of > >> hardware, but > >> will have negative effects on other platforms. > > > > A lot of ISV code is meant to run on a particular platform, or on a > > small set of target platforms. Such code is often hand-tuned and > > the tuning directives will be different for different targets.. I > > see no reason why we shouldn't allow that. As a matter of fact, > > not allowing it will make us less competitive. > > > > > >> Instead of providing facilities to tune the hardware, we should > >> understand why LLVM does not choose the right unrolling factor. > > > > Because in general it's impossible. User's hints are always > > welcome. > > > > -Krzysztof > > > > > > -- > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > > hosted by The Linux Foundation > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
On 27 Nov 2012, at 15:03, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- >> From: "Bjorn De Sutter" <bjorn.desutter at elis.ugent.be> >> To: llvmdev at cs.uiuc.edu >> Sent: Tuesday, November 27, 2012 6:49:39 AM >> Subject: Re: [LLVMdev] loop pragmas >> >> I am thinking about another use of annotations that fits in a longer >> term vision, which centers around feeding compilers with information >> from higher-level tools such as precompilers. >> >> ... >> >> To facilitate research in this direction and use LLVM in such a tool >> flow, it is absolutely necessary that both the precompiler and the >> researcher doing manual experiments can steer the low-level LLVM >> compiler with code annotations such as loop pragmas or attributes, >> simply because once the code is in a lower-level form such as C or >> bitcode or LLVM IR, there is not enough information available in the >> code itself to select the best low-level transformations. >> >> ... > > +1 > > This still leaves the question of exactly how to attach metadata to loops, etc. > -Hal >What about the following: 1) During parsing, pragmas and attributes and ... are collected and stored in some kind of annotation container. Every occurrence of a pragma, attribute, etc. ends up as a pseudo instruction in the IR that references one annotation in the container. Per annotation, generic information such as its type is stored, potential parameters, line number information, etc. 2) Additional passes can extend the information tracked for each annotation: for example, for loop pragmas, some small pass could compute the nesting depth of each pragma. For data dependency related loop params, the loop annotation can be converted into per instruction metadata, etc. Also additional pseudo-instructions can be inserted. Similar passes can check which annotations are still meaningful, or aggregate information from multiple annotations. This might be useful, for example to omit loop annotations on loops that were unrolled completely, or to aggregate annotations when loops are fused (by polly e.g.). 3) Each and every code transformation pass is responsible for either maintaining the annotations or for invalidating them. The standard behavior would be to invalidate all annotations. It is up to the developers of passes using the annotations to make sure that the annotations survive it to their pass. A second step would be to invalidate only the annotations referenced by pseudo-instructions that are involved in the transformations. This can also be done in post-processing steps of transformations: for example, if some pseudo-instruction is not at a loop nest level indicated by the annotation it references, the data is invalid. Or if more than one pseudo instruction refers to an annotation after some transformation, this points to duplication which might invalidate some annotation. Bjorn
On 11/27/2012 8:03 AM, Hal Finkel wrote:> > This still leaves the question of exactly how to attach metadata to loops, etc.In one implementation, the loops had a fixed structure: guard branch, preheader, header and loop body, optional epilog. The structure was clearly identified by various means (flags, bits, etc.). No optimization in the optimizer (roughly a counterpart of the LLVM's bitcode optimizer) was allowed to alter it. For example, CFG simplification could change branches all it wanted, *except* the branches that were a part of the loop structure. A "continue" inside of a loop would not create a nested loop, etc. This worked like a dream (hence my limited appreciation for SimplifyCFG...). In that implementation, the metadata was placed in the loop header, or the loop preheader, depending on what information it was. Since any code that would try to move things around in a loop would have to know what it's doing (i.e. would have to be aware of the loop structure and what the rules were), there was no risk that the metadata would become accidentally separated from the loop. I'd love to see something like that in LLVM, help make it happen, etc, etc. Of course, other ideas are welcome as well. :) -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Krzysztof Parzyszek wrote:> On 11/27/2012 8:03 AM, Hal Finkel wrote: > > > >This still leaves the question of exactly how to attach metadata to loops, etc. > > In one implementation, the loops had a fixed structure: guard > branch, preheader, header and loop body, optional epilog. The > structure was clearly identified by various means (flags, bits, > etc.). No optimization in the optimizer (roughly a counterpart of > the LLVM's bitcode optimizer) was allowed to alter it. For example, > CFG simplification could change branches all it wanted, *except* the > branches that were a part of the loop structure. A "continue" > inside of a loop would not create a nested loop, etc. This worked > like a dream (hence my limited appreciation for SimplifyCFG...). > > In that implementation, the metadata was placed in the loop header, > or the loop preheader, depending on what information it was. Since > any code that would try to move things around in a loop would have > to know what it's doing (i.e. would have to be aware of the loop > structure and what the rules were), there was no risk that the > metadata would become accidentally separated from the loop. > > I'd love to see something like that in LLVM, help make it happen,+1 Sebastian -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
----- Original Message -----> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> > To: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 27, 2012 10:58:05 AM > Subject: Re: [LLVMdev] loop pragmas > > On 11/27/2012 8:03 AM, Hal Finkel wrote: > > > > This still leaves the question of exactly how to attach metadata to > > loops, etc. > > In one implementation, the loops had a fixed structure: guard branch, > preheader, header and loop body, optional epilog. The structure was > clearly identified by various means (flags, bits, etc.). No > optimization in the optimizer (roughly a counterpart of the LLVM's > bitcode optimizer) was allowed to alter it. For example, CFG > simplification could change branches all it wanted, *except* the > branches that were a part of the loop structure. A "continue" inside > of > a loop would not create a nested loop, etc. This worked like a dream > (hence my limited appreciation for SimplifyCFG...). > > In that implementation, the metadata was placed in the loop header, > or > the loop preheader, depending on what information it was. Since any > code that would try to move things around in a loop would have to > know > what it's doing (i.e. would have to be aware of the loop structure > and > what the rules were), there was no risk that the metadata would > become > accidentally separated from the loop. > > I'd love to see something like that in LLVM, help make it happen, > etc, > etc. Of course, other ideas are welcome as well. :)Can you please write up a description of exactly what you have in mind? What information would go in the header, preheader (or maybe attached to the backedges, etc.). Would this be metadata or intrinsics (or both)? Would basic-block-level metadata be a better fit? (we had a patch for this proposed some months ago by Ralf Karrenberg -- I just figured I'd mention it in case it is useful here). Obviously we don't want to limit the optimization space available in order to preserve this metadata, and so we need to work out what happens in the case of loop fusion, splitting, etc. -Hal> > -Krzysztof > > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory