On Oct 7, 2013, at 3:49 PM, Daniel Berlin <dberlin at dberlin.org> wrote:> > > > On Wed, Sep 25, 2013 at 5:50 PM, Filip Pizlo <fpizlo at apple.com> wrote: > Hi all, > > TBAA on loads and stores is super useful for conveying high-level type knowledge to LLVM transformations. But often translating a high-level language into LLVM IR will result in runtime calls that are known at the high level to only affect (either by reading or by writing) some restricted subset of the heap. These aren't read-only calls; that is often too strong of a guarantee. A great example would be a GC barrier or GC allocation that may arise if LLVM is used as a backend for a compiler for a higher-level language. There are probably other examples that arise in high-level languages, but let's use allocation as an example for now. I can envision two ways of representing allocation in LLVM IR: > > Method #1: > > %object = call @runtime_allocate(... /* pass size and/or other data */) > > Method #2: > > SomeBlock: > ... ; code preceding the allocation > %objectFast = ... ; IR for the inlined allocation fast path, which produces %objectFast but may yield null if the fast path fails and we need to take slow path > %didFail = icmp eq %objectFast, null > br %didFail, label %AllocationSlowPath, label &Continue > AllocationSlowPath: > %objectSlow = call @runtime_allocate_slow_path(... /* pass size and/or other data */) > br label %Continue > Continue: > %object = phi [%objectFast, SomeBlock], [%objectSlow, AllocationSlowPath] > > In both of these cases, the call instruction will appear to clobber the world leading to more pessimistic optimization. Moreover, it's not always correct to mark the allocation as being readonly; it may read or write observable state. But if you don't like the allocation example then you can imagine that a GC store barrier would have an almost identical pattern to #2 and it would definitely not be readonly. Anyway, such a pattern would lead to fewer loads being eliminated or hoisted, fewer stores being eliminated or sunk, etc - all due to those pesky calls. But the high-level code generator that produces this IR will know for sure that @runtime_allocate or @runtime_allocate_slow_path can only affect a very narrow subset of the heap: namely, it will only read and write the GC's data structures. So most of the loads and stores surrounding the allocation, that arose from source-level loads and stores, would definitely not have any interference with the call and this would ! > be easy to convey using TBAA. > > Hence if we could decorate the calls to these runtime functions as only reading or writing certain TBAA types, > > Out of curiosity, why is this tied to TBAA types, rather than variables or something else? > IE what analysis are you performing that gives you type answers instead of answers in terms of local/global variables?TBAA’s utility for non-C languages (like JavaScript, in my case) isn’t for reasoning about user-facing types, but for creating what you could alternatively think of as an “abstract heaps”. TBAA is perfect for this. For example I might know that a certain function clobbers a field called ‘foo’. In my frontend I create an "abstract heap" for each field name, and this gets lowered to LLVM TBAA. I already use this for telling LLVM about dependencies between loads and stores. If I emit a store as a result of lowering a JS “o.foo = …” operation then I will ascribe a TBAA node for the “foo” abstract heap. This is just one example of the many different abstract heaps (i.e. “TBAA nodes”) that my compiler uses. Note that most of the loads and stores that I’m concerned with do not involve alloca’s (i.e. locals) or “global variables” in any kind of way that LLVM could reason about. A mechanism that only works for locals and globals would be of zero use to me since all of my alloca’s are trivially killed by mem2reg and I don’t have any globals.> The information you are talking about here is traditional mod/ref info that is completely independent of TBAA.You lost me. I’m writing a compiler that generates LLVM IR. I have high-level knowledge that allows me to prove the set of abstract heap locations that may be read or written by any call, load, or store. What do you suggest as a mechanism by which I can communicate that information to LLVM? Are you saying that such a mechanism already exists and that I should use that instead? As far as I can tell, TBAA is the cleanest such mechanism that I have found thus far and its only shortcoming is that I cannot use it to annotate calls.> > The more specific the info you can get from the metadata, the better off you are eliminating loads/stores > > Otherwise, yes, it's certainly incredibly helpful to the other optimizations. > > However, unless there is a good reason, i'd suggest you just make it generic call read/write metadata that talks about values (be they incoming arguments, alloca'd memory, or whatever else).You lost me. How does this help a high-level compiler convey to LLVM that a particular call and a particular load or store don’t have any dependency? My proposal achieves this, albeit using a coarse-grained at a coarse-grained level: you have to express your knowledge of dependencies using a hierarchy of abstract sets of heap locations (i.e. exactly what TBAA gives you).> > It's actually a lot easier to reason about individual heap values than entire types, unless, again, you only have an analysis that is providing answers about types (or your plan was simply to use the existing TBAA info and do a bottom-up closure over the entire callgraph, which would provide some value, but lose a lot of info due to external function calls)The goal is to allow a front-end that already has knowledge about some high-level types to convey that knowledge to LLVM when generating LLVM IR. In my front-end, I actually don’t really do much analysis to produce the TBAA information - it mostly just falls out naturally from my own IR. I’m not sure what you mean about “individual heap values”. If you mean global variables, then that’s of no use to me. I don’t have any global variables. It’s interesting that using TBAA for this completely subsumes any approach based on globals: you could ascribe a distinct TBAA node to each global and get the same effect. But TBAA is more powerful because you can annotate something that you’ve already proven about a memory location, and this information can be orthogonal to the IR used to compute that memory location. This makes TBAA ideal for conveying dependency information from a high-level compiler to LLVM at the time of LLVM IR generation. -Filip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131007/93e7a50b/attachment.html>
On Mon, Oct 7, 2013 at 4:22 PM, Filip Pizlo <fpizlo at apple.com> wrote:> > > On Oct 7, 2013, at 3:49 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > > On Wed, Sep 25, 2013 at 5:50 PM, Filip Pizlo <fpizlo at apple.com> wrote: >> >> Hi all, >> >> TBAA on loads and stores is super useful for conveying high-level type knowledge to LLVM transformations. But often translating a high-level language into LLVM IR will result in runtime calls that are known at the high level to only affect (either by reading or by writing) some restricted subset of the heap. These aren't read-only calls; that is often too strong of a guarantee. A great example would be a GC barrier or GC allocation that may arise if LLVM is used as a backend for a compiler for a higher-level language. There are probably other examples that arise in high-level languages, but let's use allocation as an example for now. I can envision two ways of representing allocation in LLVM IR: >> >> Method #1: >> >> %object = call @runtime_allocate(... /* pass size and/or other data */) >> >> Method #2: >> >> SomeBlock: >> ... ; code preceding the allocation >> %objectFast = ... ; IR for the inlined allocation fast path, which produces %objectFast but may yield null if the fast path fails and we need to take slow path >> %didFail = icmp eq %objectFast, null >> br %didFail, label %AllocationSlowPath, label &Continue >> AllocationSlowPath: >> %objectSlow = call @runtime_allocate_slow_path(... /* pass size and/or other data */) >> br label %Continue >> Continue: >> %object = phi [%objectFast, SomeBlock], [%objectSlow, AllocationSlowPath] >> >> In both of these cases, the call instruction will appear to clobber the world leading to more pessimistic optimization. Moreover, it's not always correct to mark the allocation as being readonly; it may read or write observable state. But if you don't like the allocation example then you can imagine that a GC store barrier would have an almost identical pattern to #2 and it would definitely not be readonly. Anyway, such a pattern would lead to fewer loads being eliminated or hoisted, fewer stores being eliminated or sunk, etc - all due to those pesky calls. But the high-level code generator that produces this IR will know for sure that @runtime_allocate or @runtime_allocate_slow_path can only affect a very narrow subset of the heap: namely, it will only read and write the GC's data structures. So most of the loads and stores surrounding the allocation, that arose from source-level loads and stores, would definitely not have any interference with the call and this would ! >> be easy to convey using TBAA. >> >> Hence if we could decorate the calls to these runtime functions as only reading or writing certain TBAA types, > > > Out of curiosity, why is this tied to TBAA types, rather than variables or something else? > IE what analysis are you performing that gives you type answers instead of answers in terms of local/global variables? > > > TBAA’s utility for non-C languages (like JavaScript, in my case) isn’t for reasoning about user-facing types, but for creating what you could alternatively think of as an “abstract heaps”. TBAA is perfect for this.No. A nested hierarchy that describes conflicts among possible abstract heap locations is perfect for this. TBAA is not quite the same. See below.> > For example I might know that a certain function clobbers a field called ‘foo’. In my frontend I create an "abstract heap" for each field name, and this gets lowered to LLVM TBAA.So let's start simple. You are representing an abstract set of heap locations. Are they actually in any way related to the *existing type tree* that is being created for TBAA? Or are you adding new values to it that are not really part of a type hierarchy at all, but represent some other relation among abstract heaps?>From your description (you create new things that represent abstractheap locations and lower that into a nested tree), it sounds like the answer is "you are adding new locations that represent a different tree than the existing TBAA". If that is not true, what is the actual relation to the existing information? TBAA in the LLVM compiler, at least as currently produced, while theoretically compatible with this scheme , is not quite the same. The langref says "metadata is added to the IR to describe a type system of a higher level language." If you are adding a new tree just to represent new abstract heap location info, you are essentially reusing it simply because it represents a nested set structure. I don't see a lot of value in that sort of conflation, hence the question. If not, can you describe in more detail how you are generating this info? Because that is what i get from your description.> The information you are talking about here is traditional mod/ref info that is completely independent of TBAA. > > > You lost me. I’m writing a compiler that generates LLVM IR. I have high-level knowledge that allows me to prove the set of abstract heap locations that may be read or written by any call, load, or store. > > What do you suggest as a mechanism by which I can communicate that information to LLVM?I'm suggesting you create an attribute that deals with abstract heap locations that you are describing, instead of overloading and tying it into an existing set of information that has a different purpose (TBAA, which at least in the current world, represents a nested type tree that describes access rules of a language).> > Are you saying that such a mechanism already exists and that I should use that instead? > > As far as I can tell, TBAA is the cleanest such mechanism that I have found thus far and its only shortcoming is that I cannot use it to annotate calls.The fact that some other mechanism is kinda-sorta-like-what-you-want does not necessarily mean it should be tied into it :) Hence, I am trying to understand the actual relation between the TBAA tree as it exists now, and what you are suggesting creating. For example, i could perform pointer analysis in clang and generate a TBAA tree from that. It would work. It would be easy. It doesn't mean it is necessarily a great approach.> > > The more specific the info you can get from the metadata, the better off you are eliminating loads/stores > > > Otherwise, yes, it's certainly incredibly helpful to the other optimizations. > > However, unless there is a good reason, i'd suggest you just make it generic call read/write metadata that talks about values (be they incoming arguments, alloca'd memory, or whatever else). > > > You lost me. How does this help a high-level compiler convey to LLVM that a particular call and a particular load or store don’t have any dependency?You add the modref info to the calls. You add attributes to the the loads/stores. It becomes another disambiguation method to tell whether the load/store and the call have a dependency.> > My proposal achieves this, albeit using a coarse-grained at a coarse-grained level: you have to express your knowledge of dependencies using a hierarchy of abstract sets of heap locations (i.e. exactly what TBAA gives you).TBAA does not actually give you that. It gives you a tree that describes a type hierarchy according to the memory access rules of some language. This is transformable into sets of possible heap locations, but it is not actually the same. I get why you may want to reuse it. The structure is close to what you will want. The attributes already exist on load/stores. However, much like the tbaa.struct info, unless this actually represents the same tree that TBAA does, it should probably not be part of the same tree.
I think we’re talking cross-purposes at this point. Let me try to step back and concisely restate what I’m trying to do: - I am writing a frontend for a high-level language and I emit LLVM IR. - I control the TBAA tree. I don’t use it to express the high-level language’s types, since that would make absolutely no sense. JavaScript doesn’t have types. But by the time my compiler is generating LLVM IR, it will have come up with meaningful constraints on aliasing and those constraints are naturally expressed using TBAA. - Some calls that I emit in LLVM IR are neither readnone/readonly nor do they clobber the world - instead they clobber a known set of memory locations, and that set of memory locations is already expressible in the TBAA representation that I control. Which part of the above do you disagree with? On Oct 7, 2013, at 4:57 PM, Daniel Berlin <dberlin at dberlin.org> wrote:> On Mon, Oct 7, 2013 at 4:22 PM, Filip Pizlo <fpizlo at apple.com> wrote: >> >> >> On Oct 7, 2013, at 3:49 PM, Daniel Berlin <dberlin at dberlin.org> wrote: >> >> >> >> >> On Wed, Sep 25, 2013 at 5:50 PM, Filip Pizlo <fpizlo at apple.com> wrote: >>> >>> Hi all, >>> >>> TBAA on loads and stores is super useful for conveying high-level type knowledge to LLVM transformations. But often translating a high-level language into LLVM IR will result in runtime calls that are known at the high level to only affect (either by reading or by writing) some restricted subset of the heap. These aren't read-only calls; that is often too strong of a guarantee. A great example would be a GC barrier or GC allocation that may arise if LLVM is used as a backend for a compiler for a higher-level language. There are probably other examples that arise in high-level languages, but let's use allocation as an example for now. I can envision two ways of representing allocation in LLVM IR: >>> >>> Method #1: >>> >>> %object = call @runtime_allocate(... /* pass size and/or other data */) >>> >>> Method #2: >>> >>> SomeBlock: >>> ... ; code preceding the allocation >>> %objectFast = ... ; IR for the inlined allocation fast path, which produces %objectFast but may yield null if the fast path fails and we need to take slow path >>> %didFail = icmp eq %objectFast, null >>> br %didFail, label %AllocationSlowPath, label &Continue >>> AllocationSlowPath: >>> %objectSlow = call @runtime_allocate_slow_path(... /* pass size and/or other data */) >>> br label %Continue >>> Continue: >>> %object = phi [%objectFast, SomeBlock], [%objectSlow, AllocationSlowPath] >>> >>> In both of these cases, the call instruction will appear to clobber the world leading to more pessimistic optimization. Moreover, it's not always correct to mark the allocation as being readonly; it may read or write observable state. But if you don't like the allocation example then you can imagine that a GC store barrier would have an almost identical pattern to #2 and it would definitely not be readonly. Anyway, such a pattern would lead to fewer loads being eliminated or hoisted, fewer stores being eliminated or sunk, etc - all due to those pesky calls. But the high-level code generator that produces this IR will know for sure that @runtime_allocate or @runtime_allocate_slow_path can only affect a very narrow subset of the heap: namely, it will only read and write the GC's data structures. So most of the loads and stores surrounding the allocation, that arose from source-level loads and stores, would definitely not have any interference with the call and this would ! >>> be easy to convey using TBAA. >>> >>> Hence if we could decorate the calls to these runtime functions as only reading or writing certain TBAA types, >> >> >> Out of curiosity, why is this tied to TBAA types, rather than variables or something else? >> IE what analysis are you performing that gives you type answers instead of answers in terms of local/global variables? >> >> >> TBAA’s utility for non-C languages (like JavaScript, in my case) isn’t for reasoning about user-facing types, but for creating what you could alternatively think of as an “abstract heaps”. TBAA is perfect for this. > > > No. A nested hierarchy that describes conflicts among possible > abstract heap locations is perfect for this. TBAA is not quite the > same. See below. > >> >> For example I might know that a certain function clobbers a field called ‘foo’. In my frontend I create an "abstract heap" for each field name, and this gets lowered to LLVM TBAA. > > > So let's start simple. > > You are representing an abstract set of heap locations. Are they > actually in any way related to the *existing type tree* that is being > created for TBAA?Yes. Because I am also creating the TBAA type tree. I control it entirely.> Or are you adding new values to it that are not > really part of a type hierarchy at all, but represent some other > relation among abstract heaps? > From your description (you create new things that represent abstract > heap locations and lower that into a nested tree), it sounds like the > answer is "you are adding new locations that represent a different > tree than the existing TBAA”.What do you mean by “the existing TBAA”? To my understanding, LLVM itself never creates TBAA nodes; they arise entirely out of the frontend that emits LLVM IR. I control the frontend in this case.> > If that is not true, what is the actual relation to the existing information? > TBAA in the LLVM compiler, at least as currently produced, while > theoretically compatible with this scheme , is not quite the same. > The langref says "metadata is added to the IR to describe a type > system of a higher level language.”My understanding is that LLVM doesn’t produce TBAA. It consumes TBAA. Hence it’s more meaningful to reason about TBAA in terms of its semantics rather than hypothesizing about how and why someone would produce it. We agree that TBAA is a nested set structure that describes heap locations. I’m observing that it would be useful to express bounds on what heap locations that calls read or write. Therefore, TBAA can be used to convey those sets.> > If you are adding a new tree just to represent new abstract heap > location info, you are essentially reusing it simply because it > represents a nested set structure. I don't see a lot of value in > that sort of conflation, hence the question.If we agree that it would be good to have a system for bounding what a call reads or writes using nested sets, then we have two choices: - Reuse TBAA, which can already be used to express nested sets of memory locations. - Invent a whole new type system that is structurally identical to TBAA but that goes by a different name. Are you seriously proposing the latter?> > If not, can you describe in more detail how you are generating this info? > Because that is what i get from your description. > >> The information you are talking about here is traditional mod/ref info that is completely independent of TBAA. >> >> >> You lost me. I’m writing a compiler that generates LLVM IR. I have high-level knowledge that allows me to prove the set of abstract heap locations that may be read or written by any call, load, or store. >> >> What do you suggest as a mechanism by which I can communicate that information to LLVM? > > I'm suggesting you create an attribute that deals with abstract heap > locations that you are describing, instead of overloading and tying it > into an existing set of information that has a different purpose > (TBAA, which at least in the current world, represents a nested type > tree that describes access rules of a language).So, you’re suggesting inventing a new representation for type hierarchies, that would have the following properties: - The same rules as TBAA for load/store. - The same rules as what I propose for call. - Identical to TBAA in all other respects, except that it would have a different name. Am I understanding your proposal correctly?> >> >> Are you saying that such a mechanism already exists and that I should use that instead? >> >> As far as I can tell, TBAA is the cleanest such mechanism that I have found thus far and its only shortcoming is that I cannot use it to annotate calls. > > The fact that some other mechanism is kinda-sorta-like-what-you-want > does not necessarily mean it should be tied into it :) > Hence, I am trying to understand the actual relation between the TBAA > tree as it exists nowWhat TBAA tree? Are you referring to the TBAA format as described in http://llvm.org/docs/LangRef.html#tbaa-metadata, or some specific TBAA tree that exists for some specific frontend for some specific language?> , and what you are suggesting creating. > For example, i could perform pointer analysis in clang and generate a > TBAA tree from that. It would work. It would be easy. It doesn't > mean it is necessarily a great approach.Would your pointer analysis generate either disjoint sets, or nested sets? For example, if you did a Steensgaard-style pointer analysis then TBAA *would* be the most ideal way of representing the results of such an analysis. One way to view my front end’s aliasing knowledge is that I’m doing something like a unification-based analysis. I actually end up with a three-level TBAA hierarchy (the world, with subtypes for my disjoint sets of heap locations; some of those sets then have further subtypes).> >> >> >> The more specific the info you can get from the metadata, the better off you are eliminating loads/stores >> >> >> Otherwise, yes, it's certainly incredibly helpful to the other optimizations. >> >> However, unless there is a good reason, i'd suggest you just make it generic call read/write metadata that talks about values (be they incoming arguments, alloca'd memory, or whatever else). >> >> >> You lost me. How does this help a high-level compiler convey to LLVM that a particular call and a particular load or store don’t have any dependency? > > > You add the modref info to the calls. You add attributes to the the > loads/stores. It becomes another disambiguation method to tell > whether the load/store and the call have a dependency.Are you suggesting that I override AliasAnalysis? I’m trying to make it possible for a frontend to convey this information using LLVM IR, rather than having to override internal LLVM classes in order to provide this information. Hence there is a question of what format to use. I’m arguing that TBAA is that format.> >> >> My proposal achieves this, albeit using a coarse-grained at a coarse-grained level: you have to express your knowledge of dependencies using a hierarchy of abstract sets of heap locations (i.e. exactly what TBAA gives you). > > > TBAA does not actually give you that. It gives you a tree that > describes a type hierarchy according to the memory access rules of > some language.I disagree. It’s true that this is one use-case for TBAA, but that’s not how I use it. I don’t believe that I’m under any obligation to use it that way. TBAA is just a format that LLVM consumes, that can be used by a producer of LLVM IR to convey aliasing information.> This is transformable into sets of possible heap > locations, but it is not actually the same. > I get why you may want to reuse it. > The structure is close to what you will want. The attributes already > exist on load/stores. However, much like the tbaa.struct info, unless > this actually represents the same tree that TBAA does, it should > probably not be part of the same tree.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131007/f36410c1/attachment.html>