On Sun, Feb 5, 2017 at 12:25 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:> Hi Daniel, > > Many thanks for working on this! > SSI/e-SSA is the only way I'm aware of for doing efficient sparse > analyses, so I'm definitely in favor of adding support for it in LLVM! > > I read the discussion so far and did a cursory review of the patches, and > I have just a few questions: > - Is your plan to keep e-SSA form all the time (do it once and maintain it > throughout the pipeline), or do it in the passes that need it and then get > of it at end of the pass?At the moment, get rid of it.> My concern (similar to Sanjoy's) is the increased overhead for matching > patterns and so on.So far i have measured exactly none, FWIW. I'm also not worried because given how much we try to match tons of useless variants, i'm positive i could make it net zero compile time no matter what happened by removing wasted matching time :)> For example, how would instcombine work?Assuming you wished to use it there, the intrinsic already has the returned attribute said, which states, specifically, that it always returns its first argument. If instcombine doesn't take advantage, it already has a problem with intrinsics marked with the returned attribute :) (though yeah, they currently only exist in backends) As for how you would make it work, there is no magic, of course We either change the matchers to see through it or we don't. Both are valid options, with their own tradeoffs. Nobody has yet approached me about adding it to instcombine, so i haven't tried to formulate an opinion which way i'd go :) Note that other intrinsics, such as noalias, have the same issue. Would we change the m_*() matchers to know how to see through the intrinsic?> > - What you've implemented never creates phi nodes, right? (as opposed to > ABCD paper) >Correct. It originally did, but it's not possible to sanely create phi nodes for assumes in all cases. At least it seems that the current algorithm seems to bail out if the> successor has multiple predecessors.This is because it's not possible to place info in such cases without phi nodes, and even then it's not clear that it makes sense. Also note that such cases are all critical edges (otherwise i can't see how they have info to propagate :P), and if you break the critical edges, it works just fine. The need to split critical edges is pretty much true for maximal results for any optimization.> This might be an ok abstraction for GVN, but for things like CVP it's > probably not. CVP merges information incoming from multiple edges (as any > other fancier abstractions we may want to have in the future will). >It's important to note: we sort phi node uses into the predecessor block they belong to, so that restriction does *not* apply to the typical phi node use case. IE given: define i32 @test12(i32 %x) { %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %cond_true, label %cond_false cond_true: br label %ret cond_false: br label %ret ret: %res = phi i32 [ %x, %cond_true ], [ %x, %cond_false ] ret i32 %res } You will get: ; CHECK-LABEL: @test12( ; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[X:%.*]], 0 ; CHECK-NEXT: br i1 [[CMP]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] ; CHECK: cond_true: ; CHECK-NEXT: [[X_0:%.*]] = call i32 @llvm.ssa.copy.i32(i32 [[X]]) ; CHECK-NEXT: br label [[RET:%.*]] ; CHECK: cond_false: ; CHECK-NEXT: [[X_1:%.*]] = call i32 @llvm.ssa.copy.i32(i32 [[X]]) ; CHECK-NEXT: br label [[RET]] ; CHECK: ret: ; CHECK-NEXT: [[RES:%.*]] = phi i32 [ [[X_0]], [[COND_TRUE]] ], [ [[X_1]], [[COND_FALSE]] ] ; CHECK-NEXT: ret i32 [[RES]] (as you can see, we test this) So the cases i think you are thinking about, are not problems except in one degenerate case, which is where critical edges lead directly to the use. In such cases, note that the merges it performs can be proved to miss information or be non-optimal in the case of critical edges, even as it tries now. All that said ...> Any plans to support this? (i.e., is the current algorithm "easily" > upgradable to support this scenario?) >It is easy to do, but it is ... messy, IMHO. Note that such a thing is generally a mess. Here is the degenerate case: As an example: define i32 @f1(i32 %x) { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb2, label %bb1 bb1: br label %bb2 bb2: %cond = phi i32 [ %x, %bb0 ], [ %x, %bb1 ] %foo = add i32 %cond, %x ret i32 %foo } the critical edge from bb0 to bb2 causes us to have no place to place predicateinfo in the current algorithm for the true branch (we will placefalse info). You could also always place it before each branch its for (because that block should dominate all uses in the conditional edges already) . The actual placement is irrelevant, BTW. In the renaming algorithm, by the time it goes to place them, it already knows what it *should* dominate. The only thing it gets "wrong" is the phi uses, because they are placed in the predecessor blocks right now as we sort. But because we see them there, we can detect this case and change placement. Because they are guaranteed to be at the beginning of the successor block, you are guaranteed that, if you want to insert, the only thing that can be between them and the def you want to insert there is other phi uses in the same boat. So you can lookahead in the stream, find that def, insert it, use it, and pretend everything's great (without even pushing it onto the stack!) This is tricky in a one-pass algorithm, as it's really changing a simple renaming automaton into something that has two modes "critical phi use" and "regular" mode. In critical phi use mode, it finds the def, inserts it, and keeps processing until it hits a non-phi use. Then it shifts back into regular mode. But there's also only exactly one case all of the above work affects: The case where the phi node with the use is a direct successor of the branch, such that the edge to that use is critical. In any case, this is also precisely the problem that splitting critical edges resolves, and if you use break-crit-edgse on the above, you get right answers all the time :) Note also that the thing that we replace, propagateEquality in GVN, also restricts itself to single predecessors, and simply attempts to propagate directly into critical phi node uses to avoid the issue (which we can't do since NewGVN is an analysis followed by an eliminator). So yes, we can fix this case if we need to.> > - For a function/intrinsic marked as "returned(1) readnone", the obvious > optimization to do would be to RAUW the return with the argument (and kill > the function call altogether -- assuming readnone allows that; but I lost > track of that discussion). Does instcombine do that?I understand that piggybacking on this existing feature is a good thing,> but I'm just wondering if, say, InstCombine won't simply revert e-SSA? >NewGVN is the only pass that uses it ATM, and it destroys the form. The current expectation is that anything that uses it will destroy it. This is the deliberate outcome of this discussion right now :) If it becomes slow enough that we want to keep it up to date, IMHO, we can burn that bridge when we come to it. At the point at which we are trying to keep predicateinfo up to date in a large number of places, i'd argue we should just move to e-ssa/ssi as the default and be done with it. Since that's what you'll have effectively done anyway. I think it would be reasonable to see where it gets used, and if enough places, make a decision what to do.> > - In processBranch/processAssume you also consider branches on ANDs/ORs of > comparisons, and then each comparison is processed individually. However, > this seems incorrect for one of the branches: > if (a && b) { > ... > } else { > // holds: !a OR !b > use(a) > use(b) > } > > Right now it seems that the information that is attached to the else > branch is !a AND !b, which would be incorrect.I could be pedantic and say the only information attached is a comparison, the branch, and whether the edge is the true edge or the false edge :) Which is correct. It also chains the info to give you the entire string of comparisons that were applied. However, in practice you are right, and clients are not likely to get this right with what i've given them. Since in practice, the only useful derived info is in the true branch of and, and the false branch of or, i'm going to make it not attach info to the other branch. Unless you can see a case it makes sense to?> I haven't seen the client of this analysis, so this is just speculation, > but the interface doesn't seem to have any indication for this case. Or > maybe I'm misreading the code :) >No, you are correct. It just happens that the only client is smart enough to do something sane (i think :P)> > - Now slightly unrelated: do you know of other representations capable of > handling relational analyses? For example, with e-SSA: > if (x+y > 0) { > // predicate info attached to 'x+y' onlyuse(x)> }We could do that with this pass, actually. It's a question of where you terminate the operand-finding. You could actually just make it DFS the comparison operation and collect operands until you hit all the leaves, and insert for those.This would be a simple modification to collectCmpOperands. (it will still be smart enough to only insert if there are actual uses affected by info). We don't do that ATM, but there isn't anything i can see that would stop you. The goals i set were to replace propagateEquality in GVN with NewGVN, so we pretty much only produce info that we can directly propagate. It all depends on how many names you want to generate. I believe the study i saw, which tried splitting at different places, came to the conclusion that the best "bang for buck" was e-ssa. IE the percent more you get from more was not worth the cost. But when i spoke with them about it, their biggest time-waste was actually "we insert a lot of useless phis for operations that never get used, because computing liveness is hard". But as i've shown here, you don't even need to explicitly compute it to solve that problem. So who knows, maybe it does pay to split everywhere once that cost is eliminated. I'd put this one in the "evaluation necessary".> >So at use(x) no information will be available, even if we know that FWIW> 'x+y > 0' holds there. I don't think LLVM has any analysis that can track > these kind of symbolic relations, but I was just curious if there's > anything out there that can handle such analyses? >The folks i know who do this, in fact start with e-ssa and go from there, with the caveats i listed :) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170205/e888aa78/attachment.html>
(and D29519 was updated to not insert in and/or on the "wrong" edges, and tests added to ensure it happens) On Sun, Feb 5, 2017 at 1:25 PM, Daniel Berlin <dberlin at dberlin.org> wrote:> > > On Sun, Feb 5, 2017 at 12:25 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote: > >> Hi Daniel, >> >> Many thanks for working on this! >> SSI/e-SSA is the only way I'm aware of for doing efficient sparse >> analyses, so I'm definitely in favor of adding support for it in LLVM! >> >> I read the discussion so far and did a cursory review of the patches, and >> I have just a few questions: >> - Is your plan to keep e-SSA form all the time (do it once and maintain >> it throughout the pipeline), or do it in the passes that need it and then >> get of it at end of the pass? > > > At the moment, get rid of it. > > >> My concern (similar to Sanjoy's) is the increased overhead for matching >> patterns and so on. > > > So far i have measured exactly none, FWIW. > > I'm also not worried because given how much we try to match tons of > useless variants, i'm positive i could make it net zero compile time no > matter what happened by removing wasted matching time :) > > > >> For example, how would instcombine work? > > Assuming you wished to use it there, the intrinsic already has the > returned attribute said, which states, specifically, that it always returns > its first argument. > > If instcombine doesn't take advantage, it already has a problem with > intrinsics marked with the returned attribute :) > (though yeah, they currently only exist in backends) > > > As for how you would make it work, there is no magic, of course > We either change the matchers to see through it or we don't. > Both are valid options, with their own tradeoffs. Nobody has yet > approached me about adding it to instcombine, so i haven't tried to > formulate an opinion which way i'd go :) > > > Note that other intrinsics, such as noalias, have the same issue. > > > Would we change the m_*() matchers to know how to see through the >> intrinsic? >> >> - What you've implemented never creates phi nodes, right? (as opposed to >> ABCD paper) >> > > Correct. > It originally did, but it's not possible to sanely create phi nodes for > assumes in all cases. > > At least it seems that the current algorithm seems to bail out if the >> successor has multiple predecessors. > > > This is because it's not possible to place info in such cases without phi > nodes, and even then it's not clear that it makes sense. > > Also note that such cases are all critical edges (otherwise i can't see > how they have info to propagate :P), and if you break the critical edges, > it works just fine. > > The need to split critical edges is pretty much true for maximal results > for any optimization. > > > >> This might be an ok abstraction for GVN, but for things like CVP it's >> probably not. CVP merges information incoming from multiple edges (as any >> other fancier abstractions we may want to have in the future will). >> > > > It's important to note: we sort phi node uses into the predecessor block > they belong to, so that restriction does *not* apply to the typical phi > node use case. > > > IE given: > define i32 @test12(i32 %x) { > > %cmp = icmp eq i32 %x, 0 > br i1 %cmp, label %cond_true, label %cond_false > > cond_true: > br label %ret > > cond_false: > br label %ret > > ret: > %res = phi i32 [ %x, %cond_true ], [ %x, %cond_false ] > ret i32 %res > } > > > You will get: > ; CHECK-LABEL: @test12( > ; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[X:%.*]], 0 > ; CHECK-NEXT: br i1 [[CMP]], label [[COND_TRUE:%.*]], label > [[COND_FALSE:%.*]] > ; CHECK: cond_true: > ; CHECK-NEXT: [[X_0:%.*]] = call i32 @llvm.ssa.copy.i32(i32 [[X]]) > ; CHECK-NEXT: br label [[RET:%.*]] > ; CHECK: cond_false: > ; CHECK-NEXT: [[X_1:%.*]] = call i32 @llvm.ssa.copy.i32(i32 [[X]]) > ; CHECK-NEXT: br label [[RET]] > ; CHECK: ret: > ; CHECK-NEXT: [[RES:%.*]] = phi i32 [ [[X_0]], [[COND_TRUE]] ], [ > [[X_1]], [[COND_FALSE]] ] > ; CHECK-NEXT: ret i32 [[RES]] > > (as you can see, we test this) > > So the cases i think you are thinking about, are not problems except in > one degenerate case, which is where critical edges lead directly to the use. > > In such cases, note that the merges it performs can be proved to miss > information or be non-optimal in the case of critical edges, even as it > tries now. > > All that said ... > > > >> Any plans to support this? (i.e., is the current algorithm "easily" >> upgradable to support this scenario?) >> > > It is easy to do, but it is ... messy, IMHO. > > Note that such a thing is generally a mess. > > Here is the degenerate case: > > > As an example: > define i32 @f1(i32 %x) { > bb0: > %cmp = icmp eq i32 %x, 0 > br i1 %cmp, label %bb2, label %bb1 > bb1: > br label %bb2 > bb2: > %cond = phi i32 [ %x, %bb0 ], [ %x, %bb1 ] > %foo = add i32 %cond, %x > ret i32 %foo > > } > > the critical edge from bb0 to bb2 causes us to have no place to place > predicateinfo in the current algorithm for the true branch (we will > placefalse info). > > You could also always place it before each branch its for (because that > block should dominate all uses in the conditional edges already) . > > The actual placement is irrelevant, BTW. In the renaming algorithm, by the > time it goes to place them, it already knows what it *should* dominate. > > The only thing it gets "wrong" is the phi uses, because they are placed in > the predecessor blocks right now as we sort. > > But because we see them there, we can detect this case and change > placement. > Because they are guaranteed to be at the beginning of the successor block, > you are guaranteed that, if you want to insert, the only thing that can be > between them and the def you want to insert there is other phi uses in the > same boat. > So you can lookahead in the stream, find that def, insert it, use it, and > pretend everything's great (without even pushing it onto the stack!) > > This is tricky in a one-pass algorithm, as it's really changing a simple > renaming automaton into something that has two modes "critical phi use" and > "regular" mode. In critical phi use mode, it finds the def, inserts it, and > keeps processing until it hits a non-phi use. Then it shifts back into > regular mode. > But there's also only exactly one case all of the above work affects: > > The case where the phi node with the use is a direct successor of the > branch, such that the edge to that use is critical. > > In any case, this is also precisely the problem that splitting critical > edges resolves, and if you use break-crit-edgse on the above, you get right > answers all the time :) > > Note also that the thing that we replace, propagateEquality in GVN, also > restricts itself to single predecessors, and simply attempts to propagate > directly into critical phi node uses to avoid the issue (which we can't do > since NewGVN is an analysis followed by an eliminator). > > So yes, we can fix this case if we need to. > > > >> >> - For a function/intrinsic marked as "returned(1) readnone", the obvious >> optimization to do would be to RAUW the return with the argument (and kill >> the function call altogether -- assuming readnone allows that; but I lost >> track of that discussion). Does instcombine do that? > > I understand that piggybacking on this existing feature is a good thing, >> but I'm just wondering if, say, InstCombine won't simply revert e-SSA? >> > > NewGVN is the only pass that uses it ATM, and it destroys the form. > The current expectation is that anything that uses it will destroy it. > This is the deliberate outcome of this discussion right now :) > > If it becomes slow enough that we want to keep it up to date, IMHO, we can > burn that bridge when we come to it. > > At the point at which we are trying to keep predicateinfo up to date in a > large number of places, i'd argue we should just move to e-ssa/ssi as the > default and be done with it. Since that's what you'll have effectively > done anyway. > > I think it would be reasonable to see where it gets used, and if enough > places, make a decision what to do. > >> >> - In processBranch/processAssume you also consider branches on ANDs/ORs >> of comparisons, and then each comparison is processed individually. >> However, this seems incorrect for one of the branches: >> if (a && b) { >> ... >> } else { >> // holds: !a OR !b >> use(a) >> use(b) >> } >> >> Right now it seems that the information that is attached to the else >> branch is !a AND !b, which would be incorrect. > > > I could be pedantic and say the only information attached is a comparison, > the branch, and whether the edge is the true edge or the false edge :) > Which is correct. > > It also chains the info to give you the entire string of comparisons that > were applied. > > However, in practice you are right, and clients are not likely to get this > right with what i've given them. > > Since in practice, the only useful derived info is in the true branch of > and, and the false branch of or, i'm going to make it not attach info to > the other branch. > > Unless you can see a case it makes sense to? > > > >> I haven't seen the client of this analysis, so this is just speculation, >> but the interface doesn't seem to have any indication for this case. Or >> maybe I'm misreading the code :) >> > > No, you are correct. It just happens that the only client is smart enough > to do something sane (i think :P) > > >> >> - Now slightly unrelated: do you know of other representations capable of >> handling relational analyses? For example, with e-SSA: >> if (x+y > 0) { >> // predicate info attached to 'x+y' only > > use(x) >> } > > > We could do that with this pass, actually. > > It's a question of where you terminate the operand-finding. > > You could actually just make it DFS the comparison operation and collect > operands until you hit all the leaves, and insert for those.This would be a > simple modification to collectCmpOperands. > (it will still be smart enough to only insert if there are actual uses > affected by info). > > We don't do that ATM, but there isn't anything i can see that would stop > you. > > The goals i set were to replace propagateEquality in GVN with NewGVN, so > we pretty much only produce info that we can directly propagate. > > It all depends on how many names you want to generate. > > I believe the study i saw, which tried splitting at different places, came > to the conclusion that the best "bang for buck" was e-ssa. > > IE the percent more you get from more was not worth the cost. > > But when i spoke with them about it, their biggest time-waste was > actually "we insert a lot of useless phis for operations that never get > used, because computing liveness is hard". But as i've shown here, you > don't even need to explicitly compute it to solve that problem. So who > knows, maybe it does pay to split everywhere once that cost is eliminated. > > I'd put this one in the "evaluation necessary". > > > >> >> > So at use(x) no information will be available, even if we know that FWIW >> 'x+y > 0' holds there. I don't think LLVM has any analysis that can track >> these kind of symbolic relations, but I was just curious if there's >> anything out there that can handle such analyses? >> > > > The folks i know who do this, in fact start with e-ssa and go from there, > with the caveats i listed :) > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170205/0971eddd/attachment.html>
Thanks for the answers! The plan makes sense to me. Regarding phis, what about diamonds, e.g.: define i32 @f(i32 %x) { br .., label %bb0, label %bb1 bb0: %cmp = icmp sge i32 %x, 0 ; x > 0 br i1 %cmp, label %bb2, label %bb3 bb1: %x2 = add nsw nuw %x, 1 %cmp2 = icmp sge i32 %x2, 2 ; x+1 > 2 / x > 1 br i1 %cmp2, label %bb2, label %bb3 bb2: %x3 = phi i32 [ %x, %bb0 ], [ %x2, %bb1 ] ; CVP says: %x3 is > 0 ... br label %bb3 bb3: ... } CVP can infer that %x > 0 because the union of the intervals given to the phi node imply that. Sure, we can split those edges, but maybe adding the predicate info to blocks bb0 & bb1 would solve the problem?>> - In processBranch you also consider branches on ANDs/ORs of comparisons, >> and then each comparison is processed individually. However, this seems >> incorrect for one of the branches: >> if (a && b) { >> ... >> } else { >> // holds: !a OR !b >> use(a) >> use(b) >> } >> >> Right now it seems that the information that is attached to the else >> branch is !a AND !b, which would be incorrect. > > I could be pedantic and say the only information attached is a comparison, > the branch, and whether the edge is the true edge or the false edge :) > Which is correct. > > It also chains the info to give you the entire string of comparisons that > were applied. > > However, in practice you are right, and clients are not likely to get this > right with what i've given them. > > Since in practice, the only useful derived info is in the true branch of > and, and the false branch of or, i'm going to make it not attach info to > the other branch. > > Unless you can see a case it makes sense to?For CVP there is. For example: if (x > 2 && x < 5) { ... } else { // known: x <= 2 || x >= 5 // CVP produces a range for x: [5, 3) (no loss of information here at all) if (x == 4) ... // CVP folds this to false } So CVP can handle (simple) disjunctive information. Other ValueTracking analyses handle simple patterns as well, though probably at this time those can't use this new stuff unless we go all in with e-SSA. Not sure how to export the information to clients, though. Supporting arbitrary boolean combinations of comparisons seems tricky, but maybe sticking to just 1-level of and/or is ok. I'm mentioning CVP because it *really* needs to be refactored to use e-SSA/SSI. The current code is slow, is very limited in scope (w/ somewhat arbitrary throttling), and is too complicated. Thanks, Nuno
On Sun, Feb 5, 2017 at 3:41 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:> Thanks for the answers! The plan makes sense to me. > > Regarding phis, what about diamonds, e.g.: >> > define i32 @f(i32 %x) { > br .., label %bb0, label %bb1 > bb0: > %cmp = icmp sge i32 %x, 0 ; x > 0 > br i1 %cmp, label %bb2, label %bb3 > bb1: > %x2 = add nsw nuw %x, 1 > %cmp2 = icmp sge i32 %x2, 2 ; x+1 > 2 / x > 1 > br i1 %cmp2, label %bb2, label %bb3 > bb2: > %x3 = phi i32 [ %x, %bb0 ], [ %x2, %bb1 ] > ; CVP says: %x3 is > 0 > ... > br label %bb3 > bb3: > ... > } > > CVP can infer that %x > 0 because the union of the intervals given to the > phi node imply that. > Sure, we can split those edges, but maybe adding the predicate info to > blocks bb0 & bb1 would solve the problem? >Right, and these are all critical edges, as you say :) I will actually try a trick where we insert the def twice and try to place them before phi node uses. We sort phi node uses by incoming block, so they should end up next to each other.> > > - In processBranch you also consider branches on ANDs/ORs of comparisons, >>> and then each comparison is processed individually. However, this seems >>> incorrect for one of the branches: >>> if (a && b) { >>> ... >>> } else { >>> // holds: !a OR !b >>> use(a) >>> use(b) >>> } >>> >>> Right now it seems that the information that is attached to the else >>> branch is !a AND !b, which would be incorrect. >>> >> >> I could be pedantic and say the only information attached is a >> comparison, the branch, and whether the edge is the true edge or the false >> edge :) >> Which is correct. >> >> It also chains the info to give you the entire string of comparisons that >> were applied. >> >> However, in practice you are right, and clients are not likely to get >> this right with what i've given them. >> >> Since in practice, the only useful derived info is in the true branch of >> and, and the false branch of or, i'm going to make it not attach info to >> the other branch. >> >> Unless you can see a case it makes sense to? >> > > For CVP there is. For example: > if (x > 2 && x < 5) { > ... > } else { > // known: x <= 2 || x >= 5 > // CVP produces a range for x: [5, 3) (no loss of information here at > all) > if (x == 4) ... // CVP folds this to false > } > > >Okay, so it does try to handle simple disjunctions.> So CVP can handle (simple) disjunctive information. Other ValueTracking > analyses handle simple patterns as well, though probably at this time those > can't use this new stuff unless we go all in with e-SSA. > Not sure how to export the information to clients, though. Supporting > arbitrary boolean combinations of comparisons seems tricky, but maybe > sticking to just 1-level of and/or is ok. >I mean, we can certainly mark and link the info however we want. I'm just not sure what the best way that clients want to use it is yet. Let me think about this as a followup, since it at least is now "correct" to obvious clients> > I'm mentioning CVP because it *really* needs to be refactored to use > e-SSA/SSI. The current code is slow, is very limited in scope (w/ somewhat > arbitrary throttling), and is too complicated. >Note that with patches to do LVI in DFS postorder instead of BFS order, it actually should be close to ideal :) If CVP moves forward and queries LVI in RPO order, and LVI is doing PO, it should be as close to O(1) work per LVI call as you can get. Of course, it's still a mess, code wise, but ... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170205/3fac8406/attachment.html>