On Mon, Apr 27, 2015 at 5:32 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> On Mon, Apr 27, 2015 at 4:21 PM, Daniel Berlin <dberlin at
dberlin.org> wrote:
>> You can't win here (believe me, i've tried, and better people
than me have
>> tried, for years :P).
>> No matter what you do, the partitioning will never be 100% precise.
The
>> only way to solve that in general is to pairwise query over the
>> partitioning.
>>
>> Your basic problem is that all of the alias analyses do not come up
with the
>> same partitioning of the heap.
>>
>> Worse, some are not even transitive, or even symmetric.
>
> Am I correct in saying that the lack of transitivity fundamentally
> cannot be helped -- a MayAlias composed with a MayAlias is not a
> MayAlias.
Correct. I also imagine you can abuse noalias to get a noalias b, b
noalias c, a mayalias c.
(But i haven't thought too hard about it)
As a trivial> The simplest example of this I can come up with is {A ,
> C?A:B , B} where A `NoAlias` B and C is a runtime value that cannot be
> computed at compile-time. Then A `MayAlias` C?A:B and C?A:B
> `MayAlias` B but A `NoAlias` B.
Yes.
l>
>> This means one single variable may be in multiple disjoint alias sets
in
>> reality.
>
> Agreed. I did not propose solving the problem within ASTracker for
> this very reason.
>
>> (For example, GEP, PHI will give you different AA results than PHI, GEP
in
>> some cases. That is mostly a FIXME, but there are plenty of cases you
can
>> come up with where info tells you something about one variable and
it's
>> relationship to another, but not to a third. The non-symmetric case is
>> rarer, but again, possible.)
>>
>> Let's say you modify AliasSetTracker to not collapse, and handle
multiple
>> disjoint partitions at once.
>>
>> (Here is here i will draw the equivalence between this and what
we've done
>> before in GCC :) )
>>
>> The only way to maintain correctness (in general) and not collapse is
to
>> maintain multiple alias sets, and say which are affected by which
>> instructions.
>>
>> (IE you have 1 instruction, and it may MOD 5 different alias sets, each
of
>> which represents a different partitioning of the heap).
>>
>> You will discover very quickly, that for most partitioning you can
think of,
>> the cost of maintaining the alias sets over a set of instructions with
is
>> greater than the cost of additional queries you may come up with in the
>> first place
>>
>> In other words, depending on how precise your partitioning you may have
>> variables that clobber 10 or 20 or 30 of the disjoint alias sets.
Handling
>> this is a lot more expensive than saying "hey, i want to move this
one load
>> up another instruction. Does it really conflict?". This is
because you
>> really only care about individual aliases in the alias set at any given
>> time, but unless you go through all the addresses ahead of time, and
had a
>> way to say "okay alias set tracker, i only care about variable x,
y, z, and
>> even then, only x in block a, y in block b, etc", you are paying
the cost of
>> doing whatever queries are necessary to prove something about *all*
>> variables in an alias set.
>>
>> MemorySSA was essentially built with this problem in mind (and thus,
tries
>> to provide chains that let you do querying on. You ask it about a
load, it
>> can O(1) tell you the nearest dominating thing that MOD's it (IE to
which
>> getModRefInfo(Instruction, Location) says Mod).
>
> So my proposal is somewhat related to what you're suggesting --
> tracking the stores through a separate ASTracker lets me basically ask
> a conservative version of getModRefInfo over the entire loop body.
Yes.
MemorySSA just lets you do this per-load instead.
> And given that the loop body will be strongly connected, I think the
> list of clobbering stores reported by MemorySSA will be exactly equal
> to the list of stores reported by AST, control flow should not matter
> since everything in the loop body reaches everything else.
Depends how good AST is.
Given a loop like this:
; Function Attrs: nounwind ssp
define void @test10(i32 %N, double* nocapture %G) #0 {
entry:
%0 = add i32 %N, -1
%1 = icmp sgt i32 %0, 1
br i1 %1, label %bb.nph, label %return
bb.nph: ; preds = %entry
%tmp = sext i32 %0 to i64
%tmp8 = add i64 %tmp, -1
br label %bb
bb: ; preds = %bb, %bb.nph
%indvar = phi i64 [ 0, %bb.nph ], [ %tmp11, %bb ]
%scevgep = getelementptr double, double* %G, i64 %indvar
%tmp9 = add i64 %indvar, 2
%scevgep10 = getelementptr double, double* %G, i64 %tmp9
%tmp11 = add i64 %indvar, 1
%scevgep12 = getelementptr double, double* %G, i64 %tmp11
%2 = load double, double* %scevgep12, align 8
%3 = load double, double* %scevgep10, align 8
%4 = fadd double %2, %3
%5 = load double, double* %scevgep, align 8
%6 = fadd double %4, %5
store double %6, double* %scevgep12, align 8
%exitcond = icmp eq i64 %tmp11, %tmp8
br i1 %exitcond, label %return, label %bb
return: ; preds = %bb, %entry
ret void
}
MemorySSA will return that load scevgep12 is not clobbered in the loop
(in one path through the phi it leads all the way back to the entry
block, in the other path, it leads all the way back to the entry block
because scevgep12 becomes scevgep10 along that path), load scevgep10
gets back to live on entry block and is not clobbered by the loop, and
load scevgep is clobbered by the store below it.
IE:
; Function Attrs: nounwind ssp
define void @test10(i32 %N, double* nocapture %G) #0 {
entry:
%0 = add i32 %N, -1
%1 = icmp sgt i32 %0, 1
br i1 %1, label %bb.nph, label %return
bb.nph: ; preds = %entry
%tmp = sext i32 %0 to i64
%tmp8 = add i64 %tmp, -1
br label %bb
bb: ; preds = %bb, %bb.nph
; 3 = MemoryPhi({bb.nph,liveOnEntry},{bb,1})
%indvar = phi i64 [ 0, %bb.nph ], [ %tmp11, %bb ]
%scevgep = getelementptr double, double* %G, i64 %indvar
%tmp9 = add i64 %indvar, 2
%scevgep10 = getelementptr double, double* %G, i64 %tmp9
%tmp11 = add i64 %indvar, 1
%scevgep12 = getelementptr double, double* %G, i64 %tmp11
; MemoryUse(liveOnEntry)
%2 = load double, double* %scevgep12, align 8
; MemoryUse(liveOnEntry)
%3 = load double, double* %scevgep10, align 8
%4 = fadd double %2, %3
; MemoryUse(3)
%5 = load double, double* %scevgep, align 8
%6 = fadd double %4, %5
; 1 = MemoryDef(3)
store double %6, double* %scevgep12, align 8
%exitcond = icmp eq i64 %tmp11, %tmp8
br i1 %exitcond, label %return, label %bb
return: ; preds = %bb, %entry
; 2 = MemoryPhi({entry,liveOnEntry},{bb,1})
ret void
}
>
> -- Sanjoy