Denis Antrushin via llvm-dev
2019-Nov-28 14:02 UTC
[llvm-dev] GVN and MemoryDependencyAnalysis questions
Hi all,
I am looking at bug https://bugs.llvm.org/show_bug.cgi?id=42733 and I don't
understand how
MemoryDependenceAnalysis is supposed to work together with GVN.
Could someone clarify few questions?
1. Function comment of MemoryDependenceAnalysis::getNonLocalCallDependency()
says that
it returns "...the set of instructions that either define or clobber the
value".
I read it as it can return incomplete set of dependencies (excluding NonLocal
ones).
How then it can be used by GVN, which needs full set of dependencies to work
correctly?
More generally, what is the purpose of such interface at all? Am I missing
something here?
2. With empty cache, MemDepAnalysis in fact adds Invalid (NonLocal) Dependency
to the
result if PHITransAddr cannot fully translate it, but after processing of other
blocks
caused entry for address to appear in cache, it will be ignored being NonLocal
and not reported.
This is confusing, as I do not expect for cache to affect result?
3. What is the _exact_ purpose of 'BBSkipFirstBlockPair Pair' field of
NonLocalPinterInfo struct?
Looks like it serves to record entry BB of address query, but I see that it can
change until
cache entry is not empty. Is it intended behavior?
4. PHITransAddr can translate GEP to another GEP which resides in a block from
which original
query block is not reachable. In the example below, it can translate GEP %tmp28
from bb26 to
GEP %tmp14 from bb12, which is return block! Is it bug? If not, what's the
purpose of it?
(MustDominate flag is set to false here - this is how GVN calls into
MemDepAnalysis)
A testcase from PR42733 looks like this:
bb8: ; preds = %bb6
%tmp9 = load atomic i64 addrspace(1)*, i64 addrspace(1)** %arg unordered,
align 8
br i1 %arg2, label %bb11, label %bb10
bb10: ; preds = %bb8
br label %bb15
bb11: ; preds = %bb8
br i1 %arg3, label %bb12, label %bb18
bb12: ; preds = %bb11
%tmp13 = phi i64 addrspace(1)* [ %tmp9, %bb11 ]
<<< Note this PHI!
%tmp14 = getelementptr inbounds i64, i64 addrspace(1)* %tmp13, i64 8
store atomic i64 1, i64 addrspace(1)* %tmp14 unordered, align 8
ret i64 0
bb15: ; preds = %bb26, %bb10
%tmp16 = phi i64 addrspace(1)* [ %tmp9, %bb10 ], [ %tmp27, %bb26 ]
%tmp17 = phi i64 [ %tmp7, %bb10 ], [ 0, %bb26 ]
switch i32 %arg4, label %bb19 [
i32 0, label %bb26
i32 1, label %bb23
]
bb18: ; preds = %bb11
br label %bb19
bb19: ; preds = %bb18, %bb15, %bb6
%tmp20 = phi i64 addrspace(1)* [ %tmp16, %bb15 ], [ null, %bb6 ], [ %tmp9,
%bb18 ]
%tmp21 = getelementptr inbounds i64, i64 addrspace(1)* %tmp20, i64 8
%tmp22 = load atomic i64, i64 addrspace(1)* %tmp21 unordered, align 8, !tbaa
!0
br label %bb6
bb23: ; preds = %bb15
%tmp24 = getelementptr inbounds i64, i64 addrspace(1)* %tmp16, i64 8
%tmp25 = load atomic i64, i64 addrspace(1)* %tmp24 unordered, align 8
call void @baz(i64 %tmp25) #0
ret i64 0
bb26: ; preds = %bb15
call void @bar()
%tmp27 = load atomic i64 addrspace(1)*, i64 addrspace(1)** %arg unordered,
align 8
%tmp28 = getelementptr inbounds i64, i64 addrspace(1)* %tmp27, i64 8
%tmp29 = load atomic i64, i64 addrspace(1)* %tmp28 unordered, align 8
%tmp30 = getelementptr inbounds i64, i64 addrspace(1)* %tmp27, i64 40
store atomic i64 %tmp29, i64 addrspace(1)* %tmp30 unordered, align 4
br label %bb15
GVN performs incorrect load replacement for %tmp25 at bb23.
On first GVN iteration, we start querying MD for %tmp25.
It takes GEP %tmp24 and walks CFG up to bb15 and from there it walks to bb10
trying to translate %tmp9
(branch to bb6 with query for %tmp27 resolve successfully, so I won't
mention it further)
Here it stops, because PHI node %tmp13 at bb12 prevents PHITransAddr from
finding node %tmp14 (it is not direct user of
%tmp9). So PHITransAddr returns null, and getNonLocalPointerDepFromBB() adds
invalid/nonlocal MemDepResultEntry to the
result vector. (Finally we have two deps reported on first iteration: %tmp28
from bb6 and invalid entry from bb10)
Later on, redundant PHI %tmp13 = phi i64 addrspace(1)* [ %tmp9, %bb11 ] is
deleted and during processing of block
bb19, GEP %tmp14 gets into cache as NonLocal entry with starting block of bb10.
(Note, %tmp14 _cannot_ reach bb10!)
Then, since on first iteration we have changed something, GVN proceeds to second
iteration.
Process repeats, but now translation of PHI %tmp16 translates to %tmp14 and
founds valid cache
entry for it (it now is direct user of %tmp9).
But since it is NonLocal entry, it is simply ignored and not reported back to
GVN as memory dependency...
Any idea how to properly fix this?
I have seen a review in which all uses of invalidated cache entry are
invalidated as well.
While it looks like it will fix my problem, it does not sounds right in this
case...
Thanks,
Denis