Denis Antrushin via llvm-dev
2019-Nov-28 14:02 UTC
[llvm-dev] GVN and MemoryDependencyAnalysis questions
Hi all, I am looking at bug https://bugs.llvm.org/show_bug.cgi?id=42733 and I don't understand how MemoryDependenceAnalysis is supposed to work together with GVN. Could someone clarify few questions? 1. Function comment of MemoryDependenceAnalysis::getNonLocalCallDependency() says that it returns "...the set of instructions that either define or clobber the value". I read it as it can return incomplete set of dependencies (excluding NonLocal ones). How then it can be used by GVN, which needs full set of dependencies to work correctly? More generally, what is the purpose of such interface at all? Am I missing something here? 2. With empty cache, MemDepAnalysis in fact adds Invalid (NonLocal) Dependency to the result if PHITransAddr cannot fully translate it, but after processing of other blocks caused entry for address to appear in cache, it will be ignored being NonLocal and not reported. This is confusing, as I do not expect for cache to affect result? 3. What is the _exact_ purpose of 'BBSkipFirstBlockPair Pair' field of NonLocalPinterInfo struct? Looks like it serves to record entry BB of address query, but I see that it can change until cache entry is not empty. Is it intended behavior? 4. PHITransAddr can translate GEP to another GEP which resides in a block from which original query block is not reachable. In the example below, it can translate GEP %tmp28 from bb26 to GEP %tmp14 from bb12, which is return block! Is it bug? If not, what's the purpose of it? (MustDominate flag is set to false here - this is how GVN calls into MemDepAnalysis) A testcase from PR42733 looks like this: bb8: ; preds = %bb6 %tmp9 = load atomic i64 addrspace(1)*, i64 addrspace(1)** %arg unordered, align 8 br i1 %arg2, label %bb11, label %bb10 bb10: ; preds = %bb8 br label %bb15 bb11: ; preds = %bb8 br i1 %arg3, label %bb12, label %bb18 bb12: ; preds = %bb11 %tmp13 = phi i64 addrspace(1)* [ %tmp9, %bb11 ] <<< Note this PHI! %tmp14 = getelementptr inbounds i64, i64 addrspace(1)* %tmp13, i64 8 store atomic i64 1, i64 addrspace(1)* %tmp14 unordered, align 8 ret i64 0 bb15: ; preds = %bb26, %bb10 %tmp16 = phi i64 addrspace(1)* [ %tmp9, %bb10 ], [ %tmp27, %bb26 ] %tmp17 = phi i64 [ %tmp7, %bb10 ], [ 0, %bb26 ] switch i32 %arg4, label %bb19 [ i32 0, label %bb26 i32 1, label %bb23 ] bb18: ; preds = %bb11 br label %bb19 bb19: ; preds = %bb18, %bb15, %bb6 %tmp20 = phi i64 addrspace(1)* [ %tmp16, %bb15 ], [ null, %bb6 ], [ %tmp9, %bb18 ] %tmp21 = getelementptr inbounds i64, i64 addrspace(1)* %tmp20, i64 8 %tmp22 = load atomic i64, i64 addrspace(1)* %tmp21 unordered, align 8, !tbaa !0 br label %bb6 bb23: ; preds = %bb15 %tmp24 = getelementptr inbounds i64, i64 addrspace(1)* %tmp16, i64 8 %tmp25 = load atomic i64, i64 addrspace(1)* %tmp24 unordered, align 8 call void @baz(i64 %tmp25) #0 ret i64 0 bb26: ; preds = %bb15 call void @bar() %tmp27 = load atomic i64 addrspace(1)*, i64 addrspace(1)** %arg unordered, align 8 %tmp28 = getelementptr inbounds i64, i64 addrspace(1)* %tmp27, i64 8 %tmp29 = load atomic i64, i64 addrspace(1)* %tmp28 unordered, align 8 %tmp30 = getelementptr inbounds i64, i64 addrspace(1)* %tmp27, i64 40 store atomic i64 %tmp29, i64 addrspace(1)* %tmp30 unordered, align 4 br label %bb15 GVN performs incorrect load replacement for %tmp25 at bb23. On first GVN iteration, we start querying MD for %tmp25. It takes GEP %tmp24 and walks CFG up to bb15 and from there it walks to bb10 trying to translate %tmp9 (branch to bb6 with query for %tmp27 resolve successfully, so I won't mention it further) Here it stops, because PHI node %tmp13 at bb12 prevents PHITransAddr from finding node %tmp14 (it is not direct user of %tmp9). So PHITransAddr returns null, and getNonLocalPointerDepFromBB() adds invalid/nonlocal MemDepResultEntry to the result vector. (Finally we have two deps reported on first iteration: %tmp28 from bb6 and invalid entry from bb10) Later on, redundant PHI %tmp13 = phi i64 addrspace(1)* [ %tmp9, %bb11 ] is deleted and during processing of block bb19, GEP %tmp14 gets into cache as NonLocal entry with starting block of bb10. (Note, %tmp14 _cannot_ reach bb10!) Then, since on first iteration we have changed something, GVN proceeds to second iteration. Process repeats, but now translation of PHI %tmp16 translates to %tmp14 and founds valid cache entry for it (it now is direct user of %tmp9). But since it is NonLocal entry, it is simply ignored and not reported back to GVN as memory dependency... Any idea how to properly fix this? I have seen a review in which all uses of invalidated cache entry are invalidated as well. While it looks like it will fix my problem, it does not sounds right in this case... Thanks, Denis