thr3ads.net - llvm dev - [llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug [May 2016]

If this information is useful, please help other people find it:
Share via:

Geoff Berry via llvm-dev

2016-May-02 19:22 UTC

[llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug

I've put my changes to EarlyCSE that trigger this case up on phab here: 
http://reviews.llvm.org/D19821.  These changes depend on 
http://reviews.llvm.org/D19664 so that will need to be applied first.  
With these changes applied, the original attached .ll file should 
trigger this bug when compiled with opt -early-cse -early-cse-use-memoryssa

On 5/2/2016 2:34 PM, Daniel Berlin wrote:> (IE the thing we cache has to be the same no matter which version is 
> called. If it can't be for some reason, we can't cache results from
> both in the same cache. If it can be, i suspect there is a bug in 
> making that work :P)
>
> I suspect the latter, and not the former.
>
>
> On Mon, May 2, 2016 at 11:32 AM, Daniel Berlin <dberlin at dberlin.org 
> <mailto:dberlin at dberlin.org>> wrote:
>
>     I suspect something is pulling the RHS of the memorydef and
>     caching it for calls it should not be used for.
>     In particular, i suspect we are about to discover we can't cache
>     the results from both versions of getClobberingMemoryAccess
>     together, or that the cache is not always getting consistently
>     written.
>
>
>
>
>     On Mon, May 2, 2016 at 11:16 AM, George Burgess IV
>     <george.burgess.iv at gmail.com <mailto:george.burgess.iv at
gmail.com>>
>     wrote:
>
>         Yeah, that sounds like a fun bug. I'll take a look later today
>         and see what I can find out. :)
>
>         On Fri, Apr 29, 2016 at 2:55 PM, Geoff Berry
>         <gberry at codeaurora.org <mailto:gberry at
codeaurora.org>> wrote:
>
>             Hi guys,
>
>             I think I have run into another CachingMemorySSAWalker
>             cache bug. It's a bit tricky to reproduce, so I'd like
to
>             start by trying to show you what is happening when running
>             EarlyCSE with my local changes to use MemorySSA.  I've
>             attached a debug log that shows that the value returned by
>             getClobberingMemoryAccess(Inst) after a call to
>             removeMemoryAccess is wrong.  The MemorySSA node in
>             question is MemoryUse(7), and the corruption happens after
>             a call to remove MemoryUse(2), at which point its clobber
>             value changes to '1 = MemoryDef(liveOnEntry)'.  The
>             interesting thing is that is doesn't seem to be the first
>             call to getClobberingMemoryAccess after the removal that
>             causes the corruption, but rather the second.  You'll
>             notice that I added calls to getClobberingMemoryAccess
>             when doing MSSA.dump(), which is what I'm using to attempt
>             to figure out when the cache gets corrupted.
>
>             Hopefully this is enough information to debug the
>             problem.  If not perhaps we can look at getting my
>             EarlyCSE changes checked in in a disabled state so you can
>             reproduce the problem directly.  I'm also happy to help
>             debug it farther.
>
>             -- 
>             Geoff Berry
>             Employee of Qualcomm Innovation Center, Inc.
>               Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>
>
>
>
-- 
Geoff Berry
Employee of Qualcomm Innovation Center, Inc.
  Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160502/eb576e30/attachment.html>

George Burgess IV via llvm-dev

2016-May-03 18:53 UTC

head link

[llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug

So, after looking at this, I think this is WAI, and the wonkiness is a
result of an arbitrary limit somewhere in BasicAA.

I say this because if you remove any unused GEP from your example, the
loads (MemoryUse(7), MemoryUse(7)) are both optimized to MemoryUse(1) when
we build MemorySSA. Add useless GEP back, and they turn back to
MemoryUse(7). This is valid, because BasicAA presumably sees that the
malloc'ed pointer doesn't escape, and it's marked with noalias. So,
BasicAA
can assume that it isn't clobbered by @foo.

With respect to why the clobbers are *different*, it looks like the walker
first sees the code as

load i32, i32* %i27
load i32, i32* %i28

...Which some transform then changes to

load i32, i32* %i27
load i32, i32* %i27

...And then requerys the walker.

Because the MemoryLocation on the second load changes, when you call
getClobberingAccess on it, we won't find a cache entry (indeed, we end up
walking all defs up to @malloc in this query). Because this lookup is
happening after some useless GEPs have been removed, AA is more aggressive
here, and lets us optimize the clobber of the second load.

This doesn't happen for the first load because its cache is still valid
(its MemoryLocation is still %i27).

I can't think of a good way to address this (AA offering more aggressive
results for seemingly unrelated IR changes) in MSSA. Danny -- any ideas? :/

On Mon, May 2, 2016 at 12:22 PM, Geoff Berry <gberry at codeaurora.org>
wrote:
> I've put my changes to EarlyCSE that trigger this case up on phab here:
> http://reviews.llvm.org/D19821.  These changes depend on
> http://reviews.llvm.org/D19664 so that will need to be applied first.
> With these changes applied, the original attached .ll file should trigger
> this bug when compiled with opt -early-cse -early-cse-use-memoryssa
> On 5/2/2016 2:34 PM, Daniel Berlin wrote:
>
> (IE the thing we cache has to be the same no matter which version is
> called. If it can't be for some reason, we can't cache results from
both in
> the same cache. If it can be, i suspect there is a bug in making that work
> :P)
>
>
> I suspect the latter, and not the former.
>
>
> On Mon, May 2, 2016 at 11:32 AM, Daniel Berlin <dberlin at
dberlin.org>
> wrote:
>
>> I suspect something is pulling the RHS of the memorydef and caching it
>> for calls it should not be used for.
>> In particular, i suspect we are about to discover we can't cache
the
>> results from both versions of getClobberingMemoryAccess together, or
that
>> the cache is not always getting consistently written.
>>
>>
>>
>>
>> On Mon, May 2, 2016 at 11:16 AM, George Burgess IV <
>> <george.burgess.iv at gmail.com>george.burgess.iv at
gmail.com> wrote:
>>
>>> Yeah, that sounds like a fun bug. I'll take a look later today
and see
>>> what I can find out. :)
>>>
>>> On Fri, Apr 29, 2016 at 2:55 PM, Geoff Berry < <gberry at
codeaurora.org>
>>> gberry at codeaurora.org> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I think I have run into another CachingMemorySSAWalker cache
bug.  It's
>>>> a bit tricky to reproduce, so I'd like to start by trying
to show you what
>>>> is happening when running EarlyCSE with my local changes to use
MemorySSA.
>>>> I've attached a debug log that shows that the value
returned by
>>>> getClobberingMemoryAccess(Inst) after a call to
removeMemoryAccess is
>>>> wrong.  The MemorySSA node in question is MemoryUse(7), and the
corruption
>>>> happens after a call to remove MemoryUse(2), at which point its
clobber
>>>> value changes to '1 = MemoryDef(liveOnEntry)'.  The
interesting thing is
>>>> that is doesn't seem to be the first call to
getClobberingMemoryAccess
>>>> after the removal that causes the corruption, but rather the
second.
>>>> You'll notice that I added calls to
getClobberingMemoryAccess when doing
>>>> MSSA.dump(), which is what I'm using to attempt to figure
out when the
>>>> cache gets corrupted.
>>>> Hopefully this is enough information to debug the problem.  If
not
>>>> perhaps we can look at getting my EarlyCSE changes checked in
in a disabled
>>>> state so you can reproduce the problem directly.  I'm also
happy to help
>>>> debug it farther.
>>>>
>>>> --
>>>> Geoff Berry
>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>  Qualcomm Innovation Center, Inc. is a member of Code Aurora
Forum, a Linux Foundation Collaborative Project
>>>>
>>>>
>>>
>>
>
> --
> Geoff Berry
> Employee of Qualcomm Innovation Center, Inc.
>  Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160503/78ef42f5/attachment-0001.html>

Daniel Berlin via llvm-dev

2016-May-03 19:03 UTC

head link

[llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug

We shouldn't really fix MSSA here. As we push MSSA through the compiler,
and as the new pass manager stuff gets pushed forward, we should eventually
be able to have a nice pass for all the little expensive things that
basicaa is doing, and recompute it/update it as necessary.

TL;DR BasicAA was meant to be stateless, and we should move the things that
really require state to do well out of it and compute those separately,
which will let us get rid of arbitrary walking limits.

In particular, i expect the limit you are hitting is the decomposed gep
limit.  This is really a forward propagation problem being solved by
walking in reverse :)
The solution is to just solve it as a forward propagation problem and
update results when we need to - because it's just a forward propagation
problem, updating is simply rerunning the propagation problem for things
that have changed after removing their lattice values. it should only
propagate just what is actually necessary.




On Tue, May 3, 2016 at 11:53 AM, George Burgess IV <
george.burgess.iv at gmail.com> wrote:
> So, after looking at this, I think this is WAI, and the wonkiness is a
> result of an arbitrary limit somewhere in BasicAA.
>
> I say this because if you remove any unused GEP from your example, the
> loads (MemoryUse(7), MemoryUse(7)) are both optimized to MemoryUse(1) when
> we build MemorySSA. Add useless GEP back, and they turn back to
> MemoryUse(7). This is valid, because BasicAA presumably sees that the
> malloc'ed pointer doesn't escape, and it's marked with noalias.
So, BasicAA
> can assume that it isn't clobbered by @foo.
>
> With respect to why the clobbers are *different*, it looks like the walker
> first sees the code as
>
> load i32, i32* %i27
> load i32, i32* %i28
>
> ...Which some transform then changes to
>
> load i32, i32* %i27
> load i32, i32* %i27
>
> ...And then requerys the walker.
>
> Because the MemoryLocation on the second load changes, when you call
> getClobberingAccess on it, we won't find a cache entry (indeed, we end
up
> walking all defs up to @malloc in this query). Because this lookup is
> happening after some useless GEPs have been removed, AA is more aggressive
> here, and lets us optimize the clobber of the second load.
>
> This doesn't happen for the first load because its cache is still valid
> (its MemoryLocation is still %i27).
>
> I can't think of a good way to address this (AA offering more
aggressive
> results for seemingly unrelated IR changes) in MSSA. Danny -- any ideas? :/
>
> On Mon, May 2, 2016 at 12:22 PM, Geoff Berry <gberry at
codeaurora.org>
> wrote:
>
>> I've put my changes to EarlyCSE that trigger this case up on phab
here:
>> http://reviews.llvm.org/D19821.  These changes depend on
>> http://reviews.llvm.org/D19664 so that will need to be applied first.
>> With these changes applied, the original attached .ll file should
trigger
>> this bug when compiled with opt -early-cse -early-cse-use-memoryssa
>> On 5/2/2016 2:34 PM, Daniel Berlin wrote:
>>
>> (IE the thing we cache has to be the same no matter which version is
>> called. If it can't be for some reason, we can't cache results
from both in
>> the same cache. If it can be, i suspect there is a bug in making that
work
>> :P)
>>
>>
>> I suspect the latter, and not the former.
>>
>>
>> On Mon, May 2, 2016 at 11:32 AM, Daniel Berlin <dberlin at
dberlin.org>
>> wrote:
>>
>>> I suspect something is pulling the RHS of the memorydef and caching
it
>>> for calls it should not be used for.
>>> In particular, i suspect we are about to discover we can't
cache the
>>> results from both versions of getClobberingMemoryAccess together,
or that
>>> the cache is not always getting consistently written.
>>>
>>>
>>>
>>>
>>> On Mon, May 2, 2016 at 11:16 AM, George Burgess IV <
>>> <george.burgess.iv at gmail.com>george.burgess.iv at
gmail.com> wrote:
>>>
>>>> Yeah, that sounds like a fun bug. I'll take a look later
today and see
>>>> what I can find out. :)
>>>>
>>>> On Fri, Apr 29, 2016 at 2:55 PM, Geoff Berry < <gberry at
codeaurora.org>
>>>> gberry at codeaurora.org> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I think I have run into another CachingMemorySSAWalker
cache bug.
>>>>> It's a bit tricky to reproduce, so I'd like to
start by trying to show you
>>>>> what is happening when running EarlyCSE with my local
changes to use
>>>>> MemorySSA.  I've attached a debug log that shows that
the value returned by
>>>>> getClobberingMemoryAccess(Inst) after a call to
removeMemoryAccess is
>>>>> wrong.  The MemorySSA node in question is MemoryUse(7), and
the corruption
>>>>> happens after a call to remove MemoryUse(2), at which point
its clobber
>>>>> value changes to '1 = MemoryDef(liveOnEntry)'.  The
interesting thing is
>>>>> that is doesn't seem to be the first call to
getClobberingMemoryAccess
>>>>> after the removal that causes the corruption, but rather
the second.
>>>>> You'll notice that I added calls to
getClobberingMemoryAccess when doing
>>>>> MSSA.dump(), which is what I'm using to attempt to
figure out when the
>>>>> cache gets corrupted.
>>>>> Hopefully this is enough information to debug the problem. 
If not
>>>>> perhaps we can look at getting my EarlyCSE changes checked
in in a disabled
>>>>> state so you can reproduce the problem directly.  I'm
also happy to help
>>>>> debug it farther.
>>>>>
>>>>> --
>>>>> Geoff Berry
>>>>> Employee of Qualcomm Innovation Center, Inc.
>>>>>  Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, a Linux Foundation Collaborative Project
>>>>>
>>>>>
>>>>
>>>
>>
>> --
>> Geoff Berry
>> Employee of Qualcomm Innovation Center, Inc.
>>  Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160503/e2995988/attachment.html>

llvm dev - May 2016 - [MemorySSA] Potential CachingMemorySSAWalker bug

[llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug

[llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug

[llvm-dev] [MemorySSA] Potential CachingMemorySSAWalker bug