thr3ads.net - llvm dev - [LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module) [Feb 2014]

If this information is useful, please help other people find it:
Share via:

Philip Reames

2014-Feb-24 19:17 UTC

[LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

On 02/24/2014 12:45 AM, Andrew Trick wrote:>
> On Feb 21, 2014, at 10:37 AM, Philip Reames <listmail at
philipreames.com
> <mailto:listmail at philipreames.com>> wrote:
>
>>
>> On 02/14/2014 05:55 PM, Philip Reames wrote:
>>> Splitting out a conversation which started in "make DataLayout
a
>>> mandatory part of Module" since the topic has decidedly
changed.
>>> This also relates to the email "RFC: GEP as canonical form for
>>> pointer addressing" I just sent.
>>>
>>> On 02/10/2014 05:25 PM, Nick Lewycky wrote:
>>>> ...
>>>>
>>>> We're supposed to have the llvm.gcroots intrinsic for this
purpose,
>>>> but you note that it prevents gc roots from being in registers 
>>>> (they must be in memory somewhere, usually on the stack), and
that
>>>> fixing it is more work than is reasonable.
>>> This is slightly off, but probably close to what I actually said 
>>> even if not quite what I meant.  :)
>>>
>>> I'm going to skip this and respond with a fuller explanation 
>>> Monday.  I'd written an explanation once, realized it was
wrong, and
>>> decided I should probably revisit when fully awake.
>>>
>>> Fundamentally, I believe that gc.roots could be made to work, even 
>>> with decent (but not optimal) performance in the end.  We may even 
>>> contribute some patches towards fixing issues with the gc.root 
>>> mechanism just to make a fair comparison.  I just don't believe
it's
>>> the right approach or the best way to reach the end goal.
>> So, not quite on Monday, but I did get around to writing up an 
>> explanation of what's wrong with using gcroot.  It turned out to be
>> much longer than I expected, so I turned it into a blog post:
>> http://www.philipreames.com/Blog/2014/02/21/why-not-use-gcroot/
>>
>> The very short version: gcroot loses roots (for any GC) due to bad 
>> interaction with the optimizer, and gcroot doesn't capture all
copies
>> of a pointer root which fundamentally breaks collectors which 
>> relocate roots.  The only way I know to make gcroot (in its current 
>> form) work reliably for all collectors is to insert safepoints very 
>> early, which has highly negative performance impacts.  There are some 
>> (potentially) cheaper but ugly hacks available if you don't need to
>> relocate roots.
>>
>> There's also going to be a follow up post on implementation
problems,
>> but that's completely separate from the fundamental problems.
>
> Thanks for the writeup. FWIW my understanding of gcroot has always 
> been that the call to invoke GC is “extern” and not readonly, so we 
> can’t do store->load forwarding on the escaped pointer across it. I 
> have never used gcroot myself.Andy, I'm not clear what you're trying to say here.  Could you 
rephrase?  In particular, what do you mean by "call to invoke GC"?

Philip
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140224/d3201a70/attachment.html>

Andrew Trick

2014-Feb-24 19:27 UTC

head link

[LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

On Feb 24, 2014, at 11:17 AM, Philip Reames <listmail at philipreames.com>
wrote:
> 
> On 02/24/2014 12:45 AM, Andrew Trick wrote:
>> 
>> On Feb 21, 2014, at 10:37 AM, Philip Reames <listmail at
philipreames.com> wrote:
>> 
>>> 
>>> On 02/14/2014 05:55 PM, Philip Reames wrote:
>>>> Splitting out a conversation which started in "make
DataLayout a mandatory part of Module" since the topic has decidedly
changed.  This also relates to the email "RFC: GEP as canonical form for
pointer addressing" I just sent.
>>>> 
>>>> On 02/10/2014 05:25 PM, Nick Lewycky wrote:
>>>>> ...
>>>>> 
>>>>> We're supposed to have the llvm.gcroots intrinsic for
this purpose, but you note that it prevents gc roots from being in registers
(they must be in memory somewhere, usually on the stack), and that fixing it is
more work than is reasonable.
>>>> This is slightly off, but probably close to what I actually
said even if not quite what I meant.  :)
>>>> 
>>>> I'm going to skip this and respond with a fuller
explanation Monday.  I'd written an explanation once, realized it was wrong,
and decided I should probably revisit when fully awake.
>>>> 
>>>> Fundamentally, I believe that gc.roots could be made to work,
even with decent (but not optimal) performance in the end.  We may even
contribute some patches towards fixing issues with the gc.root mechanism just to
make a fair comparison.  I just don't believe it's the right approach or
the best way to reach the end goal.
>>> So, not quite on Monday, but I did get around to writing up an
explanation of what's wrong with using gcroot.  It turned out to be much
longer than I expected, so I turned it into a blog post:
>>> http://www.philipreames.com/Blog/2014/02/21/why-not-use-gcroot/
>>> 
>>> The very short version: gcroot loses roots (for any GC) due to bad
interaction with the optimizer, and gcroot doesn't capture all copies of a
pointer root which fundamentally breaks collectors which relocate roots.  The
only way I know to make gcroot (in its current form) work reliably for all
collectors is to insert safepoints very early, which has highly negative
performance impacts.  There are some (potentially) cheaper but ugly hacks
available if you don't need to relocate roots.
>>> 
>>> There's also going to be a follow up post on implementation
problems, but that's completely separate from the fundamental problems.
>> 
>> Thanks for the writeup. FWIW my understanding of gcroot has always been
that the call to invoke GC is “extern” and not readonly, so we can’t do
store->load forwarding on the escaped pointer across it. I have never used
gcroot myself.
> Andy, I'm not clear what you're trying to say here.  Could you
rephrase?  In particular, what do you mean by "call to invoke GC"?
I mean a call site that we think of as a safepoint could potentially call to the
runtime and block while GC runs. We can’t let LLVM optimize hoist loads or sink
stores across that call.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140224/e7cc7226/attachment.html>

Philip Reames

2014-Feb-24 21:26 UTC

head link

[LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

On 02/24/2014 11:27 AM, Andrew Trick wrote:>
> On Feb 24, 2014, at 11:17 AM, Philip Reames <listmail at
philipreames.com
> <mailto:listmail at philipreames.com>> wrote:
>
>>
>> On 02/24/2014 12:45 AM, Andrew Trick wrote:
>>>
>>> On Feb 21, 2014, at 10:37 AM, Philip Reames 
>>> <listmail at philipreames.com <mailto:listmail at
philipreames.com>> wrote:
>>>
>>>>
>>>> On 02/14/2014 05:55 PM, Philip Reames wrote:
>>>>> Splitting out a conversation which started in "make
DataLayout a
>>>>> mandatory part of Module" since the topic has
decidedly changed.
>>>>> This also relates to the email "RFC: GEP as canonical
form for
>>>>> pointer addressing" I just sent.
>>>>>
>>>>> On 02/10/2014 05:25 PM, Nick Lewycky wrote:
>>>>>> ...
>>>>>>
>>>>>> We're supposed to have the llvm.gcroots intrinsic
for this
>>>>>> purpose, but you note that it prevents gc roots from
being in
>>>>>> registers (they must be in memory somewhere, usually on
the
>>>>>> stack), and that fixing it is more work than is
reasonable.
>>>>> This is slightly off, but probably close to what I actually
said
>>>>> even if not quite what I meant.  :)
>>>>>
>>>>> I'm going to skip this and respond with a fuller
explanation
>>>>> Monday.  I'd written an explanation once, realized it
was wrong,
>>>>> and decided I should probably revisit when fully awake.
>>>>>
>>>>> Fundamentally, I believe that gc.roots could be made to
work, even
>>>>> with decent (but not optimal) performance in the end.  We
may even
>>>>> contribute some patches towards fixing issues with the
gc.root
>>>>> mechanism just to make a fair comparison. I just don't
believe
>>>>> it's the right approach or the best way to reach the
end goal.
>>>> So, not quite on Monday, but I did get around to writing up an 
>>>> explanation of what's wrong with using gcroot.  It turned
out to be
>>>> much longer than I expected, so I turned it into a blog post:
>>>> http://www.philipreames.com/Blog/2014/02/21/why-not-use-gcroot/
>>>>
>>>> The very short version: gcroot loses roots (for any GC) due to
bad
>>>> interaction with the optimizer, and gcroot doesn't capture
all
>>>> copies of a pointer root which fundamentally breaks collectors 
>>>> which relocate roots.  The only way I know to make gcroot (in
its
>>>> current form) work reliably for all collectors is to insert 
>>>> safepoints very early, which has highly negative performance 
>>>> impacts.  There are some (potentially) cheaper but ugly hacks 
>>>> available if you don't need to relocate roots.
>>>>
>>>> There's also going to be a follow up post on implementation
>>>> problems, but that's completely separate from the
fundamental
>>>> problems.
>>>
>>> Thanks for the writeup. FWIW my understanding of gcroot has always 
>>> been that the call to invoke GC is “extern” and not readonly, so we
>>> can’t do store->load forwarding on the escaped pointer across
it. I
>>> have never used gcroot myself.
>> Andy, I'm not clear what you're trying to say here.  Could you 
>> rephrase?  In particular, what do you mean by "call to invoke
GC"?
>
> I mean a call site that we think of as a safepoint could potentially 
> call to the runtime and block while GC runs. We can’t let LLVM 
> optimize hoist loads or sink stores across that call.
>Ah, okay.  I think get where you're coming from.

For call safepoints, if you assume the call itself prevents the 
optimization, you're mostly okay.  This is problematic if you want to 
have a safepoint on a read-only call (for example), but could be hacked 
around.

The problem comes up with backedge, function entry, and function return 
safepoints.   Given there is no example in tree (or out of tree that I 
know of) which uses these, it's a little hard to tell how it's supposed 
to work.  My belief is that the findCustomSafePoints callback on 
GCStrategy is supposed to insert these.  The problem is that this pass 
is a MachineFunction pass and this runs long after optimization.

The alternate approach - which I believe you're assuming - is to insert 
calls for each safepoint explicitly before optimization.  (In the blog 
post, I refer to this as "early safepoint insertion".) Unless I'm
badly
misreading both documentation and code, that's not how gcroot is 
expecting to be used.  I agree that e.s.p. would work from a correctness 
standpoint for the first example I listed.  It doesn't solve the second 
case though.

Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140224/d697be62/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Feb 2014 - [LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

[LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

[LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

[LLVMdev] Pointer vs Integer classification (was Re: make DataLayout a mandatory part of Module)

Possibly Parallel Threads