thr3ads.net - llvm dev - [LLVMdev] Code for late safepoint placement available [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Philip Reames

2014-Jun-05 16:45 UTC

[LLVMdev] Code for late safepoint placement available

Thanks for the comments and for taking a look.

On 06/05/2014 02:19 AM, David Chisnall wrote:> Hi Philip,
>
> The first thing that I notice on looking at the code is the lack of
comments.  For example, about the only comment that I see in include
/llvm/IR/Statepoint.h is a note telling me that a class is only intended to be
used on the stack.  Doxygen comments and, for a feature like this, some
high-level overview of how it's intended to be used are essential.Agreed.  This is mandatory before upstreaming.  We've gone through a 
couple of implementations and before sharing I tried to go through and 
rip out the actively misleading comments.  Unfortunately, that didn't 
leave much left.  :(> We currently use address spaces on our architecture to differentiate
between traditional pointers and fat pointers (both of which are supported on
our system in hardware).  It would be nice, rather than hard-coding an address
space, to have a property on DataLayout that would tell you if a specific
address space contained GC pointers.  We could then designate an address space
that was both GCable and contained fat pointers, and another for non-GCable fat
pointers.Making this configurable is definitely desirable.  I like your idea of 
including a list of address spaces in data layout which contain GC 
pointers and making that query-able.  That sounds quite reasonable.  
Unless anyone makes another proposal, I'll run with this.
>
> The tests don't look to be in a sensible format for LLVM - in
particular, LLVM tests should not depend on clang.  It would be better, I think,
to commit the IR generated from your Python script than the script itself
(although putting it in utils might be helpful for people to regenerate them).By the point we upstream this, those particular tests should probably 
just be removed.  When we were iterating on possible designs, it was 
useful not having the tests in IR form.  At this point, those tests have 
lost most of their value.  Creating a set of IR tests which use address 
spaces properly and cover all the same corner cases is probably the 
right approach.

Philip>
> On 4 Jun 2014, at 17:35, Philip Reames <listmail at philipreames.com>
wrote:
>
>> As I've mentioned on the mailing list a couple of times over the
last few months, we've been working on an approach for supporting precise
fully relocating garbage collection in LLVM.  I am happy to announce that we now
have a version of the code available for public view and discussion.
>>
>> https://github.com/AzulSystems/llvm-late-safepoint-placement
>>
>> Our goal is to eventually see this merged into the LLVM tree. 
There's a fair amount of cleanup that needs to happen before that point, but
we are actively working towards that eventual goal.
>>
>> Please note that there are a couple of known issues with the current
version (see the README).  This is best considered a proof of concept
implementation and is not yet ready for production use.  We will be addressing
the remaining issues over the next few weeks and will be sharing updates as they
occur.
>>
>> In the meantime, I'd like to get the discussion started on how
these changes will eventually land in tree.  Part of the reason for sharing the
code in an early state is to be able to build a history of working in the open,
and to to able to merge minor fixes into the main LLVM repository before trying
to upstream the core changes.  We are aware this is a fairly major change set
and are happy to work within the community process in that regard.
>>
>> I've included a list of specific questions I know we'd like to
get feedback on, but general comments or questions are also very welcome.
>>
>> Open Topics:
>> 	• How should we factor the core GC support for review?  Our current
intent is to separate logically distinct pieces, and share each layer one at a
time.  (e.g. first infrastructure enhancements, then intrinsics and codegen
support, then verifiers, then safepoint insertion passes)  Is this the right
approach?
>> 	• How configurable does the GC support need to be for inclusion in
LLVM?  Currently, we expect the frontend to mark GC pointers using address
spaces.  Do we need to support alternate mechanisms?  If so, what interface
should this take?
>> 	• How should we approach removing the existing partial support for
garbage collection? (gcroot)  Do we want to support both going forward?  Do we
need to provide a forward migration path in bitcode?  Given the usage is
generally though MCJIT, we would prefer we simply deprecate the existing gcroot
support and target it for complete removal a couple of releases down the road..
>> 	• What programmatic interface should we present at the IR level and
where should it live?  We're moving towards a CallSite like interface for
statepoints, gc_relocates, and gc_results call sites.  Is this the right
approach?  If so, should it live in the IR subtree, or Support?  (Note: The
current code is only about 40% migrated to the new interface.)
>> 	• To support invokable calls with safepoints, we need to make the
statepoint intrinsic invokable.  This is new for intrinsics in LLVM.  Is there
any reason that InvokeInst must be a subclass of CallInst? (rather than a view
over either calls or invokes like CallSite)  Would changes to support invokable
intrinsics be accepted upstream?  Alternate approaches are welcome.
>> 	• Is the concept of an abstract VM state something LLVM should know
about?  If so, how should it be represented?  We're actively exploring this
topic, but don't have strong opinions on the topic yet.
>> 	• Our statepoint shares a lot in the way of implementation and
semantics with patchpoint and stackmap.  Is it better to submit new intrinsics,
or try to identify a single intrinsic which could represent both?  Our current
feeling is to keep them separate semantically, but share implementation where
possible.
>>
>> Yours,
>> Philip (& team)
>>
>> p.s. Sanjoy, one of my co-workers,  will be helping to answer questions
as they arise.
>>
>> p.p.s. For those wondering why the current gcroot mechanism isn't
sufficient, I covered that in a previous blog post:
>> [1] http://www.philipreames.com/Blog/2014/02/21/why-not-use-gcroot/
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Talin

2014-Jun-10 18:15 UTC

head link

[LLVMdev] Code for late safepoint placement available

On Thu, Jun 5, 2014 at 9:45 AM, Philip Reames <listmail at
philipreames.com>
wrote:
> Thanks for the comments and for taking a look.
>
>
> On 06/05/2014 02:19 AM, David Chisnall wrote:
>
>> Hi Philip,
>>
>> The first thing that I notice on looking at the code is the lack of
>> comments.  For example, about the only comment that I see in include
>> /llvm/IR/Statepoint.h is a note telling me that a class is only
intended to
>> be used on the stack.  Doxygen comments and, for a feature like this,
some
>> high-level overview of how it's intended to be used are essential.
>>
> Agreed.  This is mandatory before upstreaming.  We've gone through a
> couple of implementations and before sharing I tried to go through and rip
> out the actively misleading comments.  Unfortunately, that didn't leave
> much left.  :(
>
>  We currently use address spaces on our architecture to differentiate
>> between traditional pointers and fat pointers (both of which are
supported
>> on our system in hardware).  It would be nice, rather than hard-coding
an
>> address space, to have a property on DataLayout that would tell you if
a
>> specific address space contained GC pointers.  We could then designate
an
>> address space that was both GCable and contained fat pointers, and
another
>> for non-GCable fat pointers.
>>
> Making this configurable is definitely desirable.  I like your idea of
> including a list of address spaces in data layout which contain GC pointers
> and making that query-able.  That sounds quite reasonable.  Unless anyone
> makes another proposal, I'll run with this.

Does it need to be a list, or could it be a range? It the set of GC address
spaces was contiguous over some range of integers, then you'd only need to
store a min / max value, which might make the implementation simpler.
>
>
>
>> The tests don't look to be in a sensible format for LLVM - in
particular,
>> LLVM tests should not depend on clang.  It would be better, I think, to
>> commit the IR generated from your Python script than the script itself
>> (although putting it in utils might be helpful for people to regenerate
>> them).
>>
> By the point we upstream this, those particular tests should probably just
> be removed.  When we were iterating on possible designs, it was useful not
> having the tests in IR form.  At this point, those tests have lost most of
> their value.  Creating a set of IR tests which use address spaces properly
> and cover all the same corner cases is probably the right approach.
>
> Philip
>
>
>> On 4 Jun 2014, at 17:35, Philip Reames <listmail at
philipreames.com> wrote:
>>
>>  As I've mentioned on the mailing list a couple of times over the
last
>>> few months, we've been working on an approach for supporting
precise fully
>>> relocating garbage collection in LLVM.  I am happy to announce that
we now
>>> have a version of the code available for public view and
discussion.
>>>
>>> https://github.com/AzulSystems/llvm-late-safepoint-placement
>>>
>>> Our goal is to eventually see this merged into the LLVM tree. 
There's a
>>> fair amount of cleanup that needs to happen before that point, but
we are
>>> actively working towards that eventual goal.
>>>
>>> Please note that there are a couple of known issues with the
current
>>> version (see the README).  This is best considered a proof of
concept
>>> implementation and is not yet ready for production use.  We will be
>>> addressing the remaining issues over the next few weeks and will be
sharing
>>> updates as they occur.
>>>
>>> In the meantime, I'd like to get the discussion started on how
these
>>> changes will eventually land in tree.  Part of the reason for
sharing the
>>> code in an early state is to be able to build a history of working
in the
>>> open, and to to able to merge minor fixes into the main LLVM
repository
>>> before trying to upstream the core changes.  We are aware this is a
fairly
>>> major change set and are happy to work within the community process
in that
>>> regard.
>>>
>>> I've included a list of specific questions I know we'd like
to get
>>> feedback on, but general comments or questions are also very
welcome.
>>>
>>> Open Topics:
>>>         • How should we factor the core GC support for review?  Our
>>> current intent is to separate logically distinct pieces, and share
each
>>> layer one at a time.  (e.g. first infrastructure enhancements, then
>>> intrinsics and codegen support, then verifiers, then safepoint
insertion
>>> passes)  Is this the right approach?
>>>         • How configurable does the GC support need to be for
inclusion
>>> in LLVM?  Currently, we expect the frontend to mark GC pointers
using
>>> address spaces.  Do we need to support alternate mechanisms?  If
so, what
>>> interface should this take?
>>>         • How should we approach removing the existing partial
support
>>> for garbage collection? (gcroot)  Do we want to support both going
forward?
>>>  Do we need to provide a forward migration path in bitcode?  Given
the
>>> usage is generally though MCJIT, we would prefer we simply
deprecate the
>>> existing gcroot support and target it for complete removal a couple
of
>>> releases down the road..
>>>         • What programmatic interface should we present at the IR
level
>>> and where should it live?  We're moving towards a CallSite like
interface
>>> for statepoints, gc_relocates, and gc_results call sites.  Is this
the
>>> right approach?  If so, should it live in the IR subtree, or
Support?
>>>  (Note: The current code is only about 40% migrated to the new
interface.)
>>>         • To support invokable calls with safepoints, we need to
make
>>> the statepoint intrinsic invokable.  This is new for intrinsics in
LLVM.
>>>  Is there any reason that InvokeInst must be a subclass of
CallInst?
>>> (rather than a view over either calls or invokes like CallSite) 
Would
>>> changes to support invokable intrinsics be accepted upstream? 
Alternate
>>> approaches are welcome.
>>>         • Is the concept of an abstract VM state something LLVM
should
>>> know about?  If so, how should it be represented?  We're
actively exploring
>>> this topic, but don't have strong opinions on the topic yet.
>>>         • Our statepoint shares a lot in the way of implementation
and
>>> semantics with patchpoint and stackmap.  Is it better to submit new
>>> intrinsics, or try to identify a single intrinsic which could
represent
>>> both?  Our current feeling is to keep them separate semantically,
but share
>>> implementation where possible.
>>>
>>> Yours,
>>> Philip (& team)
>>>
>>> p.s. Sanjoy, one of my co-workers,  will be helping to answer
questions
>>> as they arise.
>>>
>>> p.p.s. For those wondering why the current gcroot mechanism
isn't
>>> sufficient, I covered that in a previous blog post:
>>> [1] http://www.philipreames.com/Blog/2014/02/21/why-not-use-gcroot/
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140610/a4c83f15/attachment.html>

Philip Reames

2014-Jun-13 16:23 UTC

head link

[LLVMdev] Code for late safepoint placement available

On 06/10/2014 11:15 AM, Talin wrote:>
>         We currently use address spaces on our architecture to
>         differentiate between traditional pointers and fat pointers
>         (both of which are supported on our system in hardware).  It
>         would be nice, rather than hard-coding an address space, to
>         have a property on DataLayout that would tell you if a
>         specific address space contained GC pointers.  We could then
>         designate an address space that was both GCable and contained
>         fat pointers, and another for non-GCable fat pointers.
>
>     Making this configurable is definitely desirable.  I like your
>     idea of including a list of address spaces in data layout which
>     contain GC pointers and making that query-able.  That sounds quite
>     reasonable.  Unless anyone makes another proposal, I'll run with
this.
>
>
> Does it need to be a list, or could it be a range? It the set of GC 
> address spaces was contiguous over some range of integers, then you'd 
> only need to store a min / max value, which might make the 
> implementation simpler.A range is also fine. David, does that meet your use case as well?

Philip
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140613/134bafd0/attachment.html>

llvm dev - Jun 2014 - [LLVMdev] Code for late safepoint placement available

[LLVMdev] Code for late safepoint placement available

[LLVMdev] Code for late safepoint placement available

[LLVMdev] Code for late safepoint placement available