thr3ads.net - llvm dev - [LLVMdev] LLVM and managed languages [Dec 2011]

If this information is useful, please help other people find it:
Share via:

Talin

2011-Jul-01 18:05 UTC

[LLVMdev] LLVM and managed languages

So I've been using LLVM for about 4 years now, and I've posted a lot on
this
list about specific issues. What I would like to do is step back for a
moment and give my "big picture" assessment of LLVM overall,
particularly
with respect to developing a "managed" language like Java / C# or my
own
language, Tart.

Obviously, I feel that LLVM is the best choice out there, otherwise I would
of switched to using other alternatives long ago. At the same time, however,
I've observed that the great majority of experimental languages - Groovy,
Scala, and many others - are built on top of the Java virtual machine, which
seems to be the platform of choice for language experimenters. In a similar
vein, if you spend any time studying the academic research on garbage
collection algorithms, you'll see that the JVM also appears to be the
platform of choice here as well.

One question to ask is, what factors would weigh heavily in the decision to
choose the JVM over LLVM when implementing a managed language? On the
positive side, LLVM's mature and simple APIs make it a breeze to manipulate
types and functions, and it's modular architecture means I can use the parts
I want without being forced to use parts I don't need. And the ability to go
straight to machine code means that I don't need to ship a complex virtual
machine with my executables.

I would also like to list the areas where I *don't need help* from LLVM -
that is, these are things I can easily handle myself in the frontend:

   - Dynamic dispatch and virtual methods / multi-methods
   - Boxing and unboxing of value types
   - Reflection
   - Memory management

I mention these items for two reasons: First, because these are services
that *would* be provided by a JVM, and I want people to know that the lack
of these items in LLVM does not in any way count as a detriment. And
secondly, because I've occasionally heard people on this list ask for
features like these, and I think it would be a waste of time to spend effort
implementing something that the frontend can easily handle.

Next, I want to list the areas where I think LLVM is lacking:

The first issue is more of a community-related one, which is that IMHO LLVM
doesn't have a lot of forward momentum in the experimental languages space.
The vast majority of contributions to the LLVM code base fall into three
categories: Clang-related work, ports to new backends, and optimization
research. Parts of LLVM which don't fall into these categories - like
garbage collection for example - tend to languish.

Note that this is not a criticism, but merely an observation - if it were a
criticism I'd have to blame myself as much as anyone else. There are good
reasons why things are as they are - the few people like myself who are
working on non-C-like languages generally don't have the energy to both work
on their front ends and at the same time do any serious amount of work on
the LLVM infrastructure. (You all saw what happened with the union types
fiasco - I contributed all of the front-end parts for declaring and
serializing unions, in the hopes that someone more knowledgeable would pick
up the effort of implementing them on the code generation side - but that
never happened and the feature got removed after six months of no progress.)

I think that LLVM's best hope in this department is that one of these
experimental languages becomes popular enough to attract a serious
contributor base of it's own, and that some of that effort can bleed off
into improving LLVM. Some of that was actually starting to happen in the
case of unladen-swallow, before that project was shut down.

The rest of my issues are more specific:

*Garbage collection is still way too difficult*. The biggest problem is the
inability to track SSA values - it requires the frontend to generate very
inefficient and error-prone code, manually spilling SSA values to the stack
around nearly every function call. I've written proposals for improving the
situation, which I won't repeat here - but realistically there's no way
that
I am ever going to have time learn to enough about LLVM's code generation
passes to implement something like this myself.

Another area which I'd like to see supported is stack walking - that is,
starting from the current call frame, iterate through all of the call frames
above it. Currently the only way to do this is via platform-specific
assembly language - I'd think it would be relatively simple to make an
intrinsic that does this. (Note that the current stack intrinsics are
unusable, because they are (a) unreliable, and (b) inefficient. Taking an
integer index of a stack frame as a parameter is not the right way to do
it.)

I have to say, if there's any single issue that could make me give up on
LLVM and switch over to using something like a JVM, it would be this -
because as far as I can tell, LLVM's garbage collection features are in
exactly the same state they were in when I started working with LLVM four
years ago.

*Generating debug info* is another area which is very hard to use. The new
DIBuilder class was a significant improvement, but the whole area is still
very complex, it's extremely easy to make mistakes and then spend days or
weeks trying to figure out what went wrong - there's no type safety or
validation that can prevent you from making such mistakes. Over the last 4
years I've spent many man-months of time wrestling with DWARF.

(One also wonders if DWARF is even the right answer. Especially compared to
the JVM - the Java debuggers are able to derive much of their information
from the reflection and class metadata that's already there in the class
files, so from that perspective DWARF adds a huge amount of overhead.)

*Platform ABI limitations* - Currently LLVM requires the frontend developer
to know quite a lot about the platform ABI - for example whether you are
allowed to pass a struct of a certain size as a value type rather than as a
reference. The thing is, experimental languages like mine generally don't
care so much about ABI compatibility - sure, we'd like to be able to call C
library functions once in a while (and we don't mind doing the extra work in
those cases), but most of the time we just want to pass a data type around
and expect it to work. Requiring the use of different techniques on
different platforms makes the situation considerably more complex.

*Light-weight coroutines* would be a "nice to have", as would better
*concurrency
primitives*. These are things I could do on my own, but it would be better,
I think, to have them in LLVM - because in my view of the world, anything
that requires lots of architecture-specific knowledge ideally belongs on the
LLVM side of the line.

There's been a lot of discussion about divide-by-zero errors and other
*non-declared
exceptions*. Having this available would be a great help.

*Named structure types* - as per Chris's proposal - would be a major
simplification, as it would allow declaration of pointer fields without
having to import the code that defines the structure of the thing that the
pointer is pointing to.

Anyway that's pretty much my list of big-ticket items. As you can see,
anyone who cares about these issues would have to think very hard before
choosing LLVM over the JVM as an implementation platform.

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110701/e6f6eba9/attachment.html>

Andrew Trick

2011-Jul-01 23:41 UTC

head link

[LLVMdev] LLVM and managed languages

Hi Talin,

I have some questions below. If these topics have already been discussed in
earlier threads, kindly point me there. I'm aware of your GC proposal, but
the rest is new to me.

On Jul 1, 2011, at 11:05 AM, Talin wrote:> Garbage collection is still way too difficult. The biggest problem is the
inability to track SSA values - it requires the frontend to generate very
inefficient and error-prone code, manually spilling SSA values to the stack
around nearly every function call. I've written proposals for improving the
situation, which I won't repeat here - but realistically there's no way
that I am ever going to have time learn to enough about LLVM's code
generation passes to implement something like this myself.
> 
> Another area which I'd like to see supported is stack walking - that
is, starting from the current call frame, iterate through all of the call frames
above it. Currently the only way to do this is via platform-specific assembly
language - I'd think it would be relatively simple to make an intrinsic that
does this. (Note that the current stack intrinsics are unusable, because they
are (a) unreliable, and (b) inefficient. Taking an integer index of a stack
frame as a parameter is not the right way to do it.)
> 
> I have to say, if there's any single issue that could make me give up
on LLVM and switch over to using something like a JVM, it would be this -
because as far as I can tell, LLVM's garbage collection features are in
exactly the same state they were in when I started working with LLVM four years
ago.
JVM implementers have done a lot of work in their respective compiler backend to
support moving collectors. I'm not aware of any efficient solution short of
making the backend aware of object pointer types. i.e. Machine operands need to
carry these types.

For stack walking, what do you need in addition to DWARF unwind? Or is
generating correct unwind tables the problem? I'm not sure what stack
intrinsics you mean... returnaddress and frameaddress? They don't always
work if you provide the necessary target options? It almost sounds like you want
more control than libunwind provides? Or do you want more efficiency?
> Light-weight coroutines would be a "nice to have", as would
better concurrency primitives.
This could be interesting. Can you explain in more detail or just point me to
some examples?
> There's been a lot of discussion about divide-by-zero errors and other
non-declared exceptions. Having this available would be a great help.
What is a non-declared exception (no throw statement in the source code?), and
do you think something is missing from LLVM IR? I don't see any difficulty
expressing Java/C# exceptions in LLVM IR. Do you want to build high-level
language semantics into LLVM IR to avoid duplicating effort across Java/C#/Tart
frontends? Or do you want better optimization/codegen in these cases
(language-friendly backend)?

Keep in mind that adding higher level IR constructs only inhibits optimization.
Java bytecode bakes language semantics into opcodes as a compression technique.
It's not a form that is amenable to optimization.

Maybe what you really want is a managed language toolkit for use in frontends
such as yours?

Backing up now to the overall topic, which of these general themes fits your
experience:
1) You want more efficient support for implementing advanced runtime features
for a given target
2) You want more ABI abstraction for easier porting across a range of targets

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110701/824d9bb0/attachment.html>

Andrew Trick

2011-Jul-01 23:49 UTC

head link

[LLVMdev] LLVM and managed languages

On Jul 1, 2011, at 4:41 PM, Andrew Trick wrote:> Java bytecode bakes language semantics into opcodes as a compression
technique. It's not a form that is amenable to optimization.
I mispoke slightly. "compressing" the bytecode is as much an
interpreter performance optimization as it is an I/O optimization.

By "not amenable to optimization" I meant high level bytecode is not a
suitable IR for an optimizing compiler.

I just assumed you're not developing an interpreted language. If so, then
develop your own bytecode or use the JVM :)

-Andy

Talin

2011-Jul-02 00:33 UTC

head link

[LLVMdev] LLVM and managed languages

On Fri, Jul 1, 2011 at 4:41 PM, Andrew Trick <atrick at apple.com> wrote:
> Hi Talin,
>
> I have some questions below. If these topics have already been discussed in
> earlier threads, kindly point me there. I'm aware of your GC proposal,
but
> the rest is new to me.
>
> On Jul 1, 2011, at 11:05 AM, Talin wrote:
>
> *Garbage collection is still way too difficult*. The biggest problem is
> the inability to track SSA values - it requires the frontend to generate
> very inefficient and error-prone code, manually spilling SSA values to the
> stack around nearly every function call. I've written proposals for
> improving the situation, which I won't repeat here - but realistically
> there's no way that I am ever going to have time learn to enough about
> LLVM's code generation passes to implement something like this myself.
>
> Another area which I'd like to see supported is stack walking - that
is,
> starting from the current call frame, iterate through all of the call
frames
> above it. Currently the only way to do this is via platform-specific
> assembly language - I'd think it would be relatively simple to make an
> intrinsic that does this. (Note that the current stack intrinsics are
> unusable, because they are (a) unreliable, and (b) inefficient. Taking an
> integer index of a stack frame as a parameter is not the right way to do
> it.)
>
> I have to say, if there's any single issue that could make me give up
on
> LLVM and switch over to using something like a JVM, it would be this -
> because as far as I can tell, LLVM's garbage collection features are in
> exactly the same state they were in when I started working with LLVM four
> years ago.
>
>
> JVM implementers have done a lot of work in their respective compiler
> backend to support moving collectors. I'm not aware of any efficient
> solution short of making the backend aware of object pointer types. i.e.
> Machine operands need to carry these types.
>
> So, I have a working implementation of a moving collector that uses theLLVM intrinsics. This consists of three components:

   - My frontend code that tells LLVM which data values contain traceable
   roots.
   - My custom GCStrategy class which uses information about the roots to
   generate a stack map for every function identifying the relative offset of
   each root.
   - My runtime library which traverses those stack maps.

All of this is pretty standard stuff. What's nice about it is that LLVM
doesn't really need to know any of the details of my garbage collection
algorithm, or even what a root looks like - I'm sure you saw my earlier
discussion of discriminated unions as roots, where the collector needs
access to the discriminator field to determine whether the payload is a
pointer or not. Since the llvm.gcroot intrinsic causes the root to be passed
to the strategy with no assumptions about its internal structure, that gives
me a lot of flexibility.

For having SSA values as roots, the situation is obviously more complex. One
simplifying assumption is that we don't try and trace register values
directly (because a structure may have been 'sliced' into multiple
registers
and it would be hard for the GCStrategy to handle that), but rather do what
we do now and say that the collector can only trace objects which live in
memory. Thus, we still need to spill SSA values into memory and reload them
- but that work should be done by LLVM and not by the frontend, because LLVM
has far more detailed knowledge of value lifetimes.

The other hard bit is how to 'mark' arbitrary values as roots. Right now
values are marked using the llvm.gcroot intrinsics, but that only works on
allocas. My current thinking is that we add a second bit to the
"Struct"
type (the first bit being isPacked) called "isRoot". If you want to
declare
a pointer as a root, then wrap it in a struct having that bit set - so you
have to add an extra 0 into your GEP to get the value out, but that's
trivial. Since a struct can contain any other data type, it makes it easy to
create roots having any desired structure. Again, these get communicated
from the frontend to the strategy pass without interpretation by LLVM,
except for one stipulation, which is that the structure isn't
'sliced', that
the offset given to the strategy pass points to a complete copy of the
struct.

> For stack walking, what do you need in addition to DWARF unwind? Or is
> generating correct unwind tables the problem? I'm not sure what stack
> intrinsics you mean... returnaddress and frameaddress? They don't
always
> work if you provide the necessary target options? It almost sounds like you
> want more control than libunwind provides? Or do you want more efficiency?
>
> Here's what my stack walker for x86 platforms looks like now:
struct CallFrame {
  CallFrame * prevFrame;
  void * returnAddr;
};

void GC_traceStack(tart_object * traceAction) {
  CallFrame * framePtr;
  #if _MSC_VER
    #if SIZEOF_VOID_PTR == 4
      __asm {
        mov framePtr, ebp
      }
    #else
      __asm {
        mov framePtr, rbp
      }
    #endif
  #else
    #if SIZEOF_VOID_PTR == 4
      __asm__("movl %%ebp, %0" :"=r"(framePtr));
    #else
      __asm__("movq %%rbp, %0" :"=r"(framePtr));
    #endif
  #endif

  while (framePtr != NULL) {
    void * returnAddr = framePtr->returnAddr;
    framePtr = framePtr->prevFrame;
    TraceDescriptor * tdesc = GC_lookupStackFrameDesc(returnAddr);
    if (tdesc != NULL) {
      TraceAction_traceDescriptors(traceAction, (void *)framePtr, tdesc);
    }
  }
}


Basically it gets the bp register as a pointer, and treats it as the head of
a linked list. It iterates through the list nodes until it finds NULL. For
each node, it finds the return address (which is stored next to the node
pointer), and looks that up in a map which returns the stack frame
descriptor for that function. As you can see this is extremely simple and
fast.

If I were designing a set of llvm intrinsics to replicate this, I would have
three functions: llvm.callframe.current() simply returns the value of the bp
register; llvm.callframe.next() takes the address of the previous stack
frame and returns its parent; and llvm.callframe.returnaddress() takes the
address of a call frame and returns the associated return address. These
three intrinsics should (I hope) be implementable on any platform that LLVM
targets, and would (I would think) each lower into a single instruction.

This would avoid me having to have a different set of inline assembly for
each different processor family that LLVM supports.
> *Light-weight coroutines* would be a "nice to have", as would
better *concurrency
> primitives*.
>
>
> This could be interesting. Can you explain in more detail or just point me
> to some examples?
>
> I'm thinking of the kind of things that Go or IBM's X10 language
does, orpossibly Python's generators. Each language is of course going to create its
own model of concurrency, but that model is going to be created from
lower-level primitives that are either provided by LLVM (things like
stack-swapping and thread-local variables) or provided by the various
operating system libraries (thread creation and semaphores and such.)
> There's been a lot of discussion about divide-by-zero errors and other
*non-declared
> exceptions*. Having this available would be a great help.
>
> What is a non-declared exception (no throw statement in the source code?),
> and do you think something is missing from LLVM IR? I don't see any
> difficulty expressing Java/C# exceptions in LLVM IR. Do you want to build
> high-level language semantics into LLVM IR to avoid duplicating effort
> across Java/C#/Tart frontends? Or do you want better optimization/codegen
in
> these cases (language-friendly backend)?
>
> There's been several threads on this lately, and I am by no means an
expertin this area, but I do understand that the basic problem of jumping out from
the middle of a basic block is a hard one to solve - but I would like to see
it solved. I can work with whatever solution you guys come up with.

> Keep in mind that adding higher level IR constructs only inhibits
> optimization. Java bytecode bakes language semantics into opcodes as a
> compression technique. It's not a form that is amenable to
optimization.
>
> Maybe what you really want is a managed language toolkit for use in
> frontends such as yours?
>
> Backing up now to the overall topic, which of these general themes fits
> your experience:
> 1) You want more efficient support for implementing advanced runtime
> features for a given target
> 2) You want more ABI abstraction for easier porting across a range of
> targets
>
> I think #2 is more interesting to me. It's not hard to do #1 for a
givenplatform, as there are libraries a-plenty to do most of this stuff.

Think of it this way: The name of the project is LLVM: "Low-Level Virtual
Machine". I realize that's a bit of a misnomer, in that LLVM isn't
primarily
about executing code, but more about preparing code for execution. (I
suppose we could call it LLISEE - Low Level Instruction Set and Execution
Environment - but that's a mouthful. You could also call it LLVVM - Low
Level "Virtual" Virtual Machine, in that the VM doesn't actually
exist). But
the "VM" part of the name seems to imply that it's not just a
compiler, but
a framework for execution and for building environments in which execution
happens.

-Andy



-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110701/74ea83a5/attachment.html>

Chris Lattner

2011-Jul-03 16:07 UTC

head link

[LLVMdev] LLVM and managed languages

Hi Talin,

Interesting post,

On Jul 1, 2011, at 11:05 AM, Talin wrote:> Garbage collection is still way too difficult. The biggest problem is the
inability to track SSA values - it requires the
> Light-weight coroutines would be a "nice to have", as would
better concurrency primitives. These are things I could
Tackling the "still way too difficult" and "should be more
batteries included" aspects of your post, I think that there is a lot of
room for an LLVM subproject that provides a "language implementors
toolbox" in the form of some really well documented and standardized
runtime libraries that provide GC, synchronization, etc capabilities.  Having
them available and reusable (and sub-settable!) would allow code sharing across
a wide range of different llvm language implementation.

These are the capabilities that the JVM provides languages implemented on top of
it, and are clearly useful.  Of course the power of LLVM is that it doesn't
force you to use a *particular* set of synch primitives, a particular memory
management scheme, etc.  However, that doesn't mean that we can't
provide a toolbox of parts to choose from, just like we provide a standard
optimization sequence which clients can choose to use or not.

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110703/7afde528/attachment.html>

Talin

2011-Jul-05 03:44 UTC

head link

[LLVMdev] LLVM and managed languages

On Sun, Jul 3, 2011 at 9:07 AM, Chris Lattner <clattner at apple.com>
wrote:
> Hi Talin,
>
> Interesting post,
>
> On Jul 1, 2011, at 11:05 AM, Talin wrote:
>
> *Garbage collection is still way too difficult*. The biggest problem is
> the inability to track SSA values - it requires the
>
>
> *Light-weight coroutines* would be a "nice to have", as would
better *concurrency
> primitives*. These are things I could
>
>
> Tackling the "still way too difficult" and "should be more
batteries
> included" aspects of your post, I think that there is a lot of room
for an
> LLVM subproject that provides a "language implementors toolbox"
in the form
> of some really well documented and standardized runtime libraries that
> provide GC, synchronization, etc capabilities.  Having them available and
> reusable (and sub-settable!) would allow code sharing across a wide range
of
> different llvm language implementation.
>
> Well, sort of. Things like GC tracing of SSA values can't really be
done asa library, i'm assuming it requires changes to the code generator.

Things like iterating stack frames can be done as an add-on library, but I
think they would work better as intrinsics - which again requires changes to
the code generators.

Things like stack switching could go either way I guess.

Most of the things that *could* go in an add-on library I could write
myself. The primary exception is cases where the library code has to be
different for each target platform - in such cases what you'd really like is
for each different platform-specific implementation to have a maintainer who
is an expert on that platform.

This is not to say that such a library as you propose would not be useful
and convenient.

Of course, I'm assuming that the typical language implementor has skills and
needs similar to my own, which might not be true. In general, I try to be an
expert on the first half of the dragon book, and leave the second half to
experts who know far more than I do.

> These are the capabilities that the JVM provides languages implemented on
> top of it, and are clearly useful.  Of course the power of LLVM is that it
> doesn't force you to use a *particular* set of synch primitives, a
> particular memory management scheme, etc.  However, that doesn't mean
that
> we can't provide a toolbox of parts to choose from, just like we
provide a
> standard optimization sequence which clients can choose to use or not.
>
> -Chris
>

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110704/c4ea0d76/attachment.html>

Talin

2011-Jul-05 18:35 UTC

head link

[LLVMdev] LLVM and managed languages

On Sun, Jul 3, 2011 at 9:07 AM, Chris Lattner <clattner at apple.com>
wrote:
> Hi Talin,
>
> Interesting post,
>
> On Jul 1, 2011, at 11:05 AM, Talin wrote:
>
> *Garbage collection is still way too difficult*. The biggest problem is
> the inability to track SSA values - it requires the
>
>
> *Light-weight coroutines* would be a "nice to have", as would
better *concurrency
> primitives*. These are things I could
>
>
> Tackling the "still way too difficult" and "should be more
batteries
> included" aspects of your post, I think that there is a lot of room
for an
> LLVM subproject that provides a "language implementors toolbox"
in the form
> of some really well documented and standardized runtime libraries that
> provide GC, synchronization, etc capabilities.  Having them available and
> reusable (and sub-settable!) would allow code sharing across a wide range
of
> different llvm language implementation.
>
> A few additional thoughts: The language implementer's library that youpropose would include a whole bunch of stuff that I didn't mention in the
original post because it's stuff that I already have, but which might be
useful to other language implementers. Some examples:

   - DWARF exception table decoding functions, useful for writing
   personality functions.
   - Other functions for accessing DWARF data at runtime - useful for
   generating stack dumps.
   - Aligned memory allocators. It's useful for some garbage collection
   algorithms to be able to allocate heap blocks that are aligned on a large
   power-of-two boundary (like 64K boundary) so that you can quickly determine
   which memory pool an address belongs to by looking at the upper bits of the
   address. (Mainly this would just be a standardized interface around the
   platform's C library functions that do this, such as posix_memalign.)
   - An efficient map class for mapping function return addresses to a
   pointer. This map would be initialized on program start from a table of
   tuples and would be immutable after that. One use of this is finding the
   stack map for a given function return address.
   - Integer/float parsing and printing functions for building your
   language's equivalent of 'printf'.
   - Helper functions for finding and running static constructors.
   - Getting and setting thread-local data in a way that uses the __thread
   attribute if available, and pthreads if not. (I notice that
   pthread_getspecific is extremely efficient on OS X.)
   - Low-level debugging support.

One requirement is that all of these functions should be written in such as
way as to not require the C++ runtime libraries, only C libs.

These are the capabilities that the JVM provides languages implemented
on> top of it, and are clearly useful.  Of course the power of LLVM is that it
> doesn't force you to use a *particular* set of synch primitives, a
> particular memory management scheme, etc.  However, that doesn't mean
that
> we can't provide a toolbox of parts to choose from, just like we
provide a
> standard optimization sequence which clients can choose to use or not.
>
> -Chris
>


-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110705/28534ba8/attachment.html>

Geoff Reedy

2011-Jul-06 16:04 UTC

head link

[LLVMdev] LLVM and managed languages

On Fri, Jul 01, 2011 at 11:05:44AM -0700, Talin said> So I've been using LLVM for about 4 years now, and I've posted a
lot on this
> list about specific issues. What I would like to do is step back for a
> moment and give my "big picture" assessment of LLVM overall,
particularly
> with respect to developing a "managed" language like Java / C# or
my own
> language, Tart.
I'm working on an LLVM backend for the Scala compiler[1], so I'm very
interested in the issues you discuss here. Some of them I have
encountered myself already and others I can see on the horizon. I am new
to the list and offer apologies and request pointers if I cover ground
that's been trodden before.

[1] http://greedy.github.com/scala/
> I would also like to list the areas where I *don't need help* from LLVM
-
> that is, these are things I can easily handle myself in the frontend:
> 
>    - Dynamic dispatch and virtual methods / multi-methods
>    - Boxing and unboxing of value types
>    - Reflection
>    - Memory management
> 
> I mention these items for two reasons: First, because these are services
> that *would* be provided by a JVM, and I want people to know that the lack
> of these items in LLVM does not in any way count as a detriment. And
> secondly, because I've occasionally heard people on this list ask for
> features like these, and I think it would be a waste of time to spend
effort
> implementing something that the frontend can easily handle.
Agree 100%.
> The rest of my issues are more specific:
> 
> *Garbage collection is still way too difficult*. The biggest problem is the
> inability to track SSA values - it requires the frontend to generate very
> inefficient and error-prone code, manually spilling SSA values to the stack
> around nearly every function call. I've written proposals for improving
the
> situation, which I won't repeat here - but realistically there's no
way that
> I am ever going to have time learn to enough about LLVM's code
generation
> passes to implement something like this myself.
I'd appreciate a pointer to the previous proposal and discussion. I
assume that you're talking about automatic spilling of SSA roots to
stack slots at safe points? Is it possible that you could attach
metadata to these SSA values and generate the appropriate LLVM IR to
spill all the values that dominate the safe point in the GC strategy?
> Another area which I'd like to see supported is stack walking - that
is,
> starting from the current call frame, iterate through all of the call
frames
> above it. Currently the only way to do this is via platform-specific
> assembly language - I'd think it would be relatively simple to make an
> intrinsic that does this. (Note that the current stack intrinsics are
> unusable, because they are (a) unreliable, and (b) inefficient. Taking an
> integer index of a stack frame as a parameter is not the right way to do
> it.)
Do you have something in mind similar to the interface provided by
libunwind (http://www.nongnu.org/libunwind/docs.html) but lighter weight
and integrated with LLVM?
> I have to say, if there's any single issue that could make me give up
on
> LLVM and switch over to using something like a JVM, it would be this -
> because as far as I can tell, LLVM's garbage collection features are in
> exactly the same state they were in when I started working with LLVM four
> years ago.
In my mind one of the frustrations trying to use the GC support in LLVM
is the somewhat sketchy documentation. I'd be nice to have a
straightforward end-to-end example. Since you've implemented a GC using
this support maybe you could write something up. I know I'd appreciate
it a lot. Do you find that the status table in the GC document is still
accurate? It seems like the features you'd like to see implemented most
here are

 - Emitting code at safe points
 - Register maps
 - Liveness analysis (it appears that the new lifetime intrinsics may
   enable this)
> *Platform ABI limitations* - Currently LLVM requires the frontend developer
> to know quite a lot about the platform ABI - for example whether you are
> allowed to pass a struct of a certain size as a value type rather than as a
> reference. The thing is, experimental languages like mine generally
don't
> care so much about ABI compatibility - sure, we'd like to be able to
call C
> library functions once in a while (and we don't mind doing the extra
work in
> those cases), but most of the time we just want to pass a data type around
> and expect it to work. Requiring the use of different techniques on
> different platforms makes the situation considerably more complex.
Is it reasonable to add a new calling convention that like fastcc
doesn't need to worry about conforming to any ABI but instead of trying
to go fast, provides maximum flexibility?
> *Light-weight coroutines* would be a "nice to have", as would
better
> *concurrency
> primitives*. These are things I could do on my own, but it would be better,
> I think, to have them in LLVM - because in my view of the world, anything
> that requires lots of architecture-specific knowledge ideally belongs on
the
> LLVM side of the line.
> 
> There's been a lot of discussion about divide-by-zero errors and other
> *non-declared
> exceptions*. Having this available would be a great help.
Agreed. I have read some of the proposals about this and it does indeed
seem tricky. At the least the instructions that could raise an exception
would have to be marked in some way so that side-affecting code isn't
moved past them.
> *Named structure types* - as per Chris's proposal - would be a major
> simplification, as it would allow declaration of pointer fields without
> having to import the code that defines the structure of the thing that the
> pointer is pointing to.
I would really like to see this. It would also help when inspecting IR.
Right now if types have been uniqued all structures with the same layout
get collapsed to a single name which makes things awkward to read.

-- Geoff

Talin

2011-Jul-06 17:43 UTC

head link

[LLVMdev] LLVM and managed languages

On Wed, Jul 6, 2011 at 9:04 AM, Geoff Reedy <geoff at
programmer-monk.net>wrote:
> On Fri, Jul 01, 2011 at 11:05:44AM -0700, Talin said
> > So I've been using LLVM for about 4 years now, and I've posted
a lot on
> this
> > list about specific issues. What I would like to do is step back for a
> > moment and give my "big picture" assessment of LLVM overall,
particularly
> > with respect to developing a "managed" language like Java /
C# or my own
> > language, Tart.
>
> I'm working on an LLVM backend for the Scala compiler[1], so I'm
very
> interested in the issues you discuss here. Some of them I have
> encountered myself already and others I can see on the horizon. I am new
> to the list and offer apologies and request pointers if I cover ground
> that's been trodden before.
>
> [1] http://greedy.github.com/scala/
>
> > I would also like to list the areas where I *don't need help* from
LLVM -
> > that is, these are things I can easily handle myself in the frontend:
> >
> >    - Dynamic dispatch and virtual methods / multi-methods
> >    - Boxing and unboxing of value types
> >    - Reflection
> >    - Memory management
> >
> > I mention these items for two reasons: First, because these are
services
> > that *would* be provided by a JVM, and I want people to know that the
> lack
> > of these items in LLVM does not in any way count as a detriment. And
> > secondly, because I've occasionally heard people on this list ask
for
> > features like these, and I think it would be a waste of time to spend
> effort
> > implementing something that the frontend can easily handle.
>
> Agree 100%.
>
> > The rest of my issues are more specific:
> >
> > *Garbage collection is still way too difficult*. The biggest problem
is
> the
> > inability to track SSA values - it requires the frontend to generate
very
> > inefficient and error-prone code, manually spilling SSA values to the
> stack
> > around nearly every function call. I've written proposals for
improving
> the
> > situation, which I won't repeat here - but realistically
there's no way
> that
> > I am ever going to have time learn to enough about LLVM's code
generation
> > passes to implement something like this myself.
>
> I'd appreciate a pointer to the previous proposal and discussion. I
> assume that you're talking about automatic spilling of SSA roots to
> stack slots at safe points? Is it possible that you could attach
> metadata to these SSA values and generate the appropriate LLVM IR to
> spill all the values that dominate the safe point in the GC strategy?
>
> Actually, my ideas have changed somewhat, so I'll start a new thread
withthe most recent incarnation.

>  > Another area which I'd like to see supported is stack walking -
that is,
> > starting from the current call frame, iterate through all of the call
> frames
> > above it. Currently the only way to do this is via platform-specific
> > assembly language - I'd think it would be relatively simple to
make an
> > intrinsic that does this. (Note that the current stack intrinsics are
> > unusable, because they are (a) unreliable, and (b) inefficient. Taking
an
> > integer index of a stack frame as a parameter is not the right way to
do
> > it.)
>
> Do you have something in mind similar to the interface provided by
> libunwind (http://www.nongnu.org/libunwind/docs.html) but lighter weight
> and integrated with LLVM?
>
> You're in luck - I happen to have a document that explains this very
thing:
https://docs.google.com/document/pub?id=1-ws0KYo47S0CgqpwkjfWDBJ8wFhW_0UYKxPIJ0TyKrQ
> I have to say, if there's any single issue that could make me give up
on
> > LLVM and switch over to using something like a JVM, it would be this -
> > because as far as I can tell, LLVM's garbage collection features
are in
> > exactly the same state they were in when I started working with LLVM
four
> > years ago.
>
> In my mind one of the frustrations trying to use the GC support in LLVM
> is the somewhat sketchy documentation. I'd be nice to have a
> straightforward end-to-end example. Since you've implemented a GC using
> this support maybe you could write something up. I know I'd appreciate
> it a lot. Do you find that the status table in the GC document is still
> accurate? It seems like the features you'd like to see implemented most
> here are
>
>  - Emitting code at safe points
>  - Register maps
>  - Liveness analysis (it appears that the new lifetime intrinsics may
>   enable this)
>
> > *Platform ABI limitations* - Currently LLVM requires the frontend
> developer
> > to know quite a lot about the platform ABI - for example whether you
are
> > allowed to pass a struct of a certain size as a value type rather than
as
> a
> > reference. The thing is, experimental languages like mine generally
don't
> > care so much about ABI compatibility - sure, we'd like to be able
to call
> C
> > library functions once in a while (and we don't mind doing the
extra work
> in
> > those cases), but most of the time we just want to pass a data type
> around
> > and expect it to work. Requiring the use of different techniques on
> > different platforms makes the situation considerably more complex.
>
> Is it reasonable to add a new calling convention that like fastcc
> doesn't need to worry about conforming to any ABI but instead of trying
> to go fast, provides maximum flexibility?
>
> I'll have to defer to the experts on this one.
> > *Light-weight coroutines* would be a "nice to have", as
would better
> > *concurrency
> > primitives*. These are things I could do on my own, but it would be
> better,
> > I think, to have them in LLVM - because in my view of the world,
anything
> > that requires lots of architecture-specific knowledge ideally belongs
on
> the
> > LLVM side of the line.
> >
> > There's been a lot of discussion about divide-by-zero errors and
other
> > *non-declared
> > exceptions*. Having this available would be a great help.
>
> Agreed. I have read some of the proposals about this and it does indeed
> seem tricky. At the least the instructions that could raise an exception
> would have to be marked in some way so that side-affecting code isn't
> moved past them.
>
> There have been a couple of proposals put forward, but nothing hasmaterialized yet. Here is one:

http://code.google.com/p/llvm-stack-switch/wiki/Proposal


There's also this thread:

http://comments.gmane.org/gmane.comp.compilers.llvm.devel/38987


>  > *Named structure types* - as per Chris's proposal - would be a
major
> > simplification, as it would allow declaration of pointer fields
without
> > having to import the code that defines the structure of the thing that
> the
> > pointer is pointing to.
>
> I would really like to see this. It would also help when inspecting IR.
> Right now if types have been uniqued all structures with the same layout
> get collapsed to a single name which makes things awkward to read.
>
> Yep!
> -- Geoff
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110706/88114ee9/attachment.html>

Jon Harrop

2011-Dec-04 21:38 UTC

head link

[LLVMdev] LLVM and managed languages

Hi Talin,

I wrote HLVM:

  http://www.ffconsultancy.com/ocaml/hlvm/

The whole project took a few man months. HLVM provides an ML-like type system,
unboxed tuples, discriminated unions, TCO, generic printing, C FFI, POSIX
threads, mark-sweep GC and both JIT and standalone compilation. I wrote several
(accurate) garbage collectors for it including a stop-the-world mark-sweep
collector and they took a few days each. I did so by implementing my own shadow
stack along the lines of Henderson's "Accurate garbage collection in an
uncooperative environment". HLVM was only ever a hobby project. IMHO, the
results were incredible and are a testament to the awesomeness of LLVM.

HLVM prototyped many interesting ideas but the most relevant here is the use of
value types to avoid heap allocation in order to alleviate the garbage
collector. For example, HLVM is about 3x faster than Java on the following ray
tracing benchmark because HLVM does no allocation or collection whilst tracing
whereas Java does a huge amount:

 
http://flyingfrogblog.blogspot.com/2010/01/hlvm-on-ray-tracer-language-comparison.html

I believe that benchmark shows that even the most heavily optimized GC can be
thrashed by a program that simply avoids GC. Value types make this possible and
the lack of value types on the JVM is, therefore, a major issue.

Now, I have some concerns about the statements being made here regarding GC
support in LLVM. Let me begin by addressing the statement:
> Garbage collection is still way too difficult.
This is completely untrue. You can easily write your own GC simply by storing
roots explicitly in your own shadow stack. Contrary to what many people told me
before I did this, performance is fine for many applications:

http://flyingfrogblog.blogspot.com/2009/03/current-shadow-stack-overheads-in-hlvm.html

I believe the conclusion you meant to draw is that writing a GC using LLVM's
highly experimental GC support is way too difficult but that is a very different
statement.

This is a really important point because it means that all of the major changes
discussed so far are just optimizations beyond a simple shadow stack. Moreover,
I don't think we have any idea how valuable these optimizations might be.
Therefore, I think this discussion needs to start with an objective survey of
the possible approaches and some performance estimates.

HLVM's benchmarks indicate that the shadow stack overhead is usually under
20% but can be as high as 70% with purely functional code. HLVM currently pushes
all roots onto the shadow stack as soon as they come into scope and resets the
shadow stack pointer at each function exit. A better approach would be to defer
pushing until safe points (if any) and push only live roots (i.e. those referred
to again after the safe point).

I believe HLVM could attain competitive performance even on purely functional
data structures by using a non-moving mark-region collector. I have written
several prototypes in C++ but they are very much work-in-progress. As the GCs
are non-moving there is no need to restore pointers after a safe point, which
sounds more efficient to me. The downside is that temporaries are no longer
allocated by bumping a pointer into the nursery generation but my mark-region GC
uses a small bitwise LUT to find the next free block in the current region which
is almost as fast.

Others have said that the best possible performance is obtained by making the
back-end aware of GC pointers. I can well believe that but the cost of doing so
with LLVM is so large that I think it will be necessary to sacrifice optimality
for something almost as good that is attainable with a lot less effort. In fact,
I would much rather see a separate project that sits on top of LLVM and provides
a safe high-level garbage collected VM for language experimenters to use.

Incidentally, if anyone wants to get up to speed on GC design I highly recommend
Richard Jones' new book on the subject:

http://www.amazon.co.uk/Garbage-Collection-Handbook-Management-Algorithms/dp/1420082795

Cheers,
Jon.

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Talin
Sent: 01 July 2011 19:06
To: LLVM Developers Mailing List
Subject: [LLVMdev] LLVM and managed languages

So I've been using LLVM for about 4 years now, and I've posted a lot on
this list about specific issues. What I would like to do is step back for a
moment and give my "big picture" assessment of LLVM overall,
particularly with respect to developing a "managed" language like Java
/ C# or my own language, Tart.

Obviously, I feel that LLVM is the best choice out there, otherwise I would of
switched to using other alternatives long ago. At the same time, however,
I've observed that the great majority of experimental languages - Groovy,
Scala, and many others - are built on top of the Java virtual machine, which
seems to be the platform of choice for language experimenters. In a similar
vein, if you spend any time studying the academic research on garbage collection
algorithms, you'll see that the JVM also appears to be the platform of
choice here as well.

One question to ask is, what factors would weigh heavily in the decision to
choose the JVM over LLVM when implementing a managed language? On the positive
side, LLVM's mature and simple APIs make it a breeze to manipulate types and
functions, and it's modular architecture means I can use the parts I want
without being forced to use parts I don't need. And the ability to go
straight to machine code means that I don't need to ship a complex virtual
machine with my executables.

I would also like to list the areas where I don't need help from LLVM - that
is, these are things I can easily handle myself in the frontend:
• Dynamic dispatch and virtual methods / multi-methods
• Boxing and unboxing of value types
• Reflection
• Memory management
I mention these items for two reasons: First, because these are services that
would be provided by a JVM, and I want people to know that the lack of these
items in LLVM does not in any way count as a detriment. And secondly, because
I've occasionally heard people on this list ask for features like these, and
I think it would be a waste of time to spend effort implementing something that
the frontend can easily handle.

Next, I want to list the areas where I think LLVM is lacking:

The first issue is more of a community-related one, which is that IMHO LLVM
doesn't have a lot of forward momentum in the experimental languages space.
The vast majority of contributions to the LLVM code base fall into three
categories: Clang-related work, ports to new backends, and optimization
research. Parts of LLVM which don't fall into these categories - like
garbage collection for example - tend to languish.

Note that this is not a criticism, but merely an observation - if it were a
criticism I'd have to blame myself as much as anyone else. There are good
reasons why things are as they are - the few people like myself who are working
on non-C-like languages generally don't have the energy to both work on
their front ends and at the same time do any serious amount of work on the LLVM
infrastructure. (You all saw what happened with the union types fiasco - I
contributed all of the front-end parts for declaring and serializing unions, in
the hopes that someone more knowledgeable would pick up the effort of
implementing them on the code generation side - but that never happened and the
feature got removed after six months of no progress.)

I think that LLVM's best hope in this department is that one of these
experimental languages becomes popular enough to attract a serious contributor
base of it's own, and that some of that effort can bleed off into improving
LLVM. Some of that was actually starting to happen in the case of
unladen-swallow, before that project was shut down.

The rest of my issues are more specific:

Garbage collection is still way too difficult. The biggest problem is the
inability to track SSA values - it requires the frontend to generate very
inefficient and error-prone code, manually spilling SSA values to the stack
around nearly every function call. I've written proposals for improving the
situation, which I won't repeat here - but realistically there's no way
that I am ever going to have time learn to enough about LLVM's code
generation passes to implement something like this myself.

Another area which I'd like to see supported is stack walking - that is,
starting from the current call frame, iterate through all of the call frames
above it. Currently the only way to do this is via platform-specific assembly
language - I'd think it would be relatively simple to make an intrinsic that
does this. (Note that the current stack intrinsics are unusable, because they
are (a) unreliable, and (b) inefficient. Taking an integer index of a stack
frame as a parameter is not the right way to do it.)


I have to say, if there's any single issue that could make me give up on
LLVM and switch over to using something like a JVM, it would be this - because
as far as I can tell, LLVM's garbage collection features are in exactly the
same state they were in when I started working with LLVM four years ago.

Generating debug info is another area which is very hard to use. The new
DIBuilder class was a significant improvement, but the whole area is still very
complex, it's extremely easy to make mistakes and then spend days or weeks
trying to figure out what went wrong - there's no type safety or validation
that can prevent you from making such mistakes. Over the last 4 years I've
spent many man-months of time wrestling with DWARF.

(One also wonders if DWARF is even the right answer. Especially compared to the
JVM - the Java debuggers are able to derive much of their information from the
reflection and class metadata that's already there in the class files, so
from that perspective DWARF adds a huge amount of overhead.)

Platform ABI limitations - Currently LLVM requires the frontend developer to
know quite a lot about the platform ABI - for example whether you are allowed to
pass a struct of a certain size as a value type rather than as a reference. The
thing is, experimental languages like mine generally don't care so much
about ABI compatibility - sure, we'd like to be able to call C library
functions once in a while (and we don't mind doing the extra work in those
cases), but most of the time we just want to pass a data type around and expect
it to work. Requiring the use of different techniques on different platforms
makes the situation considerably more complex.

Light-weight coroutines would be a "nice to have", as would better
concurrency primitives. These are things I could do on my own, but it would be
better, I think, to have them in LLVM - because in my view of the world,
anything that requires lots of architecture-specific knowledge ideally belongs
on the LLVM side of the line.

There's been a lot of discussion about divide-by-zero errors and other
non-declared exceptions. Having this available would be a great help.

Named structure types - as per Chris's proposal - would be a major
simplification, as it would allow declaration of pointer fields without having
to import the code that defines the structure of the thing that the pointer is
pointing to.

Anyway that's pretty much my list of big-ticket items. As you can see,
anyone who cares about these issues would have to think very hard before
choosing LLVM over the JVM as an implementation platform.

-- 
-- Talin

Talin

2011-Dec-06 00:18 UTC

head link

[LLVMdev] LLVM and managed languages

Comments inline.

On Sun, Dec 4, 2011 at 1:38 PM, Jon Harrop <
jonathandeanharrop at googlemail.com> wrote:
> Hi Talin,
>
> I wrote HLVM:
>
>  http://www.ffconsultancy.com/ocaml/hlvm/
>
> The whole project took a few man months. HLVM provides an ML-like type
> system, unboxed tuples, discriminated unions, TCO, generic printing, C FFI,
> POSIX threads, mark-sweep GC and both JIT and standalone compilation. I
> wrote several (accurate) garbage collectors for it including a
> stop-the-world mark-sweep collector and they took a few days each. I did so
> by implementing my own shadow stack along the lines of Henderson's
> "Accurate garbage collection in an uncooperative environment".
HLVM was
> only ever a hobby project. IMHO, the results were incredible and are a
> testament to the awesomeness of LLVM.
>
> HLVM prototyped many interesting ideas but the most relevant here is the
> use of value types to avoid heap allocation in order to alleviate the
> garbage collector. For example, HLVM is about 3x faster than Java on the
> following ray tracing benchmark because HLVM does no allocation or
> collection whilst tracing whereas Java does a huge amount:
>
>
>
http://flyingfrogblog.blogspot.com/2010/01/hlvm-on-ray-tracer-language-comparison.html
>
> I believe that benchmark shows that even the most heavily optimized GC can
> be thrashed by a program that simply avoids GC. Value types make this
> possible and the lack of value types on the JVM is, therefore, a major
> issue.
>
> I've looked over the HLVM code a bit in the past. Note that I do havevalue types in my language, although they are not ubiquitous.

> Now, I have some concerns about the statements being made here regarding
> GC support in LLVM. Let me begin by addressing the statement:
>
> > Garbage collection is still way too difficult.
>
> This is completely untrue. You can easily write your own GC simply by
> storing roots explicitly in your own shadow stack. Contrary to what many
> people told me before I did this, performance is fine for many
applications:
>
>
>
http://flyingfrogblog.blogspot.com/2009/03/current-shadow-stack-overheads-in-hlvm.html
>
> I'm afraid I'm going to have to disagree.
First off, since you mentioned shadow-stack: All that shadow stack does is
tell you at what offsets, for each stack frame, the stack root pointers
are. Now, my garbage collector doesn't use shadow-stack, it uses the *real*
stack - that is, instead of maintaining a parallel linked list of stack
frames which has to be updated on function entry and exit, it uses the
actual call frame pointers to traverse the list of call frames. This avoids
the need to update a global variable (even a thread-local global) each time
a call frame is entered and exited, which is what shadow-stack requires.

The actual code to do this is fairly trivial - much less code involved than
shadow stack, but there is one minor hitch - you have to re-implement it
for each different processor, since different processors store the call
frame pointer in a different register. I think it would be nice if LLVM
supplied a standard way to do this, but it's not something that is a
serious obstacle to me.

But all of that is unimportant - my stack traversal is semantically
equivalent to shadow-stack. None of the rest of what I'm about to say would
make any difference if I was using shadow stack. That's not where the
problems lie.

The main problem, as I see it, is the design of the llvm.gcroot()
intrinsic. The first problem with it is that the argument to llvm.gcroot
must be a memory address - specifically, it must be the result of an alloca
instruction. That means that even if you have nice value types, they have
to live in memory in order for them to be traced. You can't keep a value
type in a register or an SSA value. If you do have some data type in a
register, and it contains pointers that need to be traced, that value must
be written to memory at each call site, and reloaded from memory after the
call returns. (The latter is only true if you have a moving collector,
which I do.)

A second problem is that you are required to explicitly declare as stack
root not only your local variables, but any intermediate results of
expression. In other words, the "root-ness" of a value doesn't
automatically follow along with the value, as it would with a type-based
approach. The result is that the frontend ends up generating horrible,
horrible code that is littered with extra allocas, calls to llvm.gcroot(),
and lots of extra loads and stores.

A third problem is that the optimizer *has* to know about garbage
collection, because the optimizer can do things that might invalidate your
garbage collector's assumptions. For example, the optimizer might decide to
slice a struct in half, placing each half in a register - at which point,
your table of offsets showing where the traceable pointers are is now
meaningless. From what people have told me, LLVM currently handles this by
simply disabling most optimizations for any value which is marked with
llvm.gcroot(). If "root-ness" were a property of a type, then the
optimizer
would be able to do whatever it wanted (mostly I guess), and then you'd be
able to scan the output of the optimizer and identify roots by their types.
Stack roots would only need to be traced if they were in a range of code
where they were live. And so on.

I believe the conclusion you meant to draw is that writing a GC
using> LLVM's highly experimental GC support is way too difficult but that is
a
> very different statement.
>
> I don't know what you mean by "highly experimental GC
support". As far asI know, LLVM's GC support consists of the combination of shadow-stack + the
llvm.gc intrinsic functions. And as far as I know, there have been no
changes to either of these in over 4 years. Which is to say, it's clearly
not an area which is the focus of anyone's attention.

> This is a really important point because it means that all of the major
> changes discussed so far are just optimizations beyond a simple shadow
> stack. Moreover, I don't think we have any idea how valuable these
> optimizations might be. Therefore, I think this discussion needs to start
> with an objective survey of the possible approaches and some performance
> estimates.
>
> Increasing performance is only half of the story. The other half is makingit easy to use. For me personally, I've managed to work around all of the
issues that I've mentioned, but it took me a long time to get it right, and
even today it's an ongoing burden - the IR code that my frontend generates
is unreadable due to it being cluttered with extra instructions needed to
deal with GC roots, which would not be needed if LLVM had a better means of
indicating roots.

In other words, I started this thread not because I needed the things I was
asking for - I've already worked around the lack of them - but rather
because I want those who come after to me to have a better experience than
I had. My premise at the start of the thread is that more people choose the
JVM over LLVM as a host platform for language experimentation, and I still
think this is true.

> HLVM's benchmarks indicate that the shadow stack overhead is usually
under
> 20% but can be as high as 70% with purely functional code. HLVM currently
> pushes all roots onto the shadow stack as soon as they come into scope and
> resets the shadow stack pointer at each function exit. A better approach
> would be to defer pushing until safe points (if any) and push only live
> roots (i.e. those referred to again after the safe point).
>
> I believe HLVM could attain competitive performance even on purely
> functional data structures by using a non-moving mark-region collector. I
> have written several prototypes in C++ but they are very much
> work-in-progress. As the GCs are non-moving there is no need to restore
> pointers after a safe point, which sounds more efficient to me. The
> downside is that temporaries are no longer allocated by bumping a pointer
> into the nursery generation but my mark-region GC uses a small bitwise LUT
> to find the next free block in the current region which is almost as fast.
>
> Others have said that the best possible performance is obtained by making
> the back-end aware of GC pointers. I can well believe that but the cost of
> doing so with LLVM is so large that I think it will be necessary to
> sacrifice optimality for something almost as good that is attainable with a
> lot less effort. In fact, I would much rather see a separate project that
> sits on top of LLVM and provides a safe high-level garbage collected VM for
> language experimenters to use.
>
> I can't say how much difference in performance there would be if LLVM
hada GC-aware optimizer, as opposed to what we have now, which is that
GC-related code is mostly unoptimized (again, from what people have told
me, not from my own observation).

> Incidentally, if anyone wants to get up to speed on GC design I highly
> recommend Richard Jones' new book on the subject:
>
>
>
http://www.amazon.co.uk/Garbage-Collection-Handbook-Management-Algorithms/dp/1420082795
>
> Cheers,
> Jon.
>
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu] On
> Behalf Of Talin
> Sent: 01 July 2011 19:06
> To: LLVM Developers Mailing List
> Subject: [LLVMdev] LLVM and managed languages
>
> So I've been using LLVM for about 4 years now, and I've posted a
lot on
> this list about specific issues. What I would like to do is step back for a
> moment and give my "big picture" assessment of LLVM overall,
particularly
> with respect to developing a "managed" language like Java / C# or
my own
> language, Tart.
>
> Obviously, I feel that LLVM is the best choice out there, otherwise I
> would of switched to using other alternatives long ago. At the same time,
> however, I've observed that the great majority of experimental
languages -
> Groovy, Scala, and many others - are built on top of the Java virtual
> machine, which seems to be the platform of choice for language
> experimenters. In a similar vein, if you spend any time studying the
> academic research on garbage collection algorithms, you'll see that the
JVM
> also appears to be the platform of choice here as well.
>
> One question to ask is, what factors would weigh heavily in the decision
> to choose the JVM over LLVM when implementing a managed language? On the
> positive side, LLVM's mature and simple APIs make it a breeze to
manipulate
> types and functions, and it's modular architecture means I can use the
> parts I want without being forced to use parts I don't need. And the
> ability to go straight to machine code means that I don't need to ship
a
> complex virtual machine with my executables.
>
> I would also like to list the areas where I don't need help from LLVM -
> that is, these are things I can easily handle myself in the frontend:
> • Dynamic dispatch and virtual methods / multi-methods
> • Boxing and unboxing of value types
> • Reflection
> • Memory management
> I mention these items for two reasons: First, because these are services
> that would be provided by a JVM, and I want people to know that the lack of
> these items in LLVM does not in any way count as a detriment. And secondly,
> because I've occasionally heard people on this list ask for features
like
> these, and I think it would be a waste of time to spend effort implementing
> something that the frontend can easily handle.
>
> Next, I want to list the areas where I think LLVM is lacking:
>
> The first issue is more of a community-related one, which is that IMHO
> LLVM doesn't have a lot of forward momentum in the experimental
languages
> space. The vast majority of contributions to the LLVM code base fall into
> three categories: Clang-related work, ports to new backends, and
> optimization research. Parts of LLVM which don't fall into these
categories
> - like garbage collection for example - tend to languish.
>
> Note that this is not a criticism, but merely an observation - if it were
> a criticism I'd have to blame myself as much as anyone else. There are
good
> reasons why things are as they are - the few people like myself who are
> working on non-C-like languages generally don't have the energy to both
> work on their front ends and at the same time do any serious amount of work
> on the LLVM infrastructure. (You all saw what happened with the union types
> fiasco - I contributed all of the front-end parts for declaring and
> serializing unions, in the hopes that someone more knowledgeable would pick
> up the effort of implementing them on the code generation side - but that
> never happened and the feature got removed after six months of no
progress.)
>
> I think that LLVM's best hope in this department is that one of these
> experimental languages becomes popular enough to attract a serious
> contributor base of it's own, and that some of that effort can bleed
off
> into improving LLVM. Some of that was actually starting to happen in the
> case of unladen-swallow, before that project was shut down.
>
> The rest of my issues are more specific:
>
> Garbage collection is still way too difficult. The biggest problem is the
> inability to track SSA values - it requires the frontend to generate very
> inefficient and error-prone code, manually spilling SSA values to the stack
> around nearly every function call. I've written proposals for improving
the
> situation, which I won't repeat here - but realistically there's no
way
> that I am ever going to have time learn to enough about LLVM's code
> generation passes to implement something like this myself.
>
> Another area which I'd like to see supported is stack walking - that
is,
> starting from the current call frame, iterate through all of the call
> frames above it. Currently the only way to do this is via platform-specific
> assembly language - I'd think it would be relatively simple to make an
> intrinsic that does this. (Note that the current stack intrinsics are
> unusable, because they are (a) unreliable, and (b) inefficient. Taking an
> integer index of a stack frame as a parameter is not the right way to do
> it.)
>
>
> I have to say, if there's any single issue that could make me give up
on
> LLVM and switch over to using something like a JVM, it would be this -
> because as far as I can tell, LLVM's garbage collection features are in
> exactly the same state they were in when I started working with LLVM four
> years ago.
>
> Generating debug info is another area which is very hard to use. The new
> DIBuilder class was a significant improvement, but the whole area is still
> very complex, it's extremely easy to make mistakes and then spend days
or
> weeks trying to figure out what went wrong - there's no type safety or
> validation that can prevent you from making such mistakes. Over the last 4
> years I've spent many man-months of time wrestling with DWARF.
>
> (One also wonders if DWARF is even the right answer. Especially compared
> to the JVM - the Java debuggers are able to derive much of their
> information from the reflection and class metadata that's already there
in
> the class files, so from that perspective DWARF adds a huge amount of
> overhead.)
>
> Platform ABI limitations - Currently LLVM requires the frontend developer
> to know quite a lot about the platform ABI - for example whether you are
> allowed to pass a struct of a certain size as a value type rather than as a
> reference. The thing is, experimental languages like mine generally
don't
> care so much about ABI compatibility - sure, we'd like to be able to
call C
> library functions once in a while (and we don't mind doing the extra
work
> in those cases), but most of the time we just want to pass a data type
> around and expect it to work. Requiring the use of different techniques on
> different platforms makes the situation considerably more complex.
>
> Light-weight coroutines would be a "nice to have", as would
better
> concurrency primitives. These are things I could do on my own, but it would
> be better, I think, to have them in LLVM - because in my view of the world,
> anything that requires lots of architecture-specific knowledge ideally
> belongs on the LLVM side of the line.
>
> There's been a lot of discussion about divide-by-zero errors and other
> non-declared exceptions. Having this available would be a great help.
>
> Named structure types - as per Chris's proposal - would be a major
> simplification, as it would allow declaration of pointer fields without
> having to import the code that defines the structure of the thing that the
> pointer is pointing to.
>
> Anyway that's pretty much my list of big-ticket items. As you can see,
> anyone who cares about these issues would have to think very hard before
> choosing LLVM over the JVM as an implementation platform.
>
> --
> -- Talin
>
>

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111205/1566ad37/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Dec 2011 - [LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

[LLVMdev] LLVM and managed languages

Possibly Parallel Threads