Hi Talin,
I wrote HLVM:
http://www.ffconsultancy.com/ocaml/hlvm/
The whole project took a few man months. HLVM provides an ML-like type system,
unboxed tuples, discriminated unions, TCO, generic printing, C FFI, POSIX
threads, mark-sweep GC and both JIT and standalone compilation. I wrote several
(accurate) garbage collectors for it including a stop-the-world mark-sweep
collector and they took a few days each. I did so by implementing my own shadow
stack along the lines of Henderson's "Accurate garbage collection in an
uncooperative environment". HLVM was only ever a hobby project. IMHO, the
results were incredible and are a testament to the awesomeness of LLVM.
HLVM prototyped many interesting ideas but the most relevant here is the use of
value types to avoid heap allocation in order to alleviate the garbage
collector. For example, HLVM is about 3x faster than Java on the following ray
tracing benchmark because HLVM does no allocation or collection whilst tracing
whereas Java does a huge amount:
http://flyingfrogblog.blogspot.com/2010/01/hlvm-on-ray-tracer-language-comparison.html
I believe that benchmark shows that even the most heavily optimized GC can be
thrashed by a program that simply avoids GC. Value types make this possible and
the lack of value types on the JVM is, therefore, a major issue.
Now, I have some concerns about the statements being made here regarding GC
support in LLVM. Let me begin by addressing the statement:
> Garbage collection is still way too difficult.
This is completely untrue. You can easily write your own GC simply by storing
roots explicitly in your own shadow stack. Contrary to what many people told me
before I did this, performance is fine for many applications:
http://flyingfrogblog.blogspot.com/2009/03/current-shadow-stack-overheads-in-hlvm.html
I believe the conclusion you meant to draw is that writing a GC using LLVM's
highly experimental GC support is way too difficult but that is a very different
statement.
This is a really important point because it means that all of the major changes
discussed so far are just optimizations beyond a simple shadow stack. Moreover,
I don't think we have any idea how valuable these optimizations might be.
Therefore, I think this discussion needs to start with an objective survey of
the possible approaches and some performance estimates.
HLVM's benchmarks indicate that the shadow stack overhead is usually under
20% but can be as high as 70% with purely functional code. HLVM currently pushes
all roots onto the shadow stack as soon as they come into scope and resets the
shadow stack pointer at each function exit. A better approach would be to defer
pushing until safe points (if any) and push only live roots (i.e. those referred
to again after the safe point).
I believe HLVM could attain competitive performance even on purely functional
data structures by using a non-moving mark-region collector. I have written
several prototypes in C++ but they are very much work-in-progress. As the GCs
are non-moving there is no need to restore pointers after a safe point, which
sounds more efficient to me. The downside is that temporaries are no longer
allocated by bumping a pointer into the nursery generation but my mark-region GC
uses a small bitwise LUT to find the next free block in the current region which
is almost as fast.
Others have said that the best possible performance is obtained by making the
back-end aware of GC pointers. I can well believe that but the cost of doing so
with LLVM is so large that I think it will be necessary to sacrifice optimality
for something almost as good that is attainable with a lot less effort. In fact,
I would much rather see a separate project that sits on top of LLVM and provides
a safe high-level garbage collected VM for language experimenters to use.
Incidentally, if anyone wants to get up to speed on GC design I highly recommend
Richard Jones' new book on the subject:
http://www.amazon.co.uk/Garbage-Collection-Handbook-Management-Algorithms/dp/1420082795
Cheers,
Jon.
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Talin
Sent: 01 July 2011 19:06
To: LLVM Developers Mailing List
Subject: [LLVMdev] LLVM and managed languages
So I've been using LLVM for about 4 years now, and I've posted a lot on
this list about specific issues. What I would like to do is step back for a
moment and give my "big picture" assessment of LLVM overall,
particularly with respect to developing a "managed" language like Java
/ C# or my own language, Tart.
Obviously, I feel that LLVM is the best choice out there, otherwise I would of
switched to using other alternatives long ago. At the same time, however,
I've observed that the great majority of experimental languages - Groovy,
Scala, and many others - are built on top of the Java virtual machine, which
seems to be the platform of choice for language experimenters. In a similar
vein, if you spend any time studying the academic research on garbage collection
algorithms, you'll see that the JVM also appears to be the platform of
choice here as well.
One question to ask is, what factors would weigh heavily in the decision to
choose the JVM over LLVM when implementing a managed language? On the positive
side, LLVM's mature and simple APIs make it a breeze to manipulate types and
functions, and it's modular architecture means I can use the parts I want
without being forced to use parts I don't need. And the ability to go
straight to machine code means that I don't need to ship a complex virtual
machine with my executables.
I would also like to list the areas where I don't need help from LLVM - that
is, these are things I can easily handle myself in the frontend:
• Dynamic dispatch and virtual methods / multi-methods
• Boxing and unboxing of value types
• Reflection
• Memory management
I mention these items for two reasons: First, because these are services that
would be provided by a JVM, and I want people to know that the lack of these
items in LLVM does not in any way count as a detriment. And secondly, because
I've occasionally heard people on this list ask for features like these, and
I think it would be a waste of time to spend effort implementing something that
the frontend can easily handle.
Next, I want to list the areas where I think LLVM is lacking:
The first issue is more of a community-related one, which is that IMHO LLVM
doesn't have a lot of forward momentum in the experimental languages space.
The vast majority of contributions to the LLVM code base fall into three
categories: Clang-related work, ports to new backends, and optimization
research. Parts of LLVM which don't fall into these categories - like
garbage collection for example - tend to languish.
Note that this is not a criticism, but merely an observation - if it were a
criticism I'd have to blame myself as much as anyone else. There are good
reasons why things are as they are - the few people like myself who are working
on non-C-like languages generally don't have the energy to both work on
their front ends and at the same time do any serious amount of work on the LLVM
infrastructure. (You all saw what happened with the union types fiasco - I
contributed all of the front-end parts for declaring and serializing unions, in
the hopes that someone more knowledgeable would pick up the effort of
implementing them on the code generation side - but that never happened and the
feature got removed after six months of no progress.)
I think that LLVM's best hope in this department is that one of these
experimental languages becomes popular enough to attract a serious contributor
base of it's own, and that some of that effort can bleed off into improving
LLVM. Some of that was actually starting to happen in the case of
unladen-swallow, before that project was shut down.
The rest of my issues are more specific:
Garbage collection is still way too difficult. The biggest problem is the
inability to track SSA values - it requires the frontend to generate very
inefficient and error-prone code, manually spilling SSA values to the stack
around nearly every function call. I've written proposals for improving the
situation, which I won't repeat here - but realistically there's no way
that I am ever going to have time learn to enough about LLVM's code
generation passes to implement something like this myself.
Another area which I'd like to see supported is stack walking - that is,
starting from the current call frame, iterate through all of the call frames
above it. Currently the only way to do this is via platform-specific assembly
language - I'd think it would be relatively simple to make an intrinsic that
does this. (Note that the current stack intrinsics are unusable, because they
are (a) unreliable, and (b) inefficient. Taking an integer index of a stack
frame as a parameter is not the right way to do it.)
I have to say, if there's any single issue that could make me give up on
LLVM and switch over to using something like a JVM, it would be this - because
as far as I can tell, LLVM's garbage collection features are in exactly the
same state they were in when I started working with LLVM four years ago.
Generating debug info is another area which is very hard to use. The new
DIBuilder class was a significant improvement, but the whole area is still very
complex, it's extremely easy to make mistakes and then spend days or weeks
trying to figure out what went wrong - there's no type safety or validation
that can prevent you from making such mistakes. Over the last 4 years I've
spent many man-months of time wrestling with DWARF.
(One also wonders if DWARF is even the right answer. Especially compared to the
JVM - the Java debuggers are able to derive much of their information from the
reflection and class metadata that's already there in the class files, so
from that perspective DWARF adds a huge amount of overhead.)
Platform ABI limitations - Currently LLVM requires the frontend developer to
know quite a lot about the platform ABI - for example whether you are allowed to
pass a struct of a certain size as a value type rather than as a reference. The
thing is, experimental languages like mine generally don't care so much
about ABI compatibility - sure, we'd like to be able to call C library
functions once in a while (and we don't mind doing the extra work in those
cases), but most of the time we just want to pass a data type around and expect
it to work. Requiring the use of different techniques on different platforms
makes the situation considerably more complex.
Light-weight coroutines would be a "nice to have", as would better
concurrency primitives. These are things I could do on my own, but it would be
better, I think, to have them in LLVM - because in my view of the world,
anything that requires lots of architecture-specific knowledge ideally belongs
on the LLVM side of the line.
There's been a lot of discussion about divide-by-zero errors and other
non-declared exceptions. Having this available would be a great help.
Named structure types - as per Chris's proposal - would be a major
simplification, as it would allow declaration of pointer fields without having
to import the code that defines the structure of the thing that the pointer is
pointing to.
Anyway that's pretty much my list of big-ticket items. As you can see,
anyone who cares about these issues would have to think very hard before
choosing LLVM over the JVM as an implementation platform.
--
-- Talin