thr3ads.net - llvm dev - [LLVMdev] FW: LLVM IR is a compiler IR [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Michael Clagett

2011-Oct-06 18:29 UTC

[LLVMdev] LLVM IR is a compiler IR

Hi Folks --

Let me go ahead and pose a question similar to the one Joachim poses below.  I
too am trying to evaluate whether LLVM will be of use to me in building a
compiler and garbage collection mechanism for a byte code vm that I have built. 
Although there are multiple types of code that can be created with this system,
all of them eventually translate to execution of one or more of these byte codes
-- with the exception of stuff that's coded directly in Intel assembly
language (which I understand can be output as is and left untouched by the LLVM
compiler if desired).  There's about 32 core op codes that constitute the
basic instruction set and I can envision mapping each of these to some sequence
of LLVM IR.   There's also a whole lot more "extended opcodes"
that are executed by the same core instruction execution loop but which are
coded using the built-in Intel assembler and added dynamically by the system.  I
could envision also going to the trouble of mapping each of these to a sequence
of LLVM IR instructions and then being able to emit a series of LLVM IR
sequences purely based on the sequence of vm opcodes encountered in a scan of
code compiled for the vm.

I'm hoping that such a product could then be submitted to all the LLVM
optimizations and result in better Intel assembly code generation than what I
have hand-coded myself (in my implementations of either the core or the extended
opcodes -- and especially in the intel code sequences resulting from the use of
these opcodes in sequences together).  So first question is simply to ask for a
validation of this thinking and whether such a strategy seems feasible.

The second question pertains to this discussion thread.  If at the end of the
day, all I am trying to do is compile for an 80x86 platform (although ideally
hoping to target Windows, Linux and the Mac) and don't need to target
multiple processors, then LLVM should add significant value for me if the answer
to the first question is that it is a sound and sensible strategy.  And most of
the discussion in this thread about platform-specific issues shouldn't apply
if I only have one processor type to target.  Am I thinking about this
correctly?

Any insights from some of you old hands would be greatly appreciated.   

Thanks.

Mike

Message: 1
Date: Wed, 05 Oct 2011 17:10:36 -0500
From: greened at obbligato.org (David A. Greene)
Subject: Re: [LLVMdev] LLVM IR is a compiler IR
To: Joachim Durchholz <jo at durchholz.org>
Cc: llvmdev at cs.uiuc.edu
Message-ID: <nngk48j5btv.fsf at transit.us.cray.com>
Content-Type: text/plain; charset=us-ascii

Joachim Durchholz <jo at durchholz.org> writes:
 > Now that the dust begins to settle... I'm wondering whether LLVM is for
me.
>
> I'm working on something that can be used to create software for 
> different environments: C/C++, JVM, CLR, Parrot, etc.
> I.e. one language for different environments, but not write once, run 
> anywhere.
>
> Now what would be the role of LLVM in such an infrastructure?
> Just backend for C/C++ linkage, and I should go and look elsewhere for 
> JVM/CLR/whateverVM?
> Should I look into LLVM subprojects? Which ones? 
It depends on what you want to do with the IR.  If you want to create
object files, LLVM is great.  You just need to map the semantics of the
various HLLs onto the LLVM IR language, as with any translator.  For any
kind of code-generator-ish thing, it's hard to beat LLVM IR, IMHO.

If you want to JIT, then some of LLVM IR's limitations will impact the
speed of code generation, as Dan outlined.

If you want to do fancy transformations that use or analyze high-level
language semantics, LLVM IR may not be right for you, as most of that
information is lost by the time the code has been converted to LLVM IR.

                                  -Dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111006/905d4bb7/attachment.html>

David A. Greene

2011-Oct-06 19:02 UTC

head link

[LLVMdev] LLVM IR is a compiler IR

Michael Clagett <mclagett at hotmail.com> writes:
> There's about 32 core op codes that constitute the basic instruction
> set and I can envision mapping each of these to some sequence of LLVM
> IR.  There's also a whole lot more "extended opcodes" that
are
> executed by the same core instruction execution loop but which are
> coded using the built-in Intel assembler and added dynamically by the
> system.  I could envision also going to the trouble of mapping each of
> these to a sequence of LLVM IR instructions and then being able to
> emit a series of LLVM IR sequences purely based on the sequence of vm
> opcodes encountered in a scan of code compiled for the vm.
> I'm hoping that such a product could then be submitted to all the LLVM
> optimizations and result in better Intel assembly code generation than
> what I have hand-coded myself (in my implementations of either the
> core or the extended opcodes -- and especially in the intel code
> sequences resulting from the use of these opcodes in sequences
> together).  So first question is simply to ask for a validation of
> this thinking and whether such a strategy seems feasible.
Let me make sure I'm understanding you correctly.  You want to map each
of you opcodes into an LLVM sequence and then use the LLVM optimizations
and JIT to generate efficient native code implementations?  Then you
would invoke those implementations during interpretation?

Or is it that you want to take a bytecode program, map it to LLVM IR,
run it through optimizations and codegen to produce a native executable?

Either one of these will work and LLVM seems like a good match as long
as you don't expect the optimizations to understand the higher-level
semantics of your opcodes (without some work by you, at least).

I don't quiet grasp any benefit to the first use as I would just go
ahead and generate the optimal native code sequence for each opcode once
and be done with it.  No LLVM needed at all.  So I suspect this is not
what you want to do

                             -Dave

Michael Clagett

2011-Oct-06 20:07 UTC

head link

[LLVMdev] FW: LLVM IR is a compiler IR

Sorry, menat to upload this to the list.
> From: greened at obbligato.org
> To: mclagett at hotmail.com
> CC: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] LLVM IR is a compiler IR
> Date: Thu, 6 Oct 2011 14:02:48 -0500
> 
> Michael Clagett <mclagett at hotmail.com> writes:
> 
> > There's about 32 core op codes that constitute the basic
instruction
> > set and I can envision mapping each of these to some sequence of LLVM
> > IR.  There's also a whole lot more "extended opcodes"
that are
> > executed by the same core instruction execution loop but which are
> > coded using the built-in Intel assembler and added dynamically by the
> > system.  I could envision also going to the trouble of mapping each of
> > these to a sequence of LLVM IR instructions and then being able to
> > emit a series of LLVM IR sequences purely based on the sequence of vm
> > opcodes encountered in a scan of code compiled for the vm.
> 
> > I'm hoping that such a product could then be submitted to all the
LLVM
> > optimizations and result in better Intel assembly code generation than
> > what I have hand-coded myself (in my implementations of either the
> > core or the extended opcodes -- and especially in the intel code
> > sequences resulting from the use of these opcodes in sequences
> > together).  So first question is simply to ask for a validation of
> > this thinking and whether such a strategy seems feasible.
> 
> Let me make sure I'm understanding you correctly.  You want to map each
> of you opcodes into an LLVM sequence and then use the LLVM optimizations
> and JIT to generate efficient native code implementations?  Then you
> would invoke those implementations during interpretation?
> 
> Or is it that you want to take a bytecode program, map it to LLVM IR,
> run it through optimizations and codegen to produce a native executable?
> 
> Either one of these will work and LLVM seems like a good match as long
> as you don't expect the optimizations to understand the higher-level
> semantics of your opcodes (without some work by you, at least).
> 
> I don't quiet grasp any benefit to the first use as I would just go
> ahead and generate the optimal native code sequence for each opcode once
> and be done with it.  No LLVM needed at all.  So I suspect this is not
> what you want to do
> 
>                              -Dave 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111006/27744e77/attachment.html>

Michael Clagett

2011-Oct-06 20:24 UTC

head link

[LLVMdev] FW: LLVM IR is a compiler IR

Sorry for the noise, but this is the message I meant to send to the list rather
than replying to David directly.  Unfortunately, I just sent his message to me
before.

From: mclagett at hotmail.com
To: greened at obbligato.org
Subject: RE: [LLVMdev] LLVM IR is a compiler IR
Date: Thu, 6 Oct 2011 19:44:11 +0000








Thanks for your prompt reply.  My answers are below at the end of your message.
> From: greened at obbligato.org
> To: mclagett at hotmail.com
> CC: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] LLVM IR is a compiler IR
> Date: Thu, 6 Oct 2011 14:02:48 -0500
> 
> Michael Clagett <mclagett at hotmail.com> writes:
> 
> > There's about 32 core op codes that constitute the basic
instruction
> > set and I can envision mapping each of these to some sequence of LLVM
> > IR.  There's also a whole lot more "extended opcodes"
that are
> > executed by the same core instruction execution loop but which are
> > coded using the built-in Intel assembler and added dynamically by the
> > system.  I could envision also going to the trouble of mapping each of
> > these to a sequence of LLVM IR instructions and then being able to
> > emit a series of LLVM IR sequences purely based on the sequence of vm
> > opcodes encountered in a scan of code compiled for the vm.
> 
> > I'm hoping that such a product could then be submitted to all the
LLVM
> > optimizations and result in better Intel assembly code generation than
> > what I have hand-coded myself (in my implementations of either the
> > core or the extended opcodes -- and especially in the intel code
> > sequences resulting from the use of these opcodes in sequences
> > together).  So first question is simply to ask for a validation of
> > this thinking and whether such a strategy seems feasible.
> 
> Let me make sure I'm understanding you correctly.  You want to map each
> of you opcodes into an LLVM sequence and then use the LLVM optimizations
> and JIT to generate efficient native code implementations?  Then you
> would invoke those implementations during interpretation?
> 
> Or is it that you want to take a bytecode program, map it to LLVM IR,
> run it through optimizations and codegen to produce a native executable?
> 
> Either one of these will work and LLVM seems like a good match as long
> as you don't expect the optimizations to understand the higher-level
> semantics of your opcodes (without some work by you, at least).
> 
> I don't quiet grasp any benefit to the first use as I would just go
> ahead and generate the optimal native code sequence for each opcode once
> and be done with it.  No LLVM needed at all.  So I suspect this is not
> what you want to do
> 
>                              -Dave
It is actually the first of your alternatives above that I was hoping to achieve
and the reason I was thinking this would be valuable is twofold.   First, I
don't have so much faith in the quality or optimal character of my own byte
code implementations.   The core 32 opcodes tend to be high-level
implementations of low-level operations in dealing with the elements of the
virtual machine -- things like stack operations on the two built in stacks of
the virtual machine or movements to and from the vm's address register or
access to and from the addresses moved there.  Other core opcodes include
'2*' (multiply top of stack by 2) 'COM' (one's complement of
top of stack)  ';' (jump vm instruction ptr to address on top of return
stack) and that sort of thing.  The top of data and return stacks are mapped to
register EAX and EDI, respectively, and the address reg is mapped to ESI.  But
any stack operations that involve' push'-ing, 'pop'-ing,
'dup'-ing, 'drop'-ing, etc. end up going to memory storage where
the bulk of the stack storage lives.  Each of these core primitives are on
average around 10 intel instructions long, and many of the extended opcodes that
have been coded are there to bypass more costly sequences of core primitives
that they replace.  It was my general feeling that a good SSA-based compilation
mechanism like that of LLVM could do a better job at maximizing the use of the
Intel's limited resources than I could.

Moreover, as long as code remains at the VM instruction level, these resources
are even more constrained than usual.  EDX needs to be preserved to hold the VM
instruction pointer.  EAX, ESI and EDI need to be preserved for the purposes
outlined above.  So there is an advantage in compiling VM opcode sequences to
assembler that can violate these invariants on a more extended basis and use the
entire register set for a longer period of time.  Similar considerations apply
to simply reducing from 10 instructions to 1 or 2 instructions operations that
at the VM level require the stack, but that at the intel assembler level would
more naturally be handled in registers.

Finally, I just have the general feeling that more significant compiler
optimizations can be effected across sequences of what are my vm opcode
implementations.  This is a general feeling, but I'm hoping fairly
well-founded.

Hope that explains my thinking better.  Does that change at all your view of the
benefits that  I might achieve from LLVM?

Thanks.

Mike
 		 	   		   		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111006/f534baf8/attachment.html>

David A. Greene

2011-Oct-06 21:20 UTC

head link

[LLVMdev] FW: LLVM IR is a compiler IR

Michael Clagett <mclagett at hotmail.com> writes:
> It is actually the first of your alternatives above that I was hoping
> to achieve and the reason I was thinking this would be valuable is
> twofold.  First, I don't have so much faith in the quality or optimal
> character of my own byte code implementations.  The core 32 opcodes
> tend to be high-level implementations of low-level operations in
> dealing with the elements of the virtual machine
Ok, so these are mildly complex operations.  It makes sense to start
with some kind of machine-generated asm implementation to get optimized
performance.  Don't write the opcode implementations in asm directly.
Write them in a high level language and compile them to native code.
See below.
> The top of data and return stacks are mapped to register EAX and EDI,
> respectively, and the address reg is mapped to ESI.
Does this have to be the case?  See below.
> It was my general feeling that a good SSA-based compilation mechanism
> like that of LLVM could do a better job at maximizing the use of the
> Intel's limited resources than I could.
As an alternative to using the JIT, would it be possible to implement
each opcode in its own interpreter function and compile them statically?
Of course there would be call overhead interpreting each opcode.  Once
you've got that you could apply various techniques such as threading the
interpreter (not multiprocessing, but threading the interpreter as in
http://en.wikipedia.org/wiki/Threaded_code) to eliminate the overhead.

I don't think there's any particular reason to rely on the JIT unless
you want to take it a bit further and optimize a specific sequence of
opcodes seen when interpreting a specific program.  But then we're
getting into the various Futamura transformations.  :)
> Moreover, as long as code remains at the VM instruction level, these
> resources are even more constrained than usual.  EDX needs to be
> preserved to hold the VM instruction pointer.  EAX, ESI and EDI need
> to be preserved for the purposes outlined above.
Why do those registers need to be preserved?  Imagine the interpreter
were written completely in a high level language.  The compiler doesn't
care which register holds a stack pointer, data pointer, etc. as long as
the virtual machine's state is consistent.
> Similar considerations apply to simply reducing from 10 instructions
> to 1 or 2 instructions operations that at the VM level require the
> stack, but that at the intel assembler level would more naturally be
> handled in registers.
Ah, ok, this is interesting.  You want to change the execution model
on-the-fly.  A grad school colleague of mine did something very similar
to this, translating a stack machine into a register machine.  Of course
he's a hardware nut so he designed hardware to do it.  :) Unfortunately,
I don't think he ever published anything on it.

Doing the threading thing mentioned above or the JIT/Dynamo thing
mentioned below can both accomplish this, I think, and without any
register constraints if I'm understanding you correctly.
> Finally, I just have the general feeling that more significant
> compiler optimizations can be effected across sequences of what are my
> vm opcode implementations.  This is a general feeling, but I'm hoping
> fairly well-founded.
Yes, that's true.  See the Futamura reference above.  Given a VM and an
input program, you can in fact generate an optimized executable.  This
is the logical extension of what you're getting at.

For this kind of thing a JIT makes sense.  You might have a look at what
the HP people did with Dynamo.  They got a lot of performance out of
translating PA-RISC to PA-RISC by doing exactly what you describe.
> Hope that explains my thinking better.  Does that change at all your
> view of the benefits that I might achieve from LLVM?
It doesn't change it in the sense that I think LLVM will work well for
this.  JIT speed could be an issue but that will be amortized if the
opcode sequence is executed enough times.

One way to speed up the JIT is to pre-generate a set of instruction
templates for each opcode that get filled in with specific information
available at runtime.  See the papers on DyC for some examples.  I
believe the Dynamo folks also took this route.  This would be quite
extensive work in LLVM but would be very valuable, I think.  Partial
evaluation papers may also be useful to explore.

HTH.

                             -Dave

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Oct 2011 - [LLVMdev] FW: LLVM IR is a compiler IR

[LLVMdev] LLVM IR is a compiler IR

[LLVMdev] LLVM IR is a compiler IR

[LLVMdev] FW: LLVM IR is a compiler IR

[LLVMdev] FW: LLVM IR is a compiler IR

[LLVMdev] FW: LLVM IR is a compiler IR

Apparently Analagous Threads