Hi Folks -- Let me go ahead and pose a question similar to the one Joachim poses below. I too am trying to evaluate whether LLVM will be of use to me in building a compiler and garbage collection mechanism for a byte code vm that I have built. Although there are multiple types of code that can be created with this system, all of them eventually translate to execution of one or more of these byte codes -- with the exception of stuff that's coded directly in Intel assembly language (which I understand can be output as is and left untouched by the LLVM compiler if desired). There's about 32 core op codes that constitute the basic instruction set and I can envision mapping each of these to some sequence of LLVM IR. There's also a whole lot more "extended opcodes" that are executed by the same core instruction execution loop but which are coded using the built-in Intel assembler and added dynamically by the system. I could envision also going to the trouble of mapping each of these to a sequence of LLVM IR instructions and then being able to emit a series of LLVM IR sequences purely based on the sequence of vm opcodes encountered in a scan of code compiled for the vm. I'm hoping that such a product could then be submitted to all the LLVM optimizations and result in better Intel assembly code generation than what I have hand-coded myself (in my implementations of either the core or the extended opcodes -- and especially in the intel code sequences resulting from the use of these opcodes in sequences together). So first question is simply to ask for a validation of this thinking and whether such a strategy seems feasible. The second question pertains to this discussion thread. If at the end of the day, all I am trying to do is compile for an 80x86 platform (although ideally hoping to target Windows, Linux and the Mac) and don't need to target multiple processors, then LLVM should add significant value for me if the answer to the first question is that it is a sound and sensible strategy. And most of the discussion in this thread about platform-specific issues shouldn't apply if I only have one processor type to target. Am I thinking about this correctly? Any insights from some of you old hands would be greatly appreciated. Thanks. Mike Message: 1 Date: Wed, 05 Oct 2011 17:10:36 -0500 From: greened at obbligato.org (David A. Greene) Subject: Re: [LLVMdev] LLVM IR is a compiler IR To: Joachim Durchholz <jo at durchholz.org> Cc: llvmdev at cs.uiuc.edu Message-ID: <nngk48j5btv.fsf at transit.us.cray.com> Content-Type: text/plain; charset=us-ascii Joachim Durchholz <jo at durchholz.org> writes:> Now that the dust begins to settle... I'm wondering whether LLVM is for me. > > I'm working on something that can be used to create software for > different environments: C/C++, JVM, CLR, Parrot, etc. > I.e. one language for different environments, but not write once, run > anywhere. > > Now what would be the role of LLVM in such an infrastructure? > Just backend for C/C++ linkage, and I should go and look elsewhere for > JVM/CLR/whateverVM? > Should I look into LLVM subprojects? Which ones?It depends on what you want to do with the IR. If you want to create object files, LLVM is great. You just need to map the semantics of the various HLLs onto the LLVM IR language, as with any translator. For any kind of code-generator-ish thing, it's hard to beat LLVM IR, IMHO. If you want to JIT, then some of LLVM IR's limitations will impact the speed of code generation, as Dan outlined. If you want to do fancy transformations that use or analyze high-level language semantics, LLVM IR may not be right for you, as most of that information is lost by the time the code has been converted to LLVM IR. -Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111006/905d4bb7/attachment.html>
Michael Clagett <mclagett at hotmail.com> writes:> There's about 32 core op codes that constitute the basic instruction > set and I can envision mapping each of these to some sequence of LLVM > IR. There's also a whole lot more "extended opcodes" that are > executed by the same core instruction execution loop but which are > coded using the built-in Intel assembler and added dynamically by the > system. I could envision also going to the trouble of mapping each of > these to a sequence of LLVM IR instructions and then being able to > emit a series of LLVM IR sequences purely based on the sequence of vm > opcodes encountered in a scan of code compiled for the vm.> I'm hoping that such a product could then be submitted to all the LLVM > optimizations and result in better Intel assembly code generation than > what I have hand-coded myself (in my implementations of either the > core or the extended opcodes -- and especially in the intel code > sequences resulting from the use of these opcodes in sequences > together). So first question is simply to ask for a validation of > this thinking and whether such a strategy seems feasible.Let me make sure I'm understanding you correctly. You want to map each of you opcodes into an LLVM sequence and then use the LLVM optimizations and JIT to generate efficient native code implementations? Then you would invoke those implementations during interpretation? Or is it that you want to take a bytecode program, map it to LLVM IR, run it through optimizations and codegen to produce a native executable? Either one of these will work and LLVM seems like a good match as long as you don't expect the optimizations to understand the higher-level semantics of your opcodes (without some work by you, at least). I don't quiet grasp any benefit to the first use as I would just go ahead and generate the optimal native code sequence for each opcode once and be done with it. No LLVM needed at all. So I suspect this is not what you want to do -Dave
Sorry, menat to upload this to the list.> From: greened at obbligato.org > To: mclagett at hotmail.com > CC: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] LLVM IR is a compiler IR > Date: Thu, 6 Oct 2011 14:02:48 -0500 > > Michael Clagett <mclagett at hotmail.com> writes: > > > There's about 32 core op codes that constitute the basic instruction > > set and I can envision mapping each of these to some sequence of LLVM > > IR. There's also a whole lot more "extended opcodes" that are > > executed by the same core instruction execution loop but which are > > coded using the built-in Intel assembler and added dynamically by the > > system. I could envision also going to the trouble of mapping each of > > these to a sequence of LLVM IR instructions and then being able to > > emit a series of LLVM IR sequences purely based on the sequence of vm > > opcodes encountered in a scan of code compiled for the vm. > > > I'm hoping that such a product could then be submitted to all the LLVM > > optimizations and result in better Intel assembly code generation than > > what I have hand-coded myself (in my implementations of either the > > core or the extended opcodes -- and especially in the intel code > > sequences resulting from the use of these opcodes in sequences > > together). So first question is simply to ask for a validation of > > this thinking and whether such a strategy seems feasible. > > Let me make sure I'm understanding you correctly. You want to map each > of you opcodes into an LLVM sequence and then use the LLVM optimizations > and JIT to generate efficient native code implementations? Then you > would invoke those implementations during interpretation? > > Or is it that you want to take a bytecode program, map it to LLVM IR, > run it through optimizations and codegen to produce a native executable? > > Either one of these will work and LLVM seems like a good match as long > as you don't expect the optimizations to understand the higher-level > semantics of your opcodes (without some work by you, at least). > > I don't quiet grasp any benefit to the first use as I would just go > ahead and generate the optimal native code sequence for each opcode once > and be done with it. No LLVM needed at all. So I suspect this is not > what you want to do > > -Dave-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111006/27744e77/attachment.html>
Sorry for the noise, but this is the message I meant to send to the list rather than replying to David directly. Unfortunately, I just sent his message to me before. From: mclagett at hotmail.com To: greened at obbligato.org Subject: RE: [LLVMdev] LLVM IR is a compiler IR Date: Thu, 6 Oct 2011 19:44:11 +0000 Thanks for your prompt reply. My answers are below at the end of your message.> From: greened at obbligato.org > To: mclagett at hotmail.com > CC: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] LLVM IR is a compiler IR > Date: Thu, 6 Oct 2011 14:02:48 -0500 > > Michael Clagett <mclagett at hotmail.com> writes: > > > There's about 32 core op codes that constitute the basic instruction > > set and I can envision mapping each of these to some sequence of LLVM > > IR. There's also a whole lot more "extended opcodes" that are > > executed by the same core instruction execution loop but which are > > coded using the built-in Intel assembler and added dynamically by the > > system. I could envision also going to the trouble of mapping each of > > these to a sequence of LLVM IR instructions and then being able to > > emit a series of LLVM IR sequences purely based on the sequence of vm > > opcodes encountered in a scan of code compiled for the vm. > > > I'm hoping that such a product could then be submitted to all the LLVM > > optimizations and result in better Intel assembly code generation than > > what I have hand-coded myself (in my implementations of either the > > core or the extended opcodes -- and especially in the intel code > > sequences resulting from the use of these opcodes in sequences > > together). So first question is simply to ask for a validation of > > this thinking and whether such a strategy seems feasible. > > Let me make sure I'm understanding you correctly. You want to map each > of you opcodes into an LLVM sequence and then use the LLVM optimizations > and JIT to generate efficient native code implementations? Then you > would invoke those implementations during interpretation? > > Or is it that you want to take a bytecode program, map it to LLVM IR, > run it through optimizations and codegen to produce a native executable? > > Either one of these will work and LLVM seems like a good match as long > as you don't expect the optimizations to understand the higher-level > semantics of your opcodes (without some work by you, at least). > > I don't quiet grasp any benefit to the first use as I would just go > ahead and generate the optimal native code sequence for each opcode once > and be done with it. No LLVM needed at all. So I suspect this is not > what you want to do > > -DaveIt is actually the first of your alternatives above that I was hoping to achieve and the reason I was thinking this would be valuable is twofold. First, I don't have so much faith in the quality or optimal character of my own byte code implementations. The core 32 opcodes tend to be high-level implementations of low-level operations in dealing with the elements of the virtual machine -- things like stack operations on the two built in stacks of the virtual machine or movements to and from the vm's address register or access to and from the addresses moved there. Other core opcodes include '2*' (multiply top of stack by 2) 'COM' (one's complement of top of stack) ';' (jump vm instruction ptr to address on top of return stack) and that sort of thing. The top of data and return stacks are mapped to register EAX and EDI, respectively, and the address reg is mapped to ESI. But any stack operations that involve' push'-ing, 'pop'-ing, 'dup'-ing, 'drop'-ing, etc. end up going to memory storage where the bulk of the stack storage lives. Each of these core primitives are on average around 10 intel instructions long, and many of the extended opcodes that have been coded are there to bypass more costly sequences of core primitives that they replace. It was my general feeling that a good SSA-based compilation mechanism like that of LLVM could do a better job at maximizing the use of the Intel's limited resources than I could. Moreover, as long as code remains at the VM instruction level, these resources are even more constrained than usual. EDX needs to be preserved to hold the VM instruction pointer. EAX, ESI and EDI need to be preserved for the purposes outlined above. So there is an advantage in compiling VM opcode sequences to assembler that can violate these invariants on a more extended basis and use the entire register set for a longer period of time. Similar considerations apply to simply reducing from 10 instructions to 1 or 2 instructions operations that at the VM level require the stack, but that at the intel assembler level would more naturally be handled in registers. Finally, I just have the general feeling that more significant compiler optimizations can be effected across sequences of what are my vm opcode implementations. This is a general feeling, but I'm hoping fairly well-founded. Hope that explains my thinking better. Does that change at all your view of the benefits that I might achieve from LLVM? Thanks. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111006/f534baf8/attachment.html>
Michael Clagett <mclagett at hotmail.com> writes:> It is actually the first of your alternatives above that I was hoping > to achieve and the reason I was thinking this would be valuable is > twofold. First, I don't have so much faith in the quality or optimal > character of my own byte code implementations. The core 32 opcodes > tend to be high-level implementations of low-level operations in > dealing with the elements of the virtual machineOk, so these are mildly complex operations. It makes sense to start with some kind of machine-generated asm implementation to get optimized performance. Don't write the opcode implementations in asm directly. Write them in a high level language and compile them to native code. See below.> The top of data and return stacks are mapped to register EAX and EDI, > respectively, and the address reg is mapped to ESI.Does this have to be the case? See below.> It was my general feeling that a good SSA-based compilation mechanism > like that of LLVM could do a better job at maximizing the use of the > Intel's limited resources than I could.As an alternative to using the JIT, would it be possible to implement each opcode in its own interpreter function and compile them statically? Of course there would be call overhead interpreting each opcode. Once you've got that you could apply various techniques such as threading the interpreter (not multiprocessing, but threading the interpreter as in http://en.wikipedia.org/wiki/Threaded_code) to eliminate the overhead. I don't think there's any particular reason to rely on the JIT unless you want to take it a bit further and optimize a specific sequence of opcodes seen when interpreting a specific program. But then we're getting into the various Futamura transformations. :)> Moreover, as long as code remains at the VM instruction level, these > resources are even more constrained than usual. EDX needs to be > preserved to hold the VM instruction pointer. EAX, ESI and EDI need > to be preserved for the purposes outlined above.Why do those registers need to be preserved? Imagine the interpreter were written completely in a high level language. The compiler doesn't care which register holds a stack pointer, data pointer, etc. as long as the virtual machine's state is consistent.> Similar considerations apply to simply reducing from 10 instructions > to 1 or 2 instructions operations that at the VM level require the > stack, but that at the intel assembler level would more naturally be > handled in registers.Ah, ok, this is interesting. You want to change the execution model on-the-fly. A grad school colleague of mine did something very similar to this, translating a stack machine into a register machine. Of course he's a hardware nut so he designed hardware to do it. :) Unfortunately, I don't think he ever published anything on it. Doing the threading thing mentioned above or the JIT/Dynamo thing mentioned below can both accomplish this, I think, and without any register constraints if I'm understanding you correctly.> Finally, I just have the general feeling that more significant > compiler optimizations can be effected across sequences of what are my > vm opcode implementations. This is a general feeling, but I'm hoping > fairly well-founded.Yes, that's true. See the Futamura reference above. Given a VM and an input program, you can in fact generate an optimized executable. This is the logical extension of what you're getting at. For this kind of thing a JIT makes sense. You might have a look at what the HP people did with Dynamo. They got a lot of performance out of translating PA-RISC to PA-RISC by doing exactly what you describe.> Hope that explains my thinking better. Does that change at all your > view of the benefits that I might achieve from LLVM?It doesn't change it in the sense that I think LLVM will work well for this. JIT speed could be an issue but that will be amortized if the opcode sequence is executed enough times. One way to speed up the JIT is to pre-generate a set of instruction templates for each opcode that get filled in with specific information available at runtime. See the papers on DyC for some examples. I believe the Dynamo folks also took this route. This would be quite extensive work in LLVM but would be very valuable, I think. Partial evaluation papers may also be useful to explore. HTH. -Dave