Hello, I am Ramon Garcia Fernandez. My interest in LLVM is to develop an interface for Java virtual machine bytecodes, so that Java programs can be run under LLVM. You may ask why not using the Java virtual machine. Although it may be improved, there are some misfeatures in it. This is what I have learned. It makes the communication with native code too expensive. Passing an array from native to the virtual machine or vice versa requires a copy of the data. Why? you may ask. Because Java uses garbage collectors based on copying. Thus the position of an object may be moved by the virtual machine. The implementation of generational garbage collection in Java uses areas of memory for each generation, so that when an object changes from the young generation to the old its storage must be moved. This may give some performance advantage, by making young objects close in memory, but with the cost of making exchange of data with native code expensive. In particular, data copying is required for reading and writing files, sending or receiving data from the network, or drawing. Since Java is not often used for numerical analysis or tasks that require little data exchange with the outside world, I disagree that the implementation with a copying collector is good for most applications. A more obvious problem is, of course, that it is not possible to compile Java code statically and save the result in the disk. So I am starting to write a compiler of Java bytecode to LLVM bytecode. For now I am designing, dealing with things such as how to assign stack positions to the operands of each instruction. My target is to deliver something simple. Operations such as classloader creation and dynamic class loading will not be supported. Hoping that this is the start of a long term cooperation, Ramon
Have you looked at the this? http://llvm.org/viewvc/llvm-project/java/trunk/ Someone started a Java frontend. Perhaps you could finish it? -Tanya On Feb 2, 2008, at 5:48 PM, Ramón García wrote:> Hello, I am Ramon Garcia Fernandez. My interest in LLVM is to develop > an interface for Java virtual machine bytecodes, so that Java programs > can be run under LLVM. > > You may ask why not using the Java virtual machine. Although it may be > improved, there are some misfeatures in it. This is what I have > learned. It makes the communication with native code too expensive. > Passing an array from native to the virtual machine or vice versa > requires a copy of the data. Why? you may ask. Because Java uses > garbage collectors based on copying. Thus the position of an object > may be moved by the virtual machine. The implementation of > generational garbage collection in Java uses areas of memory for each > generation, so that when an object changes from the young generation > to the old its storage must be moved. This may give some performance > advantage, by making young objects close in memory, but with the cost > of making exchange of data with native code expensive. In particular, > data copying is required for reading and writing files, sending or > receiving data from the network, or drawing. Since Java is not often > used for numerical analysis or tasks that require little data exchange > with the outside world, I disagree that the implementation with a > copying collector is good for most applications. > > A more obvious problem is, of course, that it is not possible to > compile Java code statically and save the result in the disk. > > So I am starting to write a compiler of Java bytecode to LLVM > bytecode. For now I am designing, dealing with things such as how to > assign stack positions to the operands of each instruction. > > My target is to deliver something simple. Operations such as > classloader creation and dynamic class loading will not be supported. > > Hoping that this is the start of a long term cooperation, > > Ramon > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
No, I didn't, I am going to look at it. On Feb 3, 2008 3:09 AM, Tanya Lattner <tonic at nondot.org> wrote:> Have you looked at the this? > http://llvm.org/viewvc/llvm-project/java/trunk/Ramon
I have just worked with this code. The architecture is fine, and I think that this code should be reused, It needs updating, however, because it does not compile with LLVM 2.1 (I prefer to use a stable version to focus my work, and port to LLVM 2.2 later). I have seen that one incompatibility is that this Java frontend requires C++ with exceptions, but LLVM is compiled with -fno-exceptions. For now, I am compiling with -fexceptions. Should exceptions be removed from the code of the Java frontend? Then, I have doubts about whether the changes for getting it built are correct or not. I will make more questions later. This could be a work plan: * Getting the java frontend built. * Implement exception handling (jsr/ret bytecodes) * Implement garbage collection. * Support JAR files. This should get an usable Java implementation. But there is still very hard work to be done. The difficult part is dynamic class loading, reflection and creation of classloaders. This would enable to use LLVM for Java server applications such as Tomcat or JBoss. I am not sure if this work is possible without funding a full time position. Just some questions to think about it. To what extent does LLVM support dynamic code loading? Is it posible to get code loaded at runtime? Could this break assumptions made by interprocedural optimization? (A function may be called in unexpected ways) Another difficult part is optimization. In order to get good performance, references should be converted to values whenever possible. Recent virtual machines support scape analysis, so that local references can be converted into values, and be stored and released in the stack. This should be generalized to references that are class members. Java code is particular hard to optimize because any function call is a virtual function call. Is inlining posible? Only if one makes assumptions about any code using some class, that no other class is going to override the called method. Programmers could declare methods final, but this is rarely done. Assumptions may be checked for all loaded classes, but, for classes not yet loaded (and which may be loaded dynamically), who nows? But this is for very long term feature. For now, let us have fun completing the doable parts. Best regards, Ramon
You probably want to sit down and have a long talk with Jeroen Frijters, the principal behind the IKVM project. Note that you will have to deal with ClassLoaders at some level, regardless of what you'd prefer.> A more obvious problem is, of course, that it is not possible to > compile Java code statically and save the result in the disk. >That is untrue--last time I checked, gcj does this out of the box. Several other tools used to (TowerJ, I think its name was), but the demand for this turned out to be nil and they folded. Most of Java's appeal lies in its ability to dynamically link libraries. And quite frankly, the overhead of passing native data across that JNI boundary is generally pretty tiny, unless you do some truly idiotic things in either your Java or your JNI/C++ code. I still wouldn't want to do it in a tight loop, mind you, but it's generally not more than a handful of assembly instructions. (This is what I've been told, anyway--I haven't pored over the OpenJDK sources to find the actual code that does the translation.) Having said that, I think a JVM->LLVM bytecode converter is a really cool idea. But I think you're ultimately going to come to the same decision IKVM did, which is to support ClassLoading as well as static loading. Ted Neward Java, .NET, XML Services Consulting, Teaching, Speaking, Writing http://www.tedneward.com> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Ramón García > Sent: Saturday, February 02, 2008 5:48 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] Introducing myself, and Java project. > > Hello, I am Ramon Garcia Fernandez. My interest in LLVM is to develop > an interface for Java virtual machine bytecodes, so that Java programs > can be run under LLVM. > > You may ask why not using the Java virtual machine. Although it may be > improved, there are some misfeatures in it. This is what I have > learned. It makes the communication with native code too expensive. > Passing an array from native to the virtual machine or vice versa > requires a copy of the data. Why? you may ask. Because Java uses > garbage collectors based on copying. Thus the position of an object > may be moved by the virtual machine. The implementation of > generational garbage collection in Java uses areas of memory for each > generation, so that when an object changes from the young generation > to the old its storage must be moved. This may give some performance > advantage, by making young objects close in memory, but with the cost > of making exchange of data with native code expensive. In particular, > data copying is required for reading and writing files, sending or > receiving data from the network, or drawing. Since Java is not often > used for numerical analysis or tasks that require little data exchange > with the outside world, I disagree that the implementation with a > copying collector is good for most applications. > > A more obvious problem is, of course, that it is not possible to > compile Java code statically and save the result in the disk. > > So I am starting to write a compiler of Java bytecode to LLVM > bytecode. For now I am designing, dealing with things such as how to > assign stack positions to the operands of each instruction. > > My target is to deliver something simple. Operations such as > classloader creation and dynamic class loading will not be supported. > > Hoping that this is the start of a long term cooperation, > > Ramon > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.5.516 / Virus Database: 269.19.19/1256 - Release Date: > 2/2/2008 1:50 PM >No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.19.19/1256 - Release Date: 2/2/2008 1:50 PM
Sorry for the confusion about the JNI overhead, perhaps I wasn't clear. The big overhead of calling JNI happens if one passes an array because in this case data must be copied (the JNI interface allows the implementation to choose, but the current JDK implementation always copies data). This mean that for a Java application to read data from a file, to fetch bytes from a network connection, or to paint a bitmap in the screen, data must be first copied from virtual machine memory to native memory, and then the operation is done. The reason is that the implementation of garbage collection which copies objects, and thus native code cannot assume that the memory position of an object or array is fixed. See, for instance, the specification of Get<PrimitiveType>ArrayElements, http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/functions.html#wp17382 The interface allows the virtual machine implementation to copy or not, but the current implementation always copies array data. Ramon> And quite frankly, the overhead of passing native data across that JNI > boundary is generally pretty tiny, unless you do some truly idiotic things > in either your Java or your JNI/C++ code. I still wouldn't want to do it in > a tight loop, mind you, but it's generally not more than a handful of > assembly instructions. (This is what I've been told, anyway--I haven't pored > over the OpenJDK sources to find the actual code that does the translation.) > > Having said that, I think a JVM->LLVM bytecode converter is a really cool > idea. But I think you're ultimately going to come to the same decision IKVM > did, which is to support ClassLoading as well as static loading. > > Ted Neward > Java, .NET, XML Services > Consulting, Teaching, Speaking, Writing > http://www.tedneward.com > > > > > -----Original Message----- > > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > > On Behalf Of Ramón García > > Sent: Saturday, February 02, 2008 5:48 PM > > To: llvmdev at cs.uiuc.edu > > Subject: [LLVMdev] Introducing myself, and Java project. > > > > Hello, I am Ramon Garcia Fernandez. My interest in LLVM is to develop > > an interface for Java virtual machine bytecodes, so that Java programs > > can be run under LLVM. > > > > You may ask why not using the Java virtual machine. Although it may be > > improved, there are some misfeatures in it. This is what I have > > learned. It makes the communication with native code too expensive. > > Passing an array from native to the virtual machine or vice versa > > requires a copy of the data. Why? you may ask. Because Java uses > > garbage collectors based on copying. Thus the position of an object > > may be moved by the virtual machine. The implementation of > > generational garbage collection in Java uses areas of memory for each > > generation, so that when an object changes from the young generation > > to the old its storage must be moved. This may give some performance > > advantage, by making young objects close in memory, but with the cost > > of making exchange of data with native code expensive. In particular, > > data copying is required for reading and writing files, sending or > > receiving data from the network, or drawing. Since Java is not often > > used for numerical analysis or tasks that require little data exchange > > with the outside world, I disagree that the implementation with a > > copying collector is good for most applications. > > > > A more obvious problem is, of course, that it is not possible to > > compile Java code statically and save the result in the disk. > > > > So I am starting to write a compiler of Java bytecode to LLVM > > bytecode. For now I am designing, dealing with things such as how to > > assign stack positions to the operands of each instruction. > > > > My target is to deliver something simple. Operations such as > > classloader creation and dynamic class loading will not be supported. > > > > Hoping that this is the start of a long term cooperation, > > > > Ramon > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > No virus found in this incoming message. > > Checked by AVG Free Edition. > > Version: 7.5.516 / Virus Database: 269.19.19/1256 - Release Date: > > 2/2/2008 1:50 PM > > > > No virus found in this outgoing message. > Checked by AVG Free Edition. > Version: 7.5.516 / Virus Database: 269.19.19/1256 - Release Date: 2/2/2008 > 1:50 PM > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >