Hi, David Chisnall via llvm-dev wrote: > On 19 Jul 2016, at 04:06, Lorenzo Laneve via llvm-dev<llvm-dev at lists.llvm.org> wrote: >> My idea was to create a complete backend treating Java as a normal platform, to enable LLVM to compile programs to Java Bytecode (.class) and Java Archive files (.jar). This could be useful in situations where we need to compile a program for a platform still not natively supported by LLVM. >> >> I don't know if it exists already, I've heard about this "LLJVM" but I don't think it does the same thing as my idea. >> What do you think? > > I think that it will be difficult. Java bytecode is intrinsically designed to be memory safe, whereas LLVM IR is not. There is no equivalent of inttoptr or ptrtoint in Java bytecode and the closest equivalent of a GEP is to retrieve a field from an object (though that’s only really for GEP + load/store). > > You could potentially do something a bit ugly and treat all of LLVM memory as one big ByteBuffer object, and make pointers indexes into this, but then you’d make it very hard for your LLVM-originating code to interoperate with Java-originating code and so you’d have to write a lot of code to take the place of the system call layer. The caveat here is that Java has this "private" but-not-really-in-practice API called sun.misc.Unsafe that can be used to access native memory. So you can have (I'm paraphrasing, the method names may not match): long addr = unsafe.allocateMemory() unsafe.putInt(addr + 48, 9001); int val = unsafe.getInt(addr + 48); etc. You may even get decent performance out of this since JIT compilers tend to have to optimize these well (they're commonly uses in the implementation of some popular JDK classes). But you're right that it will still be difficult to naively inter-operate between Java and C++ objects. Which is why it will be an interesting research project. :) -- Sanjoy > Oh, and I doubt that you’ll find many more platforms that have a fully functional JVM than are LLVM targets. Even big-endian MIPS64 is not well-supported by Java (JamVM - a pure interpreter - is the only thing that we’ve managed to find that works). > > David > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On 19 Jul 2016, at 15:52, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> > But you're right that it will still be difficult to naively > inter-operate between Java and C++ objects. Which is why it will be > an interesting research project. :)If that’s your goal, then you might have better luck doing source-to-source than going via LLVM IR. In the past, I’ve managed to do most of Objective-C (not goto, pretty much everything else) -> JavaScript and Dart using clang AST visitors, with Objective-C classes being represented as native objects, with a bit of glue code to paper over the differences in the object models. With C++, you would most likely want to implement subclassing as composition and make each C++ class a Java interface so that you could do all of the kinds of casts required for multiple inheritance. There have been a few attempts at C to Java compilation, though I’m not sure of the current status of any of them. gcj implements Java classes using the same ABI as C++ classes, effectively treating Java as a subset of C++ (plus garbage collection). It might be interesting to start with the same subset of C++ that Microsoft has used for one of their various C++-on-the-CLR implementations and work from there. I think Alp Toker had some parsing for MS managed C++ extensions and CLR code generation working a few years back, but I’m not sure what happened to his code. David -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3719 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160719/58ca6eb4/attachment.bin>
On 19 July 2016 at 15:52, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote:> But you're right that it will still be difficult to naively > inter-operate between Java and C++ objects. Which is why it will be > an interesting research project. :)If you're trying to bypass/replace JNI, you're in for a surprise. :) The number of bugs I found while interacting with Java from C or C++ on different VMs (MS, Sun, OpenJDK) were astounding. Apart from the usual C++ class layout (which may be better in gcj as David says), we had corruption in the stack because the VMs weren't understanding the unwind information. I originally found the stack bug in 2002 on Windows, later checked in 2008 and it was still there. I'd be surprised if that's fixed, and even more surprised if that's the only remaining problem. And those were only through JNI, a relatively safe interface. If you try to send C++ directly to Java Bytecode, you'll find a huge list of "implementation details" that are not just undefined, but thoroughly undocumented and different on purpose (like memory allocation, signals, asynchronous I/O, threads, etc). Good luck! :) cheers, --renato
I thought about something like that but I think it's not a good idea. Like writing an AST visitor on Clang for example would be cool but it isn't open to other frontends, and I think that this is a job for LLVM. What about java-* attributes that can be put on certain IR operations to indicate structures that are needed to know about the Java Bytecode structure, or operations that should be translated in a specific way for Java? These attributes can be added to the IR modules optionally like debug info> On Jul 19, 2016, at 5:28 PM, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote: > >> On 19 Jul 2016, at 15:52, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >> >> But you're right that it will still be difficult to naively >> inter-operate between Java and C++ objects. Which is why it will be >> an interesting research project. :) > > If that’s your goal, then you might have better luck doing source-to-source than going via LLVM IR. In the past, I’ve managed to do most of Objective-C (not goto, pretty much everything else) -> JavaScript and Dart using clang AST visitors, with Objective-C classes being represented as native objects, with a bit of glue code to paper over the differences in the object models. With C++, you would most likely want to implement subclassing as composition and make each C++ class a Java interface so that you could do all of the kinds of casts required for multiple inheritance. > > There have been a few attempts at C to Java compilation, though I’m not sure of the current status of any of them. gcj implements Java classes using the same ABI as C++ classes, effectively treating Java as a subset of C++ (plus garbage collection). It might be interesting to start with the same subset of C++ that Microsoft has used for one of their various C++-on-the-CLR implementations and work from there. I think Alp Toker had some parsing for MS managed C++ extensions and CLR code generation working a few years back, but I’m not sure what happened to his code. > > David >