Hi Tilmann,> Nevertheless, it is unlikely that llvm-qemu will ever be much faster > than regular qemu (by replacing its code generator completely, which > it currently does), which is due to the fact that regular qemu has a > very lightweight code generator (it basically only copies blocks of > memory and performs some patching to them and only does static > register allocation) which generates reasonably good code, with a very > low overhead for compilation time. In contrast the LLVM JIT generates > really high quality code (in fact the JIT and the static compiler > share the same code generator), but at a higher price in terms of > compilation time.How about storing generated code on disc? Or the intermediate IR? I'd typically use the same things under qemu day after day and would be happy to slowly build up a cache on disc. Perhaps starting qemu with a 'spend time and add to cache' option when I'm getting it to learn and a 'use only what's in the cache already' when I having got the time to wait. Cheers, Ralph.
On Sun, Apr 6, 2008 at 1:11 PM, Ralph Corderoy <ralph at inputplus.co.uk> wrote:> How about storing generated code on disc? Or the intermediate IR? I'd > typically use the same things under qemu day after day and would be > happy to slowly build up a cache on disc. Perhaps starting qemu with a > 'spend time and add to cache' option when I'm getting it to learn and a > 'use only what's in the cache already' when I having got the time to > wait. >A similar approach is used by FX!32 to translate Win32 x86 binaries to Alpha binaries (to run them on Windows NT for Alpha). Basically FX!32 performs incremental static recompilation. On the first run the binary runs with an interpreter and profiling data is gathered (e.g. the targets of indirect branches, as they usually can't be determined statically). FX!32 runs a background process which uses the gathered data to statically recompile the binary (doing optimizations which would be too expensive to be done at runtime). The next time the user executes the binary, he is already executing native Alpha code and in case he executes parts of the program which he didn't reach the last time or which FX!32 couldn't determine statically, FX!32 will fall back to the interpreter and collect profiling data again. This way the binary is translated incrementally from x86 to Alpha. In principle a system like FX!32 could be implement with llvm-qemu. For user-mode emulation it is certainly possible (these days, the demand for such a system seems rather low though). However, for system emulation (unlike user-mode emulation, system emulation also requires MMU emulation) I would say static recompilation is not feasible since addresses change very frequently, e.g. a program running on the system is very likely to be loaded at different addresses on every invocation, so you would end up falling back to the interpreter most of the time. Greetings, Tilmann Scheller -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080406/f8559a54/attachment.html>
Ralph Corderoy wrote:> Hi Tilmann, > > >> Nevertheless, it is unlikely that llvm-qemu will ever be much faster >> than regular qemu (by replacing its code generator completely, which >> it currently does), which is due to the fact that regular qemu has a >> very lightweight code generator (it basically only copies blocks of >> memory and performs some patching to them and only does static >> register allocation) which generates reasonably good code, with a very >> low overhead for compilation time. In contrast the LLVM JIT generates >> really high quality code (in fact the JIT and the static compiler >> share the same code generator), but at a higher price in terms of >> compilation time. >>One of the things I noticed in the last message on llvm-qemu was that you were compiling with output from qemu with llvm-gcc? Is this correct? If so, then wouldn't that be one of the sources of overhead when using LLVM? It would make more sense to me (but also be more implementation work) to link the LLVM JIT and CodeGen libraries directly into Qemu and to forgo the Qemu->C->LLVM translation process. Not only should this speed it up (because you're removing the C preprocessing, C parsing, and exeve() overhead), but it should also give you more control over what LLVM is doing (you can use LLVM's JIT infrastructure to re-codegen functions at run-time and things like that). -- John T.> > How about storing generated code on disc? Or the intermediate IR? I'd > typically use the same things under qemu day after day and would be > happy to slowly build up a cache on disc. Perhaps starting qemu with a > 'spend time and add to cache' option when I'm getting it to learn and a > 'use only what's in the cache already' when I having got the time to > wait. > > Cheers, > > > Ralph. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Sun, Apr 6, 2008 at 6:32 PM, John Criswell <criswell at uiuc.edu> wrote:> One of the things I noticed in the last message on llvm-qemu was that > you were compiling with output from qemu with llvm-gcc? Is this correct?Not really, llvm-gcc is only invoked once when llvm-qemu is being built. Qemu's code generator works like this: a source machine instruction is decomposed into several simpler micro ops (basically the IR of qemu), for every micro op there is a C implementation, which at build time is compiled for the respective target architecture. At build time the actual code generator is generated by the tool dyngen, which parses the compiled micro op object file and generates a C function which receives a micro op stream and generates the corresponding machine code by concatenating the respective machine code for the micro ops and doing some additional patching (like inserting parameters for the micro ops). What I did for llvm-qemu is to write llvm-dyngen, which instead of an ELF file reads in a bitcode op.o (generated by compiling op.c with llvm-gcc) and creates a function which basically concatenates and patches the LLVM IR ops and after that JITs them. Every source architecture has different micro ops, thus llvm-dyngen needs to be used to create the code generator for a particular architecture. Greetings, Tilmann -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080406/91549c9c/attachment.html>