Thank you very much for the feedback, I tried to address the brought up issues in this updated proposal. In case you have any suggestions or comments feel free to tell me. Thanks in Advance Tilmann * Proposal for Google Summer of Code Project ** Using LLVM as a backend for QEMU's dynamic binary translation *** Terms: - host architecture: the architecture of the CPU QEMU is running on - target architecture: the architecture of the program which is being executed within QEMU *** Abstract: The goal of this project is to modify the QEMU dynamic binary translator to use components of the LLVM compiler infrastructure to turn it into a highly optimizing dynamic binary translator in order to increase the performance of QEMU even further. Instead of directly emitting code for the host architecture QEMU is running on, the target code is first translated to LLVM IR, then a selection of LLVM's optimization functions is applied to the IR and as a last step the LLVM JIT is used to generate code from the optimized IR for the host architecture. Since the translation to LLVM IR, the optimization and the code generation comes at a cost of an increased execution time, it's not feasible to apply this process to any piece of code, else the execution time would be even lower. Especially since on average a program spends 90% of its time within 10% of the code it is critical to get these 10% to execute fast, for the other 90% of the code parts might only execute once or only a few times and the extra time spent to generate the optimized code would not pay off. Therefore the idea is to identify the "hotspots" by counting how many times a piece of code has been executed, e.g. on basic block level, and performing an optimizing translation once a certain threshold is hit or falling back to the current binary translation of QEMU if not. Detailed speed measurements will be performed in order to evaluate the efficiency of this approach, especially in comparison to the approach currently used by QEMU. *** Benefits: QEMU will largely benefit from this project through an expected increase in speed, while remaining portable. Through this project LLVM will effectively get front ends for all target architectures supported by QEMU (at the moment these are x86, ARM, SPARC, PowerPC and MIPS). This lays the ground for the application of LLVM on binary code which could be e.g. the optimization of binaries where no source code is available, the instrumentation of binary code (e.g. for performance analysis), program analysis of binary code to assist in reverse engineering or static recompilation (depending on the instruction set this requires additional runtime code). This project is a first step to enhance LLVM to be suitable for static or dynamic binary translation and thereby attracting new users for LLVM which are interested in this subject. It will show the applicability of LLVM in an emulation environment, especially in regard to dynamic binary translation. It can also be used as a basis to try out concepts like profile-guided optimization or static optimization in the context of an emulator. Also since the LLVM JIT will be used for the final code generation QEMU can be hosted on any architecture targeted by the LLVM JIT (at the moment this are x86, x86-64, PowerPC and PowerPC 64), at least concerning code generation. Further adjustments to QEMU might be necessary though to get QEMU to run on a certain architecture which is supported by the LLVM JIT but not by QEMU. *** Deliverables: - a version of QEMU with an optimizing dynamic binary translator utilizing LLVM components - a set of test suites which are created during the development (with at least 80% statement coverage) - all necessary documentation to understand and be able to maintain the software *** Plan: The development of the software will be done within the three month timeframe of GSoC. Weekly status reports will be given. Week 1: - get familiar with LLVM and QEMU - write small test programs for certain LLVM components, or even a simple prototype - get to know LLVM example programs Week 2, 3, 4: - modify QEMU's dynamic binary translator to emit LLVM IR - create tests to verify the translation Week 5, 6: - integrate LLVM JIT into QEMU's dynamic binary translator - perform first speed measurements Week 7, 8: - integrate LLVM optimizations into QEMU - perform more speed measurements, select useful optimizations Week 9, 10: - test the system extensively - write final documentation Week 11, 12: - time buffer to deal with unexpected events *** Qualification: I'm a graduate student studying Software Engineering at the University of Stuttgart in Germany. I have a strong interest in compiler technology and see this project as a great opportunity to gain experience in this field. I have taken a compiler building class and plan to focus my future studies in this area. Emulation is another area i'm interested in. I wrote a Game Boy Advance emulator in C from scratch and a GP32 emulator based on QEMU (also C). While doing this I gained a basic understanding of the QEMU codebase. I'm currently involved in a university project which develops a testing tool for glass box tests for Java and COBOL, which allows to gather certain coverage metrics, and which will be opensourced later this year. I have decent experience with C and Java and i'm familiar with C++. Also I have a deep understanding of the ARM architecture and I'm familiar with x86. This project is a big chance for me to give something back to the open source community, especially since both LLVM and QEMU can profit from this project.
Hi Tilmann, This looks good. I have just a few comments below but they are minor. On Sun, 2007-03-25 at 15:51 +0200, Tilmann Scheller wrote:> Thank you very much for the feedback, I tried to address the brought up > issues in this updated proposal. In case you have any suggestions or > comments feel free to tell me. > > Thanks in Advance > > Tilmann > > > * Proposal for Google Summer of Code Project > > ** Using LLVM as a backend for QEMU's dynamic binary translation > > *** Terms: > - host architecture: the architecture of the CPU QEMU is running on > - target architecture: the architecture of the program which is being > executed within QEMU > > > *** Abstract: > The goal of this project is to modify the QEMU dynamic binary translator > to use components of the LLVM compiler infrastructure to turn it into a > highly optimizing dynamic binary translator in order to increase the > performance of QEMU even further. Instead of directly emitting code for > the host architecture QEMU is running on, the target code is first > translated to LLVM IR, then a selection of LLVM's optimization functions > is applied to the IR and as a last step the LLVM JIT is used to generateis -> are also, I'd drop "as a last step"> code from the optimized IR for the host architecture. Since the > translation to LLVM IR, the optimization and the code generation comes > at a cost of an increased execution time, it's not feasible to apply > this process to any piece of code, else the execution time would be even > lower. Especially since on average a program spends 90% of its time > within 10% of the code it is critical to get these 10% to execute fast, > for the other 90% of the code parts might only execute once or only a > few times and the extra time spent to generate the optimized code would > not pay off. Therefore the idea is to identify the "hotspots" by > counting how many times a piece of code has been executed, e.g. on basic > block level, and performing an optimizing translation once a certainIts probably easier to do at the function level, at least initially.> threshold is hit or falling back to the current binary translation of > QEMU if not.This is better, but you might want to indicate your ideas for identifying those hotspots if you can do it in a few words. Otherwise, you can, perhaps, put those details in a web pages referenced from this proposal (which Google encourages).> Detailed speed measurements will be performed in order to evaluate the > efficiency of this approach, especially in comparison to the approach > currently used by QEMU. > > > *** Benefits: > QEMU will largely benefit from this project through an expected increase > in speed, while remaining portable. > Through this project LLVM will effectively get front ends for all target > architectures supported by QEMU (at the moment these are x86, ARM, > SPARC, PowerPC and MIPS). This lays the ground for the application of > LLVM on binary code which could be e.g. the optimization of binaries > where no source code is available, the instrumentation of binary code > (e.g. for performance analysis), program analysis of binary code to > assist in reverse engineering or static recompilation (depending on the > instruction set this requires additional runtime code). > This project is a first step to enhance LLVM to be suitable for static > or dynamic binary translation and thereby attracting new users for LLVM > which are interested in this subject. > It will show the applicability of LLVM in an emulation environment, > especially in regard to dynamic binary translation. It can also be used > as a basis to try out concepts like profile-guided optimization or > static optimization in the context of an emulator.Much better :)> Also since the LLVM JIT will be used for the final code generation QEMU > can be hosted on any architecture targeted by the LLVM JIT (at the > moment this are x86, x86-64, PowerPC and PowerPC 64), at least > concerning code generation. Further adjustments to QEMU might be > necessary though to get QEMU to run on a certain architecture which is > supported by the LLVM JIT but not by QEMU. > > > *** Deliverables: > - a version of QEMU with an optimizing dynamic binary translator > utilizing LLVM components > - a set of test suites which are created during the development (with at > least 80% statement coverage) > - all necessary documentation to understand and be able to maintain the > software > > > *** Plan: > The development of the software will be done within the three month > timeframe of GSoC. Weekly status reports will be given. > > Week 1: > - get familiar with LLVM and QEMU > - write small test programs for certain LLVM components, or even a > simple prototype > - get to know LLVM example programs > Week 2, 3, 4: > - modify QEMU's dynamic binary translator to emit LLVM IR > - create tests to verify the translation > Week 5, 6: > - integrate LLVM JIT into QEMU's dynamic binary translator > - perform first speed measurements > Week 7, 8: > - integrate LLVM optimizations into QEMU > - perform more speed measurements, select useful optimizations > Week 9, 10: > - test the system extensively > - write final documentation > Week 11, 12: > - time buffer to deal with unexpected eventsLooks good.> > *** Qualification: > I'm a graduate student studying Software Engineering at the University > of Stuttgart in Germany. I have a strong interest in compiler technology > and see this project as a great opportunity to gain experience in this > field. I have taken a compiler building class and plan to focus my > future studies in this area. > Emulation is another area i'm interested in. I wrote a Game Boy Advance > emulator in C from scratch and a GP32 emulator based on QEMU (also C). > While doing this I gained a basic understanding of the QEMU codebase. > I'm currently involved in a university project which develops a testing > tool for glass box tests for Java and COBOL, which allows to gather > certain coverage metrics, and which will be opensourced later this year. > I have decent experience with C and Java and i'm familiar with C++. Also > I have a deep understanding of the ARM architecture and I'm familiar > with x86. > This project is a big chance for me to give something back to the open > source community, especially since both LLVM and QEMU can profit from > this project.Nice job. A few minor tweaks and "ship" it :) Reid.> _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Sun, 25 Mar 2007, Tilmann Scheller wrote:> Thank you very much for the feedback, I tried to address the brought up > issues in this updated proposal. In case you have any suggestions or > comments feel free to tell me.FWIW, you might be interested in this thesis: http://llvm.org/pubs/2004-05-JoshiMSThesis.html It has a chapter about turning Alpha machine code into LLVM. -Chris> Tilmann > > > * Proposal for Google Summer of Code Project > > ** Using LLVM as a backend for QEMU's dynamic binary translation > > *** Terms: > - host architecture: the architecture of the CPU QEMU is running on > - target architecture: the architecture of the program which is being > executed within QEMU > > > *** Abstract: > The goal of this project is to modify the QEMU dynamic binary translator > to use components of the LLVM compiler infrastructure to turn it into a > highly optimizing dynamic binary translator in order to increase the > performance of QEMU even further. Instead of directly emitting code for > the host architecture QEMU is running on, the target code is first > translated to LLVM IR, then a selection of LLVM's optimization functions > is applied to the IR and as a last step the LLVM JIT is used to generate > code from the optimized IR for the host architecture. Since the > translation to LLVM IR, the optimization and the code generation comes > at a cost of an increased execution time, it's not feasible to apply > this process to any piece of code, else the execution time would be even > lower. Especially since on average a program spends 90% of its time > within 10% of the code it is critical to get these 10% to execute fast, > for the other 90% of the code parts might only execute once or only a > few times and the extra time spent to generate the optimized code would > not pay off. Therefore the idea is to identify the "hotspots" by > counting how many times a piece of code has been executed, e.g. on basic > block level, and performing an optimizing translation once a certain > threshold is hit or falling back to the current binary translation of > QEMU if not. > Detailed speed measurements will be performed in order to evaluate the > efficiency of this approach, especially in comparison to the approach > currently used by QEMU. > > > *** Benefits: > QEMU will largely benefit from this project through an expected increase > in speed, while remaining portable. > Through this project LLVM will effectively get front ends for all target > architectures supported by QEMU (at the moment these are x86, ARM, > SPARC, PowerPC and MIPS). This lays the ground for the application of > LLVM on binary code which could be e.g. the optimization of binaries > where no source code is available, the instrumentation of binary code > (e.g. for performance analysis), program analysis of binary code to > assist in reverse engineering or static recompilation (depending on the > instruction set this requires additional runtime code). > This project is a first step to enhance LLVM to be suitable for static > or dynamic binary translation and thereby attracting new users for LLVM > which are interested in this subject. > It will show the applicability of LLVM in an emulation environment, > especially in regard to dynamic binary translation. It can also be used > as a basis to try out concepts like profile-guided optimization or > static optimization in the context of an emulator. > Also since the LLVM JIT will be used for the final code generation QEMU > can be hosted on any architecture targeted by the LLVM JIT (at the > moment this are x86, x86-64, PowerPC and PowerPC 64), at least > concerning code generation. Further adjustments to QEMU might be > necessary though to get QEMU to run on a certain architecture which is > supported by the LLVM JIT but not by QEMU. > > > *** Deliverables: > - a version of QEMU with an optimizing dynamic binary translator > utilizing LLVM components > - a set of test suites which are created during the development (with at > least 80% statement coverage) > - all necessary documentation to understand and be able to maintain the > software > > > *** Plan: > The development of the software will be done within the three month > timeframe of GSoC. Weekly status reports will be given. > > Week 1: > - get familiar with LLVM and QEMU > - write small test programs for certain LLVM components, or even a > simple prototype > - get to know LLVM example programs > Week 2, 3, 4: > - modify QEMU's dynamic binary translator to emit LLVM IR > - create tests to verify the translation > Week 5, 6: > - integrate LLVM JIT into QEMU's dynamic binary translator > - perform first speed measurements > Week 7, 8: > - integrate LLVM optimizations into QEMU > - perform more speed measurements, select useful optimizations > Week 9, 10: > - test the system extensively > - write final documentation > Week 11, 12: > - time buffer to deal with unexpected events > > > *** Qualification: > I'm a graduate student studying Software Engineering at the University > of Stuttgart in Germany. I have a strong interest in compiler technology > and see this project as a great opportunity to gain experience in this > field. I have taken a compiler building class and plan to focus my > future studies in this area. > Emulation is another area i'm interested in. I wrote a Game Boy Advance > emulator in C from scratch and a GP32 emulator based on QEMU (also C). > While doing this I gained a basic understanding of the QEMU codebase. > I'm currently involved in a university project which develops a testing > tool for glass box tests for Java and COBOL, which allows to gather > certain coverage metrics, and which will be opensourced later this year. > I have decent experience with C and Java and i'm familiar with C++. Also > I have a deep understanding of the ARM architecture and I'm familiar > with x86. > This project is a big chance for me to give something back to the open > source community, especially since both LLVM and QEMU can profit from > this project. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-Chris -- http://nondot.org/sabre/ http://llvm.org/