First, is there a way to search the archives for this list ? I apologize in advance if I have stepped on a FAQ. My goal is to execute legacy binary machine code from a very old one of a kind computer on a variety of modern computers. I already wrote an emulator for the legacy machine that executes the old machine code. However, my emulator is just an interpreter and therefore has some limitations: - The emulator spends a lot of time in an executive loop that fetches legacy instructions, decodes them, and jumps to appropriate C functions that emulate each legacy instruction. The executive loop also has to handle emulated interrupts, support single-step debugging, etc. - The emulator is compiled and run on only a few modern hardware/ operating system combinations. The emulator is fairly portable, but extensive optimizations on some platforms restrict capabilities on other platforms. - The emulator executes the legacy machine code unmodified which is good, but that means opportunities for optimization are lost. The legacy machine code is full of dead code, jumps to jumps, redundant sub-expressions, unnecessary memory accesses, etc. Back in the old days, compilers really didn't optimize at all. They generated horrible code that was sometimes hand modified. My idea is to convert my emulator into a translator that emits LLVM IR either directly or via calls to the LLVM library. I would then execute the result via JIT or native code compilation... Is this a reasonable approach ? Can this approach be used even when the legacy code is self modifying ? After a code modification, a re-translation and re-JIT would be needed. Are there any general suggestions ?
Harry Metcalfe
2008-Jun-24 08:28 UTC
[LLVMdev] Advice - llvm as binary to binary translator ?
Hi Eric, I'm currently writing an IA-32 to LLVMIR translator. I'm only mid way through, but I can certainly say that there have been more difficulties than I anticipated when I began! I think that it is a reasonable approach, perhaps especially in your case, since you have an emulator already. Automatic static translation is equivalent to the halting problem for IA-32 code, though perhaps it wouldn't be for yours (what architecture are you using?). A dynamic phase is therefore necessary for me -- if it is for you too, you'll have a leg up. Self-modifying code is both hideous and unusual, and very difficult to deal with. I'm leaving it to one side. General thoughts: are you sure that LLVMIR is suitable? You may be better off with a lower-level representation. At least in my case, LLVM enforces a level of structure that doesn't exist in machine code. That's something you'll also probably have to deal with. Its type system also hampers the modification of translated code, so it's advantageous to ensure that you won't need to change any code once translated. This is of particular importance when you're trying to figure out the bounds of an array, and things like that: a change to the size of an array is a change of its type, which means it's much easier just to get the size of the array right in the first place. I'm currently in the process of altering my code so that a lot more analysis takes place before translation even begins! Finally, how will you deal with memory accesses and aliasing? This is certainly the thorniest problem, and its the one my dynamic phase exists to solve. Do email me off-list if you like -- it sounds like we're pursuing similar lines of inquiry! Harry On Sat, 2008-06-21 at 21:53 -0400, Erik Buck wrote:> First, is there a way to search the archives for this list ? I > apologize in advance if I have stepped on a FAQ. > > My goal is to execute legacy binary machine code from a very old one > of a kind computer on a variety of modern computers. I already wrote > an emulator for the legacy machine that executes the old machine > code. However, my emulator is just an interpreter and therefore has > some limitations: > > - The emulator spends a lot of time in an executive loop that fetches > legacy instructions, decodes them, and jumps to appropriate C > functions that emulate each legacy instruction. The executive loop > also has to handle emulated interrupts, support single-step debugging, > etc. > > - The emulator is compiled and run on only a few modern hardware/ > operating system combinations. The emulator is fairly portable, but > extensive optimizations on some platforms restrict capabilities on > other platforms. > > - The emulator executes the legacy machine code unmodified which is > good, but that means opportunities for optimization are lost. The > legacy machine code is full of dead code, jumps to jumps, redundant > sub-expressions, unnecessary memory accesses, etc. Back in the old > days, compilers really didn't optimize at all. They generated > horrible code that was sometimes hand modified. > > My idea is to convert my emulator into a translator that emits LLVM IR > either directly or via calls to the LLVM library. I would then > execute the result via JIT or native code compilation... > > Is this a reasonable approach ? > Can this approach be used even when the legacy code is self > modifying ? After a code modification, a re-translation and re-JIT > would be needed. > > Are there any general suggestions ? > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
You may also want to take a look at valgrind. It is capable of translating IA-32/64, PPC-32/64, etc.. to its own SSA-style IR. And then it has backends from the IR to the very same architectures. You can also build a backend to LLVM and let it further optimize the generated code (although valgrind already has its own optimizers). Building such translators is not easy business. I can tell you that by experience.. Depending on what you want to achieve, I would reuse something already existent. Nuno ----- Original Message ----- From: "Harry Metcalfe" <H.S.Metcalfe at sussex.ac.uk> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> Sent: Tuesday, June 24, 2008 9:28 AM Subject: Re: [LLVMdev] Advice - llvm as binary to binary translator ?> Hi Eric, > > I'm currently writing an IA-32 to LLVMIR translator. I'm only mid way > through, but I can certainly say that there have been more difficulties > than I anticipated when I began! > > I think that it is a reasonable approach, perhaps especially in your > case, since you have an emulator already. Automatic static translation > is equivalent to the halting problem for IA-32 code, though perhaps it > wouldn't be for yours (what architecture are you using?). A dynamic > phase is therefore necessary for me -- if it is for you too, you'll have > a leg up. > > Self-modifying code is both hideous and unusual, and very difficult to > deal with. I'm leaving it to one side. > > General thoughts: are you sure that LLVMIR is suitable? You may be > better off with a lower-level representation. At least in my case, LLVM > enforces a level of structure that doesn't exist in machine code. That's > something you'll also probably have to deal with. > > Its type system also hampers the modification of translated code, so > it's advantageous to ensure that you won't need to change any code once > translated. This is of particular importance when you're trying to > figure out the bounds of an array, and things like that: a change to the > size of an array is a change of its type, which means it's much easier > just to get the size of the array right in the first place. I'm > currently in the process of altering my code so that a lot more analysis > takes place before translation even begins! > > Finally, how will you deal with memory accesses and aliasing? This is > certainly the thorniest problem, and its the one my dynamic phase exists > to solve. > > Do email me off-list if you like -- it sounds like we're pursuing > similar lines of inquiry! > > Harry > > > > On Sat, 2008-06-21 at 21:53 -0400, Erik Buck wrote: >> First, is there a way to search the archives for this list ? I >> apologize in advance if I have stepped on a FAQ. >> >> My goal is to execute legacy binary machine code from a very old one >> of a kind computer on a variety of modern computers. I already wrote >> an emulator for the legacy machine that executes the old machine >> code. However, my emulator is just an interpreter and therefore has >> some limitations: >> >> - The emulator spends a lot of time in an executive loop that fetches >> legacy instructions, decodes them, and jumps to appropriate C >> functions that emulate each legacy instruction. The executive loop >> also has to handle emulated interrupts, support single-step debugging, >> etc. >> >> - The emulator is compiled and run on only a few modern hardware/ >> operating system combinations. The emulator is fairly portable, but >> extensive optimizations on some platforms restrict capabilities on >> other platforms. >> >> - The emulator executes the legacy machine code unmodified which is >> good, but that means opportunities for optimization are lost. The >> legacy machine code is full of dead code, jumps to jumps, redundant >> sub-expressions, unnecessary memory accesses, etc. Back in the old >> days, compilers really didn't optimize at all. They generated >> horrible code that was sometimes hand modified. >> >> My idea is to convert my emulator into a translator that emits LLVM IR >> either directly or via calls to the LLVM library. I would then >> execute the result via JIT or native code compilation... >> >> Is this a reasonable approach ? >> Can this approach be used even when the legacy code is self >> modifying ? After a code modification, a re-translation and re-JIT >> would be needed. >> >> Are there any general suggestions ?
Jonathan S. Shapiro
2008-Jun-24 18:33 UTC
[LLVMdev] Advice - llvm as binary to binary translator ?
This is off-topic. I would reply directly to Erick, but I seem to have missed the note that had his reply address. Erick, Harry: The evidence in the literature supporting non-trivial optimization in a dynamic binary to binary translator should be viewed with caution. Our suspicion when we started the HDTrans work was that all that the fancy optimizations accomplish in most cases is to (almost) make up for the computational cost of generating the IR in the first place. The numbers we achieved on IA32->IA32 translation seem to support that view. There are special cases where this might not be true, and of course there are specific counter-examples, but the HDTrans numbers apparently came as a shock to some people in the dynamic translator world. Erick mentioned in particular a number of control flow optimizations that are very easily handled in a direct emitter (i.e. with no IR). Dead code is also naturally eliminated by any dynamic translator. Redundant expressions are not, but the performance advantage of modern processors over legacy machines is so large that eliminating those may not be worthwhile. Newer machines typically have more registers and/or fast L1 caches, both of which are your friend in a dynamic translator. Given this, if performance actually matters your time and effort might be better spent crafting a really tight binary decoder that pays careful attention to cache utilization, and then implementing a relatively modest number of branch to branch and call/return optimizations that are fairly simple to do with a low-level direct translator. Regardless of whether you accept my argument about the value of optimization, it's clear from the HDTrans numbers that you can lose a lot more performance in decoding and IR generation than the published research work has generally acknowledged. Please don't hesitate to contact me directly if I can be helpful. shap