Chris Lattner wrote:> > On Oct 26, 2009, at 1:00 AM, Howard Chu wrote: > >> Hi, just read the LLVM 2.6 release announcement, the bit about llvm- >> mc caught >> my attention. I've been looking for a tool to disassemble x86 object >> files >> into an IR and then reassemble them into x86_64 object code. The >> immediate use >> for them would be to convert driver blobs that some vendors provide >> for their >> hardware (e.g. the Lucent modem driver) so they can be used in a 64 >> bit >> kernel. From the release announcement it looks like llvm-mc isn't >> ready for >> this purpose yet, was just curious if this kind of task was anywhere >> on its >> roadmap. Thanks... > > We don't have anything like that planned, but do plan to do an > assembler and disassembler. The disassembler (for x86-16/32/64) is > iterating on review comments before it goes in. The assembler is > currently being built out and will initially support macho. > Translating X86-32 to X86-64 sounds tricky but it could probably be > built on some of this infrastructure.Thanks for the response. I guess the real question is how much functionality the disassembler will have. If it only disassembles to assembly source files that's one thing. If it can go all the way to the LLVM IR that should make going to anything else pretty trivial. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu wrote:> Chris Lattner wrote: >> >> On Oct 26, 2009, at 1:00 AM, Howard Chu wrote: >> >>> Hi, just read the LLVM 2.6 release announcement, the bit about llvm- >>> mc caught >>> my attention. I've been looking for a tool to disassemble x86 object >>> files >>> into an IR and then reassemble them into x86_64 object code. The >>> immediate use >>> for them would be to convert driver blobs that some vendors provide >>> for their >>> hardware (e.g. the Lucent modem driver) so they can be used in a 64 >>> bit >>> kernel. From the release announcement it looks like llvm-mc isn't >>> ready for >>> this purpose yet, was just curious if this kind of task was anywhere >>> on its >>> roadmap. Thanks... >> >> We don't have anything like that planned, but do plan to do an >> assembler and disassembler. The disassembler (for x86-16/32/64) is >> iterating on review comments before it goes in. The assembler is >> currently being built out and will initially support macho. >> Translating X86-32 to X86-64 sounds tricky but it could probably be >> built on some of this infrastructure. > > Thanks for the response. I guess the real question is how much functionality > the disassembler will have. If it only disassembles to assembly source files > that's one thing. If it can go all the way to the LLVM IR that should make > going to anything else pretty trivial. >By the way, another obvious use for this feature would be to re-optimize packaged binaries. E.g. in the Microsoft world there are a lot of apps out there that are optimized specifically for Intel CPUs and don't run as well as they ought to on AMD CPUs. Disassembling the binary into LLVM IR and then re-optimizing it would allow consumers to extract the maximum performance out of software they've purchased, even if the vendor has no interest in supporting them in this way. Another obvious next step would be to allow re-compiling one platform's binaries to run on another CPU architecture. For most POSIX-based OSes it would only require a thin wrapper library to map the runtime environment from one platform to the other. Going down this direction would kind of reduce the need for the Fat-ELF project underway; if you can easily retarget a binary from one platform to another there's no reason to ship multiple binary formats in a single executable/object file. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
> Another obvious next step would be to allow re-compiling one platform's > binaries to run on another CPU architecture. For most POSIX-based OSes it > would only require a thin wrapper library to map the runtime environment from > one platform to the other. Going down this direction would kind of reduce the > need for the Fat-ELF project underway; if you can easily retarget a binary > from one platform to another there's no reason to ship multiple binary formats > in a single executable/object file.How would one deal with things like sizeof(void *) being different on different platforms, or platform-specific hacks in preprocessor conditionals? I suppose you could use that thin POSIX wrapper to pass the MAP_32BIT (at least on Linux) to mmap, and then what you're really getting is not the 64-bit address space, but the extra registers and target-specific optimizations. I'm pretty there aren't plans to use llvm-mc to disassemble binaries to LLVM IR. Recently some folks were discussing other ways of doing a similar x86 -> LLVM translation on this list, though, so you might check the archives. Reid
On Oct 26, 2009, at 7:39 PM, Howard Chu wrote:>> We don't have anything like that planned, but do plan to do an >> assembler and disassembler. The disassembler (for x86-16/32/64) is >> iterating on review comments before it goes in. The assembler is >> currently being built out and will initially support macho. >> Translating X86-32 to X86-64 sounds tricky but it could probably be >> built on some of this infrastructure. > > Thanks for the response. I guess the real question is how much > functionality the disassembler will have. If it only disassembles to > assembly source files that's one thing. If it can go all the way to > the LLVM IR that should make going to anything else pretty trivial.It will disassembler encoded instruction bytes (CD 21) into an MCInst which can then be printed to a string "int $21". -Chris
Even if you could disassemble to IR, you could not recompile it for the same reason that llvm does not give you platform independent C and C++ code is excluded based on preprocessor things that never gets into the generated code. sizes which could be hardcoded by optimisations and such could be platform dependent. also transforming machine code or even assembler to llvm IR would be non-trivial... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091027/b2f59fe5/attachment.html>