Aidan Steele
2011-Dec-19 04:29 UTC
[LLVMdev] Disassembly arbitrary machine-code byte arrays
Hi, My apologies if this appears to be a very trivial question -- I have tried to solve this on my own and I am stuck. Any assistance that could be provided would be immensely appreciated. What is the absolute bare minimum that I need to do to disassemble an array of, say, ARM machine code bytes? Or an array of Thumb machine code bytes? For example, I might have an array of unsigned chars -- how could I go about decoding these into MCInst objects? Does such a decoding process take place in one fell swoop or do I parse the stream one instruction at a time? Can I ask it to "decode the next 10 bytes"? What follows is my (feeble) attempt at getting started. It probably doesn't help that I am only familiar with C and Objective-C and find C++ syntax absolutely bewildering. Kind regards, Aidan Steele int main (int argc, const char *argv[]) { LLVMInitializeARMTargetInfo(); LLVMInitializeARMTargetMC(); LLVMInitializeARMAsmParser(); LLVMInitializeARMDisassembler(); const llvm::Target Target; llvm::OwningPtr<const llvm::MCSubtargetInfo> STI(Target.createMCSubtargetInfo("", "", "")); llvm::OwningPtr<const llvm::MCDisassembler> disassembler(Target.createMCDisassembler(*STI)); llvm::OwningPtr<llvm::MemoryBuffer> Buffer; llvm::MemoryBuffer::getFile(llvm::StringRef("/path/to/file.bin"), Buffer); llvm::MCInst Inst; uint64_t Size = 0; disassembler->getInstruction(Inst, Size, *Buffer.take(), 0, llvm::nulls(), llvm::nulls()); // llvm::StringRef TheArchString("arm-apple-darwin"); // std::string normalized = llvm::Triple::normalize(TheArchString); // // llvm::Triple TheTriple; // TheTriple.setArch(llvm::Triple::arm); // TheTriple.setOS(llvm::Triple::Darwin); // TheTriple.setVendor(llvm::Triple::Apple); // llvm::Target *TheTarget = NULL; return 0; }
James Molloy
2011-Dec-19 09:23 UTC
[LLVMdev] Disassembly arbitrary machine-code byte arrays
Hi Aiden, The easiest thing I can do is to point you to the source of the "llvm-mc" tool, which does exactly what you ask in its "-disassemble" mode. The code is rather small, so it should be easy to work out. tools/llvm-mc Cheers, James -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Aidan Steele Sent: 19 December 2011 04:30 To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] Disassembly arbitrary machine-code byte arrays Hi, My apologies if this appears to be a very trivial question -- I have tried to solve this on my own and I am stuck. Any assistance that could be provided would be immensely appreciated. What is the absolute bare minimum that I need to do to disassemble an array of, say, ARM machine code bytes? Or an array of Thumb machine code bytes? For example, I might have an array of unsigned chars -- how could I go about decoding these into MCInst objects? Does such a decoding process take place in one fell swoop or do I parse the stream one instruction at a time? Can I ask it to "decode the next 10 bytes"? What follows is my (feeble) attempt at getting started. It probably doesn't help that I am only familiar with C and Objective-C and find C++ syntax absolutely bewildering. Kind regards, Aidan Steele int main (int argc, const char *argv[]) { LLVMInitializeARMTargetInfo(); LLVMInitializeARMTargetMC(); LLVMInitializeARMAsmParser(); LLVMInitializeARMDisassembler(); const llvm::Target Target; llvm::OwningPtr<const llvm::MCSubtargetInfo> STI(Target.createMCSubtargetInfo("", "", "")); llvm::OwningPtr<const llvm::MCDisassembler> disassembler(Target.createMCDisassembler(*STI)); llvm::OwningPtr<llvm::MemoryBuffer> Buffer; llvm::MemoryBuffer::getFile(llvm::StringRef("/path/to/file.bin"), Buffer); llvm::MCInst Inst; uint64_t Size = 0; disassembler->getInstruction(Inst, Size, *Buffer.take(), 0, llvm::nulls(), llvm::nulls()); // llvm::StringRef TheArchString("arm-apple-darwin"); // std::string normalized = llvm::Triple::normalize(TheArchString); // // llvm::Triple TheTriple; // TheTriple.setArch(llvm::Triple::arm); // TheTriple.setOS(llvm::Triple::Darwin); // TheTriple.setVendor(llvm::Triple::Apple); // llvm::Target *TheTarget = NULL; return 0; } _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Kevin Enderby
2011-Dec-19 20:14 UTC
[LLVMdev] Disassembly arbitrary machine-code byte arrays
Hi Aiden, The 'C' based interface you could use in is llvm/include/llvm-c/Disassembler.h, which in there is: /** * Disassemble a single instruction using the disassembler context specified in * the parameter DC. The bytes of the instruction are specified in the * parameter Bytes, and contains at least BytesSize number of bytes. The * instruction is at the address specified by the PC parameter. If a valid * instruction can be disassembled, its string is returned indirectly in * OutString whose size is specified in the parameter OutStringSize. This * function returns the number of bytes in the instruction or zero if there was * no valid instruction. */ size_t LLVMDisasmInstruction(LLVMDisasmContextRef DC, uint8_t *Bytes, uint64_t BytesSize, uint64_t PC, char *OutString, size_t OutStringSize); This is used in darwin's otool(1) which is an objdump(1) like tool. It ends up in the libLTO shared library. Kev On Dec 19, 2011, at 1:23 AM, James Molloy wrote:> Hi Aiden, > > The easiest thing I can do is to point you to the source of the "llvm-mc" tool, which does exactly what you ask in its "-disassemble" mode. The code is rather small, so it should be easy to work out. > > tools/llvm-mc > > Cheers, > > James > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Aidan Steele > Sent: 19 December 2011 04:30 > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] Disassembly arbitrary machine-code byte arrays > > Hi, > > My apologies if this appears to be a very trivial question -- I have > tried to solve this on my own and I am stuck. Any assistance that > could be provided would be immensely appreciated. > > What is the absolute bare minimum that I need to do to disassemble an > array of, say, ARM machine code bytes? Or an array of Thumb machine > code bytes? For example, I might have an array of unsigned chars -- > how could I go about decoding these into MCInst objects? Does such a > decoding process take place in one fell swoop or do I parse the stream > one instruction at a time? Can I ask it to "decode the next 10 bytes"? > What follows is my (feeble) attempt at getting started. It probably > doesn't help that I am only familiar with C and Objective-C and find > C++ syntax absolutely bewildering. > > Kind regards, > Aidan Steele > > int main (int argc, const char *argv[]) > { > LLVMInitializeARMTargetInfo(); > LLVMInitializeARMTargetMC(); > LLVMInitializeARMAsmParser(); > LLVMInitializeARMDisassembler(); > > const llvm::Target Target; > > llvm::OwningPtr<const llvm::MCSubtargetInfo> > STI(Target.createMCSubtargetInfo("", "", "")); > llvm::OwningPtr<const llvm::MCDisassembler> > disassembler(Target.createMCDisassembler(*STI)); > > llvm::OwningPtr<llvm::MemoryBuffer> Buffer; > llvm::MemoryBuffer::getFile(llvm::StringRef("/path/to/file.bin"), Buffer); > llvm::MCInst Inst; > uint64_t Size = 0; > > disassembler->getInstruction(Inst, Size, *Buffer.take(), 0, > llvm::nulls(), llvm::nulls()); > > // llvm::StringRef TheArchString("arm-apple-darwin"); > // std::string normalized = llvm::Triple::normalize(TheArchString); > // > // llvm::Triple TheTriple; > // TheTriple.setArch(llvm::Triple::arm); > // TheTriple.setOS(llvm::Triple::Darwin); > // TheTriple.setVendor(llvm::Triple::Apple); > // llvm::Target *TheTarget = NULL; > > return 0; > } > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Seemingly Similar Threads
- [LLVMdev] Disassembly arbitrary machine-code byte arrays
- [LLVMdev] Disassembly arbitrary machine-code byte arrays
- [LLVMdev] Disassembly arbitrary machine-code byte arrays
- TargetRegistry and MC object ownership.
- [LLVMdev] How to output a .S *and* a .OBJ file?