Muhui Jiang via llvm-dev
2018-Jun-28 12:32 UTC
[llvm-dev] Distinguish between ARM and Thumb
Hi Nowadays I am using LLVM to do ARM binary analysis. I was wondering is llvm available to provide some debugging information on the mode of ARM. For example, llvm-dwarfdump could dump some instructions information for debugging. Is it able to know the mode for each instruction? Or we may write some llvm pass to help us to know the instruction mode? Any suggestions are welcomed. Many Thanks Regards Muhui -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180628/7f06be4d/attachment.html>
Peter Smith via llvm-dev
2018-Jun-28 13:07 UTC
[llvm-dev] Distinguish between ARM and Thumb
Hello Muhui, If you are disassembling a non-stripped ELF binary you can find out the Arm/Thumb state by looking at the mapping symbols $t and $a, alternatively each ELF symbol of type STT_FUNC will have bit 0 set to 0 for Arm state and bit 1 for Thumb state. Hence with the symbol table you can reconstruct the state at each address by finding a symbol. More information is available in ELF for the Arm Architecture [1]. If you have got a stripped binary without any symbolic information then life gets a lot more difficult. There are some encoding rules [2] that can help you find out whether a Thumb instruction is 2 or 4 bytes long but in general you'll at least need to know whether you are starting on an Arm or Thumb instruction and will need to trace control flow instructions to track state changes and to avoid interpreting literal data as instructions. For the former I don't think you need to do much beyond reading the symbol table. I don't think LLVM does passes to reconstruct binaries, that logic would usually lie in a tool like objdump. Hope this helps Peter [1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf (search for mapping symbols) [2] https://developer.arm.com/products/architecture/a-profile/docs/ddi0406/latest/arm-architecture-reference-manual-armv7-a-and-armv7-r-edition (search for Thumb instruction encoding) On 28 June 2018 at 13:32, Muhui Jiang via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi > > Nowadays I am using LLVM to do ARM binary analysis. I was wondering is llvm > available to provide some debugging information on the mode of ARM. > > For example, llvm-dwarfdump could dump some instructions information for > debugging. Is it able to know the mode for each instruction? Or we may > write some llvm pass to help us to know the instruction mode? Any > suggestions are welcomed. Many Thanks > > Regards > Muhui > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Muhui Jiang via llvm-dev
2018-Jun-28 15:06 UTC
[llvm-dev] Distinguish between ARM and Thumb
Hi Peter Thank you so much for your detail and quick reply. I think I have already known how to do it on non-stripped binary. Regards Muhui 2018-06-28 9:07 GMT-04:00 Peter Smith <peter.smith at linaro.org>:> Hello Muhui, > > If you are disassembling a non-stripped ELF binary you can find out > the Arm/Thumb state by looking at the mapping symbols $t and $a, > alternatively each ELF symbol of type STT_FUNC will have bit 0 set to > 0 for Arm state and bit 1 for Thumb state. Hence with the symbol table > you can reconstruct the state at each address by finding a symbol. > More information is available in ELF for the Arm Architecture [1]. > > If you have got a stripped binary without any symbolic information > then life gets a lot more difficult. There are some encoding rules [2] > that can help you find out whether a Thumb instruction is 2 or 4 bytes > long but in general you'll at least need to know whether you are > starting on an Arm or Thumb instruction and will need to trace control > flow instructions to track state changes and to avoid interpreting > literal data as instructions. > > For the former I don't think you need to do much beyond reading the > symbol table. I don't think LLVM does passes to reconstruct binaries, > that logic would usually lie in a tool like objdump. > > Hope this helps > > Peter > > [1] http://infocenter.arm.com/help/topic/com.arm.doc. > ihi0044f/IHI0044F_aaelf.pdf > (search for mapping symbols) > [2] https://developer.arm.com/products/architecture/a- > profile/docs/ddi0406/latest/arm-architecture-reference- > manual-armv7-a-and-armv7-r-edition > (search for Thumb instruction encoding) > > On 28 June 2018 at 13:32, Muhui Jiang via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Hi > > > > Nowadays I am using LLVM to do ARM binary analysis. I was wondering is > llvm > > available to provide some debugging information on the mode of ARM. > > > > For example, llvm-dwarfdump could dump some instructions information for > > debugging. Is it able to know the mode for each instruction? Or we may > > write some llvm pass to help us to know the instruction mode? Any > > suggestions are welcomed. Many Thanks > > > > Regards > > Muhui > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180628/688fc123/attachment.html>