Hi, I am new to llvm, not familiar with c++, after some use with llvm-objdump, and finding the broken output, I try to debug and fix the code so it can become usable. Please help review the patch, so that they can be merged. And there's still two major problem I have found about arm disassembler: 1. arm instruction decoder cannot recognise bx series instructions. 2. As gcc will generate thumb and arm instruction mixed binary, we have to switch from each other. we can tell if a function is thumb or arm code by looking at the symbol table entry, when in thumb code, the lowest bit of the symbol value will be set to '1'. so how these logic can be implemented while still adapt to the structure of the code? Songmao -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Implement-sectionContainsSymbol-preparing-for-the-ob.patch Type: text/x-patch Size: 1177 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111011/527db4f3/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fix-objdump-various-problem.patch Type: text/x-patch Size: 6143 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111011/527db4f3/attachment-0001.bin>
On Tue, Oct 11, 2011 at 12:15 AM, Neo <smtian at ingenic.cn> wrote:> Hi, > I am new to llvm, not familiar with c++, after some use with > llvm-objdump, and finding the broken output, I try to debug and fix the code > so it can become usable. Please help review the patch, so that they can be > merged. > And there's still two major problem I have found about arm disassembler: > 1. arm instruction decoder cannot recognise bx series instructions. > 2. As gcc will generate thumb and arm instruction mixed binary, we have to > switch from each other. > we can tell if a function is thumb or arm code by looking at the symbol > table entry, when in thumb code, the lowest bit of the symbol value will be > set to '1'. > so how these logic can be implemented while still adapt to the structure of > the code? > > SongmaoFor the first patch. The code is only valid for executable files, not relocatable files. st_shndx should be used to determine if the symbol is in the given section. Also st_value can hold the offset into st_shndx, not the actual address. Also it doesn't handle non-function symbols. For the second patch. Could you explain what exactly you are trying to fix? I see some stuff that I know is wrong, but it would help if I knew the intent. As for what I do know. * The error function already prints out the error. If you want to print additional info, add an overload of error that allows that. * Please use spaces instead of tabs. Lots of the code doesn't line up properly for me. * Setting the size to 4 to skip bytes is arbitrary, and won't always give decent results on different platforms. Thank you for working on this. As for your comments on the arm disassembler. 1) I am not familiar with ARM, but I do know the decoder is currently being worked on. 2) We were just discussing this in IRC. The idea is to simply handle ARM disassembly as a special case and inspect the bit to decide how to disassemble the symbol. - Michael Spencer
Neo, On Oct 11, 2011, at 12:15 AM, Neo wrote:> 1. arm instruction decoder cannot recognise bx series instructions.Can you provide a testcase for an instruction it fails to disassemble? --Owen
On 2011年10月12日 03:40, Michael Spencer wrote:> On Tue, Oct 11, 2011 at 12:15 AM, Neo<smtian at ingenic.cn> wrote: >> Hi, >> I am new to llvm, not familiar with c++, after some use with >> llvm-objdump, and finding the broken output, I try to debug and fix the code >> so it can become usable. Please help review the patch, so that they can be >> merged. >> And there's still two major problem I have found about arm disassembler: >> 1. arm instruction decoder cannot recognise bx series instructions. >> 2. As gcc will generate thumb and arm instruction mixed binary, we have to >> switch from each other. >> we can tell if a function is thumb or arm code by looking at the symbol >> table entry, when in thumb code, the lowest bit of the symbol value will be >> set to '1'. >> so how these logic can be implemented while still adapt to the structure of >> the code? >> >> Songmao > For the first patch. The code is only valid for executable files, not > relocatable files. st_shndx should be used to determine if the symbol > is in the given section. Also st_value can hold the offset into > st_shndx, not the actual address. Also it doesn't handle non-function > symbols. > > For the second patch. Could you explain what exactly you are trying to > fix? I see some stuff that I know is wrong, but it would help if I > knew the intent. As for what I do know. > * The error function already prints out the error. If you want to > print additional info, add an overload of error that allows that. > * Please use spaces instead of tabs. Lots of the code doesn't line up > properly for me. > * Setting the size to 4 to skip bytes is arbitrary, and won't always > give decent results on different platforms. > > Thank you for working on this. > > As for your comments on the arm disassembler. > > 1) I am not familiar with ARM, but I do know the decoder is currently > being worked on. > > 2) We were just discussing this in IRC. The idea is to simply handle > ARM disassembly as a special case and inspect the bit to decide how to > disassemble the symbol. > > - Michael Spencer >I have just found out that arm disassembler use tblgen -gen-disassembler not -gen-arm-decoder, so I have looked at the wrong code, but can anyone explain what the arm-decoder is for? The llvm-objdump failed on bx lr insn(0xe12fff1e),because the condition(Bits & ARM::HasV4TOps) has failed, the Bits is 0, so it failed, but I haven't found out why. Songmao
Michael, I have rework the patch according to your suggestion. And I have read binutil/objdump source code and found that it has a logic that if there's no symtab, it will use dynsym, which is missing in llvm-objdump. Songmao -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Fix-the-address-calculation-for-llvm-objdump.patch Type: text/x-patch Size: 3036 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111012/8d8263d1/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Implement-sectionContainsSymbol-preparing-for-the-ob.patch Type: text/x-patch Size: 1321 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111012/8d8263d1/attachment-0001.bin>
On 2011年10月12日 04:12, Owen Anderson wrote:> Neo, > > On Oct 11, 2011, at 12:15 AM, Neo wrote: > >> 1. arm instruction decoder cannot recognise bx series instructions. > Can you provide a testcase for an instruction it fails to disassemble? > > --Owen >Owen, Add -triple="armv7-unknown-unknown" can fix the problem. Songmao
Reasonably Related Threads
- [LLVMdev] llvm-objdump related patch
- How to objcopy via LLVM toolchain for armv7e-m ELF32LE?
- [LLVMdev] ELFObjectFile::getSymbolFileOffset
- [LLVMdev] [patch] Support PE/COFF in COFFObjectFile, fix some bugs object file readers
- [LLVMdev] Error handling in LLVMObject library