Hi,
I am new to llvm, not familiar with c++, after some use with
llvm-objdump, and finding the broken output, I try to debug and fix the
code so it can become usable. Please help review the patch, so that they
can be merged.
And there's still two major problem I have found about arm
disassembler:
1. arm instruction decoder cannot recognise bx series instructions.
2. As gcc will generate thumb and arm instruction mixed binary, we have
to switch from each other.
we can tell if a function is thumb or arm code by looking at the symbol
table entry, when in thumb code, the lowest bit of the symbol value will
be set to '1'.
so how these logic can be implemented while still adapt to the structure
of the code?
Songmao
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Implement-sectionContainsSymbol-preparing-for-the-ob.patch
Type: text/x-patch
Size: 1177 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111011/527db4f3/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Fix-objdump-various-problem.patch
Type: text/x-patch
Size: 6143 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111011/527db4f3/attachment-0001.bin>
On Tue, Oct 11, 2011 at 12:15 AM, Neo <smtian at ingenic.cn> wrote:> Hi, > I am new to llvm, not familiar with c++, after some use with > llvm-objdump, and finding the broken output, I try to debug and fix the code > so it can become usable. Please help review the patch, so that they can be > merged. > And there's still two major problem I have found about arm disassembler: > 1. arm instruction decoder cannot recognise bx series instructions. > 2. As gcc will generate thumb and arm instruction mixed binary, we have to > switch from each other. > we can tell if a function is thumb or arm code by looking at the symbol > table entry, when in thumb code, the lowest bit of the symbol value will be > set to '1'. > so how these logic can be implemented while still adapt to the structure of > the code? > > SongmaoFor the first patch. The code is only valid for executable files, not relocatable files. st_shndx should be used to determine if the symbol is in the given section. Also st_value can hold the offset into st_shndx, not the actual address. Also it doesn't handle non-function symbols. For the second patch. Could you explain what exactly you are trying to fix? I see some stuff that I know is wrong, but it would help if I knew the intent. As for what I do know. * The error function already prints out the error. If you want to print additional info, add an overload of error that allows that. * Please use spaces instead of tabs. Lots of the code doesn't line up properly for me. * Setting the size to 4 to skip bytes is arbitrary, and won't always give decent results on different platforms. Thank you for working on this. As for your comments on the arm disassembler. 1) I am not familiar with ARM, but I do know the decoder is currently being worked on. 2) We were just discussing this in IRC. The idea is to simply handle ARM disassembly as a special case and inspect the bit to decide how to disassemble the symbol. - Michael Spencer
Neo, On Oct 11, 2011, at 12:15 AM, Neo wrote:> 1. arm instruction decoder cannot recognise bx series instructions.Can you provide a testcase for an instruction it fails to disassemble? --Owen
On 2011年10月12日 03:40, Michael Spencer wrote:> On Tue, Oct 11, 2011 at 12:15 AM, Neo<smtian at ingenic.cn> wrote: >> Hi, >> I am new to llvm, not familiar with c++, after some use with >> llvm-objdump, and finding the broken output, I try to debug and fix the code >> so it can become usable. Please help review the patch, so that they can be >> merged. >> And there's still two major problem I have found about arm disassembler: >> 1. arm instruction decoder cannot recognise bx series instructions. >> 2. As gcc will generate thumb and arm instruction mixed binary, we have to >> switch from each other. >> we can tell if a function is thumb or arm code by looking at the symbol >> table entry, when in thumb code, the lowest bit of the symbol value will be >> set to '1'. >> so how these logic can be implemented while still adapt to the structure of >> the code? >> >> Songmao > For the first patch. The code is only valid for executable files, not > relocatable files. st_shndx should be used to determine if the symbol > is in the given section. Also st_value can hold the offset into > st_shndx, not the actual address. Also it doesn't handle non-function > symbols. > > For the second patch. Could you explain what exactly you are trying to > fix? I see some stuff that I know is wrong, but it would help if I > knew the intent. As for what I do know. > * The error function already prints out the error. If you want to > print additional info, add an overload of error that allows that. > * Please use spaces instead of tabs. Lots of the code doesn't line up > properly for me. > * Setting the size to 4 to skip bytes is arbitrary, and won't always > give decent results on different platforms. > > Thank you for working on this. > > As for your comments on the arm disassembler. > > 1) I am not familiar with ARM, but I do know the decoder is currently > being worked on. > > 2) We were just discussing this in IRC. The idea is to simply handle > ARM disassembly as a special case and inspect the bit to decide how to > disassemble the symbol. > > - Michael Spencer >I have just found out that arm disassembler use tblgen -gen-disassembler not -gen-arm-decoder, so I have looked at the wrong code, but can anyone explain what the arm-decoder is for? The llvm-objdump failed on bx lr insn(0xe12fff1e),because the condition(Bits & ARM::HasV4TOps) has failed, the Bits is 0, so it failed, but I haven't found out why. Songmao
Michael,
I have rework the patch according to your suggestion. And I have
read binutil/objdump source code and found that it has a logic that if
there's no symtab, it will use dynsym, which is missing in llvm-objdump.
Songmao
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Fix-the-address-calculation-for-llvm-objdump.patch
Type: text/x-patch
Size: 3036 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111012/8d8263d1/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Implement-sectionContainsSymbol-preparing-for-the-ob.patch
Type: text/x-patch
Size: 1321 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111012/8d8263d1/attachment-0001.bin>
On 2011年10月12日 04:12, Owen Anderson wrote:> Neo, > > On Oct 11, 2011, at 12:15 AM, Neo wrote: > >> 1. arm instruction decoder cannot recognise bx series instructions. > Can you provide a testcase for an instruction it fails to disassemble? > > --Owen >Owen, Add -triple="armv7-unknown-unknown" can fix the problem. Songmao
Maybe Matching Threads
- [LLVMdev] llvm-objdump related patch
- How to objcopy via LLVM toolchain for armv7e-m ELF32LE?
- [LLVMdev] ELFObjectFile::getSymbolFileOffset
- [LLVMdev] [patch] Support PE/COFF in COFFObjectFile, fix some bugs object file readers
- [LLVMdev] Error handling in LLVMObject library