Will Dietz
2012-Jan-23 05:32 UTC
[LLVMdev] ELFObjectFile changes, llvm-objdump showing 'wrong' values?
Hi all, I'm using the MC framework for a project, and while updating to latest trunk (r148672) encountered the following issue: It seems that SymbolRef::getAddress and SymbolRef::getFileOffset have been changed to add the symbol's offset to the offset of the containing section? This has the following implications: To get the /actual/ fileoffset, I now need to do: Symbol.getFileOffset() - ContainingSection.getFileOffset() And to get the address relative to the section, I do: Symbol.getFileOffset() - 2*ContainingSection.getFileOffset() I suspect this isn't the desired functionality (what use is the original value?)? You can also see the impact of this on the tool llvm-objdump (as well as llvm-nm), as shown below: Normal objdump: http://pastebin.com/Fsv3Vvye vs llvm-objdump: http://pastebin.com/MRryQe4D I believe r148653 caused this, but haven't verified directly. This didn't happen as of r148100. Am I missing something (my code borrows a good deal from llvm-objdump and llvm-nm, so if they are doing something wrong with respect to these new changes, so am I), or is this something that should be fixed? Thanks for your time! ~Will
Bendersky, Eli
2012-Jan-23 05:42 UTC
[LLVMdev] ELFObjectFile changes, llvm-objdump showing 'wrong' values?
Hi Will, I've committed the recent change to ELFObjectFile (r148653). It was supposed to add new functionality, not break existing one. I'll take a look at this and will keep you updated. Eli> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Will Dietz > Sent: Monday, January 23, 2012 07:32 > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] ELFObjectFile changes, llvm-objdump showing 'wrong' > values? > > Hi all, > > I'm using the MC framework for a project, and while updating to latest trunk > (r148672) encountered the following issue: > > It seems that SymbolRef::getAddress and SymbolRef::getFileOffset have > been changed to add the symbol's offset to the offset of the containing > section? > > This has the following implications: > > To get the /actual/ fileoffset, I now need to do: > Symbol.getFileOffset() - ContainingSection.getFileOffset() And to get the > address relative to the section, I do: > Symbol.getFileOffset() - 2*ContainingSection.getFileOffset() > > I suspect this isn't the desired functionality (what use is the original value?)? > > You can also see the impact of this on the tool llvm-objdump (as well as llvm- > nm), as shown below: > > Normal objdump: http://pastebin.com/Fsv3Vvye vs llvm-objdump: > http://pastebin.com/MRryQe4D > > I believe r148653 caused this, but haven't verified directly. This didn't happen > as of r148100. > > Am I missing something (my code borrows a good deal from llvm-objdump > and llvm-nm, so if they are doing something wrong with respect to these > new changes, so am I), or is this something that should be fixed? > > Thanks for your time! > > ~Will > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev--------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Bendersky, Eli
2012-Jan-23 06:26 UTC
[LLVMdev] ELFObjectFile changes, llvm-objdump showing 'wrong' values?
Hi, I would like to examine the implications you mention in more detail. (1) Symbol address According to the ELF standard, in a symbol table entry st_value means: "In relocatable files, st_value holds a section offset for a defined symbol. That is, st_value is an offset from the beginning of the section that st_shndx identifies." (*) Therefore, when queried about a symbol's address what would the right answer be? In ELFObjectFile::getSymbolAddress, previously, it was simply symb->st_value (which is the relative offset to the section). Now, Section->sh_addr is added to reflect the actual address of the symbol. Ignoring for the moment the change this imposes on objdump & nm (which can be amended), what would the expected address be for clients of getSymbolAddress? (2) Symbol offset Again, referring to the definition of the "st_value" field above, the file offset of the symbol is the section offset plus the symbol's offset in the section, which is reflected in the new code: Result = symb->st_value + (Section ? Section->sh_offset : 0); The old code subtracted Section->sh_addr from that for reasons that are not entirely clear to me. I'm not sure where this creates a problem for you? AFAICS, neither llvm-objdump nor llvm-nm use the symbol's file offset. It's also not clear from your pastes of llvm-objdump and objdump what the significant difference are. Eli (*) ELFObjectFile represents a relocatable file> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Will Dietz > Sent: Monday, January 23, 2012 07:32 > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] ELFObjectFile changes, llvm-objdump showing 'wrong' > values? > > Hi all, > > I'm using the MC framework for a project, and while updating to latest trunk > (r148672) encountered the following issue: > > It seems that SymbolRef::getAddress and SymbolRef::getFileOffset have > been changed to add the symbol's offset to the offset of the containing > section? > > This has the following implications: > > To get the /actual/ fileoffset, I now need to do: > Symbol.getFileOffset() - ContainingSection.getFileOffset() And to get the > address relative to the section, I do: > Symbol.getFileOffset() - 2*ContainingSection.getFileOffset() > > I suspect this isn't the desired functionality (what use is the original value?)? > > You can also see the impact of this on the tool llvm-objdump (as well as llvm- > nm), as shown below: > > Normal objdump: http://pastebin.com/Fsv3Vvye vs llvm-objdump: > http://pastebin.com/MRryQe4D > > I believe r148653 caused this, but haven't verified directly. This didn't happen > as of r148100. > > Am I missing something (my code borrows a good deal from llvm-objdump > and llvm-nm, so if they are doing something wrong with respect to these > new changes, so am I), or is this something that should be fixed? > > Thanks for your time! > > ~Will > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev--------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Will Dietz
2012-Jan-23 07:17 UTC
[LLVMdev] ELFObjectFile changes, llvm-objdump showing 'wrong' values?
2012/1/23 Bendersky, Eli <eli.bendersky at intel.com>:> Hi, > > I would like to examine the implications you mention in more detail. >Thank you!> (1) Symbol address > According to the ELF standard, in a symbol table entry st_value means: "In relocatable files, st_value holds a section offset for a defined symbol. That is, > st_value is an offset from the beginning of the section that st_shndx identifies." (*) > > Therefore, when queried about a symbol's address what would the right answer be? In ELFObjectFile::getSymbolAddress, previously, it was simply symb->st_value (which is the relative offset to the section). Now, Section->sh_addr is added to reflect the actual address of the symbol. > > Ignoring for the moment the change this imposes on objdump & nm (which can be amended), what would the expected address be for clients of getSymbolAddress?I trust your interpretation and implementation of the relevant spec's, and don't mean to suggest a mistake there. I apologize if I did so previously. What I do know is that now ELFObjectFile doesn't seem to work on executables, as it did before. Accordingly the tools that use ELFObjectFile (llvm-objdump, llvm-nm) no longer accurately display symbol information on such files (and my project, using code from these tools, doesn't either). Since these tools used to do this "correctly", as do their non-llvm counterparts, and because they made use of ELFObjectFile for this purpose, I assumed that was a supported use case. It appears that's incorrect, and the output working for executables was always a coincidence. I wish this wasn't the case, but I understand things change and will update my project accordingly (or move away from MC if that's not possible, I suppose). I assume there's no somewhat-equivalent class/etc that will enable a client to reason about non-relocatable ELF files now that ELFObjectFile doesn't support them?> > (2) Symbol offset > Again, referring to the definition of the "st_value" field above, the file offset of the symbol is the section offset plus the symbol's offset in the section, which is reflected in the new code: > > Result = symb->st_value + > (Section ? Section->sh_offset : 0); > > The old code subtracted Section->sh_addr from that for reasons that are not entirely clear to me. > > I'm not sure where this creates a problem for you? AFAICS, neither llvm-objdump nor llvm-nm use the symbol's file offset. It's also not clear from your pastes of llvm-objdump and objdump what the significant difference are. >The difference in the pastes, and my apologies for not explicitly pointing this out originally, is that the symbol addresses (see 'main') now seem to double-include the section address in their value. Notice how llvm-objdump gives address of 00800850 for main while objdump shows 004004a0. Note that before your changes llvm-objdump's output was aligned with that of normal objdump in this regard.> Eli > > (*) ELFObjectFile represents a relocatable file >It appears 100% of the/my problem is thinking ELFObjectFile was suitable for use on non-relocatable files such as executables. Since this appears to be wrong (it gives the wrong results for such files as detailed above, and probably others), and because this is by design not mistake, might I suggest something similar to updating Binary::createBinary (in lib/Object/Binary.cpp) to reflect this to avoid future confusion (as it presently uses ELFObjectFile for all ELF file types, not just relocatables). I don't know how the correct person to bug about this, hopefully addressing llvmdev@ is sufficient here. Thank you for your time Eli, your detailed explanation, and your continued work. Have a good one :) ~Will