Greg Fitzgerald
2012-Oct-16 22:07 UTC
[LLVMdev] R_ARM_ABS32 disassembly with integrated-as
Attached is an example of how to reproduce the issue. It uses a C file that happens to has a bunch of switch statements which are encoded as jump tables, giving us data-in-code. Usage: To build object files with clang via the -integrated-as versus via GCC: $ export NDK_DIR=<my_ndk_dir> $ export LLVM_DIR=<my_llvm_bin_dir> $ make To test that the generated objects contain the same Mapping Symbols: $ make test If "make test" fails, a diff is printed containing what GCC generates versus LLVM. To bypass clang and gcc (say you don't want to install the NDK), you can build the same LLVM object file with just: $ make ll To bypass llc, you can try "make asm" to first generate a .s and then compile that. But if you do this, one runs into two more bugs. First, the MC layer fails to parse ARM ELF, only MachO. Second, clang fails to care, bypassing the integrated-as and instead generating the .o via GCC. If you happen to have -ccc-gcc-name set, you will think your test passes when what actually happened is that both objects were compiled with GCC! Thanks, Greg On Tue, Oct 16, 2012 at 1:03 PM, Renato Golin <rengolin at systemcall.org> wrote:> On 16 October 2012 03:16, Greg Fitzgerald <garious at gmail.com> wrote: >> Lastly, from MCELFStreamer, how do I determine if we generating an ARM or >> Thumb ELF? > > That was the only part I didn't know how to get. Jim should know. > > >> I can catch Thumb from the EmitThumbFunc, but that seems a >> little odd. > > Ignore EmitThumbFunc, it has nothing to do with your change. > > >> $ readelf -s via-llvm-as.o | grep "\$." >> 2: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d >> 3: 00000000 0 NOTYPE LOCAL DEFAULT 4 $t > > Clearly, you're not detecting all code/data changes, or the direct ELF > emission is not creating too many constant pools. > > Can you attach the assembly generated and the ELF object created from > both the inline asm and the gcc asm? > > > -- > cheers, > --renato > > http://systemcall.org/-------------- next part -------------- A non-text attachment was scrubbed... Name: scaffold.C Type: text/x-csrc Size: 20949 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121016/b4d2ef22/attachment.c> -------------- next part -------------- A non-text attachment was scrubbed... Name: tsthd.h Type: text/x-chdr Size: 3291 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121016/b4d2ef22/attachment.h> -------------- next part -------------- A non-text attachment was scrubbed... Name: scaffold-arm.ll Type: application/octet-stream Size: 46328 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121016/b4d2ef22/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: Makefile Type: application/octet-stream Size: 1802 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121016/b4d2ef22/attachment-0001.obj>
Greg Fitzgerald
2012-Oct-17 14:05 UTC
[LLVMdev] R_ARM_ABS32 disassembly with integrated-as
Hi Jim, The diff below is not intended to be a patch, but a starting point. It is the shortest path (I hope) to getting LLVM to emit ARM mapping symbols to the ELF without changing any shared interfaces. Could you have a look at the FIXME comments and offer some pointers on how to get this code out of MCELFStreamer? Thanks, Greg diff --git a/lib/MC/MCELFStreamer.cpp b/lib/MC/MCELFStreamer.cpp index 8107005..153ca78 100644 --- a/lib/MC/MCELFStreamer.cpp +++ b/lib/MC/MCELFStreamer.cpp @@ -40,12 +40,14 @@ class MCELFStreamer : public MCObjectStreamer { public: MCELFStreamer(MCContext &Context, MCAsmBackend &TAB, raw_ostream &OS, MCCodeEmitter *Emitter) - : MCObjectStreamer(Context, TAB, OS, Emitter) {} + : MCObjectStreamer(Context, TAB, OS, Emitter), + IsThumb(false), MappingSymbolCounter(0) {} MCELFStreamer(MCContext &Context, MCAsmBackend &TAB, raw_ostream &OS, MCCodeEmitter *Emitter, MCAssembler *Assembler) - : MCObjectStreamer(Context, TAB, OS, Emitter, Assembler) {} + : MCObjectStreamer(Context, TAB, OS, Emitter, Assembler), + IsThumb(false), MappingSymbolCounter(0) {} ~MCELFStreamer() {} @@ -58,6 +60,7 @@ public: virtual void EmitLabel(MCSymbol *Symbol); virtual void EmitAssemblerFlag(MCAssemblerFlag Flag); virtual void EmitThumbFunc(MCSymbol *Func); + virtual void EmitDataRegion(MCDataRegionType Kind); virtual void EmitAssignment(MCSymbol *Symbol, const MCExpr *Value); virtual void EmitWeakReference(MCSymbol *Alias, const MCSymbol *Symbol); virtual void EmitSymbolAttribute(MCSymbol *Symbol, MCSymbolAttr Attribute); @@ -108,6 +111,7 @@ public: private: virtual void EmitInstToFragment(const MCInst &Inst); virtual void EmitInstToData(const MCInst &Inst); + virtual void EmitMappingSymbol(bool IsData); void fixSymbolsInTLSFixups(const MCExpr *expr); @@ -119,6 +123,11 @@ private: std::vector<LocalCommon> LocalCommons; SmallPtrSet<MCSymbol *, 16> BindingExplicitlySet; + + // FIXME: This information is in ARMAsmBackend, but we currently + // have no way to reach it. + bool IsThumb; + int64_t MappingSymbolCounter; /// @} void SetSection(StringRef Section, unsigned Type, unsigned Flags, SectionKind Kind) { @@ -130,18 +139,21 @@ private: ELF::SHF_WRITE |ELF::SHF_ALLOC, SectionKind::getDataRel()); EmitCodeAlignment(4, 0); + EmitMappingSymbol(/*IsData*/true); } void SetSectionText() { SetSection(".text", ELF::SHT_PROGBITS, ELF::SHF_EXECINSTR | ELF::SHF_ALLOC, SectionKind::getText()); EmitCodeAlignment(4, 0); + EmitMappingSymbol(/*IsData*/false); } void SetSectionBss() { SetSection(".bss", ELF::SHT_NOBITS, ELF::SHF_WRITE | ELF::SHF_ALLOC, SectionKind::getBSS()); EmitCodeAlignment(4, 0); + EmitMappingSymbol(/*IsData*/true); } }; } @@ -188,6 +200,55 @@ void MCELFStreamer::EmitThumbFunc(MCSymbol *Func) { MCSymbolData &SD = getAssembler().getOrCreateSymbolData(*Func); SD.setFlags(SD.getFlags() | ELF_Other_ThumbFunc); + + // Use this flag to output Thumb symbols after data sections. + IsThumb = true; + + // FIXME: Instead, emit the correct mapping symbol at .text + EmitMappingSymbol(/*IsData*/false); +} + +void MCELFStreamer::EmitMappingSymbol(bool IsData) { + // FIXME: The following is specific to the ARM. This should be moved + // to ARMAsmBackend. + + if (!getAssembler().getBackend().hasDataInCodeSupport()) + return; + + // Create a temporary label to mark the start of the data region. + MCSymbol *Start = getContext().CreateTempSymbol(); + EmitLabel(Start); + + // FIXME: We want to generate symbols with the same name, but + // CreateSymbol() is not a public function. So instead + // we generate a unique name as we go. Luckily, the ARM + // ELF spec says that everything after the '.' is ignored. + StringRef Name = IsData ? "$d" : IsThumb ? "$t" : "$a"; + StringRef UniqueName = Name.str() + "." + itostr(MappingSymbolCounter++); + MCSymbol *Symbol = getContext().GetOrCreateSymbol(UniqueName); + + MCSymbolData &SD = getAssembler().getOrCreateSymbolData(*Symbol); + MCELF::SetType(SD, ELF::STT_NOTYPE); + MCELF::SetBinding(SD, ELF::STB_LOCAL); + SD.setExternal(false); + Symbol->setSection(*getCurrentSection()); + + const MCExpr *Value = MCSymbolRefExpr::Create(Start, getContext()); + Symbol->setVariableValue(Value); +} + +void MCELFStreamer::EmitDataRegion(MCDataRegionType Kind) { + switch (Kind) { + case MCDR_DataRegion: + case MCDR_DataRegionJT8: + case MCDR_DataRegionJT16: + case MCDR_DataRegionJT32: + EmitMappingSymbol(/*IsData*/true); + break; + case MCDR_DataRegionEnd: + EmitMappingSymbol(/*IsData*/false); + break; + } } void MCELFStreamer::EmitAssignment(MCSymbol *Symbol, const MCExpr *Value) { diff --git a/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp b/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp index 1ba6ab0..a2c6f7d 100644 --- a/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp +++ b/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp @@ -561,7 +561,10 @@ public: uint8_t OSABI; ELFARMAsmBackend(const Target &T, const StringRef TT, uint8_t _OSABI) - : ARMAsmBackend(T, TT), OSABI(_OSABI) { } + : ARMAsmBackend(T, TT), OSABI(_OSABI) { + // FIXME: Are there versions of ELF-ARM that do not support data-in-code? + HasDataInCodeSupport = true; + } void applyFixup(const MCFixup &Fixup, char *Data, unsigned DataSize, uint64_t Value) const;
On 17 October 2012 15:05, Greg Fitzgerald <garious at gmail.com> wrote:> + virtual void EmitMappingSymbol(bool IsData);I'd use an enum, or have multiple internal implementations... EmitDataMappingSymbol -> { nop on base class, on ARM, prints "$d" } EmitCodeMappingSymbol -> { nop on base class, calling either EmitThumbMappingSymbol or EmitARMMappingSymbol (private) on ARM }> +void MCELFStreamer::EmitMappingSymbol(bool IsData) { > + // FIXME: The following is specific to the ARM. This should be moved > + // to ARMAsmBackend.Maybe MCARMELFStreamer (or whatever sounds nicer than that). ARMAsm is a big bag of code and nowadays, most of it is format agnostic, I think (asm, elf). -- cheers, --renato http://systemcall.org/
Reasonably Related Threads
- [LLVMdev] R_ARM_ABS32 disassembly with integrated-as
- [LLVMdev] R_ARM_ABS32 disassembly with integrated-as
- [LLVMdev] R_ARM_ABS32 disassembly with integrated-as
- [LLVMdev] R_ARM_ABS32 disassembly with integrated-as
- [LLVMdev] R_ARM_ABS32 disassembly with integrated-as