On Thu, Jan 10, 2013 at 1:55 PM, Nick Kledzik <kledzik at apple.com> wrote:> The reality is that some content is tied to the format. That is, Writers do not > just lay down the content of the atoms supplied and then put some file format > wrapper around that content. Some of the content exists because of > the file format, for instance GOT and PLT entires. > > Now in the case of PLT entries the overall algorithm is similar to what > darwin needs for "stubs". So the abstract algorithm is an a Pass and > each Writer supplies additions to that Pass which generate platform/Writer > specific atoms as needed. > > Perhaps the name "Writer" is causing confusion. In a previous incarnation > we called in "Platform". But that caused confusion because does platform > mean the OS or the file format? And LLVM uses "target" to mean many > things too. > > My pragmatic approach is that any (non-driver) code that only mach-o/darwin > will need should go in lib/ReaderWriter/MachO. Similarly, and code that > is only needed by some platform that uses ELF should go in lib/ReaderWriter/ELF. Any common processing of atoms should be done > in a Pass which has hooks allowing specialization by Writers (aka platforms). > > Another way to look at the current design is that WriterELF needs to support > every processor/platform ELF output. Exactly what it does is controlled > (configured) by WriterOptionsELF. The driver's job is to produce the right > WriteOptionsELF settings. For library based linking (no command line) > you just just code to instantiate an appropriate WriteOptionsELF.Interesting. As you have pointed out, currently the format and the os are conflated. Would it make sense to have Reader/Writer literally be concerned *only* with turning an atom graph into an output object file in a specific format? (each format may have format-specific atoms). Then the OS layer (+ architecture?) is implemented in a separate library that is a client of Reader/Writer. How do you expect WriterOptionsELF to scale with respect to the number of OS's and architectures supported? E.g. for each architecture, will we need to add to WriterOptionsELF? For each OS, will we need to add to it? -- Sean Silva
On Jan 10, 2013, at 4:09 PM, Sean Silva wrote:> On Thu, Jan 10, 2013 at 1:55 PM, Nick Kledzik <kledzik at apple.com> wrote: >> The reality is that some content is tied to the format. That is, Writers do not >> just lay down the content of the atoms supplied and then put some file format >> wrapper around that content. Some of the content exists because of >> the file format, for instance GOT and PLT entires. >> >> Now in the case of PLT entries the overall algorithm is similar to what >> darwin needs for "stubs". So the abstract algorithm is an a Pass and >> each Writer supplies additions to that Pass which generate platform/Writer >> specific atoms as needed. >> >> Perhaps the name "Writer" is causing confusion. In a previous incarnation >> we called in "Platform". But that caused confusion because does platform >> mean the OS or the file format? And LLVM uses "target" to mean many >> things too. >> >> My pragmatic approach is that any (non-driver) code that only mach-o/darwin >> will need should go in lib/ReaderWriter/MachO. Similarly, and code that >> is only needed by some platform that uses ELF should go in lib/ReaderWriter/ELF. Any common processing of atoms should be done >> in a Pass which has hooks allowing specialization by Writers (aka platforms). >> >> Another way to look at the current design is that WriterELF needs to support >> every processor/platform ELF output. Exactly what it does is controlled >> (configured) by WriterOptionsELF. The driver's job is to produce the right >> WriteOptionsELF settings. For library based linking (no command line) >> you just just code to instantiate an appropriate WriteOptionsELF. > > Interesting. As you have pointed out, currently the format and the os > are conflated. Would it make sense to have Reader/Writer literally be > concerned *only* with turning an atom graph into an output object file > in a specific format? (each format may have format-specific atoms). > Then the OS layer (+ architecture?) is implemented in a separate > library that is a client of Reader/Writer.This conflation problem is really only an issue with ELF (because it is so widely used and because ELF was designed as a container format). In fact, the "conflation" actually makes MachO and PE/COFF code structure in lld easier because everything is in one place. In other words, for mach-o I would not know what to put in the "OS" layer vs the "format" layer.> How do you expect WriterOptionsELF to scale with respect to the number > of OS's and architectures supported? E.g. for each architecture, will > we need to add to WriterOptionsELF? For each OS, will we need to add > to it?I've imagined that WriterELF would need (at least internally) a big set of fine-grain options (e.g. which symbol table hash function, add special section foo, etc). Then the question is whether to express all the fine grain options in the WriterOptionsELF, or instead have high level settings in WriterOptionsELF and then have WriterELF translated the high level settings into the fine grain settings. I'd like to hear some examples of OS vs architectures difference needed in ELF. I (naïvely) think of architectures as just being a "machine" field in the WriterOptionsELF. And OS options as just being "add this magic section", or "put this section here". -Nick
On Thu, Jan 10, 2013 at 9:59 PM, Nick Kledzik <kledzik at apple.com> wrote:> I'd like to hear some examples of OS vs architectures difference needed in ELF. > I (naïvely) think of architectures as just being a "machine" field in the > WriterOptionsELF. And OS options as just being "add this magic section", or > "put this section here".I did a little investigation, and it looks like there is a nontrivial amount of code is needed for platform support: - From perusing the gold source code, there doesn't appear to be very much OS-wise (although I think it piggybacks on GNU ld's gigantic os/format specific configuration; see below). - Gold appears to have all the architecture-specific stuff isolated into separate files. Each one is multi-thousand lines. Just arm.cc is>12000 lines of code (by way of comparison, currently all the sourcecode of lib/ReaderWriter/ is a combined total of <10000 lines). - GNU ld has >20000 lines of templates of C source code which it instantiates in order to support different OS's. It has >14000 lines of linker script templates. It has >5000 lines of parameter-setting files which are used to instantiate those templates. -- Sean Silva