Shankar Easwaram
2015-Feb-07 15:52 UTC
[LLVMdev] [lld] Representation of lld::Reference with a fake target
We are modeling target specific functionally using references, Doesn't your idea defeat the purpose of the atom model? Atoms are mostly target neutral and yaml/native format represents just an atom. Having a derived class for atoms will have a impact on the testing method with lld IMO. We could continue to model using references in my opinion and add some meta data information in the atom where references are not able to model.> On Feb 7, 2015, at 02:36, Simon Atanasyan <simon at atanasyan.com> wrote: > > My 2c: maybe we should not try to put all target specific object file > formats into the single YAML/Native representation. Let's define an > universal formats of file "header" for YAML/Native representation and > probably some top-level structures common for all target and allow > target specific code to arbitrary extend these formats. For example > code in the ReaderWriter/ELF will know how to convert ELF object files > into the YAML/Native form. In that case we get in fact some > incompatible YAML/Native formats for ELF, PECOFF, MachO etc. But I > think it is not a problem. > >> On Sat, Feb 7, 2015 at 6:28 AM, Rui Ueyama <ruiu at google.com> wrote: >> Not all input files have to be able to represented in YAML/Native format. >> There are many unrealistic use cases there. No one wants to write an >> executable file in Native because there's no operating system that can run >> that file. So is YAML. So is the combination of .so file and Native/YAML >> unless we have an operating system whose loader is able to loads a YAML .so >> file. >> >> We might want to write a Native/YAML file as a re-linkable object file (in >> GNU it's -r option), but that's an object file. >> >> So it's totally okay if some input file type is not representable in >> YAML/Native. Some use cases are not real. We can't force all developers to >> spend their time to support unrealistic use cases. >> >> On Fri, Feb 6, 2015 at 7:04 PM, Shankar Easwaran <shankarke at gmail.com> >> wrote: >>> >>> The intermediate result is what is really written to disk when >>> --output-filetype=yaml or native is chosen too. >>> >>> >>> Writing to YAML/Reading back YAML is not doable when you convert input >>> files to atoms because some of the input files are not representable in YAML >>> format. >>> >>>> On Fri, Feb 6, 2015 at 8:48 PM, Rui Ueyama <ruiu at google.com> wrote: >>>> >>>> I think no one is opposing the idea of reading and writing YAML. >>>> >>>> The problem here is that why we need to force all developers to write >>>> code to serialize intermediate data in the middle of link, which no one >>>> except the round-trip passes needs. >>>> >>>> On Fri, Feb 6, 2015 at 6:41 PM, Shankar Easwaram <shankarke at gmail.com> >>>> wrote: >>>>> >>>>> Doing it for every input file is not useful as some of the input files >>>>> are not represent able in YAML form. Examples are shared libraries. >>>>> >>>>> The reason I made the yaml pass be called before the writer was the >>>>> intermediate result was more complete since all atoms have been resolved at >>>>> that point and the state of all atoms are much sane. >>>>> >>>>> It was also easy to use the pass manager. the code was very small to >>>>> achieve what we are trying to do that all the information to the writer is >>>>> passed through references or atom properties. >>>>> >>>>> Shankar Easwaran >>>>> >>>>> >>>>> >>>>> On Feb 6, 2015, at 19:54, Rui Ueyama <ruiu at google.com> wrote: >>>>> >>>>> On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at gmail.com> >>>>> wrote: >>>>>> >>>>>>> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at google.com> wrote: >>>>>>> There are two questions. >>>>>>> >>>>>>> Firstly, do you think the on-disk format needs to compatible with a >>>>>>> C++ >>>>>>> struct so that we can cast that memory buffer to the struct? That may >>>>>>> be >>>>>>> super-fast but that also comes with many limitations. It's hard to >>>>>>> extend, >>>>>>> for example. Every time we want to store variable-length objects we >>>>>>> need to >>>>>>> define string-table-like data structure. And I'm not very sure that >>>>>>> it's >>>>>>> fastest -- because mmap'able objects are not very compact on disk, >>>>>>> slow disk >>>>>>> IO could be a bottleneck, if we compare that with more compact file >>>>>>> format. >>>>>>> I believe Protobufs or Thrust are fast enough or even might be >>>>>>> faster. >>>>>> >>>>>> I'm not sure here. Although I do question if the object files will >>>>>> even need to be read from disk in your standard edit/compile/debug >>>>>> loop or on a build server. I believe we'll need real data to determine >>>>>> this. >>>>>> >>>>>>> >>>>>>> Secondly, do you know why we are dumping post-linked object file to >>>>>>> Native >>>>>>> format? If we want to have a different kind of *object* file format, >>>>>>> we >>>>>>> would want to have a tool to convert an object file in an existing >>>>>>> file >>>>>>> format (say, ELF) to "native", and teach LLD how read from the file. >>>>>>> Currently we are writing a file in the middle of linking process, >>>>>>> which >>>>>>> doesn't make sense to me. >>>>>> >>>>>> This is an artifact of having the native format before we had any >>>>>> readers. I agree that it's weird and not terribly useful to write to >>>>>> native format in the middle of the link, although I have found it >>>>>> helpful to output yaml. There's no need to be able to read it back in >>>>>> and resume though. >>>>> >>>>> >>>>> Even for YAML it doesn't make much sense to write it to a file and read >>>>> it back from the file in the middle of the link, do it? I found that being >>>>> able to output YAML is useful too, but round-trip is a different thing. In >>>>> the middle of the process, we have bunch of additional information that >>>>> doesn't exist in input files and doesn't have to be output to the link >>>>> result. Ability to serialize that intermediate result is not useful. >>>>> >>>>> Shankar, you added these round-trip tests. Do you have any opinion? >>>>> >>>>>> Ideally lld -r would be the tool we use to convert COFF/ELF/MachO to >>>>>> the native format. > > -- > Simon Atanasyan
Simon Atanasyan
2015-Feb-08 11:08 UTC
[LLVMdev] [lld] Representation of lld::Reference with a fake target
I definitely do not suggest to refuse the atom/reference model. My idea is to stop considering YAML/Native formats as *union* of all target specific properties but allow target specific code to arbitrary extend YAML/Native formats. Like MIPS, ARM, X86 etc can extend ELF format. That approach can be implemented by two ways. The first one is to move real YAML/Native read/writing to the target specific code, so lldNative and lldYAML define only common upper-level structures for both formats and provide some common routines for reading/writing. The second way is to allow atoms and references to keep sets of target specific attributes in some general form like key->value dictionaries. On Sat, Feb 7, 2015 at 6:52 PM, Shankar Easwaram <shankarke at gmail.com> wrote:> We are modeling target specific functionally using references, Doesn't your idea defeat the purpose of the atom model? Atoms are mostly target neutral and yaml/native format represents just an atom. Having a derived class for atoms will have a impact on the testing method with lld IMO. > > We could continue to model using references in my opinion and add some meta data information in the atom where references are not able to model. > > >> On Feb 7, 2015, at 02:36, Simon Atanasyan <simon at atanasyan.com> wrote: >> >> My 2c: maybe we should not try to put all target specific object file >> formats into the single YAML/Native representation. Let's define an >> universal formats of file "header" for YAML/Native representation and >> probably some top-level structures common for all target and allow >> target specific code to arbitrary extend these formats. For example >> code in the ReaderWriter/ELF will know how to convert ELF object files >> into the YAML/Native form. In that case we get in fact some >> incompatible YAML/Native formats for ELF, PECOFF, MachO etc. But I >> think it is not a problem.-- Simon Atanasyan
Shankar Easwaram
2015-Feb-08 13:29 UTC
[LLVMdev] [lld] Representation of lld::Reference with a fake target
I agree with you that we should consider one of the below approaches.> On Feb 8, 2015, at 05:08, Simon Atanasyan <simon at atanasyan.com> wrote: > > I definitely do not suggest to refuse the atom/reference model. My > idea is to stop considering YAML/Native formats as *union* of all > target specific properties but allow target specific code to arbitrary > extend YAML/Native formats. Like MIPS, ARM, X86 etc can extend ELF > format. That approach can be implemented by two ways. The first one is > to move real YAML/Native read/writing to the target specific code, so > lldNative and lldYAML define only common upper-level structures for > both formats and provide some common routines for reading/writing. The > second way is to allow atoms and references to keep sets of target > specific attributes in some general form like key->value dictionaries. > >> On Sat, Feb 7, 2015 at 6:52 PM, Shankar Easwaram <shankarke at gmail.com> wrote: >> We are modeling target specific functionally using references, Doesn't your idea defeat the purpose of the atom model? Atoms are mostly target neutral and yaml/native format represents just an atom. Having a derived class for atoms will have a impact on the testing method with lld IMO. >> >> We could continue to model using references in my opinion and add some meta data information in the atom where references are not able to model. >> >> >>> On Feb 7, 2015, at 02:36, Simon Atanasyan <simon at atanasyan.com> wrote: >>> >>> My 2c: maybe we should not try to put all target specific object file >>> formats into the single YAML/Native representation. Let's define an >>> universal formats of file "header" for YAML/Native representation and >>> probably some top-level structures common for all target and allow >>> target specific code to arbitrary extend these formats. For example >>> code in the ReaderWriter/ELF will know how to convert ELF object files >>> into the YAML/Native form. In that case we get in fact some >>> incompatible YAML/Native formats for ELF, PECOFF, MachO etc. But I >>> think it is not a problem. > > -- > Simon Atanasyan