thr3ads.net - llvm dev - [llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this? [Oct 2017]

If this information is useful, please help other people find it:
Share via:

kyra via llvm-dev

2017-Oct-10 19:41 UTC

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

On 10/10/2017 9:00 PM, Rui Ueyama wrote:> I'm not sure if I understand correctly. If my understanding is 
> correct, you are saying that GHC can link either .o or .so at runtime, 
> which sounds a bit odd because .o is not designed for dynamic linking. 
> Am I missing something?Yes, GHC runtime linker *does* link .o files not only doing all 
necessary relocations but also creating trampolines for "far" code to 
fulfill "small" memory model.
> I also do not understand why only static libraries need "compile/link 
> pass" -- they at least don't need a compile pass, as they contain 
> compiled .o files, and they indeed need a link pass, but that's also 
> true for a single big .o file generated by -r, no? After all, in order 
> to link against a .a file, I think you need to pull out a .o file from 
> a .a and do whatever you need to do to link a single big .o file.Don't quite understand this.
The idea is that when creating a package you should *at the very least* 
provide a static library a client can statically link against. You 
optionally may create a shared library for a client to link against, but 
to do so you should *recompile* the whole package because things differ 
now (this is how GHC works), – you can't simply link all your existing 
object code (what you've produced the static library from) into this 
shared library. But if you want to provide the single prelinked *.o file 
(for GHC runtime linker consumption) you need no to perform any extra 
compile step, you simply link all your object files (exactly those which 
went to the package's static library) into this *.o file with 'ld
-r'.
> IIUC, GHC is faster when handling .a files compared to a prelinked big 
> .o file, even if they contain the same binary code/data. But it sounds 
> like an artifact of the current implementation of GHC, because, in 
> theory, there's no reason the former is much inefficient than the 
> latter. If that's the case, doesn't it make more sense to improve
GHC?No. GHC **runtime** linker is much slower when handling *.a files (and 
this is exactly the culprit of this whole story) since it goes through 
the whole archive and links each object module separately doing all 
resolutions and relocations and trampolines, than when linking already 
prelinked big *.o file.

There are, perhaps, some confusions related to what GHC *runtime* linker 
is. GHC runtime linker goes out into the scene when either GHC is used 
interactively, or GHC encounters the code which it has to execute at 
compile time (Template Haskell/quasiquotations). Thus GHC compiler must 
link some external code during it's own run time.

HTH.

Cheers,
Kyra

Rui Ueyama via llvm-dev

2017-Oct-10 20:01 UTC

head link

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

On Tue, Oct 10, 2017 at 12:41 PM, kyra <kyrab at mail.ru> wrote:
> On 10/10/2017 9:00 PM, Rui Ueyama wrote:
>
>> I'm not sure if I understand correctly. If my understanding is
correct,
>> you are saying that GHC can link either .o or .so at runtime, which
sounds
>> a bit odd because .o is not designed for dynamic linking. Am I missing
>> something?
>>
> Yes, GHC runtime linker *does* link .o files not only doing all necessary
> relocations but also creating trampolines for "far" code to
fulfill "small"
> memory model.
>
> I also do not understand why only static libraries need "compile/link
>> pass" -- they at least don't need a compile pass, as they
contain compiled
>> .o files, and they indeed need a link pass, but that's also true
for a
>> single big .o file generated by -r, no? After all, in order to link
against
>> a .a file, I think you need to pull out a .o file from a .a and do
whatever
>> you need to do to link a single big .o file.
>>
> Don't quite understand this.
> The idea is that when creating a package you should *at the very least*
> provide a static library a client can statically link against. You
> optionally may create a shared library for a client to link against, but to
> do so you should *recompile* the whole package because things differ now
> (this is how GHC works), – you can't simply link all your existing
object
> code (what you've produced the static library from) into this shared
> library. But if you want to provide the single prelinked *.o file (for GHC
> runtime linker consumption) you need no to perform any extra compile step,
> you simply link all your object files (exactly those which went to the
> package's static library) into this *.o file with 'ld -r'.
>
> IIUC, GHC is faster when handling .a files compared to a prelinked big .o
>> file, even if they contain the same binary code/data. But it sounds
like an
>> artifact of the current implementation of GHC, because, in theory,
there's
>> no reason the former is much inefficient than the latter. If that's
the
>> case, doesn't it make more sense to improve GHC?
>>
> No. GHC **runtime** linker is much slower when handling *.a files (and
> this is exactly the culprit of this whole story) since it goes through the
> whole archive and links each object module separately doing all resolutions
> and relocations and trampolines, than when linking already prelinked big
> *.o file.
>
Looks like I still do not understand why a .a can be much slower than a
prelinked .o. As far as I understand, "ld -r" doesn't reduce
amount of data
that much. It doesn't reduce the number of relocations, as relocations in
input object files are basically passed through to the output. It doesn't
reduce the number of symbols that much, as the combined object file
contains a union of all symbols appeared in the input files. So, I think
the amount of data in a .a is essentially the same as a prelinked .o. I
wonder what can make a difference in speed.

There are, perhaps, some confusions related to what GHC *runtime*
linker> is. GHC runtime linker goes out into the scene when either GHC is used
> interactively, or GHC encounters the code which it has to execute at
> compile time (Template Haskell/quasiquotations). Thus GHC compiler must
> link some external code during it's own run time.
>
> HTH.
>
> Cheers,
> Kyra
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171010/54f9aedd/attachment.html>

Reid Kleckner via llvm-dev

2017-Oct-10 21:20 UTC

head link

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

On Tue, Oct 10, 2017 at 1:01 PM, Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> No. GHC **runtime** linker is much slower when handling *.a files (and
>> this is exactly the culprit of this whole story) since it goes through
the
>> whole archive and links each object module separately doing all
resolutions
>> and relocations and trampolines, than when linking already prelinked
big
>> *.o file.
>>
>
> Looks like I still do not understand why a .a can be much slower than a
> prelinked .o. As far as I understand, "ld -r" doesn't reduce
amount of data
> that much. It doesn't reduce the number of relocations, as relocations
in
> input object files are basically passed through to the output. It
doesn't
> reduce the number of symbols that much, as the combined object file
> contains a union of all symbols appeared in the input files. So, I think
> the amount of data in a .a is essentially the same as a prelinked .o. I
> wonder what can make a difference in speed.
>
I can't speak for Haskell, but ld -r can be useful for speeding up C++
links, because it acts as a pre-merging step for duplicate comdats.
Consider a library that uses many instantiations of the same template with
the same type. An archive will contain many copies of the template, but a
relocated object file will only contain one.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171010/f1cdef65/attachment.html>

kyra via llvm-dev

2017-Oct-10 21:21 UTC

head link

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

On 10/10/2017 11:01 PM, Rui Ueyama wrote:> Looks like I still do not understand why a .a can be much slower than 
> a prelinked .o. As far as I understand, "ld -r" doesn't
reduce amount
> of data that much. It doesn't reduce the number of relocations, as 
> relocations in input object files are basically passed through to the 
> output. It doesn't reduce the number of symbols that much, as the 
> combined object file contains a union of all symbols appeared in the 
> input files. So, I think the amount of data in a .a is essentially the 
> same as a prelinked .o. I wonder what can make a difference in speed.Ah, good point.

Only now have I realized that my perception of link times was formed 
when no '-split-sections'  option existed. The corresponding option was 
'-split-obs' and typical package 's static library contained
thousands
object modules.

For example:
The latest official GHC 8.2.1 release "base" package's static
library
built with '-split-objs' contains 25631 object modules. The static 
library size is 28MB, prelinked object file size is 15MB.
My own custom built GHC ghc-8.3.20170619 release "base" package's
static
library built with '-split-sections' (instead of '-split-objs')
contains
228 object modules only. The static library size is 22MB, prelinked 
object file size is 15MB.

Thus, when working with "-split-sections" libraries we won't,
perhaps,
see that big differences in link times (remember we mean GHC runtime 
linker here) between these libraries and their prelinked object 
counterparts.

Thus, perhaps, having '-r' option in COFF LLD is becoming much less 
important than I though before.

Cheers,
Kyra

llvm dev - Oct 2017 - Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?

[llvm-dev] Make LLD output COFF relocatable object file (like ELF's -r does). How much work is required to implement this?