thr3ads.net - llvm dev - [llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler) [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2018-Jan-30 04:54 UTC

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

You can make a PDB per lib (consider msvcrtd.pdb which ships with MSVC),
but all these per-lib PDBs would have to be merged into a single master PDB
at the end, so you still can't avoid that final .  In a way, that's
similar
to the idea behind /DEBUG:FASTLINK (keep the debug info in object files to
eliminate the cost of merging types and symbol records) and we know what
the problems with /DEBUG:FASTLINK are.

The PDB generation code in LLD is still completely single threaded, so
that's one area for huge potential gains, but only some parts of the
algorithm are parallelizable.  We're trying to squeeze every last bit of
performance out of the single-threaded case first before we parallelize,
but that option is definitely still there for us.

On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada <santagada at
gmail.com>
wrote:
> Does packing obj files in .lib helps linking in any way? My understanding
> is that there would be no difference. It could help if I could make a pdb
> per lib, but there is no way to do so... Maybe we could implement this on
> lld?
>
> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at
google.com> wrote:
>
>> Yes we've discussed many different ideas for incremental linking,
but our
>> conclusion is that you can only get one of Fast|Simple.  If you want it
to
>> be fast it has to be complicated and if you want it to be simple then
it's
>> going to be slow.
>>
>> Consider the case where you edit one .cpp file and change this:
>>
>> int x = 0, y = 7;
>>
>> to this:
>>
>> int x = 0;
>> short y = 7;
>>
>> Because different instructions operate on shorts vs ints, some of the
>> instruction encodings will be different and potentially of a different
size.
>>
>> Because of this, the contribution to the .text section from this object
>> file is going to be a different size.
>>
>> Because of that, all subsequent object files will start at a different
>> absolute file address in the final executable.
>>
>> Because of that, every single symbol in every single object file will
>> need to be updated in the final PDB.
>>
>> There are many other things that need to happen as well, but the point
is
>> that trivial change to a cpp file can explode into many changes in the
>> final PDB.
>>
>> There are ways to handle this, but they're not simple.  We have
some
>> ideas, but for the moment we are focused on making full linking as fast
as
>> possible because it's much easier and still provides benefits.  We
think we
>> can get it fast enough that it will be acceptable, and that should give
us
>> some extra time to do incremental linking properly.
>>
>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> About incremental linking, the only thing from my benchmark that
needs
>>> to be incremental is the pdb patching as generating the binary
seems faster
>>> than incremental linking on link.exe, so did anyone propose
renaming the
>>> current binary, writing a new one and then diffing the coff obj and
using
>>> that info to just rewriting that part of the pdb. Or another idea
is making
>>> the build system feed into the linker which files changed so the
>>> types/debug information can be compared instead of all of them?
>>>
>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> Not a lot.
>>>>
>>>> /TIME will show high level timing of the various phases (this
is the
>>>> same option MSVC uses).
>>>>
>>>> If you want anything more detailed than that, vTune or ETW+WPA
(
>>>> https://github.com/google/UIforETW/releases) are probably what
you'll
>>>> need to do.
>>>>
>>>> (We'd definitely love patches to improve performance, or
even just
>>>> ideas about how to make things faster.  Improving link speed is
one of our
>>>> biggest priorities.)
>>>>
>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada <
>>>> santagada at gmail.com> wrote:
>>>>
>>>>> Yeah true, is there any switches to profile the linker?
>>>>>
>>>>> On 29 Jan 2018 18:43, "Zachary Turner"
<zturner at google.com> wrote:
>>>>>
>>>>>> Part of the reason why lld is so fast is because we map
every input
>>>>>> file into memory up front and rely on the virtual
memory manager in the
>>>>>> kernel to make this fast.  Generally speaking, this is
a lot faster than
>>>>>> opening a file, reading it and processing a file, and
closing the file.
>>>>>> The downside, as you note, is that it uses a lot of
memory.
>>>>>>
>>>>>> But there's a catch.  The kernel is smart enough to
share the
>>>>>> physical memory pages when you map the same file
multiple times from
>>>>>> multiple processes.  So it only looks like the memory
usage is high because
>>>>>> it reserves a large amount of address space in each
process.  But the total
>>>>>> amount of physical memory used will not increase when
additional instances
>>>>>> of the same file are mapped.
>>>>>>
>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo Santagada <
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> I cleaned up my tests and figured that the obj file
generated with
>>>>>>> problems was only with msvc 2015, so trying again
with msvc 2017 I get:
>>>>>>>
>>>>>>> lld-link: 4s
>>>>>>> lld-link /debug: 1m30s and ~20gb of ram
>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram
>>>>>>> link: 13s
>>>>>>> link /debug:fastlink: 43s and 1gb of ram
>>>>>>> link specialpdb: 1m10s and 4gb of ram
>>>>>>> link /debug: 9m16s min and >14gb of ram
>>>>>>>
>>>>>>> link incremental: 8s when it works.
>>>>>>>
>>>>>>>
>>>>>>> *specialpdb is created with passing to a set of
compilation units
>>>>>>> (eg a folder) the same pdb to be written to, so it
dedups the symbols
>>>>>>> before the final linking, but that does decrease
the concurrency as this
>>>>>>> step can't be done after linking.
>>>>>>>
>>>>>>>
>>>>>>> My question is, in the set of patches you guys
haven't upstreamed is
>>>>>>> there anything that makes compilation uses less
memory? Or just asking more
>>>>>>> directly, when will those patches make to upstream,
or can I try them? The
>>>>>>> memory usage of lld-link is a little worrying as we
have around 6-8
>>>>>>> binaries that we link for windows and they mostly
use the same libraries so
>>>>>>> 20gb of ram each means we probably can't link
them all together anymore.
>>>>>>>
>>>>>>>
>>>>>>> Tomorrow I will send my tool and changes to lld so
more people can
>>>>>>> try this out and tell if it helps with their msvc
only code.
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary Turner
<zturner at google.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> I don’t have pgo numbers. When I build using
-flto=thin the link
>>>>>>>> time is significantly faster than msvc /ltcg
and runtime is slightly
>>>>>>>> faster, but I haven’t tested on a large variety
of different workloads, so
>>>>>>>> YMMV. Link time will definitely be faster
though
>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> This part is only for objects with /Z7
debug information in them
>>>>>>>>> right? I think most of the third parties
are either: .lib/obj without debug
>>>>>>>>> information, the same with information on
pdb files. Rewriting all
>>>>>>>>> .lib/.obj with /Z7 information seems doable
with a small python script, the
>>>>>>>>> pdb one is going to be more work, but I
always wanted to know how a pdb
>>>>>>>>> file is structured so "fun" times
ahead. But yeah printing it out, and
>>>>>>>>> timing it might be very useful indeed.
>>>>>>>>>
>>>>>>>>> Did anyone tried to compile/link
lld-link.exe with LTO+PGO to see
>>>>>>>>> how much faster can it get? I might try
that as well, as 10% speed
>>>>>>>>> improvement might be handy.
>>>>>>>>>
>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM, Zachary
Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Look for this code in lld/coff/pdb.cpp
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> if (Config->DebugGHashes) {
>>>>>>>>>> ArrayRef<GloballyHashedType>
Hashes;
>>>>>>>>>> std::vector<GloballyHashedType>
OwnedHashes;
>>>>>>>>>> if
(Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File))
>>>>>>>>>> Hashes = getHashesFromDebugH(*DebugH);
>>>>>>>>>> else {
>>>>>>>>>> OwnedHashes =
GloballyHashedType::hashTypes(Types);
>>>>>>>>>> Hashes = OwnedHashes;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> In the else block there, add a log
message that says
>>>>>>>>>> “synthesizing .debug$h section for “ +
Obj->Name
>>>>>>>>>>
>>>>>>>>>> See how many of these you get. When I
build chrome + all third
>>>>>>>>>> party libraries this way i get about
100, which is small enough to still
>>>>>>>>>> see large performance gains.
>>>>>>>>>>
>>>>>>>>>> If you have many 3rd party libraries,
it may be necessary to
>>>>>>>>>> rewrite the .lib files too, not just
the .obj files. Eventually I’ll get
>>>>>>>>>> around to implementing all of this as
well, as well as better heuristics in
>>>>>>>>>> lld-link to disable ghash if it’s going
to be slow
>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ok I went for kind of middle ground
solution, I patch in the obj
>>>>>>>>>>> files, but as adding a new section
didn't seem to work, I add a "shadow"
>>>>>>>>>>> section, by editing the pointer to
line number and the virtual size on the
>>>>>>>>>>> .debug$T section. Although
technically broken, both link.exe and
>>>>>>>>>>> lld-link.exe don't seem to mind
the alterations and as the shadow .debug$H
>>>>>>>>>>> is not really a section anymore
(its just some bytes at the end of the
>>>>>>>>>>> file) it doesn't change
anything else that does matter. With that I could
>>>>>>>>>>> do my first test with a subset of
our code base, and the results are not
>>>>>>>>>>> good. I found one of our sources
that break the ghash computation, I will
>>>>>>>>>>> get more info on this and post a
proper bug report, but I guess its type
>>>>>>>>>>> information that is generated only
by msvc. The other more alarming problem
>>>>>>>>>>> is that linking is way slower with
the ghahes... my guess is that we have a
>>>>>>>>>>> bunch of pdb files for some third
party libraries and calculating those
>>>>>>>>>>> ghashes takes more time than actual
linking of this small part of the
>>>>>>>>>>> source (it links in 4s in both
link.exe and lld-link.exe without ghashes).
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:52 PM,
Leonardo Santagada <
>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> We don't generate any .lib
as those don't work well with
>>>>>>>>>>>> incremental linking (and give
zero advantages when linking AFAIK), and it
>>>>>>>>>>>> would be pretty easy to have a
modern format for having a .ghash for
>>>>>>>>>>>> multiple files, something
simple like size prefixed name and then size
>>>>>>>>>>>> prefixed ghash blobs.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:44
PM, Zachary Turner <
>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> We considered that early
on, but most object files actually
>>>>>>>>>>>>> end up in .lib files so
unless there were a way to connect the objects in
>>>>>>>>>>>>> the .lib to the
corresponding .ghash files, this would disable ghash usage
>>>>>>>>>>>>> for a large amount of
inputs. Supporting both is an option, but it adds a
>>>>>>>>>>>>> bit of complexity and I’m
not totally convinced it’s worth it
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
11:38 AM Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> it does.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I just had an epiphany:
why not just write a .ghash file and
>>>>>>>>>>>>>> have lld read those if
they exist for an .obj file?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Seem much simpler than
trying to wire up a 20 year old file
>>>>>>>>>>>>>> format. I will try to
do this, is something like this acceptable for LLD?
>>>>>>>>>>>>>> The cool thing is that
I can generate .ghash for .lib or any obj lying
>>>>>>>>>>>>>> around (maybe even for
pdb in the future).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:32 PM, Zachary Turner <
>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In general, we
should be able to accept any MSVC .obj file
>>>>>>>>>>>>>>> to LLD.  At the
very least, we're not aware of any cases that don't work.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Does your MSVC .obj
file link fine before you add the
>>>>>>>>>>>>>>> .debug$H?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 11:23 AM Leonardo Santagada <
>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Okay,
apparently coff2yaml and yaml2coff are not in a great
>>>>>>>>>>>>>>>> place as they
both don't deal well with the fact that you can have
>>>>>>>>>>>>>>>> overlapping
sections, which seems to be what clang-cl produces (the .data
>>>>>>>>>>>>>>>> section points
to the same place as a later section). Which is not a big
>>>>>>>>>>>>>>>> big problem for
me particularly because msvc doesn't even generate .data
>>>>>>>>>>>>>>>> sections in
.obj.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm trying
to put support for .bss sections in both
>>>>>>>>>>>>>>>> coff2yaml and
yaml2coff... but I still can link just fine with my
>>>>>>>>>>>>>>>> transformations
clang-cl generated files... what does give me problems is
>>>>>>>>>>>>>>>> msvc .obj
files. Have you tried to link one of these?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 8:05 PM, Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> yeah,
apparently .bss has a flag of unitialized data that
>>>>>>>>>>>>>>>>> is not
being respected on the layout of the coff files (it should skip
>>>>>>>>>>>>>>>>> those
sections) but I dunno what to do with .data as it doesn't have a size.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (resending
as apparently my pastes generated a ton of
>>>>>>>>>>>>>>>>> hidden html
data and this message hit the mailinglist limit of 100k)
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Leonardo Santagada
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Leonardo Santagada
>>>>>>>
>>>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180130/7a8d6c48/attachment.html>

Leonardo Santagada via llvm-dev

2018-Jan-30 17:21 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Today I played around replacing the sha1 with xxHash64 and the results so
far are bad. Linking times almost doubled and I can't really explain why,
the only thing that comes to mind is hash collisions but on type names they
should be very few in 64bit hashes.

Any reason why you are trying blake2 and not murmurhash3 or xxHash64?

About creating a pdb per lib, you can say to msvc to put the pdb of every
.obj compilation to the same file, but you can't after 20 files compiled to
.obj (with /Z7 or /Zi) to them merge all the debug information in one .pdb
file AFAIK. That would make our links much faster I think as people either
are changing headers (and then they know they have to wait) or changing a
single/few .cpp files. It would be great to group our 3k obj debug
information in groups so that this linking steps can be paralelizable. Is
there any support maybe for merging pdb with pdb util and then feeding that
to lld-link instead of .obj debug info?

I also re-read the post about ghash and it says blink links in 88s, the 28s
you talk about is with unrelased optimizations only?

On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at google.com>
wrote:
> You can make a PDB per lib (consider msvcrtd.pdb which ships with MSVC),
> but all these per-lib PDBs would have to be merged into a single master PDB
> at the end, so you still can't avoid that final .  In a way, that's
similar
> to the idea behind /DEBUG:FASTLINK (keep the debug info in object files to
> eliminate the cost of merging types and symbol records) and we know what
> the problems with /DEBUG:FASTLINK are.
>
> The PDB generation code in LLD is still completely single threaded, so
> that's one area for huge potential gains, but only some parts of the
> algorithm are parallelizable.  We're trying to squeeze every last bit
of
> performance out of the single-threaded case first before we parallelize,
> but that option is definitely still there for us.
>
> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> Does packing obj files in .lib helps linking in any way? My
understanding
>> is that there would be no difference. It could help if I could make a
pdb
>> per lib, but there is no way to do so... Maybe we could implement this
on
>> lld?
>>
>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at
google.com> wrote:
>>
>>> Yes we've discussed many different ideas for incremental
linking, but
>>> our conclusion is that you can only get one of Fast|Simple.  If you
want it
>>> to be fast it has to be complicated and if you want it to be simple
then
>>> it's going to be slow.
>>>
>>> Consider the case where you edit one .cpp file and change this:
>>>
>>> int x = 0, y = 7;
>>>
>>> to this:
>>>
>>> int x = 0;
>>> short y = 7;
>>>
>>> Because different instructions operate on shorts vs ints, some of
the
>>> instruction encodings will be different and potentially of a
different size.
>>>
>>> Because of this, the contribution to the .text section from this
object
>>> file is going to be a different size.
>>>
>>> Because of that, all subsequent object files will start at a
different
>>> absolute file address in the final executable.
>>>
>>> Because of that, every single symbol in every single object file
will
>>> need to be updated in the final PDB.
>>>
>>> There are many other things that need to happen as well, but the
point
>>> is that trivial change to a cpp file can explode into many changes
in the
>>> final PDB.
>>>
>>> There are ways to handle this, but they're not simple.  We have
some
>>> ideas, but for the moment we are focused on making full linking as
fast as
>>> possible because it's much easier and still provides benefits. 
We think we
>>> can get it fast enough that it will be acceptable, and that should
give us
>>> some extra time to do incremental linking properly.
>>>
>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada <santagada at
gmail.com>
>>> wrote:
>>>
>>>> About incremental linking, the only thing from my benchmark
that needs
>>>> to be incremental is the pdb patching as generating the binary
seems faster
>>>> than incremental linking on link.exe, so did anyone propose
renaming the
>>>> current binary, writing a new one and then diffing the coff obj
and using
>>>> that info to just rewriting that part of the pdb. Or another
idea is making
>>>> the build system feed into the linker which files changed so
the
>>>> types/debug information can be compared instead of all of them?
>>>>
>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner <zturner at
google.com>
>>>> wrote:
>>>>
>>>>> Not a lot.
>>>>>
>>>>> /TIME will show high level timing of the various phases
(this is the
>>>>> same option MSVC uses).
>>>>>
>>>>> If you want anything more detailed than that, vTune or
ETW+WPA (
>>>>> https://github.com/google/UIforETW/releases) are probably
what you'll
>>>>> need to do.
>>>>>
>>>>> (We'd definitely love patches to improve performance,
or even just
>>>>> ideas about how to make things faster.  Improving link
speed is one of our
>>>>> biggest priorities.)
>>>>>
>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> Yeah true, is there any switches to profile the linker?
>>>>>>
>>>>>> On 29 Jan 2018 18:43, "Zachary Turner"
<zturner at google.com> wrote:
>>>>>>
>>>>>>> Part of the reason why lld is so fast is because we
map every input
>>>>>>> file into memory up front and rely on the virtual
memory manager in the
>>>>>>> kernel to make this fast.  Generally speaking, this
is a lot faster than
>>>>>>> opening a file, reading it and processing a file,
and closing the file.
>>>>>>> The downside, as you note, is that it uses a lot of
memory.
>>>>>>>
>>>>>>> But there's a catch.  The kernel is smart
enough to share the
>>>>>>> physical memory pages when you map the same file
multiple times from
>>>>>>> multiple processes.  So it only looks like the
memory usage is high because
>>>>>>> it reserves a large amount of address space in each
process.  But the total
>>>>>>> amount of physical memory used will not increase
when additional instances
>>>>>>> of the same file are mapped.
>>>>>>>
>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo Santagada
<
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> I cleaned up my tests and figured that the obj
file generated with
>>>>>>>> problems was only with msvc 2015, so trying
again with msvc 2017 I get:
>>>>>>>>
>>>>>>>> lld-link: 4s
>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram
>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram
>>>>>>>> link: 13s
>>>>>>>> link /debug:fastlink: 43s and 1gb of ram
>>>>>>>> link specialpdb: 1m10s and 4gb of ram
>>>>>>>> link /debug: 9m16s min and >14gb of ram
>>>>>>>>
>>>>>>>> link incremental: 8s when it works.
>>>>>>>>
>>>>>>>>
>>>>>>>> *specialpdb is created with passing to a set of
compilation units
>>>>>>>> (eg a folder) the same pdb to be written to, so
it dedups the symbols
>>>>>>>> before the final linking, but that does
decrease the concurrency as this
>>>>>>>> step can't be done after linking.
>>>>>>>>
>>>>>>>>
>>>>>>>> My question is, in the set of patches you guys
haven't upstreamed
>>>>>>>> is there anything that makes compilation uses
less memory? Or just asking
>>>>>>>> more directly, when will those patches make to
upstream, or can I try them?
>>>>>>>> The memory usage of lld-link is a little
worrying as we have around 6-8
>>>>>>>> binaries that we link for windows and they
mostly use the same libraries so
>>>>>>>> 20gb of ram each means we probably can't
link them all together anymore.
>>>>>>>>
>>>>>>>>
>>>>>>>> Tomorrow I will send my tool and changes to lld
so more people can
>>>>>>>> try this out and tell if it helps with their
msvc only code.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary
Turner <
>>>>>>>> zturner at google.com> wrote:
>>>>>>>>
>>>>>>>>> I don’t have pgo numbers. When I build
using -flto=thin the link
>>>>>>>>> time is significantly faster than msvc
/ltcg and runtime is slightly
>>>>>>>>> faster, but I haven’t tested on a large
variety of different workloads, so
>>>>>>>>> YMMV. Link time will definitely be faster
though
>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM Leonardo
Santagada <
>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> This part is only for objects with /Z7
debug information in them
>>>>>>>>>> right? I think most of the third
parties are either: .lib/obj without debug
>>>>>>>>>> information, the same with information
on pdb files. Rewriting all
>>>>>>>>>> .lib/.obj with /Z7 information seems
doable with a small python script, the
>>>>>>>>>> pdb one is going to be more work, but I
always wanted to know how a pdb
>>>>>>>>>> file is structured so "fun"
times ahead. But yeah printing it out, and
>>>>>>>>>> timing it might be very useful indeed.
>>>>>>>>>>
>>>>>>>>>> Did anyone tried to compile/link
lld-link.exe with LTO+PGO to see
>>>>>>>>>> how much faster can it get? I might try
that as well, as 10% speed
>>>>>>>>>> improvement might be handy.
>>>>>>>>>>
>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM,
Zachary Turner <
>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Look for this code in
lld/coff/pdb.cpp
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> if (Config->DebugGHashes) {
>>>>>>>>>>> ArrayRef<GloballyHashedType>
Hashes;
>>>>>>>>>>>
std::vector<GloballyHashedType> OwnedHashes;
>>>>>>>>>>> if
(Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File))
>>>>>>>>>>> Hashes =
getHashesFromDebugH(*DebugH);
>>>>>>>>>>> else {
>>>>>>>>>>> OwnedHashes =
GloballyHashedType::hashTypes(Types);
>>>>>>>>>>> Hashes = OwnedHashes;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> In the else block there, add a log
message that says
>>>>>>>>>>> “synthesizing .debug$h section for
“ + Obj->Name
>>>>>>>>>>>
>>>>>>>>>>> See how many of these you get. When
I build chrome + all third
>>>>>>>>>>> party libraries this way i get
about 100, which is small enough to still
>>>>>>>>>>> see large performance gains.
>>>>>>>>>>>
>>>>>>>>>>> If you have many 3rd party
libraries, it may be necessary to
>>>>>>>>>>> rewrite the .lib files too, not
just the .obj files. Eventually I’ll get
>>>>>>>>>>> around to implementing all of this
as well, as well as better heuristics in
>>>>>>>>>>> lld-link to disable ghash if it’s
going to be slow
>>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM
Leonardo Santagada <
>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ok I went for kind of middle
ground solution, I patch in the
>>>>>>>>>>>> obj files, but as adding a new
section didn't seem to work, I add a
>>>>>>>>>>>> "shadow" section, by
editing the pointer to line number and the virtual
>>>>>>>>>>>> size on the .debug$T section.
Although technically broken, both link.exe
>>>>>>>>>>>> and lld-link.exe don't seem
to mind the alterations and as the shadow
>>>>>>>>>>>> .debug$H is not really a
section anymore (its just some bytes at the end of
>>>>>>>>>>>> the file) it doesn't change
anything else that does matter. With that I
>>>>>>>>>>>> could do my first test with a
subset of our code base, and the results are
>>>>>>>>>>>> not good. I found one of our
sources that break the ghash computation, I
>>>>>>>>>>>> will get more info on this and
post a proper bug report, but I guess its
>>>>>>>>>>>> type information that is
generated only by msvc. The other more alarming
>>>>>>>>>>>> problem is that linking is way
slower with the ghahes... my guess is that
>>>>>>>>>>>> we have a bunch of pdb files
for some third party libraries and calculating
>>>>>>>>>>>> those ghashes takes more time
than actual linking of this small part of the
>>>>>>>>>>>> source (it links in 4s in both
link.exe and lld-link.exe without ghashes).
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:52
PM, Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> We don't generate any
.lib as those don't work well with
>>>>>>>>>>>>> incremental linking (and
give zero advantages when linking AFAIK), and it
>>>>>>>>>>>>> would be pretty easy to
have a modern format for having a .ghash for
>>>>>>>>>>>>> multiple files, something
simple like size prefixed name and then size
>>>>>>>>>>>>> prefixed ghash blobs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:44 PM, Zachary Turner <
>>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> We considered that
early on, but most object files actually
>>>>>>>>>>>>>> end up in .lib files so
unless there were a way to connect the objects in
>>>>>>>>>>>>>> the .lib to the
corresponding .ghash files, this would disable ghash usage
>>>>>>>>>>>>>> for a large amount of
inputs. Supporting both is an option, but it adds a
>>>>>>>>>>>>>> bit of complexity and
I’m not totally convinced it’s worth it
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
11:38 AM Leonardo Santagada <
>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> it does.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I just had an
epiphany: why not just write a .ghash file and
>>>>>>>>>>>>>>> have lld read those
if they exist for an .obj file?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Seem much simpler
than trying to wire up a 20 year old file
>>>>>>>>>>>>>>> format. I will try
to do this, is something like this acceptable for LLD?
>>>>>>>>>>>>>>> The cool thing is
that I can generate .ghash for .lib or any obj lying
>>>>>>>>>>>>>>> around (maybe even
for pdb in the future).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 8:32 PM, Zachary Turner <
>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In general, we
should be able to accept any MSVC .obj file
>>>>>>>>>>>>>>>> to LLD.  At the
very least, we're not aware of any cases that don't work.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Does your MSVC
.obj file link fine before you add the
>>>>>>>>>>>>>>>> .debug$H?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 11:23 AM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Okay,
apparently coff2yaml and yaml2coff are not in a
>>>>>>>>>>>>>>>>> great place
as they both don't deal well with the fact that you can have
>>>>>>>>>>>>>>>>> overlapping
sections, which seems to be what clang-cl produces (the .data
>>>>>>>>>>>>>>>>> section
points to the same place as a later section). Which is not a big
>>>>>>>>>>>>>>>>> big problem
for me particularly because msvc doesn't even generate .data
>>>>>>>>>>>>>>>>> sections in
.obj.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm
trying to put support for .bss sections in both
>>>>>>>>>>>>>>>>> coff2yaml
and yaml2coff... but I still can link just fine with my
>>>>>>>>>>>>>>>>>
transformations clang-cl generated files... what does give me problems is
>>>>>>>>>>>>>>>>> msvc .obj
files. Have you tried to link one of these?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Jan
26, 2018 at 8:05 PM, Leonardo Santagada <
>>>>>>>>>>>>>>>>> santagada
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> yeah,
apparently .bss has a flag of unitialized data that
>>>>>>>>>>>>>>>>>> is not
being respected on the layout of the coff files (it should skip
>>>>>>>>>>>>>>>>>> those
sections) but I dunno what to do with .data as it doesn't have a size.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
(resending as apparently my pastes generated a ton of
>>>>>>>>>>>>>>>>>> hidden
html data and this message hit the mailinglist limit of 100k)
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Leonardo Santagada
>>>>>>>>
>>>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Leonardo Santagada
>>>>
>>>

-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180130/eb97c668/attachment-0001.html>

Zachary Turner via llvm-dev

2018-Jan-30 17:39 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

It turns out there were some problems with the measurements in that blog
post.  I built LLD with the RelWithDebInfo configuration which we later
found out uses /Ob1 instead of /Ob2.  That was worth some cycles.  Then
there were some more optimizations that went in after that.  And to get
down to 28s I also used an LTO'ed build of lld.

If you're building LLD at ToT you should have everything needed to
reproduce those numbers, but it will vary depending on the speed of your
CPU obviously.

On Tue, Jan 30, 2018 at 9:21 AM Leonardo Santagada <santagada at
gmail.com>
wrote:
> Today I played around replacing the sha1 with xxHash64 and the results so
> far are bad. Linking times almost doubled and I can't really explain
why,
> the only thing that comes to mind is hash collisions but on type names they
> should be very few in 64bit hashes.
>
> Any reason why you are trying blake2 and not murmurhash3 or xxHash64?
>
> About creating a pdb per lib, you can say to msvc to put the pdb of every
> .obj compilation to the same file, but you can't after 20 files
compiled to
> .obj (with /Z7 or /Zi) to them merge all the debug information in one .pdb
> file AFAIK. That would make our links much faster I think as people either
> are changing headers (and then they know they have to wait) or changing a
> single/few .cpp files. It would be great to group our 3k obj debug
> information in groups so that this linking steps can be paralelizable. Is
> there any support maybe for merging pdb with pdb util and then feeding that
> to lld-link instead of .obj debug info?
>
> I also re-read the post about ghash and it says blink links in 88s, the
> 28s you talk about is with unrelased optimizations only?
>
> On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at
google.com>
> wrote:
>
>> You can make a PDB per lib (consider msvcrtd.pdb which ships with
MSVC),
>> but all these per-lib PDBs would have to be merged into a single master
PDB
>> at the end, so you still can't avoid that final .  In a way,
that's similar
>> to the idea behind /DEBUG:FASTLINK (keep the debug info in object files
to
>> eliminate the cost of merging types and symbol records) and we know
what
>> the problems with /DEBUG:FASTLINK are.
>>
>> The PDB generation code in LLD is still completely single threaded, so
>> that's one area for huge potential gains, but only some parts of
the
>> algorithm are parallelizable.  We're trying to squeeze every last
bit of
>> performance out of the single-threaded case first before we
parallelize,
>> but that option is definitely still there for us.
>>
>> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> Does packing obj files in .lib helps linking in any way? My
>>> understanding is that there would be no difference. It could help
if I
>>> could make a pdb per lib, but there is no way to do so... Maybe we
could
>>> implement this on lld?
>>>
>>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at
google.com> wrote:
>>>
>>>> Yes we've discussed many different ideas for incremental
linking, but
>>>> our conclusion is that you can only get one of Fast|Simple.  If
you want it
>>>> to be fast it has to be complicated and if you want it to be
simple then
>>>> it's going to be slow.
>>>>
>>>> Consider the case where you edit one .cpp file and change this:
>>>>
>>>> int x = 0, y = 7;
>>>>
>>>> to this:
>>>>
>>>> int x = 0;
>>>> short y = 7;
>>>>
>>>> Because different instructions operate on shorts vs ints, some
of the
>>>> instruction encodings will be different and potentially of a
different size.
>>>>
>>>> Because of this, the contribution to the .text section from
this object
>>>> file is going to be a different size.
>>>>
>>>> Because of that, all subsequent object files will start at a
different
>>>> absolute file address in the final executable.
>>>>
>>>> Because of that, every single symbol in every single object
file will
>>>> need to be updated in the final PDB.
>>>>
>>>> There are many other things that need to happen as well, but
the point
>>>> is that trivial change to a cpp file can explode into many
changes in the
>>>> final PDB.
>>>>
>>>> There are ways to handle this, but they're not simple.  We
have some
>>>> ideas, but for the moment we are focused on making full linking
as fast as
>>>> possible because it's much easier and still provides
benefits.  We think we
>>>> can get it fast enough that it will be acceptable, and that
should give us
>>>> some extra time to do incremental linking properly.
>>>>
>>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada
<santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> About incremental linking, the only thing from my benchmark
that needs
>>>>> to be incremental is the pdb patching as generating the
binary seems faster
>>>>> than incremental linking on link.exe, so did anyone propose
renaming the
>>>>> current binary, writing a new one and then diffing the coff
obj and using
>>>>> that info to just rewriting that part of the pdb. Or
another idea is making
>>>>> the build system feed into the linker which files changed
so the
>>>>> types/debug information can be compared instead of all of
them?
>>>>>
>>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner <zturner
at google.com>
>>>>> wrote:
>>>>>
>>>>>> Not a lot.
>>>>>>
>>>>>> /TIME will show high level timing of the various phases
(this is the
>>>>>> same option MSVC uses).
>>>>>>
>>>>>> If you want anything more detailed than that, vTune or
ETW+WPA (
>>>>>> https://github.com/google/UIforETW/releases) are
probably what
>>>>>> you'll need to do.
>>>>>>
>>>>>> (We'd definitely love patches to improve
performance, or even just
>>>>>> ideas about how to make things faster.  Improving link
speed is one of our
>>>>>> biggest priorities.)
>>>>>>
>>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada
<
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> Yeah true, is there any switches to profile the
linker?
>>>>>>>
>>>>>>> On 29 Jan 2018 18:43, "Zachary Turner"
<zturner at google.com> wrote:
>>>>>>>
>>>>>>>> Part of the reason why lld is so fast is
because we map every input
>>>>>>>> file into memory up front and rely on the
virtual memory manager in the
>>>>>>>> kernel to make this fast.  Generally speaking,
this is a lot faster than
>>>>>>>> opening a file, reading it and processing a
file, and closing the file.
>>>>>>>> The downside, as you note, is that it uses a
lot of memory.
>>>>>>>>
>>>>>>>> But there's a catch.  The kernel is smart
enough to share the
>>>>>>>> physical memory pages when you map the same
file multiple times from
>>>>>>>> multiple processes.  So it only looks like the
memory usage is high because
>>>>>>>> it reserves a large amount of address space in
each process.  But the total
>>>>>>>> amount of physical memory used will not
increase when additional instances
>>>>>>>> of the same file are mapped.
>>>>>>>>
>>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I cleaned up my tests and figured that the
obj file generated with
>>>>>>>>> problems was only with msvc 2015, so trying
again with msvc 2017 I get:
>>>>>>>>>
>>>>>>>>> lld-link: 4s
>>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram
>>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram
>>>>>>>>> link: 13s
>>>>>>>>> link /debug:fastlink: 43s and 1gb of ram
>>>>>>>>> link specialpdb: 1m10s and 4gb of ram
>>>>>>>>> link /debug: 9m16s min and >14gb of ram
>>>>>>>>>
>>>>>>>>> link incremental: 8s when it works.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *specialpdb is created with passing to a
set of compilation units
>>>>>>>>> (eg a folder) the same pdb to be written
to, so it dedups the symbols
>>>>>>>>> before the final linking, but that does
decrease the concurrency as this
>>>>>>>>> step can't be done after linking.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> My question is, in the set of patches you
guys haven't upstreamed
>>>>>>>>> is there anything that makes compilation
uses less memory? Or just asking
>>>>>>>>> more directly, when will those patches make
to upstream, or can I try them?
>>>>>>>>> The memory usage of lld-link is a little
worrying as we have around 6-8
>>>>>>>>> binaries that we link for windows and they
mostly use the same libraries so
>>>>>>>>> 20gb of ram each means we probably
can't link them all together anymore.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Tomorrow I will send my tool and changes to
lld so more people can
>>>>>>>>> try this out and tell if it helps with
their msvc only code.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary
Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> I don’t have pgo numbers. When I build
using -flto=thin the link
>>>>>>>>>> time is significantly faster than msvc
/ltcg and runtime is slightly
>>>>>>>>>> faster, but I haven’t tested on a large
variety of different workloads, so
>>>>>>>>>> YMMV. Link time will definitely be
faster though
>>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> This part is only for objects with
/Z7 debug information in them
>>>>>>>>>>> right? I think most of the third
parties are either: .lib/obj without debug
>>>>>>>>>>> information, the same with
information on pdb files. Rewriting all
>>>>>>>>>>> .lib/.obj with /Z7 information
seems doable with a small python script, the
>>>>>>>>>>> pdb one is going to be more work,
but I always wanted to know how a pdb
>>>>>>>>>>> file is structured so
"fun" times ahead. But yeah printing it out, and
>>>>>>>>>>> timing it might be very useful
indeed.
>>>>>>>>>>>
>>>>>>>>>>> Did anyone tried to compile/link
lld-link.exe with LTO+PGO to
>>>>>>>>>>> see how much faster can it get? I
might try that as well, as 10% speed
>>>>>>>>>>> improvement might be handy.
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM,
Zachary Turner <
>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Look for this code in
lld/coff/pdb.cpp
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> if (Config->DebugGHashes) {
>>>>>>>>>>>>
ArrayRef<GloballyHashedType> Hashes;
>>>>>>>>>>>>
std::vector<GloballyHashedType> OwnedHashes;
>>>>>>>>>>>> if
(Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File))
>>>>>>>>>>>> Hashes =
getHashesFromDebugH(*DebugH);
>>>>>>>>>>>> else {
>>>>>>>>>>>> OwnedHashes =
GloballyHashedType::hashTypes(Types);
>>>>>>>>>>>> Hashes = OwnedHashes;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> In the else block there, add a
log message that says
>>>>>>>>>>>> “synthesizing .debug$h section
for “ + Obj->Name
>>>>>>>>>>>>
>>>>>>>>>>>> See how many of these you get.
When I build chrome + all third
>>>>>>>>>>>> party libraries this way i get
about 100, which is small enough to still
>>>>>>>>>>>> see large performance gains.
>>>>>>>>>>>>
>>>>>>>>>>>> If you have many 3rd party
libraries, it may be necessary to
>>>>>>>>>>>> rewrite the .lib files too, not
just the .obj files. Eventually I’ll get
>>>>>>>>>>>> around to implementing all of
this as well, as well as better heuristics in
>>>>>>>>>>>> lld-link to disable ghash if
it’s going to be slow
>>>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM
Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Ok I went for kind of
middle ground solution, I patch in the
>>>>>>>>>>>>> obj files, but as adding a
new section didn't seem to work, I add a
>>>>>>>>>>>>> "shadow" section,
by editing the pointer to line number and the virtual
>>>>>>>>>>>>> size on the .debug$T
section. Although technically broken, both link.exe
>>>>>>>>>>>>> and lld-link.exe don't
seem to mind the alterations and as the shadow
>>>>>>>>>>>>> .debug$H is not really a
section anymore (its just some bytes at the end of
>>>>>>>>>>>>> the file) it doesn't
change anything else that does matter. With that I
>>>>>>>>>>>>> could do my first test with
a subset of our code base, and the results are
>>>>>>>>>>>>> not good. I found one of
our sources that break the ghash computation, I
>>>>>>>>>>>>> will get more info on this
and post a proper bug report, but I guess its
>>>>>>>>>>>>> type information that is
generated only by msvc. The other more alarming
>>>>>>>>>>>>> problem is that linking is
way slower with the ghahes... my guess is that
>>>>>>>>>>>>> we have a bunch of pdb
files for some third party libraries and calculating
>>>>>>>>>>>>> those ghashes takes more
time than actual linking of this small part of the
>>>>>>>>>>>>> source (it links in 4s in
both link.exe and lld-link.exe without ghashes).
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:52 PM, Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> We don't generate
any .lib as those don't work well with
>>>>>>>>>>>>>> incremental linking
(and give zero advantages when linking AFAIK), and it
>>>>>>>>>>>>>> would be pretty easy to
have a modern format for having a .ghash for
>>>>>>>>>>>>>> multiple files,
something simple like size prefixed name and then size
>>>>>>>>>>>>>> prefixed ghash blobs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:44 PM, Zachary Turner <
>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We considered that
early on, but most object files actually
>>>>>>>>>>>>>>> end up in .lib
files so unless there were a way to connect the objects in
>>>>>>>>>>>>>>> the .lib to the
corresponding .ghash files, this would disable ghash usage
>>>>>>>>>>>>>>> for a large amount
of inputs. Supporting both is an option, but it adds a
>>>>>>>>>>>>>>> bit of complexity
and I’m not totally convinced it’s worth it
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 11:38 AM Leonardo Santagada <
>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> it does.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I just had an
epiphany: why not just write a .ghash file
>>>>>>>>>>>>>>>> and have lld
read those if they exist for an .obj file?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Seem much
simpler than trying to wire up a 20 year old file
>>>>>>>>>>>>>>>> format. I will
try to do this, is something like this acceptable for LLD?
>>>>>>>>>>>>>>>> The cool thing
is that I can generate .ghash for .lib or any obj lying
>>>>>>>>>>>>>>>> around (maybe
even for pdb in the future).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 8:32 PM, Zachary Turner <
>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In general,
we should be able to accept any MSVC .obj file
>>>>>>>>>>>>>>>>> to LLD.  At
the very least, we're not aware of any cases that don't work.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Does your
MSVC .obj file link fine before you add the
>>>>>>>>>>>>>>>>> .debug$H?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Jan
26, 2018 at 11:23 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>> santagada
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Okay,
apparently coff2yaml and yaml2coff are not in a
>>>>>>>>>>>>>>>>>> great
place as they both don't deal well with the fact that you can have
>>>>>>>>>>>>>>>>>>
overlapping sections, which seems to be what clang-cl produces (the .data
>>>>>>>>>>>>>>>>>> section
points to the same place as a later section). Which is not a big
>>>>>>>>>>>>>>>>>> big
problem for me particularly because msvc doesn't even generate .data
>>>>>>>>>>>>>>>>>>
sections in .obj.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm
trying to put support for .bss sections in both
>>>>>>>>>>>>>>>>>>
coff2yaml and yaml2coff... but I still can link just fine with my
>>>>>>>>>>>>>>>>>>
transformations clang-cl generated files... what does give me problems is
>>>>>>>>>>>>>>>>>> msvc
.obj files. Have you tried to link one of these?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri,
Jan 26, 2018 at 8:05 PM, Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
yeah, apparently .bss has a flag of unitialized data
>>>>>>>>>>>>>>>>>>>
that is not being respected on the layout of the coff files (it should skip
>>>>>>>>>>>>>>>>>>>
those sections) but I dunno what to do with .data as it doesn't have a size.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
(resending as apparently my pastes generated a ton of
>>>>>>>>>>>>>>>>>>>
hidden html data and this message hit the mailinglist limit of 100k)
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Leonardo Santagada
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>
>
> --
>
> Leonardo Santagada
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180130/6fca1fa3/attachment.html>

Leonardo Santagada via llvm-dev

2018-Jan-30 20:28 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Nice and why are you trying blake2 instead of a faster hash algorithm? And
do you have any guess as to why xxHash64 wasn't faster than SHA1? I still
have to see how many collision I get with it, but it seems so improbable
that collisions on 64 bit hashes would be the problem.

On 30 Jan 2018 18:39, "Zachary Turner" <zturner at google.com>
wrote:

It turns out there were some problems with the measurements in that blog
post.  I built LLD with the RelWithDebInfo configuration which we later
found out uses /Ob1 instead of /Ob2.  That was worth some cycles.  Then
there were some more optimizations that went in after that.  And to get
down to 28s I also used an LTO'ed build of lld.

If you're building LLD at ToT you should have everything needed to
reproduce those numbers, but it will vary depending on the speed of your
CPU obviously.

On Tue, Jan 30, 2018 at 9:21 AM Leonardo Santagada <santagada at
gmail.com>
wrote:
> Today I played around replacing the sha1 with xxHash64 and the results so
> far are bad. Linking times almost doubled and I can't really explain
why,
> the only thing that comes to mind is hash collisions but on type names they
> should be very few in 64bit hashes.
>
> Any reason why you are trying blake2 and not murmurhash3 or xxHash64?
>
> About creating a pdb per lib, you can say to msvc to put the pdb of every
> .obj compilation to the same file, but you can't after 20 files
compiled to
> .obj (with /Z7 or /Zi) to them merge all the debug information in one .pdb
> file AFAIK. That would make our links much faster I think as people either
> are changing headers (and then they know they have to wait) or changing a
> single/few .cpp files. It would be great to group our 3k obj debug
> information in groups so that this linking steps can be paralelizable. Is
> there any support maybe for merging pdb with pdb util and then feeding that
> to lld-link instead of .obj debug info?
>
> I also re-read the post about ghash and it says blink links in 88s, the
> 28s you talk about is with unrelased optimizations only?
>
> On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at
google.com>
> wrote:
>
>> You can make a PDB per lib (consider msvcrtd.pdb which ships with
MSVC),
>> but all these per-lib PDBs would have to be merged into a single master
PDB
>> at the end, so you still can't avoid that final .  In a way,
that's similar
>> to the idea behind /DEBUG:FASTLINK (keep the debug info in object files
to
>> eliminate the cost of merging types and symbol records) and we know
what
>> the problems with /DEBUG:FASTLINK are.
>>
>> The PDB generation code in LLD is still completely single threaded, so
>> that's one area for huge potential gains, but only some parts of
the
>> algorithm are parallelizable.  We're trying to squeeze every last
bit of
>> performance out of the single-threaded case first before we
parallelize,
>> but that option is definitely still there for us.
>>
>> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> Does packing obj files in .lib helps linking in any way? My
>>> understanding is that there would be no difference. It could help
if I
>>> could make a pdb per lib, but there is no way to do so... Maybe we
could
>>> implement this on lld?
>>>
>>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at
google.com> wrote:
>>>
>>>> Yes we've discussed many different ideas for incremental
linking, but
>>>> our conclusion is that you can only get one of Fast|Simple.  If
you want it
>>>> to be fast it has to be complicated and if you want it to be
simple then
>>>> it's going to be slow.
>>>>
>>>> Consider the case where you edit one .cpp file and change this:
>>>>
>>>> int x = 0, y = 7;
>>>>
>>>> to this:
>>>>
>>>> int x = 0;
>>>> short y = 7;
>>>>
>>>> Because different instructions operate on shorts vs ints, some
of the
>>>> instruction encodings will be different and potentially of a
different size.
>>>>
>>>> Because of this, the contribution to the .text section from
this object
>>>> file is going to be a different size.
>>>>
>>>> Because of that, all subsequent object files will start at a
different
>>>> absolute file address in the final executable.
>>>>
>>>> Because of that, every single symbol in every single object
file will
>>>> need to be updated in the final PDB.
>>>>
>>>> There are many other things that need to happen as well, but
the point
>>>> is that trivial change to a cpp file can explode into many
changes in the
>>>> final PDB.
>>>>
>>>> There are ways to handle this, but they're not simple.  We
have some
>>>> ideas, but for the moment we are focused on making full linking
as fast as
>>>> possible because it's much easier and still provides
benefits.  We think we
>>>> can get it fast enough that it will be acceptable, and that
should give us
>>>> some extra time to do incremental linking properly.
>>>>
>>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada
<santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> About incremental linking, the only thing from my benchmark
that needs
>>>>> to be incremental is the pdb patching as generating the
binary seems faster
>>>>> than incremental linking on link.exe, so did anyone propose
renaming the
>>>>> current binary, writing a new one and then diffing the coff
obj and using
>>>>> that info to just rewriting that part of the pdb. Or
another idea is making
>>>>> the build system feed into the linker which files changed
so the
>>>>> types/debug information can be compared instead of all of
them?
>>>>>
>>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner <zturner
at google.com>
>>>>> wrote:
>>>>>
>>>>>> Not a lot.
>>>>>>
>>>>>> /TIME will show high level timing of the various phases
(this is the
>>>>>> same option MSVC uses).
>>>>>>
>>>>>> If you want anything more detailed than that, vTune or
ETW+WPA (
>>>>>> https://github.com/google/UIforETW/releases) are
probably what
>>>>>> you'll need to do.
>>>>>>
>>>>>> (We'd definitely love patches to improve
performance, or even just
>>>>>> ideas about how to make things faster.  Improving link
speed is one of our
>>>>>> biggest priorities.)
>>>>>>
>>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada
<
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> Yeah true, is there any switches to profile the
linker?
>>>>>>>
>>>>>>> On 29 Jan 2018 18:43, "Zachary Turner"
<zturner at google.com> wrote:
>>>>>>>
>>>>>>>> Part of the reason why lld is so fast is
because we map every input
>>>>>>>> file into memory up front and rely on the
virtual memory manager in the
>>>>>>>> kernel to make this fast.  Generally speaking,
this is a lot faster than
>>>>>>>> opening a file, reading it and processing a
file, and closing the file.
>>>>>>>> The downside, as you note, is that it uses a
lot of memory.
>>>>>>>>
>>>>>>>> But there's a catch.  The kernel is smart
enough to share the
>>>>>>>> physical memory pages when you map the same
file multiple times from
>>>>>>>> multiple processes.  So it only looks like the
memory usage is high because
>>>>>>>> it reserves a large amount of address space in
each process.  But the total
>>>>>>>> amount of physical memory used will not
increase when additional instances
>>>>>>>> of the same file are mapped.
>>>>>>>>
>>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I cleaned up my tests and figured that the
obj file generated with
>>>>>>>>> problems was only with msvc 2015, so trying
again with msvc 2017 I get:
>>>>>>>>>
>>>>>>>>> lld-link: 4s
>>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram
>>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram
>>>>>>>>> link: 13s
>>>>>>>>> link /debug:fastlink: 43s and 1gb of ram
>>>>>>>>> link specialpdb: 1m10s and 4gb of ram
>>>>>>>>> link /debug: 9m16s min and >14gb of ram
>>>>>>>>>
>>>>>>>>> link incremental: 8s when it works.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *specialpdb is created with passing to a
set of compilation units
>>>>>>>>> (eg a folder) the same pdb to be written
to, so it dedups the symbols
>>>>>>>>> before the final linking, but that does
decrease the concurrency as this
>>>>>>>>> step can't be done after linking.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> My question is, in the set of patches you
guys haven't upstreamed
>>>>>>>>> is there anything that makes compilation
uses less memory? Or just asking
>>>>>>>>> more directly, when will those patches make
to upstream, or can I try them?
>>>>>>>>> The memory usage of lld-link is a little
worrying as we have around 6-8
>>>>>>>>> binaries that we link for windows and they
mostly use the same libraries so
>>>>>>>>> 20gb of ram each means we probably
can't link them all together anymore.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Tomorrow I will send my tool and changes to
lld so more people can
>>>>>>>>> try this out and tell if it helps with
their msvc only code.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary
Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> I don’t have pgo numbers. When I build
using -flto=thin the link
>>>>>>>>>> time is significantly faster than msvc
/ltcg and runtime is slightly
>>>>>>>>>> faster, but I haven’t tested on a large
variety of different workloads, so
>>>>>>>>>> YMMV. Link time will definitely be
faster though
>>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> This part is only for objects with
/Z7 debug information in them
>>>>>>>>>>> right? I think most of the third
parties are either: .lib/obj without debug
>>>>>>>>>>> information, the same with
information on pdb files. Rewriting all
>>>>>>>>>>> .lib/.obj with /Z7 information
seems doable with a small python script, the
>>>>>>>>>>> pdb one is going to be more work,
but I always wanted to know how a pdb
>>>>>>>>>>> file is structured so
"fun" times ahead. But yeah printing it out, and
>>>>>>>>>>> timing it might be very useful
indeed.
>>>>>>>>>>>
>>>>>>>>>>> Did anyone tried to compile/link
lld-link.exe with LTO+PGO to
>>>>>>>>>>> see how much faster can it get? I
might try that as well, as 10% speed
>>>>>>>>>>> improvement might be handy.
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM,
Zachary Turner <
>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Look for this code in
lld/coff/pdb.cpp
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> if (Config->DebugGHashes) {
>>>>>>>>>>>>
ArrayRef<GloballyHashedType> Hashes;
>>>>>>>>>>>>
std::vector<GloballyHashedType> OwnedHashes;
>>>>>>>>>>>> if
(Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File))
>>>>>>>>>>>> Hashes =
getHashesFromDebugH(*DebugH);
>>>>>>>>>>>> else {
>>>>>>>>>>>> OwnedHashes =
GloballyHashedType::hashTypes(Types);
>>>>>>>>>>>> Hashes = OwnedHashes;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> In the else block there, add a
log message that says
>>>>>>>>>>>> “synthesizing .debug$h section
for “ + Obj->Name
>>>>>>>>>>>>
>>>>>>>>>>>> See how many of these you get.
When I build chrome + all third
>>>>>>>>>>>> party libraries this way i get
about 100, which is small enough to still
>>>>>>>>>>>> see large performance gains.
>>>>>>>>>>>>
>>>>>>>>>>>> If you have many 3rd party
libraries, it may be necessary to
>>>>>>>>>>>> rewrite the .lib files too, not
just the .obj files. Eventually I’ll get
>>>>>>>>>>>> around to implementing all of
this as well, as well as better heuristics in
>>>>>>>>>>>> lld-link to disable ghash if
it’s going to be slow
>>>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM
Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Ok I went for kind of
middle ground solution, I patch in the
>>>>>>>>>>>>> obj files, but as adding a
new section didn't seem to work, I add a
>>>>>>>>>>>>> "shadow" section,
by editing the pointer to line number and the virtual
>>>>>>>>>>>>> size on the .debug$T
section. Although technically broken, both link.exe
>>>>>>>>>>>>> and lld-link.exe don't
seem to mind the alterations and as the shadow
>>>>>>>>>>>>> .debug$H is not really a
section anymore (its just some bytes at the end of
>>>>>>>>>>>>> the file) it doesn't
change anything else that does matter. With that I
>>>>>>>>>>>>> could do my first test with
a subset of our code base, and the results are
>>>>>>>>>>>>> not good. I found one of
our sources that break the ghash computation, I
>>>>>>>>>>>>> will get more info on this
and post a proper bug report, but I guess its
>>>>>>>>>>>>> type information that is
generated only by msvc. The other more alarming
>>>>>>>>>>>>> problem is that linking is
way slower with the ghahes... my guess is that
>>>>>>>>>>>>> we have a bunch of pdb
files for some third party libraries and calculating
>>>>>>>>>>>>> those ghashes takes more
time than actual linking of this small part of the
>>>>>>>>>>>>> source (it links in 4s in
both link.exe and lld-link.exe without ghashes).
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:52 PM, Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> We don't generate
any .lib as those don't work well with
>>>>>>>>>>>>>> incremental linking
(and give zero advantages when linking AFAIK), and it
>>>>>>>>>>>>>> would be pretty easy to
have a modern format for having a .ghash for
>>>>>>>>>>>>>> multiple files,
something simple like size prefixed name and then size
>>>>>>>>>>>>>> prefixed ghash blobs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:44 PM, Zachary Turner <
>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We considered that
early on, but most object files actually
>>>>>>>>>>>>>>> end up in .lib
files so unless there were a way to connect the objects in
>>>>>>>>>>>>>>> the .lib to the
corresponding .ghash files, this would disable ghash usage
>>>>>>>>>>>>>>> for a large amount
of inputs. Supporting both is an option, but it adds a
>>>>>>>>>>>>>>> bit of complexity
and I’m not totally convinced it’s worth it
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 11:38 AM Leonardo Santagada <
>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> it does.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I just had an
epiphany: why not just write a .ghash file
>>>>>>>>>>>>>>>> and have lld
read those if they exist for an .obj file?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Seem much
simpler than trying to wire up a 20 year old file
>>>>>>>>>>>>>>>> format. I will
try to do this, is something like this acceptable for LLD?
>>>>>>>>>>>>>>>> The cool thing
is that I can generate .ghash for .lib or any obj lying
>>>>>>>>>>>>>>>> around (maybe
even for pdb in the future).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 8:32 PM, Zachary Turner <
>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In general,
we should be able to accept any MSVC .obj file
>>>>>>>>>>>>>>>>> to LLD.  At
the very least, we're not aware of any cases that don't work.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Does your
MSVC .obj file link fine before you add the
>>>>>>>>>>>>>>>>> .debug$H?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Jan
26, 2018 at 11:23 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>> santagada
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Okay,
apparently coff2yaml and yaml2coff are not in a
>>>>>>>>>>>>>>>>>> great
place as they both don't deal well with the fact that you can have
>>>>>>>>>>>>>>>>>>
overlapping sections, which seems to be what clang-cl produces (the .data
>>>>>>>>>>>>>>>>>> section
points to the same place as a later section). Which is not a big
>>>>>>>>>>>>>>>>>> big
problem for me particularly because msvc doesn't even generate .data
>>>>>>>>>>>>>>>>>>
sections in .obj.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm
trying to put support for .bss sections in both
>>>>>>>>>>>>>>>>>>
coff2yaml and yaml2coff... but I still can link just fine with my
>>>>>>>>>>>>>>>>>>
transformations clang-cl generated files... what does give me problems is
>>>>>>>>>>>>>>>>>> msvc
.obj files. Have you tried to link one of these?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri,
Jan 26, 2018 at 8:05 PM, Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
yeah, apparently .bss has a flag of unitialized data
>>>>>>>>>>>>>>>>>>>
that is not being respected on the layout of the coff files (it should skip
>>>>>>>>>>>>>>>>>>>
those sections) but I dunno what to do with .data as it doesn't have a size.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
(resending as apparently my pastes generated a ton of
>>>>>>>>>>>>>>>>>>>
hidden html data and this message hit the mailinglist limit of 100k)
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Leonardo Santagada
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>
>
> --
>
> Leonardo Santagada
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180130/67a928bb/attachment.html>

Zachary Turner via llvm-dev

2018-Jan-30 20:32 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Did you change both the compiler and linker (or make sure that your objcopy
was updated to write your 64 bit hashes)?

The linker is hardcodes to expect 20-byte sha 1s , anything else and it
will recompute them in serial
On Tue, Jan 30, 2018 at 12:28 PM Leonardo Santagada <santagada at
gmail.com>
wrote:
> Nice and why are you trying blake2 instead of a faster hash algorithm? And
> do you have any guess as to why xxHash64 wasn't faster than SHA1? I
still
> have to see how many collision I get with it, but it seems so improbable
> that collisions on 64 bit hashes would be the problem.
>
> On 30 Jan 2018 18:39, "Zachary Turner" <zturner at
google.com> wrote:
>
> It turns out there were some problems with the measurements in that blog
> post.  I built LLD with the RelWithDebInfo configuration which we later
> found out uses /Ob1 instead of /Ob2.  That was worth some cycles.  Then
> there were some more optimizations that went in after that.  And to get
> down to 28s I also used an LTO'ed build of lld.
>
> If you're building LLD at ToT you should have everything needed to
> reproduce those numbers, but it will vary depending on the speed of your
> CPU obviously.
>
> On Tue, Jan 30, 2018 at 9:21 AM Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> Today I played around replacing the sha1 with xxHash64 and the results
so
>> far are bad. Linking times almost doubled and I can't really
explain why,
>> the only thing that comes to mind is hash collisions but on type names
they
>> should be very few in 64bit hashes.
>>
>> Any reason why you are trying blake2 and not murmurhash3 or xxHash64?
>>
>> About creating a pdb per lib, you can say to msvc to put the pdb of
every
>> .obj compilation to the same file, but you can't after 20 files
compiled to
>> .obj (with /Z7 or /Zi) to them merge all the debug information in one
.pdb
>> file AFAIK. That would make our links much faster I think as people
either
>> are changing headers (and then they know they have to wait) or changing
a
>> single/few .cpp files. It would be great to group our 3k obj debug
>> information in groups so that this linking steps can be paralelizable.
Is
>> there any support maybe for merging pdb with pdb util and then feeding
that
>> to lld-link instead of .obj debug info?
>>
>> I also re-read the post about ghash and it says blink links in 88s, the
>> 28s you talk about is with unrelased optimizations only?
>>
>> On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> You can make a PDB per lib (consider msvcrtd.pdb which ships with
MSVC),
>>> but all these per-lib PDBs would have to be merged into a single
master PDB
>>> at the end, so you still can't avoid that final .  In a way,
that's similar
>>> to the idea behind /DEBUG:FASTLINK (keep the debug info in object
files to
>>> eliminate the cost of merging types and symbol records) and we know
what
>>> the problems with /DEBUG:FASTLINK are.
>>>
>>> The PDB generation code in LLD is still completely single threaded,
so
>>> that's one area for huge potential gains, but only some parts
of the
>>> algorithm are parallelizable.  We're trying to squeeze every
last bit of
>>> performance out of the single-threaded case first before we
parallelize,
>>> but that option is definitely still there for us.
>>>
>>> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada <santagada at
gmail.com>
>>> wrote:
>>>
>>>> Does packing obj files in .lib helps linking in any way? My
>>>> understanding is that there would be no difference. It could
help if I
>>>> could make a pdb per lib, but there is no way to do so... Maybe
we could
>>>> implement this on lld?
>>>>
>>>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at
google.com> wrote:
>>>>
>>>>> Yes we've discussed many different ideas for
incremental linking, but
>>>>> our conclusion is that you can only get one of Fast|Simple.
If you want it
>>>>> to be fast it has to be complicated and if you want it to
be simple then
>>>>> it's going to be slow.
>>>>>
>>>>> Consider the case where you edit one .cpp file and change
this:
>>>>>
>>>>> int x = 0, y = 7;
>>>>>
>>>>> to this:
>>>>>
>>>>> int x = 0;
>>>>> short y = 7;
>>>>>
>>>>> Because different instructions operate on shorts vs ints,
some of the
>>>>> instruction encodings will be different and potentially of
a different size.
>>>>>
>>>>> Because of this, the contribution to the .text section from
this
>>>>> object file is going to be a different size.
>>>>>
>>>>> Because of that, all subsequent object files will start at
a different
>>>>> absolute file address in the final executable.
>>>>>
>>>>> Because of that, every single symbol in every single object
file will
>>>>> need to be updated in the final PDB.
>>>>>
>>>>> There are many other things that need to happen as well,
but the point
>>>>> is that trivial change to a cpp file can explode into many
changes in the
>>>>> final PDB.
>>>>>
>>>>> There are ways to handle this, but they're not simple. 
We have some
>>>>> ideas, but for the moment we are focused on making full
linking as fast as
>>>>> possible because it's much easier and still provides
benefits.  We think we
>>>>> can get it fast enough that it will be acceptable, and that
should give us
>>>>> some extra time to do incremental linking properly.
>>>>>
>>>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> About incremental linking, the only thing from my
benchmark that
>>>>>> needs to be incremental is the pdb patching as
generating the binary seems
>>>>>> faster than incremental linking on link.exe, so did
anyone propose renaming
>>>>>> the current binary, writing a new one and then diffing
the coff obj and
>>>>>> using that info to just rewriting that part of the pdb.
Or another idea is
>>>>>> making the build system feed into the linker which
files changed so the
>>>>>> types/debug information can be compared instead of all
of them?
>>>>>>
>>>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner
<zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Not a lot.
>>>>>>>
>>>>>>> /TIME will show high level timing of the various
phases (this is the
>>>>>>> same option MSVC uses).
>>>>>>>
>>>>>>> If you want anything more detailed than that, vTune
or ETW+WPA (
>>>>>>> https://github.com/google/UIforETW/releases) are
probably what
>>>>>>> you'll need to do.
>>>>>>>
>>>>>>> (We'd definitely love patches to improve
performance, or even just
>>>>>>> ideas about how to make things faster.  Improving
link speed is one of our
>>>>>>> biggest priorities.)
>>>>>>>
>>>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada
<
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> Yeah true, is there any switches to profile the
linker?
>>>>>>>>
>>>>>>>> On 29 Jan 2018 18:43, "Zachary
Turner" <zturner at google.com> wrote:
>>>>>>>>
>>>>>>>>> Part of the reason why lld is so fast is
because we map every
>>>>>>>>> input file into memory up front and rely on
the virtual memory manager in
>>>>>>>>> the kernel to make this fast.  Generally
speaking, this is a lot faster
>>>>>>>>> than opening a file, reading it and
processing a file, and closing the
>>>>>>>>> file.  The downside, as you note, is that
it uses a lot of memory.
>>>>>>>>>
>>>>>>>>> But there's a catch.  The kernel is
smart enough to share the
>>>>>>>>> physical memory pages when you map the same
file multiple times from
>>>>>>>>> multiple processes.  So it only looks like
the memory usage is high because
>>>>>>>>> it reserves a large amount of address space
in each process.  But the total
>>>>>>>>> amount of physical memory used will not
increase when additional instances
>>>>>>>>> of the same file are mapped.
>>>>>>>>>
>>>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo
Santagada <
>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I cleaned up my tests and figured that
the obj file generated
>>>>>>>>>> with problems was only with msvc 2015,
so trying again with msvc 2017 I
>>>>>>>>>> get:
>>>>>>>>>>
>>>>>>>>>> lld-link: 4s
>>>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram
>>>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of
ram
>>>>>>>>>> link: 13s
>>>>>>>>>> link /debug:fastlink: 43s and 1gb of
ram
>>>>>>>>>> link specialpdb: 1m10s and 4gb of ram
>>>>>>>>>> link /debug: 9m16s min and >14gb of
ram
>>>>>>>>>>
>>>>>>>>>> link incremental: 8s when it works.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *specialpdb is created with passing to
a set of compilation units
>>>>>>>>>> (eg a folder) the same pdb to be
written to, so it dedups the symbols
>>>>>>>>>> before the final linking, but that does
decrease the concurrency as this
>>>>>>>>>> step can't be done after linking.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> My question is, in the set of patches
you guys haven't upstreamed
>>>>>>>>>> is there anything that makes
compilation uses less memory? Or just asking
>>>>>>>>>> more directly, when will those patches
make to upstream, or can I try them?
>>>>>>>>>> The memory usage of lld-link is a
little worrying as we have around 6-8
>>>>>>>>>> binaries that we link for windows and
they mostly use the same libraries so
>>>>>>>>>> 20gb of ram each means we probably
can't link them all together anymore.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Tomorrow I will send my tool and
changes to lld so more people
>>>>>>>>>> can try this out and tell if it helps
with their msvc only code.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM,
Zachary Turner <
>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I don’t have pgo numbers. When I
build using -flto=thin the link
>>>>>>>>>>> time is significantly faster than
msvc /ltcg and runtime is slightly
>>>>>>>>>>> faster, but I haven’t tested on a
large variety of different workloads, so
>>>>>>>>>>> YMMV. Link time will definitely be
faster though
>>>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM
Leonardo Santagada <
>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> This part is only for objects
with /Z7 debug information in
>>>>>>>>>>>> them right? I think most of the
third parties are either: .lib/obj without
>>>>>>>>>>>> debug information, the same
with information on pdb files. Rewriting all
>>>>>>>>>>>> .lib/.obj with /Z7 information
seems doable with a small python script, the
>>>>>>>>>>>> pdb one is going to be more
work, but I always wanted to know how a pdb
>>>>>>>>>>>> file is structured so
"fun" times ahead. But yeah printing it out, and
>>>>>>>>>>>> timing it might be very useful
indeed.
>>>>>>>>>>>>
>>>>>>>>>>>> Did anyone tried to
compile/link lld-link.exe with LTO+PGO to
>>>>>>>>>>>> see how much faster can it get?
I might try that as well, as 10% speed
>>>>>>>>>>>> improvement might be handy.
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14
PM, Zachary Turner <
>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Look for this code in
lld/coff/pdb.cpp
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> if
(Config->DebugGHashes) {
>>>>>>>>>>>>>
ArrayRef<GloballyHashedType> Hashes;
>>>>>>>>>>>>>
std::vector<GloballyHashedType> OwnedHashes;
>>>>>>>>>>>>> if
(Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File))
>>>>>>>>>>>>> Hashes =
getHashesFromDebugH(*DebugH);
>>>>>>>>>>>>> else {
>>>>>>>>>>>>> OwnedHashes =
GloballyHashedType::hashTypes(Types);
>>>>>>>>>>>>> Hashes = OwnedHashes;
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the else block there,
add a log message that says
>>>>>>>>>>>>> “synthesizing .debug$h
section for “ + Obj->Name
>>>>>>>>>>>>>
>>>>>>>>>>>>> See how many of these you
get. When I build chrome + all third
>>>>>>>>>>>>> party libraries this way i
get about 100, which is small enough to still
>>>>>>>>>>>>> see large performance
gains.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you have many 3rd party
libraries, it may be necessary to
>>>>>>>>>>>>> rewrite the .lib files too,
not just the .obj files. Eventually I’ll get
>>>>>>>>>>>>> around to implementing all
of this as well, as well as better heuristics in
>>>>>>>>>>>>> lld-link to disable ghash
if it’s going to be slow
>>>>>>>>>>>>> On Sun, Jan 28, 2018 at
1:51 PM Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ok I went for kind of
middle ground solution, I patch in the
>>>>>>>>>>>>>> obj files, but as
adding a new section didn't seem to work, I add a
>>>>>>>>>>>>>> "shadow"
section, by editing the pointer to line number and the virtual
>>>>>>>>>>>>>> size on the .debug$T
section. Although technically broken, both link.exe
>>>>>>>>>>>>>> and lld-link.exe
don't seem to mind the alterations and as the shadow
>>>>>>>>>>>>>> .debug$H is not really
a section anymore (its just some bytes at the end of
>>>>>>>>>>>>>> the file) it
doesn't change anything else that does matter. With that I
>>>>>>>>>>>>>> could do my first test
with a subset of our code base, and the results are
>>>>>>>>>>>>>> not good. I found one
of our sources that break the ghash computation, I
>>>>>>>>>>>>>> will get more info on
this and post a proper bug report, but I guess its
>>>>>>>>>>>>>> type information that
is generated only by msvc. The other more alarming
>>>>>>>>>>>>>> problem is that linking
is way slower with the ghahes... my guess is that
>>>>>>>>>>>>>> we have a bunch of pdb
files for some third party libraries and calculating
>>>>>>>>>>>>>> those ghashes takes
more time than actual linking of this small part of the
>>>>>>>>>>>>>> source (it links in 4s
in both link.exe and lld-link.exe without ghashes).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at
8:52 PM, Leonardo Santagada <
>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We don't
generate any .lib as those don't work well with
>>>>>>>>>>>>>>> incremental linking
(and give zero advantages when linking AFAIK), and it
>>>>>>>>>>>>>>> would be pretty
easy to have a modern format for having a .ghash for
>>>>>>>>>>>>>>> multiple files,
something simple like size prefixed name and then size
>>>>>>>>>>>>>>> prefixed ghash
blobs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 8:44 PM, Zachary Turner <
>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We considered
that early on, but most object files actually
>>>>>>>>>>>>>>>> end up in .lib
files so unless there were a way to connect the objects in
>>>>>>>>>>>>>>>> the .lib to the
corresponding .ghash files, this would disable ghash usage
>>>>>>>>>>>>>>>> for a large
amount of inputs. Supporting both is an option, but it adds a
>>>>>>>>>>>>>>>> bit of
complexity and I’m not totally convinced it’s worth it
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 26,
2018 at 11:38 AM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> it does.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I just had
an epiphany: why not just write a .ghash file
>>>>>>>>>>>>>>>>> and have
lld read those if they exist for an .obj file?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Seem much
simpler than trying to wire up a 20 year old
>>>>>>>>>>>>>>>>> file
format. I will try to do this, is something like this acceptable for
>>>>>>>>>>>>>>>>> LLD? The
cool thing is that I can generate .ghash for .lib or any obj lying
>>>>>>>>>>>>>>>>> around
(maybe even for pdb in the future).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Jan
26, 2018 at 8:32 PM, Zachary Turner <
>>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In
general, we should be able to accept any MSVC .obj
>>>>>>>>>>>>>>>>>> file to
LLD.  At the very least, we're not aware of any cases that don't
>>>>>>>>>>>>>>>>>> work.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does
your MSVC .obj file link fine before you add the
>>>>>>>>>>>>>>>>>>
.debug$H?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri,
Jan 26, 2018 at 11:23 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Okay, apparently coff2yaml and yaml2coff are not in a
>>>>>>>>>>>>>>>>>>>
great place as they both don't deal well with the fact that you can have
>>>>>>>>>>>>>>>>>>>
overlapping sections, which seems to be what clang-cl produces (the .data
>>>>>>>>>>>>>>>>>>>
section points to the same place as a later section). Which is not a big
>>>>>>>>>>>>>>>>>>> big
problem for me particularly because msvc doesn't even generate .data
>>>>>>>>>>>>>>>>>>>
sections in .obj.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
I'm trying to put support for .bss sections in both
>>>>>>>>>>>>>>>>>>>
coff2yaml and yaml2coff... but I still can link just fine with my
>>>>>>>>>>>>>>>>>>>
transformations clang-cl generated files... what does give me problems is
>>>>>>>>>>>>>>>>>>>
msvc .obj files. Have you tried to link one of these?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Fri, Jan 26, 2018 at 8:05 PM, Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
yeah, apparently .bss has a flag of unitialized data
>>>>>>>>>>>>>>>>>>>>
that is not being respected on the layout of the coff files (it should skip
>>>>>>>>>>>>>>>>>>>>
those sections) but I dunno what to do with .data as it doesn't have a size.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
(resending as apparently my pastes generated a ton of
>>>>>>>>>>>>>>>>>>>>
hidden html data and this message hit the mailinglist limit of 100k)
>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Leonardo Santagada
>>>>>>
>>>>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180130/f4da2806/attachment-0001.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Jan 2018 - [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Maybe Matching Threads