thr3ads.net - llvm dev - [llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler) [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2018-Jan-20 20:44 UTC

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Chrome is actually one of my exact benchmark cases. When building
blink_core.dll and browser_tests.exe, i get anywhere from a 20-40%
reduction in link time. We have some other optimizations in the pipeline
but not upstream yet.

My best time so far (including other optimizations not yet upstream) is 28s
on blink_core.dll, compared to 110s with /debug
On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <santagada at
gmail.com>
wrote:
> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>> You probably don't want to go down the same route that clang goes
through
>> to write the object file.  If you think yaml2coff is convoluted, the
way
>> clang does it will just give you a headache.  There are multiple
>> abstractions involved to account for different object file formats
(ELF,
>> COFF, MachO) and output formats (Assembly, binary file).  At least with
>> yaml2coff
>>
>
> I think your phrase got cut there, but yeah I just found AsmPrinter.cpp
> and it is convoluted.
>
>
>
>> It's true that yaml2coff is using the COFFParser structure, but if
you
>> look at the writeCOFF function in yaml2coff it's pretty bare-metal.
The
>> logic you need will be almost identical, except that instead of
checking
>> the COFFParser for the various fields, you'll check the existing
>> COFFObjectFile, which should have similar fields.
>>
>> The only thing you need to different is when writing the section table
>> and section contents, to insert a new entry.  Since you're
injecting a
>> section into the middle, you'll also probably need to push back the
file
>> pointer of all subsequent sections so that they don't overlap. 
(e.g. if
>> the original sections are 1, 2, 3, 4, 5 and you insert between 2 and 3,
>> then the original sections 3, 4, and 5 would need to have their
>> FilePointerToRawData offset by the size of the new section).
>>
>
> I have the PE/COFF spec open here and I'm happy that I read a bit of it
so
> I actually know what you are talking about... yeah it doesn't seem too
> complicated.
>
>
>
>> If you need to know what values to put for the other fields in a
section
>> header, run `dumpbin /headers foo.obj` on a clang-generated object file
>> that has a .debug$H section already (e.g. run clang with
>> -emit-codeview-ghash-section, and look at the properties of the
.debug$H
>> section and use the same values).
>>
>
> Thanks I will do that and then also look at how the CodeView part of the
> code does it if I can't understand some of it.
>
>
>> The only invariant that needs to be maintained is that
>> Section[N]->FilePointerOfRawData ==
Section[N-1]->FilePointerOfRawData +
>> Section[N-1]->SizeOfRawData
>>
>
> Well, that and all the sections need to be on the final file... But I'm
> hopeful.
>
>
> Anyone has times on linking a big project like chrome with this so that at
> least I know what kind of performance to expect?
>
> My numbers are something like:
>
> 1 pdb per obj file: link.exe takes ~15 minutes and 16GB of ram,
> lld-link.exe takes 2:30 minutes and ~8GB of ram
> around 10 pdbs per folder: link.exe takes 1 minute and 2-3GB of ram,
> lld-link.exe takes 1:30 minutes and ~6GB of ram
> faslink: link.exe takes 40 seconds, but then 20 seconds of loading at the
> first break point in the debugger and we lost DIA support for listing
> symbols.
> incremental: link.exe takes 8 seconds, but it only happens when very minor
> changes happen.
>
> We have an non negligible number of symbols used on some runtime systems.
>
>
>>
>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> Thanks for the tips, I now have something that reads the obj file,
finds
>>> .debug$T sections and global hashes it (proof of concept kind of
code).
>>> What I can't find is: how does clang itself writes the coff
files with
>>> global hashes, as that might help me understand how to create the
.debug$H
>>> section, how to update the file section count and how to properly
write
>>> this back.
>>>
>>> The code on yaml2coff is expecting to be working on the yaml
COFFParser
>>> struct and I'm having quite a bit of a headache turning the
COFFObjectFile
>>> into a COFFParser object or compatible... Tomorrow I might try the
very non
>>> efficient path of coff2yaml and then yaml2coff with the hashes
header...
>>> but it seems way too inefficient and convoluted.
>>>
>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada
<santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <zturner
at google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada
<
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> No I didn't, I used cl.exe from the visual
studio toolchain. What
>>>>>>> I'm proposing is a tool for processing .obj
files in COFF format, reading
>>>>>>> them and generating the GHASH part.
>>>>>>>
>>>>>>> To make our build faster we use hundreds of unity
build files
>>>>>>> (.cpp's with a lot of other .cpp's in them
aka munch files) but still have
>>>>>>> a lot of single .cpp's as well (in total
something like 3.4k .obj files).
>>>>>>>
>>>>>>> ps: sorry for sending to the wrong list, I was
reading about llvm
>>>>>>> mailing lists and jumped when I saw what I thought
was a lld exclusive list.
>>>>>>>
>>>>>>
>>>>>> A tool like this would be useful, yes.  We've
talked about it
>>>>>> internally as well and agreed it would be useful, we
just haven't
>>>>>> prioritized it.  If you're interested in submitting
a patch along those
>>>>>> lines though, I think it would be a good addition.
>>>>>>
>>>>>> I'm not sure what the best place for it would be. 
llvm-readobj and
>>>>>> llvm-objdump seem like obvious choices, but they are
intended to be
>>>>>> read-only, so perhaps they wouldn't be a good fit.
>>>>>>
>>>>>> llvm-pdbutil is kind of a hodgepodge of everything else
related to
>>>>>> PDBs and symbols, so I wouldn't be opposed to
making a new subcommand there
>>>>>> called "ghash" or something that could
process an object file and output a
>>>>>> new object file with a .debug$H section.
>>>>>>
>>>>>> A third option would be to make a new tool for it.
>>>>>>
>>>>>> I don't htink it would be that hard to write.  If
you're interested
>>>>>> in trying to make a patch for this, I can offer some
guidance on where to
>>>>>> look in the code.  Otherwise it's something that
we'll probably get to, I'm
>>>>>> just not sure when.
>>>>>>
>>>>>>>
>>>>> I would love to write it and contribute it back, please do
tell, I did
>>>>> find some of the code of ghash in lld, but in fuzzy on the
llvm codeview
>>>>> part of it and never seen llvm-readobj/objdump or
llvm-pdbutil, but I'm not
>>>>> afraid to look :)
>>>>>
>>>>>
>>>>  Luckily all of the important code is hidden behind library
calls, and
>>>> it should already just do the right thing, so I suspect you
won't need to
>>>> know much about CodeView to do this.
>>>>
>>>> I think Peter has the right idea about putting this in
llvm-objcopy.
>>>>
>>>> You can look at one of the existing CopyBinary functions there,
which
>>>> currently only work for ELF, but you can just make a new
overload that
>>>> accepts a COFFObjectFile.
>>>>
>>>> I would probably start by iterating over each of the sections
>>>> (getNumberOfSections / getSectionName) looking for .debug$T and
.debug$H
>>>> sections.
>>>>
>>>> If you find a .debug$H section then you can just skip that
object
>>>> file.
>>>>
>>>> If you find a .debug$T but not a .debug$H, then basically do
the same
>>>> thing that LLD does in PDBLinker::mergeDebugT  (create a
CVTypeArray, and
>>>> pass it to GloballyHashedType::hashTypes.  That will return an
array of
>>>> hash values.  (the format of .debug$H is the header, followed
by the hash
>>>> values).  Then when you're writing the list of sections,
just add in the
>>>> .debug$H section right after the .debug$T section.
>>>>
>>>> Currently llvm-objcopy only writes ELF files, so it would need
to be
>>>> taught to write COFF files.  We have code to do this in the
yaml2obj
>>>> utility (specifically, in yaml2coff.cpp in the function
writeCOFF).  There
>>>> may be a way to move this code to somewhere else
(llvm/Object/COFF.h?) so
>>>> that it can be re-used by both yaml2coff and llvm-objcopy, but
in the worst
>>>> case scenario you could copy the code and re-write it to work
with these
>>>> new structures.
>>>>
>>>> Lastly, you'll probably want to put all of this behind an
option in
>>>> llvm-objcopy such as -add-codeview-ghash-section
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>
>
>
> --
>
> Leonardo Santagada
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180120/983ee363/attachment.html>

Zachary Turner via llvm-dev

2018-Jan-20 20:50 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Generally speaking a good rule of thumb is that /debug:ghash will be close
to or faster than /debug:fastlink, but with none of the penalties like slow
debug time
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <zturner at google.com>
wrote:
> Chrome is actually one of my exact benchmark cases. When building
> blink_core.dll and browser_tests.exe, i get anywhere from a 20-40%
> reduction in link time. We have some other optimizations in the pipeline
> but not upstream yet.
>
> My best time so far (including other optimizations not yet upstream) is
> 28s on blink_core.dll, compared to 110s with /debug
> On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> You probably don't want to go down the same route that clang
goes
>>> through to write the object file.  If you think yaml2coff is
convoluted,
>>> the way clang does it will just give you a headache.  There are
multiple
>>> abstractions involved to account for different object file formats
(ELF,
>>> COFF, MachO) and output formats (Assembly, binary file).  At least
with
>>> yaml2coff
>>>
>>
>> I think your phrase got cut there, but yeah I just found AsmPrinter.cpp
>> and it is convoluted.
>>
>>
>>
>>> It's true that yaml2coff is using the COFFParser structure, but
if you
>>> look at the writeCOFF function in yaml2coff it's pretty
bare-metal.
>>> The logic you need will be almost identical, except that instead of
>>> checking the COFFParser for the various fields, you'll check
the existing
>>> COFFObjectFile, which should have similar fields.
>>>
>>> The only thing you need to different is when writing the section
table
>>> and section contents, to insert a new entry.  Since you're
injecting a
>>> section into the middle, you'll also probably need to push back
the file
>>> pointer of all subsequent sections so that they don't overlap. 
(e.g. if
>>> the original sections are 1, 2, 3, 4, 5 and you insert between 2
and 3,
>>> then the original sections 3, 4, and 5 would need to have their
>>> FilePointerToRawData offset by the size of the new section).
>>>
>>
>> I have the PE/COFF spec open here and I'm happy that I read a bit
of it
>> so I actually know what you are talking about... yeah it doesn't
seem too
>> complicated.
>>
>>
>>
>>> If you need to know what values to put for the other fields in a
section
>>> header, run `dumpbin /headers foo.obj` on a clang-generated object
file
>>> that has a .debug$H section already (e.g. run clang with
>>> -emit-codeview-ghash-section, and look at the properties of the
.debug$H
>>> section and use the same values).
>>>
>>
>> Thanks I will do that and then also look at how the CodeView part of
the
>> code does it if I can't understand some of it.
>>
>>
>>> The only invariant that needs to be maintained is that
>>> Section[N]->FilePointerOfRawData ==
Section[N-1]->FilePointerOfRawData +
>>> Section[N-1]->SizeOfRawData
>>>
>>
>> Well, that and all the sections need to be on the final file... But
I'm
>> hopeful.
>>
>>
>> Anyone has times on linking a big project like chrome with this so that
>> at least I know what kind of performance to expect?
>>
>> My numbers are something like:
>>
>> 1 pdb per obj file: link.exe takes ~15 minutes and 16GB of ram,
>> lld-link.exe takes 2:30 minutes and ~8GB of ram
>> around 10 pdbs per folder: link.exe takes 1 minute and 2-3GB of ram,
>> lld-link.exe takes 1:30 minutes and ~6GB of ram
>> faslink: link.exe takes 40 seconds, but then 20 seconds of loading at
the
>> first break point in the debugger and we lost DIA support for listing
>> symbols.
>> incremental: link.exe takes 8 seconds, but it only happens when very
>> minor changes happen.
>>
>> We have an non negligible number of symbols used on some runtime
systems.
>>
>>
>>>
>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <santagada
at gmail.com>
>>> wrote:
>>>
>>>> Thanks for the tips, I now have something that reads the obj
file,
>>>> finds .debug$T sections and global hashes it (proof of concept
kind of
>>>> code). What I can't find is: how does clang itself writes
the coff files
>>>> with global hashes, as that might help me understand how to
create the
>>>> .debug$H section, how to update the file section count and how
to properly
>>>> write this back.
>>>>
>>>> The code on yaml2coff is expecting to be working on the yaml
COFFParser
>>>> struct and I'm having quite a bit of a headache turning the
COFFObjectFile
>>>> into a COFFParser object or compatible... Tomorrow I might try
the very non
>>>> efficient path of coff2yaml and then yaml2coff with the hashes
header...
>>>> but it seems way too inefficient and convoluted.
>>>>
>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <zturner at
google.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner
<zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada
<
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> No I didn't, I used cl.exe from the visual
studio toolchain. What
>>>>>>>> I'm proposing is a tool for processing .obj
files in COFF format, reading
>>>>>>>> them and generating the GHASH part.
>>>>>>>>
>>>>>>>> To make our build faster we use hundreds of
unity build files
>>>>>>>> (.cpp's with a lot of other .cpp's in
them aka munch files) but still have
>>>>>>>> a lot of single .cpp's as well (in total
something like 3.4k .obj files).
>>>>>>>>
>>>>>>>> ps: sorry for sending to the wrong list, I was
reading about llvm
>>>>>>>> mailing lists and jumped when I saw what I
thought was a lld exclusive list.
>>>>>>>>
>>>>>>>
>>>>>>> A tool like this would be useful, yes.  We've
talked about it
>>>>>>> internally as well and agreed it would be useful,
we just haven't
>>>>>>> prioritized it.  If you're interested in
submitting a patch along those
>>>>>>> lines though, I think it would be a good addition.
>>>>>>>
>>>>>>> I'm not sure what the best place for it would
be.  llvm-readobj and
>>>>>>> llvm-objdump seem like obvious choices, but they
are intended to be
>>>>>>> read-only, so perhaps they wouldn't be a good
fit.
>>>>>>>
>>>>>>> llvm-pdbutil is kind of a hodgepodge of everything
else related to
>>>>>>> PDBs and symbols, so I wouldn't be opposed to
making a new subcommand there
>>>>>>> called "ghash" or something that could
process an object file and output a
>>>>>>> new object file with a .debug$H section.
>>>>>>>
>>>>>>> A third option would be to make a new tool for it.
>>>>>>>
>>>>>>> I don't htink it would be that hard to write. 
If you're interested
>>>>>>> in trying to make a patch for this, I can offer
some guidance on where to
>>>>>>> look in the code.  Otherwise it's something
that we'll probably get to, I'm
>>>>>>> just not sure when.
>>>>>>>
>>>>>>>>
>>>>>> I would love to write it and contribute it back, please
do tell, I
>>>>>> did find some of the code of ghash in lld, but in fuzzy
on the llvm
>>>>>> codeview part of it and never seen llvm-readobj/objdump
or llvm-pdbutil,
>>>>>> but I'm not afraid to look :)
>>>>>>
>>>>>>
>>>>>  Luckily all of the important code is hidden behind library
calls, and
>>>>> it should already just do the right thing, so I suspect you
won't need to
>>>>> know much about CodeView to do this.
>>>>>
>>>>> I think Peter has the right idea about putting this in
llvm-objcopy.
>>>>>
>>>>> You can look at one of the existing CopyBinary functions
there, which
>>>>> currently only work for ELF, but you can just make a new
overload that
>>>>> accepts a COFFObjectFile.
>>>>>
>>>>> I would probably start by iterating over each of the
sections
>>>>> (getNumberOfSections / getSectionName) looking for .debug$T
and .debug$H
>>>>> sections.
>>>>>
>>>>> If you find a .debug$H section then you can just skip that
object
>>>>> file.
>>>>>
>>>>> If you find a .debug$T but not a .debug$H, then basically
do the same
>>>>> thing that LLD does in PDBLinker::mergeDebugT  (create a
CVTypeArray, and
>>>>> pass it to GloballyHashedType::hashTypes.  That will return
an array of
>>>>> hash values.  (the format of .debug$H is the header,
followed by the hash
>>>>> values).  Then when you're writing the list of
sections, just add in the
>>>>> .debug$H section right after the .debug$T section.
>>>>>
>>>>> Currently llvm-objcopy only writes ELF files, so it would
need to be
>>>>> taught to write COFF files.  We have code to do this in the
yaml2obj
>>>>> utility (specifically, in yaml2coff.cpp in the function
writeCOFF).  There
>>>>> may be a way to move this code to somewhere else
(llvm/Object/COFF.h?) so
>>>>> that it can be re-used by both yaml2coff and llvm-objcopy,
but in the worst
>>>>> case scenario you could copy the code and re-write it to
work with these
>>>>> new structures.
>>>>>
>>>>> Lastly, you'll probably want to put all of this behind
an option in
>>>>> llvm-objcopy such as -add-codeview-ghash-section
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Leonardo Santagada
>>>>
>>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180120/0f0eba12/attachment.html>

Leonardo Santagada via llvm-dev

2018-Jan-20 21:34 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

if we get to < 30s I think most users would prefer it to link.exe, just
hopping there is still some more optimizations to get closer to ELF linking
times (around 10-15s here).

On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <zturner at google.com>
wrote:
> Generally speaking a good rule of thumb is that /debug:ghash will be close
> to or faster than /debug:fastlink, but with none of the penalties like slow
> debug time
> On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <zturner at
google.com>
> wrote:
>
>> Chrome is actually one of my exact benchmark cases. When building
>> blink_core.dll and browser_tests.exe, i get anywhere from a 20-40%
>> reduction in link time. We have some other optimizations in the
pipeline
>> but not upstream yet.
>>
>> My best time so far (including other optimizations not yet upstream) is
>> 28s on blink_core.dll, compared to 110s with /debug
>> On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> You probably don't want to go down the same route that
clang goes
>>>> through to write the object file.  If you think yaml2coff is
convoluted,
>>>> the way clang does it will just give you a headache.  There are
multiple
>>>> abstractions involved to account for different object file
formats (ELF,
>>>> COFF, MachO) and output formats (Assembly, binary file).  At
least with
>>>> yaml2coff
>>>>
>>>
>>> I think your phrase got cut there, but yeah I just found
AsmPrinter.cpp
>>> and it is convoluted.
>>>
>>>
>>>
>>>> It's true that yaml2coff is using the COFFParser structure,
but if you
>>>> look at the writeCOFF function in yaml2coff it's pretty
bare-metal.
>>>> The logic you need will be almost identical, except that
instead of
>>>> checking the COFFParser for the various fields, you'll
check the existing
>>>> COFFObjectFile, which should have similar fields.
>>>>
>>>> The only thing you need to different is when writing the
section table
>>>> and section contents, to insert a new entry.  Since you're
injecting a
>>>> section into the middle, you'll also probably need to push
back the file
>>>> pointer of all subsequent sections so that they don't
overlap.  (e.g. if
>>>> the original sections are 1, 2, 3, 4, 5 and you insert between
2 and 3,
>>>> then the original sections 3, 4, and 5 would need to have their
>>>> FilePointerToRawData offset by the size of the new section).
>>>>
>>>
>>> I have the PE/COFF spec open here and I'm happy that I read a
bit of it
>>> so I actually know what you are talking about... yeah it
doesn't seem too
>>> complicated.
>>>
>>>
>>>
>>>> If you need to know what values to put for the other fields in
a
>>>> section header, run `dumpbin /headers foo.obj` on a
clang-generated object
>>>> file that has a .debug$H section already (e.g. run clang with
>>>> -emit-codeview-ghash-section, and look at the properties of the
.debug$H
>>>> section and use the same values).
>>>>
>>>
>>> Thanks I will do that and then also look at how the CodeView part
of the
>>> code does it if I can't understand some of it.
>>>
>>>
>>>> The only invariant that needs to be maintained is that
Section[N]->FilePointerOfRawData =>>>>
Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData
>>>>
>>>
>>> Well, that and all the sections need to be on the final file... But
I'm
>>> hopeful.
>>>
>>>
>>> Anyone has times on linking a big project like chrome with this so
that
>>> at least I know what kind of performance to expect?
>>>
>>> My numbers are something like:
>>>
>>> 1 pdb per obj file: link.exe takes ~15 minutes and 16GB of ram,
>>> lld-link.exe takes 2:30 minutes and ~8GB of ram
>>> around 10 pdbs per folder: link.exe takes 1 minute and 2-3GB of
ram,
>>> lld-link.exe takes 1:30 minutes and ~6GB of ram
>>> faslink: link.exe takes 40 seconds, but then 20 seconds of loading
at
>>> the first break point in the debugger and we lost DIA support for
listing
>>> symbols.
>>> incremental: link.exe takes 8 seconds, but it only happens when
very
>>> minor changes happen.
>>>
>>> We have an non negligible number of symbols used on some runtime
systems.
>>>
>>>
>>>>
>>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <
>>>> santagada at gmail.com> wrote:
>>>>
>>>>> Thanks for the tips, I now have something that reads the
obj file,
>>>>> finds .debug$T sections and global hashes it (proof of
concept kind of
>>>>> code). What I can't find is: how does clang itself
writes the coff files
>>>>> with global hashes, as that might help me understand how to
create the
>>>>> .debug$H section, how to update the file section count and
how to properly
>>>>> write this back.
>>>>>
>>>>> The code on yaml2coff is expecting to be working on the
yaml
>>>>> COFFParser struct and I'm having quite a bit of a
headache turning the
>>>>> COFFObjectFile into a COFFParser object or compatible...
Tomorrow I might
>>>>> try the very non efficient path of coff2yaml and then
yaml2coff with the
>>>>> hashes header... but it seems way too inefficient and
convoluted.
>>>>>
>>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner
<zturner at google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner
<zturner at google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> No I didn't, I used cl.exe from the
visual studio toolchain. What
>>>>>>>>> I'm proposing is a tool for processing
.obj files in COFF format, reading
>>>>>>>>> them and generating the GHASH part.
>>>>>>>>>
>>>>>>>>> To make our build faster we use hundreds of
unity build files
>>>>>>>>> (.cpp's with a lot of other .cpp's
in them aka munch files) but still have
>>>>>>>>> a lot of single .cpp's as well (in
total something like 3.4k .obj files).
>>>>>>>>>
>>>>>>>>> ps: sorry for sending to the wrong list, I
was reading about llvm
>>>>>>>>> mailing lists and jumped when I saw what I
thought was a lld exclusive list.
>>>>>>>>>
>>>>>>>>
>>>>>>>> A tool like this would be useful, yes. 
We've talked about it
>>>>>>>> internally as well and agreed it would be
useful, we just haven't
>>>>>>>> prioritized it.  If you're interested in
submitting a patch along those
>>>>>>>> lines though, I think it would be a good
addition.
>>>>>>>>
>>>>>>>> I'm not sure what the best place for it
would be.  llvm-readobj and
>>>>>>>> llvm-objdump seem like obvious choices, but
they are intended to be
>>>>>>>> read-only, so perhaps they wouldn't be a
good fit.
>>>>>>>>
>>>>>>>> llvm-pdbutil is kind of a hodgepodge of
everything else related to
>>>>>>>> PDBs and symbols, so I wouldn't be opposed
to making a new subcommand there
>>>>>>>> called "ghash" or something that
could process an object file and output a
>>>>>>>> new object file with a .debug$H section.
>>>>>>>>
>>>>>>>> A third option would be to make a new tool for
it.
>>>>>>>>
>>>>>>>> I don't htink it would be that hard to
write.  If you're interested
>>>>>>>> in trying to make a patch for this, I can offer
some guidance on where to
>>>>>>>> look in the code.  Otherwise it's something
that we'll probably get to, I'm
>>>>>>>> just not sure when.
>>>>>>>>
>>>>>>>>>
>>>>>>> I would love to write it and contribute it back,
please do tell, I
>>>>>>> did find some of the code of ghash in lld, but in
fuzzy on the llvm
>>>>>>> codeview part of it and never seen
llvm-readobj/objdump or llvm-pdbutil,
>>>>>>> but I'm not afraid to look :)
>>>>>>>
>>>>>>>
>>>>>>  Luckily all of the important code is hidden behind
library calls,
>>>>>> and it should already just do the right thing, so I
suspect you won't need
>>>>>> to know much about CodeView to do this.
>>>>>>
>>>>>> I think Peter has the right idea about putting this in
llvm-objcopy.
>>>>>>
>>>>>> You can look at one of the existing CopyBinary
functions there, which
>>>>>> currently only work for ELF, but you can just make a
new overload that
>>>>>> accepts a COFFObjectFile.
>>>>>>
>>>>>> I would probably start by iterating over each of the
sections
>>>>>> (getNumberOfSections / getSectionName) looking for
.debug$T and .debug$H
>>>>>> sections.
>>>>>>
>>>>>> If you find a .debug$H section then you can just skip
that object
>>>>>> file.
>>>>>>
>>>>>> If you find a .debug$T but not a .debug$H, then
basically do the same
>>>>>> thing that LLD does in PDBLinker::mergeDebugT  (create
a CVTypeArray, and
>>>>>> pass it to GloballyHashedType::hashTypes.  That will
return an array
>>>>>> of hash values.  (the format of .debug$H is the header,
followed by the
>>>>>> hash values).  Then when you're writing the list of
sections, just add in
>>>>>> the .debug$H section right after the .debug$T section.
>>>>>>
>>>>>> Currently llvm-objcopy only writes ELF files, so it
would need to be
>>>>>> taught to write COFF files.  We have code to do this in
the yaml2obj
>>>>>> utility (specifically, in yaml2coff.cpp in the function
writeCOFF).  There
>>>>>> may be a way to move this code to somewhere else
(llvm/Object/COFF.h?) so
>>>>>> that it can be re-used by both yaml2coff and
llvm-objcopy, but in the worst
>>>>>> case scenario you could copy the code and re-write it
to work with these
>>>>>> new structures.
>>>>>>
>>>>>> Lastly, you'll probably want to put all of this
behind an option in
>>>>>> llvm-objcopy such as -add-codeview-ghash-section
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>

-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180120/97945e5f/attachment.html>

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Jan 2018 - [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Apparently Analagous Threads