thr3ads.net - llvm dev - [llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler) [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2018-Jan-19 21:38 UTC

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <santagada at
gmail.com>
wrote:
> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>>
>>
>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> No I didn't, I used cl.exe from the visual studio toolchain.
What I'm
>>> proposing is a tool for processing .obj files in COFF format,
reading them
>>> and generating the GHASH part.
>>>
>>> To make our build faster we use hundreds of unity build files
(.cpp's
>>> with a lot of other .cpp's in them aka munch files) but still
have a lot of
>>> single .cpp's as well (in total something like 3.4k .obj
files).
>>>
>>> ps: sorry for sending to the wrong list, I was reading about llvm
>>> mailing lists and jumped when I saw what I thought was a lld
exclusive list.
>>>
>>
>> A tool like this would be useful, yes.  We've talked about it
internally
>> as well and agreed it would be useful, we just haven't prioritized
it.  If
>> you're interested in submitting a patch along those lines though, I
think
>> it would be a good addition.
>>
>> I'm not sure what the best place for it would be.  llvm-readobj and
>> llvm-objdump seem like obvious choices, but they are intended to be
>> read-only, so perhaps they wouldn't be a good fit.
>>
>> llvm-pdbutil is kind of a hodgepodge of everything else related to PDBs
>> and symbols, so I wouldn't be opposed to making a new subcommand
there
>> called "ghash" or something that could process an object file
and output a
>> new object file with a .debug$H section.
>>
>> A third option would be to make a new tool for it.
>>
>> I don't htink it would be that hard to write.  If you're
interested in
>> trying to make a patch for this, I can offer some guidance on where to
look
>> in the code.  Otherwise it's something that we'll probably get
to, I'm just
>> not sure when.
>>
>>>
> I would love to write it and contribute it back, please do tell, I did
> find some of the code of ghash in lld, but in fuzzy on the llvm codeview
> part of it and never seen llvm-readobj/objdump or llvm-pdbutil, but I'm
not
> afraid to look :)
>
> Luckily all of the important code is hidden behind library calls, and it
should already just do the right thing, so I suspect you won't need to know
much about CodeView to do this.

I think Peter has the right idea about putting this in llvm-objcopy.

You can look at one of the existing CopyBinary functions there, which
currently only work for ELF, but you can just make a new overload that
accepts a COFFObjectFile.

I would probably start by iterating over each of the sections
(getNumberOfSections / getSectionName) looking for .debug$T and .debug$H
sections.

If you find a .debug$H section then you can just skip that object file.

If you find a .debug$T but not a .debug$H, then basically do the same thing
that LLD does in PDBLinker::mergeDebugT  (create a CVTypeArray, and pass it
to GloballyHashedType::hashTypes.  That will return an array of hash
values.  (the format of .debug$H is the header, followed by the hash
values).  Then when you're writing the list of sections, just add in the
.debug$H section right after the .debug$T section.

Currently llvm-objcopy only writes ELF files, so it would need to be taught
to write COFF files.  We have code to do this in the yaml2obj utility
(specifically, in yaml2coff.cpp in the function writeCOFF).  There may be a
way to move this code to somewhere else (llvm/Object/COFF.h?) so that it
can be re-used by both yaml2coff and llvm-objcopy, but in the worst case
scenario you could copy the code and re-write it to work with these new
structures.

Lastly, you'll probably want to put all of this behind an option in
llvm-objcopy such as -add-codeview-ghash-section
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180119/455596c9/attachment.html>

Leonardo Santagada via llvm-dev

2018-Jan-20 19:52 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Thanks for the tips, I now have something that reads the obj file, finds
.debug$T sections and global hashes it (proof of concept kind of code).
What I can't find is: how does clang itself writes the coff files with
global hashes, as that might help me understand how to create the .debug$H
section, how to update the file section count and how to properly write
this back.

The code on yaml2coff is expecting to be working on the yaml COFFParser
struct and I'm having quite a bit of a headache turning the COFFObjectFile
into a COFFParser object or compatible... Tomorrow I might try the very non
efficient path of coff2yaml and then yaml2coff with the hashes header...
but it seems way too inefficient and convoluted.

On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <zturner at google.com>
wrote:
>
>
> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada <santagada
at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> No I didn't, I used cl.exe from the visual studio
toolchain. What I'm
>>>> proposing is a tool for processing .obj files in COFF format,
reading them
>>>> and generating the GHASH part.
>>>>
>>>> To make our build faster we use hundreds of unity build files
(.cpp's
>>>> with a lot of other .cpp's in them aka munch files) but
still have a lot of
>>>> single .cpp's as well (in total something like 3.4k .obj
files).
>>>>
>>>> ps: sorry for sending to the wrong list, I was reading about
llvm
>>>> mailing lists and jumped when I saw what I thought was a lld
exclusive list.
>>>>
>>>
>>> A tool like this would be useful, yes.  We've talked about it
internally
>>> as well and agreed it would be useful, we just haven't
prioritized it.  If
>>> you're interested in submitting a patch along those lines
though, I think
>>> it would be a good addition.
>>>
>>> I'm not sure what the best place for it would be.  llvm-readobj
and
>>> llvm-objdump seem like obvious choices, but they are intended to be
>>> read-only, so perhaps they wouldn't be a good fit.
>>>
>>> llvm-pdbutil is kind of a hodgepodge of everything else related to
PDBs
>>> and symbols, so I wouldn't be opposed to making a new
subcommand there
>>> called "ghash" or something that could process an object
file and output a
>>> new object file with a .debug$H section.
>>>
>>> A third option would be to make a new tool for it.
>>>
>>> I don't htink it would be that hard to write.  If you're
interested in
>>> trying to make a patch for this, I can offer some guidance on where
to look
>>> in the code.  Otherwise it's something that we'll probably
get to, I'm just
>>> not sure when.
>>>
>>>>
>> I would love to write it and contribute it back, please do tell, I did
>> find some of the code of ghash in lld, but in fuzzy on the llvm
codeview
>> part of it and never seen llvm-readobj/objdump or llvm-pdbutil, but
I'm not
>> afraid to look :)
>>
>>
>  Luckily all of the important code is hidden behind library calls, and it
> should already just do the right thing, so I suspect you won't need to
know
> much about CodeView to do this.
>
> I think Peter has the right idea about putting this in llvm-objcopy.
>
> You can look at one of the existing CopyBinary functions there, which
> currently only work for ELF, but you can just make a new overload that
> accepts a COFFObjectFile.
>
> I would probably start by iterating over each of the sections
> (getNumberOfSections / getSectionName) looking for .debug$T and .debug$H
> sections.
>
> If you find a .debug$H section then you can just skip that object file.
>
> If you find a .debug$T but not a .debug$H, then basically do the same
> thing that LLD does in PDBLinker::mergeDebugT  (create a CVTypeArray, and
> pass it to GloballyHashedType::hashTypes.  That will return an array of
> hash values.  (the format of .debug$H is the header, followed by the hash
> values).  Then when you're writing the list of sections, just add in
the
> .debug$H section right after the .debug$T section.
>
> Currently llvm-objcopy only writes ELF files, so it would need to be
> taught to write COFF files.  We have code to do this in the yaml2obj
> utility (specifically, in yaml2coff.cpp in the function writeCOFF).  There
> may be a way to move this code to somewhere else (llvm/Object/COFF.h?) so
> that it can be re-used by both yaml2coff and llvm-objcopy, but in the worst
> case scenario you could copy the code and re-write it to work with these
> new structures.
>
> Lastly, you'll probably want to put all of this behind an option in
> llvm-objcopy such as -add-codeview-ghash-section
>
>

-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180120/3721c977/attachment.html>

Zachary Turner via llvm-dev

2018-Jan-20 20:05 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

You probably don't want to go down the same route that clang goes through
to write the object file.  If you think yaml2coff is convoluted, the way
clang does it will just give you a headache.  There are multiple
abstractions involved to account for different object file formats (ELF,
COFF, MachO) and output formats (Assembly, binary file).  At least with
yaml2coff

It's true that yaml2coff is using the COFFParser structure, but if you look
at the writeCOFF function in yaml2coff it's pretty bare-metal.  The logic
you need will be almost identical, except that instead of checking the
COFFParser for the various fields, you'll check the existing
COFFObjectFile, which should have similar fields.

The only thing you need to different is when writing the section table and
section contents, to insert a new entry.  Since you're injecting a section
into the middle, you'll also probably need to push back the file pointer of
all subsequent sections so that they don't overlap.  (e.g. if the original
sections are 1, 2, 3, 4, 5 and you insert between 2 and 3, then the
original sections 3, 4, and 5 would need to have their FilePointerToRawData
offset by the size of the new section).

If you need to know what values to put for the other fields in a section
header, run `dumpbin /headers foo.obj` on a clang-generated object file
that has a .debug$H section already (e.g. run clang with
-emit-codeview-ghash-section, and look at the properties of the .debug$H
section and use the same values).

The only invariant that needs to be maintained is that
Section[N]->FilePointerOfRawData == Section[N-1]->FilePointerOfRawData +
Section[N-1]->SizeOfRawData

On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <santagada at
gmail.com>
wrote:
> Thanks for the tips, I now have something that reads the obj file, finds
> .debug$T sections and global hashes it (proof of concept kind of code).
> What I can't find is: how does clang itself writes the coff files with
> global hashes, as that might help me understand how to create the .debug$H
> section, how to update the file section count and how to properly write
> this back.
>
> The code on yaml2coff is expecting to be working on the yaml COFFParser
> struct and I'm having quite a bit of a headache turning the
COFFObjectFile
> into a COFFParser object or compatible... Tomorrow I might try the very non
> efficient path of coff2yaml and then yaml2coff with the hashes header...
> but it seems way too inefficient and convoluted.
>
> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>>
>>
>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada <
>>>> santagada at gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> No I didn't, I used cl.exe from the visual studio
toolchain. What I'm
>>>>> proposing is a tool for processing .obj files in COFF
format, reading them
>>>>> and generating the GHASH part.
>>>>>
>>>>> To make our build faster we use hundreds of unity build
files (.cpp's
>>>>> with a lot of other .cpp's in them aka munch files) but
still have a lot of
>>>>> single .cpp's as well (in total something like 3.4k
.obj files).
>>>>>
>>>>> ps: sorry for sending to the wrong list, I was reading
about llvm
>>>>> mailing lists and jumped when I saw what I thought was a
lld exclusive list.
>>>>>
>>>>
>>>> A tool like this would be useful, yes.  We've talked about
it
>>>> internally as well and agreed it would be useful, we just
haven't
>>>> prioritized it.  If you're interested in submitting a patch
along those
>>>> lines though, I think it would be a good addition.
>>>>
>>>> I'm not sure what the best place for it would be. 
llvm-readobj and
>>>> llvm-objdump seem like obvious choices, but they are intended
to be
>>>> read-only, so perhaps they wouldn't be a good fit.
>>>>
>>>> llvm-pdbutil is kind of a hodgepodge of everything else related
to PDBs
>>>> and symbols, so I wouldn't be opposed to making a new
subcommand there
>>>> called "ghash" or something that could process an
object file and output a
>>>> new object file with a .debug$H section.
>>>>
>>>> A third option would be to make a new tool for it.
>>>>
>>>> I don't htink it would be that hard to write.  If
you're interested in
>>>> trying to make a patch for this, I can offer some guidance on
where to look
>>>> in the code.  Otherwise it's something that we'll
probably get to, I'm just
>>>> not sure when.
>>>>
>>>>>
>>> I would love to write it and contribute it back, please do tell, I
did
>>> find some of the code of ghash in lld, but in fuzzy on the llvm
codeview
>>> part of it and never seen llvm-readobj/objdump or llvm-pdbutil, but
I'm not
>>> afraid to look :)
>>>
>>>
>>  Luckily all of the important code is hidden behind library calls, and
it
>> should already just do the right thing, so I suspect you won't need
to know
>> much about CodeView to do this.
>>
>> I think Peter has the right idea about putting this in llvm-objcopy.
>>
>> You can look at one of the existing CopyBinary functions there, which
>> currently only work for ELF, but you can just make a new overload that
>> accepts a COFFObjectFile.
>>
>> I would probably start by iterating over each of the sections
>> (getNumberOfSections / getSectionName) looking for .debug$T and
.debug$H
>> sections.
>>
>> If you find a .debug$H section then you can just skip that object file.
>>
>> If you find a .debug$T but not a .debug$H, then basically do the same
>> thing that LLD does in PDBLinker::mergeDebugT  (create a CVTypeArray,
and
>> pass it to GloballyHashedType::hashTypes.  That will return an array of
>> hash values.  (the format of .debug$H is the header, followed by the
hash
>> values).  Then when you're writing the list of sections, just add
in the
>> .debug$H section right after the .debug$T section.
>>
>> Currently llvm-objcopy only writes ELF files, so it would need to be
>> taught to write COFF files.  We have code to do this in the yaml2obj
>> utility (specifically, in yaml2coff.cpp in the function writeCOFF). 
There
>> may be a way to move this code to somewhere else (llvm/Object/COFF.h?)
so
>> that it can be re-used by both yaml2coff and llvm-objcopy, but in the
worst
>> case scenario you could copy the code and re-write it to work with
these
>> new structures.
>>
>> Lastly, you'll probably want to put all of this behind an option in
>> llvm-objcopy such as -add-codeview-ghash-section
>>
>>
>
>
> --
>
> Leonardo Santagada
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180120/dc9dbebf/attachment.html>

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Jan 2018 - [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Reasonably Related Threads