thr3ads.net - llvm dev - [llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler) [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Leonardo Santagada via llvm-dev

2018-Jan-25 17:57 UTC

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Any idea on how to create this new symbol there? I saw that there is a
symbol pointing to each section, but didn't understand the format, and
yaml2obj doesn't check it or do anything with the list.

On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada <santagada at
gmail.com>
wrote:
> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT.
>
> sorry for the caps... long day of working on this, and using vs 2017,
> which adds a new section type .chks64 that I couldn't find
documentation
> anywhere was difficult. I highly recommend everyone to just not using vs
> 2017 until 15.8 or something, our internal bug list is gigantic.
>
> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>> Actually I already have a theory that even though you are adding the
>> section to the section table, you might not be adding a *symbol* for
the
>> section to the symbol table.  So the existing symbols (which reference
>> sections by index) will all be wrong because you've inserted a new
>> section.  Still though, obj2yaml would expose that.
>>
>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> Yea as long as you compare clang-cl object file with automatically
>>> generated .debug$H section against clang-cl object file without
.debug$H
>>> but added after the fact with llvm-objcopy, that should expose the
problem
>>> I think when you run obj2yaml on them.
>>>
>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada <santagada at
gmail.com>
>>> wrote:
>>>
>>>> I did reorder my sections, so that .debug$H is in the correct
place,
>>>> but now I get some errors on dubplicate symbols, I created a
folder with
>>>> examples:
>>>>
>>>> https://www.dropbox.com/sh/nmvzi44pi0boe76/AAA0f47O5PCJ9JiUc
>>>> 6wVuwBra?dl=0
>>>>
>>>> t.obj is generated by vs 2015 and it links fine with
lld-link.exe, but
>>>> tout.obj gives this errors:
>>>>
>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options in
>>>> tout.obj and in LIBCMT.lib(default_local_stdio_options.obj)
>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options in
>>>> tout.obj and in libvcruntime.lib(undname.obj)
>>>>
>>>> I'm using PEView from http://wjradburn.com/software/ to
look at the
>>>> files and can't see anything wrong, except some valid
differences in the
>>>> offsets being used for the data (so pointer to data is
different between
>>>> them).
>>>>
>>>> I will look into yaml2obj now to see if I see anything else
weird going
>>>> on.
>>>>
>>>>
>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner <zturner at
google.com>
>>>> wrote:
>>>>
>>>>> I'm pretty confident that cl is not putting anything
strange in the
>>>>> .debug$T sections.  We've done a lot of testing and
never seen anything
>>>>> except CodeView type records in a .debug$T.  My hunch is
that your objcopy
>>>>> patch is probably not doing the right thing in one or more
of the section
>>>>> headers, and this is confusing the linker.
>>>>>
>>>>> One idea might be to build a simple object file with
clang-cl but
>>>>> without the magic -mllvm -emit-codeview-ghash-section, then
run your
>>>>> llvm-objcopy on it.  Then build the same object file
passing -mllvm
>>>>> -emit-codeview-ghash-section.  Then run obj2yaml on both
and diff the
>>>>> results.  They should be byte-for-byte identical.  That
should give you a
>>>>> clue about if objcopy is doing something wrong.
>>>>>
>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> Don't worry, I definetly want to perfect this to
generate legal obj
>>>>>> files, this is just to speed up testing.
>>>>>>
>>>>>> Now after patching all the obj files I get this errors
when linking a
>>>>>> small part of our code base (msvc 2017 15.5.3, lld and
llvm-objcopy 7.0.0):
>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>> section: $LN8
>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>> section: $LN43
>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>> section: $LN37
>>>>>>
>>>>>> I'm starting to guess that cl.exe might be putting
some random comdat
>>>>>> or other discardable symbols in the .debug$T and clang
doesn't? I will try
>>>>>> to debug this and see what more I can uncover.
>>>>>>
>>>>>> Linking works perfectly without my llvm-objcopy pass to
add .debug$H?
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner
<zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> It might not influence LLD, but at the same time we
don't want to
>>>>>>> upstream something that is producing technically
illegal COFF files.  Also
>>>>>>> good to hear about the planned changes to your
header files.  Looking
>>>>>>> forward to hearing about your experiences with
clang-cl.
>>>>>>>
>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo Santagada
<
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I finally got my first .obj file patched with
.debug$H to look
>>>>>>>> somewhat right. I added the new section at the
end of the file so I don't
>>>>>>>> have to recalculate all sections (although now
I probably could position it
>>>>>>>> in the middle, knowing that each section is:
SizeOfRawData +
>>>>>>>> (last.Header.NumberOfRelocations * (4+4+2)) and
the $H needs to
>>>>>>>> come right after $T in the file). That although
illegal based on the coff
>>>>>>>> specs doesn't seem its going to influence
lld.
>>>>>>>>
>>>>>>>> Also we talked and we are probably going to do
something similar to
>>>>>>>> a bunch of windows defines and a check for our
own define (to guarantee
>>>>>>>> that no one imported windows.h before win32.h)
and drop the namespace and
>>>>>>>> the conflicting names.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM, Zachary
Turner <
>>>>>>>> zturner at google.com> wrote:
>>>>>>>>
>>>>>>>>> That's very possible that a 3rd party
indirect header include is
>>>>>>>>> involved.  One idea might be like I
suggested where you #define _WINDOWS_
>>>>>>>>> in win32.h and guarantee that it's
always included first.  Then those other
>>>>>>>>> headers won't be able to #include
<windows.h>.  but it will probably
>>>>>>>>> greatly expand the amount of stuff you have
to add to win32.h, as you will
>>>>>>>>> probably find some callers of functions
that aren't yet in your win32.h
>>>>>>>>> that you'd have to add.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM Leonardo
Santagada <
>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Ok some information was lost on getting
this example to you, I'm
>>>>>>>>>> sorry for not being clear.
>>>>>>>>>>
>>>>>>>>>> We have a huge code base, let's say
90% of it doesn't include
>>>>>>>>>> either header, 9% include win32.h and
1% includes both, I will try to
>>>>>>>>>> discover why, but my guess is they
include both a third party that includes
>>>>>>>>>> windows.h and some of our libs that use
win32.h.
>>>>>>>>>>
>>>>>>>>>> I will try to fully understand this
tomorrow.
>>>>>>>>>>
>>>>>>>>>> I guess clang will not implement this
ever so finishing the
>>>>>>>>>> object copier is the best solution
until all code is ported to clang.
>>>>>>>>>>
>>>>>>>>>> On 23 Jan 2018 00:02, "Zachary
Turner" <zturner at google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> You said win32.h doesn't
include windows.h, but main.cpp does.
>>>>>>>>>>> So what's the disadvantage of
just including it in win32.h anyway, since
>>>>>>>>>>> it's already going to be in
every translation unit?  (Unless you didn't
>>>>>>>>>>> mean to #include it in main.cpp)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I guess all I can do is warn you
how bad of an idea this is.
>>>>>>>>>>> For starters, I already found a bug
in your code ;-)
>>>>>>>>>>>
>>>>>>>>>>> // stdint.h
>>>>>>>>>>> typedef int                int32_t;
>>>>>>>>>>>
>>>>>>>>>>> // winnt.h
>>>>>>>>>>> typedef long LONG;
>>>>>>>>>>>
>>>>>>>>>>> // windef.h
>>>>>>>>>>> typedef struct tagPOINT
>>>>>>>>>>> {
>>>>>>>>>>>     LONG  x;   // long x
>>>>>>>>>>>     LONG  y;   // long y
>>>>>>>>>>> } POINT, *PPOINT, NEAR *NPPOINT,
FAR *LPPOINT;
>>>>>>>>>>>
>>>>>>>>>>> // win32.h
>>>>>>>>>>> typedef int32_t LONG;
>>>>>>>>>>>
>>>>>>>>>>> struct POINT
>>>>>>>>>>> {
>>>>>>>>>>> LONG x;   // int x
>>>>>>>>>>> LONG y;   // int y
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> So POINT is defined two different
ways.  In your minimal
>>>>>>>>>>> interface, it's declared as 2
int32's, which are int.  In the actual
>>>>>>>>>>> Windows header files, it's
declared as 2 longs.
>>>>>>>>>>>
>>>>>>>>>>> This might seem like a unimportant
bug since int and long are
>>>>>>>>>>> the same size, but int and long
also mangle differently and affect overload
>>>>>>>>>>> resolution, so you could have weird
linker errors or call the wrong
>>>>>>>>>>> function overload.
>>>>>>>>>>>
>>>>>>>>>>> Plus, it illustrates the fact that
this struct *actually is* a
>>>>>>>>>>> different type from the one in the
windows header.
>>>>>>>>>>>
>>>>>>>>>>> You said at the end that you never
intentionally import win32.h
>>>>>>>>>>> and windows.h from the same
translation unit.  But then in this example you
>>>>>>>>>>> did.  I wonder if you could enforce
that by doing this:
>>>>>>>>>>>
>>>>>>>>>>> // win32.h
>>>>>>>>>>> #pragma once
>>>>>>>>>>>
>>>>>>>>>>> // Error if windows.h was included
before us.
>>>>>>>>>>> #if defined(_WINDOWS_)
>>>>>>>>>>> #error "You're including
win32.h after having already included
>>>>>>>>>>> windows.h.  Don't do
this!"
>>>>>>>>>>> #endif
>>>>>>>>>>>
>>>>>>>>>>> // And also make sure windows.h
can't get included after us
>>>>>>>>>>> #define _WINDOWS_
>>>>>>>>>>>
>>>>>>>>>>> For the record, I tried the test
case you linked when windows.h
>>>>>>>>>>> is not included in main.cpp and it
works (but still has the bug about int
>>>>>>>>>>> and long).
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jan 22, 2018 at 2:23 PM
Leonardo Santagada <
>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It is super gross, but we copy
parts of windows.h because
>>>>>>>>>>>> having all of it if both
gigantic and very very messy. So our win32.h has a
>>>>>>>>>>>> couple thousands of lines and
not 30k+ for windows.h and we try to have
>>>>>>>>>>>> zero macros. Win32.h
doesn't include windows.h so using ::BOOL wouldn't
>>>>>>>>>>>> work. We don't want to
create a namespace, we just want a cleaner interface
>>>>>>>>>>>> to windows api. The namespace
with c linkage is the way to trick cl into
>>>>>>>>>>>> allowing us to in some files
have both windows.h and Win32.h. I really
>>>>>>>>>>>> don't see any way for us to
have this Win32.h without this cl support, so
>>>>>>>>>>>> maybe we should either put
windows.h in a compiled header somewhere and not
>>>>>>>>>>>> care that it is infecting
everything or just have one place we can call to
>>>>>>>>>>>> clean up after including
windows.h (a massive set of undefs).
>>>>>>>>>>>>
>>>>>>>>>>>> So using can't work,
because we never intentionally import
>>>>>>>>>>>> windows.h and win32.h on the
same translation unit.
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jan 22, 2018 at 7:08
PM, Zachary Turner <
>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> This is pretty gross,
honestly :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can't you just use
using declarations?
>>>>>>>>>>>>>
>>>>>>>>>>>>> namespace Win32 {
>>>>>>>>>>>>> extern "C" {
>>>>>>>>>>>>>
>>>>>>>>>>>>> using ::BOOL;
>>>>>>>>>>>>> using ::LONG;
>>>>>>>>>>>>> using ::POINT;
>>>>>>>>>>>>> using ::LPPOINT;
>>>>>>>>>>>>>
>>>>>>>>>>>>> using ::GetCursorPos;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> This works with clang-cl.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
5:39 AM Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here it is a minimal
example, we do this so we don't have to
>>>>>>>>>>>>>> import the whole
windows api everywhere.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
https://gist.github.com/santagada/7977e929d31c629c4bf18ebb98
>>>>>>>>>>>>>> 7f6be3
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Jan 21, 2018 at
2:31 AM, Zachary Turner <
>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Clang-cl maintains
compatibility with msvc even in cases
>>>>>>>>>>>>>>> where it’s non
standards compliant (eg 2 phase name lookup), but we try to
>>>>>>>>>>>>>>> keep these cases
few and far between.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To help me
understand your case, do you mean you copy
>>>>>>>>>>>>>>> windows.h and
modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>> defined twice? If i
were to write this:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> struct Foo {};
>>>>>>>>>>>>>>> struct Foo {};
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is this a small
repro of the issue you’re talking about?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Jan 20,
2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can totally
see something like incremental linking with a
>>>>>>>>>>>>>>>> simple padding
between obj and a mapping file (which can also help with
>>>>>>>>>>>>>>>> edit and
continue, something we also would love to have).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We have another
developer doing the port to support
>>>>>>>>>>>>>>>> clang-cl, but
although most of our code also goes trough a version of
>>>>>>>>>>>>>>>> clang,
migrating the rest to clang-cl has been a fight. From what I heard
>>>>>>>>>>>>>>>> the main
problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>> bring the awful
parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>> on cl, but
clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>> same name, even
though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>> as cl.exe is
not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it and
>>>>>>>>>>>>>>>> have this at
least until this is completely fixed in our code base.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the biggest win
with moving to cl would be a better more
>>>>>>>>>>>>>>>> standards
compliant compiler, no 1 minute compiles on heavily templated
>>>>>>>>>>>>>>>> files and maybe
the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Jan 20,
2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 10-15s will
be hard without true incremental linking.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At some
point that's going to be the only way to get any
>>>>>>>>>>>>>>>>> faster, but
incremental linking is hard (putting it lightly), and since our
>>>>>>>>>>>>>>>>> full links
are already really fast we think we can get reasonably close to
>>>>>>>>>>>>>>>>> link.exe
incremental speeds with full links.  But it's never enough and I
>>>>>>>>>>>>>>>>> will always
want it to be faster, so you may see incremental linking in the
>>>>>>>>>>>>>>>>> future
after we hit a performance wall with full link speed :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In any
case, I'm definitely interested in seeing what kind
>>>>>>>>>>>>>>>>> of numbers
you get with /debug:ghash after you get this llvm-objcopy
>>>>>>>>>>>>>>>>> feature
implemented.  So keep me updated :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As an
aside, have you tried building with clang instead of
>>>>>>>>>>>>>>>>> cl?  If you
build with clang you wouldn't even have to do this llvm-objcopy
>>>>>>>>>>>>>>>>> work,
because it would "just work".  If you've tried but ran into issues
>>>>>>>>>>>>>>>>> I'm
interested in hearing about those too.  On the other hand, it's also
>>>>>>>>>>>>>>>>> reasonable
to only switch one thing at a time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Jan
20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>> santagada
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> if we
get to < 30s I think most users would prefer it to
>>>>>>>>>>>>>>>>>>
link.exe, just hopping there is still some more optimizations to get closer
>>>>>>>>>>>>>>>>>> to ELF
linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat,
Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>> zturner
at google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>> of
the penalties like slow debug time
>>>>>>>>>>>>>>>>>>> On
Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark cases.
>>>>>>>>>>>>>>>>>>>>
When building blink_core.dll and browser_tests.exe, i get anywhere from a
>>>>>>>>>>>>>>>>>>>>
20-40% reduction in link time. We have some other optimizations in the
>>>>>>>>>>>>>>>>>>>>
pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
My best time so far (including other optimizations not
>>>>>>>>>>>>>>>>>>>>
yet upstream) is 28s on blink_core.dll, compared to 110s with /debug
>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same route
>>>>>>>>>>>>>>>>>>>>>>
that clang goes through to write the object file.  If you think yaml2coff
>>>>>>>>>>>>>>>>>>>>>>
is convoluted, the way clang does it will just give you a headache.  There
>>>>>>>>>>>>>>>>>>>>>>
are multiple abstractions involved to account for different object file
>>>>>>>>>>>>>>>>>>>>>>
formats (ELF, COFF, MachO) and output formats (Assembly, binary file).  At
>>>>>>>>>>>>>>>>>>>>>>
least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I just
>>>>>>>>>>>>>>>>>>>>>
found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the COFFParser
>>>>>>>>>>>>>>>>>>>>>>
structure, but if you look at the writeCOFF function
>>>>>>>>>>>>>>>>>>>>>>
in yaml2coff it's pretty bare-metal.  The logic you need will be almost
>>>>>>>>>>>>>>>>>>>>>>
identical, except that instead of checking the COFFParser for the various
>>>>>>>>>>>>>>>>>>>>>>
fields, you'll check the existing COFFObjectFile, which should have similar
>>>>>>>>>>>>>>>>>>>>>>
fields.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when writing
>>>>>>>>>>>>>>>>>>>>>>
the section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>
overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy that I
>>>>>>>>>>>>>>>>>>>>>
read a bit of it so I actually know what you are talking about... yeah it
>>>>>>>>>>>>>>>>>>>>>
doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the other
>>>>>>>>>>>>>>>>>>>>>>
fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how the
>>>>>>>>>>>>>>>>>>>>>
CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained is
>>>>>>>>>>>>>>>>>>>>>>
that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData +
>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on the
>>>>>>>>>>>>>>>>>>>>>
final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like chrome
>>>>>>>>>>>>>>>>>>>>>
with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes and
>>>>>>>>>>>>>>>>>>>>>
16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1 minute and
>>>>>>>>>>>>>>>>>>>>>
2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it only
>>>>>>>>>>>>>>>>>>>>>
happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols used on
>>>>>>>>>>>>>>>>>>>>>
some runtime systems.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something that reads
>>>>>>>>>>>>>>>>>>>>>>>
the obj file, finds .debug$T sections and global hashes it (proof of
>>>>>>>>>>>>>>>>>>>>>>>
concept kind of code). What I can't find is: how does clang itself writes
>>>>>>>>>>>>>>>>>>>>>>>
the coff files with global hashes, as that might help me understand how to
>>>>>>>>>>>>>>>>>>>>>>>
create the .debug$H section, how to update the file section count and how
>>>>>>>>>>>>>>>>>>>>>>>
to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be working on
>>>>>>>>>>>>>>>>>>>>>>>
the yaml COFFParser struct and I'm having quite a bit of a headache turning
>>>>>>>>>>>>>>>>>>>>>>>
the COFFObjectFile into a COFFParser object or compatible... Tomorrow I
>>>>>>>>>>>>>>>>>>>>>>>
might try the very non efficient path of coff2yaml and then yaml2coff with
>>>>>>>>>>>>>>>>>>>>>>>
the hashes header... but it seems way too inefficient and convoluted.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the visual
>>>>>>>>>>>>>>>>>>>>>>>>>>>
studio toolchain. What I'm proposing is a tool for processing .obj files in
>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds of
>>>>>>>>>>>>>>>>>>>>>>>>>>>
unity build files (.cpp's with a lot of other .cpp's in them aka munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list, I was
>>>>>>>>>>>>>>>>>>>>>>>>>>>
reading about llvm mailing lists and jumped when I saw what I thought was a
>>>>>>>>>>>>>>>>>>>>>>>>>>>
lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.  We've
>>>>>>>>>>>>>>>>>>>>>>>>>>
talked about it internally as well and agreed it would be useful, we just
>>>>>>>>>>>>>>>>>>>>>>>>>>
haven't prioritized it.  If you're interested in submitting a patch
along
>>>>>>>>>>>>>>>>>>>>>>>>>>
those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it would
>>>>>>>>>>>>>>>>>>>>>>>>>>
be.  llvm-readobj and llvm-objdump seem like obvious choices, but they are
>>>>>>>>>>>>>>>>>>>>>>>>>>
intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool for it.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to write.  If
>>>>>>>>>>>>>>>>>>>>>>>>>>
you're interested in trying to make a patch for this, I can offer some
>>>>>>>>>>>>>>>>>>>>>>>>>>
guidance on where to look in the code.  Otherwise it's something that
we'll
>>>>>>>>>>>>>>>>>>>>>>>>>>
probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it back,
>>>>>>>>>>>>>>>>>>>>>>>>>
please do tell, I did find some of the code of ghash in lld, but in fuzzy
>>>>>>>>>>>>>>>>>>>>>>>>>
on the llvm codeview part of it and never seen llvm-readobj/objdump or
>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden behind
>>>>>>>>>>>>>>>>>>>>>>>>
library calls, and it should already just do the right thing, so I suspect
>>>>>>>>>>>>>>>>>>>>>>>>
you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about putting this
>>>>>>>>>>>>>>>>>>>>>>>>
in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing CopyBinary
>>>>>>>>>>>>>>>>>>>>>>>>
functions there, which currently only work for ELF, but you can just make a
>>>>>>>>>>>>>>>>>>>>>>>>
new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over each of
>>>>>>>>>>>>>>>>>>>>>>>>
the sections (getNumberOfSections / getSectionName) looking for .debug$T
>>>>>>>>>>>>>>>>>>>>>>>>
and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can just
>>>>>>>>>>>>>>>>>>>>>>>>
skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H, then
>>>>>>>>>>>>>>>>>>>>>>>>
basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes.
>>>>>>>>>>>>>>>>>>>>>>>>
That will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF files, so it
>>>>>>>>>>>>>>>>>>>>>>>>
would need to be taught to write COFF files.  We have code to do this in
>>>>>>>>>>>>>>>>>>>>>>>>
the yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>>>>>>>>>>>>
writeCOFF).  There may be a way to move this code to somewhere else
>>>>>>>>>>>>>>>>>>>>>>>>
(llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of this
>>>>>>>>>>>>>>>>>>>>>>>>
behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Leonardo Santagada
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Leonardo Santagada
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Leonardo Santagada
>>>>
>>>
>
>
> --
>
> Leonardo Santagada
>


-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180125/e5c3a53e/attachment-0001.html>

Leonardo Santagada via llvm-dev

2018-Jan-25 18:07 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

just did what you said and yes, I need to add a symbol to the symbol table
and fix all the other indexes

On Thu, Jan 25, 2018 at 6:57 PM, Leonardo Santagada <santagada at
gmail.com>
wrote:
> Any idea on how to create this new symbol there? I saw that there is a
> symbol pointing to each section, but didn't understand the format, and
> yaml2obj doesn't check it or do anything with the list.
>
> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT.
>>
>> sorry for the caps... long day of working on this, and using vs 2017,
>> which adds a new section type .chks64 that I couldn't find
documentation
>> anywhere was difficult. I highly recommend everyone to just not using
vs
>> 2017 until 15.8 or something, our internal bug list is gigantic.
>>
>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> Actually I already have a theory that even though you are adding
the
>>> section to the section table, you might not be adding a *symbol*
for the
>>> section to the symbol table.  So the existing symbols (which
reference
>>> sections by index) will all be wrong because you've inserted a
new
>>> section.  Still though, obj2yaml would expose that.
>>>
>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> Yea as long as you compare clang-cl object file with
automatically
>>>> generated .debug$H section against clang-cl object file without
.debug$H
>>>> but added after the fact with llvm-objcopy, that should expose
the problem
>>>> I think when you run obj2yaml on them.
>>>>
>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada
<santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> I did reorder my sections, so that .debug$H is in the
correct place,
>>>>> but now I get some errors on dubplicate symbols, I created
a folder with
>>>>> examples:
>>>>>
>>>>>
https://www.dropbox.com/sh/nmvzi44pi0boe76/AAA0f47O5PCJ9JiUc
>>>>> 6wVuwBra?dl=0
>>>>>
>>>>> t.obj is generated by vs 2015 and it links fine with
lld-link.exe, but
>>>>> tout.obj gives this errors:
>>>>>
>>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options in
>>>>> tout.obj and in LIBCMT.lib(default_local_stdio_options.obj)
>>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options in
>>>>> tout.obj and in libvcruntime.lib(undname.obj)
>>>>>
>>>>> I'm using PEView from http://wjradburn.com/software/ to
look at the
>>>>> files and can't see anything wrong, except some valid
differences in the
>>>>> offsets being used for the data (so pointer to data is
different between
>>>>> them).
>>>>>
>>>>> I will look into yaml2obj now to see if I see anything else
weird
>>>>> going on.
>>>>>
>>>>>
>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner <zturner
at google.com>
>>>>> wrote:
>>>>>
>>>>>> I'm pretty confident that cl is not putting
anything strange in the
>>>>>> .debug$T sections.  We've done a lot of testing and
never seen anything
>>>>>> except CodeView type records in a .debug$T.  My hunch
is that your objcopy
>>>>>> patch is probably not doing the right thing in one or
more of the section
>>>>>> headers, and this is confusing the linker.
>>>>>>
>>>>>> One idea might be to build a simple object file with
clang-cl but
>>>>>> without the magic -mllvm -emit-codeview-ghash-section,
then run your
>>>>>> llvm-objcopy on it.  Then build the same object file
passing -mllvm
>>>>>> -emit-codeview-ghash-section.  Then run obj2yaml on
both and diff the
>>>>>> results.  They should be byte-for-byte identical.  That
should give you a
>>>>>> clue about if objcopy is doing something wrong.
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada <
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> Don't worry, I definetly want to perfect this
to generate legal obj
>>>>>>> files, this is just to speed up testing.
>>>>>>>
>>>>>>> Now after patching all the obj files I get this
errors when linking
>>>>>>> a small part of our code base (msvc 2017 15.5.3,
lld and llvm-objcopy
>>>>>>> 7.0.0):
>>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>>> section: $LN8
>>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>>> section: $LN43
>>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>>> section: $LN37
>>>>>>>
>>>>>>> I'm starting to guess that cl.exe might be
putting some random
>>>>>>> comdat or other discardable symbols in the .debug$T
and clang doesn't? I
>>>>>>> will try to debug this and see what more I can
uncover.
>>>>>>>
>>>>>>> Linking works perfectly without my llvm-objcopy
pass to add .debug$H?
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner
<zturner at google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It might not influence LLD, but at the same
time we don't want to
>>>>>>>> upstream something that is producing
technically illegal COFF files.  Also
>>>>>>>> good to hear about the planned changes to your
header files.  Looking
>>>>>>>> forward to hearing about your experiences with
clang-cl.
>>>>>>>>
>>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I finally got my first .obj file patched
with .debug$H to look
>>>>>>>>> somewhat right. I added the new section at
the end of the file so I don't
>>>>>>>>> have to recalculate all sections (although
now I probably could position it
>>>>>>>>> in the middle, knowing that each section
is: SizeOfRawData +
>>>>>>>>> (last.Header.NumberOfRelocations * (4+4+2))
and the $H needs to
>>>>>>>>> come right after $T in the file). That
although illegal based on the coff
>>>>>>>>> specs doesn't seem its going to
influence lld.
>>>>>>>>>
>>>>>>>>> Also we talked and we are probably going to
do something similar
>>>>>>>>> to a bunch of windows defines and a check
for our own define (to guarantee
>>>>>>>>> that no one imported windows.h before
win32.h) and drop the namespace and
>>>>>>>>> the conflicting names.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM, Zachary
Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> That's very possible that a 3rd
party indirect header include is
>>>>>>>>>> involved.  One idea might be like I
suggested where you #define _WINDOWS_
>>>>>>>>>> in win32.h and guarantee that it's
always included first.  Then those other
>>>>>>>>>> headers won't be able to #include
<windows.h>.  but it will probably
>>>>>>>>>> greatly expand the amount of stuff you
have to add to win32.h, as you will
>>>>>>>>>> probably find some callers of functions
that aren't yet in your win32.h
>>>>>>>>>> that you'd have to add.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ok some information was lost on
getting this example to you, I'm
>>>>>>>>>>> sorry for not being clear.
>>>>>>>>>>>
>>>>>>>>>>> We have a huge code base, let's
say 90% of it doesn't include
>>>>>>>>>>> either header, 9% include win32.h
and 1% includes both, I will try to
>>>>>>>>>>> discover why, but my guess is they
include both a third party that includes
>>>>>>>>>>> windows.h and some of our libs that
use win32.h.
>>>>>>>>>>>
>>>>>>>>>>> I will try to fully understand this
tomorrow.
>>>>>>>>>>>
>>>>>>>>>>> I guess clang will not implement
this ever so finishing the
>>>>>>>>>>> object copier is the best solution
until all code is ported to clang.
>>>>>>>>>>>
>>>>>>>>>>> On 23 Jan 2018 00:02, "Zachary
Turner" <zturner at google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You said win32.h doesn't
include windows.h, but main.cpp does.
>>>>>>>>>>>> So what's the disadvantage
of just including it in win32.h anyway, since
>>>>>>>>>>>> it's already going to be in
every translation unit?  (Unless you didn't
>>>>>>>>>>>> mean to #include it in
main.cpp)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I guess all I can do is warn
you how bad of an idea this is.
>>>>>>>>>>>> For starters, I already found a
bug in your code ;-)
>>>>>>>>>>>>
>>>>>>>>>>>> // stdint.h
>>>>>>>>>>>> typedef int               
int32_t;
>>>>>>>>>>>>
>>>>>>>>>>>> // winnt.h
>>>>>>>>>>>> typedef long LONG;
>>>>>>>>>>>>
>>>>>>>>>>>> // windef.h
>>>>>>>>>>>> typedef struct tagPOINT
>>>>>>>>>>>> {
>>>>>>>>>>>>     LONG  x;   // long x
>>>>>>>>>>>>     LONG  y;   // long y
>>>>>>>>>>>> } POINT, *PPOINT, NEAR
*NPPOINT, FAR *LPPOINT;
>>>>>>>>>>>>
>>>>>>>>>>>> // win32.h
>>>>>>>>>>>> typedef int32_t LONG;
>>>>>>>>>>>>
>>>>>>>>>>>> struct POINT
>>>>>>>>>>>> {
>>>>>>>>>>>> LONG x;   // int x
>>>>>>>>>>>> LONG y;   // int y
>>>>>>>>>>>> };
>>>>>>>>>>>>
>>>>>>>>>>>> So POINT is defined two
different ways.  In your minimal
>>>>>>>>>>>> interface, it's declared as
2 int32's, which are int.  In the actual
>>>>>>>>>>>> Windows header files, it's
declared as 2 longs.
>>>>>>>>>>>>
>>>>>>>>>>>> This might seem like a
unimportant bug since int and long are
>>>>>>>>>>>> the same size, but int and long
also mangle differently and affect overload
>>>>>>>>>>>> resolution, so you could have
weird linker errors or call the wrong
>>>>>>>>>>>> function overload.
>>>>>>>>>>>>
>>>>>>>>>>>> Plus, it illustrates the fact
that this struct *actually is* a
>>>>>>>>>>>> different type from the one in
the windows header.
>>>>>>>>>>>>
>>>>>>>>>>>> You said at the end that you
never intentionally import win32.h
>>>>>>>>>>>> and windows.h from the same
translation unit.  But then in this example you
>>>>>>>>>>>> did.  I wonder if you could
enforce that by doing this:
>>>>>>>>>>>>
>>>>>>>>>>>> // win32.h
>>>>>>>>>>>> #pragma once
>>>>>>>>>>>>
>>>>>>>>>>>> // Error if windows.h was
included before us.
>>>>>>>>>>>> #if defined(_WINDOWS_)
>>>>>>>>>>>> #error "You're
including win32.h after having already included
>>>>>>>>>>>> windows.h.  Don't do
this!"
>>>>>>>>>>>> #endif
>>>>>>>>>>>>
>>>>>>>>>>>> // And also make sure windows.h
can't get included after us
>>>>>>>>>>>> #define _WINDOWS_
>>>>>>>>>>>>
>>>>>>>>>>>> For the record, I tried the
test case you linked when windows.h
>>>>>>>>>>>> is not included in main.cpp and
it works (but still has the bug about int
>>>>>>>>>>>> and long).
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jan 22, 2018 at 2:23 PM
Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It is super gross, but we
copy parts of windows.h because
>>>>>>>>>>>>> having all of it if both
gigantic and very very messy. So our win32.h has a
>>>>>>>>>>>>> couple thousands of lines
and not 30k+ for windows.h and we try to have
>>>>>>>>>>>>> zero macros. Win32.h
doesn't include windows.h so using ::BOOL wouldn't
>>>>>>>>>>>>> work. We don't want to
create a namespace, we just want a cleaner interface
>>>>>>>>>>>>> to windows api. The
namespace with c linkage is the way to trick cl into
>>>>>>>>>>>>> allowing us to in some
files have both windows.h and Win32.h. I really
>>>>>>>>>>>>> don't see any way for
us to have this Win32.h without this cl support, so
>>>>>>>>>>>>> maybe we should either put
windows.h in a compiled header somewhere and not
>>>>>>>>>>>>> care that it is infecting
everything or just have one place we can call to
>>>>>>>>>>>>> clean up after including
windows.h (a massive set of undefs).
>>>>>>>>>>>>>
>>>>>>>>>>>>> So using can't work,
because we never intentionally import
>>>>>>>>>>>>> windows.h and win32.h on
the same translation unit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
7:08 PM, Zachary Turner <
>>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is pretty gross,
honestly :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can't you just use
using declarations?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> namespace Win32 {
>>>>>>>>>>>>>> extern "C" {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> using ::BOOL;
>>>>>>>>>>>>>> using ::LONG;
>>>>>>>>>>>>>> using ::POINT;
>>>>>>>>>>>>>> using ::LPPOINT;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> using ::GetCursorPos;
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This works with
clang-cl.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
5:39 AM Leonardo Santagada <
>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here it is a
minimal example, we do this so we don't have to
>>>>>>>>>>>>>>> import the whole
windows api everywhere.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
https://gist.github.com/santagada/7977e929d31c629c4bf18ebb98
>>>>>>>>>>>>>>> 7f6be3
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Jan 21,
2018 at 2:31 AM, Zachary Turner <
>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Clang-cl
maintains compatibility with msvc even in cases
>>>>>>>>>>>>>>>> where it’s non
standards compliant (eg 2 phase name lookup), but we try to
>>>>>>>>>>>>>>>> keep these
cases few and far between.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To help me
understand your case, do you mean you copy
>>>>>>>>>>>>>>>> windows.h and
modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>>> defined twice?
If i were to write this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> struct Foo {};
>>>>>>>>>>>>>>>> struct Foo {};
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is this a small
repro of the issue you’re talking about?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Jan 20,
2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I can
totally see something like incremental linking with
>>>>>>>>>>>>>>>>> a simple
padding between obj and a mapping file (which can also help with
>>>>>>>>>>>>>>>>> edit and
continue, something we also would love to have).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We have
another developer doing the port to support
>>>>>>>>>>>>>>>>> clang-cl,
but although most of our code also goes trough a version of
>>>>>>>>>>>>>>>>> clang,
migrating the rest to clang-cl has been a fight. From what I heard
>>>>>>>>>>>>>>>>> the main
problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>>> bring the
awful parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>>> on cl, but
clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>>> same name,
even though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>>> as cl.exe
is not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it and
>>>>>>>>>>>>>>>>> have this
at least until this is completely fixed in our code base.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the biggest
win with moving to cl would be a better more
>>>>>>>>>>>>>>>>> standards
compliant compiler, no 1 minute compiles on heavily templated
>>>>>>>>>>>>>>>>> files and
maybe the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Jan
20, 2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 10-15s
will be hard without true incremental linking.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> At some
point that's going to be the only way to get any
>>>>>>>>>>>>>>>>>> faster,
but incremental linking is hard (putting it lightly), and since our
>>>>>>>>>>>>>>>>>> full
links are already really fast we think we can get reasonably close to
>>>>>>>>>>>>>>>>>>
link.exe incremental speeds with full links.  But it's never enough and I
>>>>>>>>>>>>>>>>>> will
always want it to be faster, so you may see incremental linking in the
>>>>>>>>>>>>>>>>>> future
after we hit a performance wall with full link speed :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In any
case, I'm definitely interested in seeing what
>>>>>>>>>>>>>>>>>> kind of
numbers you get with /debug:ghash after you get this llvm-objcopy
>>>>>>>>>>>>>>>>>> feature
implemented.  So keep me updated :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As an
aside, have you tried building with clang instead
>>>>>>>>>>>>>>>>>> of cl? 
If you build with clang you wouldn't even have to do this
>>>>>>>>>>>>>>>>>>
llvm-objcopy work, because it would "just work".  If you've tried
but ran
>>>>>>>>>>>>>>>>>> into
issues I'm interested in hearing about those too.  On the other hand,
>>>>>>>>>>>>>>>>>>
it's also reasonable to only switch one thing at a time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat,
Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> if
we get to < 30s I think most users would prefer it to
>>>>>>>>>>>>>>>>>>>
link.exe, just hopping there is still some more optimizations to get closer
>>>>>>>>>>>>>>>>>>> to
ELF linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>>>
of the penalties like slow debug time
>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark cases.
>>>>>>>>>>>>>>>>>>>>>
When building blink_core.dll and browser_tests.exe, i get anywhere from a
>>>>>>>>>>>>>>>>>>>>>
20-40% reduction in link time. We have some other optimizations in the
>>>>>>>>>>>>>>>>>>>>>
pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
My best time so far (including other optimizations not
>>>>>>>>>>>>>>>>>>>>>
yet upstream) is 28s on blink_core.dll, compared to 110s with /debug
>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same route
>>>>>>>>>>>>>>>>>>>>>>>
that clang goes through to write the object file.  If you think yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>
is convoluted, the way clang does it will just give you a headache.  There
>>>>>>>>>>>>>>>>>>>>>>>
are multiple abstractions involved to account for different object file
>>>>>>>>>>>>>>>>>>>>>>>
formats (ELF, COFF, MachO) and output formats (Assembly, binary file).  At
>>>>>>>>>>>>>>>>>>>>>>>
least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I just
>>>>>>>>>>>>>>>>>>>>>>
found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the COFFParser
>>>>>>>>>>>>>>>>>>>>>>>
structure, but if you look at the writeCOFF
>>>>>>>>>>>>>>>>>>>>>>>
function in yaml2coff it's pretty bare-metal.  The logic you need will be
>>>>>>>>>>>>>>>>>>>>>>>
almost identical, except that instead of checking the COFFParser for the
>>>>>>>>>>>>>>>>>>>>>>>
various fields, you'll check the existing COFFObjectFile, which should have
>>>>>>>>>>>>>>>>>>>>>>>
similar fields.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when writing
>>>>>>>>>>>>>>>>>>>>>>>
the section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>>
overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy that
>>>>>>>>>>>>>>>>>>>>>>
I read a bit of it so I actually know what you are talking about... yeah it
>>>>>>>>>>>>>>>>>>>>>>
doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the other
>>>>>>>>>>>>>>>>>>>>>>>
fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how the
>>>>>>>>>>>>>>>>>>>>>>
CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained is
>>>>>>>>>>>>>>>>>>>>>>>
that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData +
>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on the
>>>>>>>>>>>>>>>>>>>>>>
final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like chrome
>>>>>>>>>>>>>>>>>>>>>>
with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes and
>>>>>>>>>>>>>>>>>>>>>>
16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1 minute
>>>>>>>>>>>>>>>>>>>>>>
and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it only
>>>>>>>>>>>>>>>>>>>>>>
happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols used on
>>>>>>>>>>>>>>>>>>>>>>
some runtime systems.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something that
>>>>>>>>>>>>>>>>>>>>>>>>
reads the obj file, finds .debug$T sections and global hashes it (proof of
>>>>>>>>>>>>>>>>>>>>>>>>
concept kind of code). What I can't find is: how does clang itself writes
>>>>>>>>>>>>>>>>>>>>>>>>
the coff files with global hashes, as that might help me understand how to
>>>>>>>>>>>>>>>>>>>>>>>>
create the .debug$H section, how to update the file section count and how
>>>>>>>>>>>>>>>>>>>>>>>>
to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be working on
>>>>>>>>>>>>>>>>>>>>>>>>
the yaml COFFParser struct and I'm having quite a bit of a headache turning
>>>>>>>>>>>>>>>>>>>>>>>>
the COFFObjectFile into a COFFParser object or compatible... Tomorrow I
>>>>>>>>>>>>>>>>>>>>>>>>
might try the very non efficient path of coff2yaml and then yaml2coff with
>>>>>>>>>>>>>>>>>>>>>>>>
the hashes header... but it seems way too inefficient and convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>
<santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the visual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
studio toolchain. What I'm proposing is a tool for processing .obj files in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
unity build files (.cpp's with a lot of other .cpp's in them aka munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list, I was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
reading about llvm mailing lists and jumped when I saw what I thought was a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.  We've
>>>>>>>>>>>>>>>>>>>>>>>>>>>
talked about it internally as well and agreed it would be useful, we just
>>>>>>>>>>>>>>>>>>>>>>>>>>>
haven't prioritized it.  If you're interested in submitting a patch
along
>>>>>>>>>>>>>>>>>>>>>>>>>>>
those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>
be.  llvm-readobj and llvm-objdump seem like obvious choices, but they are
>>>>>>>>>>>>>>>>>>>>>>>>>>>
intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool for
>>>>>>>>>>>>>>>>>>>>>>>>>>>
it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to write.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you're interested in trying to make a patch for this, I can offer some
>>>>>>>>>>>>>>>>>>>>>>>>>>>
guidance on where to look in the code.  Otherwise it's something that
we'll
>>>>>>>>>>>>>>>>>>>>>>>>>>>
probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it back,
>>>>>>>>>>>>>>>>>>>>>>>>>>
please do tell, I did find some of the code of ghash in lld, but in fuzzy
>>>>>>>>>>>>>>>>>>>>>>>>>>
on the llvm codeview part of it and never seen llvm-readobj/objdump or
>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden
>>>>>>>>>>>>>>>>>>>>>>>>>
behind library calls, and it should already just do the right thing, so I
>>>>>>>>>>>>>>>>>>>>>>>>>
suspect you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about putting
>>>>>>>>>>>>>>>>>>>>>>>>>
this in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing CopyBinary
>>>>>>>>>>>>>>>>>>>>>>>>>
functions there, which currently only work for ELF, but you can just make a
>>>>>>>>>>>>>>>>>>>>>>>>>
new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over each of
>>>>>>>>>>>>>>>>>>>>>>>>>
the sections (getNumberOfSections / getSectionName) looking for .debug$T
>>>>>>>>>>>>>>>>>>>>>>>>>
and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can just
>>>>>>>>>>>>>>>>>>>>>>>>>
skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H, then
>>>>>>>>>>>>>>>>>>>>>>>>>
basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes.
>>>>>>>>>>>>>>>>>>>>>>>>>
That will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF files, so
>>>>>>>>>>>>>>>>>>>>>>>>>
it would need to be taught to write COFF files.  We have code to do this in
>>>>>>>>>>>>>>>>>>>>>>>>>
the yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>>>>>>>>>>>>>
writeCOFF).  There may be a way to move this code to somewhere else
>>>>>>>>>>>>>>>>>>>>>>>>>
(llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of this
>>>>>>>>>>>>>>>>>>>>>>>>>
behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Leonardo Santagada
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Leonardo Santagada
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>
>
>
> --
>
> Leonardo Santagada
>


-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180125/40ac4533/attachment.html>

Zachary Turner via llvm-dev

2018-Jan-25 18:15 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

If you run obj2yaml against a very simple object file, you'll see something
like this at the end:
```
symbols:
  - Name:            '@comp.id'
    Value:           17130443
    SectionNumber:   -1
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_STATIC
  - Name:            '@feat.00'
    Value:           2147484048
    SectionNumber:   -1
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_STATIC
  - Name:            .drectve
    Value:           0
    SectionNumber:   1
    SimpleType:      IMAGE_SYM_TYPE_NULL
    ComplexType:     IMAGE_SYM_DTYPE_NULL
    StorageClass:    IMAGE_SYM_CLASS_STATIC
    SectionDefinition:
      Length:          47
      NumberOfRelocations: 0
      NumberOfLinenumbers: 0
      CheckSum:        0
      Number:          0
...
```

There's a structure called coff::symbol which basically represents each one
of these records.  It looks like this:

```
struct symbol {
  char Name[NameSize];
  uint32_t Value;
  int32_t SectionNumber;
  uint16_t Type;
  uint8_t StorageClass;
  uint8_t NumberOfAuxSymbols;
};
```

So you'll need to create one for the debug$H section and stick it into the
list.  This particular list doesn't have to be in any special order, so you
can just put it at the end (although it's probably not that much harder to
insert into the middle, and it will make for a good test that you've done
it right.  The output can be diffed against clang-cl object file and be
identical this way).  So write all the normal symbols as you probably
already are, then write one for the .debug$H section.  Initialize the
fields to the same thing that you see when you run obj2yaml against an
object file generated by clang-cl for the .debug$H section.

This structure doesn't contain any kind of file pointers or offsets, so all
you really need to fix up are the "SectionNumber" fields.  Basically
as you
are writing the existing symbols, you would do somethign like:

for (const auto &Sym : ObjFile.symbols()) {
  if (Symbol->SectionNumber >= DebugHInsertionIndex)
    ++Symbol->SectionNumber;
  writeSymbol(Sym);
}
writeSymbol(DebugHSym);


On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada <santagada at
gmail.com>
wrote:
> Any idea on how to create this new symbol there? I saw that there is a
> symbol pointing to each section, but didn't understand the format, and
> yaml2obj doesn't check it or do anything with the list.
>
> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT.
>>
>> sorry for the caps... long day of working on this, and using vs 2017,
>> which adds a new section type .chks64 that I couldn't find
documentation
>> anywhere was difficult. I highly recommend everyone to just not using
vs
>> 2017 until 15.8 or something, our internal bug list is gigantic.
>>
>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> Actually I already have a theory that even though you are adding
the
>>> section to the section table, you might not be adding a *symbol*
for the
>>> section to the symbol table.  So the existing symbols (which
reference
>>> sections by index) will all be wrong because you've inserted a
new
>>> section.  Still though, obj2yaml would expose that.
>>>
>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> Yea as long as you compare clang-cl object file with
automatically
>>>> generated .debug$H section against clang-cl object file without
.debug$H
>>>> but added after the fact with llvm-objcopy, that should expose
the problem
>>>> I think when you run obj2yaml on them.
>>>>
>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada
<santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> I did reorder my sections, so that .debug$H is in the
correct place,
>>>>> but now I get some errors on dubplicate symbols, I created
a folder with
>>>>> examples:
>>>>>
>>>>>
>>>>>
https://www.dropbox.com/sh/nmvzi44pi0boe76/AAA0f47O5PCJ9JiUc6wVuwBra?dl=0
>>>>>
>>>>> t.obj is generated by vs 2015 and it links fine with
lld-link.exe, but
>>>>> tout.obj gives this errors:
>>>>>
>>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options in
>>>>> tout.obj and in LIBCMT.lib(default_local_stdio_options.obj)
>>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options in
>>>>> tout.obj and in libvcruntime.lib(undname.obj)
>>>>>
>>>>> I'm using PEView from http://wjradburn.com/software/ to
look at the
>>>>> files and can't see anything wrong, except some valid
differences in the
>>>>> offsets being used for the data (so pointer to data is
different between
>>>>> them).
>>>>>
>>>>> I will look into yaml2obj now to see if I see anything else
weird
>>>>> going on.
>>>>>
>>>>>
>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner <zturner
at google.com>
>>>>> wrote:
>>>>>
>>>>>> I'm pretty confident that cl is not putting
anything strange in the
>>>>>> .debug$T sections.  We've done a lot of testing and
never seen anything
>>>>>> except CodeView type records in a .debug$T.  My hunch
is that your objcopy
>>>>>> patch is probably not doing the right thing in one or
more of the section
>>>>>> headers, and this is confusing the linker.
>>>>>>
>>>>>> One idea might be to build a simple object file with
clang-cl but
>>>>>> without the magic -mllvm -emit-codeview-ghash-section,
then run your
>>>>>> llvm-objcopy on it.  Then build the same object file
passing -mllvm
>>>>>> -emit-codeview-ghash-section.  Then run obj2yaml on
both and diff the
>>>>>> results.  They should be byte-for-byte identical.  That
should give you a
>>>>>> clue about if objcopy is doing something wrong.
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada <
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> Don't worry, I definetly want to perfect this
to generate legal obj
>>>>>>> files, this is just to speed up testing.
>>>>>>>
>>>>>>> Now after patching all the obj files I get this
errors when linking
>>>>>>> a small part of our code base (msvc 2017 15.5.3,
lld and llvm-objcopy
>>>>>>> 7.0.0):
>>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>>> section: $LN8
>>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>>> section: $LN43
>>>>>>> lld-link.exe : error : relocation against symbol in
discarded
>>>>>>> section: $LN37
>>>>>>>
>>>>>>> I'm starting to guess that cl.exe might be
putting some random
>>>>>>> comdat or other discardable symbols in the .debug$T
and clang doesn't? I
>>>>>>> will try to debug this and see what more I can
uncover.
>>>>>>>
>>>>>>> Linking works perfectly without my llvm-objcopy
pass to add .debug$H?
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner
<zturner at google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It might not influence LLD, but at the same
time we don't want to
>>>>>>>> upstream something that is producing
technically illegal COFF files.  Also
>>>>>>>> good to hear about the planned changes to your
header files.  Looking
>>>>>>>> forward to hearing about your experiences with
clang-cl.
>>>>>>>>
>>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I finally got my first .obj file patched
with .debug$H to look
>>>>>>>>> somewhat right. I added the new section at
the end of the file so I don't
>>>>>>>>> have to recalculate all sections (although
now I probably could position it
>>>>>>>>> in the middle, knowing that each section
is: SizeOfRawData +
>>>>>>>>> (last.Header.NumberOfRelocations * (4+4+2))
and the $H needs to come right
>>>>>>>>> after $T in the file). That although
illegal based on the coff specs
>>>>>>>>> doesn't seem its going to influence
lld.
>>>>>>>>>
>>>>>>>>> Also we talked and we are probably going to
do something similar
>>>>>>>>> to a bunch of windows defines and a check
for our own define (to guarantee
>>>>>>>>> that no one imported windows.h before
win32.h) and drop the namespace and
>>>>>>>>> the conflicting names.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM, Zachary
Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> That's very possible that a 3rd
party indirect header include is
>>>>>>>>>> involved.  One idea might be like I
suggested where you #define _WINDOWS_
>>>>>>>>>> in win32.h and guarantee that it's
always included first.  Then those other
>>>>>>>>>> headers won't be able to #include
<windows.h>.  but it will probably
>>>>>>>>>> greatly expand the amount of stuff you
have to add to win32.h, as you will
>>>>>>>>>> probably find some callers of functions
that aren't yet in your win32.h
>>>>>>>>>> that you'd have to add.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ok some information was lost on
getting this example to you, I'm
>>>>>>>>>>> sorry for not being clear.
>>>>>>>>>>>
>>>>>>>>>>> We have a huge code base, let's
say 90% of it doesn't include
>>>>>>>>>>> either header, 9% include win32.h
and 1% includes both, I will try to
>>>>>>>>>>> discover why, but my guess is they
include both a third party that includes
>>>>>>>>>>> windows.h and some of our libs that
use win32.h.
>>>>>>>>>>>
>>>>>>>>>>> I will try to fully understand this
tomorrow.
>>>>>>>>>>>
>>>>>>>>>>> I guess clang will not implement
this ever so finishing the
>>>>>>>>>>> object copier is the best solution
until all code is ported to clang.
>>>>>>>>>>>
>>>>>>>>>>> On 23 Jan 2018 00:02, "Zachary
Turner" <zturner at google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You said win32.h doesn't
include windows.h, but main.cpp does.
>>>>>>>>>>>> So what's the disadvantage
of just including it in win32.h anyway, since
>>>>>>>>>>>> it's already going to be in
every translation unit?  (Unless you didn't
>>>>>>>>>>>> mean to #include it in
main.cpp)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I guess all I can do is warn
you how bad of an idea this is.
>>>>>>>>>>>> For starters, I already found a
bug in your code ;-)
>>>>>>>>>>>>
>>>>>>>>>>>> // stdint.h
>>>>>>>>>>>> typedef int               
int32_t;
>>>>>>>>>>>>
>>>>>>>>>>>> // winnt.h
>>>>>>>>>>>> typedef long LONG;
>>>>>>>>>>>>
>>>>>>>>>>>> // windef.h
>>>>>>>>>>>> typedef struct tagPOINT
>>>>>>>>>>>> {
>>>>>>>>>>>>     LONG  x;   // long x
>>>>>>>>>>>>     LONG  y;   // long y
>>>>>>>>>>>> } POINT, *PPOINT, NEAR
*NPPOINT, FAR *LPPOINT;
>>>>>>>>>>>>
>>>>>>>>>>>> // win32.h
>>>>>>>>>>>> typedef int32_t LONG;
>>>>>>>>>>>>
>>>>>>>>>>>> struct POINT
>>>>>>>>>>>> {
>>>>>>>>>>>> LONG x;   // int x
>>>>>>>>>>>> LONG y;   // int y
>>>>>>>>>>>> };
>>>>>>>>>>>>
>>>>>>>>>>>> So POINT is defined two
different ways.  In your minimal
>>>>>>>>>>>> interface, it's declared as
2 int32's, which are int.  In the actual
>>>>>>>>>>>> Windows header files, it's
declared as 2 longs.
>>>>>>>>>>>>
>>>>>>>>>>>> This might seem like a
unimportant bug since int and long are
>>>>>>>>>>>> the same size, but int and long
also mangle differently and affect overload
>>>>>>>>>>>> resolution, so you could have
weird linker errors or call the wrong
>>>>>>>>>>>> function overload.
>>>>>>>>>>>>
>>>>>>>>>>>> Plus, it illustrates the fact
that this struct *actually is* a
>>>>>>>>>>>> different type from the one in
the windows header.
>>>>>>>>>>>>
>>>>>>>>>>>> You said at the end that you
never intentionally import win32.h
>>>>>>>>>>>> and windows.h from the same
translation unit.  But then in this example you
>>>>>>>>>>>> did.  I wonder if you could
enforce that by doing this:
>>>>>>>>>>>>
>>>>>>>>>>>> // win32.h
>>>>>>>>>>>> #pragma once
>>>>>>>>>>>>
>>>>>>>>>>>> // Error if windows.h was
included before us.
>>>>>>>>>>>> #if defined(_WINDOWS_)
>>>>>>>>>>>> #error "You're
including win32.h after having already included
>>>>>>>>>>>> windows.h.  Don't do
this!"
>>>>>>>>>>>> #endif
>>>>>>>>>>>>
>>>>>>>>>>>> // And also make sure windows.h
can't get included after us
>>>>>>>>>>>> #define _WINDOWS_
>>>>>>>>>>>>
>>>>>>>>>>>> For the record, I tried the
test case you linked when windows.h
>>>>>>>>>>>> is not included in main.cpp and
it works (but still has the bug about int
>>>>>>>>>>>> and long).
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jan 22, 2018 at 2:23 PM
Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It is super gross, but we
copy parts of windows.h because
>>>>>>>>>>>>> having all of it if both
gigantic and very very messy. So our win32.h has a
>>>>>>>>>>>>> couple thousands of lines
and not 30k+ for windows.h and we try to have
>>>>>>>>>>>>> zero macros. Win32.h
doesn't include windows.h so using ::BOOL wouldn't
>>>>>>>>>>>>> work. We don't want to
create a namespace, we just want a cleaner interface
>>>>>>>>>>>>> to windows api. The
namespace with c linkage is the way to trick cl into
>>>>>>>>>>>>> allowing us to in some
files have both windows.h and Win32.h. I really
>>>>>>>>>>>>> don't see any way for
us to have this Win32.h without this cl support, so
>>>>>>>>>>>>> maybe we should either put
windows.h in a compiled header somewhere and not
>>>>>>>>>>>>> care that it is infecting
everything or just have one place we can call to
>>>>>>>>>>>>> clean up after including
windows.h (a massive set of undefs).
>>>>>>>>>>>>>
>>>>>>>>>>>>> So using can't work,
because we never intentionally import
>>>>>>>>>>>>> windows.h and win32.h on
the same translation unit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
7:08 PM, Zachary Turner <
>>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is pretty gross,
honestly :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can't you just use
using declarations?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> namespace Win32 {
>>>>>>>>>>>>>> extern "C" {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> using ::BOOL;
>>>>>>>>>>>>>> using ::LONG;
>>>>>>>>>>>>>> using ::POINT;
>>>>>>>>>>>>>> using ::LPPOINT;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> using ::GetCursorPos;
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This works with
clang-cl.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
5:39 AM Leonardo Santagada <
>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here it is a
minimal example, we do this so we don't have to
>>>>>>>>>>>>>>> import the whole
windows api everywhere.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
https://gist.github.com/santagada/7977e929d31c629c4bf18ebb987f6be3
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Jan 21,
2018 at 2:31 AM, Zachary Turner <
>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Clang-cl
maintains compatibility with msvc even in cases
>>>>>>>>>>>>>>>> where it’s non
standards compliant (eg 2 phase name lookup), but we try to
>>>>>>>>>>>>>>>> keep these
cases few and far between.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To help me
understand your case, do you mean you copy
>>>>>>>>>>>>>>>> windows.h and
modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>>> defined twice?
If i were to write this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> struct Foo {};
>>>>>>>>>>>>>>>> struct Foo {};
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is this a small
repro of the issue you’re talking about?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Jan 20,
2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I can
totally see something like incremental linking with
>>>>>>>>>>>>>>>>> a simple
padding between obj and a mapping file (which can also help with
>>>>>>>>>>>>>>>>> edit and
continue, something we also would love to have).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We have
another developer doing the port to support
>>>>>>>>>>>>>>>>> clang-cl,
but although most of our code also goes trough a version of
>>>>>>>>>>>>>>>>> clang,
migrating the rest to clang-cl has been a fight. From what I heard
>>>>>>>>>>>>>>>>> the main
problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>>> bring the
awful parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>>> on cl, but
clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>>> same name,
even though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>>> as cl.exe
is not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it and have this at least
>>>>>>>>>>>>>>>>> until this
is completely fixed in our code base.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the biggest
win with moving to cl would be a better more
>>>>>>>>>>>>>>>>> standards
compliant compiler, no 1 minute compiles on heavily templated
>>>>>>>>>>>>>>>>> files and
maybe the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Jan
20, 2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 10-15s
will be hard without true incremental linking.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> At some
point that's going to be the only way to get any
>>>>>>>>>>>>>>>>>> faster,
but incremental linking is hard (putting it lightly), and since our
>>>>>>>>>>>>>>>>>> full
links are already really fast we think we can get reasonably close to
>>>>>>>>>>>>>>>>>>
link.exe incremental speeds with full links.  But it's never enough and I
>>>>>>>>>>>>>>>>>> will
always want it to be faster, so you may see incremental linking in the
>>>>>>>>>>>>>>>>>> future
after we hit a performance wall with full link speed :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In any
case, I'm definitely interested in seeing what
>>>>>>>>>>>>>>>>>> kind of
numbers you get with /debug:ghash after you get this llvm-objcopy
>>>>>>>>>>>>>>>>>> feature
implemented.  So keep me updated :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As an
aside, have you tried building with clang instead
>>>>>>>>>>>>>>>>>> of cl? 
If you build with clang you wouldn't even have to do this
>>>>>>>>>>>>>>>>>>
llvm-objcopy work, because it would "just work".  If you've tried
but ran
>>>>>>>>>>>>>>>>>> into
issues I'm interested in hearing about those too.  On the other hand,
>>>>>>>>>>>>>>>>>>
it's also reasonable to only switch one thing at a time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat,
Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> if
we get to < 30s I think most users would prefer it to
>>>>>>>>>>>>>>>>>>>
link.exe, just hopping there is still some more optimizations to get closer
>>>>>>>>>>>>>>>>>>> to
ELF linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>>>
of the penalties like slow debug time
>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark cases.
>>>>>>>>>>>>>>>>>>>>>
When building blink_core.dll and browser_tests.exe, i get anywhere from a
>>>>>>>>>>>>>>>>>>>>>
20-40% reduction in link time. We have some other optimizations in the
>>>>>>>>>>>>>>>>>>>>>
pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
My best time so far (including other optimizations not
>>>>>>>>>>>>>>>>>>>>>
yet upstream) is 28s on blink_core.dll, compared to 110s with /debug
>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same route
>>>>>>>>>>>>>>>>>>>>>>>
that clang goes through to write the object file.  If you think yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>
is convoluted, the way clang does it will just give you a headache.  There
>>>>>>>>>>>>>>>>>>>>>>>
are multiple abstractions involved to account for different object file
>>>>>>>>>>>>>>>>>>>>>>>
formats (ELF, COFF, MachO) and output formats (Assembly, binary file).  At
>>>>>>>>>>>>>>>>>>>>>>>
least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I just
>>>>>>>>>>>>>>>>>>>>>>
found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the COFFParser
>>>>>>>>>>>>>>>>>>>>>>>
structure, but if you look at the writeCOFF
>>>>>>>>>>>>>>>>>>>>>>>
function in yaml2coff it's pretty bare-metal.  The logic you need will be
>>>>>>>>>>>>>>>>>>>>>>>
almost identical, except that instead of checking the COFFParser for the
>>>>>>>>>>>>>>>>>>>>>>>
various fields, you'll check the existing COFFObjectFile, which should have
>>>>>>>>>>>>>>>>>>>>>>>
similar fields.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when writing
>>>>>>>>>>>>>>>>>>>>>>>
the section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>>
overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy that
>>>>>>>>>>>>>>>>>>>>>>
I read a bit of it so I actually know what you are talking about... yeah it
>>>>>>>>>>>>>>>>>>>>>>
doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the other
>>>>>>>>>>>>>>>>>>>>>>>
fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how the
>>>>>>>>>>>>>>>>>>>>>>
CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained is
>>>>>>>>>>>>>>>>>>>>>>>
that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on the
>>>>>>>>>>>>>>>>>>>>>>
final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like chrome
>>>>>>>>>>>>>>>>>>>>>>
with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes and
>>>>>>>>>>>>>>>>>>>>>>
16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1 minute
>>>>>>>>>>>>>>>>>>>>>>
and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it only
>>>>>>>>>>>>>>>>>>>>>>
happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols used on
>>>>>>>>>>>>>>>>>>>>>>
some runtime systems.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something that
>>>>>>>>>>>>>>>>>>>>>>>>
reads the obj file, finds .debug$T sections and global hashes it (proof of
>>>>>>>>>>>>>>>>>>>>>>>>
concept kind of code). What I can't find is: how does clang itself writes
>>>>>>>>>>>>>>>>>>>>>>>>
the coff files with global hashes, as that might help me understand how to
>>>>>>>>>>>>>>>>>>>>>>>>
create the .debug$H section, how to update the file section count and how
>>>>>>>>>>>>>>>>>>>>>>>>
to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be working on
>>>>>>>>>>>>>>>>>>>>>>>>
the yaml COFFParser struct and I'm having quite a bit of a headache turning
>>>>>>>>>>>>>>>>>>>>>>>>
the COFFObjectFile into a COFFParser object or compatible... Tomorrow I
>>>>>>>>>>>>>>>>>>>>>>>>
might try the very non efficient path of coff2yaml and then yaml2coff with
>>>>>>>>>>>>>>>>>>>>>>>>
the hashes header... but it seems way too inefficient and convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>
<santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the visual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
studio toolchain. What I'm proposing is a tool for processing .obj files in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
unity build files (.cpp's with a lot of other .cpp's in them aka munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list, I was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
reading about llvm mailing lists and jumped when I saw what I thought was a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.  We've
>>>>>>>>>>>>>>>>>>>>>>>>>>>
talked about it internally as well and agreed it would be useful, we just
>>>>>>>>>>>>>>>>>>>>>>>>>>>
haven't prioritized it.  If you're interested in submitting a patch
along
>>>>>>>>>>>>>>>>>>>>>>>>>>>
those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>
be.  llvm-readobj and llvm-objdump seem like obvious choices, but they are
>>>>>>>>>>>>>>>>>>>>>>>>>>>
intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool for
>>>>>>>>>>>>>>>>>>>>>>>>>>>
it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to write.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you're interested in trying to make a patch for this, I can offer some
>>>>>>>>>>>>>>>>>>>>>>>>>>>
guidance on where to look in the code.  Otherwise it's something that
we'll
>>>>>>>>>>>>>>>>>>>>>>>>>>>
probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it back,
>>>>>>>>>>>>>>>>>>>>>>>>>>
please do tell, I did find some of the code of ghash in lld, but in fuzzy
>>>>>>>>>>>>>>>>>>>>>>>>>>
on the llvm codeview part of it and never seen llvm-readobj/objdump or
>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden
>>>>>>>>>>>>>>>>>>>>>>>>>
behind library calls, and it should already just do the right thing, so I
>>>>>>>>>>>>>>>>>>>>>>>>>
suspect you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about putting
>>>>>>>>>>>>>>>>>>>>>>>>>
this in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing CopyBinary
>>>>>>>>>>>>>>>>>>>>>>>>>
functions there, which currently only work for ELF, but you can just make a
>>>>>>>>>>>>>>>>>>>>>>>>>
new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over each of
>>>>>>>>>>>>>>>>>>>>>>>>>
the sections (getNumberOfSections / getSectionName) looking for .debug$T
>>>>>>>>>>>>>>>>>>>>>>>>>
and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can just
>>>>>>>>>>>>>>>>>>>>>>>>>
skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H, then
>>>>>>>>>>>>>>>>>>>>>>>>>
basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes.  That
>>>>>>>>>>>>>>>>>>>>>>>>>
will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF files, so
>>>>>>>>>>>>>>>>>>>>>>>>>
it would need to be taught to write COFF files.  We have code to do this in
>>>>>>>>>>>>>>>>>>>>>>>>>
the yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>>>>>>>>>>>>>
writeCOFF).  There may be a way to move this code to somewhere else
>>>>>>>>>>>>>>>>>>>>>>>>>
(llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of this
>>>>>>>>>>>>>>>>>>>>>>>>>
behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Leonardo Santagada
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Leonardo Santagada
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>
>
>
> --
>
> Leonardo Santagada
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180125/1086fc97/attachment-0001.html>

Leonardo Santagada via llvm-dev

2018-Jan-25 18:46 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

I see that there is an auxsymbol per section symbol, and also on the yaml
representation there is a checksum, selection and unused all of them I have
no idea how to fill in, also this aux symbol might have some important
information for me to patch on the other symbols. Can you find the part in
llvm that it writes those? because at least for auxsymbol the yaml part of
the code threats as a binary blob so there is no info on what they should
be.

On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner at google.com>
wrote:
> If you run obj2yaml against a very simple object file, you'll see
> something like this at the end:
> ```
> symbols:
>   - Name:            '@comp.id'
>     Value:           17130443
>     SectionNumber:   -1
>     SimpleType:      IMAGE_SYM_TYPE_NULL
>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>   - Name:            '@feat.00'
>     Value:           2147484048 <(21)%204748-4048>
>     SectionNumber:   -1
>     SimpleType:      IMAGE_SYM_TYPE_NULL
>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>   - Name:            .drectve
>     Value:           0
>     SectionNumber:   1
>     SimpleType:      IMAGE_SYM_TYPE_NULL
>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>     SectionDefinition:
>       Length:          47
>       NumberOfRelocations: 0
>       NumberOfLinenumbers: 0
>       CheckSum:        0
>       Number:          0
> ...
> ```
>
> There's a structure called coff::symbol which basically represents each
> one of these records.  It looks like this:
>
> ```
> struct symbol {
>   char Name[NameSize];
>   uint32_t Value;
>   int32_t SectionNumber;
>   uint16_t Type;
>   uint8_t StorageClass;
>   uint8_t NumberOfAuxSymbols;
> };
> ```
>
> So you'll need to create one for the debug$H section and stick it into
the
> list.  This particular list doesn't have to be in any special order, so
you
> can just put it at the end (although it's probably not that much harder
to
> insert into the middle, and it will make for a good test that you've
done
> it right.  The output can be diffed against clang-cl object file and be
> identical this way).  So write all the normal symbols as you probably
> already are, then write one for the .debug$H section.  Initialize the
> fields to the same thing that you see when you run obj2yaml against an
> object file generated by clang-cl for the .debug$H section.
>
> This structure doesn't contain any kind of file pointers or offsets, so
> all you really need to fix up are the "SectionNumber" fields. 
Basically as
> you are writing the existing symbols, you would do somethign like:
>
> for (const auto &Sym : ObjFile.symbols()) {
>   if (Symbol->SectionNumber >= DebugHInsertionIndex)
>     ++Symbol->SectionNumber;
>   writeSymbol(Sym);
> }
> writeSymbol(DebugHSym);
>
>
> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> Any idea on how to create this new symbol there? I saw that there is a
>> symbol pointing to each section, but didn't understand the format,
and
>> yaml2obj doesn't check it or do anything with the list.
>>
>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT.
>>>
>>> sorry for the caps... long day of working on this, and using vs
2017,
>>> which adds a new section type .chks64 that I couldn't find
documentation
>>> anywhere was difficult. I highly recommend everyone to just not
using vs
>>> 2017 until 15.8 or something, our internal bug list is gigantic.
>>>
>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> Actually I already have a theory that even though you are
adding the
>>>> section to the section table, you might not be adding a
*symbol* for the
>>>> section to the symbol table.  So the existing symbols (which
reference
>>>> sections by index) will all be wrong because you've
inserted a new
>>>> section.  Still though, obj2yaml would expose that.
>>>>
>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at
google.com>
>>>> wrote:
>>>>
>>>>> Yea as long as you compare clang-cl object file with
automatically
>>>>> generated .debug$H section against clang-cl object file
without .debug$H
>>>>> but added after the fact with llvm-objcopy, that should
expose the problem
>>>>> I think when you run obj2yaml on them.
>>>>>
>>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> I did reorder my sections, so that .debug$H is in the
correct place,
>>>>>> but now I get some errors on dubplicate symbols, I
created a folder with
>>>>>> examples:
>>>>>>
>>>>>> https://www.dropbox.com/sh/nmvzi44pi0boe76/
>>>>>> AAA0f47O5PCJ9JiUc6wVuwBra?dl=0
>>>>>>
>>>>>> t.obj is generated by vs 2015 and it links fine with
lld-link.exe,
>>>>>> but tout.obj gives this errors:
>>>>>>
>>>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options
>>>>>> in tout.obj and in
LIBCMT.lib(default_local_stdio_options.obj)
>>>>>> LLD-LINK.EXE: error: duplicate symbol:
__local_stdio_printf_options
>>>>>> in tout.obj and in libvcruntime.lib(undname.obj)
>>>>>>
>>>>>> I'm using PEView from
http://wjradburn.com/software/ to look at the
>>>>>> files and can't see anything wrong, except some
valid differences in the
>>>>>> offsets being used for the data (so pointer to data is
different between
>>>>>> them).
>>>>>>
>>>>>> I will look into yaml2obj now to see if I see anything
else weird
>>>>>> going on.
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner
<zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm pretty confident that cl is not putting
anything strange in the
>>>>>>> .debug$T sections.  We've done a lot of testing
and never seen anything
>>>>>>> except CodeView type records in a .debug$T.  My
hunch is that your objcopy
>>>>>>> patch is probably not doing the right thing in one
or more of the section
>>>>>>> headers, and this is confusing the linker.
>>>>>>>
>>>>>>> One idea might be to build a simple object file
with clang-cl but
>>>>>>> without the magic -mllvm
-emit-codeview-ghash-section, then run your
>>>>>>> llvm-objcopy on it.  Then build the same object
file passing -mllvm
>>>>>>> -emit-codeview-ghash-section.  Then run obj2yaml on
both and diff the
>>>>>>> results.  They should be byte-for-byte identical. 
That should give you a
>>>>>>> clue about if objcopy is doing something wrong.
>>>>>>>
>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada
<
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> Don't worry, I definetly want to perfect
this to generate legal obj
>>>>>>>> files, this is just to speed up testing.
>>>>>>>>
>>>>>>>> Now after patching all the obj files I get this
errors when linking
>>>>>>>> a small part of our code base (msvc 2017
15.5.3, lld and llvm-objcopy
>>>>>>>> 7.0.0):
>>>>>>>> lld-link.exe : error : relocation against
symbol in discarded
>>>>>>>> section: $LN8
>>>>>>>> lld-link.exe : error : relocation against
symbol in discarded
>>>>>>>> section: $LN43
>>>>>>>> lld-link.exe : error : relocation against
symbol in discarded
>>>>>>>> section: $LN37
>>>>>>>>
>>>>>>>> I'm starting to guess that cl.exe might be
putting some random
>>>>>>>> comdat or other discardable symbols in the
.debug$T and clang doesn't? I
>>>>>>>> will try to debug this and see what more I can
uncover.
>>>>>>>>
>>>>>>>> Linking works perfectly without my llvm-objcopy
pass to add
>>>>>>>> .debug$H?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner
<zturner at google.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> It might not influence LLD, but at the same
time we don't want to
>>>>>>>>> upstream something that is producing
technically illegal COFF files.  Also
>>>>>>>>> good to hear about the planned changes to
your header files.  Looking
>>>>>>>>> forward to hearing about your experiences
with clang-cl.
>>>>>>>>>
>>>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo
Santagada <
>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I finally got my first .obj file
patched with .debug$H to look
>>>>>>>>>> somewhat right. I added the new section
at the end of the file so I don't
>>>>>>>>>> have to recalculate all sections
(although now I probably could position it
>>>>>>>>>> in the middle, knowing that each
section is: SizeOfRawData + (last.Header.NumberOfRelocations
>>>>>>>>>> * (4+4+2)) and the $H needs to come
right after $T in the file). That
>>>>>>>>>> although illegal based on the coff
specs doesn't seem its going to
>>>>>>>>>> influence lld.
>>>>>>>>>>
>>>>>>>>>> Also we talked and we are probably
going to do something similar
>>>>>>>>>> to a bunch of windows defines and a
check for our own define (to guarantee
>>>>>>>>>> that no one imported windows.h before
win32.h) and drop the namespace and
>>>>>>>>>> the conflicting names.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM,
Zachary Turner <
>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> That's very possible that a 3rd
party indirect header include is
>>>>>>>>>>> involved.  One idea might be like I
suggested where you #define _WINDOWS_
>>>>>>>>>>> in win32.h and guarantee that
it's always included first.  Then those other
>>>>>>>>>>> headers won't be able to
#include <windows.h>.  but it will probably
>>>>>>>>>>> greatly expand the amount of stuff
you have to add to win32.h, as you will
>>>>>>>>>>> probably find some callers of
functions that aren't yet in your win32.h
>>>>>>>>>>> that you'd have to add.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM
Leonardo Santagada <
>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ok some information was lost on
getting this example to you,
>>>>>>>>>>>> I'm sorry for not being
clear.
>>>>>>>>>>>>
>>>>>>>>>>>> We have a huge code base,
let's say 90% of it doesn't include
>>>>>>>>>>>> either header, 9% include
win32.h and 1% includes both, I will try to
>>>>>>>>>>>> discover why, but my guess is
they include both a third party that includes
>>>>>>>>>>>> windows.h and some of our libs
that use win32.h.
>>>>>>>>>>>>
>>>>>>>>>>>> I will try to fully understand
this tomorrow.
>>>>>>>>>>>>
>>>>>>>>>>>> I guess clang will not
implement this ever so finishing the
>>>>>>>>>>>> object copier is the best
solution until all code is ported to clang.
>>>>>>>>>>>>
>>>>>>>>>>>> On 23 Jan 2018 00:02,
"Zachary Turner" <zturner at google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> You said win32.h
doesn't include windows.h, but main.cpp
>>>>>>>>>>>>> does.  So what's the
disadvantage of just including it in win32.h anyway,
>>>>>>>>>>>>> since it's already
going to be in every translation unit?  (Unless you
>>>>>>>>>>>>> didn't mean to #include
it in main.cpp)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I guess all I can do is
warn you how bad of an idea this is.
>>>>>>>>>>>>> For starters, I already
found a bug in your code ;-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> // stdint.h
>>>>>>>>>>>>> typedef int               
int32_t;
>>>>>>>>>>>>>
>>>>>>>>>>>>> // winnt.h
>>>>>>>>>>>>> typedef long LONG;
>>>>>>>>>>>>>
>>>>>>>>>>>>> // windef.h
>>>>>>>>>>>>> typedef struct tagPOINT
>>>>>>>>>>>>> {
>>>>>>>>>>>>>     LONG  x;   // long x
>>>>>>>>>>>>>     LONG  y;   // long y
>>>>>>>>>>>>> } POINT, *PPOINT, NEAR
*NPPOINT, FAR *LPPOINT;
>>>>>>>>>>>>>
>>>>>>>>>>>>> // win32.h
>>>>>>>>>>>>> typedef int32_t LONG;
>>>>>>>>>>>>>
>>>>>>>>>>>>> struct POINT
>>>>>>>>>>>>> {
>>>>>>>>>>>>> LONG x;   // int x
>>>>>>>>>>>>> LONG y;   // int y
>>>>>>>>>>>>> };
>>>>>>>>>>>>>
>>>>>>>>>>>>> So POINT is defined two
different ways.  In your minimal
>>>>>>>>>>>>> interface, it's
declared as 2 int32's, which are int.  In the actual
>>>>>>>>>>>>> Windows header files,
it's declared as 2 longs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This might seem like a
unimportant bug since int and long are
>>>>>>>>>>>>> the same size, but int and
long also mangle differently and affect overload
>>>>>>>>>>>>> resolution, so you could
have weird linker errors or call the wrong
>>>>>>>>>>>>> function overload.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Plus, it illustrates the
fact that this struct *actually is* a
>>>>>>>>>>>>> different type from the one
in the windows header.
>>>>>>>>>>>>>
>>>>>>>>>>>>> You said at the end that
you never intentionally import
>>>>>>>>>>>>> win32.h and windows.h from
the same translation unit.  But then in this
>>>>>>>>>>>>> example you did.  I wonder
if you could enforce that by doing this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> // win32.h
>>>>>>>>>>>>> #pragma once
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Error if windows.h was
included before us.
>>>>>>>>>>>>> #if defined(_WINDOWS_)
>>>>>>>>>>>>> #error "You're
including win32.h after having already included
>>>>>>>>>>>>> windows.h.  Don't do
this!"
>>>>>>>>>>>>> #endif
>>>>>>>>>>>>>
>>>>>>>>>>>>> // And also make sure
windows.h can't get included after us
>>>>>>>>>>>>> #define _WINDOWS_
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the record, I tried the
test case you linked when
>>>>>>>>>>>>> windows.h is not included
in main.cpp and it works (but still has the bug
>>>>>>>>>>>>> about int and long).
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
2:23 PM Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is super gross, but
we copy parts of windows.h because
>>>>>>>>>>>>>> having all of it if
both gigantic and very very messy. So our win32.h has a
>>>>>>>>>>>>>> couple thousands of
lines and not 30k+ for windows.h and we try to have
>>>>>>>>>>>>>> zero macros. Win32.h
doesn't include windows.h so using ::BOOL wouldn't
>>>>>>>>>>>>>> work. We don't want
to create a namespace, we just want a cleaner interface
>>>>>>>>>>>>>> to windows api. The
namespace with c linkage is the way to trick cl into
>>>>>>>>>>>>>> allowing us to in some
files have both windows.h and Win32.h. I really
>>>>>>>>>>>>>> don't see any way
for us to have this Win32.h without this cl support, so
>>>>>>>>>>>>>> maybe we should either
put windows.h in a compiled header somewhere and not
>>>>>>>>>>>>>> care that it is
infecting everything or just have one place we can call to
>>>>>>>>>>>>>> clean up after
including windows.h (a massive set of undefs).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So using can't
work, because we never intentionally import
>>>>>>>>>>>>>> windows.h and win32.h
on the same translation unit.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
7:08 PM, Zachary Turner <
>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is pretty
gross, honestly :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can't you just
use using declarations?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> namespace Win32 {
>>>>>>>>>>>>>>> extern
"C" {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> using ::BOOL;
>>>>>>>>>>>>>>> using ::LONG;
>>>>>>>>>>>>>>> using ::POINT;
>>>>>>>>>>>>>>> using ::LPPOINT;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> using
::GetCursorPos;
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This works with
clang-cl.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jan 22,
2018 at 5:39 AM Leonardo Santagada <
>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here it is a
minimal example, we do this so we don't have
>>>>>>>>>>>>>>>> to import the
whole windows api everywhere.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
https://gist.github.com/santagada/
>>>>>>>>>>>>>>>>
7977e929d31c629c4bf18ebb987f6be3
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Jan 21,
2018 at 2:31 AM, Zachary Turner <
>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Clang-cl
maintains compatibility with msvc even in cases
>>>>>>>>>>>>>>>>> where it’s
non standards compliant (eg 2 phase name lookup), but we try to
>>>>>>>>>>>>>>>>> keep these
cases few and far between.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To help me
understand your case, do you mean you copy
>>>>>>>>>>>>>>>>> windows.h
and modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>>>> defined
twice? If i were to write this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> struct Foo
{};
>>>>>>>>>>>>>>>>> struct Foo
{};
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is this a
small repro of the issue you’re talking about?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Jan
20, 2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>> santagada
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I can
totally see something like incremental linking with
>>>>>>>>>>>>>>>>>> a
simple padding between obj and a mapping file (which can also help with
>>>>>>>>>>>>>>>>>> edit
and continue, something we also would love to have).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We have
another developer doing the port to support
>>>>>>>>>>>>>>>>>>
clang-cl, but although most of our code also goes trough a version of
>>>>>>>>>>>>>>>>>> clang,
migrating the rest to clang-cl has been a fight. >From what I heard
>>>>>>>>>>>>>>>>>> the
main problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>>>> bring
the awful parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>>>> on cl,
but clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>>>> same
name, even though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>>>> as
cl.exe is not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it and
>>>>>>>>>>>>>>>>>> have
this at least until this is completely fixed in our code base.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> the
biggest win with moving to cl would be a better more
>>>>>>>>>>>>>>>>>>
standards compliant compiler, no 1 minute compiles on heavily templated
>>>>>>>>>>>>>>>>>> files
and maybe the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat,
Jan 20, 2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>> zturner
at google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
10-15s will be hard without true incremental linking.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> At
some point that's going to be the only way to get any
>>>>>>>>>>>>>>>>>>>
faster, but incremental linking is hard (putting it lightly), and since our
>>>>>>>>>>>>>>>>>>>
full links are already really fast we think we can get reasonably close to
>>>>>>>>>>>>>>>>>>>
link.exe incremental speeds with full links.  But it's never enough and I
>>>>>>>>>>>>>>>>>>>
will always want it to be faster, so you may see incremental linking in the
>>>>>>>>>>>>>>>>>>>
future after we hit a performance wall with full link speed :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In
any case, I'm definitely interested in seeing what
>>>>>>>>>>>>>>>>>>>
kind of numbers you get with /debug:ghash after you get this llvm-objcopy
>>>>>>>>>>>>>>>>>>>
feature implemented.  So keep me updated :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> As
an aside, have you tried building with clang instead
>>>>>>>>>>>>>>>>>>> of
cl?  If you build with clang you wouldn't even have to do this
>>>>>>>>>>>>>>>>>>>
llvm-objcopy work, because it would "just work".  If you've tried
but ran
>>>>>>>>>>>>>>>>>>>
into issues I'm interested in hearing about those too.  On the other hand,
>>>>>>>>>>>>>>>>>>>
it's also reasonable to only switch one thing at a time.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
if we get to < 30s I think most users would prefer it
>>>>>>>>>>>>>>>>>>>>
to link.exe, just hopping there is still some more optimizations to get
>>>>>>>>>>>>>>>>>>>>
closer to ELF linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>>>>
of the penalties like slow debug time
>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark cases.
>>>>>>>>>>>>>>>>>>>>>>
When building blink_core.dll and browser_tests.exe, i get anywhere from a
>>>>>>>>>>>>>>>>>>>>>>
20-40% reduction in link time. We have some other optimizations in the
>>>>>>>>>>>>>>>>>>>>>>
pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
My best time so far (including other optimizations
>>>>>>>>>>>>>>>>>>>>>>
not yet upstream) is 28s on blink_core.dll, compared to 110s with /debug
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same route
>>>>>>>>>>>>>>>>>>>>>>>>
that clang goes through to write the object file.  If you think yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>>
is convoluted, the way clang does it will just give you a headache.  There
>>>>>>>>>>>>>>>>>>>>>>>>
are multiple abstractions involved to account for different object file
>>>>>>>>>>>>>>>>>>>>>>>>
formats (ELF, COFF, MachO) and output formats (Assembly, binary file).  At
>>>>>>>>>>>>>>>>>>>>>>>>
least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I just
>>>>>>>>>>>>>>>>>>>>>>>
found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the COFFParser
>>>>>>>>>>>>>>>>>>>>>>>>
structure, but if you look at the writeCOFF
>>>>>>>>>>>>>>>>>>>>>>>>
function in yaml2coff it's pretty bare-metal.  The logic you need will be
>>>>>>>>>>>>>>>>>>>>>>>>
almost identical, except that instead of checking the COFFParser for the
>>>>>>>>>>>>>>>>>>>>>>>>
various fields, you'll check the existing COFFObjectFile, which should have
>>>>>>>>>>>>>>>>>>>>>>>>
similar fields.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when
>>>>>>>>>>>>>>>>>>>>>>>>
writing the section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>>>
overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy that
>>>>>>>>>>>>>>>>>>>>>>>
I read a bit of it so I actually know what you are talking about... yeah it
>>>>>>>>>>>>>>>>>>>>>>>
doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the
>>>>>>>>>>>>>>>>>>>>>>>>
other fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how the
>>>>>>>>>>>>>>>>>>>>>>>
CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained is
>>>>>>>>>>>>>>>>>>>>>>>>
that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData +
>>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on the
>>>>>>>>>>>>>>>>>>>>>>>
final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like
>>>>>>>>>>>>>>>>>>>>>>>
chrome with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes and
>>>>>>>>>>>>>>>>>>>>>>>
16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1 minute
>>>>>>>>>>>>>>>>>>>>>>>
and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it only
>>>>>>>>>>>>>>>>>>>>>>>
happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols used on
>>>>>>>>>>>>>>>>>>>>>>>
some runtime systems.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>
<santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something that
>>>>>>>>>>>>>>>>>>>>>>>>>
reads the obj file, finds .debug$T sections and global hashes it (proof of
>>>>>>>>>>>>>>>>>>>>>>>>>
concept kind of code). What I can't find is: how does clang itself writes
>>>>>>>>>>>>>>>>>>>>>>>>>
the coff files with global hashes, as that might help me understand how to
>>>>>>>>>>>>>>>>>>>>>>>>>
create the .debug$H section, how to update the file section count and how
>>>>>>>>>>>>>>>>>>>>>>>>>
to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be working
>>>>>>>>>>>>>>>>>>>>>>>>>
on the yaml COFFParser struct and I'm having quite a bit of a headache
>>>>>>>>>>>>>>>>>>>>>>>>>
turning the COFFObjectFile into a COFFParser object or compatible...
>>>>>>>>>>>>>>>>>>>>>>>>>
Tomorrow I might try the very non efficient path of coff2yaml and then
>>>>>>>>>>>>>>>>>>>>>>>>>
yaml2coff with the hashes header... but it seems way too inefficient and
>>>>>>>>>>>>>>>>>>>>>>>>>
convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner
>>>>>>>>>>>>>>>>>>>>>>>>>>>
<zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the visual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
studio toolchain. What I'm proposing is a tool for processing .obj files in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
unity build files (.cpp's with a lot of other .cpp's in them aka munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list, I was
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
reading about llvm mailing lists and jumped when I saw what I thought was a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.  We've
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
talked about it internally as well and agreed it would be useful, we just
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
haven't prioritized it.  If you're interested in submitting a patch
along
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
be.  llvm-readobj and llvm-objdump seem like obvious choices, but they are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to write.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you're interested in trying to make a patch for this, I can offer some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
guidance on where to look in the code.  Otherwise it's something that
we'll
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it back,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
please do tell, I did find some of the code of ghash in lld, but in fuzzy
>>>>>>>>>>>>>>>>>>>>>>>>>>>
on the llvm codeview part of it and never seen llvm-readobj/objdump or
>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden
>>>>>>>>>>>>>>>>>>>>>>>>>>
behind library calls, and it should already just do the right thing, so I
>>>>>>>>>>>>>>>>>>>>>>>>>>
suspect you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about putting
>>>>>>>>>>>>>>>>>>>>>>>>>>
this in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing CopyBinary
>>>>>>>>>>>>>>>>>>>>>>>>>>
functions there, which currently only work for ELF, but you can just make a
>>>>>>>>>>>>>>>>>>>>>>>>>>
new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over each of
>>>>>>>>>>>>>>>>>>>>>>>>>>
the sections (getNumberOfSections / getSectionName) looking for .debug$T
>>>>>>>>>>>>>>>>>>>>>>>>>>
and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can just
>>>>>>>>>>>>>>>>>>>>>>>>>>
skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H, then
>>>>>>>>>>>>>>>>>>>>>>>>>>
basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>
That will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF files, so
>>>>>>>>>>>>>>>>>>>>>>>>>>
it would need to be taught to write COFF files.  We have code to do this in
>>>>>>>>>>>>>>>>>>>>>>>>>>
the yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>>>>>>>>>>>>>>
writeCOFF).  There may be a way to move this code to somewhere else
>>>>>>>>>>>>>>>>>>>>>>>>>>
(llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of this
>>>>>>>>>>>>>>>>>>>>>>>>>>
behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Leonardo Santagada
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Leonardo Santagada
>>>>>>
>>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>

-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180125/918f371e/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jan 2018 - [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Possibly Parallel Threads