thr3ads.net - llvm dev - [llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler) [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2018-Jan-26 17:49 UTC

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

(Ignore the fact that my hashes are 8 byte in the "good" file, this is
due
to some local changes I've been experimenting with)

On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at google.com>
wrote:
> I did this:
>
> // a.cpp
> static int x = 0;
> void b(int);
> void a(int) {
>   if (x)
>     b(x);
> }
> int main(int argc, char **argv) {
>   a(argc);
>   return x;
> }
>
>
> clang-cl /Z7 /c a.cpp /Foa.noghash.obj
> clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section
> /Foa.ghash.good.obj
> llvm-objcopy a.noghash.obj a.ghash.bad.obj
> obj2yaml a.ghash.good.obj > a.ghash.good.yaml
> obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml
>
> Then open these 2 yaml files up in a diff viewer.  It looks like the
> hashes aren't getting emitted at all.  For example, in the good yaml
file I
> see this:
>
>   - Name:            '.debug$H'
>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>     Alignment:       4
>     SectionData:
> 
C5C93301000001005549419E78044E3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF84584814A8B5E7E3FB17B397A9E3DEA75CD5627
>     GlobalHashes:
>       Version:         0
>       HashAlgorithm:   1
>       HashValues:
>         - 5549419E78044E38
>         - 96D45CD700942875
>         - 8BE4A1E2B3E022BA
>         - 267DEE221F5C42B1
>         - 7BCA182AF8458481
>         - 4A8B5E7E3FB17B39
>         - 7A9E3DEA75CD5627
>   - Name:            .pdata
>
> And in the bad yaml file I see this:
>   - Name:            '.debug$H'
>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>     Alignment:       4
>     SectionData:     C5C9330100000000
>     GlobalHashes:
>       Version:         0
>       HashAlgorithm:   0
>   - Name:            .pdata
>
> Don't focus too much on trying to figure out weird linker errors.  Just
> get the output of obj2yaml to be identical when run under a diff utility,
> then everything should work fine.
>
> On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at
gmail.com>
> wrote:
>
>> I'm so close I can almost smell it :)
>>
>> I know how bad the code looks, I don't intend to submit this, but
if you
>> want to try it out its at:
>> https://gist.github.com/santagada/544136b1ee143bf31653b1158ac6829e
>>
>> I'm seeing: lld-link.exe: error: duplicate symbol:
"<redacted_unmangled>"
>> (<redacted>) in <internal> and in
<redacted_filename>.obj, looking at the
>> .yaml dump the symbols are all similar to this:
>>
>> - Name: <redacted>
>> Value: 0
>> SectionNumber: 0
>> SimpleType: IMAGE_SYM_TYPE_NULL
>> ComplexType: IMAGE_SYM_DTYPE_FUNCTION
>> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL
>> WeakExternal:
>> TagIndex: 134
>> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY
>>
>> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> I haven't really dabbled in this part of the COFF format
personally, so
>>> hopefully I'm not leading you astray :)
>>>
>>> But I checked the code for coff2yaml, and I see this:
>>>
>>>       } else if (Symbol.isSectionDefinition()) {
>>>         // This symbol represents a section definition.
>>>         assert(Symbol.getNumberOfAuxSymbols() == 1 &&
>>>                "Expected a single aux symbol to describe this
section!");
>>>         const object::coff_aux_section_definition *ObjSD
>>>             reinterpret_cast<const
object::coff_aux_section_definition
>>> *>(
>>>                 AuxData.data());
>>>
>>> So it looks like you need exactly 1 aux symbol for each section
symbol.
>>>
>>> I then scrolled up in this function to figure out where AuxData
comes
>>> from, and it comes from COFFObjectFile::getSymbolAuxData.  I think
that
>>> function holds the clue to what you need to do.  It looks like you
need to
>>> set coff::symbol::NumberOfAuxSymbols to 1, and then there is a
comment in
>>> getSymbolAuxData which says:
>>>
>>>     // AUX data comes immediately after the symbol in COFF
>>>     Aux = reinterpret_cast<const uint8_t
*>(Symbol.getRawPtr()) +
>>> SymbolSize;
>>>
>>> So I think you just need to write the bytes immediately after the
>>> coff::symbol.  The thing you need to write looks like a
>>> coff::coff_aux_section_definition structure.
>>>
>>> For the CheckSum, look at WinCOFFObjectWriter::writeSection.  It
looks
>>> like its a CRC32 of the actual section contents, which you can
generate
>>> with a couple of lines of code:
>>>
>>>   JamCRC JC(/*Init=*/0);
>>>   JC.update(DebugHContents);
>>>   AuxSymbol.CheckSum = JC.getCRC();
>>>
>>> Hope this helps
>>>
>>> On Thu, Jan 25, 2018 at 10:46 AM Leonardo Santagada <santagada
at gmail.com>
>>> wrote:
>>>
>>>>
>>>> I see that there is an auxsymbol per section symbol, and also
on the
>>>> yaml representation there is a checksum, selection and unused
all of them I
>>>> have no idea how to fill in, also this aux symbol might have
some important
>>>> information for me to patch on the other symbols. Can you find
the part in
>>>> llvm that it writes those? because at least for auxsymbol the
yaml part of
>>>> the code threats as a binary blob so there is no info on what
they should
>>>> be.
>>>>
>>>> On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner at
google.com>
>>>> wrote:
>>>>
>>>>> If you run obj2yaml against a very simple object file,
you'll see
>>>>> something like this at the end:
>>>>> ```
>>>>> symbols:
>>>>>   - Name:            '@comp.id'
>>>>>     Value:           17130443
>>>>>     SectionNumber:   -1
>>>>>     SimpleType:      IMAGE_SYM_TYPE_NULL
>>>>>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>>>>>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>>>>>   - Name:            '@feat.00'
>>>>>     Value:           2147484048 <(21)%204748-4048>
>>>>>     SectionNumber:   -1
>>>>>     SimpleType:      IMAGE_SYM_TYPE_NULL
>>>>>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>>>>>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>>>>>   - Name:            .drectve
>>>>>     Value:           0
>>>>>     SectionNumber:   1
>>>>>     SimpleType:      IMAGE_SYM_TYPE_NULL
>>>>>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>>>>>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>>>>>     SectionDefinition:
>>>>>       Length:          47
>>>>>       NumberOfRelocations: 0
>>>>>       NumberOfLinenumbers: 0
>>>>>       CheckSum:        0
>>>>>       Number:          0
>>>>> ...
>>>>> ```
>>>>>
>>>>> There's a structure called coff::symbol which basically
represents
>>>>> each one of these records.  It looks like this:
>>>>>
>>>>> ```
>>>>> struct symbol {
>>>>>   char Name[NameSize];
>>>>>   uint32_t Value;
>>>>>   int32_t SectionNumber;
>>>>>   uint16_t Type;
>>>>>   uint8_t StorageClass;
>>>>>   uint8_t NumberOfAuxSymbols;
>>>>> };
>>>>> ```
>>>>>
>>>>> So you'll need to create one for the debug$H section
and stick it into
>>>>> the list.  This particular list doesn't have to be in
any special order, so
>>>>> you can just put it at the end (although it's probably
not that much harder
>>>>> to insert into the middle, and it will make for a good test
that you've
>>>>> done it right.  The output can be diffed against clang-cl
object file and
>>>>> be identical this way).  So write all the normal symbols as
you probably
>>>>> already are, then write one for the .debug$H section. 
Initialize the
>>>>> fields to the same thing that you see when you run obj2yaml
against an
>>>>> object file generated by clang-cl for the .debug$H section.
>>>>>
>>>>> This structure doesn't contain any kind of file
pointers or offsets,
>>>>> so all you really need to fix up are the
"SectionNumber" fields.  Basically
>>>>> as you are writing the existing symbols, you would do
somethign like:
>>>>>
>>>>> for (const auto &Sym : ObjFile.symbols()) {
>>>>>   if (Symbol->SectionNumber >= DebugHInsertionIndex)
>>>>>     ++Symbol->SectionNumber;
>>>>>   writeSymbol(Sym);
>>>>> }
>>>>> writeSymbol(DebugHSym);
>>>>>
>>>>>
>>>>> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> Any idea on how to create this new symbol there? I saw
that there is
>>>>>> a symbol pointing to each section, but didn't
understand the format, and
>>>>>> yaml2obj doesn't check it or do anything with the
list.
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada
<
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> YES, THANK YOU... I WAS THINKING THIS BUT
COMPLETELY FORGOT.
>>>>>>>
>>>>>>> sorry for the caps... long day of working on this,
and using vs
>>>>>>> 2017, which adds a new section type .chks64 that I
couldn't find
>>>>>>> documentation anywhere was difficult. I highly
recommend everyone to just
>>>>>>> not using vs 2017 until 15.8 or something, our
internal bug list is
>>>>>>> gigantic.
>>>>>>>
>>>>>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner
<zturner at google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Actually I already have a theory that even
though you are adding
>>>>>>>> the section to the section table, you might not
be adding a *symbol* for
>>>>>>>> the section to the symbol table.  So the
existing symbols (which reference
>>>>>>>> sections by index) will all be wrong because
you've inserted a new
>>>>>>>> section.  Still though, obj2yaml would expose
that.
>>>>>>>>
>>>>>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner
<zturner at google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yea as long as you compare clang-cl object
file with automatically
>>>>>>>>> generated .debug$H section against clang-cl
object file without .debug$H
>>>>>>>>> but added after the fact with llvm-objcopy,
that should expose the problem
>>>>>>>>> I think when you run obj2yaml on them.
>>>>>>>>>
>>>>>>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo
Santagada <
>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I did reorder my sections, so that
.debug$H is in the correct
>>>>>>>>>> place, but now I get some errors on
dubplicate symbols, I created a folder
>>>>>>>>>> with examples:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
https://www.dropbox.com/sh/nmvzi44pi0boe76/AAA0f47O5PCJ9JiUc6wVuwBra?dl=0
>>>>>>>>>>
>>>>>>>>>> t.obj is generated by vs 2015 and it
links fine with
>>>>>>>>>> lld-link.exe, but tout.obj gives this
errors:
>>>>>>>>>>
>>>>>>>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol:
>>>>>>>>>> __local_stdio_printf_options in
tout.obj and in
>>>>>>>>>>
LIBCMT.lib(default_local_stdio_options.obj)
>>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol:
>>>>>>>>>> __local_stdio_printf_options in
tout.obj and in
>>>>>>>>>> libvcruntime.lib(undname.obj)
>>>>>>>>>>
>>>>>>>>>> I'm using PEView from
http://wjradburn.com/software/ to look at
>>>>>>>>>> the files and can't see anything
wrong, except some valid differences in
>>>>>>>>>> the offsets being used for the data (so
pointer to data is different
>>>>>>>>>> between them).
>>>>>>>>>>
>>>>>>>>>> I will look into yaml2obj now to see if
I see anything else weird
>>>>>>>>>> going on.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 25, 2018 at 6:41 PM,
Zachary Turner <
>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm pretty confident that cl is
not putting anything strange in
>>>>>>>>>>> the .debug$T sections.  We've
done a lot of testing and never seen anything
>>>>>>>>>>> except CodeView type records in a
.debug$T.  My hunch is that your objcopy
>>>>>>>>>>> patch is probably not doing the
right thing in one or more of the section
>>>>>>>>>>> headers, and this is confusing the
linker.
>>>>>>>>>>>
>>>>>>>>>>> One idea might be to build a simple
object file with clang-cl
>>>>>>>>>>> but without the magic -mllvm
-emit-codeview-ghash-section, then run your
>>>>>>>>>>> llvm-objcopy on it.  Then build the
same object file passing -mllvm
>>>>>>>>>>> -emit-codeview-ghash-section.  Then
run obj2yaml on both and diff the
>>>>>>>>>>> results.  They should be
byte-for-byte identical.  That should give you a
>>>>>>>>>>> clue about if objcopy is doing
something wrong.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM
Leonardo Santagada <
>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Don't worry, I definetly
want to perfect this to generate legal
>>>>>>>>>>>> obj files, this is just to
speed up testing.
>>>>>>>>>>>>
>>>>>>>>>>>> Now after patching all the obj
files I get this errors when
>>>>>>>>>>>> linking a small part of our
code base (msvc 2017 15.5.3, lld and
>>>>>>>>>>>> llvm-objcopy 7.0.0):
>>>>>>>>>>>> lld-link.exe : error :
relocation against symbol in discarded
>>>>>>>>>>>> section: $LN8
>>>>>>>>>>>> lld-link.exe : error :
relocation against symbol in discarded
>>>>>>>>>>>> section: $LN43
>>>>>>>>>>>> lld-link.exe : error :
relocation against symbol in discarded
>>>>>>>>>>>> section: $LN37
>>>>>>>>>>>>
>>>>>>>>>>>> I'm starting to guess that
cl.exe might be putting some random
>>>>>>>>>>>> comdat or other discardable
symbols in the .debug$T and clang doesn't? I
>>>>>>>>>>>> will try to debug this and see
what more I can uncover.
>>>>>>>>>>>>
>>>>>>>>>>>> Linking works perfectly without
my llvm-objcopy pass to add
>>>>>>>>>>>> .debug$H?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jan 25, 2018 at 1:53
AM, Zachary Turner <
>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It might not influence LLD,
but at the same time we don't want
>>>>>>>>>>>>> to upstream something that
is producing technically illegal COFF files.
>>>>>>>>>>>>> Also good to hear about the
planned changes to your header files.  Looking
>>>>>>>>>>>>> forward to hearing about
your experiences with clang-cl.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jan 24, 2018 at
10:41 AM Leonardo Santagada <
>>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I finally got my first
.obj file patched with .debug$H to
>>>>>>>>>>>>>> look somewhat right. I
added the new section at the end of the file so I
>>>>>>>>>>>>>> don't have to
recalculate all sections (although now I probably could
>>>>>>>>>>>>>> position it in the
middle, knowing that each section is: SizeOfRawData +
>>>>>>>>>>>>>>
(last.Header.NumberOfRelocations * (4+4+2)) and the $H needs to come right
>>>>>>>>>>>>>> after $T in the file).
That although illegal based on the coff specs
>>>>>>>>>>>>>> doesn't seem its
going to influence lld.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also we talked and we
are probably going to do something
>>>>>>>>>>>>>> similar to a bunch of
windows defines and a check for our own define (to
>>>>>>>>>>>>>> guarantee that no one
imported windows.h before win32.h) and drop the
>>>>>>>>>>>>>> namespace and the
conflicting names.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jan 23, 2018 at
12:46 AM, Zachary Turner <
>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's very
possible that a 3rd party indirect header
>>>>>>>>>>>>>>> include is
involved.  One idea might be like I suggested where you #define
>>>>>>>>>>>>>>> _WINDOWS_ in
win32.h and guarantee that it's always included first.  Then
>>>>>>>>>>>>>>> those other headers
won't be able to #include <windows.h>.  but it will
>>>>>>>>>>>>>>> probably greatly
expand the amount of stuff you have to add to win32.h, as
>>>>>>>>>>>>>>> you will probably
find some callers of functions that aren't yet in your
>>>>>>>>>>>>>>> win32.h that
you'd have to add.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jan 22,
2018 at 3:28 PM Leonardo Santagada <
>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ok some
information was lost on getting this example to
>>>>>>>>>>>>>>>> you, I'm
sorry for not being clear.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We have a huge
code base, let's say 90% of it doesn't
>>>>>>>>>>>>>>>> include either
header, 9% include win32.h and 1% includes both, I will try
>>>>>>>>>>>>>>>> to discover
why, but my guess is they include both a third party that
>>>>>>>>>>>>>>>> includes
windows.h and some of our libs that use win32.h.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I will try to
fully understand this tomorrow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I guess clang
will not implement this ever so finishing the
>>>>>>>>>>>>>>>> object copier
is the best solution until all code is ported to clang.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 23 Jan 2018
00:02, "Zachary Turner" <zturner at google.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> You said
win32.h doesn't include windows.h, but main.cpp
>>>>>>>>>>>>>>>>> does.  So
what's the disadvantage of just including it in win32.h anyway,
>>>>>>>>>>>>>>>>> since
it's already going to be in every translation unit?  (Unless you
>>>>>>>>>>>>>>>>> didn't
mean to #include it in main.cpp)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I guess all
I can do is warn you how bad of an idea this
>>>>>>>>>>>>>>>>> is.  For
starters, I already found a bug in your code ;-)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // stdint.h
>>>>>>>>>>>>>>>>> typedef int
int32_t;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // winnt.h
>>>>>>>>>>>>>>>>> typedef
long LONG;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // windef.h
>>>>>>>>>>>>>>>>> typedef
struct tagPOINT
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>     LONG 
x;   // long x
>>>>>>>>>>>>>>>>>     LONG 
y;   // long y
>>>>>>>>>>>>>>>>> } POINT,
*PPOINT, NEAR *NPPOINT, FAR *LPPOINT;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // win32.h
>>>>>>>>>>>>>>>>> typedef
int32_t LONG;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> struct
POINT
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>> LONG x;  
// int x
>>>>>>>>>>>>>>>>> LONG y;  
// int y
>>>>>>>>>>>>>>>>> };
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So POINT is
defined two different ways.  In your minimal
>>>>>>>>>>>>>>>>> interface,
it's declared as 2 int32's, which are int.  In the actual
>>>>>>>>>>>>>>>>> Windows
header files, it's declared as 2 longs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This might
seem like a unimportant bug since int and long
>>>>>>>>>>>>>>>>> are the
same size, but int and long also mangle differently and affect
>>>>>>>>>>>>>>>>> overload
resolution, so you could have weird linker errors or call the
>>>>>>>>>>>>>>>>> wrong
function overload.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Plus, it
illustrates the fact that this struct *actually
>>>>>>>>>>>>>>>>> is* a
different type from the one in the windows header.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> You said at
the end that you never intentionally import
>>>>>>>>>>>>>>>>> win32.h and
windows.h from the same translation unit.  But then in this
>>>>>>>>>>>>>>>>> example you
did.  I wonder if you could enforce that by doing this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // win32.h
>>>>>>>>>>>>>>>>> #pragma
once
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // Error if
windows.h was included before us.
>>>>>>>>>>>>>>>>> #if
defined(_WINDOWS_)
>>>>>>>>>>>>>>>>> #error
"You're including win32.h after having already
>>>>>>>>>>>>>>>>> included
windows.h.  Don't do this!"
>>>>>>>>>>>>>>>>> #endif
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // And also
make sure windows.h can't get included after us
>>>>>>>>>>>>>>>>> #define
_WINDOWS_
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For the
record, I tried the test case you linked when
>>>>>>>>>>>>>>>>> windows.h
is not included in main.cpp and it works (but still has the bug
>>>>>>>>>>>>>>>>> about int
and long).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jan
22, 2018 at 2:23 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>> santagada
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It is
super gross, but we copy parts of windows.h because
>>>>>>>>>>>>>>>>>> having
all of it if both gigantic and very very messy. So our win32.h has a
>>>>>>>>>>>>>>>>>> couple
thousands of lines and not 30k+ for windows.h and we try to have
>>>>>>>>>>>>>>>>>> zero
macros. Win32.h doesn't include windows.h so using ::BOOL wouldn't
>>>>>>>>>>>>>>>>>> work.
We don't want to create a namespace, we just want a cleaner interface
>>>>>>>>>>>>>>>>>> to
windows api. The namespace with c linkage is the way to trick cl into
>>>>>>>>>>>>>>>>>>
allowing us to in some files have both windows.h and Win32.h. I really
>>>>>>>>>>>>>>>>>>
don't see any way for us to have this Win32.h without this cl support, so
>>>>>>>>>>>>>>>>>> maybe
we should either put windows.h in a compiled header somewhere and not
>>>>>>>>>>>>>>>>>> care
that it is infecting everything or just have one place we can call to
>>>>>>>>>>>>>>>>>> clean
up after including windows.h (a massive set of undefs).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So
using can't work, because we never intentionally
>>>>>>>>>>>>>>>>>> import
windows.h and win32.h on the same translation unit.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon,
Jan 22, 2018 at 7:08 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>> zturner
at google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
This is pretty gross, honestly :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Can't you just use using declarations?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
namespace Win32 {
>>>>>>>>>>>>>>>>>>>
extern "C" {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
using ::BOOL;
>>>>>>>>>>>>>>>>>>>
using ::LONG;
>>>>>>>>>>>>>>>>>>>
using ::POINT;
>>>>>>>>>>>>>>>>>>>
using ::LPPOINT;
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
using ::GetCursorPos;
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
This works with clang-cl.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Mon, Jan 22, 2018 at 5:39 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Here it is a minimal example, we do this so we don't
>>>>>>>>>>>>>>>>>>>>
have to import the whole windows api everywhere.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
https://gist.github.com/santagada/7977e929d31c629c4bf18ebb987f6be3
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
On Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Clang-cl maintains compatibility with msvc even in
>>>>>>>>>>>>>>>>>>>>>
cases where it’s non standards compliant (eg 2 phase name lookup), but we
>>>>>>>>>>>>>>>>>>>>>
try to keep these cases few and far between.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
To help me understand your case, do you mean you copy
>>>>>>>>>>>>>>>>>>>>>
windows.h and modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>>>>>>>>
defined twice? If i were to write this:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
struct Foo {};
>>>>>>>>>>>>>>>>>>>>>
struct Foo {};
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Is this a small repro of the issue you’re talking
>>>>>>>>>>>>>>>>>>>>>
about?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
I can totally see something like incremental linking
>>>>>>>>>>>>>>>>>>>>>>
with a simple padding between obj and a mapping file (which can also help
>>>>>>>>>>>>>>>>>>>>>>
with edit and continue, something we also would love to have).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
We have another developer doing the port to support
>>>>>>>>>>>>>>>>>>>>>>
clang-cl, but although most of our code also goes trough a version of
>>>>>>>>>>>>>>>>>>>>>>
clang, migrating the rest to clang-cl has been a fight. From what I heard
>>>>>>>>>>>>>>>>>>>>>>
the main problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>>>>>>>>
bring the awful parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>>>>>>>>
on cl, but clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>>>>>>>>
same name, even though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>>>>>>>>
as cl.exe is not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it and have this at least
>>>>>>>>>>>>>>>>>>>>>>
until this is completely fixed in our code base.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
the biggest win with moving to cl would be a better
>>>>>>>>>>>>>>>>>>>>>>
more standards compliant compiler, no 1 minute compiles on heavily
>>>>>>>>>>>>>>>>>>>>>>
templated files and maybe the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
10-15s will be hard without true incremental linking.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
At some point that's going to be the only way to get
>>>>>>>>>>>>>>>>>>>>>>>
any faster, but incremental linking is hard (putting it lightly), and since
>>>>>>>>>>>>>>>>>>>>>>>
our full links are already really fast we think we can get reasonably close
>>>>>>>>>>>>>>>>>>>>>>>
to link.exe incremental speeds with full links.  But it's never enough and
>>>>>>>>>>>>>>>>>>>>>>>
I will always want it to be faster, so you may see incremental linking in
>>>>>>>>>>>>>>>>>>>>>>>
the future after we hit a performance wall with full link speed :)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
In any case, I'm definitely interested in seeing
>>>>>>>>>>>>>>>>>>>>>>>
what kind of numbers you get with /debug:ghash after you get this
>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy feature implemented.  So keep me updated :)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
As an aside, have you tried building with clang
>>>>>>>>>>>>>>>>>>>>>>>
instead of cl?  If you build with clang you wouldn't even have to do this
>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy work, because it would "just work".  If you've tried
but ran
>>>>>>>>>>>>>>>>>>>>>>>
into issues I'm interested in hearing about those too.  On the other hand,
>>>>>>>>>>>>>>>>>>>>>>>
it's also reasonable to only switch one thing at a time.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
if we get to < 30s I think most users would prefer
>>>>>>>>>>>>>>>>>>>>>>>>
it to link.exe, just hopping there is still some more optimizations to get
>>>>>>>>>>>>>>>>>>>>>>>>
closer to ELF linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>>>>>>>>
of the penalties like slow debug time
>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark
>>>>>>>>>>>>>>>>>>>>>>>>>>
cases. When building blink_core.dll and browser_tests.exe, i get anywhere
>>>>>>>>>>>>>>>>>>>>>>>>>>
from a 20-40% reduction in link time. We have some other optimizations in
>>>>>>>>>>>>>>>>>>>>>>>>>>
the pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
My best time so far (including other
>>>>>>>>>>>>>>>>>>>>>>>>>>
optimizations not yet upstream) is 28s on blink_core.dll, compared to 110s
>>>>>>>>>>>>>>>>>>>>>>>>>>
with /debug
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner
>>>>>>>>>>>>>>>>>>>>>>>>>>>
<zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
route that clang goes through to write the object file.  If you think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
yaml2coff is convoluted, the way clang does it will just give you a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
headache.  There are multiple abstractions involved to account for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
different object file formats (ELF, COFF, MachO) and output formats
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(Assembly, binary file).  At least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I
>>>>>>>>>>>>>>>>>>>>>>>>>>>
just found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFFParser structure, but if you look at the writeCOFF
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
function in yaml2coff it's pretty bare-metal.  The logic you need will be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
almost identical, except that instead of checking the COFFParser for the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
various fields, you'll check the existing COFFObjectFile, which should have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
similar fields.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
writing the section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy
>>>>>>>>>>>>>>>>>>>>>>>>>>>
that I read a bit of it so I actually know what you are talking about...
>>>>>>>>>>>>>>>>>>>>>>>>>>>
yeah it doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
other fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how
>>>>>>>>>>>>>>>>>>>>>>>>>>>
the CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
is that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on
>>>>>>>>>>>>>>>>>>>>>>>>>>>
the final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like
>>>>>>>>>>>>>>>>>>>>>>>>>>>
chrome with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes
>>>>>>>>>>>>>>>>>>>>>>>>>>>
and 16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>
minute and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it
>>>>>>>>>>>>>>>>>>>>>>>>>>>
only happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols used
>>>>>>>>>>>>>>>>>>>>>>>>>>>
on some runtime systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
reads the obj file, finds .debug$T sections and global hashes it (proof of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
concept kind of code). What I can't find is: how does clang itself writes
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
the coff files with global hashes, as that might help me understand how to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
create the .debug$H section, how to update the file section count and how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
working on the yaml COFFParser struct and I'm having quite a bit of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
headache turning the COFFObjectFile into a COFFParser object or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
compatible... Tomorrow I might try the very non efficient path of coff2yaml
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
and then yaml2coff with the hashes header... but it seems way too
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
inefficient and convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Turner <zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Turner <zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the visual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
studio toolchain. What I'm proposing is a tool for processing .obj files in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
of unity build files (.cpp's with a lot of other .cpp's in them aka
munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
was reading about llvm mailing lists and jumped when I saw what I thought
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
was a lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
We've talked about it internally as well and agreed it would be useful, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just haven't prioritized it.  If you're interested in submitting a patch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
along those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
would be.  llvm-readobj and llvm-objdump seem like obvious choices, but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
they are intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
for it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
write.  If you're interested in trying to make a patch for this, I can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
offer some guidance on where to look in the code.  Otherwise it's something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
that we'll probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
back, please do tell, I did find some of the code of ghash in lld, but in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
fuzzy on the llvm codeview part of it and never seen llvm-readobj/objdump
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
or llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
behind library calls, and it should already just do the right thing, so I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
suspect you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
putting this in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
CopyBinary functions there, which currently only work for ELF, but you can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just make a new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
of the sections (getNumberOfSections / getSectionName) looking for .debug$T
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
then basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes.  That
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF files,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
so it would need to be taught to write COFF files.  We have code to do this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
in the yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
writeCOFF).  There may be a way to move this code to somewhere else
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
this behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Leonardo Santagada
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Leonardo Santagada
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Leonardo Santagada
>>>>
>>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/23654a9e/attachment-0001.html>

Leonardo Santagada via llvm-dev

2018-Jan-26 17:51 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

it is identical to me... wierd.

On Fri, Jan 26, 2018 at 6:49 PM, Zachary Turner <zturner at google.com>
wrote:
> (Ignore the fact that my hashes are 8 byte in the "good" file,
this is due
> to some local changes I've been experimenting with)
>
> On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at
google.com> wrote:
>
>> I did this:
>>
>> // a.cpp
>> static int x = 0;
>> void b(int);
>> void a(int) {
>>   if (x)
>>     b(x);
>> }
>> int main(int argc, char **argv) {
>>   a(argc);
>>   return x;
>> }
>>
>>
>> clang-cl /Z7 /c a.cpp /Foa.noghash.obj
>> clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section
>> /Foa.ghash.good.obj
>> llvm-objcopy a.noghash.obj a.ghash.bad.obj
>> obj2yaml a.ghash.good.obj > a.ghash.good.yaml
>> obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml
>>
>> Then open these 2 yaml files up in a diff viewer.  It looks like the
>> hashes aren't getting emitted at all.  For example, in the good
yaml file I
>> see this:
>>
>>   - Name:            '.debug$H'
>>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>>     Alignment:       4
>>     SectionData:     C5C93301000001005549419E78044E
>> 3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF8
>> 4584814A8B5E7E3FB17B397A9E3DEA75CD5627
>>     GlobalHashes:
>>       Version:         0
>>       HashAlgorithm:   1
>>       HashValues:
>>         - 5549419E78044E38
>>         - 96D45CD700942875
>>         - 8BE4A1E2B3E022BA
>>         - 267DEE221F5C42B1
>>         - 7BCA182AF8458481
>>         - 4A8B5E7E3FB17B39
>>         - 7A9E3DEA75CD5627
>>   - Name:            .pdata
>>
>> And in the bad yaml file I see this:
>>   - Name:            '.debug$H'
>>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>>     Alignment:       4
>>     SectionData:     C5C9330100000000
>>     GlobalHashes:
>>       Version:         0
>>       HashAlgorithm:   0
>>   - Name:            .pdata
>>
>> Don't focus too much on trying to figure out weird linker errors. 
Just
>> get the output of obj2yaml to be identical when run under a diff
utility,
>> then everything should work fine.
>>
>> On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>> I'm so close I can almost smell it :)
>>>
>>> I know how bad the code looks, I don't intend to submit this,
but if you
>>> want to try it out its at: https://gist.github.com/santagada/
>>> 544136b1ee143bf31653b1158ac6829e
>>>
>>> I'm seeing: lld-link.exe: error: duplicate symbol:
>>> "<redacted_unmangled>" (<redacted>) in
<internal> and in
>>> <redacted_filename>.obj, looking at the .yaml dump the
symbols are all
>>> similar to this:
>>>
>>> - Name: <redacted>
>>> Value: 0
>>> SectionNumber: 0
>>> SimpleType: IMAGE_SYM_TYPE_NULL
>>> ComplexType: IMAGE_SYM_DTYPE_FUNCTION
>>> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL
>>> WeakExternal:
>>> TagIndex: 134
>>> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY
>>>
>>> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> I haven't really dabbled in this part of the COFF format
personally, so
>>>> hopefully I'm not leading you astray :)
>>>>
>>>> But I checked the code for coff2yaml, and I see this:
>>>>
>>>>       } else if (Symbol.isSectionDefinition()) {
>>>>         // This symbol represents a section definition.
>>>>         assert(Symbol.getNumberOfAuxSymbols() == 1 &&
>>>>                "Expected a single aux symbol to describe
this
>>>> section!");
>>>>         const object::coff_aux_section_definition *ObjSD
>>>>             reinterpret_cast<const
object::coff_aux_section_definition
>>>> *>(
>>>>                 AuxData.data());
>>>>
>>>> So it looks like you need exactly 1 aux symbol for each section
symbol.
>>>>
>>>> I then scrolled up in this function to figure out where AuxData
comes
>>>> from, and it comes from COFFObjectFile::getSymbolAuxData.  I
think
>>>> that function holds the clue to what you need to do.  It looks
like you
>>>> need to set coff::symbol::NumberOfAuxSymbols to 1, and then
there is a
>>>> comment in getSymbolAuxData which says:
>>>>
>>>>     // AUX data comes immediately after the symbol in COFF
>>>>     Aux = reinterpret_cast<const uint8_t
*>(Symbol.getRawPtr()) +
>>>> SymbolSize;
>>>>
>>>> So I think you just need to write the bytes immediately after
the
>>>> coff::symbol.  The thing you need to write looks like a
>>>> coff::coff_aux_section_definition structure.
>>>>
>>>> For the CheckSum, look at WinCOFFObjectWriter::writeSection. 
It looks
>>>> like its a CRC32 of the actual section contents, which you can
generate
>>>> with a couple of lines of code:
>>>>
>>>>   JamCRC JC(/*Init=*/0);
>>>>   JC.update(DebugHContents);
>>>>   AuxSymbol.CheckSum = JC.getCRC();
>>>>
>>>> Hope this helps
>>>>
>>>> On Thu, Jan 25, 2018 at 10:46 AM Leonardo Santagada <
>>>> santagada at gmail.com> wrote:
>>>>
>>>>>
>>>>> I see that there is an auxsymbol per section symbol, and
also on the
>>>>> yaml representation there is a checksum, selection and
unused all of them I
>>>>> have no idea how to fill in, also this aux symbol might
have some important
>>>>> information for me to patch on the other symbols. Can you
find the part in
>>>>> llvm that it writes those? because at least for auxsymbol
the yaml part of
>>>>> the code threats as a binary blob so there is no info on
what they should
>>>>> be.
>>>>>
>>>>> On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner
at google.com>
>>>>> wrote:
>>>>>
>>>>>> If you run obj2yaml against a very simple object file,
you'll see
>>>>>> something like this at the end:
>>>>>> ```
>>>>>> symbols:
>>>>>>   - Name:            '@comp.id'
>>>>>>     Value:           17130443
>>>>>>     SectionNumber:   -1
>>>>>>     SimpleType:      IMAGE_SYM_TYPE_NULL
>>>>>>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>>>>>>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>>>>>>   - Name:            '@feat.00'
>>>>>>     Value:           2147484048
<(21)%204748-4048>
>>>>>>     SectionNumber:   -1
>>>>>>     SimpleType:      IMAGE_SYM_TYPE_NULL
>>>>>>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>>>>>>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>>>>>>   - Name:            .drectve
>>>>>>     Value:           0
>>>>>>     SectionNumber:   1
>>>>>>     SimpleType:      IMAGE_SYM_TYPE_NULL
>>>>>>     ComplexType:     IMAGE_SYM_DTYPE_NULL
>>>>>>     StorageClass:    IMAGE_SYM_CLASS_STATIC
>>>>>>     SectionDefinition:
>>>>>>       Length:          47
>>>>>>       NumberOfRelocations: 0
>>>>>>       NumberOfLinenumbers: 0
>>>>>>       CheckSum:        0
>>>>>>       Number:          0
>>>>>> ...
>>>>>> ```
>>>>>>
>>>>>> There's a structure called coff::symbol which
basically represents
>>>>>> each one of these records.  It looks like this:
>>>>>>
>>>>>> ```
>>>>>> struct symbol {
>>>>>>   char Name[NameSize];
>>>>>>   uint32_t Value;
>>>>>>   int32_t SectionNumber;
>>>>>>   uint16_t Type;
>>>>>>   uint8_t StorageClass;
>>>>>>   uint8_t NumberOfAuxSymbols;
>>>>>> };
>>>>>> ```
>>>>>>
>>>>>> So you'll need to create one for the debug$H
section and stick it
>>>>>> into the list.  This particular list doesn't have
to be in any special
>>>>>> order, so you can just put it at the end (although
it's probably not that
>>>>>> much harder to insert into the middle, and it will make
for a good test
>>>>>> that you've done it right.  The output can be
diffed against clang-cl
>>>>>> object file and be identical this way).  So write all
the normal symbols as
>>>>>> you probably already are, then write one for the
.debug$H section.
>>>>>> Initialize the fields to the same thing that you see
when you run obj2yaml
>>>>>> against an object file generated by clang-cl for the
.debug$H section.
>>>>>>
>>>>>> This structure doesn't contain any kind of file
pointers or offsets,
>>>>>> so all you really need to fix up are the
"SectionNumber" fields.  Basically
>>>>>> as you are writing the existing symbols, you would do
somethign like:
>>>>>>
>>>>>> for (const auto &Sym : ObjFile.symbols()) {
>>>>>>   if (Symbol->SectionNumber >=
DebugHInsertionIndex)
>>>>>>     ++Symbol->SectionNumber;
>>>>>>   writeSymbol(Sym);
>>>>>> }
>>>>>> writeSymbol(DebugHSym);
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada <
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> Any idea on how to create this new symbol there? I
saw that there is
>>>>>>> a symbol pointing to each section, but didn't
understand the format, and
>>>>>>> yaml2obj doesn't check it or do anything with
the list.
>>>>>>>
>>>>>>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada
<
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> YES, THANK YOU... I WAS THINKING THIS BUT
COMPLETELY FORGOT.
>>>>>>>>
>>>>>>>> sorry for the caps... long day of working on
this, and using vs
>>>>>>>> 2017, which adds a new section type .chks64
that I couldn't find
>>>>>>>> documentation anywhere was difficult. I highly
recommend everyone to just
>>>>>>>> not using vs 2017 until 15.8 or something, our
internal bug list is
>>>>>>>> gigantic.
>>>>>>>>
>>>>>>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner
<zturner at google.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Actually I already have a theory that even
though you are adding
>>>>>>>>> the section to the section table, you might
not be adding a *symbol* for
>>>>>>>>> the section to the symbol table.  So the
existing symbols (which reference
>>>>>>>>> sections by index) will all be wrong
because you've inserted a new
>>>>>>>>> section.  Still though, obj2yaml would
expose that.
>>>>>>>>>
>>>>>>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary
Turner <zturner at google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Yea as long as you compare clang-cl
object file with
>>>>>>>>>> automatically generated .debug$H
section against clang-cl object file
>>>>>>>>>> without .debug$H but added after the
fact with llvm-objcopy, that should
>>>>>>>>>> expose the problem I think when you run
obj2yaml on them.
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 25, 2018 at 9:49 AM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I did reorder my sections, so that
.debug$H is in the correct
>>>>>>>>>>> place, but now I get some errors on
dubplicate symbols, I created a folder
>>>>>>>>>>> with examples:
>>>>>>>>>>>
>>>>>>>>>>>
https://www.dropbox.com/sh/nmvzi44pi0boe76/
>>>>>>>>>>> AAA0f47O5PCJ9JiUc6wVuwBra?dl=0
>>>>>>>>>>>
>>>>>>>>>>> t.obj is generated by vs 2015 and
it links fine with
>>>>>>>>>>> lld-link.exe, but tout.obj gives
this errors:
>>>>>>>>>>>
>>>>>>>>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>>>>>>>>> LLD-LINK.EXE: error: duplicate
symbol:
>>>>>>>>>>> __local_stdio_printf_options in
tout.obj and in LIBCMT.lib(default_local_
>>>>>>>>>>> stdio_options.obj)
>>>>>>>>>>> LLD-LINK.EXE: error: duplicate
symbol:
>>>>>>>>>>> __local_stdio_printf_options in
tout.obj and in
>>>>>>>>>>> libvcruntime.lib(undname.obj)
>>>>>>>>>>>
>>>>>>>>>>> I'm using PEView from
http://wjradburn.com/software/ to look at
>>>>>>>>>>> the files and can't see
anything wrong, except some valid differences in
>>>>>>>>>>> the offsets being used for the data
(so pointer to data is different
>>>>>>>>>>> between them).
>>>>>>>>>>>
>>>>>>>>>>> I will look into yaml2obj now to
see if I see anything else
>>>>>>>>>>> weird going on.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jan 25, 2018 at 6:41 PM,
Zachary Turner <
>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm pretty confident that
cl is not putting anything strange in
>>>>>>>>>>>> the .debug$T sections. 
We've done a lot of testing and never seen anything
>>>>>>>>>>>> except CodeView type records in
a .debug$T.  My hunch is that your objcopy
>>>>>>>>>>>> patch is probably not doing the
right thing in one or more of the section
>>>>>>>>>>>> headers, and this is confusing
the linker.
>>>>>>>>>>>>
>>>>>>>>>>>> One idea might be to build a
simple object file with clang-cl
>>>>>>>>>>>> but without the magic -mllvm
-emit-codeview-ghash-section, then run your
>>>>>>>>>>>> llvm-objcopy on it.  Then build
the same object file passing -mllvm
>>>>>>>>>>>> -emit-codeview-ghash-section. 
Then run obj2yaml on both and diff the
>>>>>>>>>>>> results.  They should be
byte-for-byte identical.  That should give you a
>>>>>>>>>>>> clue about if objcopy is doing
something wrong.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM
Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Don't worry, I
definetly want to perfect this to generate
>>>>>>>>>>>>> legal obj files, this is
just to speed up testing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now after patching all the
obj files I get this errors when
>>>>>>>>>>>>> linking a small part of our
code base (msvc 2017 15.5.3, lld and
>>>>>>>>>>>>> llvm-objcopy 7.0.0):
>>>>>>>>>>>>> lld-link.exe : error :
relocation against symbol in discarded
>>>>>>>>>>>>> section: $LN8
>>>>>>>>>>>>> lld-link.exe : error :
relocation against symbol in discarded
>>>>>>>>>>>>> section: $LN43
>>>>>>>>>>>>> lld-link.exe : error :
relocation against symbol in discarded
>>>>>>>>>>>>> section: $LN37
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm starting to guess
that cl.exe might be putting some random
>>>>>>>>>>>>> comdat or other discardable
symbols in the .debug$T and clang doesn't? I
>>>>>>>>>>>>> will try to debug this and
see what more I can uncover.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Linking works perfectly
without my llvm-objcopy pass to add
>>>>>>>>>>>>> .debug$H?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jan 25, 2018 at
1:53 AM, Zachary Turner <
>>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It might not influence
LLD, but at the same time we don't
>>>>>>>>>>>>>> want to upstream
something that is producing technically illegal COFF
>>>>>>>>>>>>>> files.  Also good to
hear about the planned changes to your header files.
>>>>>>>>>>>>>> Looking forward to
hearing about your experiences with clang-cl.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jan 24, 2018 at
10:41 AM Leonardo Santagada <
>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I finally got my
first .obj file patched with .debug$H to
>>>>>>>>>>>>>>> look somewhat
right. I added the new section at the end of the file so I
>>>>>>>>>>>>>>> don't have to
recalculate all sections (although now I probably could
>>>>>>>>>>>>>>> position it in the
middle, knowing that each section is: SizeOfRawData +
>>>>>>>>>>>>>>>
(last.Header.NumberOfRelocations * (4+4+2)) and the $H
>>>>>>>>>>>>>>> needs to come right
after $T in the file). That although illegal based on
>>>>>>>>>>>>>>> the coff specs
doesn't seem its going to influence lld.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also we talked and
we are probably going to do something
>>>>>>>>>>>>>>> similar to a bunch
of windows defines and a check for our own define (to
>>>>>>>>>>>>>>> guarantee that no
one imported windows.h before win32.h) and drop the
>>>>>>>>>>>>>>> namespace and the
conflicting names.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jan 23,
2018 at 12:46 AM, Zachary Turner <
>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> That's very
possible that a 3rd party indirect header
>>>>>>>>>>>>>>>> include is
involved.  One idea might be like I suggested where you #define
>>>>>>>>>>>>>>>> _WINDOWS_ in
win32.h and guarantee that it's always included first.  Then
>>>>>>>>>>>>>>>> those other
headers won't be able to #include <windows.h>.  but it will
>>>>>>>>>>>>>>>> probably
greatly expand the amount of stuff you have to add to win32.h, as
>>>>>>>>>>>>>>>> you will
probably find some callers of functions that aren't yet in your
>>>>>>>>>>>>>>>> win32.h that
you'd have to add.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Jan 22,
2018 at 3:28 PM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Ok some
information was lost on getting this example to
>>>>>>>>>>>>>>>>> you,
I'm sorry for not being clear.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We have a
huge code base, let's say 90% of it doesn't
>>>>>>>>>>>>>>>>> include
either header, 9% include win32.h and 1% includes both, I will try
>>>>>>>>>>>>>>>>> to discover
why, but my guess is they include both a third party that
>>>>>>>>>>>>>>>>> includes
windows.h and some of our libs that use win32.h.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I will try
to fully understand this tomorrow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I guess
clang will not implement this ever so finishing
>>>>>>>>>>>>>>>>> the object
copier is the best solution until all code is ported to clang.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 23 Jan
2018 00:02, "Zachary Turner" <zturner at google.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You
said win32.h doesn't include windows.h, but main.cpp
>>>>>>>>>>>>>>>>>> does. 
So what's the disadvantage of just including it in win32.h anyway,
>>>>>>>>>>>>>>>>>> since
it's already going to be in every translation unit?  (Unless you
>>>>>>>>>>>>>>>>>>
didn't mean to #include it in main.cpp)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I guess
all I can do is warn you how bad of an idea this
>>>>>>>>>>>>>>>>>> is. 
For starters, I already found a bug in your code ;-)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> //
stdint.h
>>>>>>>>>>>>>>>>>> typedef
int                int32_t;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> //
winnt.h
>>>>>>>>>>>>>>>>>> typedef
long LONG;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> //
windef.h
>>>>>>>>>>>>>>>>>> typedef
struct tagPOINT
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>    
LONG  x;   // long x
>>>>>>>>>>>>>>>>>>    
LONG  y;   // long y
>>>>>>>>>>>>>>>>>> }
POINT, *PPOINT, NEAR *NPPOINT, FAR *LPPOINT;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> //
win32.h
>>>>>>>>>>>>>>>>>> typedef
int32_t LONG;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> struct
POINT
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>> LONG x;
// int x
>>>>>>>>>>>>>>>>>> LONG y;
// int y
>>>>>>>>>>>>>>>>>> };
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So
POINT is defined two different ways.  In your minimal
>>>>>>>>>>>>>>>>>>
interface, it's declared as 2 int32's, which are int.  In the actual
>>>>>>>>>>>>>>>>>> Windows
header files, it's declared as 2 longs.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This
might seem like a unimportant bug since int and long
>>>>>>>>>>>>>>>>>> are the
same size, but int and long also mangle differently and affect
>>>>>>>>>>>>>>>>>>
overload resolution, so you could have weird linker errors or call the
>>>>>>>>>>>>>>>>>> wrong
function overload.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Plus,
it illustrates the fact that this struct *actually
>>>>>>>>>>>>>>>>>> is* a
different type from the one in the windows header.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You
said at the end that you never intentionally import
>>>>>>>>>>>>>>>>>> win32.h
and windows.h from the same translation unit.  But then in this
>>>>>>>>>>>>>>>>>> example
you did.  I wonder if you could enforce that by doing this:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> //
win32.h
>>>>>>>>>>>>>>>>>> #pragma
once
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> //
Error if windows.h was included before us.
>>>>>>>>>>>>>>>>>> #if
defined(_WINDOWS_)
>>>>>>>>>>>>>>>>>> #error
"You're including win32.h after having already
>>>>>>>>>>>>>>>>>>
included windows.h.  Don't do this!"
>>>>>>>>>>>>>>>>>> #endif
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> // And
also make sure windows.h can't get included after
>>>>>>>>>>>>>>>>>> us
>>>>>>>>>>>>>>>>>> #define
_WINDOWS_
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For the
record, I tried the test case you linked when
>>>>>>>>>>>>>>>>>>
windows.h is not included in main.cpp and it works (but still has the bug
>>>>>>>>>>>>>>>>>> about
int and long).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon,
Jan 22, 2018 at 2:23 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It
is super gross, but we copy parts of windows.h
>>>>>>>>>>>>>>>>>>>
because having all of it if both gigantic and very very messy. So our
>>>>>>>>>>>>>>>>>>>
win32.h has a couple thousands of lines and not 30k+ for windows.h and we
>>>>>>>>>>>>>>>>>>> try
to have zero macros. Win32.h doesn't include windows.h so using ::BOOL
>>>>>>>>>>>>>>>>>>>
wouldn't work. We don't want to create a namespace, we just want a
cleaner
>>>>>>>>>>>>>>>>>>>
interface to windows api. The namespace with c linkage is the way to trick
>>>>>>>>>>>>>>>>>>> cl
into allowing us to in some files have both windows.h and Win32.h. I
>>>>>>>>>>>>>>>>>>>
really don't see any way for us to have this Win32.h without this cl
>>>>>>>>>>>>>>>>>>>
support, so maybe we should either put windows.h in a compiled header
>>>>>>>>>>>>>>>>>>>
somewhere and not care that it is infecting everything or just have one
>>>>>>>>>>>>>>>>>>>
place we can call to clean up after including windows.h (a massive set of
>>>>>>>>>>>>>>>>>>>
undefs).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So
using can't work, because we never intentionally
>>>>>>>>>>>>>>>>>>>
import windows.h and win32.h on the same translation unit.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Mon, Jan 22, 2018 at 7:08 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
This is pretty gross, honestly :)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Can't you just use using declarations?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
namespace Win32 {
>>>>>>>>>>>>>>>>>>>>
extern "C" {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
using ::BOOL;
>>>>>>>>>>>>>>>>>>>>
using ::LONG;
>>>>>>>>>>>>>>>>>>>>
using ::POINT;
>>>>>>>>>>>>>>>>>>>>
using ::LPPOINT;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
using ::GetCursorPos;
>>>>>>>>>>>>>>>>>>>>
}
>>>>>>>>>>>>>>>>>>>>
}
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
This works with clang-cl.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
On Mon, Jan 22, 2018 at 5:39 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Here it is a minimal example, we do this so we don't
>>>>>>>>>>>>>>>>>>>>>
have to import the whole windows api everywhere.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
https://gist.github.com/santagada/
>>>>>>>>>>>>>>>>>>>>>
7977e929d31c629c4bf18ebb987f6be3
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
On Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Clang-cl maintains compatibility with msvc even in
>>>>>>>>>>>>>>>>>>>>>>
cases where it’s non standards compliant (eg 2 phase name lookup), but we
>>>>>>>>>>>>>>>>>>>>>>
try to keep these cases few and far between.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
To help me understand your case, do you mean you copy
>>>>>>>>>>>>>>>>>>>>>>
windows.h and modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>>>>>>>>>
defined twice? If i were to write this:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
struct Foo {};
>>>>>>>>>>>>>>>>>>>>>>
struct Foo {};
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Is this a small repro of the issue you’re talking
>>>>>>>>>>>>>>>>>>>>>>
about?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
I can totally see something like incremental linking
>>>>>>>>>>>>>>>>>>>>>>>
with a simple padding between obj and a mapping file (which can also help
>>>>>>>>>>>>>>>>>>>>>>>
with edit and continue, something we also would love to have).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
We have another developer doing the port to support
>>>>>>>>>>>>>>>>>>>>>>>
clang-cl, but although most of our code also goes trough a version of
>>>>>>>>>>>>>>>>>>>>>>>
clang, migrating the rest to clang-cl has been a fight. From what I heard
>>>>>>>>>>>>>>>>>>>>>>>
the main problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>>>>>>>>>
bring the awful parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>>>>>>>>>
on cl, but clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>>>>>>>>>
same name, even though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>>>>>>>>>
as cl.exe is not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it
>>>>>>>>>>>>>>>>>>>>>>>
and have this at least until this is completely fixed in our code base.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
the biggest win with moving to cl would be a better
>>>>>>>>>>>>>>>>>>>>>>>
more standards compliant compiler, no 1 minute compiles on heavily
>>>>>>>>>>>>>>>>>>>>>>>
templated files and maybe the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
10-15s will be hard without true incremental
>>>>>>>>>>>>>>>>>>>>>>>>
linking.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
At some point that's going to be the only way to
>>>>>>>>>>>>>>>>>>>>>>>>
get any faster, but incremental linking is hard (putting it lightly), and
>>>>>>>>>>>>>>>>>>>>>>>>
since our full links are already really fast we think we can get reasonably
>>>>>>>>>>>>>>>>>>>>>>>>
close to link.exe incremental speeds with full links.  But it's never
>>>>>>>>>>>>>>>>>>>>>>>>
enough and I will always want it to be faster, so you may see incremental
>>>>>>>>>>>>>>>>>>>>>>>>
linking in the future after we hit a performance wall with full link speed
>>>>>>>>>>>>>>>>>>>>>>>>
:)
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
In any case, I'm definitely interested in seeing
>>>>>>>>>>>>>>>>>>>>>>>>
what kind of numbers you get with /debug:ghash after you get this
>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy feature implemented.  So keep me updated :)
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
As an aside, have you tried building with clang
>>>>>>>>>>>>>>>>>>>>>>>>
instead of cl?  If you build with clang you wouldn't even have to do this
>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy work, because it would "just work".  If you've tried
but ran
>>>>>>>>>>>>>>>>>>>>>>>>
into issues I'm interested in hearing about those too.  On the other hand,
>>>>>>>>>>>>>>>>>>>>>>>>
it's also reasonable to only switch one thing at a time.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
if we get to < 30s I think most users would prefer
>>>>>>>>>>>>>>>>>>>>>>>>>
it to link.exe, just hopping there is still some more optimizations to get
>>>>>>>>>>>>>>>>>>>>>>>>>
closer to ELF linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>>>>>>>>>
of the penalties like slow debug time
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark
>>>>>>>>>>>>>>>>>>>>>>>>>>>
cases. When building blink_core.dll and browser_tests.exe, i get anywhere
>>>>>>>>>>>>>>>>>>>>>>>>>>>
from a 20-40% reduction in link time. We have some other optimizations in
>>>>>>>>>>>>>>>>>>>>>>>>>>>
the pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
My best time so far (including other
>>>>>>>>>>>>>>>>>>>>>>>>>>>
optimizations not yet upstream) is 28s on blink_core.dll, compared to 110s
>>>>>>>>>>>>>>>>>>>>>>>>>>>
with /debug
>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
<zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
route that clang goes through to write the object file.  If you think
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
yaml2coff is convoluted, the way clang does it will just give you a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
headache.  There are multiple abstractions involved to account for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
different object file formats (ELF, COFF, MachO) and output formats
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(Assembly, binary file).  At least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFFParser structure, but if you look at the writeCOFF
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
function in yaml2coff it's pretty bare-metal.  The logic you need will be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
almost identical, except that instead of checking the COFFParser for the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
various fields, you'll check the existing COFFObjectFile, which should have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
similar fields.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
writing the section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
that I read a bit of it so I actually know what you are talking about...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
yeah it doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
other fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
the CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
is that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData +
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
the final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
chrome with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
and 16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
minute and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
only happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
used on some runtime systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
that reads the obj file, finds .debug$T sections and global hashes it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(proof of concept kind of code). What I can't find is: how does clang
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
itself writes the coff files with global hashes, as that might help me
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
understand how to create the .debug$H section, how to update the file
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
section count and how to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
working on the yaml COFFParser struct and I'm having quite a bit of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
headache turning the COFFObjectFile into a COFFParser object or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
compatible... Tomorrow I might try the very non efficient path of coff2yaml
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
and then yaml2coff with the hashes header... but it seems way too
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
inefficient and convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Turner <zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Turner <zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
visual studio toolchain. What I'm proposing is a tool for processing .obj
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files in COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
of unity build files (.cpp's with a lot of other .cpp's in them aka
munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I was reading about llvm mailing lists and jumped when I saw what I thought
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
was a lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
We've talked about it internally as well and agreed it would be useful, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just haven't prioritized it.  If you're interested in submitting a patch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
along those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
would be.  llvm-readobj and llvm-objdump seem like obvious choices, but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
they are intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
for it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
write.  If you're interested in trying to make a patch for this, I can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
offer some guidance on where to look in the code.  Otherwise it's something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
that we'll probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
back, please do tell, I did find some of the code of ghash in lld, but in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
fuzzy on the llvm codeview part of it and never seen llvm-readobj/objdump
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
or llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
behind library calls, and it should already just do the right thing, so I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
suspect you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
putting this in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
CopyBinary functions there, which currently only work for ELF, but you can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just make a new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
each of the sections (getNumberOfSections / getSectionName) looking for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
.debug$T and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
then basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
That will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files, so it would need to be taught to write COFF files.  We have code to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
do this in the yaml2obj utility (specifically, in yaml2coff.cpp in the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
function writeCOFF).  There may be a way to move this code to somewhere
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
else (llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
this behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Leonardo Santagada
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Leonardo Santagada
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>

-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/6331d453/attachment-0001.html>

Zachary Turner via llvm-dev

2018-Jan-26 17:52 UTC

head link

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Hmm, ok.  In that case let me try again without my local changes.  Maybe
they are getting in the way :-/

On Fri, Jan 26, 2018 at 9:51 AM Leonardo Santagada <santagada at
gmail.com>
wrote:
> it is identical to me... wierd.
>
> On Fri, Jan 26, 2018 at 6:49 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>> (Ignore the fact that my hashes are 8 byte in the "good"
file, this is
>> due to some local changes I've been experimenting with)
>>
>> On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at
google.com>
>> wrote:
>>
>>> I did this:
>>>
>>> // a.cpp
>>> static int x = 0;
>>> void b(int);
>>> void a(int) {
>>>   if (x)
>>>     b(x);
>>> }
>>> int main(int argc, char **argv) {
>>>   a(argc);
>>>   return x;
>>> }
>>>
>>>
>>> clang-cl /Z7 /c a.cpp /Foa.noghash.obj
>>> clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section
>>> /Foa.ghash.good.obj
>>> llvm-objcopy a.noghash.obj a.ghash.bad.obj
>>> obj2yaml a.ghash.good.obj > a.ghash.good.yaml
>>> obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml
>>>
>>> Then open these 2 yaml files up in a diff viewer.  It looks like
the
>>> hashes aren't getting emitted at all.  For example, in the good
yaml file I
>>> see this:
>>>
>>>   - Name:            '.debug$H'
>>>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
>>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>>>     Alignment:       4
>>>     SectionData:
>>> 
C5C93301000001005549419E78044E3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF84584814A8B5E7E3FB17B397A9E3DEA75CD5627
>>>     GlobalHashes:
>>>       Version:         0
>>>       HashAlgorithm:   1
>>>       HashValues:
>>>         - 5549419E78044E38
>>>         - 96D45CD700942875
>>>         - 8BE4A1E2B3E022BA
>>>         - 267DEE221F5C42B1
>>>         - 7BCA182AF8458481
>>>         - 4A8B5E7E3FB17B39
>>>         - 7A9E3DEA75CD5627
>>>   - Name:            .pdata
>>>
>>> And in the bad yaml file I see this:
>>>   - Name:            '.debug$H'
>>>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
>>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>>>     Alignment:       4
>>>     SectionData:     C5C9330100000000
>>>     GlobalHashes:
>>>       Version:         0
>>>       HashAlgorithm:   0
>>>   - Name:            .pdata
>>>
>>> Don't focus too much on trying to figure out weird linker
errors.  Just
>>> get the output of obj2yaml to be identical when run under a diff
utility,
>>> then everything should work fine.
>>>
>>> On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at
gmail.com>
>>> wrote:
>>>
>>>> I'm so close I can almost smell it :)
>>>>
>>>> I know how bad the code looks, I don't intend to submit
this, but if
>>>> you want to try it out its at:
>>>>
https://gist.github.com/santagada/544136b1ee143bf31653b1158ac6829e
>>>>
>>>> I'm seeing: lld-link.exe: error: duplicate symbol:
>>>> "<redacted_unmangled>" (<redacted>) in
<internal> and in
>>>> <redacted_filename>.obj, looking at the .yaml dump the
symbols are all
>>>> similar to this:
>>>>
>>>> - Name: <redacted>
>>>> Value: 0
>>>> SectionNumber: 0
>>>> SimpleType: IMAGE_SYM_TYPE_NULL
>>>> ComplexType: IMAGE_SYM_DTYPE_FUNCTION
>>>> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL
>>>> WeakExternal:
>>>> TagIndex: 134
>>>> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY
>>>>
>>>> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at
google.com>
>>>> wrote:
>>>>
>>>>> I haven't really dabbled in this part of the COFF
format personally,
>>>>> so hopefully I'm not leading you astray :)
>>>>>
>>>>> But I checked the code for coff2yaml, and I see this:
>>>>>
>>>>>       } else if (Symbol.isSectionDefinition()) {
>>>>>         // This symbol represents a section definition.
>>>>>         assert(Symbol.getNumberOfAuxSymbols() == 1
&&
>>>>>                "Expected a single aux symbol to
describe this
>>>>> section!");
>>>>>         const object::coff_aux_section_definition *ObjSD
>>>>>             reinterpret_cast<const
object::coff_aux_section_definition
>>>>> *>(
>>>>>                 AuxData.data());
>>>>>
>>>>> So it looks like you need exactly 1 aux symbol for each
section symbol.
>>>>>
>>>>> I then scrolled up in this function to figure out where
AuxData comes
>>>>> from, and it comes from COFFObjectFile::getSymbolAuxData. 
I think that
>>>>> function holds the clue to what you need to do.  It looks
like you need to
>>>>> set coff::symbol::NumberOfAuxSymbols to 1, and then there
is a comment in
>>>>> getSymbolAuxData which says:
>>>>>
>>>>>     // AUX data comes immediately after the symbol in COFF
>>>>>     Aux = reinterpret_cast<const uint8_t
*>(Symbol.getRawPtr()) +
>>>>> SymbolSize;
>>>>>
>>>>> So I think you just need to write the bytes immediately
after the
>>>>> coff::symbol.  The thing you need to write looks like a
>>>>> coff::coff_aux_section_definition structure.
>>>>>
>>>>> For the CheckSum, look at
WinCOFFObjectWriter::writeSection.  It looks
>>>>> like its a CRC32 of the actual section contents, which you
can generate
>>>>> with a couple of lines of code:
>>>>>
>>>>>   JamCRC JC(/*Init=*/0);
>>>>>   JC.update(DebugHContents);
>>>>>   AuxSymbol.CheckSum = JC.getCRC();
>>>>>
>>>>> Hope this helps
>>>>>
>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/76f12887/attachment-0001.html>

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Jan 2018 - [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Apparently Analagous Threads