Leonardo Santagada via llvm-dev
2018-Jan-26 17:51 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
it is identical to me... wierd. On Fri, Jan 26, 2018 at 6:49 PM, Zachary Turner <zturner at google.com> wrote:> (Ignore the fact that my hashes are 8 byte in the "good" file, this is due > to some local changes I've been experimenting with) > > On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at google.com> wrote: > >> I did this: >> >> // a.cpp >> static int x = 0; >> void b(int); >> void a(int) { >> if (x) >> b(x); >> } >> int main(int argc, char **argv) { >> a(argc); >> return x; >> } >> >> >> clang-cl /Z7 /c a.cpp /Foa.noghash.obj >> clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section >> /Foa.ghash.good.obj >> llvm-objcopy a.noghash.obj a.ghash.bad.obj >> obj2yaml a.ghash.good.obj > a.ghash.good.yaml >> obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml >> >> Then open these 2 yaml files up in a diff viewer. It looks like the >> hashes aren't getting emitted at all. For example, in the good yaml file I >> see this: >> >> - Name: '.debug$H' >> Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, >> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ] >> Alignment: 4 >> SectionData: C5C93301000001005549419E78044E >> 3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF8 >> 4584814A8B5E7E3FB17B397A9E3DEA75CD5627 >> GlobalHashes: >> Version: 0 >> HashAlgorithm: 1 >> HashValues: >> - 5549419E78044E38 >> - 96D45CD700942875 >> - 8BE4A1E2B3E022BA >> - 267DEE221F5C42B1 >> - 7BCA182AF8458481 >> - 4A8B5E7E3FB17B39 >> - 7A9E3DEA75CD5627 >> - Name: .pdata >> >> And in the bad yaml file I see this: >> - Name: '.debug$H' >> Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, >> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ] >> Alignment: 4 >> SectionData: C5C9330100000000 >> GlobalHashes: >> Version: 0 >> HashAlgorithm: 0 >> - Name: .pdata >> >> Don't focus too much on trying to figure out weird linker errors. Just >> get the output of obj2yaml to be identical when run under a diff utility, >> then everything should work fine. >> >> On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at gmail.com> >> wrote: >> >>> I'm so close I can almost smell it :) >>> >>> I know how bad the code looks, I don't intend to submit this, but if you >>> want to try it out its at: https://gist.github.com/santagada/ >>> 544136b1ee143bf31653b1158ac6829e >>> >>> I'm seeing: lld-link.exe: error: duplicate symbol: >>> "<redacted_unmangled>" (<redacted>) in <internal> and in >>> <redacted_filename>.obj, looking at the .yaml dump the symbols are all >>> similar to this: >>> >>> - Name: <redacted> >>> Value: 0 >>> SectionNumber: 0 >>> SimpleType: IMAGE_SYM_TYPE_NULL >>> ComplexType: IMAGE_SYM_DTYPE_FUNCTION >>> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL >>> WeakExternal: >>> TagIndex: 134 >>> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY >>> >>> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at google.com> >>> wrote: >>> >>>> I haven't really dabbled in this part of the COFF format personally, so >>>> hopefully I'm not leading you astray :) >>>> >>>> But I checked the code for coff2yaml, and I see this: >>>> >>>> } else if (Symbol.isSectionDefinition()) { >>>> // This symbol represents a section definition. >>>> assert(Symbol.getNumberOfAuxSymbols() == 1 && >>>> "Expected a single aux symbol to describe this >>>> section!"); >>>> const object::coff_aux_section_definition *ObjSD >>>> reinterpret_cast<const object::coff_aux_section_definition >>>> *>( >>>> AuxData.data()); >>>> >>>> So it looks like you need exactly 1 aux symbol for each section symbol. >>>> >>>> I then scrolled up in this function to figure out where AuxData comes >>>> from, and it comes from COFFObjectFile::getSymbolAuxData. I think >>>> that function holds the clue to what you need to do. It looks like you >>>> need to set coff::symbol::NumberOfAuxSymbols to 1, and then there is a >>>> comment in getSymbolAuxData which says: >>>> >>>> // AUX data comes immediately after the symbol in COFF >>>> Aux = reinterpret_cast<const uint8_t *>(Symbol.getRawPtr()) + >>>> SymbolSize; >>>> >>>> So I think you just need to write the bytes immediately after the >>>> coff::symbol. The thing you need to write looks like a >>>> coff::coff_aux_section_definition structure. >>>> >>>> For the CheckSum, look at WinCOFFObjectWriter::writeSection. It looks >>>> like its a CRC32 of the actual section contents, which you can generate >>>> with a couple of lines of code: >>>> >>>> JamCRC JC(/*Init=*/0); >>>> JC.update(DebugHContents); >>>> AuxSymbol.CheckSum = JC.getCRC(); >>>> >>>> Hope this helps >>>> >>>> On Thu, Jan 25, 2018 at 10:46 AM Leonardo Santagada < >>>> santagada at gmail.com> wrote: >>>> >>>>> >>>>> I see that there is an auxsymbol per section symbol, and also on the >>>>> yaml representation there is a checksum, selection and unused all of them I >>>>> have no idea how to fill in, also this aux symbol might have some important >>>>> information for me to patch on the other symbols. Can you find the part in >>>>> llvm that it writes those? because at least for auxsymbol the yaml part of >>>>> the code threats as a binary blob so there is no info on what they should >>>>> be. >>>>> >>>>> On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner at google.com> >>>>> wrote: >>>>> >>>>>> If you run obj2yaml against a very simple object file, you'll see >>>>>> something like this at the end: >>>>>> ``` >>>>>> symbols: >>>>>> - Name: '@comp.id' >>>>>> Value: 17130443 >>>>>> SectionNumber: -1 >>>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>>> ComplexType: IMAGE_SYM_DTYPE_NULL >>>>>> StorageClass: IMAGE_SYM_CLASS_STATIC >>>>>> - Name: '@feat.00' >>>>>> Value: 2147484048 <(21)%204748-4048> >>>>>> SectionNumber: -1 >>>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>>> ComplexType: IMAGE_SYM_DTYPE_NULL >>>>>> StorageClass: IMAGE_SYM_CLASS_STATIC >>>>>> - Name: .drectve >>>>>> Value: 0 >>>>>> SectionNumber: 1 >>>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>>> ComplexType: IMAGE_SYM_DTYPE_NULL >>>>>> StorageClass: IMAGE_SYM_CLASS_STATIC >>>>>> SectionDefinition: >>>>>> Length: 47 >>>>>> NumberOfRelocations: 0 >>>>>> NumberOfLinenumbers: 0 >>>>>> CheckSum: 0 >>>>>> Number: 0 >>>>>> ... >>>>>> ``` >>>>>> >>>>>> There's a structure called coff::symbol which basically represents >>>>>> each one of these records. It looks like this: >>>>>> >>>>>> ``` >>>>>> struct symbol { >>>>>> char Name[NameSize]; >>>>>> uint32_t Value; >>>>>> int32_t SectionNumber; >>>>>> uint16_t Type; >>>>>> uint8_t StorageClass; >>>>>> uint8_t NumberOfAuxSymbols; >>>>>> }; >>>>>> ``` >>>>>> >>>>>> So you'll need to create one for the debug$H section and stick it >>>>>> into the list. This particular list doesn't have to be in any special >>>>>> order, so you can just put it at the end (although it's probably not that >>>>>> much harder to insert into the middle, and it will make for a good test >>>>>> that you've done it right. The output can be diffed against clang-cl >>>>>> object file and be identical this way). So write all the normal symbols as >>>>>> you probably already are, then write one for the .debug$H section. >>>>>> Initialize the fields to the same thing that you see when you run obj2yaml >>>>>> against an object file generated by clang-cl for the .debug$H section. >>>>>> >>>>>> This structure doesn't contain any kind of file pointers or offsets, >>>>>> so all you really need to fix up are the "SectionNumber" fields. Basically >>>>>> as you are writing the existing symbols, you would do somethign like: >>>>>> >>>>>> for (const auto &Sym : ObjFile.symbols()) { >>>>>> if (Symbol->SectionNumber >= DebugHInsertionIndex) >>>>>> ++Symbol->SectionNumber; >>>>>> writeSymbol(Sym); >>>>>> } >>>>>> writeSymbol(DebugHSym); >>>>>> >>>>>> >>>>>> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada < >>>>>> santagada at gmail.com> wrote: >>>>>> >>>>>>> Any idea on how to create this new symbol there? I saw that there is >>>>>>> a symbol pointing to each section, but didn't understand the format, and >>>>>>> yaml2obj doesn't check it or do anything with the list. >>>>>>> >>>>>>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada < >>>>>>> santagada at gmail.com> wrote: >>>>>>> >>>>>>>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT. >>>>>>>> >>>>>>>> sorry for the caps... long day of working on this, and using vs >>>>>>>> 2017, which adds a new section type .chks64 that I couldn't find >>>>>>>> documentation anywhere was difficult. I highly recommend everyone to just >>>>>>>> not using vs 2017 until 15.8 or something, our internal bug list is >>>>>>>> gigantic. >>>>>>>> >>>>>>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at google.com >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> Actually I already have a theory that even though you are adding >>>>>>>>> the section to the section table, you might not be adding a *symbol* for >>>>>>>>> the section to the symbol table. So the existing symbols (which reference >>>>>>>>> sections by index) will all be wrong because you've inserted a new >>>>>>>>> section. Still though, obj2yaml would expose that. >>>>>>>>> >>>>>>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at google.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Yea as long as you compare clang-cl object file with >>>>>>>>>> automatically generated .debug$H section against clang-cl object file >>>>>>>>>> without .debug$H but added after the fact with llvm-objcopy, that should >>>>>>>>>> expose the problem I think when you run obj2yaml on them. >>>>>>>>>> >>>>>>>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada < >>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> I did reorder my sections, so that .debug$H is in the correct >>>>>>>>>>> place, but now I get some errors on dubplicate symbols, I created a folder >>>>>>>>>>> with examples: >>>>>>>>>>> >>>>>>>>>>> https://www.dropbox.com/sh/nmvzi44pi0boe76/ >>>>>>>>>>> AAA0f47O5PCJ9JiUc6wVuwBra?dl=0 >>>>>>>>>>> >>>>>>>>>>> t.obj is generated by vs 2015 and it links fine with >>>>>>>>>>> lld-link.exe, but tout.obj gives this errors: >>>>>>>>>>> >>>>>>>>>>> lld-link.exe /DEBUG:GHASH tout.obj >>>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol: >>>>>>>>>>> __local_stdio_printf_options in tout.obj and in LIBCMT.lib(default_local_ >>>>>>>>>>> stdio_options.obj) >>>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol: >>>>>>>>>>> __local_stdio_printf_options in tout.obj and in >>>>>>>>>>> libvcruntime.lib(undname.obj) >>>>>>>>>>> >>>>>>>>>>> I'm using PEView from http://wjradburn.com/software/ to look at >>>>>>>>>>> the files and can't see anything wrong, except some valid differences in >>>>>>>>>>> the offsets being used for the data (so pointer to data is different >>>>>>>>>>> between them). >>>>>>>>>>> >>>>>>>>>>> I will look into yaml2obj now to see if I see anything else >>>>>>>>>>> weird going on. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner < >>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> I'm pretty confident that cl is not putting anything strange in >>>>>>>>>>>> the .debug$T sections. We've done a lot of testing and never seen anything >>>>>>>>>>>> except CodeView type records in a .debug$T. My hunch is that your objcopy >>>>>>>>>>>> patch is probably not doing the right thing in one or more of the section >>>>>>>>>>>> headers, and this is confusing the linker. >>>>>>>>>>>> >>>>>>>>>>>> One idea might be to build a simple object file with clang-cl >>>>>>>>>>>> but without the magic -mllvm -emit-codeview-ghash-section, then run your >>>>>>>>>>>> llvm-objcopy on it. Then build the same object file passing -mllvm >>>>>>>>>>>> -emit-codeview-ghash-section. Then run obj2yaml on both and diff the >>>>>>>>>>>> results. They should be byte-for-byte identical. That should give you a >>>>>>>>>>>> clue about if objcopy is doing something wrong. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada < >>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Don't worry, I definetly want to perfect this to generate >>>>>>>>>>>>> legal obj files, this is just to speed up testing. >>>>>>>>>>>>> >>>>>>>>>>>>> Now after patching all the obj files I get this errors when >>>>>>>>>>>>> linking a small part of our code base (msvc 2017 15.5.3, lld and >>>>>>>>>>>>> llvm-objcopy 7.0.0): >>>>>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>>>>> section: $LN8 >>>>>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>>>>> section: $LN43 >>>>>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>>>>> section: $LN37 >>>>>>>>>>>>> >>>>>>>>>>>>> I'm starting to guess that cl.exe might be putting some random >>>>>>>>>>>>> comdat or other discardable symbols in the .debug$T and clang doesn't? I >>>>>>>>>>>>> will try to debug this and see what more I can uncover. >>>>>>>>>>>>> >>>>>>>>>>>>> Linking works perfectly without my llvm-objcopy pass to add >>>>>>>>>>>>> .debug$H? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner < >>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> It might not influence LLD, but at the same time we don't >>>>>>>>>>>>>> want to upstream something that is producing technically illegal COFF >>>>>>>>>>>>>> files. Also good to hear about the planned changes to your header files. >>>>>>>>>>>>>> Looking forward to hearing about your experiences with clang-cl. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo Santagada < >>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I finally got my first .obj file patched with .debug$H to >>>>>>>>>>>>>>> look somewhat right. I added the new section at the end of the file so I >>>>>>>>>>>>>>> don't have to recalculate all sections (although now I probably could >>>>>>>>>>>>>>> position it in the middle, knowing that each section is: SizeOfRawData + >>>>>>>>>>>>>>> (last.Header.NumberOfRelocations * (4+4+2)) and the $H >>>>>>>>>>>>>>> needs to come right after $T in the file). That although illegal based on >>>>>>>>>>>>>>> the coff specs doesn't seem its going to influence lld. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also we talked and we are probably going to do something >>>>>>>>>>>>>>> similar to a bunch of windows defines and a check for our own define (to >>>>>>>>>>>>>>> guarantee that no one imported windows.h before win32.h) and drop the >>>>>>>>>>>>>>> namespace and the conflicting names. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM, Zachary Turner < >>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> That's very possible that a 3rd party indirect header >>>>>>>>>>>>>>>> include is involved. One idea might be like I suggested where you #define >>>>>>>>>>>>>>>> _WINDOWS_ in win32.h and guarantee that it's always included first. Then >>>>>>>>>>>>>>>> those other headers won't be able to #include <windows.h>. but it will >>>>>>>>>>>>>>>> probably greatly expand the amount of stuff you have to add to win32.h, as >>>>>>>>>>>>>>>> you will probably find some callers of functions that aren't yet in your >>>>>>>>>>>>>>>> win32.h that you'd have to add. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM Leonardo Santagada < >>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Ok some information was lost on getting this example to >>>>>>>>>>>>>>>>> you, I'm sorry for not being clear. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We have a huge code base, let's say 90% of it doesn't >>>>>>>>>>>>>>>>> include either header, 9% include win32.h and 1% includes both, I will try >>>>>>>>>>>>>>>>> to discover why, but my guess is they include both a third party that >>>>>>>>>>>>>>>>> includes windows.h and some of our libs that use win32.h. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I will try to fully understand this tomorrow. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I guess clang will not implement this ever so finishing >>>>>>>>>>>>>>>>> the object copier is the best solution until all code is ported to clang. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 23 Jan 2018 00:02, "Zachary Turner" <zturner at google.com> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You said win32.h doesn't include windows.h, but main.cpp >>>>>>>>>>>>>>>>>> does. So what's the disadvantage of just including it in win32.h anyway, >>>>>>>>>>>>>>>>>> since it's already going to be in every translation unit? (Unless you >>>>>>>>>>>>>>>>>> didn't mean to #include it in main.cpp) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I guess all I can do is warn you how bad of an idea this >>>>>>>>>>>>>>>>>> is. For starters, I already found a bug in your code ;-) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // stdint.h >>>>>>>>>>>>>>>>>> typedef int int32_t; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // winnt.h >>>>>>>>>>>>>>>>>> typedef long LONG; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // windef.h >>>>>>>>>>>>>>>>>> typedef struct tagPOINT >>>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>> LONG x; // long x >>>>>>>>>>>>>>>>>> LONG y; // long y >>>>>>>>>>>>>>>>>> } POINT, *PPOINT, NEAR *NPPOINT, FAR *LPPOINT; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // win32.h >>>>>>>>>>>>>>>>>> typedef int32_t LONG; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> struct POINT >>>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>> LONG x; // int x >>>>>>>>>>>>>>>>>> LONG y; // int y >>>>>>>>>>>>>>>>>> }; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So POINT is defined two different ways. In your minimal >>>>>>>>>>>>>>>>>> interface, it's declared as 2 int32's, which are int. In the actual >>>>>>>>>>>>>>>>>> Windows header files, it's declared as 2 longs. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This might seem like a unimportant bug since int and long >>>>>>>>>>>>>>>>>> are the same size, but int and long also mangle differently and affect >>>>>>>>>>>>>>>>>> overload resolution, so you could have weird linker errors or call the >>>>>>>>>>>>>>>>>> wrong function overload. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Plus, it illustrates the fact that this struct *actually >>>>>>>>>>>>>>>>>> is* a different type from the one in the windows header. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You said at the end that you never intentionally import >>>>>>>>>>>>>>>>>> win32.h and windows.h from the same translation unit. But then in this >>>>>>>>>>>>>>>>>> example you did. I wonder if you could enforce that by doing this: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // win32.h >>>>>>>>>>>>>>>>>> #pragma once >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // Error if windows.h was included before us. >>>>>>>>>>>>>>>>>> #if defined(_WINDOWS_) >>>>>>>>>>>>>>>>>> #error "You're including win32.h after having already >>>>>>>>>>>>>>>>>> included windows.h. Don't do this!" >>>>>>>>>>>>>>>>>> #endif >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> // And also make sure windows.h can't get included after >>>>>>>>>>>>>>>>>> us >>>>>>>>>>>>>>>>>> #define _WINDOWS_ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> For the record, I tried the test case you linked when >>>>>>>>>>>>>>>>>> windows.h is not included in main.cpp and it works (but still has the bug >>>>>>>>>>>>>>>>>> about int and long). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 2:23 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It is super gross, but we copy parts of windows.h >>>>>>>>>>>>>>>>>>> because having all of it if both gigantic and very very messy. So our >>>>>>>>>>>>>>>>>>> win32.h has a couple thousands of lines and not 30k+ for windows.h and we >>>>>>>>>>>>>>>>>>> try to have zero macros. Win32.h doesn't include windows.h so using ::BOOL >>>>>>>>>>>>>>>>>>> wouldn't work. We don't want to create a namespace, we just want a cleaner >>>>>>>>>>>>>>>>>>> interface to windows api. The namespace with c linkage is the way to trick >>>>>>>>>>>>>>>>>>> cl into allowing us to in some files have both windows.h and Win32.h. I >>>>>>>>>>>>>>>>>>> really don't see any way for us to have this Win32.h without this cl >>>>>>>>>>>>>>>>>>> support, so maybe we should either put windows.h in a compiled header >>>>>>>>>>>>>>>>>>> somewhere and not care that it is infecting everything or just have one >>>>>>>>>>>>>>>>>>> place we can call to clean up after including windows.h (a massive set of >>>>>>>>>>>>>>>>>>> undefs). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So using can't work, because we never intentionally >>>>>>>>>>>>>>>>>>> import windows.h and win32.h on the same translation unit. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 7:08 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This is pretty gross, honestly :) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Can't you just use using declarations? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> namespace Win32 { >>>>>>>>>>>>>>>>>>>> extern "C" { >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> using ::BOOL; >>>>>>>>>>>>>>>>>>>> using ::LONG; >>>>>>>>>>>>>>>>>>>> using ::POINT; >>>>>>>>>>>>>>>>>>>> using ::LPPOINT; >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> using ::GetCursorPos; >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This works with clang-cl. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 5:39 AM Leonardo Santagada < >>>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Here it is a minimal example, we do this so we don't >>>>>>>>>>>>>>>>>>>>> have to import the whole windows api everywhere. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> https://gist.github.com/santagada/ >>>>>>>>>>>>>>>>>>>>> 7977e929d31c629c4bf18ebb987f6be3 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Clang-cl maintains compatibility with msvc even in >>>>>>>>>>>>>>>>>>>>>> cases where it’s non standards compliant (eg 2 phase name lookup), but we >>>>>>>>>>>>>>>>>>>>>> try to keep these cases few and far between. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> To help me understand your case, do you mean you copy >>>>>>>>>>>>>>>>>>>>>> windows.h and modify it? How does this lead to the same struct being >>>>>>>>>>>>>>>>>>>>>> defined twice? If i were to write this: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> struct Foo {}; >>>>>>>>>>>>>>>>>>>>>> struct Foo {}; >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Is this a small repro of the issue you’re talking >>>>>>>>>>>>>>>>>>>>>> about? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I can totally see something like incremental linking >>>>>>>>>>>>>>>>>>>>>>> with a simple padding between obj and a mapping file (which can also help >>>>>>>>>>>>>>>>>>>>>>> with edit and continue, something we also would love to have). >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> We have another developer doing the port to support >>>>>>>>>>>>>>>>>>>>>>> clang-cl, but although most of our code also goes trough a version of >>>>>>>>>>>>>>>>>>>>>>> clang, migrating the rest to clang-cl has been a fight. From what I heard >>>>>>>>>>>>>>>>>>>>>>> the main problem is that we have a copy of parts of windows.h (so not to >>>>>>>>>>>>>>>>>>>>>>> bring the awful parts of it like lower case macros) and that totally works >>>>>>>>>>>>>>>>>>>>>>> on cl, but clang (at least 6.0) complains about two struct/vars with the >>>>>>>>>>>>>>>>>>>>>>> same name, even though they are exactly the same. Making clang-cl as broken >>>>>>>>>>>>>>>>>>>>>>> as cl.exe is not an option I suppose? I would love to turn on a flag >>>>>>>>>>>>>>>>>>>>>>> --accept-that-cl-made-bad-decisions-and-live-with-it >>>>>>>>>>>>>>>>>>>>>>> and have this at least until this is completely fixed in our code base. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> the biggest win with moving to cl would be a better >>>>>>>>>>>>>>>>>>>>>>> more standards compliant compiler, no 1 minute compiles on heavily >>>>>>>>>>>>>>>>>>>>>>> templated files and maybe the holy grail of ThinLTO. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> 10-15s will be hard without true incremental >>>>>>>>>>>>>>>>>>>>>>>> linking. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> At some point that's going to be the only way to >>>>>>>>>>>>>>>>>>>>>>>> get any faster, but incremental linking is hard (putting it lightly), and >>>>>>>>>>>>>>>>>>>>>>>> since our full links are already really fast we think we can get reasonably >>>>>>>>>>>>>>>>>>>>>>>> close to link.exe incremental speeds with full links. But it's never >>>>>>>>>>>>>>>>>>>>>>>> enough and I will always want it to be faster, so you may see incremental >>>>>>>>>>>>>>>>>>>>>>>> linking in the future after we hit a performance wall with full link speed >>>>>>>>>>>>>>>>>>>>>>>> :) >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> In any case, I'm definitely interested in seeing >>>>>>>>>>>>>>>>>>>>>>>> what kind of numbers you get with /debug:ghash after you get this >>>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy feature implemented. So keep me updated :) >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> As an aside, have you tried building with clang >>>>>>>>>>>>>>>>>>>>>>>> instead of cl? If you build with clang you wouldn't even have to do this >>>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy work, because it would "just work". If you've tried but ran >>>>>>>>>>>>>>>>>>>>>>>> into issues I'm interested in hearing about those too. On the other hand, >>>>>>>>>>>>>>>>>>>>>>>> it's also reasonable to only switch one thing at a time. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> if we get to < 30s I think most users would prefer >>>>>>>>>>>>>>>>>>>>>>>>> it to link.exe, just hopping there is still some more optimizations to get >>>>>>>>>>>>>>>>>>>>>>>>> closer to ELF linking times (around 10-15s here). >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Generally speaking a good rule of thumb is that >>>>>>>>>>>>>>>>>>>>>>>>>> /debug:ghash will be close to or faster than /debug:fastlink, but with none >>>>>>>>>>>>>>>>>>>>>>>>>> of the penalties like slow debug time >>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Chrome is actually one of my exact benchmark >>>>>>>>>>>>>>>>>>>>>>>>>>> cases. When building blink_core.dll and browser_tests.exe, i get anywhere >>>>>>>>>>>>>>>>>>>>>>>>>>> from a 20-40% reduction in link time. We have some other optimizations in >>>>>>>>>>>>>>>>>>>>>>>>>>> the pipeline but not upstream yet. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> My best time so far (including other >>>>>>>>>>>>>>>>>>>>>>>>>>> optimizations not yet upstream) is 28s on blink_core.dll, compared to 110s >>>>>>>>>>>>>>>>>>>>>>>>>>> with /debug >>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 12:28 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner >>>>>>>>>>>>>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> You probably don't want to go down the same >>>>>>>>>>>>>>>>>>>>>>>>>>>>> route that clang goes through to write the object file. If you think >>>>>>>>>>>>>>>>>>>>>>>>>>>>> yaml2coff is convoluted, the way clang does it will just give you a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> headache. There are multiple abstractions involved to account for >>>>>>>>>>>>>>>>>>>>>>>>>>>>> different object file formats (ELF, COFF, MachO) and output formats >>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Assembly, binary file). At least with yaml2coff >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I think your phrase got cut there, but yeah I >>>>>>>>>>>>>>>>>>>>>>>>>>>> just found AsmPrinter.cpp and it is convoluted. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It's true that yaml2coff is using the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> COFFParser structure, but if you look at the writeCOFF >>>>>>>>>>>>>>>>>>>>>>>>>>>>> function in yaml2coff it's pretty bare-metal. The logic you need will be >>>>>>>>>>>>>>>>>>>>>>>>>>>>> almost identical, except that instead of checking the COFFParser for the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> various fields, you'll check the existing COFFObjectFile, which should have >>>>>>>>>>>>>>>>>>>>>>>>>>>>> similar fields. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The only thing you need to different is when >>>>>>>>>>>>>>>>>>>>>>>>>>>>> writing the section table and section contents, to insert a new entry. Since >>>>>>>>>>>>>>>>>>>>>>>>>>>>> you're injecting a section into the middle, you'll also probably need to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> push back the file pointer of all subsequent sections so that they don't >>>>>>>>>>>>>>>>>>>>>>>>>>>>> overlap. (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert >>>>>>>>>>>>>>>>>>>>>>>>>>>>> between 2 and 3, then the original sections 3, 4, and 5 would need to have >>>>>>>>>>>>>>>>>>>>>>>>>>>>> their FilePointerToRawData offset by the size of the new section). >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I have the PE/COFF spec open here and I'm happy >>>>>>>>>>>>>>>>>>>>>>>>>>>> that I read a bit of it so I actually know what you are talking about... >>>>>>>>>>>>>>>>>>>>>>>>>>>> yeah it doesn't seem too complicated. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you need to know what values to put for the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> other fields in a section header, run `dumpbin /headers foo.obj` on a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> clang-generated object file that has a .debug$H section already (e.g. run >>>>>>>>>>>>>>>>>>>>>>>>>>>>> clang with -emit-codeview-ghash-section, and look at the properties of the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> .debug$H section and use the same values). >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks I will do that and then also look at how >>>>>>>>>>>>>>>>>>>>>>>>>>>> the CodeView part of the code does it if I can't understand some of it. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The only invariant that needs to be maintained >>>>>>>>>>>>>>>>>>>>>>>>>>>>> is that Section[N]->FilePointerOfRawData =>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Section[N-1]->FilePointerOfRawData + >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Section[N-1]->SizeOfRawData >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Well, that and all the sections need to be on >>>>>>>>>>>>>>>>>>>>>>>>>>>> the final file... But I'm hopeful. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Anyone has times on linking a big project like >>>>>>>>>>>>>>>>>>>>>>>>>>>> chrome with this so that at least I know what kind of performance to expect? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> My numbers are something like: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> 1 pdb per obj file: link.exe takes ~15 minutes >>>>>>>>>>>>>>>>>>>>>>>>>>>> and 16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram >>>>>>>>>>>>>>>>>>>>>>>>>>>> around 10 pdbs per folder: link.exe takes 1 >>>>>>>>>>>>>>>>>>>>>>>>>>>> minute and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram >>>>>>>>>>>>>>>>>>>>>>>>>>>> faslink: link.exe takes 40 seconds, but then 20 >>>>>>>>>>>>>>>>>>>>>>>>>>>> seconds of loading at the first break point in the debugger and we lost DIA >>>>>>>>>>>>>>>>>>>>>>>>>>>> support for listing symbols. >>>>>>>>>>>>>>>>>>>>>>>>>>>> incremental: link.exe takes 8 seconds, but it >>>>>>>>>>>>>>>>>>>>>>>>>>>> only happens when very minor changes happen. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> We have an non negligible number of symbols >>>>>>>>>>>>>>>>>>>>>>>>>>>> used on some runtime systems. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the tips, I now have something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that reads the obj file, finds .debug$T sections and global hashes it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (proof of concept kind of code). What I can't find is: how does clang >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> itself writes the coff files with global hashes, as that might help me >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand how to create the .debug$H section, how to update the file >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> section count and how to properly write this back. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The code on yaml2coff is expecting to be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> working on the yaml COFFParser struct and I'm having quite a bit of a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headache turning the COFFObjectFile into a COFFParser object or >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatible... Tomorrow I might try the very non efficient path of coff2yaml >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and then yaml2coff with the hashes header... but it seems way too >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> inefficient and convoluted. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Turner <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Turner <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> No I didn't, I used cl.exe from the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> visual studio toolchain. What I'm proposing is a tool for processing .obj >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files in COFF format, reading them and generating the GHASH part. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To make our build faster we use hundreds >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of unity build files (.cpp's with a lot of other .cpp's in them aka munch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files) but still have a lot of single .cpp's as well (in total something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> like 3.4k .obj files). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ps: sorry for sending to the wrong list, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I was reading about llvm mailing lists and jumped when I saw what I thought >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was a lld exclusive list. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A tool like this would be useful, yes. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We've talked about it internally as well and agreed it would be useful, we >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just haven't prioritized it. If you're interested in submitting a patch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> along those lines though, I think it would be a good addition. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm not sure what the best place for it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would be. llvm-readobj and llvm-objdump seem like obvious choices, but >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they are intended to be read-only, so perhaps they wouldn't be a good fit. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm-pdbutil is kind of a hodgepodge of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything else related to PDBs and symbols, so I wouldn't be opposed to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> making a new subcommand there called "ghash" or something that could >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process an object file and output a new object file with a .debug$H section. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A third option would be to make a new tool >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for it. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't htink it would be that hard to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> write. If you're interested in trying to make a patch for this, I can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> offer some guidance on where to look in the code. Otherwise it's something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that we'll probably get to, I'm just not sure when. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would love to write it and contribute it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> back, please do tell, I did find some of the code of ghash in lld, but in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fuzzy on the llvm codeview part of it and never seen llvm-readobj/objdump >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or llvm-pdbutil, but I'm not afraid to look :) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luckily all of the important code is hidden >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> behind library calls, and it should already just do the right thing, so I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> suspect you won't need to know much about CodeView to do this. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think Peter has the right idea about >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> putting this in llvm-objcopy. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> You can look at one of the existing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> CopyBinary functions there, which currently only work for ELF, but you can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just make a new overload that accepts a COFFObjectFile. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would probably start by iterating over >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each of the sections (getNumberOfSections / getSectionName) looking for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .debug$T and .debug$H sections. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you find a .debug$H section then you can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just skip that object file. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you find a .debug$T but not a .debug$H, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> then basically do the same thing that LLD does in PDBLinker::mergeDebugT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (create a CVTypeArray, and pass it to GloballyHashedType::hashTypes. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That will return an array of hash values. (the format of .debug$H is the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> header, followed by the hash values). Then when you're writing the list of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sections, just add in the .debug$H section right after the .debug$T section. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently llvm-objcopy only writes ELF >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files, so it would need to be taught to write COFF files. We have code to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do this in the yaml2obj utility (specifically, in yaml2coff.cpp in the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> function writeCOFF). There may be a way to move this code to somewhere >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> else (llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy, but in the worst case scenario you could copy the code and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> re-write it to work with these new structures. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lastly, you'll probably want to put all of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this behind an option in llvm-objcopy such as -add-codeview-ghash-section >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Leonardo Santagada >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Leonardo Santagada >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Leonardo Santagada >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Leonardo Santagada >>>>> >>>> >>> >>> >>> -- >>> >>> Leonardo Santagada >>> >>-- Leonardo Santagada -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/6331d453/attachment-0001.html>
Leonardo Santagada via llvm-dev
2018-Jan-26 17:59 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
I'm now thinking that there's a bug in either obj2yaml or yaml2obj, because if I run just those two tools on my codebase it generates yaml files that can't be decoded, will try now to not add any section to the obj file in llvm-objcopy to see if I can link with obj files that I rewrite (but without adding symbols or sections). One of the bugs that do annoy me is that the timedatestamp is not carried when obj2yaml writes a file, and that the layout function on yaml2coff does generate different indexes to the sections, none that look wrong, but it seems that it leaves some padding, but I didn't have time to look to closely at why. On Fri, Jan 26, 2018 at 6:52 PM, Zachary Turner <zturner at google.com> wrote:> Hmm, ok. In that case let me try again without my local changes. Maybe > they are getting in the way :-/ > > > On Fri, Jan 26, 2018 at 9:51 AM Leonardo Santagada <santagada at gmail.com> > wrote: > >> it is identical to me... wierd. >> >> On Fri, Jan 26, 2018 at 6:49 PM, Zachary Turner <zturner at google.com> >> wrote: >> >>> (Ignore the fact that my hashes are 8 byte in the "good" file, this is >>> due to some local changes I've been experimenting with) >>> >>> On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at google.com> >>> wrote: >>> >>>> I did this: >>>> >>>> // a.cpp >>>> static int x = 0; >>>> void b(int); >>>> void a(int) { >>>> if (x) >>>> b(x); >>>> } >>>> int main(int argc, char **argv) { >>>> a(argc); >>>> return x; >>>> } >>>> >>>> >>>> clang-cl /Z7 /c a.cpp /Foa.noghash.obj >>>> clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section >>>> /Foa.ghash.good.obj >>>> llvm-objcopy a.noghash.obj a.ghash.bad.obj >>>> obj2yaml a.ghash.good.obj > a.ghash.good.yaml >>>> obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml >>>> >>>> Then open these 2 yaml files up in a diff viewer. It looks like the >>>> hashes aren't getting emitted at all. For example, in the good yaml file I >>>> see this: >>>> >>>> - Name: '.debug$H' >>>> Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, >>>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ] >>>> Alignment: 4 >>>> SectionData: C5C93301000001005549419E78044E >>>> 3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF8 >>>> 4584814A8B5E7E3FB17B397A9E3DEA75CD5627 >>>> GlobalHashes: >>>> Version: 0 >>>> HashAlgorithm: 1 >>>> HashValues: >>>> - 5549419E78044E38 >>>> - 96D45CD700942875 >>>> - 8BE4A1E2B3E022BA >>>> - 267DEE221F5C42B1 >>>> - 7BCA182AF8458481 >>>> - 4A8B5E7E3FB17B39 >>>> - 7A9E3DEA75CD5627 >>>> - Name: .pdata >>>> >>>> And in the bad yaml file I see this: >>>> - Name: '.debug$H' >>>> Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, >>>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ] >>>> Alignment: 4 >>>> SectionData: C5C9330100000000 >>>> GlobalHashes: >>>> Version: 0 >>>> HashAlgorithm: 0 >>>> - Name: .pdata >>>> >>>> Don't focus too much on trying to figure out weird linker errors. Just >>>> get the output of obj2yaml to be identical when run under a diff utility, >>>> then everything should work fine. >>>> >>>> On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at gmail.com> >>>> wrote: >>>> >>>>> I'm so close I can almost smell it :) >>>>> >>>>> I know how bad the code looks, I don't intend to submit this, but if >>>>> you want to try it out its at: https://gist.github.com/santagada/ >>>>> 544136b1ee143bf31653b1158ac6829e >>>>> >>>>> I'm seeing: lld-link.exe: error: duplicate symbol: >>>>> "<redacted_unmangled>" (<redacted>) in <internal> and in >>>>> <redacted_filename>.obj, looking at the .yaml dump the symbols are all >>>>> similar to this: >>>>> >>>>> - Name: <redacted> >>>>> Value: 0 >>>>> SectionNumber: 0 >>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>> ComplexType: IMAGE_SYM_DTYPE_FUNCTION >>>>> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL >>>>> WeakExternal: >>>>> TagIndex: 134 >>>>> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY >>>>> >>>>> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at google.com> >>>>> wrote: >>>>> >>>>>> I haven't really dabbled in this part of the COFF format personally, >>>>>> so hopefully I'm not leading you astray :) >>>>>> >>>>>> But I checked the code for coff2yaml, and I see this: >>>>>> >>>>>> } else if (Symbol.isSectionDefinition()) { >>>>>> // This symbol represents a section definition. >>>>>> assert(Symbol.getNumberOfAuxSymbols() == 1 && >>>>>> "Expected a single aux symbol to describe this >>>>>> section!"); >>>>>> const object::coff_aux_section_definition *ObjSD >>>>>> reinterpret_cast<const object::coff_aux_section_definition >>>>>> *>( >>>>>> AuxData.data()); >>>>>> >>>>>> So it looks like you need exactly 1 aux symbol for each section >>>>>> symbol. >>>>>> >>>>>> I then scrolled up in this function to figure out where AuxData comes >>>>>> from, and it comes from COFFObjectFile::getSymbolAuxData. I think >>>>>> that function holds the clue to what you need to do. It looks like you >>>>>> need to set coff::symbol::NumberOfAuxSymbols to 1, and then there is >>>>>> a comment in getSymbolAuxData which says: >>>>>> >>>>>> // AUX data comes immediately after the symbol in COFF >>>>>> Aux = reinterpret_cast<const uint8_t *>(Symbol.getRawPtr()) + >>>>>> SymbolSize; >>>>>> >>>>>> So I think you just need to write the bytes immediately after the >>>>>> coff::symbol. The thing you need to write looks like a >>>>>> coff::coff_aux_section_definition structure. >>>>>> >>>>>> For the CheckSum, look at WinCOFFObjectWriter::writeSection. It >>>>>> looks like its a CRC32 of the actual section contents, which you can >>>>>> generate with a couple of lines of code: >>>>>> >>>>>> JamCRC JC(/*Init=*/0); >>>>>> JC.update(DebugHContents); >>>>>> AuxSymbol.CheckSum = JC.getCRC(); >>>>>> >>>>>> Hope this helps >>>>>> >>>>>-- Leonardo Santagada -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/1bd57a4a/attachment.html>
Seemingly Similar Threads
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)