Leonardo Santagada via llvm-dev
2018-Jan-26 15:27 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
I'm so close I can almost smell it :) I know how bad the code looks, I don't intend to submit this, but if you want to try it out its at: https://gist.github.com/santagada/544136b1ee143bf31653b1158ac6829e I'm seeing: lld-link.exe: error: duplicate symbol: "<redacted_unmangled>" (<redacted>) in <internal> and in <redacted_filename>.obj, looking at the .yaml dump the symbols are all similar to this: - Name: <redacted> Value: 0 SectionNumber: 0 SimpleType: IMAGE_SYM_TYPE_NULL ComplexType: IMAGE_SYM_DTYPE_FUNCTION StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL WeakExternal: TagIndex: 134 Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at google.com> wrote:> I haven't really dabbled in this part of the COFF format personally, so > hopefully I'm not leading you astray :) > > But I checked the code for coff2yaml, and I see this: > > } else if (Symbol.isSectionDefinition()) { > // This symbol represents a section definition. > assert(Symbol.getNumberOfAuxSymbols() == 1 && > "Expected a single aux symbol to describe this section!"); > const object::coff_aux_section_definition *ObjSD > reinterpret_cast<const object::coff_aux_section_definition *>( > AuxData.data()); > > So it looks like you need exactly 1 aux symbol for each section symbol. > > I then scrolled up in this function to figure out where AuxData comes > from, and it comes from COFFObjectFile::getSymbolAuxData. I think that > function holds the clue to what you need to do. It looks like you need to > set coff::symbol::NumberOfAuxSymbols to 1, and then there is a comment in > getSymbolAuxData which says: > > // AUX data comes immediately after the symbol in COFF > Aux = reinterpret_cast<const uint8_t *>(Symbol.getRawPtr()) + > SymbolSize; > > So I think you just need to write the bytes immediately after the > coff::symbol. The thing you need to write looks like a > coff::coff_aux_section_definition structure. > > For the CheckSum, look at WinCOFFObjectWriter::writeSection. It looks > like its a CRC32 of the actual section contents, which you can generate > with a couple of lines of code: > > JamCRC JC(/*Init=*/0); > JC.update(DebugHContents); > AuxSymbol.CheckSum = JC.getCRC(); > > Hope this helps > > On Thu, Jan 25, 2018 at 10:46 AM Leonardo Santagada <santagada at gmail.com> > wrote: > >> >> I see that there is an auxsymbol per section symbol, and also on the yaml >> representation there is a checksum, selection and unused all of them I have >> no idea how to fill in, also this aux symbol might have some important >> information for me to patch on the other symbols. Can you find the part in >> llvm that it writes those? because at least for auxsymbol the yaml part of >> the code threats as a binary blob so there is no info on what they should >> be. >> >> On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner at google.com> >> wrote: >> >>> If you run obj2yaml against a very simple object file, you'll see >>> something like this at the end: >>> ``` >>> symbols: >>> - Name: '@comp.id' >>> Value: 17130443 >>> SectionNumber: -1 >>> SimpleType: IMAGE_SYM_TYPE_NULL >>> ComplexType: IMAGE_SYM_DTYPE_NULL >>> StorageClass: IMAGE_SYM_CLASS_STATIC >>> - Name: '@feat.00' >>> Value: 2147484048 <(21)%204748-4048> >>> SectionNumber: -1 >>> SimpleType: IMAGE_SYM_TYPE_NULL >>> ComplexType: IMAGE_SYM_DTYPE_NULL >>> StorageClass: IMAGE_SYM_CLASS_STATIC >>> - Name: .drectve >>> Value: 0 >>> SectionNumber: 1 >>> SimpleType: IMAGE_SYM_TYPE_NULL >>> ComplexType: IMAGE_SYM_DTYPE_NULL >>> StorageClass: IMAGE_SYM_CLASS_STATIC >>> SectionDefinition: >>> Length: 47 >>> NumberOfRelocations: 0 >>> NumberOfLinenumbers: 0 >>> CheckSum: 0 >>> Number: 0 >>> ... >>> ``` >>> >>> There's a structure called coff::symbol which basically represents each >>> one of these records. It looks like this: >>> >>> ``` >>> struct symbol { >>> char Name[NameSize]; >>> uint32_t Value; >>> int32_t SectionNumber; >>> uint16_t Type; >>> uint8_t StorageClass; >>> uint8_t NumberOfAuxSymbols; >>> }; >>> ``` >>> >>> So you'll need to create one for the debug$H section and stick it into >>> the list. This particular list doesn't have to be in any special order, so >>> you can just put it at the end (although it's probably not that much harder >>> to insert into the middle, and it will make for a good test that you've >>> done it right. The output can be diffed against clang-cl object file and >>> be identical this way). So write all the normal symbols as you probably >>> already are, then write one for the .debug$H section. Initialize the >>> fields to the same thing that you see when you run obj2yaml against an >>> object file generated by clang-cl for the .debug$H section. >>> >>> This structure doesn't contain any kind of file pointers or offsets, so >>> all you really need to fix up are the "SectionNumber" fields. Basically as >>> you are writing the existing symbols, you would do somethign like: >>> >>> for (const auto &Sym : ObjFile.symbols()) { >>> if (Symbol->SectionNumber >= DebugHInsertionIndex) >>> ++Symbol->SectionNumber; >>> writeSymbol(Sym); >>> } >>> writeSymbol(DebugHSym); >>> >>> >>> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada <santagada at gmail.com> >>> wrote: >>> >>>> Any idea on how to create this new symbol there? I saw that there is a >>>> symbol pointing to each section, but didn't understand the format, and >>>> yaml2obj doesn't check it or do anything with the list. >>>> >>>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada < >>>> santagada at gmail.com> wrote: >>>> >>>>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT. >>>>> >>>>> sorry for the caps... long day of working on this, and using vs 2017, >>>>> which adds a new section type .chks64 that I couldn't find documentation >>>>> anywhere was difficult. I highly recommend everyone to just not using vs >>>>> 2017 until 15.8 or something, our internal bug list is gigantic. >>>>> >>>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at google.com> >>>>> wrote: >>>>> >>>>>> Actually I already have a theory that even though you are adding the >>>>>> section to the section table, you might not be adding a *symbol* for the >>>>>> section to the symbol table. So the existing symbols (which reference >>>>>> sections by index) will all be wrong because you've inserted a new >>>>>> section. Still though, obj2yaml would expose that. >>>>>> >>>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at google.com> >>>>>> wrote: >>>>>> >>>>>>> Yea as long as you compare clang-cl object file with automatically >>>>>>> generated .debug$H section against clang-cl object file without .debug$H >>>>>>> but added after the fact with llvm-objcopy, that should expose the problem >>>>>>> I think when you run obj2yaml on them. >>>>>>> >>>>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada < >>>>>>> santagada at gmail.com> wrote: >>>>>>> >>>>>>>> I did reorder my sections, so that .debug$H is in the correct >>>>>>>> place, but now I get some errors on dubplicate symbols, I created a folder >>>>>>>> with examples: >>>>>>>> >>>>>>>> https://www.dropbox.com/sh/nmvzi44pi0boe76/ >>>>>>>> AAA0f47O5PCJ9JiUc6wVuwBra?dl=0 >>>>>>>> >>>>>>>> t.obj is generated by vs 2015 and it links fine with lld-link.exe, >>>>>>>> but tout.obj gives this errors: >>>>>>>> >>>>>>>> lld-link.exe /DEBUG:GHASH tout.obj >>>>>>>> LLD-LINK.EXE: error: duplicate symbol: __local_stdio_printf_options >>>>>>>> in tout.obj and in LIBCMT.lib(default_local_stdio_options.obj) >>>>>>>> LLD-LINK.EXE: error: duplicate symbol: __local_stdio_printf_options >>>>>>>> in tout.obj and in libvcruntime.lib(undname.obj) >>>>>>>> >>>>>>>> I'm using PEView from http://wjradburn.com/software/ to look at >>>>>>>> the files and can't see anything wrong, except some valid differences in >>>>>>>> the offsets being used for the data (so pointer to data is different >>>>>>>> between them). >>>>>>>> >>>>>>>> I will look into yaml2obj now to see if I see anything else weird >>>>>>>> going on. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner <zturner at google.com >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> I'm pretty confident that cl is not putting anything strange in >>>>>>>>> the .debug$T sections. We've done a lot of testing and never seen anything >>>>>>>>> except CodeView type records in a .debug$T. My hunch is that your objcopy >>>>>>>>> patch is probably not doing the right thing in one or more of the section >>>>>>>>> headers, and this is confusing the linker. >>>>>>>>> >>>>>>>>> One idea might be to build a simple object file with clang-cl but >>>>>>>>> without the magic -mllvm -emit-codeview-ghash-section, then run your >>>>>>>>> llvm-objcopy on it. Then build the same object file passing -mllvm >>>>>>>>> -emit-codeview-ghash-section. Then run obj2yaml on both and diff the >>>>>>>>> results. They should be byte-for-byte identical. That should give you a >>>>>>>>> clue about if objcopy is doing something wrong. >>>>>>>>> >>>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada < >>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Don't worry, I definetly want to perfect this to generate legal >>>>>>>>>> obj files, this is just to speed up testing. >>>>>>>>>> >>>>>>>>>> Now after patching all the obj files I get this errors when >>>>>>>>>> linking a small part of our code base (msvc 2017 15.5.3, lld and >>>>>>>>>> llvm-objcopy 7.0.0): >>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>> section: $LN8 >>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>> section: $LN43 >>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>> section: $LN37 >>>>>>>>>> >>>>>>>>>> I'm starting to guess that cl.exe might be putting some random >>>>>>>>>> comdat or other discardable symbols in the .debug$T and clang doesn't? I >>>>>>>>>> will try to debug this and see what more I can uncover. >>>>>>>>>> >>>>>>>>>> Linking works perfectly without my llvm-objcopy pass to add >>>>>>>>>> .debug$H? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner < >>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>> >>>>>>>>>>> It might not influence LLD, but at the same time we don't want >>>>>>>>>>> to upstream something that is producing technically illegal COFF files. >>>>>>>>>>> Also good to hear about the planned changes to your header files. Looking >>>>>>>>>>> forward to hearing about your experiences with clang-cl. >>>>>>>>>>> >>>>>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo Santagada < >>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I finally got my first .obj file patched with .debug$H to look >>>>>>>>>>>> somewhat right. I added the new section at the end of the file so I don't >>>>>>>>>>>> have to recalculate all sections (although now I probably could position it >>>>>>>>>>>> in the middle, knowing that each section is: SizeOfRawData + (last.Header.NumberOfRelocations >>>>>>>>>>>> * (4+4+2)) and the $H needs to come right after $T in the file). That >>>>>>>>>>>> although illegal based on the coff specs doesn't seem its going to >>>>>>>>>>>> influence lld. >>>>>>>>>>>> >>>>>>>>>>>> Also we talked and we are probably going to do something >>>>>>>>>>>> similar to a bunch of windows defines and a check for our own define (to >>>>>>>>>>>> guarantee that no one imported windows.h before win32.h) and drop the >>>>>>>>>>>> namespace and the conflicting names. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM, Zachary Turner < >>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> That's very possible that a 3rd party indirect header include >>>>>>>>>>>>> is involved. One idea might be like I suggested where you #define >>>>>>>>>>>>> _WINDOWS_ in win32.h and guarantee that it's always included first. Then >>>>>>>>>>>>> those other headers won't be able to #include <windows.h>. but it will >>>>>>>>>>>>> probably greatly expand the amount of stuff you have to add to win32.h, as >>>>>>>>>>>>> you will probably find some callers of functions that aren't yet in your >>>>>>>>>>>>> win32.h that you'd have to add. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM Leonardo Santagada < >>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ok some information was lost on getting this example to you, >>>>>>>>>>>>>> I'm sorry for not being clear. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We have a huge code base, let's say 90% of it doesn't include >>>>>>>>>>>>>> either header, 9% include win32.h and 1% includes both, I will try to >>>>>>>>>>>>>> discover why, but my guess is they include both a third party that includes >>>>>>>>>>>>>> windows.h and some of our libs that use win32.h. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I will try to fully understand this tomorrow. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I guess clang will not implement this ever so finishing the >>>>>>>>>>>>>> object copier is the best solution until all code is ported to clang. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 23 Jan 2018 00:02, "Zachary Turner" <zturner at google.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> You said win32.h doesn't include windows.h, but main.cpp >>>>>>>>>>>>>>> does. So what's the disadvantage of just including it in win32.h anyway, >>>>>>>>>>>>>>> since it's already going to be in every translation unit? (Unless you >>>>>>>>>>>>>>> didn't mean to #include it in main.cpp) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I guess all I can do is warn you how bad of an idea this >>>>>>>>>>>>>>> is. For starters, I already found a bug in your code ;-) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // stdint.h >>>>>>>>>>>>>>> typedef int int32_t; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // winnt.h >>>>>>>>>>>>>>> typedef long LONG; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // windef.h >>>>>>>>>>>>>>> typedef struct tagPOINT >>>>>>>>>>>>>>> { >>>>>>>>>>>>>>> LONG x; // long x >>>>>>>>>>>>>>> LONG y; // long y >>>>>>>>>>>>>>> } POINT, *PPOINT, NEAR *NPPOINT, FAR *LPPOINT; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // win32.h >>>>>>>>>>>>>>> typedef int32_t LONG; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> struct POINT >>>>>>>>>>>>>>> { >>>>>>>>>>>>>>> LONG x; // int x >>>>>>>>>>>>>>> LONG y; // int y >>>>>>>>>>>>>>> }; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So POINT is defined two different ways. In your minimal >>>>>>>>>>>>>>> interface, it's declared as 2 int32's, which are int. In the actual >>>>>>>>>>>>>>> Windows header files, it's declared as 2 longs. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This might seem like a unimportant bug since int and long >>>>>>>>>>>>>>> are the same size, but int and long also mangle differently and affect >>>>>>>>>>>>>>> overload resolution, so you could have weird linker errors or call the >>>>>>>>>>>>>>> wrong function overload. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Plus, it illustrates the fact that this struct *actually is* >>>>>>>>>>>>>>> a different type from the one in the windows header. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You said at the end that you never intentionally import >>>>>>>>>>>>>>> win32.h and windows.h from the same translation unit. But then in this >>>>>>>>>>>>>>> example you did. I wonder if you could enforce that by doing this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // win32.h >>>>>>>>>>>>>>> #pragma once >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // Error if windows.h was included before us. >>>>>>>>>>>>>>> #if defined(_WINDOWS_) >>>>>>>>>>>>>>> #error "You're including win32.h after having already >>>>>>>>>>>>>>> included windows.h. Don't do this!" >>>>>>>>>>>>>>> #endif >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> // And also make sure windows.h can't get included after us >>>>>>>>>>>>>>> #define _WINDOWS_ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For the record, I tried the test case you linked when >>>>>>>>>>>>>>> windows.h is not included in main.cpp and it works (but still has the bug >>>>>>>>>>>>>>> about int and long). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 2:23 PM Leonardo Santagada < >>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It is super gross, but we copy parts of windows.h because >>>>>>>>>>>>>>>> having all of it if both gigantic and very very messy. So our win32.h has a >>>>>>>>>>>>>>>> couple thousands of lines and not 30k+ for windows.h and we try to have >>>>>>>>>>>>>>>> zero macros. Win32.h doesn't include windows.h so using ::BOOL wouldn't >>>>>>>>>>>>>>>> work. We don't want to create a namespace, we just want a cleaner interface >>>>>>>>>>>>>>>> to windows api. The namespace with c linkage is the way to trick cl into >>>>>>>>>>>>>>>> allowing us to in some files have both windows.h and Win32.h. I really >>>>>>>>>>>>>>>> don't see any way for us to have this Win32.h without this cl support, so >>>>>>>>>>>>>>>> maybe we should either put windows.h in a compiled header somewhere and not >>>>>>>>>>>>>>>> care that it is infecting everything or just have one place we can call to >>>>>>>>>>>>>>>> clean up after including windows.h (a massive set of undefs). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So using can't work, because we never intentionally import >>>>>>>>>>>>>>>> windows.h and win32.h on the same translation unit. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 7:08 PM, Zachary Turner < >>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is pretty gross, honestly :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can't you just use using declarations? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> namespace Win32 { >>>>>>>>>>>>>>>>> extern "C" { >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> using ::BOOL; >>>>>>>>>>>>>>>>> using ::LONG; >>>>>>>>>>>>>>>>> using ::POINT; >>>>>>>>>>>>>>>>> using ::LPPOINT; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> using ::GetCursorPos; >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This works with clang-cl. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 5:39 AM Leonardo Santagada < >>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Here it is a minimal example, we do this so we don't have >>>>>>>>>>>>>>>>>> to import the whole windows api everywhere. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://gist.github.com/santagada/ >>>>>>>>>>>>>>>>>> 7977e929d31c629c4bf18ebb987f6be3 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner < >>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Clang-cl maintains compatibility with msvc even in cases >>>>>>>>>>>>>>>>>>> where it’s non standards compliant (eg 2 phase name lookup), but we try to >>>>>>>>>>>>>>>>>>> keep these cases few and far between. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> To help me understand your case, do you mean you copy >>>>>>>>>>>>>>>>>>> windows.h and modify it? How does this lead to the same struct being >>>>>>>>>>>>>>>>>>> defined twice? If i were to write this: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> struct Foo {}; >>>>>>>>>>>>>>>>>>> struct Foo {}; >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Is this a small repro of the issue you’re talking about? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I can totally see something like incremental linking >>>>>>>>>>>>>>>>>>>> with a simple padding between obj and a mapping file (which can also help >>>>>>>>>>>>>>>>>>>> with edit and continue, something we also would love to have). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We have another developer doing the port to support >>>>>>>>>>>>>>>>>>>> clang-cl, but although most of our code also goes trough a version of >>>>>>>>>>>>>>>>>>>> clang, migrating the rest to clang-cl has been a fight. From what I heard >>>>>>>>>>>>>>>>>>>> the main problem is that we have a copy of parts of windows.h (so not to >>>>>>>>>>>>>>>>>>>> bring the awful parts of it like lower case macros) and that totally works >>>>>>>>>>>>>>>>>>>> on cl, but clang (at least 6.0) complains about two struct/vars with the >>>>>>>>>>>>>>>>>>>> same name, even though they are exactly the same. Making clang-cl as broken >>>>>>>>>>>>>>>>>>>> as cl.exe is not an option I suppose? I would love to turn on a flag >>>>>>>>>>>>>>>>>>>> --accept-that-cl-made-bad-decisions-and-live-with-it >>>>>>>>>>>>>>>>>>>> and have this at least until this is completely fixed in our code base. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> the biggest win with moving to cl would be a better >>>>>>>>>>>>>>>>>>>> more standards compliant compiler, no 1 minute compiles on heavily >>>>>>>>>>>>>>>>>>>> templated files and maybe the holy grail of ThinLTO. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> 10-15s will be hard without true incremental linking. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> At some point that's going to be the only way to get >>>>>>>>>>>>>>>>>>>>> any faster, but incremental linking is hard (putting it lightly), and since >>>>>>>>>>>>>>>>>>>>> our full links are already really fast we think we can get reasonably close >>>>>>>>>>>>>>>>>>>>> to link.exe incremental speeds with full links. But it's never enough and >>>>>>>>>>>>>>>>>>>>> I will always want it to be faster, so you may see incremental linking in >>>>>>>>>>>>>>>>>>>>> the future after we hit a performance wall with full link speed :) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> In any case, I'm definitely interested in seeing what >>>>>>>>>>>>>>>>>>>>> kind of numbers you get with /debug:ghash after you get this llvm-objcopy >>>>>>>>>>>>>>>>>>>>> feature implemented. So keep me updated :) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> As an aside, have you tried building with clang >>>>>>>>>>>>>>>>>>>>> instead of cl? If you build with clang you wouldn't even have to do this >>>>>>>>>>>>>>>>>>>>> llvm-objcopy work, because it would "just work". If you've tried but ran >>>>>>>>>>>>>>>>>>>>> into issues I'm interested in hearing about those too. On the other hand, >>>>>>>>>>>>>>>>>>>>> it's also reasonable to only switch one thing at a time. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> if we get to < 30s I think most users would prefer it >>>>>>>>>>>>>>>>>>>>>> to link.exe, just hopping there is still some more optimizations to get >>>>>>>>>>>>>>>>>>>>>> closer to ELF linking times (around 10-15s here). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Generally speaking a good rule of thumb is that >>>>>>>>>>>>>>>>>>>>>>> /debug:ghash will be close to or faster than /debug:fastlink, but with none >>>>>>>>>>>>>>>>>>>>>>> of the penalties like slow debug time >>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Chrome is actually one of my exact benchmark cases. >>>>>>>>>>>>>>>>>>>>>>>> When building blink_core.dll and browser_tests.exe, i get anywhere from a >>>>>>>>>>>>>>>>>>>>>>>> 20-40% reduction in link time. We have some other optimizations in the >>>>>>>>>>>>>>>>>>>>>>>> pipeline but not upstream yet. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> My best time so far (including other optimizations >>>>>>>>>>>>>>>>>>>>>>>> not yet upstream) is 28s on blink_core.dll, compared to 110s with /debug >>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> You probably don't want to go down the same route >>>>>>>>>>>>>>>>>>>>>>>>>> that clang goes through to write the object file. If you think yaml2coff >>>>>>>>>>>>>>>>>>>>>>>>>> is convoluted, the way clang does it will just give you a headache. There >>>>>>>>>>>>>>>>>>>>>>>>>> are multiple abstractions involved to account for different object file >>>>>>>>>>>>>>>>>>>>>>>>>> formats (ELF, COFF, MachO) and output formats (Assembly, binary file). At >>>>>>>>>>>>>>>>>>>>>>>>>> least with yaml2coff >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I think your phrase got cut there, but yeah I just >>>>>>>>>>>>>>>>>>>>>>>>> found AsmPrinter.cpp and it is convoluted. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> It's true that yaml2coff is using the COFFParser >>>>>>>>>>>>>>>>>>>>>>>>>> structure, but if you look at the writeCOFF >>>>>>>>>>>>>>>>>>>>>>>>>> function in yaml2coff it's pretty bare-metal. The logic you need will be >>>>>>>>>>>>>>>>>>>>>>>>>> almost identical, except that instead of checking the COFFParser for the >>>>>>>>>>>>>>>>>>>>>>>>>> various fields, you'll check the existing COFFObjectFile, which should have >>>>>>>>>>>>>>>>>>>>>>>>>> similar fields. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> The only thing you need to different is when >>>>>>>>>>>>>>>>>>>>>>>>>> writing the section table and section contents, to insert a new entry. Since >>>>>>>>>>>>>>>>>>>>>>>>>> you're injecting a section into the middle, you'll also probably need to >>>>>>>>>>>>>>>>>>>>>>>>>> push back the file pointer of all subsequent sections so that they don't >>>>>>>>>>>>>>>>>>>>>>>>>> overlap. (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert >>>>>>>>>>>>>>>>>>>>>>>>>> between 2 and 3, then the original sections 3, 4, and 5 would need to have >>>>>>>>>>>>>>>>>>>>>>>>>> their FilePointerToRawData offset by the size of the new section). >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I have the PE/COFF spec open here and I'm happy >>>>>>>>>>>>>>>>>>>>>>>>> that I read a bit of it so I actually know what you are talking about... >>>>>>>>>>>>>>>>>>>>>>>>> yeah it doesn't seem too complicated. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you need to know what values to put for the >>>>>>>>>>>>>>>>>>>>>>>>>> other fields in a section header, run `dumpbin /headers foo.obj` on a >>>>>>>>>>>>>>>>>>>>>>>>>> clang-generated object file that has a .debug$H section already (e.g. run >>>>>>>>>>>>>>>>>>>>>>>>>> clang with -emit-codeview-ghash-section, and look at the properties of the >>>>>>>>>>>>>>>>>>>>>>>>>> .debug$H section and use the same values). >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks I will do that and then also look at how >>>>>>>>>>>>>>>>>>>>>>>>> the CodeView part of the code does it if I can't understand some of it. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> The only invariant that needs to be maintained is >>>>>>>>>>>>>>>>>>>>>>>>>> that Section[N]->FilePointerOfRawData =>>>>>>>>>>>>>>>>>>>>>>>>>> Section[N-1]->FilePointerOfRawData + >>>>>>>>>>>>>>>>>>>>>>>>>> Section[N-1]->SizeOfRawData >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Well, that and all the sections need to be on the >>>>>>>>>>>>>>>>>>>>>>>>> final file... But I'm hopeful. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Anyone has times on linking a big project like >>>>>>>>>>>>>>>>>>>>>>>>> chrome with this so that at least I know what kind of performance to expect? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> My numbers are something like: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> 1 pdb per obj file: link.exe takes ~15 minutes and >>>>>>>>>>>>>>>>>>>>>>>>> 16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram >>>>>>>>>>>>>>>>>>>>>>>>> around 10 pdbs per folder: link.exe takes 1 minute >>>>>>>>>>>>>>>>>>>>>>>>> and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram >>>>>>>>>>>>>>>>>>>>>>>>> faslink: link.exe takes 40 seconds, but then 20 >>>>>>>>>>>>>>>>>>>>>>>>> seconds of loading at the first break point in the debugger and we lost DIA >>>>>>>>>>>>>>>>>>>>>>>>> support for listing symbols. >>>>>>>>>>>>>>>>>>>>>>>>> incremental: link.exe takes 8 seconds, but it only >>>>>>>>>>>>>>>>>>>>>>>>> happens when very minor changes happen. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> We have an non negligible number of symbols used >>>>>>>>>>>>>>>>>>>>>>>>> on some runtime systems. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the tips, I now have something that >>>>>>>>>>>>>>>>>>>>>>>>>>> reads the obj file, finds .debug$T sections and global hashes it (proof of >>>>>>>>>>>>>>>>>>>>>>>>>>> concept kind of code). What I can't find is: how does clang itself writes >>>>>>>>>>>>>>>>>>>>>>>>>>> the coff files with global hashes, as that might help me understand how to >>>>>>>>>>>>>>>>>>>>>>>>>>> create the .debug$H section, how to update the file section count and how >>>>>>>>>>>>>>>>>>>>>>>>>>> to properly write this back. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The code on yaml2coff is expecting to be working >>>>>>>>>>>>>>>>>>>>>>>>>>> on the yaml COFFParser struct and I'm having quite a bit of a headache >>>>>>>>>>>>>>>>>>>>>>>>>>> turning the COFFObjectFile into a COFFParser object or compatible... >>>>>>>>>>>>>>>>>>>>>>>>>>> Tomorrow I might try the very non efficient path of coff2yaml and then >>>>>>>>>>>>>>>>>>>>>>>>>>> yaml2coff with the hashes header... but it seems way too inefficient and >>>>>>>>>>>>>>>>>>>>>>>>>>> convoluted. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner >>>>>>>>>>>>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Turner <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> No I didn't, I used cl.exe from the visual >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> studio toolchain. What I'm proposing is a tool for processing .obj files in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> COFF format, reading them and generating the GHASH part. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To make our build faster we use hundreds of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unity build files (.cpp's with a lot of other .cpp's in them aka munch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files) but still have a lot of single .cpp's as well (in total something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> like 3.4k .obj files). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ps: sorry for sending to the wrong list, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was reading about llvm mailing lists and jumped when I saw what I thought >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was a lld exclusive list. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A tool like this would be useful, yes. We've >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> talked about it internally as well and agreed it would be useful, we just >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> haven't prioritized it. If you're interested in submitting a patch along >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> those lines though, I think it would be a good addition. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm not sure what the best place for it would >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be. llvm-readobj and llvm-objdump seem like obvious choices, but they are >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> intended to be read-only, so perhaps they wouldn't be a good fit. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm-pdbutil is kind of a hodgepodge of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything else related to PDBs and symbols, so I wouldn't be opposed to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> making a new subcommand there called "ghash" or something that could >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process an object file and output a new object file with a .debug$H section. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A third option would be to make a new tool >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for it. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't htink it would be that hard to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> write. If you're interested in trying to make a patch for this, I can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> offer some guidance on where to look in the code. Otherwise it's something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that we'll probably get to, I'm just not sure when. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would love to write it and contribute it >>>>>>>>>>>>>>>>>>>>>>>>>>>>> back, please do tell, I did find some of the code of ghash in lld, but in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> fuzzy on the llvm codeview part of it and never seen llvm-readobj/objdump >>>>>>>>>>>>>>>>>>>>>>>>>>>>> or llvm-pdbutil, but I'm not afraid to look :) >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Luckily all of the important code is hidden >>>>>>>>>>>>>>>>>>>>>>>>>>>> behind library calls, and it should already just do the right thing, so I >>>>>>>>>>>>>>>>>>>>>>>>>>>> suspect you won't need to know much about CodeView to do this. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I think Peter has the right idea about putting >>>>>>>>>>>>>>>>>>>>>>>>>>>> this in llvm-objcopy. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> You can look at one of the existing CopyBinary >>>>>>>>>>>>>>>>>>>>>>>>>>>> functions there, which currently only work for ELF, but you can just make a >>>>>>>>>>>>>>>>>>>>>>>>>>>> new overload that accepts a COFFObjectFile. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I would probably start by iterating over each >>>>>>>>>>>>>>>>>>>>>>>>>>>> of the sections (getNumberOfSections / getSectionName) looking for .debug$T >>>>>>>>>>>>>>>>>>>>>>>>>>>> and .debug$H sections. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> If you find a .debug$H section then you can >>>>>>>>>>>>>>>>>>>>>>>>>>>> just skip that object file. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> If you find a .debug$T but not a .debug$H, then >>>>>>>>>>>>>>>>>>>>>>>>>>>> basically do the same thing that LLD does in PDBLinker::mergeDebugT >>>>>>>>>>>>>>>>>>>>>>>>>>>> (create a CVTypeArray, and pass it to GloballyHashedType::hashTypes. >>>>>>>>>>>>>>>>>>>>>>>>>>>> That will return an array of hash values. (the format of .debug$H is the >>>>>>>>>>>>>>>>>>>>>>>>>>>> header, followed by the hash values). Then when you're writing the list of >>>>>>>>>>>>>>>>>>>>>>>>>>>> sections, just add in the .debug$H section right after the .debug$T section. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently llvm-objcopy only writes ELF files, >>>>>>>>>>>>>>>>>>>>>>>>>>>> so it would need to be taught to write COFF files. We have code to do this >>>>>>>>>>>>>>>>>>>>>>>>>>>> in the yaml2obj utility (specifically, in yaml2coff.cpp in the function >>>>>>>>>>>>>>>>>>>>>>>>>>>> writeCOFF). There may be a way to move this code to somewhere else >>>>>>>>>>>>>>>>>>>>>>>>>>>> (llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and >>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy, but in the worst case scenario you could copy the code and >>>>>>>>>>>>>>>>>>>>>>>>>>>> re-write it to work with these new structures. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Lastly, you'll probably want to put all of this >>>>>>>>>>>>>>>>>>>>>>>>>>>> behind an option in llvm-objcopy such as -add-codeview-ghash-section >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Leonardo Santagada >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Leonardo Santagada >>>>>>>> >>>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Leonardo Santagada >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Leonardo Santagada >>>> >>> >> >> >> -- >> >> Leonardo Santagada >> >-- Leonardo Santagada -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/e0926592/attachment-0001.html>
Zachary Turner via llvm-dev
2018-Jan-26 17:48 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
I did this:
// a.cpp
static int x = 0;
void b(int);
void a(int) {
if (x)
b(x);
}
int main(int argc, char **argv) {
a(argc);
return x;
}
clang-cl /Z7 /c a.cpp /Foa.noghash.obj
clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section
/Foa.ghash.good.obj
llvm-objcopy a.noghash.obj a.ghash.bad.obj
obj2yaml a.ghash.good.obj > a.ghash.good.yaml
obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml
Then open these 2 yaml files up in a diff viewer. It looks like the hashes
aren't getting emitted at all. For example, in the good yaml file I see
this:
- Name: '.debug$H'
Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
Alignment: 4
SectionData:
C5C93301000001005549419E78044E3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF84584814A8B5E7E3FB17B397A9E3DEA75CD5627
GlobalHashes:
Version: 0
HashAlgorithm: 1
HashValues:
- 5549419E78044E38
- 96D45CD700942875
- 8BE4A1E2B3E022BA
- 267DEE221F5C42B1
- 7BCA182AF8458481
- 4A8B5E7E3FB17B39
- 7A9E3DEA75CD5627
- Name: .pdata
And in the bad yaml file I see this:
- Name: '.debug$H'
Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
Alignment: 4
SectionData: C5C9330100000000
GlobalHashes:
Version: 0
HashAlgorithm: 0
- Name: .pdata
Don't focus too much on trying to figure out weird linker errors. Just get
the output of obj2yaml to be identical when run under a diff utility, then
everything should work fine.
On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at
gmail.com>
wrote:
> I'm so close I can almost smell it :)
>
> I know how bad the code looks, I don't intend to submit this, but if
you
> want to try it out its at:
> https://gist.github.com/santagada/544136b1ee143bf31653b1158ac6829e
>
> I'm seeing: lld-link.exe: error: duplicate symbol:
"<redacted_unmangled>"
> (<redacted>) in <internal> and in
<redacted_filename>.obj, looking at the
> .yaml dump the symbols are all similar to this:
>
> - Name: <redacted>
> Value: 0
> SectionNumber: 0
> SimpleType: IMAGE_SYM_TYPE_NULL
> ComplexType: IMAGE_SYM_DTYPE_FUNCTION
> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL
> WeakExternal:
> TagIndex: 134
> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY
>
> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>> I haven't really dabbled in this part of the COFF format
personally, so
>> hopefully I'm not leading you astray :)
>>
>> But I checked the code for coff2yaml, and I see this:
>>
>> } else if (Symbol.isSectionDefinition()) {
>> // This symbol represents a section definition.
>> assert(Symbol.getNumberOfAuxSymbols() == 1 &&
>> "Expected a single aux symbol to describe this
section!");
>> const object::coff_aux_section_definition *ObjSD >>
reinterpret_cast<const object::coff_aux_section_definition *>(
>> AuxData.data());
>>
>> So it looks like you need exactly 1 aux symbol for each section symbol.
>>
>> I then scrolled up in this function to figure out where AuxData comes
>> from, and it comes from COFFObjectFile::getSymbolAuxData. I think that
>> function holds the clue to what you need to do. It looks like you need
to
>> set coff::symbol::NumberOfAuxSymbols to 1, and then there is a comment
in
>> getSymbolAuxData which says:
>>
>> // AUX data comes immediately after the symbol in COFF
>> Aux = reinterpret_cast<const uint8_t *>(Symbol.getRawPtr()) +
>> SymbolSize;
>>
>> So I think you just need to write the bytes immediately after the
>> coff::symbol. The thing you need to write looks like a
>> coff::coff_aux_section_definition structure.
>>
>> For the CheckSum, look at WinCOFFObjectWriter::writeSection. It looks
>> like its a CRC32 of the actual section contents, which you can generate
>> with a couple of lines of code:
>>
>> JamCRC JC(/*Init=*/0);
>> JC.update(DebugHContents);
>> AuxSymbol.CheckSum = JC.getCRC();
>>
>> Hope this helps
>>
>> On Thu, Jan 25, 2018 at 10:46 AM Leonardo Santagada <santagada at
gmail.com>
>> wrote:
>>
>>>
>>> I see that there is an auxsymbol per section symbol, and also on
the
>>> yaml representation there is a checksum, selection and unused all
of them I
>>> have no idea how to fill in, also this aux symbol might have some
important
>>> information for me to patch on the other symbols. Can you find the
part in
>>> llvm that it writes those? because at least for auxsymbol the yaml
part of
>>> the code threats as a binary blob so there is no info on what they
should
>>> be.
>>>
>>> On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner at
google.com>
>>> wrote:
>>>
>>>> If you run obj2yaml against a very simple object file,
you'll see
>>>> something like this at the end:
>>>> ```
>>>> symbols:
>>>> - Name: '@comp.id'
>>>> Value: 17130443
>>>> SectionNumber: -1
>>>> SimpleType: IMAGE_SYM_TYPE_NULL
>>>> ComplexType: IMAGE_SYM_DTYPE_NULL
>>>> StorageClass: IMAGE_SYM_CLASS_STATIC
>>>> - Name: '@feat.00'
>>>> Value: 2147484048 <(21)%204748-4048>
>>>> SectionNumber: -1
>>>> SimpleType: IMAGE_SYM_TYPE_NULL
>>>> ComplexType: IMAGE_SYM_DTYPE_NULL
>>>> StorageClass: IMAGE_SYM_CLASS_STATIC
>>>> - Name: .drectve
>>>> Value: 0
>>>> SectionNumber: 1
>>>> SimpleType: IMAGE_SYM_TYPE_NULL
>>>> ComplexType: IMAGE_SYM_DTYPE_NULL
>>>> StorageClass: IMAGE_SYM_CLASS_STATIC
>>>> SectionDefinition:
>>>> Length: 47
>>>> NumberOfRelocations: 0
>>>> NumberOfLinenumbers: 0
>>>> CheckSum: 0
>>>> Number: 0
>>>> ...
>>>> ```
>>>>
>>>> There's a structure called coff::symbol which basically
represents each
>>>> one of these records. It looks like this:
>>>>
>>>> ```
>>>> struct symbol {
>>>> char Name[NameSize];
>>>> uint32_t Value;
>>>> int32_t SectionNumber;
>>>> uint16_t Type;
>>>> uint8_t StorageClass;
>>>> uint8_t NumberOfAuxSymbols;
>>>> };
>>>> ```
>>>>
>>>> So you'll need to create one for the debug$H section and
stick it into
>>>> the list. This particular list doesn't have to be in any
special order, so
>>>> you can just put it at the end (although it's probably not
that much harder
>>>> to insert into the middle, and it will make for a good test
that you've
>>>> done it right. The output can be diffed against clang-cl
object file and
>>>> be identical this way). So write all the normal symbols as you
probably
>>>> already are, then write one for the .debug$H section.
Initialize the
>>>> fields to the same thing that you see when you run obj2yaml
against an
>>>> object file generated by clang-cl for the .debug$H section.
>>>>
>>>> This structure doesn't contain any kind of file pointers or
offsets, so
>>>> all you really need to fix up are the "SectionNumber"
fields. Basically as
>>>> you are writing the existing symbols, you would do somethign
like:
>>>>
>>>> for (const auto &Sym : ObjFile.symbols()) {
>>>> if (Symbol->SectionNumber >= DebugHInsertionIndex)
>>>> ++Symbol->SectionNumber;
>>>> writeSymbol(Sym);
>>>> }
>>>> writeSymbol(DebugHSym);
>>>>
>>>>
>>>> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada
<santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> Any idea on how to create this new symbol there? I saw that
there is a
>>>>> symbol pointing to each section, but didn't understand
the format, and
>>>>> yaml2obj doesn't check it or do anything with the list.
>>>>>
>>>>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY
FORGOT.
>>>>>>
>>>>>> sorry for the caps... long day of working on this, and
using vs 2017,
>>>>>> which adds a new section type .chks64 that I
couldn't find documentation
>>>>>> anywhere was difficult. I highly recommend everyone to
just not using vs
>>>>>> 2017 until 15.8 or something, our internal bug list is
gigantic.
>>>>>>
>>>>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner
<zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Actually I already have a theory that even though
you are adding the
>>>>>>> section to the section table, you might not be
adding a *symbol* for the
>>>>>>> section to the symbol table. So the existing
symbols (which reference
>>>>>>> sections by index) will all be wrong because
you've inserted a new
>>>>>>> section. Still though, obj2yaml would expose that.
>>>>>>>
>>>>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner
<zturner at google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yea as long as you compare clang-cl object file
with automatically
>>>>>>>> generated .debug$H section against clang-cl
object file without .debug$H
>>>>>>>> but added after the fact with llvm-objcopy,
that should expose the problem
>>>>>>>> I think when you run obj2yaml on them.
>>>>>>>>
>>>>>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo
Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I did reorder my sections, so that .debug$H
is in the correct
>>>>>>>>> place, but now I get some errors on
dubplicate symbols, I created a folder
>>>>>>>>> with examples:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
https://www.dropbox.com/sh/nmvzi44pi0boe76/AAA0f47O5PCJ9JiUc6wVuwBra?dl=0
>>>>>>>>>
>>>>>>>>> t.obj is generated by vs 2015 and it links
fine with lld-link.exe,
>>>>>>>>> but tout.obj gives this errors:
>>>>>>>>>
>>>>>>>>> lld-link.exe /DEBUG:GHASH tout.obj
>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol:
>>>>>>>>> __local_stdio_printf_options in tout.obj
and in
>>>>>>>>> LIBCMT.lib(default_local_stdio_options.obj)
>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol:
>>>>>>>>> __local_stdio_printf_options in tout.obj
and in
>>>>>>>>> libvcruntime.lib(undname.obj)
>>>>>>>>>
>>>>>>>>> I'm using PEView from
http://wjradburn.com/software/ to look at
>>>>>>>>> the files and can't see anything wrong,
except some valid differences in
>>>>>>>>> the offsets being used for the data (so
pointer to data is different
>>>>>>>>> between them).
>>>>>>>>>
>>>>>>>>> I will look into yaml2obj now to see if I
see anything else weird
>>>>>>>>> going on.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary
Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> I'm pretty confident that cl is not
putting anything strange in
>>>>>>>>>> the .debug$T sections. We've done
a lot of testing and never seen anything
>>>>>>>>>> except CodeView type records in a
.debug$T. My hunch is that your objcopy
>>>>>>>>>> patch is probably not doing the right
thing in one or more of the section
>>>>>>>>>> headers, and this is confusing the
linker.
>>>>>>>>>>
>>>>>>>>>> One idea might be to build a simple
object file with clang-cl but
>>>>>>>>>> without the magic -mllvm
-emit-codeview-ghash-section, then run your
>>>>>>>>>> llvm-objcopy on it. Then build the
same object file passing -mllvm
>>>>>>>>>> -emit-codeview-ghash-section. Then run
obj2yaml on both and diff the
>>>>>>>>>> results. They should be byte-for-byte
identical. That should give you a
>>>>>>>>>> clue about if objcopy is doing
something wrong.
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM
Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Don't worry, I definetly want
to perfect this to generate legal
>>>>>>>>>>> obj files, this is just to speed up
testing.
>>>>>>>>>>>
>>>>>>>>>>> Now after patching all the obj
files I get this errors when
>>>>>>>>>>> linking a small part of our code
base (msvc 2017 15.5.3, lld and
>>>>>>>>>>> llvm-objcopy 7.0.0):
>>>>>>>>>>> lld-link.exe : error : relocation
against symbol in discarded
>>>>>>>>>>> section: $LN8
>>>>>>>>>>> lld-link.exe : error : relocation
against symbol in discarded
>>>>>>>>>>> section: $LN43
>>>>>>>>>>> lld-link.exe : error : relocation
against symbol in discarded
>>>>>>>>>>> section: $LN37
>>>>>>>>>>>
>>>>>>>>>>> I'm starting to guess that
cl.exe might be putting some random
>>>>>>>>>>> comdat or other discardable symbols
in the .debug$T and clang doesn't? I
>>>>>>>>>>> will try to debug this and see what
more I can uncover.
>>>>>>>>>>>
>>>>>>>>>>> Linking works perfectly without my
llvm-objcopy pass to add
>>>>>>>>>>> .debug$H?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM,
Zachary Turner <
>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It might not influence LLD, but
at the same time we don't want
>>>>>>>>>>>> to upstream something that is
producing technically illegal COFF files.
>>>>>>>>>>>> Also good to hear about the
planned changes to your header files. Looking
>>>>>>>>>>>> forward to hearing about your
experiences with clang-cl.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jan 24, 2018 at 10:41
AM Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I finally got my first .obj
file patched with .debug$H to look
>>>>>>>>>>>>> somewhat right. I added the
new section at the end of the file so I don't
>>>>>>>>>>>>> have to recalculate all
sections (although now I probably could position it
>>>>>>>>>>>>> in the middle, knowing that
each section is: SizeOfRawData +
>>>>>>>>>>>>>
(last.Header.NumberOfRelocations * (4+4+2)) and the $H needs to come right
>>>>>>>>>>>>> after $T in the file). That
although illegal based on the coff specs
>>>>>>>>>>>>> doesn't seem its going
to influence lld.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also we talked and we are
probably going to do something
>>>>>>>>>>>>> similar to a bunch of
windows defines and a check for our own define (to
>>>>>>>>>>>>> guarantee that no one
imported windows.h before win32.h) and drop the
>>>>>>>>>>>>> namespace and the
conflicting names.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jan 23, 2018 at
12:46 AM, Zachary Turner <
>>>>>>>>>>>>> zturner at google.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's very
possible that a 3rd party indirect header include
>>>>>>>>>>>>>> is involved. One idea
might be like I suggested where you #define
>>>>>>>>>>>>>> _WINDOWS_ in win32.h
and guarantee that it's always included first. Then
>>>>>>>>>>>>>> those other headers
won't be able to #include <windows.h>. but it will
>>>>>>>>>>>>>> probably greatly expand
the amount of stuff you have to add to win32.h, as
>>>>>>>>>>>>>> you will probably find
some callers of functions that aren't yet in your
>>>>>>>>>>>>>> win32.h that you'd
have to add.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at
3:28 PM Leonardo Santagada <
>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ok some information
was lost on getting this example to you,
>>>>>>>>>>>>>>> I'm sorry for
not being clear.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We have a huge code
base, let's say 90% of it doesn't
>>>>>>>>>>>>>>> include either
header, 9% include win32.h and 1% includes both, I will try
>>>>>>>>>>>>>>> to discover why,
but my guess is they include both a third party that
>>>>>>>>>>>>>>> includes windows.h
and some of our libs that use win32.h.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will try to fully
understand this tomorrow.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess clang will
not implement this ever so finishing the
>>>>>>>>>>>>>>> object copier is
the best solution until all code is ported to clang.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 23 Jan 2018
00:02, "Zachary Turner" <zturner at google.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You said
win32.h doesn't include windows.h, but main.cpp
>>>>>>>>>>>>>>>> does. So
what's the disadvantage of just including it in win32.h anyway,
>>>>>>>>>>>>>>>> since it's
already going to be in every translation unit? (Unless you
>>>>>>>>>>>>>>>> didn't mean
to #include it in main.cpp)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I guess all I
can do is warn you how bad of an idea this
>>>>>>>>>>>>>>>> is. For
starters, I already found a bug in your code ;-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // stdint.h
>>>>>>>>>>>>>>>> typedef int
int32_t;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // winnt.h
>>>>>>>>>>>>>>>> typedef long
LONG;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // windef.h
>>>>>>>>>>>>>>>> typedef struct
tagPOINT
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>> LONG x;
// long x
>>>>>>>>>>>>>>>> LONG y;
// long y
>>>>>>>>>>>>>>>> } POINT,
*PPOINT, NEAR *NPPOINT, FAR *LPPOINT;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // win32.h
>>>>>>>>>>>>>>>> typedef int32_t
LONG;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> struct POINT
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>> LONG x; //
int x
>>>>>>>>>>>>>>>> LONG y; //
int y
>>>>>>>>>>>>>>>> };
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So POINT is
defined two different ways. In your minimal
>>>>>>>>>>>>>>>> interface,
it's declared as 2 int32's, which are int. In the actual
>>>>>>>>>>>>>>>> Windows header
files, it's declared as 2 longs.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This might seem
like a unimportant bug since int and long
>>>>>>>>>>>>>>>> are the same
size, but int and long also mangle differently and affect
>>>>>>>>>>>>>>>> overload
resolution, so you could have weird linker errors or call the
>>>>>>>>>>>>>>>> wrong function
overload.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Plus, it
illustrates the fact that this struct *actually
>>>>>>>>>>>>>>>> is* a different
type from the one in the windows header.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You said at the
end that you never intentionally import
>>>>>>>>>>>>>>>> win32.h and
windows.h from the same translation unit. But then in this
>>>>>>>>>>>>>>>> example you
did. I wonder if you could enforce that by doing this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // win32.h
>>>>>>>>>>>>>>>> #pragma once
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // Error if
windows.h was included before us.
>>>>>>>>>>>>>>>> #if
defined(_WINDOWS_)
>>>>>>>>>>>>>>>> #error
"You're including win32.h after having already
>>>>>>>>>>>>>>>> included
windows.h. Don't do this!"
>>>>>>>>>>>>>>>> #endif
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> // And also
make sure windows.h can't get included after us
>>>>>>>>>>>>>>>> #define
_WINDOWS_
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the record,
I tried the test case you linked when
>>>>>>>>>>>>>>>> windows.h is
not included in main.cpp and it works (but still has the bug
>>>>>>>>>>>>>>>> about int and
long).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Jan 22,
2018 at 2:23 PM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It is super
gross, but we copy parts of windows.h because
>>>>>>>>>>>>>>>>> having all
of it if both gigantic and very very messy. So our win32.h has a
>>>>>>>>>>>>>>>>> couple
thousands of lines and not 30k+ for windows.h and we try to have
>>>>>>>>>>>>>>>>> zero
macros. Win32.h doesn't include windows.h so using ::BOOL wouldn't
>>>>>>>>>>>>>>>>> work. We
don't want to create a namespace, we just want a cleaner interface
>>>>>>>>>>>>>>>>> to windows
api. The namespace with c linkage is the way to trick cl into
>>>>>>>>>>>>>>>>> allowing us
to in some files have both windows.h and Win32.h. I really
>>>>>>>>>>>>>>>>> don't
see any way for us to have this Win32.h without this cl support, so
>>>>>>>>>>>>>>>>> maybe we
should either put windows.h in a compiled header somewhere and not
>>>>>>>>>>>>>>>>> care that
it is infecting everything or just have one place we can call to
>>>>>>>>>>>>>>>>> clean up
after including windows.h (a massive set of undefs).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So using
can't work, because we never intentionally import
>>>>>>>>>>>>>>>>> windows.h
and win32.h on the same translation unit.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jan
22, 2018 at 7:08 PM, Zachary Turner <
>>>>>>>>>>>>>>>>> zturner at
google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is
pretty gross, honestly :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Can't you just use using declarations?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
namespace Win32 {
>>>>>>>>>>>>>>>>>> extern
"C" {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> using
::BOOL;
>>>>>>>>>>>>>>>>>> using
::LONG;
>>>>>>>>>>>>>>>>>> using
::POINT;
>>>>>>>>>>>>>>>>>> using
::LPPOINT;
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> using
::GetCursorPos;
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This
works with clang-cl.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon,
Jan 22, 2018 at 5:39 AM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Here it is a minimal example, we do this so we don't
>>>>>>>>>>>>>>>>>>>
have to import the whole windows api everywhere.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
https://gist.github.com/santagada/7977e929d31c629c4bf18ebb987f6be3
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Clang-cl maintains compatibility with msvc even in
>>>>>>>>>>>>>>>>>>>>
cases where it’s non standards compliant (eg 2 phase name lookup), but we
>>>>>>>>>>>>>>>>>>>>
try to keep these cases few and far between.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
To help me understand your case, do you mean you copy
>>>>>>>>>>>>>>>>>>>>
windows.h and modify it? How does this lead to the same struct being
>>>>>>>>>>>>>>>>>>>>
defined twice? If i were to write this:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
struct Foo {};
>>>>>>>>>>>>>>>>>>>>
struct Foo {};
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Is this a small repro of the issue you’re talking about?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
I can totally see something like incremental linking
>>>>>>>>>>>>>>>>>>>>>
with a simple padding between obj and a mapping file (which can also help
>>>>>>>>>>>>>>>>>>>>>
with edit and continue, something we also would love to have).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
We have another developer doing the port to support
>>>>>>>>>>>>>>>>>>>>>
clang-cl, but although most of our code also goes trough a version of
>>>>>>>>>>>>>>>>>>>>>
clang, migrating the rest to clang-cl has been a fight. From what I heard
>>>>>>>>>>>>>>>>>>>>>
the main problem is that we have a copy of parts of windows.h (so not to
>>>>>>>>>>>>>>>>>>>>>
bring the awful parts of it like lower case macros) and that totally works
>>>>>>>>>>>>>>>>>>>>>
on cl, but clang (at least 6.0) complains about two struct/vars with the
>>>>>>>>>>>>>>>>>>>>>
same name, even though they are exactly the same. Making clang-cl as broken
>>>>>>>>>>>>>>>>>>>>>
as cl.exe is not an option I suppose? I would love to turn on a flag
>>>>>>>>>>>>>>>>>>>>>
--accept-that-cl-made-bad-decisions-and-live-with-it and have this at least
>>>>>>>>>>>>>>>>>>>>>
until this is completely fixed in our code base.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
the biggest win with moving to cl would be a better
>>>>>>>>>>>>>>>>>>>>>
more standards compliant compiler, no 1 minute compiles on heavily
>>>>>>>>>>>>>>>>>>>>>
templated files and maybe the holy grail of ThinLTO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
10-15s will be hard without true incremental linking.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
At some point that's going to be the only way to get
>>>>>>>>>>>>>>>>>>>>>>
any faster, but incremental linking is hard (putting it lightly), and since
>>>>>>>>>>>>>>>>>>>>>>
our full links are already really fast we think we can get reasonably close
>>>>>>>>>>>>>>>>>>>>>>
to link.exe incremental speeds with full links. But it's never enough and
>>>>>>>>>>>>>>>>>>>>>>
I will always want it to be faster, so you may see incremental linking in
>>>>>>>>>>>>>>>>>>>>>>
the future after we hit a performance wall with full link speed :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
In any case, I'm definitely interested in seeing what
>>>>>>>>>>>>>>>>>>>>>>
kind of numbers you get with /debug:ghash after you get this llvm-objcopy
>>>>>>>>>>>>>>>>>>>>>>
feature implemented. So keep me updated :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
As an aside, have you tried building with clang
>>>>>>>>>>>>>>>>>>>>>>
instead of cl? If you build with clang you wouldn't even have to do this
>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy work, because it would "just work". If you've tried
but ran
>>>>>>>>>>>>>>>>>>>>>>
into issues I'm interested in hearing about those too. On the other hand,
>>>>>>>>>>>>>>>>>>>>>>
it's also reasonable to only switch one thing at a time.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>>>>>>>>>>>>>>>>>
santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
if we get to < 30s I think most users would prefer
>>>>>>>>>>>>>>>>>>>>>>>
it to link.exe, just hopping there is still some more optimizations to get
>>>>>>>>>>>>>>>>>>>>>>>
closer to ELF linking times (around 10-15s here).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Generally speaking a good rule of thumb is that
>>>>>>>>>>>>>>>>>>>>>>>>
/debug:ghash will be close to or faster than /debug:fastlink, but with none
>>>>>>>>>>>>>>>>>>>>>>>>
of the penalties like slow debug time
>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
Chrome is actually one of my exact benchmark
>>>>>>>>>>>>>>>>>>>>>>>>>
cases. When building blink_core.dll and browser_tests.exe, i get anywhere
>>>>>>>>>>>>>>>>>>>>>>>>>
from a 20-40% reduction in link time. We have some other optimizations in
>>>>>>>>>>>>>>>>>>>>>>>>>
the pipeline but not upstream yet.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
My best time so far (including other optimizations
>>>>>>>>>>>>>>>>>>>>>>>>>
not yet upstream) is 28s on blink_core.dll, compared to 110s with /debug
>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 12:28 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <
>>>>>>>>>>>>>>>>>>>>>>>>>>
zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
You probably don't want to go down the same
>>>>>>>>>>>>>>>>>>>>>>>>>>>
route that clang goes through to write the object file. If you think
>>>>>>>>>>>>>>>>>>>>>>>>>>>
yaml2coff is convoluted, the way clang does it will just give you a
>>>>>>>>>>>>>>>>>>>>>>>>>>>
headache. There are multiple abstractions involved to account for
>>>>>>>>>>>>>>>>>>>>>>>>>>>
different object file formats (ELF, COFF, MachO) and output formats
>>>>>>>>>>>>>>>>>>>>>>>>>>>
(Assembly, binary file). At least with yaml2coff
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I think your phrase got cut there, but yeah I
>>>>>>>>>>>>>>>>>>>>>>>>>>
just found AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
It's true that yaml2coff is using the COFFParser
>>>>>>>>>>>>>>>>>>>>>>>>>>>
structure, but if you look at the writeCOFF
>>>>>>>>>>>>>>>>>>>>>>>>>>>
function in yaml2coff it's pretty bare-metal. The logic you need will be
>>>>>>>>>>>>>>>>>>>>>>>>>>>
almost identical, except that instead of checking the COFFParser for the
>>>>>>>>>>>>>>>>>>>>>>>>>>>
various fields, you'll check the existing COFFObjectFile, which should have
>>>>>>>>>>>>>>>>>>>>>>>>>>>
similar fields.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
The only thing you need to different is when
>>>>>>>>>>>>>>>>>>>>>>>>>>>
writing the section table and section contents, to insert a new entry. Since
>>>>>>>>>>>>>>>>>>>>>>>>>>>
you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>>>>>>>>>>>>>>>>>
push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>>>>>>>>>>>>>>>>>
overlap. (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>>>>>>>>>>>>>>>>>
between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>>>>>>>>>>>>>>>>>
their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
I have the PE/COFF spec open here and I'm happy
>>>>>>>>>>>>>>>>>>>>>>>>>>
that I read a bit of it so I actually know what you are talking about...
>>>>>>>>>>>>>>>>>>>>>>>>>>
yeah it doesn't seem too complicated.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you need to know what values to put for the
>>>>>>>>>>>>>>>>>>>>>>>>>>>
other fields in a section header, run `dumpbin /headers foo.obj` on a
>>>>>>>>>>>>>>>>>>>>>>>>>>>
clang-generated object file that has a .debug$H section already (e.g. run
>>>>>>>>>>>>>>>>>>>>>>>>>>>
clang with -emit-codeview-ghash-section, and look at the properties of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>
.debug$H section and use the same values).
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks I will do that and then also look at how
>>>>>>>>>>>>>>>>>>>>>>>>>>
the CodeView part of the code does it if I can't understand some of it.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
The only invariant that needs to be maintained
>>>>>>>>>>>>>>>>>>>>>>>>>>>
is that Section[N]->FilePointerOfRawData
=>>>>>>>>>>>>>>>>>>>>>>>>>>>
Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Well, that and all the sections need to be on the
>>>>>>>>>>>>>>>>>>>>>>>>>>
final file... But I'm hopeful.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Anyone has times on linking a big project like
>>>>>>>>>>>>>>>>>>>>>>>>>>
chrome with this so that at least I know what kind of performance to expect?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
My numbers are something like:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
1 pdb per obj file: link.exe takes ~15 minutes
>>>>>>>>>>>>>>>>>>>>>>>>>>
and 16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>>>>>>>>>>>>>>>>>
around 10 pdbs per folder: link.exe takes 1
>>>>>>>>>>>>>>>>>>>>>>>>>>
minute and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>>>>>>>>>>>>>>>>>
faslink: link.exe takes 40 seconds, but then 20
>>>>>>>>>>>>>>>>>>>>>>>>>>
seconds of loading at the first break point in the debugger and we lost DIA
>>>>>>>>>>>>>>>>>>>>>>>>>>
support for listing symbols.
>>>>>>>>>>>>>>>>>>>>>>>>>>
incremental: link.exe takes 8 seconds, but it
>>>>>>>>>>>>>>>>>>>>>>>>>>
only happens when very minor changes happen.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
We have an non negligible number of symbols used
>>>>>>>>>>>>>>>>>>>>>>>>>>
on some runtime systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Sat, Jan 20, 2018 at 11:52 AM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Thanks for the tips, I now have something that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
reads the obj file, finds .debug$T sections and global hashes it (proof of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
concept kind of code). What I can't find is: how does clang itself writes
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
the coff files with global hashes, as that might help me understand how to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
create the .debug$H section, how to update the file section count and how
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
to properly write this back.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
The code on yaml2coff is expecting to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
working on the yaml COFFParser struct and I'm having quite a bit of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
headache turning the COFFObjectFile into a COFFParser object or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
compatible... Tomorrow I might try the very non efficient path of coff2yaml
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
and then yaml2coff with the hashes header... but it seems way too
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
inefficient and convoluted.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 10:38 PM, Zachary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Turner <zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 1:02 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 9:44 PM, Zachary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Turner <zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
On Fri, Jan 19, 2018 at 12:29 PM Leonardo
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Santagada <santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
No I didn't, I used cl.exe from the visual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
studio toolchain. What I'm proposing is a tool for processing .obj files in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
COFF format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
To make our build faster we use hundreds of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
unity build files (.cpp's with a lot of other .cpp's in them aka munch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
files) but still have a lot of single .cpp's as well (in total something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
like 3.4k .obj files).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ps: sorry for sending to the wrong list, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
was reading about llvm mailing lists and jumped when I saw what I thought
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
was a lld exclusive list.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A tool like this would be useful, yes.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
We've talked about it internally as well and agreed it would be useful, we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just haven't prioritized it. If you're interested in submitting a patch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
along those lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I'm not sure what the best place for it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
would be. llvm-readobj and llvm-objdump seem like obvious choices, but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
they are intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-pdbutil is kind of a hodgepodge of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
everything else related to PDBs and symbols, so I wouldn't be opposed to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
making a new subcommand there called "ghash" or something that could
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
process an object file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
A third option would be to make a new tool
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
for it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I don't htink it would be that hard to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
write. If you're interested in trying to make a patch for this, I can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
offer some guidance on where to look in the code. Otherwise it's something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
that we'll probably get to, I'm just not sure when.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would love to write it and contribute it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
back, please do tell, I did find some of the code of ghash in lld, but in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
fuzzy on the llvm codeview part of it and never seen llvm-readobj/objdump
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
or llvm-pdbutil, but I'm not afraid to look :)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Luckily all of the important code is hidden
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
behind library calls, and it should already just do the right thing, so I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
suspect you won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I think Peter has the right idea about putting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
this in llvm-objcopy.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
You can look at one of the existing CopyBinary
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
functions there, which currently only work for ELF, but you can just make a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
new overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I would probably start by iterating over each
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
of the sections (getNumberOfSections / getSectionName) looking for .debug$T
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
and .debug$H sections.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$H section then you can
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
just skip that object file.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If you find a .debug$T but not a .debug$H,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
then basically do the same thing that LLD does in PDBLinker::mergeDebugT
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(create a CVTypeArray, and pass it to GloballyHashedType::hashTypes. That
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
will return an array of hash values. (the format of .debug$H is the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
header, followed by the hash values). Then when you're writing the list of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Currently llvm-objcopy only writes ELF files,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
so it would need to be taught to write COFF files. We have code to do this
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
in the yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
writeCOFF). There may be a way to move this code to somewhere else
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
re-write it to work with these new structures.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Lastly, you'll probably want to put all of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
this behind an option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Leonardo Santagada
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Leonardo
Santagada
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Leonardo Santagada
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Leonardo Santagada
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>
>
>
> --
>
> Leonardo Santagada
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/474e803b/attachment-0001.html>
Zachary Turner via llvm-dev
2018-Jan-26 17:49 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
(Ignore the fact that my hashes are 8 byte in the "good" file, this is due to some local changes I've been experimenting with) On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at google.com> wrote:> I did this: > > // a.cpp > static int x = 0; > void b(int); > void a(int) { > if (x) > b(x); > } > int main(int argc, char **argv) { > a(argc); > return x; > } > > > clang-cl /Z7 /c a.cpp /Foa.noghash.obj > clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section > /Foa.ghash.good.obj > llvm-objcopy a.noghash.obj a.ghash.bad.obj > obj2yaml a.ghash.good.obj > a.ghash.good.yaml > obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml > > Then open these 2 yaml files up in a diff viewer. It looks like the > hashes aren't getting emitted at all. For example, in the good yaml file I > see this: > > - Name: '.debug$H' > Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, > IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ] > Alignment: 4 > SectionData: > C5C93301000001005549419E78044E3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF84584814A8B5E7E3FB17B397A9E3DEA75CD5627 > GlobalHashes: > Version: 0 > HashAlgorithm: 1 > HashValues: > - 5549419E78044E38 > - 96D45CD700942875 > - 8BE4A1E2B3E022BA > - 267DEE221F5C42B1 > - 7BCA182AF8458481 > - 4A8B5E7E3FB17B39 > - 7A9E3DEA75CD5627 > - Name: .pdata > > And in the bad yaml file I see this: > - Name: '.debug$H' > Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, > IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ] > Alignment: 4 > SectionData: C5C9330100000000 > GlobalHashes: > Version: 0 > HashAlgorithm: 0 > - Name: .pdata > > Don't focus too much on trying to figure out weird linker errors. Just > get the output of obj2yaml to be identical when run under a diff utility, > then everything should work fine. > > On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <santagada at gmail.com> > wrote: > >> I'm so close I can almost smell it :) >> >> I know how bad the code looks, I don't intend to submit this, but if you >> want to try it out its at: >> https://gist.github.com/santagada/544136b1ee143bf31653b1158ac6829e >> >> I'm seeing: lld-link.exe: error: duplicate symbol: "<redacted_unmangled>" >> (<redacted>) in <internal> and in <redacted_filename>.obj, looking at the >> .yaml dump the symbols are all similar to this: >> >> - Name: <redacted> >> Value: 0 >> SectionNumber: 0 >> SimpleType: IMAGE_SYM_TYPE_NULL >> ComplexType: IMAGE_SYM_DTYPE_FUNCTION >> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL >> WeakExternal: >> TagIndex: 134 >> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY >> >> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at google.com> >> wrote: >> >>> I haven't really dabbled in this part of the COFF format personally, so >>> hopefully I'm not leading you astray :) >>> >>> But I checked the code for coff2yaml, and I see this: >>> >>> } else if (Symbol.isSectionDefinition()) { >>> // This symbol represents a section definition. >>> assert(Symbol.getNumberOfAuxSymbols() == 1 && >>> "Expected a single aux symbol to describe this section!"); >>> const object::coff_aux_section_definition *ObjSD >>> reinterpret_cast<const object::coff_aux_section_definition >>> *>( >>> AuxData.data()); >>> >>> So it looks like you need exactly 1 aux symbol for each section symbol. >>> >>> I then scrolled up in this function to figure out where AuxData comes >>> from, and it comes from COFFObjectFile::getSymbolAuxData. I think that >>> function holds the clue to what you need to do. It looks like you need to >>> set coff::symbol::NumberOfAuxSymbols to 1, and then there is a comment in >>> getSymbolAuxData which says: >>> >>> // AUX data comes immediately after the symbol in COFF >>> Aux = reinterpret_cast<const uint8_t *>(Symbol.getRawPtr()) + >>> SymbolSize; >>> >>> So I think you just need to write the bytes immediately after the >>> coff::symbol. The thing you need to write looks like a >>> coff::coff_aux_section_definition structure. >>> >>> For the CheckSum, look at WinCOFFObjectWriter::writeSection. It looks >>> like its a CRC32 of the actual section contents, which you can generate >>> with a couple of lines of code: >>> >>> JamCRC JC(/*Init=*/0); >>> JC.update(DebugHContents); >>> AuxSymbol.CheckSum = JC.getCRC(); >>> >>> Hope this helps >>> >>> On Thu, Jan 25, 2018 at 10:46 AM Leonardo Santagada <santagada at gmail.com> >>> wrote: >>> >>>> >>>> I see that there is an auxsymbol per section symbol, and also on the >>>> yaml representation there is a checksum, selection and unused all of them I >>>> have no idea how to fill in, also this aux symbol might have some important >>>> information for me to patch on the other symbols. Can you find the part in >>>> llvm that it writes those? because at least for auxsymbol the yaml part of >>>> the code threats as a binary blob so there is no info on what they should >>>> be. >>>> >>>> On Thu, Jan 25, 2018 at 7:15 PM, Zachary Turner <zturner at google.com> >>>> wrote: >>>> >>>>> If you run obj2yaml against a very simple object file, you'll see >>>>> something like this at the end: >>>>> ``` >>>>> symbols: >>>>> - Name: '@comp.id' >>>>> Value: 17130443 >>>>> SectionNumber: -1 >>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>> ComplexType: IMAGE_SYM_DTYPE_NULL >>>>> StorageClass: IMAGE_SYM_CLASS_STATIC >>>>> - Name: '@feat.00' >>>>> Value: 2147484048 <(21)%204748-4048> >>>>> SectionNumber: -1 >>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>> ComplexType: IMAGE_SYM_DTYPE_NULL >>>>> StorageClass: IMAGE_SYM_CLASS_STATIC >>>>> - Name: .drectve >>>>> Value: 0 >>>>> SectionNumber: 1 >>>>> SimpleType: IMAGE_SYM_TYPE_NULL >>>>> ComplexType: IMAGE_SYM_DTYPE_NULL >>>>> StorageClass: IMAGE_SYM_CLASS_STATIC >>>>> SectionDefinition: >>>>> Length: 47 >>>>> NumberOfRelocations: 0 >>>>> NumberOfLinenumbers: 0 >>>>> CheckSum: 0 >>>>> Number: 0 >>>>> ... >>>>> ``` >>>>> >>>>> There's a structure called coff::symbol which basically represents >>>>> each one of these records. It looks like this: >>>>> >>>>> ``` >>>>> struct symbol { >>>>> char Name[NameSize]; >>>>> uint32_t Value; >>>>> int32_t SectionNumber; >>>>> uint16_t Type; >>>>> uint8_t StorageClass; >>>>> uint8_t NumberOfAuxSymbols; >>>>> }; >>>>> ``` >>>>> >>>>> So you'll need to create one for the debug$H section and stick it into >>>>> the list. This particular list doesn't have to be in any special order, so >>>>> you can just put it at the end (although it's probably not that much harder >>>>> to insert into the middle, and it will make for a good test that you've >>>>> done it right. The output can be diffed against clang-cl object file and >>>>> be identical this way). So write all the normal symbols as you probably >>>>> already are, then write one for the .debug$H section. Initialize the >>>>> fields to the same thing that you see when you run obj2yaml against an >>>>> object file generated by clang-cl for the .debug$H section. >>>>> >>>>> This structure doesn't contain any kind of file pointers or offsets, >>>>> so all you really need to fix up are the "SectionNumber" fields. Basically >>>>> as you are writing the existing symbols, you would do somethign like: >>>>> >>>>> for (const auto &Sym : ObjFile.symbols()) { >>>>> if (Symbol->SectionNumber >= DebugHInsertionIndex) >>>>> ++Symbol->SectionNumber; >>>>> writeSymbol(Sym); >>>>> } >>>>> writeSymbol(DebugHSym); >>>>> >>>>> >>>>> On Thu, Jan 25, 2018 at 9:57 AM Leonardo Santagada < >>>>> santagada at gmail.com> wrote: >>>>> >>>>>> Any idea on how to create this new symbol there? I saw that there is >>>>>> a symbol pointing to each section, but didn't understand the format, and >>>>>> yaml2obj doesn't check it or do anything with the list. >>>>>> >>>>>> On Thu, Jan 25, 2018 at 6:56 PM, Leonardo Santagada < >>>>>> santagada at gmail.com> wrote: >>>>>> >>>>>>> YES, THANK YOU... I WAS THINKING THIS BUT COMPLETELY FORGOT. >>>>>>> >>>>>>> sorry for the caps... long day of working on this, and using vs >>>>>>> 2017, which adds a new section type .chks64 that I couldn't find >>>>>>> documentation anywhere was difficult. I highly recommend everyone to just >>>>>>> not using vs 2017 until 15.8 or something, our internal bug list is >>>>>>> gigantic. >>>>>>> >>>>>>> On Thu, Jan 25, 2018 at 6:52 PM, Zachary Turner <zturner at google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Actually I already have a theory that even though you are adding >>>>>>>> the section to the section table, you might not be adding a *symbol* for >>>>>>>> the section to the symbol table. So the existing symbols (which reference >>>>>>>> sections by index) will all be wrong because you've inserted a new >>>>>>>> section. Still though, obj2yaml would expose that. >>>>>>>> >>>>>>>> On Thu, Jan 25, 2018 at 9:50 AM Zachary Turner <zturner at google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Yea as long as you compare clang-cl object file with automatically >>>>>>>>> generated .debug$H section against clang-cl object file without .debug$H >>>>>>>>> but added after the fact with llvm-objcopy, that should expose the problem >>>>>>>>> I think when you run obj2yaml on them. >>>>>>>>> >>>>>>>>> On Thu, Jan 25, 2018 at 9:49 AM Leonardo Santagada < >>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> I did reorder my sections, so that .debug$H is in the correct >>>>>>>>>> place, but now I get some errors on dubplicate symbols, I created a folder >>>>>>>>>> with examples: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://www.dropbox.com/sh/nmvzi44pi0boe76/AAA0f47O5PCJ9JiUc6wVuwBra?dl=0 >>>>>>>>>> >>>>>>>>>> t.obj is generated by vs 2015 and it links fine with >>>>>>>>>> lld-link.exe, but tout.obj gives this errors: >>>>>>>>>> >>>>>>>>>> lld-link.exe /DEBUG:GHASH tout.obj >>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol: >>>>>>>>>> __local_stdio_printf_options in tout.obj and in >>>>>>>>>> LIBCMT.lib(default_local_stdio_options.obj) >>>>>>>>>> LLD-LINK.EXE: error: duplicate symbol: >>>>>>>>>> __local_stdio_printf_options in tout.obj and in >>>>>>>>>> libvcruntime.lib(undname.obj) >>>>>>>>>> >>>>>>>>>> I'm using PEView from http://wjradburn.com/software/ to look at >>>>>>>>>> the files and can't see anything wrong, except some valid differences in >>>>>>>>>> the offsets being used for the data (so pointer to data is different >>>>>>>>>> between them). >>>>>>>>>> >>>>>>>>>> I will look into yaml2obj now to see if I see anything else weird >>>>>>>>>> going on. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Jan 25, 2018 at 6:41 PM, Zachary Turner < >>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>> >>>>>>>>>>> I'm pretty confident that cl is not putting anything strange in >>>>>>>>>>> the .debug$T sections. We've done a lot of testing and never seen anything >>>>>>>>>>> except CodeView type records in a .debug$T. My hunch is that your objcopy >>>>>>>>>>> patch is probably not doing the right thing in one or more of the section >>>>>>>>>>> headers, and this is confusing the linker. >>>>>>>>>>> >>>>>>>>>>> One idea might be to build a simple object file with clang-cl >>>>>>>>>>> but without the magic -mllvm -emit-codeview-ghash-section, then run your >>>>>>>>>>> llvm-objcopy on it. Then build the same object file passing -mllvm >>>>>>>>>>> -emit-codeview-ghash-section. Then run obj2yaml on both and diff the >>>>>>>>>>> results. They should be byte-for-byte identical. That should give you a >>>>>>>>>>> clue about if objcopy is doing something wrong. >>>>>>>>>>> >>>>>>>>>>> On Thu, Jan 25, 2018 at 2:21 AM Leonardo Santagada < >>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Don't worry, I definetly want to perfect this to generate legal >>>>>>>>>>>> obj files, this is just to speed up testing. >>>>>>>>>>>> >>>>>>>>>>>> Now after patching all the obj files I get this errors when >>>>>>>>>>>> linking a small part of our code base (msvc 2017 15.5.3, lld and >>>>>>>>>>>> llvm-objcopy 7.0.0): >>>>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>>>> section: $LN8 >>>>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>>>> section: $LN43 >>>>>>>>>>>> lld-link.exe : error : relocation against symbol in discarded >>>>>>>>>>>> section: $LN37 >>>>>>>>>>>> >>>>>>>>>>>> I'm starting to guess that cl.exe might be putting some random >>>>>>>>>>>> comdat or other discardable symbols in the .debug$T and clang doesn't? I >>>>>>>>>>>> will try to debug this and see what more I can uncover. >>>>>>>>>>>> >>>>>>>>>>>> Linking works perfectly without my llvm-objcopy pass to add >>>>>>>>>>>> .debug$H? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 25, 2018 at 1:53 AM, Zachary Turner < >>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> It might not influence LLD, but at the same time we don't want >>>>>>>>>>>>> to upstream something that is producing technically illegal COFF files. >>>>>>>>>>>>> Also good to hear about the planned changes to your header files. Looking >>>>>>>>>>>>> forward to hearing about your experiences with clang-cl. >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jan 24, 2018 at 10:41 AM Leonardo Santagada < >>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I finally got my first .obj file patched with .debug$H to >>>>>>>>>>>>>> look somewhat right. I added the new section at the end of the file so I >>>>>>>>>>>>>> don't have to recalculate all sections (although now I probably could >>>>>>>>>>>>>> position it in the middle, knowing that each section is: SizeOfRawData + >>>>>>>>>>>>>> (last.Header.NumberOfRelocations * (4+4+2)) and the $H needs to come right >>>>>>>>>>>>>> after $T in the file). That although illegal based on the coff specs >>>>>>>>>>>>>> doesn't seem its going to influence lld. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also we talked and we are probably going to do something >>>>>>>>>>>>>> similar to a bunch of windows defines and a check for our own define (to >>>>>>>>>>>>>> guarantee that no one imported windows.h before win32.h) and drop the >>>>>>>>>>>>>> namespace and the conflicting names. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Jan 23, 2018 at 12:46 AM, Zachary Turner < >>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> That's very possible that a 3rd party indirect header >>>>>>>>>>>>>>> include is involved. One idea might be like I suggested where you #define >>>>>>>>>>>>>>> _WINDOWS_ in win32.h and guarantee that it's always included first. Then >>>>>>>>>>>>>>> those other headers won't be able to #include <windows.h>. but it will >>>>>>>>>>>>>>> probably greatly expand the amount of stuff you have to add to win32.h, as >>>>>>>>>>>>>>> you will probably find some callers of functions that aren't yet in your >>>>>>>>>>>>>>> win32.h that you'd have to add. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 3:28 PM Leonardo Santagada < >>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ok some information was lost on getting this example to >>>>>>>>>>>>>>>> you, I'm sorry for not being clear. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We have a huge code base, let's say 90% of it doesn't >>>>>>>>>>>>>>>> include either header, 9% include win32.h and 1% includes both, I will try >>>>>>>>>>>>>>>> to discover why, but my guess is they include both a third party that >>>>>>>>>>>>>>>> includes windows.h and some of our libs that use win32.h. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I will try to fully understand this tomorrow. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I guess clang will not implement this ever so finishing the >>>>>>>>>>>>>>>> object copier is the best solution until all code is ported to clang. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 23 Jan 2018 00:02, "Zachary Turner" <zturner at google.com> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You said win32.h doesn't include windows.h, but main.cpp >>>>>>>>>>>>>>>>> does. So what's the disadvantage of just including it in win32.h anyway, >>>>>>>>>>>>>>>>> since it's already going to be in every translation unit? (Unless you >>>>>>>>>>>>>>>>> didn't mean to #include it in main.cpp) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I guess all I can do is warn you how bad of an idea this >>>>>>>>>>>>>>>>> is. For starters, I already found a bug in your code ;-) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // stdint.h >>>>>>>>>>>>>>>>> typedef int int32_t; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // winnt.h >>>>>>>>>>>>>>>>> typedef long LONG; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // windef.h >>>>>>>>>>>>>>>>> typedef struct tagPOINT >>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>> LONG x; // long x >>>>>>>>>>>>>>>>> LONG y; // long y >>>>>>>>>>>>>>>>> } POINT, *PPOINT, NEAR *NPPOINT, FAR *LPPOINT; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // win32.h >>>>>>>>>>>>>>>>> typedef int32_t LONG; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> struct POINT >>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>> LONG x; // int x >>>>>>>>>>>>>>>>> LONG y; // int y >>>>>>>>>>>>>>>>> }; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So POINT is defined two different ways. In your minimal >>>>>>>>>>>>>>>>> interface, it's declared as 2 int32's, which are int. In the actual >>>>>>>>>>>>>>>>> Windows header files, it's declared as 2 longs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This might seem like a unimportant bug since int and long >>>>>>>>>>>>>>>>> are the same size, but int and long also mangle differently and affect >>>>>>>>>>>>>>>>> overload resolution, so you could have weird linker errors or call the >>>>>>>>>>>>>>>>> wrong function overload. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Plus, it illustrates the fact that this struct *actually >>>>>>>>>>>>>>>>> is* a different type from the one in the windows header. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You said at the end that you never intentionally import >>>>>>>>>>>>>>>>> win32.h and windows.h from the same translation unit. But then in this >>>>>>>>>>>>>>>>> example you did. I wonder if you could enforce that by doing this: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // win32.h >>>>>>>>>>>>>>>>> #pragma once >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // Error if windows.h was included before us. >>>>>>>>>>>>>>>>> #if defined(_WINDOWS_) >>>>>>>>>>>>>>>>> #error "You're including win32.h after having already >>>>>>>>>>>>>>>>> included windows.h. Don't do this!" >>>>>>>>>>>>>>>>> #endif >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> // And also make sure windows.h can't get included after us >>>>>>>>>>>>>>>>> #define _WINDOWS_ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For the record, I tried the test case you linked when >>>>>>>>>>>>>>>>> windows.h is not included in main.cpp and it works (but still has the bug >>>>>>>>>>>>>>>>> about int and long). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 2:23 PM Leonardo Santagada < >>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> It is super gross, but we copy parts of windows.h because >>>>>>>>>>>>>>>>>> having all of it if both gigantic and very very messy. So our win32.h has a >>>>>>>>>>>>>>>>>> couple thousands of lines and not 30k+ for windows.h and we try to have >>>>>>>>>>>>>>>>>> zero macros. Win32.h doesn't include windows.h so using ::BOOL wouldn't >>>>>>>>>>>>>>>>>> work. We don't want to create a namespace, we just want a cleaner interface >>>>>>>>>>>>>>>>>> to windows api. The namespace with c linkage is the way to trick cl into >>>>>>>>>>>>>>>>>> allowing us to in some files have both windows.h and Win32.h. I really >>>>>>>>>>>>>>>>>> don't see any way for us to have this Win32.h without this cl support, so >>>>>>>>>>>>>>>>>> maybe we should either put windows.h in a compiled header somewhere and not >>>>>>>>>>>>>>>>>> care that it is infecting everything or just have one place we can call to >>>>>>>>>>>>>>>>>> clean up after including windows.h (a massive set of undefs). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So using can't work, because we never intentionally >>>>>>>>>>>>>>>>>> import windows.h and win32.h on the same translation unit. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 7:08 PM, Zachary Turner < >>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This is pretty gross, honestly :) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Can't you just use using declarations? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> namespace Win32 { >>>>>>>>>>>>>>>>>>> extern "C" { >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> using ::BOOL; >>>>>>>>>>>>>>>>>>> using ::LONG; >>>>>>>>>>>>>>>>>>> using ::POINT; >>>>>>>>>>>>>>>>>>> using ::LPPOINT; >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> using ::GetCursorPos; >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This works with clang-cl. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jan 22, 2018 at 5:39 AM Leonardo Santagada < >>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Here it is a minimal example, we do this so we don't >>>>>>>>>>>>>>>>>>>> have to import the whole windows api everywhere. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> https://gist.github.com/santagada/7977e929d31c629c4bf18ebb987f6be3 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner < >>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Clang-cl maintains compatibility with msvc even in >>>>>>>>>>>>>>>>>>>>> cases where it’s non standards compliant (eg 2 phase name lookup), but we >>>>>>>>>>>>>>>>>>>>> try to keep these cases few and far between. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> To help me understand your case, do you mean you copy >>>>>>>>>>>>>>>>>>>>> windows.h and modify it? How does this lead to the same struct being >>>>>>>>>>>>>>>>>>>>> defined twice? If i were to write this: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> struct Foo {}; >>>>>>>>>>>>>>>>>>>>> struct Foo {}; >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Is this a small repro of the issue you’re talking >>>>>>>>>>>>>>>>>>>>> about? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I can totally see something like incremental linking >>>>>>>>>>>>>>>>>>>>>> with a simple padding between obj and a mapping file (which can also help >>>>>>>>>>>>>>>>>>>>>> with edit and continue, something we also would love to have). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We have another developer doing the port to support >>>>>>>>>>>>>>>>>>>>>> clang-cl, but although most of our code also goes trough a version of >>>>>>>>>>>>>>>>>>>>>> clang, migrating the rest to clang-cl has been a fight. From what I heard >>>>>>>>>>>>>>>>>>>>>> the main problem is that we have a copy of parts of windows.h (so not to >>>>>>>>>>>>>>>>>>>>>> bring the awful parts of it like lower case macros) and that totally works >>>>>>>>>>>>>>>>>>>>>> on cl, but clang (at least 6.0) complains about two struct/vars with the >>>>>>>>>>>>>>>>>>>>>> same name, even though they are exactly the same. Making clang-cl as broken >>>>>>>>>>>>>>>>>>>>>> as cl.exe is not an option I suppose? I would love to turn on a flag >>>>>>>>>>>>>>>>>>>>>> --accept-that-cl-made-bad-decisions-and-live-with-it and have this at least >>>>>>>>>>>>>>>>>>>>>> until this is completely fixed in our code base. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> the biggest win with moving to cl would be a better >>>>>>>>>>>>>>>>>>>>>> more standards compliant compiler, no 1 minute compiles on heavily >>>>>>>>>>>>>>>>>>>>>> templated files and maybe the holy grail of ThinLTO. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 10-15s will be hard without true incremental linking. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> At some point that's going to be the only way to get >>>>>>>>>>>>>>>>>>>>>>> any faster, but incremental linking is hard (putting it lightly), and since >>>>>>>>>>>>>>>>>>>>>>> our full links are already really fast we think we can get reasonably close >>>>>>>>>>>>>>>>>>>>>>> to link.exe incremental speeds with full links. But it's never enough and >>>>>>>>>>>>>>>>>>>>>>> I will always want it to be faster, so you may see incremental linking in >>>>>>>>>>>>>>>>>>>>>>> the future after we hit a performance wall with full link speed :) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> In any case, I'm definitely interested in seeing >>>>>>>>>>>>>>>>>>>>>>> what kind of numbers you get with /debug:ghash after you get this >>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy feature implemented. So keep me updated :) >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> As an aside, have you tried building with clang >>>>>>>>>>>>>>>>>>>>>>> instead of cl? If you build with clang you wouldn't even have to do this >>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy work, because it would "just work". If you've tried but ran >>>>>>>>>>>>>>>>>>>>>>> into issues I'm interested in hearing about those too. On the other hand, >>>>>>>>>>>>>>>>>>>>>>> it's also reasonable to only switch one thing at a time. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada < >>>>>>>>>>>>>>>>>>>>>>> santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> if we get to < 30s I think most users would prefer >>>>>>>>>>>>>>>>>>>>>>>> it to link.exe, just hopping there is still some more optimizations to get >>>>>>>>>>>>>>>>>>>>>>>> closer to ELF linking times (around 10-15s here). >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Generally speaking a good rule of thumb is that >>>>>>>>>>>>>>>>>>>>>>>>> /debug:ghash will be close to or faster than /debug:fastlink, but with none >>>>>>>>>>>>>>>>>>>>>>>>> of the penalties like slow debug time >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner < >>>>>>>>>>>>>>>>>>>>>>>>> zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Chrome is actually one of my exact benchmark >>>>>>>>>>>>>>>>>>>>>>>>>> cases. When building blink_core.dll and browser_tests.exe, i get anywhere >>>>>>>>>>>>>>>>>>>>>>>>>> from a 20-40% reduction in link time. We have some other optimizations in >>>>>>>>>>>>>>>>>>>>>>>>>> the pipeline but not upstream yet. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> My best time so far (including other >>>>>>>>>>>>>>>>>>>>>>>>>> optimizations not yet upstream) is 28s on blink_core.dll, compared to 110s >>>>>>>>>>>>>>>>>>>>>>>>>> with /debug >>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 12:28 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner >>>>>>>>>>>>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> You probably don't want to go down the same >>>>>>>>>>>>>>>>>>>>>>>>>>>> route that clang goes through to write the object file. If you think >>>>>>>>>>>>>>>>>>>>>>>>>>>> yaml2coff is convoluted, the way clang does it will just give you a >>>>>>>>>>>>>>>>>>>>>>>>>>>> headache. There are multiple abstractions involved to account for >>>>>>>>>>>>>>>>>>>>>>>>>>>> different object file formats (ELF, COFF, MachO) and output formats >>>>>>>>>>>>>>>>>>>>>>>>>>>> (Assembly, binary file). At least with yaml2coff >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I think your phrase got cut there, but yeah I >>>>>>>>>>>>>>>>>>>>>>>>>>> just found AsmPrinter.cpp and it is convoluted. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> It's true that yaml2coff is using the >>>>>>>>>>>>>>>>>>>>>>>>>>>> COFFParser structure, but if you look at the writeCOFF >>>>>>>>>>>>>>>>>>>>>>>>>>>> function in yaml2coff it's pretty bare-metal. The logic you need will be >>>>>>>>>>>>>>>>>>>>>>>>>>>> almost identical, except that instead of checking the COFFParser for the >>>>>>>>>>>>>>>>>>>>>>>>>>>> various fields, you'll check the existing COFFObjectFile, which should have >>>>>>>>>>>>>>>>>>>>>>>>>>>> similar fields. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> The only thing you need to different is when >>>>>>>>>>>>>>>>>>>>>>>>>>>> writing the section table and section contents, to insert a new entry. Since >>>>>>>>>>>>>>>>>>>>>>>>>>>> you're injecting a section into the middle, you'll also probably need to >>>>>>>>>>>>>>>>>>>>>>>>>>>> push back the file pointer of all subsequent sections so that they don't >>>>>>>>>>>>>>>>>>>>>>>>>>>> overlap. (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert >>>>>>>>>>>>>>>>>>>>>>>>>>>> between 2 and 3, then the original sections 3, 4, and 5 would need to have >>>>>>>>>>>>>>>>>>>>>>>>>>>> their FilePointerToRawData offset by the size of the new section). >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I have the PE/COFF spec open here and I'm happy >>>>>>>>>>>>>>>>>>>>>>>>>>> that I read a bit of it so I actually know what you are talking about... >>>>>>>>>>>>>>>>>>>>>>>>>>> yeah it doesn't seem too complicated. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> If you need to know what values to put for the >>>>>>>>>>>>>>>>>>>>>>>>>>>> other fields in a section header, run `dumpbin /headers foo.obj` on a >>>>>>>>>>>>>>>>>>>>>>>>>>>> clang-generated object file that has a .debug$H section already (e.g. run >>>>>>>>>>>>>>>>>>>>>>>>>>>> clang with -emit-codeview-ghash-section, and look at the properties of the >>>>>>>>>>>>>>>>>>>>>>>>>>>> .debug$H section and use the same values). >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks I will do that and then also look at how >>>>>>>>>>>>>>>>>>>>>>>>>>> the CodeView part of the code does it if I can't understand some of it. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> The only invariant that needs to be maintained >>>>>>>>>>>>>>>>>>>>>>>>>>>> is that Section[N]->FilePointerOfRawData =>>>>>>>>>>>>>>>>>>>>>>>>>>>> Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Well, that and all the sections need to be on >>>>>>>>>>>>>>>>>>>>>>>>>>> the final file... But I'm hopeful. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Anyone has times on linking a big project like >>>>>>>>>>>>>>>>>>>>>>>>>>> chrome with this so that at least I know what kind of performance to expect? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> My numbers are something like: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1 pdb per obj file: link.exe takes ~15 minutes >>>>>>>>>>>>>>>>>>>>>>>>>>> and 16GB of ram, lld-link.exe takes 2:30 minutes and ~8GB of ram >>>>>>>>>>>>>>>>>>>>>>>>>>> around 10 pdbs per folder: link.exe takes 1 >>>>>>>>>>>>>>>>>>>>>>>>>>> minute and 2-3GB of ram, lld-link.exe takes 1:30 minutes and ~6GB of ram >>>>>>>>>>>>>>>>>>>>>>>>>>> faslink: link.exe takes 40 seconds, but then 20 >>>>>>>>>>>>>>>>>>>>>>>>>>> seconds of loading at the first break point in the debugger and we lost DIA >>>>>>>>>>>>>>>>>>>>>>>>>>> support for listing symbols. >>>>>>>>>>>>>>>>>>>>>>>>>>> incremental: link.exe takes 8 seconds, but it >>>>>>>>>>>>>>>>>>>>>>>>>>> only happens when very minor changes happen. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> We have an non negligible number of symbols used >>>>>>>>>>>>>>>>>>>>>>>>>>> on some runtime systems. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the tips, I now have something that >>>>>>>>>>>>>>>>>>>>>>>>>>>>> reads the obj file, finds .debug$T sections and global hashes it (proof of >>>>>>>>>>>>>>>>>>>>>>>>>>>>> concept kind of code). What I can't find is: how does clang itself writes >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the coff files with global hashes, as that might help me understand how to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> create the .debug$H section, how to update the file section count and how >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to properly write this back. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The code on yaml2coff is expecting to be >>>>>>>>>>>>>>>>>>>>>>>>>>>>> working on the yaml COFFParser struct and I'm having quite a bit of a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> headache turning the COFFObjectFile into a COFFParser object or >>>>>>>>>>>>>>>>>>>>>>>>>>>>> compatible... Tomorrow I might try the very non efficient path of coff2yaml >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and then yaml2coff with the hashes header... but it seems way too >>>>>>>>>>>>>>>>>>>>>>>>>>>>> inefficient and convoluted. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Turner <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Turner <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Santagada <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> No I didn't, I used cl.exe from the visual >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> studio toolchain. What I'm proposing is a tool for processing .obj files in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> COFF format, reading them and generating the GHASH part. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To make our build faster we use hundreds >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of unity build files (.cpp's with a lot of other .cpp's in them aka munch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> files) but still have a lot of single .cpp's as well (in total something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> like 3.4k .obj files). >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ps: sorry for sending to the wrong list, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was reading about llvm mailing lists and jumped when I saw what I thought >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was a lld exclusive list. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A tool like this would be useful, yes. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We've talked about it internally as well and agreed it would be useful, we >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just haven't prioritized it. If you're interested in submitting a patch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> along those lines though, I think it would be a good addition. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm not sure what the best place for it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would be. llvm-readobj and llvm-objdump seem like obvious choices, but >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they are intended to be read-only, so perhaps they wouldn't be a good fit. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm-pdbutil is kind of a hodgepodge of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything else related to PDBs and symbols, so I wouldn't be opposed to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> making a new subcommand there called "ghash" or something that could >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> process an object file and output a new object file with a .debug$H section. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> A third option would be to make a new tool >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for it. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I don't htink it would be that hard to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> write. If you're interested in trying to make a patch for this, I can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> offer some guidance on where to look in the code. Otherwise it's something >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that we'll probably get to, I'm just not sure when. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would love to write it and contribute it >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> back, please do tell, I did find some of the code of ghash in lld, but in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fuzzy on the llvm codeview part of it and never seen llvm-readobj/objdump >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or llvm-pdbutil, but I'm not afraid to look :) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Luckily all of the important code is hidden >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> behind library calls, and it should already just do the right thing, so I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> suspect you won't need to know much about CodeView to do this. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think Peter has the right idea about >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> putting this in llvm-objcopy. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> You can look at one of the existing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> CopyBinary functions there, which currently only work for ELF, but you can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just make a new overload that accepts a COFFObjectFile. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would probably start by iterating over each >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of the sections (getNumberOfSections / getSectionName) looking for .debug$T >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and .debug$H sections. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you find a .debug$H section then you can >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> just skip that object file. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If you find a .debug$T but not a .debug$H, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> then basically do the same thing that LLD does in PDBLinker::mergeDebugT >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (create a CVTypeArray, and pass it to GloballyHashedType::hashTypes. That >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will return an array of hash values. (the format of .debug$H is the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> header, followed by the hash values). Then when you're writing the list of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sections, just add in the .debug$H section right after the .debug$T section. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently llvm-objcopy only writes ELF files, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> so it would need to be taught to write COFF files. We have code to do this >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in the yaml2obj utility (specifically, in yaml2coff.cpp in the function >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> writeCOFF). There may be a way to move this code to somewhere else >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> llvm-objcopy, but in the worst case scenario you could copy the code and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> re-write it to work with these new structures. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lastly, you'll probably want to put all of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this behind an option in llvm-objcopy such as -add-codeview-ghash-section >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Leonardo Santagada >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Leonardo Santagada >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Leonardo Santagada >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Leonardo Santagada >>>> >>> >> >> >> -- >> >> Leonardo Santagada >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/23654a9e/attachment-0001.html>
Seemingly Similar Threads
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)