Leonardo Santagada via llvm-dev
2018-Jan-31 12:44 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
So I found all 20 bytes and changed then to GHASH_SIZE (a const I defined in typehashing.h) and finished the switch to xxHash64, that saved me around 50 seconds to 56s, then I changed it to uint64_t instead of a 8 byte uint_8 array and that gave me 48s. With release config and a pgo pass I'm now linking in 38s... so faster than link.exe in vs 2017 (which is faster than vs 2015) doing fastlink. Now we are in a very good place, still not as fast as an incremental link, but generating better code and a full pdb. Do you want any of these patches? They don't look the nicest they can be but might help someone doing this cleanly. On Wed, Jan 31, 2018 at 9:39 AM, Leonardo Santagada <santagada at gmail.com> wrote:> Uhmm I changed only type hashing... Ok back to trying it again. Let's me > find where it is looking at 20 bytes instead of using the size of global > type hash. > > On 30 Jan 2018 21:33, "Zachary Turner" <zturner at google.com> wrote: >> >> Did you change both the compiler and linker (or make sure that your >> objcopy was updated to write your 64 bit hashes)? >> >> The linker is hardcodes to expect 20-byte sha 1s , anything else and it >> will recompute them in serial >> On Tue, Jan 30, 2018 at 12:28 PM Leonardo Santagada <santagada at gmail.com> >> wrote: >>> >>> Nice and why are you trying blake2 instead of a faster hash algorithm? >>> And do you have any guess as to why xxHash64 wasn't faster than SHA1? I >>> still have to see how many collision I get with it, but it seems so >>> improbable that collisions on 64 bit hashes would be the problem. >>> >>> On 30 Jan 2018 18:39, "Zachary Turner" <zturner at google.com> wrote: >>> >>> It turns out there were some problems with the measurements in that blog >>> post. I built LLD with the RelWithDebInfo configuration which we later >>> found out uses /Ob1 instead of /Ob2. That was worth some cycles. Then >>> there were some more optimizations that went in after that. And to get down >>> to 28s I also used an LTO'ed build of lld. >>> >>> If you're building LLD at ToT you should have everything needed to >>> reproduce those numbers, but it will vary depending on the speed of your CPU >>> obviously. >>> >>> On Tue, Jan 30, 2018 at 9:21 AM Leonardo Santagada <santagada at gmail.com> >>> wrote: >>>> >>>> Today I played around replacing the sha1 with xxHash64 and the results >>>> so far are bad. Linking times almost doubled and I can't really explain why, >>>> the only thing that comes to mind is hash collisions but on type names they >>>> should be very few in 64bit hashes. >>>> >>>> Any reason why you are trying blake2 and not murmurhash3 or xxHash64? >>>> >>>> About creating a pdb per lib, you can say to msvc to put the pdb of >>>> every .obj compilation to the same file, but you can't after 20 files >>>> compiled to .obj (with /Z7 or /Zi) to them merge all the debug information >>>> in one .pdb file AFAIK. That would make our links much faster I think as >>>> people either are changing headers (and then they know they have to wait) or >>>> changing a single/few .cpp files. It would be great to group our 3k obj >>>> debug information in groups so that this linking steps can be paralelizable. >>>> Is there any support maybe for merging pdb with pdb util and then feeding >>>> that to lld-link instead of .obj debug info? >>>> >>>> I also re-read the post about ghash and it says blink links in 88s, the >>>> 28s you talk about is with unrelased optimizations only? >>>> >>>> On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at google.com> >>>> wrote: >>>>> >>>>> You can make a PDB per lib (consider msvcrtd.pdb which ships with >>>>> MSVC), but all these per-lib PDBs would have to be merged into a single >>>>> master PDB at the end, so you still can't avoid that final . In a way, >>>>> that's similar to the idea behind /DEBUG:FASTLINK (keep the debug info in >>>>> object files to eliminate the cost of merging types and symbol records) and >>>>> we know what the problems with /DEBUG:FASTLINK are. >>>>> >>>>> The PDB generation code in LLD is still completely single threaded, so >>>>> that's one area for huge potential gains, but only some parts of the >>>>> algorithm are parallelizable. We're trying to squeeze every last bit of >>>>> performance out of the single-threaded case first before we parallelize, but >>>>> that option is definitely still there for us. >>>>> >>>>> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada >>>>> <santagada at gmail.com> wrote: >>>>>> >>>>>> Does packing obj files in .lib helps linking in any way? My >>>>>> understanding is that there would be no difference. It could help if I could >>>>>> make a pdb per lib, but there is no way to do so... Maybe we could implement >>>>>> this on lld? >>>>>> >>>>>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at google.com> wrote: >>>>>>> >>>>>>> Yes we've discussed many different ideas for incremental linking, but >>>>>>> our conclusion is that you can only get one of Fast|Simple. If you want it >>>>>>> to be fast it has to be complicated and if you want it to be simple then >>>>>>> it's going to be slow. >>>>>>> >>>>>>> Consider the case where you edit one .cpp file and change this: >>>>>>> >>>>>>> int x = 0, y = 7; >>>>>>> >>>>>>> to this: >>>>>>> >>>>>>> int x = 0; >>>>>>> short y = 7; >>>>>>> >>>>>>> Because different instructions operate on shorts vs ints, some of the >>>>>>> instruction encodings will be different and potentially of a different size. >>>>>>> >>>>>>> Because of this, the contribution to the .text section from this >>>>>>> object file is going to be a different size. >>>>>>> >>>>>>> Because of that, all subsequent object files will start at a >>>>>>> different absolute file address in the final executable. >>>>>>> >>>>>>> Because of that, every single symbol in every single object file will >>>>>>> need to be updated in the final PDB. >>>>>>> >>>>>>> There are many other things that need to happen as well, but the >>>>>>> point is that trivial change to a cpp file can explode into many changes in >>>>>>> the final PDB. >>>>>>> >>>>>>> There are ways to handle this, but they're not simple. We have some >>>>>>> ideas, but for the moment we are focused on making full linking as fast as >>>>>>> possible because it's much easier and still provides benefits. We think we >>>>>>> can get it fast enough that it will be acceptable, and that should give us >>>>>>> some extra time to do incremental linking properly. >>>>>>> >>>>>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada >>>>>>> <santagada at gmail.com> wrote: >>>>>>>> >>>>>>>> About incremental linking, the only thing from my benchmark that >>>>>>>> needs to be incremental is the pdb patching as generating the binary seems >>>>>>>> faster than incremental linking on link.exe, so did anyone propose renaming >>>>>>>> the current binary, writing a new one and then diffing the coff obj and >>>>>>>> using that info to just rewriting that part of the pdb. Or another idea is >>>>>>>> making the build system feed into the linker which files changed so the >>>>>>>> types/debug information can be compared instead of all of them? >>>>>>>> >>>>>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner <zturner at google.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Not a lot. >>>>>>>>> >>>>>>>>> /TIME will show high level timing of the various phases (this is >>>>>>>>> the same option MSVC uses). >>>>>>>>> >>>>>>>>> If you want anything more detailed than that, vTune or ETW+WPA >>>>>>>>> (github.com/google/UIforETW/releases) are probably what you'll need >>>>>>>>> to do. >>>>>>>>> >>>>>>>>> (We'd definitely love patches to improve performance, or even just >>>>>>>>> ideas about how to make things faster. Improving link speed is one of our >>>>>>>>> biggest priorities.) >>>>>>>>> >>>>>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada >>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> Yeah true, is there any switches to profile the linker? >>>>>>>>>> >>>>>>>>>> On 29 Jan 2018 18:43, "Zachary Turner" <zturner at google.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Part of the reason why lld is so fast is because we map every >>>>>>>>>>> input file into memory up front and rely on the virtual memory manager in >>>>>>>>>>> the kernel to make this fast. Generally speaking, this is a lot faster than >>>>>>>>>>> opening a file, reading it and processing a file, and closing the file. The >>>>>>>>>>> downside, as you note, is that it uses a lot of memory. >>>>>>>>>>> >>>>>>>>>>> But there's a catch. The kernel is smart enough to share the >>>>>>>>>>> physical memory pages when you map the same file multiple times from >>>>>>>>>>> multiple processes. So it only looks like the memory usage is high because >>>>>>>>>>> it reserves a large amount of address space in each process. But the total >>>>>>>>>>> amount of physical memory used will not increase when additional instances >>>>>>>>>>> of the same file are mapped. >>>>>>>>>>> >>>>>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo Santagada >>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I cleaned up my tests and figured that the obj file generated >>>>>>>>>>>> with problems was only with msvc 2015, so trying again with msvc 2017 I get: >>>>>>>>>>>> >>>>>>>>>>>> lld-link: 4s >>>>>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram >>>>>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram >>>>>>>>>>>> link: 13s >>>>>>>>>>>> link /debug:fastlink: 43s and 1gb of ram >>>>>>>>>>>> link specialpdb: 1m10s and 4gb of ram >>>>>>>>>>>> link /debug: 9m16s min and >14gb of ram >>>>>>>>>>>> >>>>>>>>>>>> link incremental: 8s when it works. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *specialpdb is created with passing to a set of compilation >>>>>>>>>>>> units (eg a folder) the same pdb to be written to, so it dedups the symbols >>>>>>>>>>>> before the final linking, but that does decrease the concurrency as this >>>>>>>>>>>> step can't be done after linking. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> My question is, in the set of patches you guys haven't >>>>>>>>>>>> upstreamed is there anything that makes compilation uses less memory? Or >>>>>>>>>>>> just asking more directly, when will those patches make to upstream, or can >>>>>>>>>>>> I try them? The memory usage of lld-link is a little worrying as we have >>>>>>>>>>>> around 6-8 binaries that we link for windows and they mostly use the same >>>>>>>>>>>> libraries so 20gb of ram each means we probably can't link them all together >>>>>>>>>>>> anymore. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Tomorrow I will send my tool and changes to lld so more people >>>>>>>>>>>> can try this out and tell if it helps with their msvc only code. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary Turner >>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I don’t have pgo numbers. When I build using -flto=thin the >>>>>>>>>>>>> link time is significantly faster than msvc /ltcg and runtime is slightly >>>>>>>>>>>>> faster, but I haven’t tested on a large variety of different workloads, so >>>>>>>>>>>>> YMMV. Link time will definitely be faster though >>>>>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM Leonardo Santagada >>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> This part is only for objects with /Z7 debug information in >>>>>>>>>>>>>> them right? I think most of the third parties are either: .lib/obj without >>>>>>>>>>>>>> debug information, the same with information on pdb files. Rewriting all >>>>>>>>>>>>>> .lib/.obj with /Z7 information seems doable with a small python script, the >>>>>>>>>>>>>> pdb one is going to be more work, but I always wanted to know how a pdb file >>>>>>>>>>>>>> is structured so "fun" times ahead. But yeah printing it out, and timing it >>>>>>>>>>>>>> might be very useful indeed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Did anyone tried to compile/link lld-link.exe with LTO+PGO to >>>>>>>>>>>>>> see how much faster can it get? I might try that as well, as 10% speed >>>>>>>>>>>>>> improvement might be handy. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM, Zachary Turner >>>>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Look for this code in lld/coff/pdb.cpp >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> if (Config->DebugGHashes) { >>>>>>>>>>>>>>> ArrayRef<GloballyHashedType> Hashes; >>>>>>>>>>>>>>> std::vector<GloballyHashedType> OwnedHashes; >>>>>>>>>>>>>>> if (Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File)) >>>>>>>>>>>>>>> Hashes = getHashesFromDebugH(*DebugH); >>>>>>>>>>>>>>> else { >>>>>>>>>>>>>>> OwnedHashes = GloballyHashedType::hashTypes(Types); >>>>>>>>>>>>>>> Hashes = OwnedHashes; >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In the else block there, add a log message that says >>>>>>>>>>>>>>> “synthesizing .debug$h section for “ + Obj->Name >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> See how many of these you get. When I build chrome + all >>>>>>>>>>>>>>> third party libraries this way i get about 100, which is small enough to >>>>>>>>>>>>>>> still see large performance gains. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If you have many 3rd party libraries, it may be necessary to >>>>>>>>>>>>>>> rewrite the .lib files too, not just the .obj files. Eventually I’ll get >>>>>>>>>>>>>>> around to implementing all of this as well, as well as better heuristics in >>>>>>>>>>>>>>> lld-link to disable ghash if it’s going to be slow >>>>>>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM Leonardo Santagada >>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ok I went for kind of middle ground solution, I patch in the >>>>>>>>>>>>>>>> obj files, but as adding a new section didn't seem to work, I add a "shadow" >>>>>>>>>>>>>>>> section, by editing the pointer to line number and the virtual size on the >>>>>>>>>>>>>>>> .debug$T section. Although technically broken, both link.exe and >>>>>>>>>>>>>>>> lld-link.exe don't seem to mind the alterations and as the shadow .debug$H >>>>>>>>>>>>>>>> is not really a section anymore (its just some bytes at the end of the file) >>>>>>>>>>>>>>>> it doesn't change anything else that does matter. With that I could do my >>>>>>>>>>>>>>>> first test with a subset of our code base, and the results are not good. I >>>>>>>>>>>>>>>> found one of our sources that break the ghash computation, I will get more >>>>>>>>>>>>>>>> info on this and post a proper bug report, but I guess its type information >>>>>>>>>>>>>>>> that is generated only by msvc. The other more alarming problem is that >>>>>>>>>>>>>>>> linking is way slower with the ghahes... my guess is that we have a bunch of >>>>>>>>>>>>>>>> pdb files for some third party libraries and calculating those ghashes takes >>>>>>>>>>>>>>>> more time than actual linking of this small part of the source (it links in >>>>>>>>>>>>>>>> 4s in both link.exe and lld-link.exe without ghashes). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:52 PM, Leonardo Santagada >>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We don't generate any .lib as those don't work well with >>>>>>>>>>>>>>>>> incremental linking (and give zero advantages when linking AFAIK), and it >>>>>>>>>>>>>>>>> would be pretty easy to have a modern format for having a .ghash for >>>>>>>>>>>>>>>>> multiple files, something simple like size prefixed name and then size >>>>>>>>>>>>>>>>> prefixed ghash blobs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:44 PM, Zachary Turner >>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We considered that early on, but most object files >>>>>>>>>>>>>>>>>> actually end up in .lib files so unless there were a way to connect the >>>>>>>>>>>>>>>>>> objects in the .lib to the corresponding .ghash files, this would disable >>>>>>>>>>>>>>>>>> ghash usage for a large amount of inputs. Supporting both is an option, but >>>>>>>>>>>>>>>>>> it adds a bit of complexity and I’m not totally convinced it’s worth it >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 11:38 AM Leonardo Santagada >>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> it does. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I just had an epiphany: why not just write a .ghash file >>>>>>>>>>>>>>>>>>> and have lld read those if they exist for an .obj file? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Seem much simpler than trying to wire up a 20 year old >>>>>>>>>>>>>>>>>>> file format. I will try to do this, is something like this acceptable for >>>>>>>>>>>>>>>>>>> LLD? The cool thing is that I can generate .ghash for .lib or any obj lying >>>>>>>>>>>>>>>>>>> around (maybe even for pdb in the future). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:32 PM, Zachary Turner >>>>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> In general, we should be able to accept any MSVC .obj >>>>>>>>>>>>>>>>>>>> file to LLD. At the very least, we're not aware of any cases that don't >>>>>>>>>>>>>>>>>>>> work. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Does your MSVC .obj file link fine before you add the >>>>>>>>>>>>>>>>>>>> .debug$H? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 11:23 AM Leonardo Santagada >>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Okay, apparently coff2yaml and yaml2coff are not in a >>>>>>>>>>>>>>>>>>>>> great place as they both don't deal well with the fact that you can have >>>>>>>>>>>>>>>>>>>>> overlapping sections, which seems to be what clang-cl produces (the .data >>>>>>>>>>>>>>>>>>>>> section points to the same place as a later section). Which is not a big big >>>>>>>>>>>>>>>>>>>>> problem for me particularly because msvc doesn't even generate .data >>>>>>>>>>>>>>>>>>>>> sections in .obj. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I'm trying to put support for .bss sections in both >>>>>>>>>>>>>>>>>>>>> coff2yaml and yaml2coff... but I still can link just fine with my >>>>>>>>>>>>>>>>>>>>> transformations clang-cl generated files... what does give me problems is >>>>>>>>>>>>>>>>>>>>> msvc .obj files. Have you tried to link one of these? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:05 PM, Leonardo Santagada >>>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> yeah, apparently .bss has a flag of unitialized data >>>>>>>>>>>>>>>>>>>>>> that is not being respected on the layout of the coff files (it should skip >>>>>>>>>>>>>>>>>>>>>> those sections) but I dunno what to do with .data as it doesn't have a size. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> (resending as apparently my pastes generated a ton of >>>>>>>>>>>>>>>>>>>>>> hidden html data and this message hit the mailinglist limit of 100k) >>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> Leonardo Santagada >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Leonardo Santagada >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Leonardo Santagada >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Leonardo Santagada >>> >>> >-- Leonardo Santagada
Zachary Turner via llvm-dev
2018-Jan-31 17:18 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
The quickest route would be for you to upload the patches for review and then go through a couple of iterations until it's cleaned up. Do you want to go that route? If not you can upload them anyway and whenever I get some spare cycles I can take it over and get it committed. But that might be slower since I have other things on my plate at the moment so I'm not sure when I'll be able to get to it. Maybe in a month or two? If you are willing to do the work to get it cleaned up and ok for committing, then the first steps would be 1) Fix variable and function names to conform to LLVM conventions, and 2) Remove the dependency in llvm-objcopy on ObjectYAML. BTW, I've also tried 8 byte ghashes locally, and in theory that should be faster, but I found a case where it's slower than 20 byte ghashes. This almost seems impossible, so I need to spend some time examining profiles to see what's going on. On Wed, Jan 31, 2018 at 4:44 AM Leonardo Santagada <santagada at gmail.com> wrote:> So I found all 20 bytes and changed then to GHASH_SIZE (a const I > defined in typehashing.h) and finished the switch to xxHash64, that > saved me around 50 seconds to 56s, then I changed it to uint64_t > instead of a 8 byte uint_8 array and that gave me 48s. With release > config and a pgo pass I'm now linking in 38s... so faster than > link.exe in vs 2017 (which is faster than vs 2015) doing fastlink. > > Now we are in a very good place, still not as fast as an incremental > link, but generating better code and a full pdb. > > Do you want any of these patches? They don't look the nicest they can > be but might help someone doing this cleanly. > > On Wed, Jan 31, 2018 at 9:39 AM, Leonardo Santagada <santagada at gmail.com> > wrote: > > Uhmm I changed only type hashing... Ok back to trying it again. Let's me > > find where it is looking at 20 bytes instead of using the size of global > > type hash. > > > > On 30 Jan 2018 21:33, "Zachary Turner" <zturner at google.com> wrote: > >> > >> Did you change both the compiler and linker (or make sure that your > >> objcopy was updated to write your 64 bit hashes)? > >> > >> The linker is hardcodes to expect 20-byte sha 1s , anything else and it > >> will recompute them in serial > >> On Tue, Jan 30, 2018 at 12:28 PM Leonardo Santagada < > santagada at gmail.com> > >> wrote: > >>> > >>> Nice and why are you trying blake2 instead of a faster hash algorithm? > >>> And do you have any guess as to why xxHash64 wasn't faster than SHA1? I > >>> still have to see how many collision I get with it, but it seems so > >>> improbable that collisions on 64 bit hashes would be the problem. > >>> > >>> On 30 Jan 2018 18:39, "Zachary Turner" <zturner at google.com> wrote: > >>> > >>> It turns out there were some problems with the measurements in that > blog > >>> post. I built LLD with the RelWithDebInfo configuration which we later > >>> found out uses /Ob1 instead of /Ob2. That was worth some cycles. Then > >>> there were some more optimizations that went in after that. And to > get down > >>> to 28s I also used an LTO'ed build of lld. > >>> > >>> If you're building LLD at ToT you should have everything needed to > >>> reproduce those numbers, but it will vary depending on the speed of > your CPU > >>> obviously. > >>> > >>> On Tue, Jan 30, 2018 at 9:21 AM Leonardo Santagada < > santagada at gmail.com> > >>> wrote: > >>>> > >>>> Today I played around replacing the sha1 with xxHash64 and the results > >>>> so far are bad. Linking times almost doubled and I can't really > explain why, > >>>> the only thing that comes to mind is hash collisions but on type > names they > >>>> should be very few in 64bit hashes. > >>>> > >>>> Any reason why you are trying blake2 and not murmurhash3 or xxHash64? > >>>> > >>>> About creating a pdb per lib, you can say to msvc to put the pdb of > >>>> every .obj compilation to the same file, but you can't after 20 files > >>>> compiled to .obj (with /Z7 or /Zi) to them merge all the debug > information > >>>> in one .pdb file AFAIK. That would make our links much faster I think > as > >>>> people either are changing headers (and then they know they have to > wait) or > >>>> changing a single/few .cpp files. It would be great to group our 3k > obj > >>>> debug information in groups so that this linking steps can be > paralelizable. > >>>> Is there any support maybe for merging pdb with pdb util and then > feeding > >>>> that to lld-link instead of .obj debug info? > >>>> > >>>> I also re-read the post about ghash and it says blink links in 88s, > the > >>>> 28s you talk about is with unrelased optimizations only? > >>>> > >>>> On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at google.com> > >>>> wrote: > >>>>> > >>>>> You can make a PDB per lib (consider msvcrtd.pdb which ships with > >>>>> MSVC), but all these per-lib PDBs would have to be merged into a > single > >>>>> master PDB at the end, so you still can't avoid that final . In a > way, > >>>>> that's similar to the idea behind /DEBUG:FASTLINK (keep the debug > info in > >>>>> object files to eliminate the cost of merging types and symbol > records) and > >>>>> we know what the problems with /DEBUG:FASTLINK are. > >>>>> > >>>>> The PDB generation code in LLD is still completely single threaded, > so > >>>>> that's one area for huge potential gains, but only some parts of the > >>>>> algorithm are parallelizable. We're trying to squeeze every last > bit of > >>>>> performance out of the single-threaded case first before we > parallelize, but > >>>>> that option is definitely still there for us. > >>>>> > >>>>> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada > >>>>> <santagada at gmail.com> wrote: > >>>>>> > >>>>>> Does packing obj files in .lib helps linking in any way? My > >>>>>> understanding is that there would be no difference. It could help > if I could > >>>>>> make a pdb per lib, but there is no way to do so... Maybe we could > implement > >>>>>> this on lld? > >>>>>> > >>>>>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at google.com> wrote: > >>>>>>> > >>>>>>> Yes we've discussed many different ideas for incremental linking, > but > >>>>>>> our conclusion is that you can only get one of Fast|Simple. If > you want it > >>>>>>> to be fast it has to be complicated and if you want it to be > simple then > >>>>>>> it's going to be slow. > >>>>>>> > >>>>>>> Consider the case where you edit one .cpp file and change this: > >>>>>>> > >>>>>>> int x = 0, y = 7; > >>>>>>> > >>>>>>> to this: > >>>>>>> > >>>>>>> int x = 0; > >>>>>>> short y = 7; > >>>>>>> > >>>>>>> Because different instructions operate on shorts vs ints, some of > the > >>>>>>> instruction encodings will be different and potentially of a > different size. > >>>>>>> > >>>>>>> Because of this, the contribution to the .text section from this > >>>>>>> object file is going to be a different size. > >>>>>>> > >>>>>>> Because of that, all subsequent object files will start at a > >>>>>>> different absolute file address in the final executable. > >>>>>>> > >>>>>>> Because of that, every single symbol in every single object file > will > >>>>>>> need to be updated in the final PDB. > >>>>>>> > >>>>>>> There are many other things that need to happen as well, but the > >>>>>>> point is that trivial change to a cpp file can explode into many > changes in > >>>>>>> the final PDB. > >>>>>>> > >>>>>>> There are ways to handle this, but they're not simple. We have > some > >>>>>>> ideas, but for the moment we are focused on making full linking as > fast as > >>>>>>> possible because it's much easier and still provides benefits. We > think we > >>>>>>> can get it fast enough that it will be acceptable, and that should > give us > >>>>>>> some extra time to do incremental linking properly. > >>>>>>> > >>>>>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada > >>>>>>> <santagada at gmail.com> wrote: > >>>>>>>> > >>>>>>>> About incremental linking, the only thing from my benchmark that > >>>>>>>> needs to be incremental is the pdb patching as generating the > binary seems > >>>>>>>> faster than incremental linking on link.exe, so did anyone > propose renaming > >>>>>>>> the current binary, writing a new one and then diffing the coff > obj and > >>>>>>>> using that info to just rewriting that part of the pdb. Or > another idea is > >>>>>>>> making the build system feed into the linker which files changed > so the > >>>>>>>> types/debug information can be compared instead of all of them? > >>>>>>>> > >>>>>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner < > zturner at google.com> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Not a lot. > >>>>>>>>> > >>>>>>>>> /TIME will show high level timing of the various phases (this is > >>>>>>>>> the same option MSVC uses). > >>>>>>>>> > >>>>>>>>> If you want anything more detailed than that, vTune or ETW+WPA > >>>>>>>>> (github.com/google/UIforETW/releases) are probably what > you'll need > >>>>>>>>> to do. > >>>>>>>>> > >>>>>>>>> (We'd definitely love patches to improve performance, or even > just > >>>>>>>>> ideas about how to make things faster. Improving link speed is > one of our > >>>>>>>>> biggest priorities.) > >>>>>>>>> > >>>>>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada > >>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>> > >>>>>>>>>> Yeah true, is there any switches to profile the linker? > >>>>>>>>>> > >>>>>>>>>> On 29 Jan 2018 18:43, "Zachary Turner" <zturner at google.com> > wrote: > >>>>>>>>>>> > >>>>>>>>>>> Part of the reason why lld is so fast is because we map every > >>>>>>>>>>> input file into memory up front and rely on the virtual memory > manager in > >>>>>>>>>>> the kernel to make this fast. Generally speaking, this is a > lot faster than > >>>>>>>>>>> opening a file, reading it and processing a file, and closing > the file. The > >>>>>>>>>>> downside, as you note, is that it uses a lot of memory. > >>>>>>>>>>> > >>>>>>>>>>> But there's a catch. The kernel is smart enough to share the > >>>>>>>>>>> physical memory pages when you map the same file multiple > times from > >>>>>>>>>>> multiple processes. So it only looks like the memory usage is > high because > >>>>>>>>>>> it reserves a large amount of address space in each process. > But the total > >>>>>>>>>>> amount of physical memory used will not increase when > additional instances > >>>>>>>>>>> of the same file are mapped. > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo Santagada > >>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> I cleaned up my tests and figured that the obj file generated > >>>>>>>>>>>> with problems was only with msvc 2015, so trying again with > msvc 2017 I get: > >>>>>>>>>>>> > >>>>>>>>>>>> lld-link: 4s > >>>>>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram > >>>>>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram > >>>>>>>>>>>> link: 13s > >>>>>>>>>>>> link /debug:fastlink: 43s and 1gb of ram > >>>>>>>>>>>> link specialpdb: 1m10s and 4gb of ram > >>>>>>>>>>>> link /debug: 9m16s min and >14gb of ram > >>>>>>>>>>>> > >>>>>>>>>>>> link incremental: 8s when it works. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> *specialpdb is created with passing to a set of compilation > >>>>>>>>>>>> units (eg a folder) the same pdb to be written to, so it > dedups the symbols > >>>>>>>>>>>> before the final linking, but that does decrease the > concurrency as this > >>>>>>>>>>>> step can't be done after linking. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> My question is, in the set of patches you guys haven't > >>>>>>>>>>>> upstreamed is there anything that makes compilation uses less > memory? Or > >>>>>>>>>>>> just asking more directly, when will those patches make to > upstream, or can > >>>>>>>>>>>> I try them? The memory usage of lld-link is a little worrying > as we have > >>>>>>>>>>>> around 6-8 binaries that we link for windows and they mostly > use the same > >>>>>>>>>>>> libraries so 20gb of ram each means we probably can't link > them all together > >>>>>>>>>>>> anymore. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Tomorrow I will send my tool and changes to lld so more people > >>>>>>>>>>>> can try this out and tell if it helps with their msvc only > code. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary Turner > >>>>>>>>>>>> <zturner at google.com> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> I don’t have pgo numbers. When I build using -flto=thin the > >>>>>>>>>>>>> link time is significantly faster than msvc /ltcg and > runtime is slightly > >>>>>>>>>>>>> faster, but I haven’t tested on a large variety of different > workloads, so > >>>>>>>>>>>>> YMMV. Link time will definitely be faster though > >>>>>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM Leonardo Santagada > >>>>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> This part is only for objects with /Z7 debug information in > >>>>>>>>>>>>>> them right? I think most of the third parties are either: > .lib/obj without > >>>>>>>>>>>>>> debug information, the same with information on pdb files. > Rewriting all > >>>>>>>>>>>>>> .lib/.obj with /Z7 information seems doable with a small > python script, the > >>>>>>>>>>>>>> pdb one is going to be more work, but I always wanted to > know how a pdb file > >>>>>>>>>>>>>> is structured so "fun" times ahead. But yeah printing it > out, and timing it > >>>>>>>>>>>>>> might be very useful indeed. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Did anyone tried to compile/link lld-link.exe with LTO+PGO > to > >>>>>>>>>>>>>> see how much faster can it get? I might try that as well, > as 10% speed > >>>>>>>>>>>>>> improvement might be handy. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM, Zachary Turner > >>>>>>>>>>>>>> <zturner at google.com> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Look for this code in lld/coff/pdb.cpp > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> if (Config->DebugGHashes) { > >>>>>>>>>>>>>>> ArrayRef<GloballyHashedType> Hashes; > >>>>>>>>>>>>>>> std::vector<GloballyHashedType> OwnedHashes; > >>>>>>>>>>>>>>> if (Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File)) > >>>>>>>>>>>>>>> Hashes = getHashesFromDebugH(*DebugH); > >>>>>>>>>>>>>>> else { > >>>>>>>>>>>>>>> OwnedHashes = GloballyHashedType::hashTypes(Types); > >>>>>>>>>>>>>>> Hashes = OwnedHashes; > >>>>>>>>>>>>>>> } > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> In the else block there, add a log message that says > >>>>>>>>>>>>>>> “synthesizing .debug$h section for “ + Obj->Name > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> See how many of these you get. When I build chrome + all > >>>>>>>>>>>>>>> third party libraries this way i get about 100, which is > small enough to > >>>>>>>>>>>>>>> still see large performance gains. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> If you have many 3rd party libraries, it may be necessary > to > >>>>>>>>>>>>>>> rewrite the .lib files too, not just the .obj files. > Eventually I’ll get > >>>>>>>>>>>>>>> around to implementing all of this as well, as well as > better heuristics in > >>>>>>>>>>>>>>> lld-link to disable ghash if it’s going to be slow > >>>>>>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM Leonardo Santagada > >>>>>>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Ok I went for kind of middle ground solution, I patch in > the > >>>>>>>>>>>>>>>> obj files, but as adding a new section didn't seem to > work, I add a "shadow" > >>>>>>>>>>>>>>>> section, by editing the pointer to line number and the > virtual size on the > >>>>>>>>>>>>>>>> .debug$T section. Although technically broken, both > link.exe and > >>>>>>>>>>>>>>>> lld-link.exe don't seem to mind the alterations and as > the shadow .debug$H > >>>>>>>>>>>>>>>> is not really a section anymore (its just some bytes at > the end of the file) > >>>>>>>>>>>>>>>> it doesn't change anything else that does matter. With > that I could do my > >>>>>>>>>>>>>>>> first test with a subset of our code base, and the > results are not good. I > >>>>>>>>>>>>>>>> found one of our sources that break the ghash > computation, I will get more > >>>>>>>>>>>>>>>> info on this and post a proper bug report, but I guess > its type information > >>>>>>>>>>>>>>>> that is generated only by msvc. The other more alarming > problem is that > >>>>>>>>>>>>>>>> linking is way slower with the ghahes... my guess is that > we have a bunch of > >>>>>>>>>>>>>>>> pdb files for some third party libraries and calculating > those ghashes takes > >>>>>>>>>>>>>>>> more time than actual linking of this small part of the > source (it links in > >>>>>>>>>>>>>>>> 4s in both link.exe and lld-link.exe without ghashes). > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:52 PM, Leonardo Santagada > >>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> We don't generate any .lib as those don't work well with > >>>>>>>>>>>>>>>>> incremental linking (and give zero advantages when > linking AFAIK), and it > >>>>>>>>>>>>>>>>> would be pretty easy to have a modern format for having > a .ghash for > >>>>>>>>>>>>>>>>> multiple files, something simple like size prefixed name > and then size > >>>>>>>>>>>>>>>>> prefixed ghash blobs. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:44 PM, Zachary Turner > >>>>>>>>>>>>>>>>> <zturner at google.com> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> We considered that early on, but most object files > >>>>>>>>>>>>>>>>>> actually end up in .lib files so unless there were a > way to connect the > >>>>>>>>>>>>>>>>>> objects in the .lib to the corresponding .ghash files, > this would disable > >>>>>>>>>>>>>>>>>> ghash usage for a large amount of inputs. Supporting > both is an option, but > >>>>>>>>>>>>>>>>>> it adds a bit of complexity and I’m not totally > convinced it’s worth it > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 11:38 AM Leonardo Santagada > >>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> it does. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I just had an epiphany: why not just write a .ghash > file > >>>>>>>>>>>>>>>>>>> and have lld read those if they exist for an .obj file? > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Seem much simpler than trying to wire up a 20 year old > >>>>>>>>>>>>>>>>>>> file format. I will try to do this, is something like > this acceptable for > >>>>>>>>>>>>>>>>>>> LLD? The cool thing is that I can generate .ghash for > .lib or any obj lying > >>>>>>>>>>>>>>>>>>> around (maybe even for pdb in the future). > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:32 PM, Zachary Turner > >>>>>>>>>>>>>>>>>>> <zturner at google.com> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> In general, we should be able to accept any MSVC .obj > >>>>>>>>>>>>>>>>>>>> file to LLD. At the very least, we're not aware of > any cases that don't > >>>>>>>>>>>>>>>>>>>> work. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Does your MSVC .obj file link fine before you add the > >>>>>>>>>>>>>>>>>>>> .debug$H? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 11:23 AM Leonardo Santagada > >>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Okay, apparently coff2yaml and yaml2coff are not in a > >>>>>>>>>>>>>>>>>>>>> great place as they both don't deal well with the > fact that you can have > >>>>>>>>>>>>>>>>>>>>> overlapping sections, which seems to be what > clang-cl produces (the .data > >>>>>>>>>>>>>>>>>>>>> section points to the same place as a later > section). Which is not a big big > >>>>>>>>>>>>>>>>>>>>> problem for me particularly because msvc doesn't > even generate .data > >>>>>>>>>>>>>>>>>>>>> sections in .obj. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I'm trying to put support for .bss sections in both > >>>>>>>>>>>>>>>>>>>>> coff2yaml and yaml2coff... but I still can link just > fine with my > >>>>>>>>>>>>>>>>>>>>> transformations clang-cl generated files... what > does give me problems is > >>>>>>>>>>>>>>>>>>>>> msvc .obj files. Have you tried to link one of these? > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:05 PM, Leonardo Santagada > >>>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> yeah, apparently .bss has a flag of unitialized data > >>>>>>>>>>>>>>>>>>>>>> that is not being respected on the layout of the > coff files (it should skip > >>>>>>>>>>>>>>>>>>>>>> those sections) but I dunno what to do with .data > as it doesn't have a size. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> (resending as apparently my pastes generated a ton > of > >>>>>>>>>>>>>>>>>>>>>> hidden html data and this message hit the > mailinglist limit of 100k) > >>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Leonardo Santagada > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Leonardo Santagada > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Leonardo Santagada > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Leonardo Santagada > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Leonardo Santagada > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> > >>>>>>>>>>>> Leonardo Santagada > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Leonardo Santagada > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >>>> Leonardo Santagada > >>> > >>> > > > > > > -- > > Leonardo Santagada >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20180131/c58b7efd/attachment-0001.html>
Leonardo Santagada via llvm-dev
2018-Feb-14 22:12 UTC
[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
Sorry for taking this long to answer here, I would love to push my changes upstream (the ones for xxHash64) but what I did was use the standard xxHash64 c code as that allows a iterative hash computation, the one on LLVM only has the simple interface to pass a stringref to it, so I really don't know how to move this along, would it require me to rewrite the current internal xxHash64 implementation? I can definetely post the code online and you and others decide what to do with it. The super ugly patch to make cl.exe .obj have the hashes is pretty ugly, but can be of service to someone, althought I wouldn't push them upstream, maybe just providing a patchset for other people that feel the urge to try should be enough. On Wed, Jan 31, 2018 at 6:18 PM, Zachary Turner <zturner at google.com> wrote:> The quickest route would be for you to upload the patches for review and > then go through a couple of iterations until it's cleaned up. Do you want > to go that route? If not you can upload them anyway and whenever I get some > spare cycles I can take it over and get it committed. But that might be > slower since I have other things on my plate at the moment so I'm not sure > when I'll be able to get to it. Maybe in a month or two? > > If you are willing to do the work to get it cleaned up and ok for > committing, then the first steps would be 1) Fix variable and function names > to conform to LLVM conventions, and 2) Remove the dependency in llvm-objcopy > on ObjectYAML. > > BTW, I've also tried 8 byte ghashes locally, and in theory that should be > faster, but I found a case where it's slower than 20 byte ghashes. This > almost seems impossible, so I need to spend some time examining profiles to > see what's going on. > > On Wed, Jan 31, 2018 at 4:44 AM Leonardo Santagada <santagada at gmail.com> > wrote: >> >> So I found all 20 bytes and changed then to GHASH_SIZE (a const I >> defined in typehashing.h) and finished the switch to xxHash64, that >> saved me around 50 seconds to 56s, then I changed it to uint64_t >> instead of a 8 byte uint_8 array and that gave me 48s. With release >> config and a pgo pass I'm now linking in 38s... so faster than >> link.exe in vs 2017 (which is faster than vs 2015) doing fastlink. >> >> Now we are in a very good place, still not as fast as an incremental >> link, but generating better code and a full pdb. >> >> Do you want any of these patches? They don't look the nicest they can >> be but might help someone doing this cleanly. >> >> On Wed, Jan 31, 2018 at 9:39 AM, Leonardo Santagada <santagada at gmail.com> >> wrote: >> > Uhmm I changed only type hashing... Ok back to trying it again. Let's me >> > find where it is looking at 20 bytes instead of using the size of global >> > type hash. >> > >> > On 30 Jan 2018 21:33, "Zachary Turner" <zturner at google.com> wrote: >> >> >> >> Did you change both the compiler and linker (or make sure that your >> >> objcopy was updated to write your 64 bit hashes)? >> >> >> >> The linker is hardcodes to expect 20-byte sha 1s , anything else and it >> >> will recompute them in serial >> >> On Tue, Jan 30, 2018 at 12:28 PM Leonardo Santagada >> >> <santagada at gmail.com> >> >> wrote: >> >>> >> >>> Nice and why are you trying blake2 instead of a faster hash algorithm? >> >>> And do you have any guess as to why xxHash64 wasn't faster than SHA1? >> >>> I >> >>> still have to see how many collision I get with it, but it seems so >> >>> improbable that collisions on 64 bit hashes would be the problem. >> >>> >> >>> On 30 Jan 2018 18:39, "Zachary Turner" <zturner at google.com> wrote: >> >>> >> >>> It turns out there were some problems with the measurements in that >> >>> blog >> >>> post. I built LLD with the RelWithDebInfo configuration which we >> >>> later >> >>> found out uses /Ob1 instead of /Ob2. That was worth some cycles. >> >>> Then >> >>> there were some more optimizations that went in after that. And to >> >>> get down >> >>> to 28s I also used an LTO'ed build of lld. >> >>> >> >>> If you're building LLD at ToT you should have everything needed to >> >>> reproduce those numbers, but it will vary depending on the speed of >> >>> your CPU >> >>> obviously. >> >>> >> >>> On Tue, Jan 30, 2018 at 9:21 AM Leonardo Santagada >> >>> <santagada at gmail.com> >> >>> wrote: >> >>>> >> >>>> Today I played around replacing the sha1 with xxHash64 and the >> >>>> results >> >>>> so far are bad. Linking times almost doubled and I can't really >> >>>> explain why, >> >>>> the only thing that comes to mind is hash collisions but on type >> >>>> names they >> >>>> should be very few in 64bit hashes. >> >>>> >> >>>> Any reason why you are trying blake2 and not murmurhash3 or xxHash64? >> >>>> >> >>>> About creating a pdb per lib, you can say to msvc to put the pdb of >> >>>> every .obj compilation to the same file, but you can't after 20 files >> >>>> compiled to .obj (with /Z7 or /Zi) to them merge all the debug >> >>>> information >> >>>> in one .pdb file AFAIK. That would make our links much faster I think >> >>>> as >> >>>> people either are changing headers (and then they know they have to >> >>>> wait) or >> >>>> changing a single/few .cpp files. It would be great to group our 3k >> >>>> obj >> >>>> debug information in groups so that this linking steps can be >> >>>> paralelizable. >> >>>> Is there any support maybe for merging pdb with pdb util and then >> >>>> feeding >> >>>> that to lld-link instead of .obj debug info? >> >>>> >> >>>> I also re-read the post about ghash and it says blink links in 88s, >> >>>> the >> >>>> 28s you talk about is with unrelased optimizations only? >> >>>> >> >>>> On Tue, Jan 30, 2018 at 5:54 AM, Zachary Turner <zturner at google.com> >> >>>> wrote: >> >>>>> >> >>>>> You can make a PDB per lib (consider msvcrtd.pdb which ships with >> >>>>> MSVC), but all these per-lib PDBs would have to be merged into a >> >>>>> single >> >>>>> master PDB at the end, so you still can't avoid that final . In a >> >>>>> way, >> >>>>> that's similar to the idea behind /DEBUG:FASTLINK (keep the debug >> >>>>> info in >> >>>>> object files to eliminate the cost of merging types and symbol >> >>>>> records) and >> >>>>> we know what the problems with /DEBUG:FASTLINK are. >> >>>>> >> >>>>> The PDB generation code in LLD is still completely single threaded, >> >>>>> so >> >>>>> that's one area for huge potential gains, but only some parts of the >> >>>>> algorithm are parallelizable. We're trying to squeeze every last >> >>>>> bit of >> >>>>> performance out of the single-threaded case first before we >> >>>>> parallelize, but >> >>>>> that option is definitely still there for us. >> >>>>> >> >>>>> On Mon, Jan 29, 2018 at 4:35 PM Leonardo Santagada >> >>>>> <santagada at gmail.com> wrote: >> >>>>>> >> >>>>>> Does packing obj files in .lib helps linking in any way? My >> >>>>>> understanding is that there would be no difference. It could help >> >>>>>> if I could >> >>>>>> make a pdb per lib, but there is no way to do so... Maybe we could >> >>>>>> implement >> >>>>>> this on lld? >> >>>>>> >> >>>>>> On 29 Jan 2018 22:14, "Zachary Turner" <zturner at google.com> wrote: >> >>>>>>> >> >>>>>>> Yes we've discussed many different ideas for incremental linking, >> >>>>>>> but >> >>>>>>> our conclusion is that you can only get one of Fast|Simple. If >> >>>>>>> you want it >> >>>>>>> to be fast it has to be complicated and if you want it to be >> >>>>>>> simple then >> >>>>>>> it's going to be slow. >> >>>>>>> >> >>>>>>> Consider the case where you edit one .cpp file and change this: >> >>>>>>> >> >>>>>>> int x = 0, y = 7; >> >>>>>>> >> >>>>>>> to this: >> >>>>>>> >> >>>>>>> int x = 0; >> >>>>>>> short y = 7; >> >>>>>>> >> >>>>>>> Because different instructions operate on shorts vs ints, some of >> >>>>>>> the >> >>>>>>> instruction encodings will be different and potentially of a >> >>>>>>> different size. >> >>>>>>> >> >>>>>>> Because of this, the contribution to the .text section from this >> >>>>>>> object file is going to be a different size. >> >>>>>>> >> >>>>>>> Because of that, all subsequent object files will start at a >> >>>>>>> different absolute file address in the final executable. >> >>>>>>> >> >>>>>>> Because of that, every single symbol in every single object file >> >>>>>>> will >> >>>>>>> need to be updated in the final PDB. >> >>>>>>> >> >>>>>>> There are many other things that need to happen as well, but the >> >>>>>>> point is that trivial change to a cpp file can explode into many >> >>>>>>> changes in >> >>>>>>> the final PDB. >> >>>>>>> >> >>>>>>> There are ways to handle this, but they're not simple. We have >> >>>>>>> some >> >>>>>>> ideas, but for the moment we are focused on making full linking as >> >>>>>>> fast as >> >>>>>>> possible because it's much easier and still provides benefits. We >> >>>>>>> think we >> >>>>>>> can get it fast enough that it will be acceptable, and that should >> >>>>>>> give us >> >>>>>>> some extra time to do incremental linking properly. >> >>>>>>> >> >>>>>>> On Mon, Jan 29, 2018 at 1:07 PM Leonardo Santagada >> >>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>> >> >>>>>>>> About incremental linking, the only thing from my benchmark that >> >>>>>>>> needs to be incremental is the pdb patching as generating the >> >>>>>>>> binary seems >> >>>>>>>> faster than incremental linking on link.exe, so did anyone >> >>>>>>>> propose renaming >> >>>>>>>> the current binary, writing a new one and then diffing the coff >> >>>>>>>> obj and >> >>>>>>>> using that info to just rewriting that part of the pdb. Or >> >>>>>>>> another idea is >> >>>>>>>> making the build system feed into the linker which files changed >> >>>>>>>> so the >> >>>>>>>> types/debug information can be compared instead of all of them? >> >>>>>>>> >> >>>>>>>> On Mon, Jan 29, 2018 at 7:55 PM, Zachary Turner >> >>>>>>>> <zturner at google.com> >> >>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>> Not a lot. >> >>>>>>>>> >> >>>>>>>>> /TIME will show high level timing of the various phases (this is >> >>>>>>>>> the same option MSVC uses). >> >>>>>>>>> >> >>>>>>>>> If you want anything more detailed than that, vTune or ETW+WPA >> >>>>>>>>> (github.com/google/UIforETW/releases) are probably what >> >>>>>>>>> you'll need >> >>>>>>>>> to do. >> >>>>>>>>> >> >>>>>>>>> (We'd definitely love patches to improve performance, or even >> >>>>>>>>> just >> >>>>>>>>> ideas about how to make things faster. Improving link speed is >> >>>>>>>>> one of our >> >>>>>>>>> biggest priorities.) >> >>>>>>>>> >> >>>>>>>>> On Mon, Jan 29, 2018 at 10:47 AM Leonardo Santagada >> >>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>> >> >>>>>>>>>> Yeah true, is there any switches to profile the linker? >> >>>>>>>>>> >> >>>>>>>>>> On 29 Jan 2018 18:43, "Zachary Turner" <zturner at google.com> >> >>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> Part of the reason why lld is so fast is because we map every >> >>>>>>>>>>> input file into memory up front and rely on the virtual memory >> >>>>>>>>>>> manager in >> >>>>>>>>>>> the kernel to make this fast. Generally speaking, this is a >> >>>>>>>>>>> lot faster than >> >>>>>>>>>>> opening a file, reading it and processing a file, and closing >> >>>>>>>>>>> the file. The >> >>>>>>>>>>> downside, as you note, is that it uses a lot of memory. >> >>>>>>>>>>> >> >>>>>>>>>>> But there's a catch. The kernel is smart enough to share the >> >>>>>>>>>>> physical memory pages when you map the same file multiple >> >>>>>>>>>>> times from >> >>>>>>>>>>> multiple processes. So it only looks like the memory usage is >> >>>>>>>>>>> high because >> >>>>>>>>>>> it reserves a large amount of address space in each process. >> >>>>>>>>>>> But the total >> >>>>>>>>>>> amount of physical memory used will not increase when >> >>>>>>>>>>> additional instances >> >>>>>>>>>>> of the same file are mapped. >> >>>>>>>>>>> >> >>>>>>>>>>> On Mon, Jan 29, 2018 at 9:24 AM Leonardo Santagada >> >>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> I cleaned up my tests and figured that the obj file generated >> >>>>>>>>>>>> with problems was only with msvc 2015, so trying again with >> >>>>>>>>>>>> msvc 2017 I get: >> >>>>>>>>>>>> >> >>>>>>>>>>>> lld-link: 4s >> >>>>>>>>>>>> lld-link /debug: 1m30s and ~20gb of ram >> >>>>>>>>>>>> lld-link /debug:ghash: 59s and ~20gb of ram >> >>>>>>>>>>>> link: 13s >> >>>>>>>>>>>> link /debug:fastlink: 43s and 1gb of ram >> >>>>>>>>>>>> link specialpdb: 1m10s and 4gb of ram >> >>>>>>>>>>>> link /debug: 9m16s min and >14gb of ram >> >>>>>>>>>>>> >> >>>>>>>>>>>> link incremental: 8s when it works. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *specialpdb is created with passing to a set of compilation >> >>>>>>>>>>>> units (eg a folder) the same pdb to be written to, so it >> >>>>>>>>>>>> dedups the symbols >> >>>>>>>>>>>> before the final linking, but that does decrease the >> >>>>>>>>>>>> concurrency as this >> >>>>>>>>>>>> step can't be done after linking. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> My question is, in the set of patches you guys haven't >> >>>>>>>>>>>> upstreamed is there anything that makes compilation uses less >> >>>>>>>>>>>> memory? Or >> >>>>>>>>>>>> just asking more directly, when will those patches make to >> >>>>>>>>>>>> upstream, or can >> >>>>>>>>>>>> I try them? The memory usage of lld-link is a little worrying >> >>>>>>>>>>>> as we have >> >>>>>>>>>>>> around 6-8 binaries that we link for windows and they mostly >> >>>>>>>>>>>> use the same >> >>>>>>>>>>>> libraries so 20gb of ram each means we probably can't link >> >>>>>>>>>>>> them all together >> >>>>>>>>>>>> anymore. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> Tomorrow I will send my tool and changes to lld so more >> >>>>>>>>>>>> people >> >>>>>>>>>>>> can try this out and tell if it helps with their msvc only >> >>>>>>>>>>>> code. >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:22 PM, Zachary Turner >> >>>>>>>>>>>> <zturner at google.com> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> I don’t have pgo numbers. When I build using -flto=thin the >> >>>>>>>>>>>>> link time is significantly faster than msvc /ltcg and >> >>>>>>>>>>>>> runtime is slightly >> >>>>>>>>>>>>> faster, but I haven’t tested on a large variety of different >> >>>>>>>>>>>>> workloads, so >> >>>>>>>>>>>>> YMMV. Link time will definitely be faster though >> >>>>>>>>>>>>> On Sun, Jan 28, 2018 at 2:20 PM Leonardo Santagada >> >>>>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> This part is only for objects with /Z7 debug information in >> >>>>>>>>>>>>>> them right? I think most of the third parties are either: >> >>>>>>>>>>>>>> .lib/obj without >> >>>>>>>>>>>>>> debug information, the same with information on pdb files. >> >>>>>>>>>>>>>> Rewriting all >> >>>>>>>>>>>>>> .lib/.obj with /Z7 information seems doable with a small >> >>>>>>>>>>>>>> python script, the >> >>>>>>>>>>>>>> pdb one is going to be more work, but I always wanted to >> >>>>>>>>>>>>>> know how a pdb file >> >>>>>>>>>>>>>> is structured so "fun" times ahead. But yeah printing it >> >>>>>>>>>>>>>> out, and timing it >> >>>>>>>>>>>>>> might be very useful indeed. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Did anyone tried to compile/link lld-link.exe with LTO+PGO >> >>>>>>>>>>>>>> to >> >>>>>>>>>>>>>> see how much faster can it get? I might try that as well, >> >>>>>>>>>>>>>> as 10% speed >> >>>>>>>>>>>>>> improvement might be handy. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> On Sun, Jan 28, 2018 at 11:14 PM, Zachary Turner >> >>>>>>>>>>>>>> <zturner at google.com> wrote: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Look for this code in lld/coff/pdb.cpp >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> if (Config->DebugGHashes) { >> >>>>>>>>>>>>>>> ArrayRef<GloballyHashedType> Hashes; >> >>>>>>>>>>>>>>> std::vector<GloballyHashedType> OwnedHashes; >> >>>>>>>>>>>>>>> if (Optional<ArrayRef<uint8_t>> DebugH = getDebugH(File)) >> >>>>>>>>>>>>>>> Hashes = getHashesFromDebugH(*DebugH); >> >>>>>>>>>>>>>>> else { >> >>>>>>>>>>>>>>> OwnedHashes = GloballyHashedType::hashTypes(Types); >> >>>>>>>>>>>>>>> Hashes = OwnedHashes; >> >>>>>>>>>>>>>>> } >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> In the else block there, add a log message that says >> >>>>>>>>>>>>>>> “synthesizing .debug$h section for “ + Obj->Name >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> See how many of these you get. When I build chrome + all >> >>>>>>>>>>>>>>> third party libraries this way i get about 100, which is >> >>>>>>>>>>>>>>> small enough to >> >>>>>>>>>>>>>>> still see large performance gains. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> If you have many 3rd party libraries, it may be necessary >> >>>>>>>>>>>>>>> to >> >>>>>>>>>>>>>>> rewrite the .lib files too, not just the .obj files. >> >>>>>>>>>>>>>>> Eventually I’ll get >> >>>>>>>>>>>>>>> around to implementing all of this as well, as well as >> >>>>>>>>>>>>>>> better heuristics in >> >>>>>>>>>>>>>>> lld-link to disable ghash if it’s going to be slow >> >>>>>>>>>>>>>>> On Sun, Jan 28, 2018 at 1:51 PM Leonardo Santagada >> >>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Ok I went for kind of middle ground solution, I patch in >> >>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>> obj files, but as adding a new section didn't seem to >> >>>>>>>>>>>>>>>> work, I add a "shadow" >> >>>>>>>>>>>>>>>> section, by editing the pointer to line number and the >> >>>>>>>>>>>>>>>> virtual size on the >> >>>>>>>>>>>>>>>> .debug$T section. Although technically broken, both >> >>>>>>>>>>>>>>>> link.exe and >> >>>>>>>>>>>>>>>> lld-link.exe don't seem to mind the alterations and as >> >>>>>>>>>>>>>>>> the shadow .debug$H >> >>>>>>>>>>>>>>>> is not really a section anymore (its just some bytes at >> >>>>>>>>>>>>>>>> the end of the file) >> >>>>>>>>>>>>>>>> it doesn't change anything else that does matter. With >> >>>>>>>>>>>>>>>> that I could do my >> >>>>>>>>>>>>>>>> first test with a subset of our code base, and the >> >>>>>>>>>>>>>>>> results are not good. I >> >>>>>>>>>>>>>>>> found one of our sources that break the ghash >> >>>>>>>>>>>>>>>> computation, I will get more >> >>>>>>>>>>>>>>>> info on this and post a proper bug report, but I guess >> >>>>>>>>>>>>>>>> its type information >> >>>>>>>>>>>>>>>> that is generated only by msvc. The other more alarming >> >>>>>>>>>>>>>>>> problem is that >> >>>>>>>>>>>>>>>> linking is way slower with the ghahes... my guess is that >> >>>>>>>>>>>>>>>> we have a bunch of >> >>>>>>>>>>>>>>>> pdb files for some third party libraries and calculating >> >>>>>>>>>>>>>>>> those ghashes takes >> >>>>>>>>>>>>>>>> more time than actual linking of this small part of the >> >>>>>>>>>>>>>>>> source (it links in >> >>>>>>>>>>>>>>>> 4s in both link.exe and lld-link.exe without ghashes). >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:52 PM, Leonardo Santagada >> >>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> We don't generate any .lib as those don't work well with >> >>>>>>>>>>>>>>>>> incremental linking (and give zero advantages when >> >>>>>>>>>>>>>>>>> linking AFAIK), and it >> >>>>>>>>>>>>>>>>> would be pretty easy to have a modern format for having >> >>>>>>>>>>>>>>>>> a .ghash for >> >>>>>>>>>>>>>>>>> multiple files, something simple like size prefixed name >> >>>>>>>>>>>>>>>>> and then size >> >>>>>>>>>>>>>>>>> prefixed ghash blobs. >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:44 PM, Zachary Turner >> >>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> We considered that early on, but most object files >> >>>>>>>>>>>>>>>>>> actually end up in .lib files so unless there were a >> >>>>>>>>>>>>>>>>>> way to connect the >> >>>>>>>>>>>>>>>>>> objects in the .lib to the corresponding .ghash files, >> >>>>>>>>>>>>>>>>>> this would disable >> >>>>>>>>>>>>>>>>>> ghash usage for a large amount of inputs. Supporting >> >>>>>>>>>>>>>>>>>> both is an option, but >> >>>>>>>>>>>>>>>>>> it adds a bit of complexity and I’m not totally >> >>>>>>>>>>>>>>>>>> convinced it’s worth it >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 11:38 AM Leonardo Santagada >> >>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> it does. >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> I just had an epiphany: why not just write a .ghash >> >>>>>>>>>>>>>>>>>>> file >> >>>>>>>>>>>>>>>>>>> and have lld read those if they exist for an .obj >> >>>>>>>>>>>>>>>>>>> file? >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Seem much simpler than trying to wire up a 20 year old >> >>>>>>>>>>>>>>>>>>> file format. I will try to do this, is something like >> >>>>>>>>>>>>>>>>>>> this acceptable for >> >>>>>>>>>>>>>>>>>>> LLD? The cool thing is that I can generate .ghash for >> >>>>>>>>>>>>>>>>>>> .lib or any obj lying >> >>>>>>>>>>>>>>>>>>> around (maybe even for pdb in the future). >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:32 PM, Zachary Turner >> >>>>>>>>>>>>>>>>>>> <zturner at google.com> wrote: >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> In general, we should be able to accept any MSVC .obj >> >>>>>>>>>>>>>>>>>>>> file to LLD. At the very least, we're not aware of >> >>>>>>>>>>>>>>>>>>>> any cases that don't >> >>>>>>>>>>>>>>>>>>>> work. >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> Does your MSVC .obj file link fine before you add the >> >>>>>>>>>>>>>>>>>>>> .debug$H? >> >>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 11:23 AM Leonardo Santagada >> >>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> Okay, apparently coff2yaml and yaml2coff are not in >> >>>>>>>>>>>>>>>>>>>>> a >> >>>>>>>>>>>>>>>>>>>>> great place as they both don't deal well with the >> >>>>>>>>>>>>>>>>>>>>> fact that you can have >> >>>>>>>>>>>>>>>>>>>>> overlapping sections, which seems to be what >> >>>>>>>>>>>>>>>>>>>>> clang-cl produces (the .data >> >>>>>>>>>>>>>>>>>>>>> section points to the same place as a later >> >>>>>>>>>>>>>>>>>>>>> section). Which is not a big big >> >>>>>>>>>>>>>>>>>>>>> problem for me particularly because msvc doesn't >> >>>>>>>>>>>>>>>>>>>>> even generate .data >> >>>>>>>>>>>>>>>>>>>>> sections in .obj. >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> I'm trying to put support for .bss sections in both >> >>>>>>>>>>>>>>>>>>>>> coff2yaml and yaml2coff... but I still can link just >> >>>>>>>>>>>>>>>>>>>>> fine with my >> >>>>>>>>>>>>>>>>>>>>> transformations clang-cl generated files... what >> >>>>>>>>>>>>>>>>>>>>> does give me problems is >> >>>>>>>>>>>>>>>>>>>>> msvc .obj files. Have you tried to link one of >> >>>>>>>>>>>>>>>>>>>>> these? >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> On Fri, Jan 26, 2018 at 8:05 PM, Leonardo Santagada >> >>>>>>>>>>>>>>>>>>>>> <santagada at gmail.com> wrote: >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> yeah, apparently .bss has a flag of unitialized >> >>>>>>>>>>>>>>>>>>>>>> data >> >>>>>>>>>>>>>>>>>>>>>> that is not being respected on the layout of the >> >>>>>>>>>>>>>>>>>>>>>> coff files (it should skip >> >>>>>>>>>>>>>>>>>>>>>> those sections) but I dunno what to do with .data >> >>>>>>>>>>>>>>>>>>>>>> as it doesn't have a size. >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> (resending as apparently my pastes generated a ton >> >>>>>>>>>>>>>>>>>>>>>> of >> >>>>>>>>>>>>>>>>>>>>>> hidden html data and this message hit the >> >>>>>>>>>>>>>>>>>>>>>> mailinglist limit of 100k) >> >>>>>>>>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> Leonardo Santagada >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Leonardo Santagada >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Leonardo Santagada >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Leonardo Santagada >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> -- >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Leonardo Santagada >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> -- >> >>>>>>>>>>>> >> >>>>>>>>>>>> Leonardo Santagada >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> >> >>>>>>>> Leonardo Santagada >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> >> >>>> Leonardo Santagada >> >>> >> >>> >> > >> >> >> >> -- >> >> Leonardo Santagada-- Leonardo Santagada
Maybe Matching Threads
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)
- [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)