David Blaikie via llvm-dev
2017-Jun-14 19:47 UTC
[llvm-dev] [cfe-dev] RFC: ODR checker for Clang and LLD
On Tue, Jun 13, 2017, 11:30 PM Peter Collingbourne <peter at pcc.me.uk> wrote:> On Tue, Jun 13, 2017 at 11:06 PM, David Blaikie <dblaikie at gmail.com> > wrote: > >> >> >> On Tue, Jun 13, 2017 at 10:05 PM Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> >>> On Tue, Jun 13, 2017 at 8:48 PM, David Blaikie <dblaikie at gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Tue, Jun 13, 2017 at 8:43 PM Peter Collingbourne <peter at pcc.me.uk> >>>> wrote: >>>> >>>>> On Tue, Jun 13, 2017 at 7:54 PM, David Blaikie <dblaikie at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 13, 2017 at 6:34 PM Peter Collingbourne via cfe-dev < >>>>>> cfe-dev at lists.llvm.org> wrote: >>>>>> >>>>>>> On Wed, Jun 7, 2017 at 11:28 PM, Peter Collingbourne < >>>>>>> peter at pcc.me.uk> wrote: >>>>>>> >>>>>>>> On Wed, Jun 7, 2017 at 8:06 PM, Sean Silva <chisophugis at gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jun 7, 2017 at 4:31 PM, Peter Collingbourne < >>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>> >>>>>>>>>> On Wed, Jun 7, 2017 at 12:17 AM, Sean Silva < >>>>>>>>>> chisophugis at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Very nice and simple implementation! >>>>>>>>>>> >>>>>>>>>>> Do you have any statistics on how large these odr tables are >>>>>>>>>>> compared to other object file data? I assume that if these tables contain >>>>>>>>>>> full mangled symbol names, they could end up being very large and may want >>>>>>>>>>> to share the symbol name strings with the overall string table in the .o >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Looking at Chromium's object files it looks like the total size >>>>>>>>>> of the odrtabs is about 50% of the total size of the object files, which >>>>>>>>>> isn't great. The current implementation only looks at records, so I imagine >>>>>>>>>> that it would be hard to share any of the strings that I'm currently >>>>>>>>>> creating. (I guess it's possible that some types will have a mangled vtable >>>>>>>>>> name in the string table, so we may be able to share a little that way.) >>>>>>>>>> Note however that this was without debug info. >>>>>>>>>> >>>>>>>>>> One option for reducing size would be to >>>>>>>>>> 1) store hashes of ODR names in ODR tables, per Rui's suggestion >>>>>>>>>> (alongside a reference to the name itself in the string table) >>>>>>>>>> 2) compress the string table for the ODR names with a standard >>>>>>>>>> compression algorithm like gzip. >>>>>>>>>> This wouldn't seem to affect link time performance much because I >>>>>>>>>> think we should only need to look at the strings if we see a ODR name hash >>>>>>>>>> match together with an ODR hash mismatch, which would mean an ODR violation >>>>>>>>>> with a high probability (i.e. unless there was an ODR name hash collision, >>>>>>>>>> we have found an ODR violation). If we don't expect a lot of sharing with >>>>>>>>>> regular string tables (see below), it seems even more reasonable. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Neat observation! >>>>>>>>> >>>>>>>>> FWIW, it is a birthday problem type situation though, so for a >>>>>>>>> 32-bit hash, we would expect a collision in about 1 in 2^16 distinct hashes >>>>>>>>> (and 2^16 seems pretty easy to hit in a large project). So 64-bit hashes >>>>>>>>> might be preferable. >>>>>>>>> >>>>>>>> >>>>>>>> Oh right, good point, using a 64-bit hash does seem like a good >>>>>>>> idea here. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> Also, do you have any numbers on the performance of your initial >>>>>>>>>>> implementation? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I measured the link time for chromium's unit_tests (the largest >>>>>>>>>> single binary in chromium) at 5.05s without ODR checks and 6.61s with ODR >>>>>>>>>> checks. So about 30% overhead, but in absolute terms it doesn't seem too >>>>>>>>>> bad. So I think this may be acceptable for an initial implementation, but >>>>>>>>>> it certainly seems worth trying to do better. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I know that things aren't currently apples-to-apples, but how does >>>>>>>>> that compare to gold? >>>>>>>>> >>>>>>>> >>>>>>>> I will measure it. >>>>>>>> >>>>>>> >>>>>>> For that unit_tests binary I measured the overhead at about 5 >>>>>>> seconds (average of 10 runs). That is with debug info, of course. >>>>>>> >>>>>>> W.r.t. LLD and having it always on by default (and hence making it >>>>>>>>>>> as fast as possible), it seems like right now you are implementing the >>>>>>>>>>> checking process with a hash table. That's simple and fine for a first >>>>>>>>>>> implementation, but it's probably worth mentioning in a comment the problem >>>>>>>>>>> of checking the tables, at least from the linker's perspective, does fit >>>>>>>>>>> into a map-reduce pattern and could be easily parallelized if needed. E.g. >>>>>>>>>>> a parallel sort to coalesce all entries for symbols of the same name >>>>>>>>>>> followed by a parallel forEach to check each bucket with the same symbol >>>>>>>>>>> name (roughly speaking). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Right, that's one approach. I was thinking of a simpler approach >>>>>>>>>> where at compile time we sort ODR names by hash and partition them using >>>>>>>>>> (say) the upper bits of the hash, so that at link time we can have N >>>>>>>>>> threads each building a hash table for a specific partition. >>>>>>>>>> >>>>>>>>>> And of course this work can be started right after symbol >>>>>>>>>> resolution finishes and parallelised with the rest of the work done by the >>>>>>>>>> linker. >>>>>>>>>> >>>>>>>>>> Even better than doing it faster is just doing less work. There's >>>>>>>>>>> a lot of work that the linker is already doing that may be reusable for the >>>>>>>>>>> ODR checking. >>>>>>>>>>> E.g. >>>>>>>>>>> - maybe we could get the coalescing step as a byproduct of our >>>>>>>>>>> existing string deduping, which we are generally doing anyway. >>>>>>>>>>> - we are already coalescing symbol names for the symbol table. >>>>>>>>>>> If the ODR table is keyed off of symbols in the binary that we are >>>>>>>>>>> inserting into the symbol table, then I think we could do the entire ODR >>>>>>>>>>> check with no extra "string" work on LLD's part. >>>>>>>>>>> >>>>>>>>>>> I see Rui already mentioned some of this in >>>>>>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=726071#c4. >>>>>>>>>>> You mentioned that not everything is necessarily directly keyed >>>>>>>>>>> on a symbol (such as types), but I think that it would really simplify >>>>>>>>>>> things if the check was done as such. Do you have any idea exactly how much >>>>>>>>>>> of the things that we want to check are not keyed on symbols? If most >>>>>>>>>>> things are keyed on symbols, for the things we are not we can just emit >>>>>>>>>>> extra symbols prefixed by __clang_odr_check_ or whatever. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Since the current implementation only works with records there is >>>>>>>>>> basically zero overlap right now between ODR names and symbols. I suppose >>>>>>>>>> that I could estimate the amount of function overlap in a hypothetical >>>>>>>>>> implementation that computes ODR hashes of functions by comparing the >>>>>>>>>> number of *_odr functions after clang has finished IRgen with the number >>>>>>>>>> after optimization finishes. This of course would be strictly more than >>>>>>>>>> functions + types. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Wouldn't any function or symbol using the record type have the >>>>>>>>> type name somewhere in it? If we used an offset+length encoding (instead of >>>>>>>>> offset + NUL termination) we might be able to reuse it then (at some cost >>>>>>>>> in finding the reference). >>>>>>>>> >>>>>>>> >>>>>>>> That may be possible with some work in the string table builder. >>>>>>>> But at that point of course we're not dealing with regular symbols any >>>>>>>> more. I guess we could have two ODR tables per object file: an array of >>>>>>>> (ODR hash, location) tuples for ODR names that correspond to symbol table >>>>>>>> symbols (i.e. Rui's proposal on the chromium bug), and an array of (ODR >>>>>>>> name, ODR hash, location) tuples for all other ODR names. I guess if we >>>>>>>> wanted a "low overhead" mode we could just omit the second table or put >>>>>>>> fewer symbols in it. >>>>>>>> >>>>>>>> With debug info surely there is some sort of string representing >>>>>>>>> the record name or something like that, no? >>>>>>>>> >>>>>>>> >>>>>>>> Not the record name on its own (they do appear but a bit awkwardly >>>>>>>> -- each namespace component is stored in a separate string), but if the >>>>>>>> record has at least one member function the mangled type name will appear >>>>>>>> somewhere in .debug_str, so we could in principle reuse that with the >>>>>>>> offset/length trick. >>>>>>>> >>>>>>>> I guess we may have to have our "low-overhead" user-facing behavior >>>>>>>>> be a bit more nuanced. E.g.: >>>>>>>>> 1. does this feature bloat object files significantly >>>>>>>>> 2. does this feature slow down link times significantly >>>>>>>>> >>>>>>>>> Intuitively, it seems like we should be able to get 1. when debug >>>>>>>>> info happens to be enabled (not sure about split dwarf?) and possibly in >>>>>>>>> all cases at the cost of complexity. We may be able to get 2. in all cases >>>>>>>>> with proper design. >>>>>>>>> >>>>>>>> >>>>>>>> I think that would be my rough assessment as well. I think we have >>>>>>>> a good shot at 1 for all cases with some of the ideas that have been >>>>>>>> mentioned already. If we can avoid creating dependencies on DWARF I think >>>>>>>> that would be ideal -- I'd ideally like this to work for COFF as well, >>>>>>>> where you'd typically expect to find CodeView in object files. If I were to >>>>>>>> try this I think the first thing that I would try is hash/compression >>>>>>>> combined with the two ODR tables (no reuse for non-symbol ODR names to >>>>>>>> start with, as compression may be enough on its own). >>>>>>>> >>>>>>> >>>>>>> I developed a second prototype which uses hash/compression with no >>>>>>> attempt to reuse. It is available here: >>>>>>> https://github.com/pcc/llvm-project/tree/odr-checker2 >>>>>>> >>>>>>> For Chromium the object file size overhead was 536566007 bytes, or >>>>>>> in relative terms about 25%, or about 4% with debug info. I measured perf >>>>>>> overhead for unit_tests at about 6%, but after I moved the checker onto >>>>>>> another thread, the overhead disappeared into the noise. >>>>>>> >>>>>> >>>>>> Still seems like quite a big increase. >>>>>> >>>>>> Any chance of compression? >>>>>> >>>>> >>>>> That was with compression -- the implementation compresses the parts >>>>> of the ODR table that aren't hashes (aside from the header and the Clang >>>>> version, which is a small fixed cost), as well as the string table. The >>>>> hashes were left uncompressed because they are in the critical path of the >>>>> linker and because I imagine that they wouldn't really be that compressible. >>>>> >>>> >>>> I'd be a bit surprised if they weren't especially compressible - >>>> >>> >>> Maybe I'm wrong, but my intuition about compression is that it works >>> best when the data contains repeated patterns. If we use a hash function >>> with good dispersion then I'd expect each hash to have little in common >>> with other hashes. >>> >>> >>>> and how much of the size increase is the compressed data V the >>>> uncompressed data? >>>> >>> >>> The ratio was roughly 60% compressed data to 40% uncompressed data. >>> >>> >>>> Is it still in the hot path when parallelized? >>>> >>> >>> Not right now according to my benchmarking, but decompression could push >>> it into the critical path if it ends up taking longer than the rest of the >>> work done by the linker after symbol resolution. On the same machine that I >>> used for benchmarking, gunzip'ing 200MB of /dev/urandom (which is roughly >>> what I'd expect the hashes to look like) takes around 1.1s, i.e. a not >>> insignificant fraction of lld's runtime. >>> >>> >>>> So I think the remaining gains would either be through limiting the >>>>> number of ODR table entries, or through reuse of data. >>>>> >>>>> Limiting might be something to explore -- one possibility is that we >>>>> could limit the ODR table entries to the declarations that are "used" by a >>>>> particular translation unit (it appears that Clang tracks something like >>>>> that in Decl::Used/Decl::Referenced, but I'm not sure if that is exactly >>>>> what we need -- I think we would basically need to test for reference >>>>> reachability from the functions/globals that are IRgen'd). >>>>> >>>> >>>> Currently it has every type and function that is in the AST? Yeah, >>>> that's a lot - perhaps it should be more like the things that go in the >>>> DWARF? (though would need to add some cases there - since the DWARF logic >>>> already relies on the ODR to not emit duplicates in some cases) >>>> >>> >>> Just every record declaration -- Clang only supports ODR hashes for >>> record declarations right now. I understand that function declarations >>> (including function bodies) are still works in progress. >>> >>> I think it should indeed just be roughly the things that go in the >>> DWARF. I think that at one point I observed that every record declaration, >>> even unused ones, were going into the DWARF, but I might have been mistaken >>> because I can no longer reproduce that. I'll take a closer look to see if I >>> can reuse what logic presumably already exists for DWARF. >>> >>> In terms of reuse, it seems that of the 536566007 bytes of >>>>> overhead, 319309579 were the compressed part of the ODR tables. So even if >>>>> we achieved 100% sharing, >>>>> >>>> >>>> 100% sharing? You mean if all the data were compressed, and assuming >>>> the hashes were compressible at the same ratio as the other data? >>>> >>> >>> Sorry, I mean if 100% of the data in the compressed part of the ODR >>> table could be eliminated by reusing data stored elsewhere (e.g. in the >>> object file string table or in the DWARF). >>> >>> with the current scheme I think that our minimum achievable overhead >>>>> would be ~15% (no debug info) or ~2% (with debug info). >>>>> >>>>> >>>>>> Could this go into .dwo files with Fission and be checked by dwp >>>>>> instead, perhaps? >>>>>> >>>>> >>>>> I think it could also work that way, yes. >>>>> >>>>> I'm reasonably happy with these figures, at least for a first >>>>>>> implementation. We may be able to do even better for file size with reuse, >>>>>>> but I'd leave that for version 2. >>>>>>> >>>>>> >>>>>> What's the story with compatibility between versions, then? Is there >>>>>> a version header? >>>>>> >>>>> >>>>> Yes, the header contains a version number. >>>>> >>>>> >>>>>> Will old formats be supported by lld indefinitely? Not at all? >>>>>> >>>>> >>>>> I think we should drop support for old formats when we introduce a new >>>>> format. My understanding is that the ODR hash can change whenever Clang >>>>> changes (the implementation only performs ODR checking if all ODR tables >>>>> were produced by the same revision of Clang), so there wouldn't seem to be >>>>> a huge benefit in keeping support for old formats around. >>>>> >>>> >>>> I imagine it's possible people aren't necessarily going to rev lld in >>>> exact lock-step with clang, but I could be wrong. (certainly binutils ld or >>>> gold aren't released/kept in lock-step with GCC, for example) >>>> >>> >>> That's certainly possible, but I'd say that the bar for dropping >>> backwards compatibility is lower because ODR tables are not required for >>> correctness. We could keep compatibility with the last version or so if it >>> isn't too burdensome, or otherwise print a warning. >>> >> >> They aren't required for correctness, but upgrading your compiler or >> linker then silently losing ODR checking would be bad (or even not silently >> losing it, but having no choice but to rev both to keep the functionality & >> hold the ODR-cleanliness bar) - it's the sort of thing where if you lost >> the checking, then gained it back again later, the regression cleanup would >> be annoying/an impediment to using the feature. >> > > Makes sense I guess. I'd be fine with a policy where the Nth open source > release should be able to read ODR tables produced by the N-1th and > possibly the N-2th release. >Still strikes me as a bit awkward - wonder how that compared to other (similar or different) linker features.> Any idea what Daniel Jasper & co have been working on WRT ODR checking & >> how this feature integrates or doesn't with their work? I imagine they >> might be working on something more like a Clang Tooling style approach, but >> I'm not sure. >> > > I'm not aware of any work like that, only of Richard Trieu's efforts for > modules that I'm piggybacking on. >+Djasper - perhaps you could provide some context on other odr detection efforts?> Peter > >> >> - Dave >> >> >>> >>> Peter >>> >>>> >>>> >>>>> >>>>> Peter >>>>> >>>>>> >>>>>> - Dave >>>>>> >>>>>> >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>>> >>>>>>>> Peter >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> -- Sean Silva >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> The issue of retaining the ODR check for functions even if they >>>>>>>>>>> get inlined may inherently pose an extra cost that can't be folded into >>>>>>>>>>> existing work the linker is doing, so there might be a reason for clang to >>>>>>>>>>> have a default mode that has practically no linking overhead and one that >>>>>>>>>>> does more thorough checking but imposes extra linking overhead. Think >>>>>>>>>>> something like a crazy boost library with thousands of functions that get >>>>>>>>>>> inlined away, but have gigantic mangled names and so precisely are the ones >>>>>>>>>>> that are going to impose extra cost on the linker. Simply due to the extra >>>>>>>>>>> volume of strings that the linker would need to look at, I don't think >>>>>>>>>>> there's a way to include checking of all inlined function "for free" at the >>>>>>>>>>> linker level using the symbol approach. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> I guess those inlined functions would still have those symbol >>>>>>>>>>> names in debug info (I think?), so piggybacking on the string deduplication >>>>>>>>>>> we're already doing might make it possible to fold away the work in that >>>>>>>>>>> case (but then again, would still impose extra cost with split dwarf...). >>>>>>>>>>> >>>>>>>>>>> Anyway, let's wait to see what the actual performance numbers >>>>>>>>>>> are. >>>>>>>>>>> >>>>>>>>>>> -- Sean Silva >>>>>>>>>>> >>>>>>>>>>> On Tue, Jun 6, 2017 at 10:40 PM, Peter Collingbourne via cfe-dev >>>>>>>>>>> <cfe-dev at lists.llvm.org> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I'd like to propose an ODR checker feature for Clang and LLD. >>>>>>>>>>>> The feature would be similar to gold's --detect-odr-violations feature, but >>>>>>>>>>>> better: we can rely on integration with clang to avoid relying on debug >>>>>>>>>>>> info and to perform more precise matching. >>>>>>>>>>>> >>>>>>>>>>>> The basic idea is that we use clang's ability to create ODR >>>>>>>>>>>> hashes for declarations. ODR hashes are computed using all information >>>>>>>>>>>> about a declaration that is ODR-relevant. If the flag >>>>>>>>>>>> -fdetect-odr-violations is passed, Clang will store the ODR hashes in a >>>>>>>>>>>> so-called ODR table in each object file. Each ODR table will contain a >>>>>>>>>>>> mapping from mangled declaration names to ODR hashes. At link time, the >>>>>>>>>>>> linker will read the ODR table and report any mismatches. >>>>>>>>>>>> >>>>>>>>>>>> To make this work: >>>>>>>>>>>> - LLVM will be extended with the ability to represent ODR >>>>>>>>>>>> tables in the IR and emit them to object files >>>>>>>>>>>> - Clang will be extended with the ability to emit ODR tables >>>>>>>>>>>> using ODR hashes >>>>>>>>>>>> - LLD will be extended to read ODR tables from object files >>>>>>>>>>>> >>>>>>>>>>>> I have implemented a prototype of this feature. It is available >>>>>>>>>>>> here: https://github.com/pcc/llvm-project/tree/odr-checker and >>>>>>>>>>>> some results from applying it to chromium are here: >>>>>>>>>>>> crbug.com/726071 >>>>>>>>>>>> As you can see it did indeed find a number of real ODR >>>>>>>>>>>> violations in Chromium, including some that wouldn't be detectable using >>>>>>>>>>>> debug info. >>>>>>>>>>>> >>>>>>>>>>>> If you're interested in what the format of the ODR table would >>>>>>>>>>>> look like, that prototype shows pretty much what I had in mind, but I >>>>>>>>>>>> expect many other aspects of the implementation to change as it is >>>>>>>>>>>> upstreamed. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -- >>>>>>>>>>>> -- >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> cfe-dev mailing list >>>>>>>>>>>> cfe-dev at lists.llvm.org >>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> Peter >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Peter >>>>>>> _______________________________________________ >>>>>>> cfe-dev mailing list >>>>>>> cfe-dev at lists.llvm.org >>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> Peter >>>>> >>>> >>> >>> >>> -- >>> -- >>> Peter >>> >> > > > -- > -- > Peter >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170614/db1c8799/attachment.html>
Peter Collingbourne via llvm-dev
2017-Jun-14 20:41 UTC
[llvm-dev] [cfe-dev] RFC: ODR checker for Clang and LLD
On Wed, Jun 14, 2017 at 12:47 PM, David Blaikie <dblaikie at gmail.com> wrote:> > > On Tue, Jun 13, 2017, 11:30 PM Peter Collingbourne <peter at pcc.me.uk> > wrote: > >> On Tue, Jun 13, 2017 at 11:06 PM, David Blaikie <dblaikie at gmail.com> >> wrote: >> >>> >>> >>> On Tue, Jun 13, 2017 at 10:05 PM Peter Collingbourne <peter at pcc.me.uk> >>> wrote: >>> >>>> On Tue, Jun 13, 2017 at 8:48 PM, David Blaikie <dblaikie at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Jun 13, 2017 at 8:43 PM Peter Collingbourne <peter at pcc.me.uk> >>>>> wrote: >>>>> >>>>>> On Tue, Jun 13, 2017 at 7:54 PM, David Blaikie <dblaikie at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 13, 2017 at 6:34 PM Peter Collingbourne via cfe-dev < >>>>>>> cfe-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>>> On Wed, Jun 7, 2017 at 11:28 PM, Peter Collingbourne < >>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>> >>>>>>>>> On Wed, Jun 7, 2017 at 8:06 PM, Sean Silva <chisophugis at gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jun 7, 2017 at 4:31 PM, Peter Collingbourne < >>>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>>> >>>>>>>>>>> On Wed, Jun 7, 2017 at 12:17 AM, Sean Silva < >>>>>>>>>>> chisophugis at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Very nice and simple implementation! >>>>>>>>>>>> >>>>>>>>>>>> Do you have any statistics on how large these odr tables are >>>>>>>>>>>> compared to other object file data? I assume that if these tables contain >>>>>>>>>>>> full mangled symbol names, they could end up being very large and may want >>>>>>>>>>>> to share the symbol name strings with the overall string table in the .o >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Looking at Chromium's object files it looks like the total size >>>>>>>>>>> of the odrtabs is about 50% of the total size of the object files, which >>>>>>>>>>> isn't great. The current implementation only looks at records, so I imagine >>>>>>>>>>> that it would be hard to share any of the strings that I'm currently >>>>>>>>>>> creating. (I guess it's possible that some types will have a mangled vtable >>>>>>>>>>> name in the string table, so we may be able to share a little that way.) >>>>>>>>>>> Note however that this was without debug info. >>>>>>>>>>> >>>>>>>>>>> One option for reducing size would be to >>>>>>>>>>> 1) store hashes of ODR names in ODR tables, per Rui's suggestion >>>>>>>>>>> (alongside a reference to the name itself in the string table) >>>>>>>>>>> 2) compress the string table for the ODR names with a standard >>>>>>>>>>> compression algorithm like gzip. >>>>>>>>>>> This wouldn't seem to affect link time performance much because >>>>>>>>>>> I think we should only need to look at the strings if we see a ODR name >>>>>>>>>>> hash match together with an ODR hash mismatch, which would mean an ODR >>>>>>>>>>> violation with a high probability (i.e. unless there was an ODR name hash >>>>>>>>>>> collision, we have found an ODR violation). If we don't expect a lot of >>>>>>>>>>> sharing with regular string tables (see below), it seems even more >>>>>>>>>>> reasonable. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Neat observation! >>>>>>>>>> >>>>>>>>>> FWIW, it is a birthday problem type situation though, so for a >>>>>>>>>> 32-bit hash, we would expect a collision in about 1 in 2^16 distinct hashes >>>>>>>>>> (and 2^16 seems pretty easy to hit in a large project). So 64-bit hashes >>>>>>>>>> might be preferable. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Oh right, good point, using a 64-bit hash does seem like a good >>>>>>>>> idea here. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Also, do you have any numbers on the performance of your initial >>>>>>>>>>>> implementation? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I measured the link time for chromium's unit_tests (the largest >>>>>>>>>>> single binary in chromium) at 5.05s without ODR checks and 6.61s with ODR >>>>>>>>>>> checks. So about 30% overhead, but in absolute terms it doesn't seem too >>>>>>>>>>> bad. So I think this may be acceptable for an initial implementation, but >>>>>>>>>>> it certainly seems worth trying to do better. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I know that things aren't currently apples-to-apples, but how >>>>>>>>>> does that compare to gold? >>>>>>>>>> >>>>>>>>> >>>>>>>>> I will measure it. >>>>>>>>> >>>>>>>> >>>>>>>> For that unit_tests binary I measured the overhead at about 5 >>>>>>>> seconds (average of 10 runs). That is with debug info, of course. >>>>>>>> >>>>>>>> W.r.t. LLD and having it always on by default (and hence making it >>>>>>>>>>>> as fast as possible), it seems like right now you are implementing the >>>>>>>>>>>> checking process with a hash table. That's simple and fine for a first >>>>>>>>>>>> implementation, but it's probably worth mentioning in a comment the problem >>>>>>>>>>>> of checking the tables, at least from the linker's perspective, does fit >>>>>>>>>>>> into a map-reduce pattern and could be easily parallelized if needed. E.g. >>>>>>>>>>>> a parallel sort to coalesce all entries for symbols of the same name >>>>>>>>>>>> followed by a parallel forEach to check each bucket with the same symbol >>>>>>>>>>>> name (roughly speaking). >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Right, that's one approach. I was thinking of a simpler approach >>>>>>>>>>> where at compile time we sort ODR names by hash and partition them using >>>>>>>>>>> (say) the upper bits of the hash, so that at link time we can have N >>>>>>>>>>> threads each building a hash table for a specific partition. >>>>>>>>>>> >>>>>>>>>>> And of course this work can be started right after symbol >>>>>>>>>>> resolution finishes and parallelised with the rest of the work done by the >>>>>>>>>>> linker. >>>>>>>>>>> >>>>>>>>>>> Even better than doing it faster is just doing less work. >>>>>>>>>>>> There's a lot of work that the linker is already doing that may be reusable >>>>>>>>>>>> for the ODR checking. >>>>>>>>>>>> E.g. >>>>>>>>>>>> - maybe we could get the coalescing step as a byproduct of our >>>>>>>>>>>> existing string deduping, which we are generally doing anyway. >>>>>>>>>>>> - we are already coalescing symbol names for the symbol table. >>>>>>>>>>>> If the ODR table is keyed off of symbols in the binary that we are >>>>>>>>>>>> inserting into the symbol table, then I think we could do the entire ODR >>>>>>>>>>>> check with no extra "string" work on LLD's part. >>>>>>>>>>>> >>>>>>>>>>>> I see Rui already mentioned some of this in >>>>>>>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=726071#c4 >>>>>>>>>>>> . >>>>>>>>>>>> You mentioned that not everything is necessarily directly keyed >>>>>>>>>>>> on a symbol (such as types), but I think that it would really simplify >>>>>>>>>>>> things if the check was done as such. Do you have any idea exactly how much >>>>>>>>>>>> of the things that we want to check are not keyed on symbols? If most >>>>>>>>>>>> things are keyed on symbols, for the things we are not we can just emit >>>>>>>>>>>> extra symbols prefixed by __clang_odr_check_ or whatever. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Since the current implementation only works with records there >>>>>>>>>>> is basically zero overlap right now between ODR names and symbols. I >>>>>>>>>>> suppose that I could estimate the amount of function overlap in a >>>>>>>>>>> hypothetical implementation that computes ODR hashes of functions by >>>>>>>>>>> comparing the number of *_odr functions after clang has finished IRgen with >>>>>>>>>>> the number after optimization finishes. This of course would be strictly >>>>>>>>>>> more than functions + types. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Wouldn't any function or symbol using the record type have the >>>>>>>>>> type name somewhere in it? If we used an offset+length encoding (instead of >>>>>>>>>> offset + NUL termination) we might be able to reuse it then (at some cost >>>>>>>>>> in finding the reference). >>>>>>>>>> >>>>>>>>> >>>>>>>>> That may be possible with some work in the string table builder. >>>>>>>>> But at that point of course we're not dealing with regular symbols any >>>>>>>>> more. I guess we could have two ODR tables per object file: an array of >>>>>>>>> (ODR hash, location) tuples for ODR names that correspond to symbol table >>>>>>>>> symbols (i.e. Rui's proposal on the chromium bug), and an array of (ODR >>>>>>>>> name, ODR hash, location) tuples for all other ODR names. I guess if we >>>>>>>>> wanted a "low overhead" mode we could just omit the second table or put >>>>>>>>> fewer symbols in it. >>>>>>>>> >>>>>>>>> With debug info surely there is some sort of string representing >>>>>>>>>> the record name or something like that, no? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Not the record name on its own (they do appear but a bit awkwardly >>>>>>>>> -- each namespace component is stored in a separate string), but if the >>>>>>>>> record has at least one member function the mangled type name will appear >>>>>>>>> somewhere in .debug_str, so we could in principle reuse that with the >>>>>>>>> offset/length trick. >>>>>>>>> >>>>>>>>> I guess we may have to have our "low-overhead" user-facing >>>>>>>>>> behavior be a bit more nuanced. E.g.: >>>>>>>>>> 1. does this feature bloat object files significantly >>>>>>>>>> 2. does this feature slow down link times significantly >>>>>>>>>> >>>>>>>>>> Intuitively, it seems like we should be able to get 1. when debug >>>>>>>>>> info happens to be enabled (not sure about split dwarf?) and possibly in >>>>>>>>>> all cases at the cost of complexity. We may be able to get 2. in all cases >>>>>>>>>> with proper design. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I think that would be my rough assessment as well. I think we have >>>>>>>>> a good shot at 1 for all cases with some of the ideas that have been >>>>>>>>> mentioned already. If we can avoid creating dependencies on DWARF I think >>>>>>>>> that would be ideal -- I'd ideally like this to work for COFF as well, >>>>>>>>> where you'd typically expect to find CodeView in object files. If I were to >>>>>>>>> try this I think the first thing that I would try is hash/compression >>>>>>>>> combined with the two ODR tables (no reuse for non-symbol ODR names to >>>>>>>>> start with, as compression may be enough on its own). >>>>>>>>> >>>>>>>> >>>>>>>> I developed a second prototype which uses hash/compression with no >>>>>>>> attempt to reuse. It is available here: https://github.com/pcc/ >>>>>>>> llvm-project/tree/odr-checker2 >>>>>>>> >>>>>>>> For Chromium the object file size overhead was 536566007 bytes, or >>>>>>>> in relative terms about 25%, or about 4% with debug info. I measured perf >>>>>>>> overhead for unit_tests at about 6%, but after I moved the checker onto >>>>>>>> another thread, the overhead disappeared into the noise. >>>>>>>> >>>>>>> >>>>>>> Still seems like quite a big increase. >>>>>>> >>>>>>> Any chance of compression? >>>>>>> >>>>>> >>>>>> That was with compression -- the implementation compresses the parts >>>>>> of the ODR table that aren't hashes (aside from the header and the Clang >>>>>> version, which is a small fixed cost), as well as the string table. The >>>>>> hashes were left uncompressed because they are in the critical path of the >>>>>> linker and because I imagine that they wouldn't really be that compressible. >>>>>> >>>>> >>>>> I'd be a bit surprised if they weren't especially compressible - >>>>> >>>> >>>> Maybe I'm wrong, but my intuition about compression is that it works >>>> best when the data contains repeated patterns. If we use a hash function >>>> with good dispersion then I'd expect each hash to have little in common >>>> with other hashes. >>>> >>>> >>>>> and how much of the size increase is the compressed data V the >>>>> uncompressed data? >>>>> >>>> >>>> The ratio was roughly 60% compressed data to 40% uncompressed data. >>>> >>>> >>>>> Is it still in the hot path when parallelized? >>>>> >>>> >>>> Not right now according to my benchmarking, but decompression could >>>> push it into the critical path if it ends up taking longer than the rest of >>>> the work done by the linker after symbol resolution. On the same machine >>>> that I used for benchmarking, gunzip'ing 200MB of /dev/urandom (which is >>>> roughly what I'd expect the hashes to look like) takes around 1.1s, i.e. a >>>> not insignificant fraction of lld's runtime. >>>> >>>> >>>>> So I think the remaining gains would either be through limiting the >>>>>> number of ODR table entries, or through reuse of data. >>>>>> >>>>>> Limiting might be something to explore -- one possibility is that we >>>>>> could limit the ODR table entries to the declarations that are "used" by a >>>>>> particular translation unit (it appears that Clang tracks something like >>>>>> that in Decl::Used/Decl::Referenced, but I'm not sure if that is exactly >>>>>> what we need -- I think we would basically need to test for reference >>>>>> reachability from the functions/globals that are IRgen'd). >>>>>> >>>>> >>>>> Currently it has every type and function that is in the AST? Yeah, >>>>> that's a lot - perhaps it should be more like the things that go in the >>>>> DWARF? (though would need to add some cases there - since the DWARF logic >>>>> already relies on the ODR to not emit duplicates in some cases) >>>>> >>>> >>>> Just every record declaration -- Clang only supports ODR hashes for >>>> record declarations right now. I understand that function declarations >>>> (including function bodies) are still works in progress. >>>> >>>> I think it should indeed just be roughly the things that go in the >>>> DWARF. I think that at one point I observed that every record declaration, >>>> even unused ones, were going into the DWARF, but I might have been mistaken >>>> because I can no longer reproduce that. I'll take a closer look to see if I >>>> can reuse what logic presumably already exists for DWARF. >>>> >>>> In terms of reuse, it seems that of the 536566007 bytes of >>>>>> overhead, 319309579 were the compressed part of the ODR tables. So even if >>>>>> we achieved 100% sharing, >>>>>> >>>>> >>>>> 100% sharing? You mean if all the data were compressed, and assuming >>>>> the hashes were compressible at the same ratio as the other data? >>>>> >>>> >>>> Sorry, I mean if 100% of the data in the compressed part of the ODR >>>> table could be eliminated by reusing data stored elsewhere (e.g. in the >>>> object file string table or in the DWARF). >>>> >>>> with the current scheme I think that our minimum achievable overhead >>>>>> would be ~15% (no debug info) or ~2% (with debug info). >>>>>> >>>>>> >>>>>>> Could this go into .dwo files with Fission and be checked by dwp >>>>>>> instead, perhaps? >>>>>>> >>>>>> >>>>>> I think it could also work that way, yes. >>>>>> >>>>>> I'm reasonably happy with these figures, at least for a first >>>>>>>> implementation. We may be able to do even better for file size with reuse, >>>>>>>> but I'd leave that for version 2. >>>>>>>> >>>>>>> >>>>>>> What's the story with compatibility between versions, then? Is there >>>>>>> a version header? >>>>>>> >>>>>> >>>>>> Yes, the header contains a version number. >>>>>> >>>>>> >>>>>>> Will old formats be supported by lld indefinitely? Not at all? >>>>>>> >>>>>> >>>>>> I think we should drop support for old formats when we introduce a >>>>>> new format. My understanding is that the ODR hash can change whenever Clang >>>>>> changes (the implementation only performs ODR checking if all ODR tables >>>>>> were produced by the same revision of Clang), so there wouldn't seem to be >>>>>> a huge benefit in keeping support for old formats around. >>>>>> >>>>> >>>>> I imagine it's possible people aren't necessarily going to rev lld in >>>>> exact lock-step with clang, but I could be wrong. (certainly binutils ld or >>>>> gold aren't released/kept in lock-step with GCC, for example) >>>>> >>>> >>>> That's certainly possible, but I'd say that the bar for dropping >>>> backwards compatibility is lower because ODR tables are not required for >>>> correctness. We could keep compatibility with the last version or so if it >>>> isn't too burdensome, or otherwise print a warning. >>>> >>> >>> They aren't required for correctness, but upgrading your compiler or >>> linker then silently losing ODR checking would be bad (or even not silently >>> losing it, but having no choice but to rev both to keep the functionality & >>> hold the ODR-cleanliness bar) - it's the sort of thing where if you lost >>> the checking, then gained it back again later, the regression cleanup would >>> be annoying/an impediment to using the feature. >>> >> >> Makes sense I guess. I'd be fine with a policy where the Nth open source >> release should be able to read ODR tables produced by the N-1th and >> possibly the N-2th release. >> > > Still strikes me as a bit awkward - wonder how that compared to other > (similar or different) linker features. >I think the most similar existing feature is .gdb_index. They have already gone through a few format revisions: https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html and have deprecated/removed support for older formats. Because the requirements for ODR tables are simpler than those for .gdb_index, I'd expect us to converge on a final format sooner, so in practice the window of compatibility would end up being longer than a year. Peter> >> Any idea what Daniel Jasper & co have been working on WRT ODR checking & >>> how this feature integrates or doesn't with their work? I imagine they >>> might be working on something more like a Clang Tooling style approach, but >>> I'm not sure. >>> >> >> I'm not aware of any work like that, only of Richard Trieu's efforts for >> modules that I'm piggybacking on. >> > > +Djasper - perhaps you could provide some context on other odr detection > efforts? > > >> Peter >> >>> >>> - Dave >>> >>> >>>> >>>> Peter >>>> >>>>> >>>>> >>>>>> >>>>>> Peter >>>>>> >>>>>>> >>>>>>> - Dave >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Peter >>>>>>>> >>>>>>>>> >>>>>>>>> Peter >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- Sean Silva >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> The issue of retaining the ODR check for functions even if they >>>>>>>>>>>> get inlined may inherently pose an extra cost that can't be folded into >>>>>>>>>>>> existing work the linker is doing, so there might be a reason for clang to >>>>>>>>>>>> have a default mode that has practically no linking overhead and one that >>>>>>>>>>>> does more thorough checking but imposes extra linking overhead. Think >>>>>>>>>>>> something like a crazy boost library with thousands of functions that get >>>>>>>>>>>> inlined away, but have gigantic mangled names and so precisely are the ones >>>>>>>>>>>> that are going to impose extra cost on the linker. Simply due to the extra >>>>>>>>>>>> volume of strings that the linker would need to look at, I don't think >>>>>>>>>>>> there's a way to include checking of all inlined function "for free" at the >>>>>>>>>>>> linker level using the symbol approach. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> I guess those inlined functions would still have those symbol >>>>>>>>>>>> names in debug info (I think?), so piggybacking on the string deduplication >>>>>>>>>>>> we're already doing might make it possible to fold away the work in that >>>>>>>>>>>> case (but then again, would still impose extra cost with split dwarf...). >>>>>>>>>>>> >>>>>>>>>>>> Anyway, let's wait to see what the actual performance numbers >>>>>>>>>>>> are. >>>>>>>>>>>> >>>>>>>>>>>> -- Sean Silva >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jun 6, 2017 at 10:40 PM, Peter Collingbourne via >>>>>>>>>>>> cfe-dev <cfe-dev at lists.llvm.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I'd like to propose an ODR checker feature for Clang and LLD. >>>>>>>>>>>>> The feature would be similar to gold's --detect-odr-violations feature, but >>>>>>>>>>>>> better: we can rely on integration with clang to avoid relying on debug >>>>>>>>>>>>> info and to perform more precise matching. >>>>>>>>>>>>> >>>>>>>>>>>>> The basic idea is that we use clang's ability to create ODR >>>>>>>>>>>>> hashes for declarations. ODR hashes are computed using all information >>>>>>>>>>>>> about a declaration that is ODR-relevant. If the flag >>>>>>>>>>>>> -fdetect-odr-violations is passed, Clang will store the ODR hashes in a >>>>>>>>>>>>> so-called ODR table in each object file. Each ODR table will contain a >>>>>>>>>>>>> mapping from mangled declaration names to ODR hashes. At link time, the >>>>>>>>>>>>> linker will read the ODR table and report any mismatches. >>>>>>>>>>>>> >>>>>>>>>>>>> To make this work: >>>>>>>>>>>>> - LLVM will be extended with the ability to represent ODR >>>>>>>>>>>>> tables in the IR and emit them to object files >>>>>>>>>>>>> - Clang will be extended with the ability to emit ODR tables >>>>>>>>>>>>> using ODR hashes >>>>>>>>>>>>> - LLD will be extended to read ODR tables from object files >>>>>>>>>>>>> >>>>>>>>>>>>> I have implemented a prototype of this feature. It is >>>>>>>>>>>>> available here: https://github.com/pcc/ >>>>>>>>>>>>> llvm-project/tree/odr-checker and some results from applying >>>>>>>>>>>>> it to chromium are here: crbug.com/726071 >>>>>>>>>>>>> As you can see it did indeed find a number of real ODR >>>>>>>>>>>>> violations in Chromium, including some that wouldn't be detectable using >>>>>>>>>>>>> debug info. >>>>>>>>>>>>> >>>>>>>>>>>>> If you're interested in what the format of the ODR table would >>>>>>>>>>>>> look like, that prototype shows pretty much what I had in mind, but I >>>>>>>>>>>>> expect many other aspects of the implementation to change as it is >>>>>>>>>>>>> upstreamed. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -- >>>>>>>>>>>>> -- >>>>>>>>>>>>> Peter >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> cfe-dev mailing list >>>>>>>>>>>>> cfe-dev at lists.llvm.org >>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> -- >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> Peter >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> Peter >>>>>>>> _______________________________________________ >>>>>>>> cfe-dev mailing list >>>>>>>> cfe-dev at lists.llvm.org >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- >>>>>> Peter >>>>>> >>>>> >>>> >>>> >>>> -- >>>> -- >>>> Peter >>>> >>> >> >> >> -- >> -- >> Peter >> >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170614/76716b88/attachment.html>
Rui Ueyama via llvm-dev
2017-Jun-15 01:00 UTC
[llvm-dev] [cfe-dev] RFC: ODR checker for Clang and LLD
Is the entry in your ODR table 64-bit? Sean mentioned that this is a birthday paradox situation, but I don't think we need that large hash values, as our aim is not to avoid any collisions. Small number of collisions is okay as it just slightly increases false negatives. I think it can even be 16-bit if space saving is important. If we choose 16-bit hash, the probability that an ODR violation is not detected is 1/65536, which is still quite low. On Wed, Jun 14, 2017 at 1:41 PM, Peter Collingbourne via cfe-dev < cfe-dev at lists.llvm.org> wrote:> > > On Wed, Jun 14, 2017 at 12:47 PM, David Blaikie <dblaikie at gmail.com> > wrote: > >> >> >> On Tue, Jun 13, 2017, 11:30 PM Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> >>> On Tue, Jun 13, 2017 at 11:06 PM, David Blaikie <dblaikie at gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Tue, Jun 13, 2017 at 10:05 PM Peter Collingbourne <peter at pcc.me.uk> >>>> wrote: >>>> >>>>> On Tue, Jun 13, 2017 at 8:48 PM, David Blaikie <dblaikie at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 13, 2017 at 8:43 PM Peter Collingbourne <peter at pcc.me.uk> >>>>>> wrote: >>>>>> >>>>>>> On Tue, Jun 13, 2017 at 7:54 PM, David Blaikie <dblaikie at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 13, 2017 at 6:34 PM Peter Collingbourne via cfe-dev < >>>>>>>> cfe-dev at lists.llvm.org> wrote: >>>>>>>> >>>>>>>>> On Wed, Jun 7, 2017 at 11:28 PM, Peter Collingbourne < >>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>> >>>>>>>>>> On Wed, Jun 7, 2017 at 8:06 PM, Sean Silva <chisophugis at gmail.com >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 7, 2017 at 4:31 PM, Peter Collingbourne < >>>>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 7, 2017 at 12:17 AM, Sean Silva < >>>>>>>>>>>> chisophugis at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Very nice and simple implementation! >>>>>>>>>>>>> >>>>>>>>>>>>> Do you have any statistics on how large these odr tables are >>>>>>>>>>>>> compared to other object file data? I assume that if these tables contain >>>>>>>>>>>>> full mangled symbol names, they could end up being very large and may want >>>>>>>>>>>>> to share the symbol name strings with the overall string table in the .o >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Looking at Chromium's object files it looks like the total size >>>>>>>>>>>> of the odrtabs is about 50% of the total size of the object files, which >>>>>>>>>>>> isn't great. The current implementation only looks at records, so I imagine >>>>>>>>>>>> that it would be hard to share any of the strings that I'm currently >>>>>>>>>>>> creating. (I guess it's possible that some types will have a mangled vtable >>>>>>>>>>>> name in the string table, so we may be able to share a little that way.) >>>>>>>>>>>> Note however that this was without debug info. >>>>>>>>>>>> >>>>>>>>>>>> One option for reducing size would be to >>>>>>>>>>>> 1) store hashes of ODR names in ODR tables, per Rui's >>>>>>>>>>>> suggestion (alongside a reference to the name itself in the string table) >>>>>>>>>>>> 2) compress the string table for the ODR names with a standard >>>>>>>>>>>> compression algorithm like gzip. >>>>>>>>>>>> This wouldn't seem to affect link time performance much because >>>>>>>>>>>> I think we should only need to look at the strings if we see a ODR name >>>>>>>>>>>> hash match together with an ODR hash mismatch, which would mean an ODR >>>>>>>>>>>> violation with a high probability (i.e. unless there was an ODR name hash >>>>>>>>>>>> collision, we have found an ODR violation). If we don't expect a lot of >>>>>>>>>>>> sharing with regular string tables (see below), it seems even more >>>>>>>>>>>> reasonable. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Neat observation! >>>>>>>>>>> >>>>>>>>>>> FWIW, it is a birthday problem type situation though, so for a >>>>>>>>>>> 32-bit hash, we would expect a collision in about 1 in 2^16 distinct hashes >>>>>>>>>>> (and 2^16 seems pretty easy to hit in a large project). So 64-bit hashes >>>>>>>>>>> might be preferable. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Oh right, good point, using a 64-bit hash does seem like a good >>>>>>>>>> idea here. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Also, do you have any numbers on the performance of your >>>>>>>>>>>>> initial implementation? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I measured the link time for chromium's unit_tests (the largest >>>>>>>>>>>> single binary in chromium) at 5.05s without ODR checks and 6.61s with ODR >>>>>>>>>>>> checks. So about 30% overhead, but in absolute terms it doesn't seem too >>>>>>>>>>>> bad. So I think this may be acceptable for an initial implementation, but >>>>>>>>>>>> it certainly seems worth trying to do better. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I know that things aren't currently apples-to-apples, but how >>>>>>>>>>> does that compare to gold? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I will measure it. >>>>>>>>>> >>>>>>>>> >>>>>>>>> For that unit_tests binary I measured the overhead at about 5 >>>>>>>>> seconds (average of 10 runs). That is with debug info, of course. >>>>>>>>> >>>>>>>>> W.r.t. LLD and having it always on by default (and hence making it >>>>>>>>>>>>> as fast as possible), it seems like right now you are implementing the >>>>>>>>>>>>> checking process with a hash table. That's simple and fine for a first >>>>>>>>>>>>> implementation, but it's probably worth mentioning in a comment the problem >>>>>>>>>>>>> of checking the tables, at least from the linker's perspective, does fit >>>>>>>>>>>>> into a map-reduce pattern and could be easily parallelized if needed. E.g. >>>>>>>>>>>>> a parallel sort to coalesce all entries for symbols of the same name >>>>>>>>>>>>> followed by a parallel forEach to check each bucket with the same symbol >>>>>>>>>>>>> name (roughly speaking). >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Right, that's one approach. I was thinking of a simpler >>>>>>>>>>>> approach where at compile time we sort ODR names by hash and partition them >>>>>>>>>>>> using (say) the upper bits of the hash, so that at link time we can have N >>>>>>>>>>>> threads each building a hash table for a specific partition. >>>>>>>>>>>> >>>>>>>>>>>> And of course this work can be started right after symbol >>>>>>>>>>>> resolution finishes and parallelised with the rest of the work done by the >>>>>>>>>>>> linker. >>>>>>>>>>>> >>>>>>>>>>>> Even better than doing it faster is just doing less work. >>>>>>>>>>>>> There's a lot of work that the linker is already doing that may be reusable >>>>>>>>>>>>> for the ODR checking. >>>>>>>>>>>>> E.g. >>>>>>>>>>>>> - maybe we could get the coalescing step as a byproduct of our >>>>>>>>>>>>> existing string deduping, which we are generally doing anyway. >>>>>>>>>>>>> - we are already coalescing symbol names for the symbol table. >>>>>>>>>>>>> If the ODR table is keyed off of symbols in the binary that we are >>>>>>>>>>>>> inserting into the symbol table, then I think we could do the entire ODR >>>>>>>>>>>>> check with no extra "string" work on LLD's part. >>>>>>>>>>>>> >>>>>>>>>>>>> I see Rui already mentioned some of this in >>>>>>>>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=726 >>>>>>>>>>>>> 071#c4. >>>>>>>>>>>>> You mentioned that not everything is necessarily directly >>>>>>>>>>>>> keyed on a symbol (such as types), but I think that it would really >>>>>>>>>>>>> simplify things if the check was done as such. Do you have any idea exactly >>>>>>>>>>>>> how much of the things that we want to check are not keyed on symbols? If >>>>>>>>>>>>> most things are keyed on symbols, for the things we are not we can just >>>>>>>>>>>>> emit extra symbols prefixed by __clang_odr_check_ or whatever. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Since the current implementation only works with records there >>>>>>>>>>>> is basically zero overlap right now between ODR names and symbols. I >>>>>>>>>>>> suppose that I could estimate the amount of function overlap in a >>>>>>>>>>>> hypothetical implementation that computes ODR hashes of functions by >>>>>>>>>>>> comparing the number of *_odr functions after clang has finished IRgen with >>>>>>>>>>>> the number after optimization finishes. This of course would be strictly >>>>>>>>>>>> more than functions + types. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Wouldn't any function or symbol using the record type have the >>>>>>>>>>> type name somewhere in it? If we used an offset+length encoding (instead of >>>>>>>>>>> offset + NUL termination) we might be able to reuse it then (at some cost >>>>>>>>>>> in finding the reference). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> That may be possible with some work in the string table builder. >>>>>>>>>> But at that point of course we're not dealing with regular symbols any >>>>>>>>>> more. I guess we could have two ODR tables per object file: an array of >>>>>>>>>> (ODR hash, location) tuples for ODR names that correspond to symbol table >>>>>>>>>> symbols (i.e. Rui's proposal on the chromium bug), and an array of (ODR >>>>>>>>>> name, ODR hash, location) tuples for all other ODR names. I guess if we >>>>>>>>>> wanted a "low overhead" mode we could just omit the second table or put >>>>>>>>>> fewer symbols in it. >>>>>>>>>> >>>>>>>>>> With debug info surely there is some sort of string representing >>>>>>>>>>> the record name or something like that, no? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Not the record name on its own (they do appear but a bit >>>>>>>>>> awkwardly -- each namespace component is stored in a separate string), but >>>>>>>>>> if the record has at least one member function the mangled type name will >>>>>>>>>> appear somewhere in .debug_str, so we could in principle reuse that with >>>>>>>>>> the offset/length trick. >>>>>>>>>> >>>>>>>>>> I guess we may have to have our "low-overhead" user-facing >>>>>>>>>>> behavior be a bit more nuanced. E.g.: >>>>>>>>>>> 1. does this feature bloat object files significantly >>>>>>>>>>> 2. does this feature slow down link times significantly >>>>>>>>>>> >>>>>>>>>>> Intuitively, it seems like we should be able to get 1. when >>>>>>>>>>> debug info happens to be enabled (not sure about split dwarf?) and possibly >>>>>>>>>>> in all cases at the cost of complexity. We may be able to get 2. in all >>>>>>>>>>> cases with proper design. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think that would be my rough assessment as well. I think we >>>>>>>>>> have a good shot at 1 for all cases with some of the ideas that have been >>>>>>>>>> mentioned already. If we can avoid creating dependencies on DWARF I think >>>>>>>>>> that would be ideal -- I'd ideally like this to work for COFF as well, >>>>>>>>>> where you'd typically expect to find CodeView in object files. If I were to >>>>>>>>>> try this I think the first thing that I would try is hash/compression >>>>>>>>>> combined with the two ODR tables (no reuse for non-symbol ODR names to >>>>>>>>>> start with, as compression may be enough on its own). >>>>>>>>>> >>>>>>>>> >>>>>>>>> I developed a second prototype which uses hash/compression with no >>>>>>>>> attempt to reuse. It is available here: https://github.com/pcc/l >>>>>>>>> lvm-project/tree/odr-checker2 >>>>>>>>> >>>>>>>>> For Chromium the object file size overhead was 536566007 bytes, or >>>>>>>>> in relative terms about 25%, or about 4% with debug info. I measured perf >>>>>>>>> overhead for unit_tests at about 6%, but after I moved the checker onto >>>>>>>>> another thread, the overhead disappeared into the noise. >>>>>>>>> >>>>>>>> >>>>>>>> Still seems like quite a big increase. >>>>>>>> >>>>>>>> Any chance of compression? >>>>>>>> >>>>>>> >>>>>>> That was with compression -- the implementation compresses the parts >>>>>>> of the ODR table that aren't hashes (aside from the header and the Clang >>>>>>> version, which is a small fixed cost), as well as the string table. The >>>>>>> hashes were left uncompressed because they are in the critical path of the >>>>>>> linker and because I imagine that they wouldn't really be that compressible. >>>>>>> >>>>>> >>>>>> I'd be a bit surprised if they weren't especially compressible - >>>>>> >>>>> >>>>> Maybe I'm wrong, but my intuition about compression is that it works >>>>> best when the data contains repeated patterns. If we use a hash function >>>>> with good dispersion then I'd expect each hash to have little in common >>>>> with other hashes. >>>>> >>>>> >>>>>> and how much of the size increase is the compressed data V the >>>>>> uncompressed data? >>>>>> >>>>> >>>>> The ratio was roughly 60% compressed data to 40% uncompressed data. >>>>> >>>>> >>>>>> Is it still in the hot path when parallelized? >>>>>> >>>>> >>>>> Not right now according to my benchmarking, but decompression could >>>>> push it into the critical path if it ends up taking longer than the rest of >>>>> the work done by the linker after symbol resolution. On the same machine >>>>> that I used for benchmarking, gunzip'ing 200MB of /dev/urandom (which is >>>>> roughly what I'd expect the hashes to look like) takes around 1.1s, i.e. a >>>>> not insignificant fraction of lld's runtime. >>>>> >>>>> >>>>>> So I think the remaining gains would either be through limiting the >>>>>>> number of ODR table entries, or through reuse of data. >>>>>>> >>>>>>> Limiting might be something to explore -- one possibility is that we >>>>>>> could limit the ODR table entries to the declarations that are "used" by a >>>>>>> particular translation unit (it appears that Clang tracks something like >>>>>>> that in Decl::Used/Decl::Referenced, but I'm not sure if that is exactly >>>>>>> what we need -- I think we would basically need to test for reference >>>>>>> reachability from the functions/globals that are IRgen'd). >>>>>>> >>>>>> >>>>>> Currently it has every type and function that is in the AST? Yeah, >>>>>> that's a lot - perhaps it should be more like the things that go in the >>>>>> DWARF? (though would need to add some cases there - since the DWARF logic >>>>>> already relies on the ODR to not emit duplicates in some cases) >>>>>> >>>>> >>>>> Just every record declaration -- Clang only supports ODR hashes for >>>>> record declarations right now. I understand that function declarations >>>>> (including function bodies) are still works in progress. >>>>> >>>>> I think it should indeed just be roughly the things that go in the >>>>> DWARF. I think that at one point I observed that every record declaration, >>>>> even unused ones, were going into the DWARF, but I might have been mistaken >>>>> because I can no longer reproduce that. I'll take a closer look to see if I >>>>> can reuse what logic presumably already exists for DWARF. >>>>> >>>>> In terms of reuse, it seems that of the 536566007 bytes of >>>>>>> overhead, 319309579 were the compressed part of the ODR tables. So even if >>>>>>> we achieved 100% sharing, >>>>>>> >>>>>> >>>>>> 100% sharing? You mean if all the data were compressed, and assuming >>>>>> the hashes were compressible at the same ratio as the other data? >>>>>> >>>>> >>>>> Sorry, I mean if 100% of the data in the compressed part of the ODR >>>>> table could be eliminated by reusing data stored elsewhere (e.g. in the >>>>> object file string table or in the DWARF). >>>>> >>>>> with the current scheme I think that our minimum achievable overhead >>>>>>> would be ~15% (no debug info) or ~2% (with debug info). >>>>>>> >>>>>>> >>>>>>>> Could this go into .dwo files with Fission and be checked by dwp >>>>>>>> instead, perhaps? >>>>>>>> >>>>>>> >>>>>>> I think it could also work that way, yes. >>>>>>> >>>>>>> I'm reasonably happy with these figures, at least for a first >>>>>>>>> implementation. We may be able to do even better for file size with reuse, >>>>>>>>> but I'd leave that for version 2. >>>>>>>>> >>>>>>>> >>>>>>>> What's the story with compatibility between versions, then? Is >>>>>>>> there a version header? >>>>>>>> >>>>>>> >>>>>>> Yes, the header contains a version number. >>>>>>> >>>>>>> >>>>>>>> Will old formats be supported by lld indefinitely? Not at all? >>>>>>>> >>>>>>> >>>>>>> I think we should drop support for old formats when we introduce a >>>>>>> new format. My understanding is that the ODR hash can change whenever Clang >>>>>>> changes (the implementation only performs ODR checking if all ODR tables >>>>>>> were produced by the same revision of Clang), so there wouldn't seem to be >>>>>>> a huge benefit in keeping support for old formats around. >>>>>>> >>>>>> >>>>>> I imagine it's possible people aren't necessarily going to rev lld in >>>>>> exact lock-step with clang, but I could be wrong. (certainly binutils ld or >>>>>> gold aren't released/kept in lock-step with GCC, for example) >>>>>> >>>>> >>>>> That's certainly possible, but I'd say that the bar for dropping >>>>> backwards compatibility is lower because ODR tables are not required for >>>>> correctness. We could keep compatibility with the last version or so if it >>>>> isn't too burdensome, or otherwise print a warning. >>>>> >>>> >>>> They aren't required for correctness, but upgrading your compiler or >>>> linker then silently losing ODR checking would be bad (or even not silently >>>> losing it, but having no choice but to rev both to keep the functionality & >>>> hold the ODR-cleanliness bar) - it's the sort of thing where if you lost >>>> the checking, then gained it back again later, the regression cleanup would >>>> be annoying/an impediment to using the feature. >>>> >>> >>> Makes sense I guess. I'd be fine with a policy where the Nth open source >>> release should be able to read ODR tables produced by the N-1th and >>> possibly the N-2th release. >>> >> >> Still strikes me as a bit awkward - wonder how that compared to other >> (similar or different) linker features. >> > > I think the most similar existing feature is .gdb_index. They have already > gone through a few format revisions: > https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html > and have deprecated/removed support for older formats. > > Because the requirements for ODR tables are simpler than those for > .gdb_index, I'd expect us to converge on a final format sooner, so in > practice the window of compatibility would end up being longer than a year. > > Peter > > >> >>> Any idea what Daniel Jasper & co have been working on WRT ODR checking & >>>> how this feature integrates or doesn't with their work? I imagine they >>>> might be working on something more like a Clang Tooling style approach, but >>>> I'm not sure. >>>> >>> >>> I'm not aware of any work like that, only of Richard Trieu's efforts for >>> modules that I'm piggybacking on. >>> >> >> +Djasper - perhaps you could provide some context on other odr detection >> efforts? >> >> >>> Peter >>> >>>> >>>> - Dave >>>> >>>> >>>>> >>>>> Peter >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>>> >>>>>>>> - Dave >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Peter >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- Sean Silva >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> The issue of retaining the ODR check for functions even if >>>>>>>>>>>>> they get inlined may inherently pose an extra cost that can't be folded >>>>>>>>>>>>> into existing work the linker is doing, so there might be a reason for >>>>>>>>>>>>> clang to have a default mode that has practically no linking overhead and >>>>>>>>>>>>> one that does more thorough checking but imposes extra linking overhead. >>>>>>>>>>>>> Think something like a crazy boost library with thousands of functions that >>>>>>>>>>>>> get inlined away, but have gigantic mangled names and so precisely are the >>>>>>>>>>>>> ones that are going to impose extra cost on the linker. Simply due to the >>>>>>>>>>>>> extra volume of strings that the linker would need to look at, I don't >>>>>>>>>>>>> think there's a way to include checking of all inlined function "for free" >>>>>>>>>>>>> at the linker level using the symbol approach. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> I guess those inlined functions would still have those symbol >>>>>>>>>>>>> names in debug info (I think?), so piggybacking on the string deduplication >>>>>>>>>>>>> we're already doing might make it possible to fold away the work in that >>>>>>>>>>>>> case (but then again, would still impose extra cost with split dwarf...). >>>>>>>>>>>>> >>>>>>>>>>>>> Anyway, let's wait to see what the actual performance numbers >>>>>>>>>>>>> are. >>>>>>>>>>>>> >>>>>>>>>>>>> -- Sean Silva >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jun 6, 2017 at 10:40 PM, Peter Collingbourne via >>>>>>>>>>>>> cfe-dev <cfe-dev at lists.llvm.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'd like to propose an ODR checker feature for Clang and LLD. >>>>>>>>>>>>>> The feature would be similar to gold's --detect-odr-violations feature, but >>>>>>>>>>>>>> better: we can rely on integration with clang to avoid relying on debug >>>>>>>>>>>>>> info and to perform more precise matching. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The basic idea is that we use clang's ability to create ODR >>>>>>>>>>>>>> hashes for declarations. ODR hashes are computed using all information >>>>>>>>>>>>>> about a declaration that is ODR-relevant. If the flag >>>>>>>>>>>>>> -fdetect-odr-violations is passed, Clang will store the ODR hashes in a >>>>>>>>>>>>>> so-called ODR table in each object file. Each ODR table will contain a >>>>>>>>>>>>>> mapping from mangled declaration names to ODR hashes. At link time, the >>>>>>>>>>>>>> linker will read the ODR table and report any mismatches. >>>>>>>>>>>>>> >>>>>>>>>>>>>> To make this work: >>>>>>>>>>>>>> - LLVM will be extended with the ability to represent ODR >>>>>>>>>>>>>> tables in the IR and emit them to object files >>>>>>>>>>>>>> - Clang will be extended with the ability to emit ODR tables >>>>>>>>>>>>>> using ODR hashes >>>>>>>>>>>>>> - LLD will be extended to read ODR tables from object files >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have implemented a prototype of this feature. It is >>>>>>>>>>>>>> available here: https://github.com/pcc/l >>>>>>>>>>>>>> lvm-project/tree/odr-checker and some results from applying >>>>>>>>>>>>>> it to chromium are here: crbug.com/726071 >>>>>>>>>>>>>> As you can see it did indeed find a number of real ODR >>>>>>>>>>>>>> violations in Chromium, including some that wouldn't be detectable using >>>>>>>>>>>>>> debug info. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If you're interested in what the format of the ODR table >>>>>>>>>>>>>> would look like, that prototype shows pretty much what I had in mind, but I >>>>>>>>>>>>>> expect many other aspects of the implementation to change as it is >>>>>>>>>>>>>> upstreamed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Peter >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> cfe-dev mailing list >>>>>>>>>>>>>> cfe-dev at lists.llvm.org >>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> -- >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> Peter >>>>>>>>> _______________________________________________ >>>>>>>>> cfe-dev mailing list >>>>>>>>> cfe-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Peter >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> Peter >>>>> >>>> >>> >>> >>> -- >>> -- >>> Peter >>> >> > > > -- > -- > Peter > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170614/275102ab/attachment-0001.html>
David Blaikie via llvm-dev
2017-Jun-15 01:14 UTC
[llvm-dev] [cfe-dev] RFC: ODR checker for Clang and LLD
On Wed, Jun 14, 2017 at 1:41 PM Peter Collingbourne <peter at pcc.me.uk> wrote:> On Wed, Jun 14, 2017 at 12:47 PM, David Blaikie <dblaikie at gmail.com> > wrote: > >> >> >> On Tue, Jun 13, 2017, 11:30 PM Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> >>> On Tue, Jun 13, 2017 at 11:06 PM, David Blaikie <dblaikie at gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Tue, Jun 13, 2017 at 10:05 PM Peter Collingbourne <peter at pcc.me.uk> >>>> wrote: >>>> >>>>> On Tue, Jun 13, 2017 at 8:48 PM, David Blaikie <dblaikie at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Jun 13, 2017 at 8:43 PM Peter Collingbourne <peter at pcc.me.uk> >>>>>> wrote: >>>>>> >>>>>>> On Tue, Jun 13, 2017 at 7:54 PM, David Blaikie <dblaikie at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 13, 2017 at 6:34 PM Peter Collingbourne via cfe-dev < >>>>>>>> cfe-dev at lists.llvm.org> wrote: >>>>>>>> >>>>>>>>> On Wed, Jun 7, 2017 at 11:28 PM, Peter Collingbourne < >>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>> >>>>>>>>>> On Wed, Jun 7, 2017 at 8:06 PM, Sean Silva <chisophugis at gmail.com >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 7, 2017 at 4:31 PM, Peter Collingbourne < >>>>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 7, 2017 at 12:17 AM, Sean Silva < >>>>>>>>>>>> chisophugis at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Very nice and simple implementation! >>>>>>>>>>>>> >>>>>>>>>>>>> Do you have any statistics on how large these odr tables are >>>>>>>>>>>>> compared to other object file data? I assume that if these tables contain >>>>>>>>>>>>> full mangled symbol names, they could end up being very large and may want >>>>>>>>>>>>> to share the symbol name strings with the overall string table in the .o >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Looking at Chromium's object files it looks like the total size >>>>>>>>>>>> of the odrtabs is about 50% of the total size of the object files, which >>>>>>>>>>>> isn't great. The current implementation only looks at records, so I imagine >>>>>>>>>>>> that it would be hard to share any of the strings that I'm currently >>>>>>>>>>>> creating. (I guess it's possible that some types will have a mangled vtable >>>>>>>>>>>> name in the string table, so we may be able to share a little that way.) >>>>>>>>>>>> Note however that this was without debug info. >>>>>>>>>>>> >>>>>>>>>>>> One option for reducing size would be to >>>>>>>>>>>> 1) store hashes of ODR names in ODR tables, per Rui's >>>>>>>>>>>> suggestion (alongside a reference to the name itself in the string table) >>>>>>>>>>>> 2) compress the string table for the ODR names with a standard >>>>>>>>>>>> compression algorithm like gzip. >>>>>>>>>>>> This wouldn't seem to affect link time performance much because >>>>>>>>>>>> I think we should only need to look at the strings if we see a ODR name >>>>>>>>>>>> hash match together with an ODR hash mismatch, which would mean an ODR >>>>>>>>>>>> violation with a high probability (i.e. unless there was an ODR name hash >>>>>>>>>>>> collision, we have found an ODR violation). If we don't expect a lot of >>>>>>>>>>>> sharing with regular string tables (see below), it seems even more >>>>>>>>>>>> reasonable. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Neat observation! >>>>>>>>>>> >>>>>>>>>>> FWIW, it is a birthday problem type situation though, so for a >>>>>>>>>>> 32-bit hash, we would expect a collision in about 1 in 2^16 distinct hashes >>>>>>>>>>> (and 2^16 seems pretty easy to hit in a large project). So 64-bit hashes >>>>>>>>>>> might be preferable. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Oh right, good point, using a 64-bit hash does seem like a good >>>>>>>>>> idea here. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Also, do you have any numbers on the performance of your >>>>>>>>>>>>> initial implementation? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I measured the link time for chromium's unit_tests (the largest >>>>>>>>>>>> single binary in chromium) at 5.05s without ODR checks and 6.61s with ODR >>>>>>>>>>>> checks. So about 30% overhead, but in absolute terms it doesn't seem too >>>>>>>>>>>> bad. So I think this may be acceptable for an initial implementation, but >>>>>>>>>>>> it certainly seems worth trying to do better. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I know that things aren't currently apples-to-apples, but how >>>>>>>>>>> does that compare to gold? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I will measure it. >>>>>>>>>> >>>>>>>>> >>>>>>>>> For that unit_tests binary I measured the overhead at about 5 >>>>>>>>> seconds (average of 10 runs). That is with debug info, of course. >>>>>>>>> >>>>>>>>> W.r.t. LLD and having it always on by default (and hence making it >>>>>>>>>>>>> as fast as possible), it seems like right now you are implementing the >>>>>>>>>>>>> checking process with a hash table. That's simple and fine for a first >>>>>>>>>>>>> implementation, but it's probably worth mentioning in a comment the problem >>>>>>>>>>>>> of checking the tables, at least from the linker's perspective, does fit >>>>>>>>>>>>> into a map-reduce pattern and could be easily parallelized if needed. E.g. >>>>>>>>>>>>> a parallel sort to coalesce all entries for symbols of the same name >>>>>>>>>>>>> followed by a parallel forEach to check each bucket with the same symbol >>>>>>>>>>>>> name (roughly speaking). >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Right, that's one approach. I was thinking of a simpler >>>>>>>>>>>> approach where at compile time we sort ODR names by hash and partition them >>>>>>>>>>>> using (say) the upper bits of the hash, so that at link time we can have N >>>>>>>>>>>> threads each building a hash table for a specific partition. >>>>>>>>>>>> >>>>>>>>>>>> And of course this work can be started right after symbol >>>>>>>>>>>> resolution finishes and parallelised with the rest of the work done by the >>>>>>>>>>>> linker. >>>>>>>>>>>> >>>>>>>>>>>> Even better than doing it faster is just doing less work. >>>>>>>>>>>>> There's a lot of work that the linker is already doing that may be reusable >>>>>>>>>>>>> for the ODR checking. >>>>>>>>>>>>> E.g. >>>>>>>>>>>>> - maybe we could get the coalescing step as a byproduct of our >>>>>>>>>>>>> existing string deduping, which we are generally doing anyway. >>>>>>>>>>>>> - we are already coalescing symbol names for the symbol table. >>>>>>>>>>>>> If the ODR table is keyed off of symbols in the binary that we are >>>>>>>>>>>>> inserting into the symbol table, then I think we could do the entire ODR >>>>>>>>>>>>> check with no extra "string" work on LLD's part. >>>>>>>>>>>>> >>>>>>>>>>>>> I see Rui already mentioned some of this in >>>>>>>>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=726071#c4 >>>>>>>>>>>>> . >>>>>>>>>>>>> You mentioned that not everything is necessarily directly >>>>>>>>>>>>> keyed on a symbol (such as types), but I think that it would really >>>>>>>>>>>>> simplify things if the check was done as such. Do you have any idea exactly >>>>>>>>>>>>> how much of the things that we want to check are not keyed on symbols? If >>>>>>>>>>>>> most things are keyed on symbols, for the things we are not we can just >>>>>>>>>>>>> emit extra symbols prefixed by __clang_odr_check_ or whatever. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Since the current implementation only works with records there >>>>>>>>>>>> is basically zero overlap right now between ODR names and symbols. I >>>>>>>>>>>> suppose that I could estimate the amount of function overlap in a >>>>>>>>>>>> hypothetical implementation that computes ODR hashes of functions by >>>>>>>>>>>> comparing the number of *_odr functions after clang has finished IRgen with >>>>>>>>>>>> the number after optimization finishes. This of course would be strictly >>>>>>>>>>>> more than functions + types. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Wouldn't any function or symbol using the record type have the >>>>>>>>>>> type name somewhere in it? If we used an offset+length encoding (instead of >>>>>>>>>>> offset + NUL termination) we might be able to reuse it then (at some cost >>>>>>>>>>> in finding the reference). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> That may be possible with some work in the string table builder. >>>>>>>>>> But at that point of course we're not dealing with regular symbols any >>>>>>>>>> more. I guess we could have two ODR tables per object file: an array of >>>>>>>>>> (ODR hash, location) tuples for ODR names that correspond to symbol table >>>>>>>>>> symbols (i.e. Rui's proposal on the chromium bug), and an array of (ODR >>>>>>>>>> name, ODR hash, location) tuples for all other ODR names. I guess if we >>>>>>>>>> wanted a "low overhead" mode we could just omit the second table or put >>>>>>>>>> fewer symbols in it. >>>>>>>>>> >>>>>>>>>> With debug info surely there is some sort of string representing >>>>>>>>>>> the record name or something like that, no? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Not the record name on its own (they do appear but a bit >>>>>>>>>> awkwardly -- each namespace component is stored in a separate string), but >>>>>>>>>> if the record has at least one member function the mangled type name will >>>>>>>>>> appear somewhere in .debug_str, so we could in principle reuse that with >>>>>>>>>> the offset/length trick. >>>>>>>>>> >>>>>>>>>> I guess we may have to have our "low-overhead" user-facing >>>>>>>>>>> behavior be a bit more nuanced. E.g.: >>>>>>>>>>> 1. does this feature bloat object files significantly >>>>>>>>>>> 2. does this feature slow down link times significantly >>>>>>>>>>> >>>>>>>>>>> Intuitively, it seems like we should be able to get 1. when >>>>>>>>>>> debug info happens to be enabled (not sure about split dwarf?) and possibly >>>>>>>>>>> in all cases at the cost of complexity. We may be able to get 2. in all >>>>>>>>>>> cases with proper design. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think that would be my rough assessment as well. I think we >>>>>>>>>> have a good shot at 1 for all cases with some of the ideas that have been >>>>>>>>>> mentioned already. If we can avoid creating dependencies on DWARF I think >>>>>>>>>> that would be ideal -- I'd ideally like this to work for COFF as well, >>>>>>>>>> where you'd typically expect to find CodeView in object files. If I were to >>>>>>>>>> try this I think the first thing that I would try is hash/compression >>>>>>>>>> combined with the two ODR tables (no reuse for non-symbol ODR names to >>>>>>>>>> start with, as compression may be enough on its own). >>>>>>>>>> >>>>>>>>> >>>>>>>>> I developed a second prototype which uses hash/compression with no >>>>>>>>> attempt to reuse. It is available here: >>>>>>>>> https://github.com/pcc/llvm-project/tree/odr-checker2 >>>>>>>>> >>>>>>>>> For Chromium the object file size overhead was 536566007 bytes, or >>>>>>>>> in relative terms about 25%, or about 4% with debug info. I measured perf >>>>>>>>> overhead for unit_tests at about 6%, but after I moved the checker onto >>>>>>>>> another thread, the overhead disappeared into the noise. >>>>>>>>> >>>>>>>> >>>>>>>> Still seems like quite a big increase. >>>>>>>> >>>>>>>> Any chance of compression? >>>>>>>> >>>>>>> >>>>>>> That was with compression -- the implementation compresses the parts >>>>>>> of the ODR table that aren't hashes (aside from the header and the Clang >>>>>>> version, which is a small fixed cost), as well as the string table. The >>>>>>> hashes were left uncompressed because they are in the critical path of the >>>>>>> linker and because I imagine that they wouldn't really be that compressible. >>>>>>> >>>>>> >>>>>> I'd be a bit surprised if they weren't especially compressible - >>>>>> >>>>> >>>>> Maybe I'm wrong, but my intuition about compression is that it works >>>>> best when the data contains repeated patterns. If we use a hash function >>>>> with good dispersion then I'd expect each hash to have little in common >>>>> with other hashes. >>>>> >>>>> >>>>>> and how much of the size increase is the compressed data V the >>>>>> uncompressed data? >>>>>> >>>>> >>>>> The ratio was roughly 60% compressed data to 40% uncompressed data. >>>>> >>>>> >>>>>> Is it still in the hot path when parallelized? >>>>>> >>>>> >>>>> Not right now according to my benchmarking, but decompression could >>>>> push it into the critical path if it ends up taking longer than the rest of >>>>> the work done by the linker after symbol resolution. On the same machine >>>>> that I used for benchmarking, gunzip'ing 200MB of /dev/urandom (which is >>>>> roughly what I'd expect the hashes to look like) takes around 1.1s, i.e. a >>>>> not insignificant fraction of lld's runtime. >>>>> >>>>> >>>>>> So I think the remaining gains would either be through limiting the >>>>>>> number of ODR table entries, or through reuse of data. >>>>>>> >>>>>>> Limiting might be something to explore -- one possibility is that we >>>>>>> could limit the ODR table entries to the declarations that are "used" by a >>>>>>> particular translation unit (it appears that Clang tracks something like >>>>>>> that in Decl::Used/Decl::Referenced, but I'm not sure if that is exactly >>>>>>> what we need -- I think we would basically need to test for reference >>>>>>> reachability from the functions/globals that are IRgen'd). >>>>>>> >>>>>> >>>>>> Currently it has every type and function that is in the AST? Yeah, >>>>>> that's a lot - perhaps it should be more like the things that go in the >>>>>> DWARF? (though would need to add some cases there - since the DWARF logic >>>>>> already relies on the ODR to not emit duplicates in some cases) >>>>>> >>>>> >>>>> Just every record declaration -- Clang only supports ODR hashes for >>>>> record declarations right now. I understand that function declarations >>>>> (including function bodies) are still works in progress. >>>>> >>>>> I think it should indeed just be roughly the things that go in the >>>>> DWARF. I think that at one point I observed that every record declaration, >>>>> even unused ones, were going into the DWARF, but I might have been mistaken >>>>> because I can no longer reproduce that. I'll take a closer look to see if I >>>>> can reuse what logic presumably already exists for DWARF. >>>>> >>>>> In terms of reuse, it seems that of the 536566007 bytes of >>>>>>> overhead, 319309579 were the compressed part of the ODR tables. So even if >>>>>>> we achieved 100% sharing, >>>>>>> >>>>>> >>>>>> 100% sharing? You mean if all the data were compressed, and assuming >>>>>> the hashes were compressible at the same ratio as the other data? >>>>>> >>>>> >>>>> Sorry, I mean if 100% of the data in the compressed part of the ODR >>>>> table could be eliminated by reusing data stored elsewhere (e.g. in the >>>>> object file string table or in the DWARF). >>>>> >>>>> with the current scheme I think that our minimum achievable overhead >>>>>>> would be ~15% (no debug info) or ~2% (with debug info). >>>>>>> >>>>>>> >>>>>>>> Could this go into .dwo files with Fission and be checked by dwp >>>>>>>> instead, perhaps? >>>>>>>> >>>>>>> >>>>>>> I think it could also work that way, yes. >>>>>>> >>>>>>> I'm reasonably happy with these figures, at least for a first >>>>>>>>> implementation. We may be able to do even better for file size with reuse, >>>>>>>>> but I'd leave that for version 2. >>>>>>>>> >>>>>>>> >>>>>>>> What's the story with compatibility between versions, then? Is >>>>>>>> there a version header? >>>>>>>> >>>>>>> >>>>>>> Yes, the header contains a version number. >>>>>>> >>>>>>> >>>>>>>> Will old formats be supported by lld indefinitely? Not at all? >>>>>>>> >>>>>>> >>>>>>> I think we should drop support for old formats when we introduce a >>>>>>> new format. My understanding is that the ODR hash can change whenever Clang >>>>>>> changes (the implementation only performs ODR checking if all ODR tables >>>>>>> were produced by the same revision of Clang), so there wouldn't seem to be >>>>>>> a huge benefit in keeping support for old formats around. >>>>>>> >>>>>> >>>>>> I imagine it's possible people aren't necessarily going to rev lld in >>>>>> exact lock-step with clang, but I could be wrong. (certainly binutils ld or >>>>>> gold aren't released/kept in lock-step with GCC, for example) >>>>>> >>>>> >>>>> That's certainly possible, but I'd say that the bar for dropping >>>>> backwards compatibility is lower because ODR tables are not required for >>>>> correctness. We could keep compatibility with the last version or so if it >>>>> isn't too burdensome, or otherwise print a warning. >>>>> >>>> >>>> They aren't required for correctness, but upgrading your compiler or >>>> linker then silently losing ODR checking would be bad (or even not silently >>>> losing it, but having no choice but to rev both to keep the functionality & >>>> hold the ODR-cleanliness bar) - it's the sort of thing where if you lost >>>> the checking, then gained it back again later, the regression cleanup would >>>> be annoying/an impediment to using the feature. >>>> >>> >>> Makes sense I guess. I'd be fine with a policy where the Nth open source >>> release should be able to read ODR tables produced by the N-1th and >>> possibly the N-2th release. >>> >> >> Still strikes me as a bit awkward - wonder how that compared to other >> (similar or different) linker features. >> > > I think the most similar existing feature is .gdb_index. They have already > gone through a few format revisions: > https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html > and have deprecated/removed support for older formats. >Not sure it's quite the same - that's an output of the linker & produces a performance degredation in GDB when not present. Here the discussion is a format input to the linker that produces warnings, potentially errors (Werror) for users that they might want to maintain compatibility for. Seems like a higher bar. Feels to me like a bit more time could be spent on a design/size impact/etc to give a better chance of ongoing stability (and/or backcompat at least) for users. But just a thought. - Dave> > Because the requirements for ODR tables are simpler than those for > .gdb_index, I'd expect us to converge on a final format sooner, so in > practice the window of compatibility would end up being longer than a year. > > Peter > > >> >>> Any idea what Daniel Jasper & co have been working on WRT ODR checking & >>>> how this feature integrates or doesn't with their work? I imagine they >>>> might be working on something more like a Clang Tooling style approach, but >>>> I'm not sure. >>>> >>> >>> I'm not aware of any work like that, only of Richard Trieu's efforts for >>> modules that I'm piggybacking on. >>> >> >> +Djasper - perhaps you could provide some context on other odr detection >> efforts? >> >> >>> Peter >>> >>>> >>>> - Dave >>>> >>>> >>>>> >>>>> Peter >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>>> >>>>>>>> - Dave >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Peter >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- Sean Silva >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> The issue of retaining the ODR check for functions even if >>>>>>>>>>>>> they get inlined may inherently pose an extra cost that can't be folded >>>>>>>>>>>>> into existing work the linker is doing, so there might be a reason for >>>>>>>>>>>>> clang to have a default mode that has practically no linking overhead and >>>>>>>>>>>>> one that does more thorough checking but imposes extra linking overhead. >>>>>>>>>>>>> Think something like a crazy boost library with thousands of functions that >>>>>>>>>>>>> get inlined away, but have gigantic mangled names and so precisely are the >>>>>>>>>>>>> ones that are going to impose extra cost on the linker. Simply due to the >>>>>>>>>>>>> extra volume of strings that the linker would need to look at, I don't >>>>>>>>>>>>> think there's a way to include checking of all inlined function "for free" >>>>>>>>>>>>> at the linker level using the symbol approach. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> I guess those inlined functions would still have those symbol >>>>>>>>>>>>> names in debug info (I think?), so piggybacking on the string deduplication >>>>>>>>>>>>> we're already doing might make it possible to fold away the work in that >>>>>>>>>>>>> case (but then again, would still impose extra cost with split dwarf...). >>>>>>>>>>>>> >>>>>>>>>>>>> Anyway, let's wait to see what the actual performance numbers >>>>>>>>>>>>> are. >>>>>>>>>>>>> >>>>>>>>>>>>> -- Sean Silva >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jun 6, 2017 at 10:40 PM, Peter Collingbourne via >>>>>>>>>>>>> cfe-dev <cfe-dev at lists.llvm.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'd like to propose an ODR checker feature for Clang and LLD. >>>>>>>>>>>>>> The feature would be similar to gold's --detect-odr-violations feature, but >>>>>>>>>>>>>> better: we can rely on integration with clang to avoid relying on debug >>>>>>>>>>>>>> info and to perform more precise matching. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The basic idea is that we use clang's ability to create ODR >>>>>>>>>>>>>> hashes for declarations. ODR hashes are computed using all information >>>>>>>>>>>>>> about a declaration that is ODR-relevant. If the flag >>>>>>>>>>>>>> -fdetect-odr-violations is passed, Clang will store the ODR hashes in a >>>>>>>>>>>>>> so-called ODR table in each object file. Each ODR table will contain a >>>>>>>>>>>>>> mapping from mangled declaration names to ODR hashes. At link time, the >>>>>>>>>>>>>> linker will read the ODR table and report any mismatches. >>>>>>>>>>>>>> >>>>>>>>>>>>>> To make this work: >>>>>>>>>>>>>> - LLVM will be extended with the ability to represent ODR >>>>>>>>>>>>>> tables in the IR and emit them to object files >>>>>>>>>>>>>> - Clang will be extended with the ability to emit ODR tables >>>>>>>>>>>>>> using ODR hashes >>>>>>>>>>>>>> - LLD will be extended to read ODR tables from object files >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have implemented a prototype of this feature. It is >>>>>>>>>>>>>> available here: >>>>>>>>>>>>>> https://github.com/pcc/llvm-project/tree/odr-checker and >>>>>>>>>>>>>> some results from applying it to chromium are here: >>>>>>>>>>>>>> crbug.com/726071 >>>>>>>>>>>>>> As you can see it did indeed find a number of real ODR >>>>>>>>>>>>>> violations in Chromium, including some that wouldn't be detectable using >>>>>>>>>>>>>> debug info. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If you're interested in what the format of the ODR table >>>>>>>>>>>>>> would look like, that prototype shows pretty much what I had in mind, but I >>>>>>>>>>>>>> expect many other aspects of the implementation to change as it is >>>>>>>>>>>>>> upstreamed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Peter >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> cfe-dev mailing list >>>>>>>>>>>>>> cfe-dev at lists.llvm.org >>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> -- >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> Peter >>>>>>>>> _______________________________________________ >>>>>>>>> cfe-dev mailing list >>>>>>>>> cfe-dev at lists.llvm.org >>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Peter >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> Peter >>>>> >>>> >>> >>> >>> -- >>> -- >>> Peter >>> >> > > > -- > -- > Peter >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/66a682c3/attachment.html>