David Blaikie via llvm-dev
2017-Mar-13 04:39 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On Sun, Mar 12, 2017, 4:10 PM Dean Michael Berris <dean.berris at gmail.com> wrote:> > > On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > I agree that we should clean up the standard library usage even just for > consistency. > > > > +1 -- now that I think about it, it should be fairly doable (also happy to > help with reviews if that helps). > > > Searching the xray directory for dependencies: > > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|grep -v 'tests'` | sort | uniq -c > > 1 #include <algorithm> > > 10 #include <atomic> > > 1 #include <bitset> > > 6 #include <cassert> > > 1 #include <cerrno> > > 1 #include <cstddef> > > 7 #include <cstdint> > > 2 #include <cstdio> > > 1 #include <cstdlib> > > 2 #include <cstring> > > 1 #include <deque> > > 2 #include <iterator> > > 2 #include <limits> > > 2 #include <memory> > > 4 #include <mutex> > > 1 #include <system_error> > > 1 #include <thread> > > 2 #include <tuple> > > 1 #include <unordered_map> > > 1 #include <unordered_set> > > 3 #include <utility> > > I think the biggest part is containers, and they are mostly in > ./xray_buffer_queue.h and ./xray_fdr_logging.cc. > > Yes, buffer_queue can definitely live without using system_error, > unordered_map, and unordered_set. It might make it a bit more complex (we'd > need to implement a correct and fairly efficient hash set) but if it means > the deployment model is simpler then I'm happy with that trade-off. When we > were implementing this, we made a decision to make it so that the "mismatch > of standard library implementations" was treated as a lower priority issue > -- something we don't think comes up as often, and is easily solvable by > re-building the runtime with the standard library the end > application/binary will be using anyway. Since XRay is only ever > statically-linked (we don't have a dynamic version of it), I thought the > rebuild option is slightly simpler than trying to implement the whole XRay > runtime in a constrained version of C++ and libc-only functions. >Except that's not how llvm is distributed. In releases it will ship with the compiler and runtime libraries but can be used with any c++ standard library. This isn't a quality if implementation thing, this is more a correctness issue.> > > > dependencies without buffer queue and fdr logging: > > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|egrep -v 'tests|buffer|fdr'` | sort | uniq -c > > 9 #include <atomic> > > 4 #include <cassert> > > 1 #include <cerrno> > > 1 #include <cstddef> > > 6 #include <cstdint> > > 2 #include <cstdio> > > 1 #include <cstring> > > 2 #include <iterator> > > 2 #include <limits> > > 1 #include <memory> > > 3 #include <mutex> > > 1 #include <thread> > > 2 #include <tuple> > > 2 #include <utility> > > I believe that this is relatively easy to cleanup. I can do that. > > > > I don't know how hard it is to rewrite buffer queue and fdr logging > using compiler_rt infrastructure. > > > > The crucial things we need in FDR mode logging are: > > - Aligned storage (I suspect this could be done without > std::aligned_storage<...>) > - memcpy (we use std::memcpy there, but probably didn't need to) > - thread-local storage (using C++'s `thread_local` keyword) > > The buffer queue can be re-written to not use std::system_error in the > APIs (use XRay-specific enums instead), internally not use std::atomic<...> > types, and have to implement the FIFO queue, a lookup table for ownership, > etc. -- while none of these seem hard, they didn't seem worth it at the > time of implementation. > > > I think buffer_queue's probably sufficiently well bounded that it > shouldn't be drastically hard to replace it with a custom implementation. > Haven't looked at fdr_logging. > > > > Maps/dictionary-like things might be a bit of a pain in particular. Not > sure if the sanitizers already have some reusable idioms/libraries for that. > > > > I'm also not really clear on where the boundary is - which headers or > language features ('new'?) can be used, and which can't. Can't say I've > ever tried to make code library agnostic. > > > > One thing we rely on heavily on in the FDR mode implementation is C++'s > `thread_local` keyword. I'm not sure what that entails runtime-wise (does > it need pthreads? or something else?) but I'm sure a functional replacement > would be alright too. >No doubt we can find some common API for that, I'd guess tsan probably has already had to figure out things like that.> > > > > > (this came up for me due to what's probably a bug in the way compiler-rt > is built - where the lib itself is built with the host compiler but the > tests are built/linked with the just-bulit clang. My host compiler uses > libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it > fails to link due to this mismatch) > > How hard would it be to fix that bug? >I've started a separate thread on that, but even if that's fixed it's still necessary to fix the dependency/distribution model here.> I would think this error would come up in the wild too, but could be > remedied by just rebuilding XRay with the right C++ standard library. > > -- Dean > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170313/91fb45c4/attachment.html>
David Chisnall via llvm-dev
2017-Mar-13 09:06 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On 13 Mar 2017, at 04:39, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > One thing we rely on heavily on in the FDR mode implementation is C++'s `thread_local` keyword. I'm not sure what that entails runtime-wise (does it need pthreads? or something else?) but I'm sure a functional replacement would be alright too. > > No doubt we can find some common API for that, I'd guess tsan probably has already had to figure out things like that.thread_local depends on any runtime (ABI library) features for destruction, so it’s only a dependency if you have thread-local global with a non-trivial destructor. Local static (non-thread-local) variables also depend on the ABI library for atomic creation. These are less of an issue than standard library dependencies though, because these calls are all defined by the platform ABI and so if there are multiple options (libcxxrt, libsupc++, libc++abi) then you substitute any of them at run time. This does mean that you’ll need to ensure that one of them is linked in the final binary though, and so must be overridden by the flag that selects C++ standard library implementation during linking. David
Dean Michael Berris via llvm-dev
2017-Mar-15 00:34 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
> On 13 Mar 2017, at 15:39, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Sun, Mar 12, 2017, 4:10 PM Dean Michael Berris <dean.berris at gmail.com <mailto:dean.berris at gmail.com>> wrote: > > > On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > > I agree that we should clean up the standard library usage even just for consistency. > > > > +1 -- now that I think about it, it should be fairly doable (also happy to help with reviews if that helps). > > > Searching the xray directory for dependencies: > > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type f|grep -v 'tests'` | sort | uniq -c > > 1 #include <algorithm> > > 10 #include <atomic> > > 1 #include <bitset> > > 6 #include <cassert> > > 1 #include <cerrno> > > 1 #include <cstddef> > > 7 #include <cstdint> > > 2 #include <cstdio> > > 1 #include <cstdlib> > > 2 #include <cstring> > > 1 #include <deque> > > 2 #include <iterator> > > 2 #include <limits> > > 2 #include <memory> > > 4 #include <mutex> > > 1 #include <system_error> > > 1 #include <thread> > > 2 #include <tuple> > > 1 #include <unordered_map> > > 1 #include <unordered_set> > > 3 #include <utility> > > I think the biggest part is containers, and they are mostly in ./xray_buffer_queue.h and ./xray_fdr_logging.cc. > > Yes, buffer_queue can definitely live without using system_error, unordered_map, and unordered_set. It might make it a bit more complex (we'd need to implement a correct and fairly efficient hash set) but if it means the deployment model is simpler then I'm happy with that trade-off. When we were implementing this, we made a decision to make it so that the "mismatch of standard library implementations" was treated as a lower priority issue -- something we don't think comes up as often, and is easily solvable by re-building the runtime with the standard library the end application/binary will be using anyway. Since XRay is only ever statically-linked (we don't have a dynamic version of it), I thought the rebuild option is slightly simpler than trying to implement the whole XRay runtime in a constrained version of C++ and libc-only functions. > > Except that's not how llvm is distributed. In releases it will ship with the compiler and runtime libraries but can be used with any c++ standard library. >That's fair. I was mostly thinking of the majority cases where I assumed that the user would be using clang/llvm/compiler-rt with the system-provided standard library anyway, and any exotic set-ups (not system libraries used) would require more work from the user anyway.> This isn't a quality if implementation thing, this is more a correctness issue.I'm less convinced of that, mostly because I've assumed that users using libraries and runtimes will be aware of the potential ABI incompatibilities that arise when using different standard libraries to build and then link. But if the constraints are different for runtimes in compiler-rt, then I'm alright with abiding by those. :)> > One thing we rely on heavily on in the FDR mode implementation is C++'s `thread_local` keyword. I'm not sure what that entails runtime-wise (does it need pthreads? or something else?) but I'm sure a functional replacement would be alright too. > > No doubt we can find some common API for that, I'd guess tsan probably has already had to figure out things like that. >I suspect I'm more worried about costs rather than APIs. I've learned a while back that system-provided TLS functionality isn't as good as compiler-provided (and system/platform-specific) TLS implementations.> > > > > > > (this came up for me due to what's probably a bug in the way compiler-rt is built - where the lib itself is built with the host compiler but the tests are built/linked with the just-bulit clang. My host compiler uses libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it fails to link due to this mismatch) > > How hard would it be to fix that bug? > > I've started a separate thread on that, but even if that's fixed it's still necessary to fix the dependency/distribution model here. >Agreed. Cheers -- Dean -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/8e6d6286/attachment.html>
Dean Michael Berris via llvm-dev
2017-Mar-15 00:38 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
> On 13 Mar 2017, at 20:06, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote: > > On 13 Mar 2017, at 04:39, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> One thing we rely on heavily on in the FDR mode implementation is C++'s `thread_local` keyword. I'm not sure what that entails runtime-wise (does it need pthreads? or something else?) but I'm sure a functional replacement would be alright too. >> >> No doubt we can find some common API for that, I'd guess tsan probably has already had to figure out things like that. > > thread_local depends on any runtime (ABI library) features for destruction, so it’s only a dependency if you have thread-local global with a non-trivial destructor. Local static (non-thread-local) variables also depend on the ABI library for atomic creation. >We use at-thread-exit thread_local variables at global scope to do cleanup for "at end of thread" functionality. So this means we do need some runtime support for that.> These are less of an issue than standard library dependencies though, because these calls are all defined by the platform ABI and so if there are multiple options (libcxxrt, libsupc++, libc++abi) then you substitute any of them at run time. This does mean that you’ll need to ensure that one of them is linked in the final binary though, and so must be overridden by the flag that selects C++ standard library implementation during linking. >Fair enough. We already do some link-time flag selection in Clang for when you use it to link -fxray-instrument built/linked binaries. We currently do not change the C++ library dependency though (nor force one). Cheer -- Dean -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/c8a6bbd2/attachment.html>
Chandler Carruth via llvm-dev
2017-Mar-15 00:38 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On Tue, Mar 14, 2017 at 5:34 PM Dean Michael Berris via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 13 Mar 2017, at 15:39, David Blaikie <dblaikie at gmail.com> wrote: > > > One thing we rely on heavily on in the FDR mode implementation is C++'s > `thread_local` keyword. I'm not sure what that entails runtime-wise (does > it need pthreads? or something else?) but I'm sure a functional replacement > would be alright too. > > > No doubt we can find some common API for that, I'd guess tsan probably has > already had to figure out things like that. > > > I suspect I'm more worried about costs rather than APIs. I've learned a > while back that system-provided TLS functionality isn't as good as > compiler-provided (and system/platform-specific) TLS implementations. >I think what you want here (see David Chisnall's post as well) is essentially the GNU extension '__thread' which both Clang and GCC have strong support for. It works like thread_local except that it has the kind of restrictions David mentions so that it doesn't become a runtime burden. (Note that thread_local, if it can at all require dynamic initialization, ends up quite bad even when the constructor is "trivial". sadly....) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/1246d381/attachment.html>
David Blaikie via llvm-dev
2017-Mar-15 01:19 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On Tue, Mar 14, 2017 at 5:34 PM Dean Michael Berris <dean.berris at gmail.com> wrote:> On 13 Mar 2017, at 15:39, David Blaikie <dblaikie at gmail.com> wrote: > > > > On Sun, Mar 12, 2017, 4:10 PM Dean Michael Berris <dean.berris at gmail.com> > wrote: > > > > On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > I agree that we should clean up the standard library usage even just for > consistency. > > > > +1 -- now that I think about it, it should be fairly doable (also happy to > help with reviews if that helps). > > > Searching the xray directory for dependencies: > > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|grep -v 'tests'` | sort | uniq -c > > 1 #include <algorithm> > > 10 #include <atomic> > > 1 #include <bitset> > > 6 #include <cassert> > > 1 #include <cerrno> > > 1 #include <cstddef> > > 7 #include <cstdint> > > 2 #include <cstdio> > > 1 #include <cstdlib> > > 2 #include <cstring> > > 1 #include <deque> > > 2 #include <iterator> > > 2 #include <limits> > > 2 #include <memory> > > 4 #include <mutex> > > 1 #include <system_error> > > 1 #include <thread> > > 2 #include <tuple> > > 1 #include <unordered_map> > > 1 #include <unordered_set> > > 3 #include <utility> > > I think the biggest part is containers, and they are mostly in > ./xray_buffer_queue.h and ./xray_fdr_logging.cc. > > Yes, buffer_queue can definitely live without using system_error, > unordered_map, and unordered_set. It might make it a bit more complex (we'd > need to implement a correct and fairly efficient hash set) but if it means > the deployment model is simpler then I'm happy with that trade-off. When we > were implementing this, we made a decision to make it so that the "mismatch > of standard library implementations" was treated as a lower priority issue > -- something we don't think comes up as often, and is easily solvable by > re-building the runtime with the standard library the end > application/binary will be using anyway. Since XRay is only ever > statically-linked (we don't have a dynamic version of it), I thought the > rebuild option is slightly simpler than trying to implement the whole XRay > runtime in a constrained version of C++ and libc-only functions. > > > Except that's not how llvm is distributed. In releases it will ship with > the compiler and runtime libraries but can be used with any c++ standard > library. > > > That's fair. I was mostly thinking of the majority cases where I assumed > that the user would be using clang/llvm/compiler-rt with the > system-provided standard library anyway, and any exotic set-ups (not system > libraries used) would require more work from the user anyway. >Not much work - Clang discovers existing GCC installations and picks the newest. So at the very least, if a user installs a newer GCC, they get a newer libstdc++ and Clang implicitly picks that up without any further work. Also Clang distribution might be available for several platforms which may have different system compilers/std libraries, the distribution is statically linked - so libstdc++ doesn't have to match or be present on the destination system, I think?> > This isn't a quality if implementation thing, this is more a correctness > issue. > > > I'm less convinced of that, mostly because I've assumed that users using > libraries and runtimes will be aware of the potential ABI incompatibilities > that arise when using different standard libraries to build and then link. >But they won't be aware that they're mismatched - they take the LLVM distribution, compile their code on a system with a libstdc++ different from the one on the system where the LLVM distribution was built... and badness.> But if the constraints are different for runtimes in compiler-rt, then I'm > alright with abiding by those. :) >*nod* so far as I know, the runtimes have historically not depended on libstdc++ for these reasons (& the added bonus that the sanitizers need to sit below such libraries)> > > One thing we rely on heavily on in the FDR mode implementation is C++'s > `thread_local` keyword. I'm not sure what that entails runtime-wise (does > it need pthreads? or something else?) but I'm sure a functional replacement > would be alright too. > > > No doubt we can find some common API for that, I'd guess tsan probably has > already had to figure out things like that. > > > I suspect I'm more worried about costs rather than APIs. I've learned a > while back that system-provided TLS functionality isn't as good as > compiler-provided (and system/platform-specific) TLS implementations. > > > > > > > > (this came up for me due to what's probably a bug in the way compiler-rt > is built - where the lib itself is built with the host compiler but the > tests are built/linked with the just-bulit clang. My host compiler uses > libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it > fails to link due to this mismatch) > > How hard would it be to fix that bug? > > > I've started a separate thread on that, but even if that's fixed it's > still necessary to fix the dependency/distribution model here. > > > Agreed. > > Cheers > > -- Dean > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170315/e37c30e3/attachment.html>