David Blaikie via llvm-dev
2017-Mar-08 22:32 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On Wed, Mar 8, 2017 at 2:28 PM Tim Shen <timshen at google.com> wrote:> On Wed, Mar 8, 2017 at 1:49 PM David Blaikie <dblaikie at gmail.com> wrote: > > So I stumbled across an issue that I think is a bit fundamental: > > The xray runtime uses the C++ standard library. > > This seems like a problem because whatever C++ standard library is used to > compile the XRay runtime may not be the same as the C++ standard library > (if any) that is used to build the target application and link XRay into. > > Does this make sense? Is this a problem? > > Talking to Chandler over lunch it sounds like there's a couple of options > - either remove the dependency (much like, I believe, the sanitizer > runtimes - use nothing from the C++ standard library, replace everything > with custom data structures, etc) or, perhaps more drastically, change the > way the runtimes are built such that they statically link a private version > of, say, libc++. > > > What's the reason of not static-linking a C++ standard library for > sanitizer runtimes back to when it was created? >Not sure - Evgeniy (cc'd) might know. Partly perhaps the development cost of having to isolate that statically linked library from colliding with any other (some kind of mangling scheme would have to be used, I think? to avoid such a collision).> > > > Chandler seemed to think maybe we could do this state-side (Tim? Might be > something you could handle) rather than pushing it back on to Dean, if that > sounds reasonable? > > > I believe that "state-side" is LLVM team side? >Right, yes, sorry.> I agree that we should clean up the standard library usage even just for > consistency. > > Searching the xray directory for dependencies: > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|grep -v 'tests'` | sort | uniq -c > 1 #include <algorithm> > 10 #include <atomic> > 1 #include <bitset> > 6 #include <cassert> > 1 #include <cerrno> > 1 #include <cstddef> > 7 #include <cstdint> > 2 #include <cstdio> > 1 #include <cstdlib> > 2 #include <cstring> > 1 #include <deque> > 2 #include <iterator> > 2 #include <limits> > 2 #include <memory> > 4 #include <mutex> > 1 #include <system_error> > 1 #include <thread> > 2 #include <tuple> > 1 #include <unordered_map> > 1 #include <unordered_set> > 3 #include <utility> > I think the biggest part is containers, and they are mostly > in ./xray_buffer_queue.h and ./xray_fdr_logging.cc. > > dependencies without buffer queue and fdr logging: > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|egrep -v 'tests|buffer|fdr'` | sort | uniq -c > 9 #include <atomic> > 4 #include <cassert> > 1 #include <cerrno> > 1 #include <cstddef> > 6 #include <cstdint> > 2 #include <cstdio> > 1 #include <cstring> > 2 #include <iterator> > 2 #include <limits> > 1 #include <memory> > 3 #include <mutex> > 1 #include <thread> > 2 #include <tuple> > 2 #include <utility> > I believe that this is relatively easy to cleanup. I can do that. > > I don't know how hard it is to rewrite buffer queue and fdr logging using > compiler_rt infrastructure. >I think buffer_queue's probably sufficiently well bounded that it shouldn't be drastically hard to replace it with a custom implementation. Haven't looked at fdr_logging. Maps/dictionary-like things might be a bit of a pain in particular. Not sure if the sanitizers already have some reusable idioms/libraries for that. I'm also not really clear on where the boundary is - which headers or language features ('new'?) can be used, and which can't. Can't say I've ever tried to make code library agnostic.> > > > (this came up for me due to what's probably a bug in the way compiler-rt > is built - where the lib itself is built with the host compiler but the > tests are built/linked with the just-bulit clang. My host compiler uses > libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it > fails to link due to this mismatch) > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170308/987e468a/attachment.html>
Evgenii Stepanov via llvm-dev
2017-Mar-08 23:12 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On Wed, Mar 8, 2017 at 2:32 PM, David Blaikie <dblaikie at gmail.com> wrote:> > > On Wed, Mar 8, 2017 at 2:28 PM Tim Shen <timshen at google.com> wrote: >> >> On Wed, Mar 8, 2017 at 1:49 PM David Blaikie <dblaikie at gmail.com> wrote: >>> >>> So I stumbled across an issue that I think is a bit fundamental: >>> >>> The xray runtime uses the C++ standard library. >>> >>> This seems like a problem because whatever C++ standard library is used >>> to compile the XRay runtime may not be the same as the C++ standard library >>> (if any) that is used to build the target application and link XRay into. >>> >>> Does this make sense? Is this a problem? >>> >>> Talking to Chandler over lunch it sounds like there's a couple of options >>> - either remove the dependency (much like, I believe, the sanitizer runtimes >>> - use nothing from the C++ standard library, replace everything with custom >>> data structures, etc) or, perhaps more drastically, change the way the >>> runtimes are built such that they statically link a private version of, say, >>> libc++. >> >> >> What's the reason of not static-linking a C++ standard library for >> sanitizer runtimes back to when it was created? > > > Not sure - Evgeniy (cc'd) might know. Partly perhaps the development cost of > having to isolate that statically linked library from colliding with any > other (some kind of mangling scheme would have to be used, I think? to avoid > such a collision).This. But we also want to avoid libc++ calling libc, because we may be inside a libc interceptor. Sanitizer_common stuff mainly uses internal_* implementations and raw system calls. Building such an isolated library is hard, especially if it has to be a static library - then you need to use either relocatable link (which is buggy) or LTO (which was in a bad shape back then). We do something like this for the symbolizer (see lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh), but not by default, and it is not integrated in the build system properly.> >> >> >>> >>> >>> Chandler seemed to think maybe we could do this state-side (Tim? Might be >>> something you could handle) rather than pushing it back on to Dean, if that >>> sounds reasonable? >> >> >> I believe that "state-side" is LLVM team side? > > > Right, yes, sorry. > >> >> I agree that we should clean up the standard library usage even just for >> consistency. >> >> Searching the xray directory for dependencies: >> ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type >> f|grep -v 'tests'` | sort | uniq -c >> 1 #include <algorithm> >> 10 #include <atomic> >> 1 #include <bitset> >> 6 #include <cassert> >> 1 #include <cerrno> >> 1 #include <cstddef> >> 7 #include <cstdint> >> 2 #include <cstdio> >> 1 #include <cstdlib> >> 2 #include <cstring> >> 1 #include <deque> >> 2 #include <iterator> >> 2 #include <limits> >> 2 #include <memory> >> 4 #include <mutex> >> 1 #include <system_error> >> 1 #include <thread> >> 2 #include <tuple> >> 1 #include <unordered_map> >> 1 #include <unordered_set> >> 3 #include <utility> >> I think the biggest part is containers, and they are mostly in >> ./xray_buffer_queue.h and ./xray_fdr_logging.cc. >> >> dependencies without buffer queue and fdr logging: >> ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type >> f|egrep -v 'tests|buffer|fdr'` | sort | uniq -c >> 9 #include <atomic> >> 4 #include <cassert> >> 1 #include <cerrno> >> 1 #include <cstddef> >> 6 #include <cstdint> >> 2 #include <cstdio> >> 1 #include <cstring> >> 2 #include <iterator> >> 2 #include <limits> >> 1 #include <memory> >> 3 #include <mutex> >> 1 #include <thread> >> 2 #include <tuple> >> 2 #include <utility> >> I believe that this is relatively easy to cleanup. I can do that. >> >> I don't know how hard it is to rewrite buffer queue and fdr logging using >> compiler_rt infrastructure. > > > I think buffer_queue's probably sufficiently well bounded that it shouldn't > be drastically hard to replace it with a custom implementation. Haven't > looked at fdr_logging. > > Maps/dictionary-like things might be a bit of a pain in particular. Not sure > if the sanitizers already have some reusable idioms/libraries for that. > > I'm also not really clear on where the boundary is - which headers or > language features ('new'?) can be used, and which can't. Can't say I've ever > tried to make code library agnostic. > >> >> >>> >>> >>> (this came up for me due to what's probably a bug in the way compiler-rt >>> is built - where the lib itself is built with the host compiler but the >>> tests are built/linked with the just-bulit clang. My host compiler uses >>> libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it >>> fails to link due to this mismatch)
Dean Michael Berris via llvm-dev
2017-Mar-12 23:10 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
> On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I agree that we should clean up the standard library usage even just for consistency. >+1 -- now that I think about it, it should be fairly doable (also happy to help with reviews if that helps).> Searching the xray directory for dependencies: > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type f|grep -v 'tests'` | sort | uniq -c > 1 #include <algorithm> > 10 #include <atomic> > 1 #include <bitset> > 6 #include <cassert> > 1 #include <cerrno> > 1 #include <cstddef> > 7 #include <cstdint> > 2 #include <cstdio> > 1 #include <cstdlib> > 2 #include <cstring> > 1 #include <deque> > 2 #include <iterator> > 2 #include <limits> > 2 #include <memory> > 4 #include <mutex> > 1 #include <system_error> > 1 #include <thread> > 2 #include <tuple> > 1 #include <unordered_map> > 1 #include <unordered_set> > 3 #include <utility> > I think the biggest part is containers, and they are mostly in ./xray_buffer_queue.h and ./xray_fdr_logging.cc.Yes, buffer_queue can definitely live without using system_error, unordered_map, and unordered_set. It might make it a bit more complex (we'd need to implement a correct and fairly efficient hash set) but if it means the deployment model is simpler then I'm happy with that trade-off. When we were implementing this, we made a decision to make it so that the "mismatch of standard library implementations" was treated as a lower priority issue -- something we don't think comes up as often, and is easily solvable by re-building the runtime with the standard library the end application/binary will be using anyway. Since XRay is only ever statically-linked (we don't have a dynamic version of it), I thought the rebuild option is slightly simpler than trying to implement the whole XRay runtime in a constrained version of C++ and libc-only functions.> > dependencies without buffer queue and fdr logging: > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type f|egrep -v 'tests|buffer|fdr'` | sort | uniq -c > 9 #include <atomic> > 4 #include <cassert> > 1 #include <cerrno> > 1 #include <cstddef> > 6 #include <cstdint> > 2 #include <cstdio> > 1 #include <cstring> > 2 #include <iterator> > 2 #include <limits> > 1 #include <memory> > 3 #include <mutex> > 1 #include <thread> > 2 #include <tuple> > 2 #include <utility> > I believe that this is relatively easy to cleanup. I can do that. > > I don't know how hard it is to rewrite buffer queue and fdr logging using compiler_rt infrastructure. >The crucial things we need in FDR mode logging are: - Aligned storage (I suspect this could be done without std::aligned_storage<...>) - memcpy (we use std::memcpy there, but probably didn't need to) - thread-local storage (using C++'s `thread_local` keyword) The buffer queue can be re-written to not use std::system_error in the APIs (use XRay-specific enums instead), internally not use std::atomic<...> types, and have to implement the FIFO queue, a lookup table for ownership, etc. -- while none of these seem hard, they didn't seem worth it at the time of implementation.> I think buffer_queue's probably sufficiently well bounded that it shouldn't be drastically hard to replace it with a custom implementation. Haven't looked at fdr_logging. > > Maps/dictionary-like things might be a bit of a pain in particular. Not sure if the sanitizers already have some reusable idioms/libraries for that. > > I'm also not really clear on where the boundary is - which headers or language features ('new'?) can be used, and which can't. Can't say I've ever tried to make code library agnostic. >One thing we rely on heavily on in the FDR mode implementation is C++'s `thread_local` keyword. I'm not sure what that entails runtime-wise (does it need pthreads? or something else?) but I'm sure a functional replacement would be alright too.> > > (this came up for me due to what's probably a bug in the way compiler-rt is built - where the lib itself is built with the host compiler but the tests are built/linked with the just-bulit clang. My host compiler uses libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it fails to link due to this mismatch)How hard would it be to fix that bug? I would think this error would come up in the wild too, but could be remedied by just rebuilding XRay with the right C++ standard library. -- Dean
David Blaikie via llvm-dev
2017-Mar-13 04:39 UTC
[llvm-dev] Use of the C++ standard library in XRay compiler-rt
On Sun, Mar 12, 2017, 4:10 PM Dean Michael Berris <dean.berris at gmail.com> wrote:> > > On 9 Mar 2017, at 09:32, David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > I agree that we should clean up the standard library usage even just for > consistency. > > > > +1 -- now that I think about it, it should be fairly doable (also happy to > help with reviews if that helps). > > > Searching the xray directory for dependencies: > > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|grep -v 'tests'` | sort | uniq -c > > 1 #include <algorithm> > > 10 #include <atomic> > > 1 #include <bitset> > > 6 #include <cassert> > > 1 #include <cerrno> > > 1 #include <cstddef> > > 7 #include <cstdint> > > 2 #include <cstdio> > > 1 #include <cstdlib> > > 2 #include <cstring> > > 1 #include <deque> > > 2 #include <iterator> > > 2 #include <limits> > > 2 #include <memory> > > 4 #include <mutex> > > 1 #include <system_error> > > 1 #include <thread> > > 2 #include <tuple> > > 1 #include <unordered_map> > > 1 #include <unordered_set> > > 3 #include <utility> > > I think the biggest part is containers, and they are mostly in > ./xray_buffer_queue.h and ./xray_fdr_logging.cc. > > Yes, buffer_queue can definitely live without using system_error, > unordered_map, and unordered_set. It might make it a bit more complex (we'd > need to implement a correct and fairly efficient hash set) but if it means > the deployment model is simpler then I'm happy with that trade-off. When we > were implementing this, we made a decision to make it so that the "mismatch > of standard library implementations" was treated as a lower priority issue > -- something we don't think comes up as often, and is easily solvable by > re-building the runtime with the standard library the end > application/binary will be using anyway. Since XRay is only ever > statically-linked (we don't have a dynamic version of it), I thought the > rebuild option is slightly simpler than trying to implement the whole XRay > runtime in a constrained version of C++ and libc-only functions. >Except that's not how llvm is distributed. In releases it will ship with the compiler and runtime libraries but can be used with any c++ standard library. This isn't a quality if implementation thing, this is more a correctness issue.> > > > dependencies without buffer queue and fdr logging: > > ...compiler-rt/lib/xray % grep '#include <[^>.]*>' -oh `find . -type > f|egrep -v 'tests|buffer|fdr'` | sort | uniq -c > > 9 #include <atomic> > > 4 #include <cassert> > > 1 #include <cerrno> > > 1 #include <cstddef> > > 6 #include <cstdint> > > 2 #include <cstdio> > > 1 #include <cstring> > > 2 #include <iterator> > > 2 #include <limits> > > 1 #include <memory> > > 3 #include <mutex> > > 1 #include <thread> > > 2 #include <tuple> > > 2 #include <utility> > > I believe that this is relatively easy to cleanup. I can do that. > > > > I don't know how hard it is to rewrite buffer queue and fdr logging > using compiler_rt infrastructure. > > > > The crucial things we need in FDR mode logging are: > > - Aligned storage (I suspect this could be done without > std::aligned_storage<...>) > - memcpy (we use std::memcpy there, but probably didn't need to) > - thread-local storage (using C++'s `thread_local` keyword) > > The buffer queue can be re-written to not use std::system_error in the > APIs (use XRay-specific enums instead), internally not use std::atomic<...> > types, and have to implement the FIFO queue, a lookup table for ownership, > etc. -- while none of these seem hard, they didn't seem worth it at the > time of implementation. > > > I think buffer_queue's probably sufficiently well bounded that it > shouldn't be drastically hard to replace it with a custom implementation. > Haven't looked at fdr_logging. > > > > Maps/dictionary-like things might be a bit of a pain in particular. Not > sure if the sanitizers already have some reusable idioms/libraries for that. > > > > I'm also not really clear on where the boundary is - which headers or > language features ('new'?) can be used, and which can't. Can't say I've > ever tried to make code library agnostic. > > > > One thing we rely on heavily on in the FDR mode implementation is C++'s > `thread_local` keyword. I'm not sure what that entails runtime-wise (does > it need pthreads? or something else?) but I'm sure a functional replacement > would be alright too. >No doubt we can find some common API for that, I'd guess tsan probably has already had to figure out things like that.> > > > > > (this came up for me due to what's probably a bug in the way compiler-rt > is built - where the lib itself is built with the host compiler but the > tests are built/linked with the just-bulit clang. My host compiler uses > libstdc++ 6, whereas the just-built clang will use libstdc++ 4.8. So it > fails to link due to this mismatch) > > How hard would it be to fix that bug? >I've started a separate thread on that, but even if that's fixed it's still necessary to fix the dependency/distribution model here.> I would think this error would come up in the wild too, but could be > remedied by just rebuilding XRay with the right C++ standard library. > > -- Dean > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170313/91fb45c4/attachment.html>