> On 2 Sep 2016, at 11:18, Dmitry Vyukov via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Thu, Sep 1, 2016 at 2:30 PM, Kuba Brecka <kuba.brecka at gmail.com> wrote: >> Hi, >> >> I'm trying to write a TSan interceptor for the C++11 call_once function. There are currently false positive reports, because the inner __call_once function is located in the (non-instrumented) libcxx library, and on macOS we can't expect the users to build their own instrumented libcxx. >> >> TSan already supports pthread_once and dispatch_once by having interceptors that re-implement the logic. However, doing the same for call_once/__call_once doesn't work, because call_once is explicitly supposed to be exception-safe, but the sanitizer runtime libraries disallow exception handling. >> >> Any ideas how to handle call_once in TSan? > > Does anybody remember exact reasons we disable exceptions in sanitizer > runtimes? One is that it won't link with C programs. Are there any > other? > If C is the only reason: there is already a part of tsan runtime is > linked only to C++ programs (it contains operator new/delete > interceptors). We could add additional files to the cxx part of > runtime and build them with exceptions. > > Alternatively, the interceptor can handle only synchronization but > forward actual logic to the real function. Along the lines of: > > INTERCEPTOR(call_once, o) { > __tsan_acquire_release(o); > REAL(call_once)(o); > } > > That will have some performance impact. If we hardcode the "fully > initialized" value, then we can eliminate the additional overhead: > > INTERCEPTOR(call_once, o) { > if (__atomic_load(o, acquire) == FULLY_INITIALIZED) { > __tsan_acquire(o); > return; > } > __tsan_acquire_release(o); > REAL(call_once)(o); > }Unfortunately, the first fast-path check is inlined and cannot be intercepted. We can only intercept the inner call to __call_once. But how would __tsan_acquire_release help here? The issue is that we need to perform the release *after* user code has run, but before the "o" flag is changed. Otherwise, TSan will still see a false positive where one thread has already run user code, and another thread already sees that call_once is finished, but the acquire has no release to pair with. Kuba
On Fri, Sep 2, 2016 at 12:09 PM, Kuba Brecka <kuba.brecka at gmail.com> wrote:> >> On 2 Sep 2016, at 11:18, Dmitry Vyukov via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> On Thu, Sep 1, 2016 at 2:30 PM, Kuba Brecka <kuba.brecka at gmail.com> wrote: >>> Hi, >>> >>> I'm trying to write a TSan interceptor for the C++11 call_once function. There are currently false positive reports, because the inner __call_once function is located in the (non-instrumented) libcxx library, and on macOS we can't expect the users to build their own instrumented libcxx. >>> >>> TSan already supports pthread_once and dispatch_once by having interceptors that re-implement the logic. However, doing the same for call_once/__call_once doesn't work, because call_once is explicitly supposed to be exception-safe, but the sanitizer runtime libraries disallow exception handling. >>> >>> Any ideas how to handle call_once in TSan? >> >> Does anybody remember exact reasons we disable exceptions in sanitizer >> runtimes? One is that it won't link with C programs. Are there any >> other? >> If C is the only reason: there is already a part of tsan runtime is >> linked only to C++ programs (it contains operator new/delete >> interceptors). We could add additional files to the cxx part of >> runtime and build them with exceptions. >> >> Alternatively, the interceptor can handle only synchronization but >> forward actual logic to the real function. Along the lines of: >> >> INTERCEPTOR(call_once, o) { >> __tsan_acquire_release(o); >> REAL(call_once)(o); >> } >> >> That will have some performance impact. If we hardcode the "fully >> initialized" value, then we can eliminate the additional overhead: >> >> INTERCEPTOR(call_once, o) { >> if (__atomic_load(o, acquire) == FULLY_INITIALIZED) { >> __tsan_acquire(o); >> return; >> } >> __tsan_acquire_release(o); >> REAL(call_once)(o); >> } > > Unfortunately, the first fast-path check is inlined and cannot be intercepted. We can only intercept the inner call to __call_once. But how would __tsan_acquire_release help here? The issue is that we need to perform the release *after* user code has run, but before the "o" flag is changed. Otherwise, TSan will still see a false positive where one thread has already run user code, and another thread already sees that call_once is finished, but the acquire has no release to pair with.Will then the following work? INTERCEPTOR(call_once, o) { REAL(call_once)(o); __tsan_release_merge(o); __tsan_acquire(o); }
> On 2 Sep 2016, at 12:11, Dmitry Vyukov <dvyukov at google.com> wrote: > > On Fri, Sep 2, 2016 at 12:09 PM, Kuba Brecka <kuba.brecka at gmail.com> wrote: >> >>> On 2 Sep 2016, at 11:18, Dmitry Vyukov via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> On Thu, Sep 1, 2016 at 2:30 PM, Kuba Brecka <kuba.brecka at gmail.com> wrote: >>>> Hi, >>>> >>>> I'm trying to write a TSan interceptor for the C++11 call_once function. There are currently false positive reports, because the inner __call_once function is located in the (non-instrumented) libcxx library, and on macOS we can't expect the users to build their own instrumented libcxx. >>>> >>>> TSan already supports pthread_once and dispatch_once by having interceptors that re-implement the logic. However, doing the same for call_once/__call_once doesn't work, because call_once is explicitly supposed to be exception-safe, but the sanitizer runtime libraries disallow exception handling. >>>> >>>> Any ideas how to handle call_once in TSan? >>> >>> Does anybody remember exact reasons we disable exceptions in sanitizer >>> runtimes? One is that it won't link with C programs. Are there any >>> other? >>> If C is the only reason: there is already a part of tsan runtime is >>> linked only to C++ programs (it contains operator new/delete >>> interceptors). We could add additional files to the cxx part of >>> runtime and build them with exceptions. >>> >>> Alternatively, the interceptor can handle only synchronization but >>> forward actual logic to the real function. Along the lines of: >>> >>> INTERCEPTOR(call_once, o) { >>> __tsan_acquire_release(o); >>> REAL(call_once)(o); >>> } >>> >>> That will have some performance impact. If we hardcode the "fully >>> initialized" value, then we can eliminate the additional overhead: >>> >>> INTERCEPTOR(call_once, o) { >>> if (__atomic_load(o, acquire) == FULLY_INITIALIZED) { >>> __tsan_acquire(o); >>> return; >>> } >>> __tsan_acquire_release(o); >>> REAL(call_once)(o); >>> } >> >> Unfortunately, the first fast-path check is inlined and cannot be intercepted. We can only intercept the inner call to __call_once. But how would __tsan_acquire_release help here? The issue is that we need to perform the release *after* user code has run, but before the "o" flag is changed. Otherwise, TSan will still see a false positive where one thread has already run user code, and another thread already sees that call_once is finished, but the acquire has no release to pair with. > > > > Will then the following work? > > INTERCEPTOR(call_once, o) { > REAL(call_once)(o); > __tsan_release_merge(o); > __tsan_acquire(o); > }Still racy. Suppose thread A is still inside REAL(call_once), but already after it has run user code and updated "o" to ~0. Thread B load-acquires "o", finds ~0, assumes it's fully initialized, keeps going, but user code stores hasn't been properly published. Kuba