thr3ads.net - llvm dev - [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer" [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Alexander Potapenko

2012-Nov-30 18:32 UTC

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

No, we are not going to use mach_inject. This isn't portable and may
be even harder to set up than mach_override.
The new ASan runtime will use the dylib interposition and will in fact
require DYLD_INSERT_LIBRARIES to work. However ASan already handles it
correctly itself: if the corresponding env var is missing the app is
just re-execed.
Dylib interposition is supported by Apple and should work on iOS as
well as Mac OS. It will also probably simplify hooking the memory
allocations in ASan, which is now very tricky.

On Fri, Nov 30, 2012 at 6:56 AM, Jack Howarth <howarth at
bromo.med.uc.edu> wrote:> On Fri, Nov 30, 2012 at 01:41:05PM +0400, Kostya Serebryany wrote:
>> Just want to remind everyone that we plan to stop using mach_override
in
>> asanin favor of OSX's native function interposition.
>> So, we probably don't want to spend too much effort fixing
mach_override.
>>
>> --kcc
>
> Kostya,
>     Is the native function interposition that is being adopted based on...
>
> https://github.com/rentzsch/mach_inject
>
> ? I assume that any method used will be transparent to the user and not
require
> manually setting DYLD_INSERT_LIBRARIES, correct?
>           Jack
>
>>
>> On Fri, Nov 30, 2012 at 4:46 AM, Alexander Potapenko <glider at
google.com>wrote:
>>
>> > Looks like this happens on x86_64 because the position of
__cxa_throw
>> > is too far from the allocated branch island (should be <2G).
This can
>> > be solved by allocating the branch islands somewhere near the text
>> > segment (look for kIslandEnd in asan_mac.cc, this is currently
>> > 0x7fffffdf0000) or by patching the function with a longer
instruction
>> > sequence that stores the jump target in a register and jumps to
that
>> > target (which is a bit more complex to implement).
>> >
>> > Once this problem is fixed, another one is going to arise. This is
how
>> > the first bytes of __cxa_throw look like:
>> >
>> > 0x0020c49ba5d916e0 <__cxa_throw+0>: lea   
0xb4f01(%rip),%rax        #
>> > 0x20c49ba5e465e8 <_ZN10__cxxabiv120__unexpected_handlerE>
>> > 0x0020c49ba5d916e7 <__cxa_throw+7>: push   %rbx
>> > 0x0020c49ba5d916e8 <__cxa_throw+8>: lea    -0x20(%rdi),%rbx
>> >
>> > If we move the relative LEA instruction somewhere, we must fix the
>> > constant in order to keep it pointing to the same address.
>> > mach_override already does this for relative CALL and JMP
>> > instructions, but not for LEA. This should be fairly simple to
fix.
>> >
>> > Note that the 32-bit variant crashes on another invalid address:
>> >
>> > ASAN:SIGSEGV
>> >
================================================================>> >
==89768== ERROR: AddressSanitizer: SEGV on unknown address 0xcccccccc
>> > (pc 0x00061f8c sp 0xbffa8bd0 bp 0xbffa8cc8 T0)
>> > AddressSanitizer can not provide additional info.
>> >     #0 0x61f8b
>> >
(/Users/glider/src/gcc-asan/inst/lib/i386/libstdc++.6.dylib+0x3f8b)
>> >     #1 0x91391724 (/usr/lib/system/libdyld.dylib+0x2724)
>> >     #2 0x0
>> > Stats: 0M malloced (0M for red zones) by 3 calls
>> > Stats: 0M realloced by 0 calls
>> > Stats: 0M freed by 0 calls
>> > Stats: 0M really freed by 0 calls
>> > Stats: 1M (256 full pages) mmaped in 2 calls
>> >   mmaps   by size class: 7:4095; 8:2047;
>> >   mallocs by size class: 7:1; 8:2;
>> >   frees   by size class:
>> >   rfrees  by size class:
>> > Stats: malloc large: 0 small slow: 2
>> > ==89768== ABORTING
>> >
>> > My guess is that this is caused by the following code being moved
to a
>> > branch island:
>> >
>> > Dump of assembler code for function __cxa_throw:
>> > 0x00008f60 <__cxa_throw+0>: push   %esi
>> > 0x00008f61 <__cxa_throw+1>: push   %ebx
>> > 0x00008f62 <__cxa_throw+2>: call   0x7a60
<__x86.get_pc_thunk.bx>
>> >
>> > Perhaps this makes __x86.get_pc_thunk.bx return an incorrect
value.
>> >
>> > Since libstdc++-v3 is built together with gcc, the two issues
related
>> > to instructions being moved to another place can be solved by
padding
>> > __cxa_throw() with five NOP instructions (enough to hold a JMP). I
>> > believe this should be acceptable, because the performance penalty
for
>> > additional NOPs is negligible, and __cxa_throw() isn't a hot
point.
>> >
>> > On Thu, Nov 29, 2012 at 1:01 PM, Nick Kledzik <kledzik at
apple.com> wrote:
>> > > I debugged this a bit and it seems the mach_override patching
of
>> > __cxa_throw is bogus.  The start of that function is patched to
jump to
>> > garbage.
>> > >
>> > > Breakpoint 1, 0x0000000100001c19 in main ()
>> > > (gdb) display/i $pc
>> > > 2: x/i $pc  0x100001c19 <main+318>:     callq 
0x100016386
>> > <dyld_stub___cxa_throw>
>> > > (gdb) si
>> > > 0x0000000100016386 in dyld_stub___cxa_throw ()
>> > > 2: x/i $pc  0x100016386 <dyld_stub___cxa_throw>:       
jmpq
>> > *0xae1c(%rip)        # 0x1000211a8
>> > > (gdb)
>> > > 0x0000000102244870 in __cxa_throw ()
>> > > 2: x/i $pc  0x102244870 <__cxa_throw>:  jmpq  
0xffd27000
>> > > (gdb)  # the above its __cxa_throw in gcc's
libstdc++.6.dylib.  The
>> > first instruction has been patch to jump to a garbage address.
>> > >
>> > > (gdb) x/8i 0x102244870-8
>> > > 0x102244868
>> >
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>:
>> > std
>> > > 0x102244869
>> >
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>:
>> > (bad)
>> > > 0x10224486a
>> >
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>:
>> > decl   (%rdi)
>> > > 0x10224486c
>> >
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>:
>> > (bad)
>> > > 0x10224486d
>> >
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>:
>> > add    %r8b,(%rax)
>> > > 0x102244870 <__cxa_throw>:      jmpq   0xffd27000
>> > > 0x102244875 <__cxa_throw+5>:    or     (%rax),%eax
>> > > 0x102244877 <__cxa_throw+7>:    push   %rbx
>> > > (gdb)
>> > > (gdb) watch *0x102244870
>> > > Hardware watchpoint 2: *4330899568
>> > > (gdb) r
>> > >
>> > > Old value = -788165304
>> > > New value = -1373139991
>> > > 0x0000000100016203 in __asan_mach_override_ptr_custom ()
>> > > (gdb) bt
>> > > #0  0x0000000100016203 in __asan_mach_override_ptr_custom ()
>> > > #1  0x0000000100015a9e in __interception::OverrideFunction ()
>> > > #2  0x00007fff5fc13378 in
ImageLoaderMachO::doModInitFunctions ()
>> > > #3  0x00007fff5fc13762 in ImageLoaderMachO::doInitialization
()
>> > > #4  0x00007fff5fc1006e in
ImageLoader::recursiveInitialization ()
>> > > #5  0x00007fff5fc0feba in ImageLoader::runInitializers ()
>> > > #6  0x00007fff5fc01fc0 in dyld::initializeMainExecutable ()
>> > > #7  0x00007fff5fc05b04 in dyld::_main ()
>> > > #8  0x00007fff5fc01397 in dyldbootstrap::start ()
>> > > #9  0x00007fff5fc0105e in _dyld_start ()
>> > > (gdb) x/8i 0x102244870
>> > > 0x102244870 <__cxa_throw>:      jmpq   0xffd27000
>> > > 0x102244875 <__cxa_throw+5>:    or     (%rax),%eax
>> > > 0x102244877 <__cxa_throw+7>:    push   %rbx
>> > > 0x102244878 <__cxa_throw+8>:    lea    -0x20(%rdi),%rbx
>> > > 0x10224487c <__cxa_throw+12>:   mov    %rsi,-0x70(%rdi)
>> > > # Here is where the patching is being done
>> > >
>> > > -Nick
>> > >
>> > > On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote:
>> > >>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <
>> > howarth at bromo.med.uc.edu>
>> > >>> wrote:
>> > >>>>
>> > >>>> Nick,
>> > >>>>   Can you take a quick look at the
asan_eh_bug.tar.bz testcase
>> > >>>> I uploaded into the newly opened radr://12777299,
"potential
>> > >>>> pthread/eh bug exposed by libsanitizer". The
FSF gcc developers
>> > >>>> have ported llvm.org's asan code into FSF gcc
(and are keeping
>> > >>>> it synced to the upstream llvm.org code). I have
been helping
>> > >>>> with the darwin build and testing
-fsanitize=address against the
>> > >>>> complete FSF gcc testsuite. This seems to have
exposed a potential
>> > >>>> bug in pthread or eh on darwin under libasan.
Hundreds of test cases
>> > >>>> in the g++ and libstdc++ testsuites fail under
-fsanitize=address
>> > >>>> in the following manner...
>> > >>>>
>> > >>>> ASAN:SIGSEGV
>> > >>>>
================================================================>> >
>>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address
>> > 0x0000ffd27000
>> > >>>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp
0x7fff55e408f0 T0)
>> > >>>> AddressSanitizer can not provide additional info.
>> > >>>>    #0 0xffd26fff
>> > (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
>> > >>>>    #1 0x7fff8bd827e0
(/usr/lib/system/libdyld.dylib+0x27e0)
>> > >>>>    #2 0x0
>> > >>>> Stats: 0M malloced (0M for red zones) by 3 calls
>> > >>>> Stats: 0M realloced by 0 calls
>> > >>>> Stats: 0M freed by 0 calls
>> > >>>> Stats: 0M really freed by 0 calls
>> > >>>> Stats: 1M (384 full pages) mmaped in 3 calls
>> > >>>>  mmaps   by size class: 7:4095; 8:2047; 9:1023;
>> > >>>>  mallocs by size class: 7:1; 8:1; 9:1;
>> > >>>>  frees   by size class:
>> > >>>>  rfrees  by size class:
>> > >>>> Stats: malloc large: 0 small slow: 3
>> > >>>> ==2738== ABORTING
>> > >>>>
>> > >>>> The failure of...
>> > >>>>
>> > >>>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
>> > >>>>
>> > >>>> was used as the test case for the radar report
and compiled with...
>> > >>>>
>> > >>>> g++-fsf-4.8 -static-libasan -fsanitize=address
-std=c++98 cond1.C -g
>> > -O0
>> > >>>> -o cond1_asan.exe
>> > >>>>
>> > >>>> to produce the above failure. When compiled
without libasan as...
>> > >>>>
>> > >>>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o
cond1_no_asan.exe
>> > >>>>
>> > >>>> the resulting executable runs fine. Debugging
this in gdb seems to
>> > show
>> > >>>> that the failure
>> > >>>> is occuring in the final call to
dyld_stub_pthread_once (). The same
>> > test
>> > >>>> case
>> > >>>> compiles fine with -fsanitize=address under llvm
3.2 clang++ and
>> > produces
>> > >>>> no runtime errors
>> > >>>> but the code execution path is very different in
that case (because
>> > of the
>> > >>>> different
>> > >>>> libstdc++).
>> > >>>>    Can you take a quick peek at this and
determine if this is a darwin
>> > >>>> pthread or unwinder
>> > >>>> bug or an issue with libasan that FSF gcc's
compiler is exposing?
>> > Thanks
>> > >>>> in advance for
>> > >>>> any help on this.
>> > >>>>         Jack
>> > >>>> _______________________________________________
>> > >>>> LLVM Developers mailing list
>> > >>>> LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
>> > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> > >>>
>> > >>>
>> > >>
>> > >>
>> > >>
>> > >> --
>> > >> Alexander Potapenko
>> > >> Software Engineer
>> > >> Google Moscow
>> > >
>> >
>> >
>> >
>> > --
>> > Alexander Potapenko
>> > Software Engineer
>> > Google Moscow
>> >


-- 
Alexander Potapenko
Software Engineer
Google Moscow

Alexander Potapenko

2012-Dec-04 17:46 UTC

head link

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

+kledzik at apple.com
The dynamic runtime is using dylib interposition (google for
"__DATA,__interpose).
If I'm understanding correctly (Nick, can you please confirm this?)
this allows to interpose the function regardless of the two-level
namespace.
The support for dynamic runtime in ASan is almost there. But the new
interposition method has revealed some issues with the allocator which
were corked here and there before. Most of those are caused by a
CoreFoundation dependency, which I'm trying to eliminate now.


On Mon, Dec 3, 2012 at 8:50 PM, Rafael Espíndola
<rafael.espindola at gmail.com> wrote:> On 30 November 2012 13:32, Alexander Potapenko <glider at google.com>
wrote:
>> No, we are not going to use mach_inject. This isn't portable and
may
>> be even harder to set up than mach_override.
>> The new ASan runtime will use the dylib interposition and will in fact
>> require DYLD_INSERT_LIBRARIES to work. However ASan already handles it
>> correctly itself: if the corresponding env var is missing the app is
>> just re-execed.
>> Dylib interposition is supported by Apple and should work on iOS as
>> well as Mac OS. It will also probably simplify hooking the memory
>> allocations in ASan, which is now very tricky.
>
> This is interesting! I had some difficulties with mach_override myself
> in firefox. Don't you have to disable the two-level namespace to be
> able to override the functions you want? What currently blocks using
> DYLD_INSERT_LIBRARIES instead of mach_override?
>
> Cheers,
> Rafael


--
Alexander Potapenko
Software Engineer
Google Moscow

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Dec 2012 - [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

Apparently Analagous Threads