thr3ads.net - llvm dev - [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer" [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Alexander Potapenko

2012-Nov-29 19:07 UTC

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

Jack, can you please upload this test somewhere?

On Thu, Nov 29, 2012 at 10:09 AM, Kostya Serebryany <kcc at google.com>
wrote:> +glider
> The compiler hardly matters here, I would expect the same failures with
> clang.
> Alex, could you please take a look?
>
> --kcc
>
>
> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at
bromo.med.uc.edu>
> wrote:
>>
>> Nick,
>>    Can you take a quick look at the asan_eh_bug.tar.bz testcase
>> I uploaded into the newly opened radr://12777299, "potential
>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers
>> have ported llvm.org's asan code into FSF gcc (and are keeping
>> it synced to the upstream llvm.org code). I have been helping
>> with the darwin build and testing -fsanitize=address against the
>> complete FSF gcc testsuite. This seems to have exposed a potential
>> bug in pthread or eh on darwin under libasan. Hundreds of test cases
>> in the g++ and libstdc++ testsuites fail under -fsanitize=address
>> in the following manner...
>>
>> ASAN:SIGSEGV
>>
================================================================>>
==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000
>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
>> AddressSanitizer can not provide additional info.
>>     #0 0xffd26fff
(/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
>>     #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
>>     #2 0x0
>> Stats: 0M malloced (0M for red zones) by 3 calls
>> Stats: 0M realloced by 0 calls
>> Stats: 0M freed by 0 calls
>> Stats: 0M really freed by 0 calls
>> Stats: 1M (384 full pages) mmaped in 3 calls
>>   mmaps   by size class: 7:4095; 8:2047; 9:1023;
>>   mallocs by size class: 7:1; 8:1; 9:1;
>>   frees   by size class:
>>   rfrees  by size class:
>> Stats: malloc large: 0 small slow: 3
>> ==2738== ABORTING
>>
>> The failure of...
>>
>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
>>
>> was used as the test case for the radar report and compiled with...
>>
>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g
-O0
>> -o cond1_asan.exe
>>
>> to produce the above failure. When compiled without libasan as...
>>
>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
>>
>> the resulting executable runs fine. Debugging this in gdb seems to show
>> that the failure
>> is occuring in the final call to dyld_stub_pthread_once (). The same
test
>> case
>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and
produces
>> no runtime errors
>> but the code execution path is very different in that case (because of
the
>> different
>> libstdc++).
>>     Can you take a quick peek at this and determine if this is a darwin
>> pthread or unwinder
>> bug or an issue with libasan that FSF gcc's compiler is exposing?
Thanks
>> in advance for
>> any help on this.
>>          Jack
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>


-- 
Alexander Potapenko
Software Engineer
Google Moscow

Alexander Potapenko

2012-Nov-29 19:15 UTC

head link

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

If this is the same test:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/eh/cond1.C.diff?cvsroot=gcc&r1=NONE&r2=1.1,
then it doesn't fail for me on Mac OS 10.8 with ASan (Clang r168632)
Have you tried symbolizing the report?

On Thu, Nov 29, 2012 at 11:07 AM, Alexander Potapenko <glider at
google.com> wrote:> Jack, can you please upload this test somewhere?
>
> On Thu, Nov 29, 2012 at 10:09 AM, Kostya Serebryany <kcc at
google.com> wrote:
>> +glider
>> The compiler hardly matters here, I would expect the same failures with
>> clang.
>> Alex, could you please take a look?
>>
>> --kcc
>>
>>
>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at
bromo.med.uc.edu>
>> wrote:
>>>
>>> Nick,
>>>    Can you take a quick look at the asan_eh_bug.tar.bz testcase
>>> I uploaded into the newly opened radr://12777299, "potential
>>> pthread/eh bug exposed by libsanitizer". The FSF gcc
developers
>>> have ported llvm.org's asan code into FSF gcc (and are keeping
>>> it synced to the upstream llvm.org code). I have been helping
>>> with the darwin build and testing -fsanitize=address against the
>>> complete FSF gcc testsuite. This seems to have exposed a potential
>>> bug in pthread or eh on darwin under libasan. Hundreds of test
cases
>>> in the g++ and libstdc++ testsuites fail under -fsanitize=address
>>> in the following manner...
>>>
>>> ASAN:SIGSEGV
>>>
================================================================>>>
==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000
>>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
>>> AddressSanitizer can not provide additional info.
>>>     #0 0xffd26fff
(/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
>>>     #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
>>>     #2 0x0
>>> Stats: 0M malloced (0M for red zones) by 3 calls
>>> Stats: 0M realloced by 0 calls
>>> Stats: 0M freed by 0 calls
>>> Stats: 0M really freed by 0 calls
>>> Stats: 1M (384 full pages) mmaped in 3 calls
>>>   mmaps   by size class: 7:4095; 8:2047; 9:1023;
>>>   mallocs by size class: 7:1; 8:1; 9:1;
>>>   frees   by size class:
>>>   rfrees  by size class:
>>> Stats: malloc large: 0 small slow: 3
>>> ==2738== ABORTING
>>>
>>> The failure of...
>>>
>>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
>>>
>>> was used as the test case for the radar report and compiled with...
>>>
>>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C
-g -O0
>>> -o cond1_asan.exe
>>>
>>> to produce the above failure. When compiled without libasan as...
>>>
>>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
>>>
>>> the resulting executable runs fine. Debugging this in gdb seems to
show
>>> that the failure
>>> is occuring in the final call to dyld_stub_pthread_once (). The
same test
>>> case
>>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and
produces
>>> no runtime errors
>>> but the code execution path is very different in that case (because
of the
>>> different
>>> libstdc++).
>>>     Can you take a quick peek at this and determine if this is a
darwin
>>> pthread or unwinder
>>> bug or an issue with libasan that FSF gcc's compiler is
exposing? Thanks
>>> in advance for
>>> any help on this.
>>>          Jack
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
>
>
> --
> Alexander Potapenko
> Software Engineer
> Google Moscow


-- 
Alexander Potapenko
Software Engineer
Google Moscow

Jack Howarth

2012-Nov-29 20:12 UTC

head link

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

On Thu, Nov 29, 2012 at 11:15:37AM -0800, Alexander Potapenko
wrote:> If this is the same test:
>
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/eh/cond1.C.diff?cvsroot=gcc&r1=NONE&r2=1.1,
> then it doesn't fail for me on Mac OS 10.8 with ASan (Clang r168632)
> Have you tried symbolizing the report?
Alexander,
    Which c++ compiler are you testing with? You won't reproduce the problem
on clang++
because it is using a wildly different libstdc++ than FSF g++ 4.8 (see the gdb
traces
I posted earlier in reply to Kostya). I have reproduced this failure with...

g++-fsf-4.8 -fsanitize=address -std=c++98 cond1.C -o cond1_asan.exe 

on darwin10, darwin11 and darwin12 against current FSF gcc trunk's libasan.
I also
verified that switching the FSF gcc X86_64 Fedora 15 to use emutls (like darwin)
via --disable-tls doesn't trigger the bug on linux. I would also note that
the
gdb trace for the FSF gcc build of cond1.C seems to execute exactly the same
compared the -fsanitize=address build with FSF gcc except latter crashes at the
final call to dyld_stub_pthread_once()...

0x00000001023f842a in dyld_stub_pthread_once ()
(gdb) 
Single stepping until exit from function dyld_stub_pthread_once, 
which has no line number information.

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00000000ffd27000
0x00000000ffd27000 in ?? ()

                     Jack
> 
> On Thu, Nov 29, 2012 at 11:07 AM, Alexander Potapenko <glider at
google.com> wrote:
> > Jack, can you please upload this test somewhere?
> >
> > On Thu, Nov 29, 2012 at 10:09 AM, Kostya Serebryany <kcc at
google.com> wrote:
> >> +glider
> >> The compiler hardly matters here, I would expect the same failures
with
> >> clang.
> >> Alex, could you please take a look?
> >>
> >> --kcc
> >>
> >>
> >> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at
bromo.med.uc.edu>
> >> wrote:
> >>>
> >>> Nick,
> >>>    Can you take a quick look at the asan_eh_bug.tar.bz
testcase
> >>> I uploaded into the newly opened radr://12777299,
"potential
> >>> pthread/eh bug exposed by libsanitizer". The FSF gcc
developers
> >>> have ported llvm.org's asan code into FSF gcc (and are
keeping
> >>> it synced to the upstream llvm.org code). I have been helping
> >>> with the darwin build and testing -fsanitize=address against
the
> >>> complete FSF gcc testsuite. This seems to have exposed a
potential
> >>> bug in pthread or eh on darwin under libasan. Hundreds of test
cases
> >>> in the g++ and libstdc++ testsuites fail under
-fsanitize=address
> >>> in the following manner...
> >>>
> >>> ASAN:SIGSEGV
> >>>
================================================================>
>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address
0x0000ffd27000
> >>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
> >>> AddressSanitizer can not provide additional info.
> >>>     #0 0xffd26fff
(/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
> >>>     #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
> >>>     #2 0x0
> >>> Stats: 0M malloced (0M for red zones) by 3 calls
> >>> Stats: 0M realloced by 0 calls
> >>> Stats: 0M freed by 0 calls
> >>> Stats: 0M really freed by 0 calls
> >>> Stats: 1M (384 full pages) mmaped in 3 calls
> >>>   mmaps   by size class: 7:4095; 8:2047; 9:1023;
> >>>   mallocs by size class: 7:1; 8:1; 9:1;
> >>>   frees   by size class:
> >>>   rfrees  by size class:
> >>> Stats: malloc large: 0 small slow: 3
> >>> ==2738== ABORTING
> >>>
> >>> The failure of...
> >>>
> >>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
> >>>
> >>> was used as the test case for the radar report and compiled
with...
> >>>
> >>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98
cond1.C -g -O0
> >>> -o cond1_asan.exe
> >>>
> >>> to produce the above failure. When compiled without libasan
as...
> >>>
> >>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
> >>>
> >>> the resulting executable runs fine. Debugging this in gdb
seems to show
> >>> that the failure
> >>> is occuring in the final call to dyld_stub_pthread_once ().
The same test
> >>> case
> >>> compiles fine with -fsanitize=address under llvm 3.2 clang++
and produces
> >>> no runtime errors
> >>> but the code execution path is very different in that case
(because of the
> >>> different
> >>> libstdc++).
> >>>     Can you take a quick peek at this and determine if this is
a darwin
> >>> pthread or unwinder
> >>> bug or an issue with libasan that FSF gcc's compiler is
exposing? Thanks
> >>> in advance for
> >>> any help on this.
> >>>          Jack
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>
> >>
> >
> >
> >
> > --
> > Alexander Potapenko
> > Software Engineer
> > Google Moscow
> 
> 
> 
> -- 
> Alexander Potapenko
> Software Engineer
> Google Moscow

Nick Kledzik

2012-Nov-29 21:01 UTC

head link

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

I debugged this a bit and it seems the mach_override patching of __cxa_throw is
bogus.  The start of that function is patched to jump to garbage.

Breakpoint 1, 0x0000000100001c19 in main ()
(gdb) display/i $pc
2: x/i $pc  0x100001c19 <main+318>:	callq  0x100016386
<dyld_stub___cxa_throw>
(gdb) si
0x0000000100016386 in dyld_stub___cxa_throw ()
2: x/i $pc  0x100016386 <dyld_stub___cxa_throw>:	jmpq   *0xae1c(%rip)     
# 0x1000211a8
(gdb) 
0x0000000102244870 in __cxa_throw ()
2: x/i $pc  0x102244870 <__cxa_throw>:	jmpq   0xffd27000
(gdb)  # the above its __cxa_throw in gcc's libstdc++.6.dylib.  The first
instruction has been patch to jump to a garbage address.

(gdb) x/8i 0x102244870-8
0x102244868
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>:
std
0x102244869
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>:
(bad)
0x10224486a
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>:
decl   (%rdi)
0x10224486c
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>:
(bad)
0x10224486d
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>:
add    %r8b,(%rax)
0x102244870 <__cxa_throw>:	jmpq   0xffd27000
0x102244875 <__cxa_throw+5>:	or     (%rax),%eax
0x102244877 <__cxa_throw+7>:	push   %rbx
(gdb) 
(gdb) watch *0x102244870
Hardware watchpoint 2: *4330899568
(gdb) r

Old value = -788165304
New value = -1373139991
0x0000000100016203 in __asan_mach_override_ptr_custom ()
(gdb) bt
#0  0x0000000100016203 in __asan_mach_override_ptr_custom ()
#1  0x0000000100015a9e in __interception::OverrideFunction ()
#2  0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions ()
#3  0x00007fff5fc13762 in ImageLoaderMachO::doInitialization ()
#4  0x00007fff5fc1006e in ImageLoader::recursiveInitialization ()
#5  0x00007fff5fc0feba in ImageLoader::runInitializers ()
#6  0x00007fff5fc01fc0 in dyld::initializeMainExecutable ()
#7  0x00007fff5fc05b04 in dyld::_main ()
#8  0x00007fff5fc01397 in dyldbootstrap::start ()
#9  0x00007fff5fc0105e in _dyld_start ()
(gdb) x/8i 0x102244870
0x102244870 <__cxa_throw>:	jmpq   0xffd27000
0x102244875 <__cxa_throw+5>:	or     (%rax),%eax
0x102244877 <__cxa_throw+7>:	push   %rbx
0x102244878 <__cxa_throw+8>:	lea    -0x20(%rdi),%rbx
0x10224487c <__cxa_throw+12>:	mov    %rsi,-0x70(%rdi)
# Here is where the patching is being done

-Nick

On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote:>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at
bromo.med.uc.edu>
>> wrote:
>>> 
>>> Nick,
>>>   Can you take a quick look at the asan_eh_bug.tar.bz testcase
>>> I uploaded into the newly opened radr://12777299, "potential
>>> pthread/eh bug exposed by libsanitizer". The FSF gcc
developers
>>> have ported llvm.org's asan code into FSF gcc (and are keeping
>>> it synced to the upstream llvm.org code). I have been helping
>>> with the darwin build and testing -fsanitize=address against the
>>> complete FSF gcc testsuite. This seems to have exposed a potential
>>> bug in pthread or eh on darwin under libasan. Hundreds of test
cases
>>> in the g++ and libstdc++ testsuites fail under -fsanitize=address
>>> in the following manner...
>>> 
>>> ASAN:SIGSEGV
>>>
================================================================>>>
==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000
>>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
>>> AddressSanitizer can not provide additional info.
>>>    #0 0xffd26fff
(/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
>>>    #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
>>>    #2 0x0
>>> Stats: 0M malloced (0M for red zones) by 3 calls
>>> Stats: 0M realloced by 0 calls
>>> Stats: 0M freed by 0 calls
>>> Stats: 0M really freed by 0 calls
>>> Stats: 1M (384 full pages) mmaped in 3 calls
>>>  mmaps   by size class: 7:4095; 8:2047; 9:1023;
>>>  mallocs by size class: 7:1; 8:1; 9:1;
>>>  frees   by size class:
>>>  rfrees  by size class:
>>> Stats: malloc large: 0 small slow: 3
>>> ==2738== ABORTING
>>> 
>>> The failure of...
>>> 
>>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
>>> 
>>> was used as the test case for the radar report and compiled with...
>>> 
>>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C
-g -O0
>>> -o cond1_asan.exe
>>> 
>>> to produce the above failure. When compiled without libasan as...
>>> 
>>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
>>> 
>>> the resulting executable runs fine. Debugging this in gdb seems to
show
>>> that the failure
>>> is occuring in the final call to dyld_stub_pthread_once (). The
same test
>>> case
>>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and
produces
>>> no runtime errors
>>> but the code execution path is very different in that case (because
of the
>>> different
>>> libstdc++).
>>>    Can you take a quick peek at this and determine if this is a
darwin
>>> pthread or unwinder
>>> bug or an issue with libasan that FSF gcc's compiler is
exposing? Thanks
>>> in advance for
>>> any help on this.
>>>         Jack
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
>> 
> 
> 
> 
> -- 
> Alexander Potapenko
> Software Engineer
> Google Moscow

Jack Howarth

2012-Nov-29 21:46 UTC

head link

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

On Thu, Nov 29, 2012 at 01:01:42PM -0800, Nick Kledzik
wrote:> I debugged this a bit and it seems the mach_override patching of
__cxa_throw is bogus.  The start of that function is patched to jump to garbage.
> 
> Breakpoint 1, 0x0000000100001c19 in main ()
> (gdb) display/i $pc
> 2: x/i $pc  0x100001c19 <main+318>:	callq  0x100016386
<dyld_stub___cxa_throw>
> (gdb) si
> 0x0000000100016386 in dyld_stub___cxa_throw ()
> 2: x/i $pc  0x100016386 <dyld_stub___cxa_throw>:	jmpq   *0xae1c(%rip)
# 0x1000211a8
> (gdb) 
> 0x0000000102244870 in __cxa_throw ()
> 2: x/i $pc  0x102244870 <__cxa_throw>:	jmpq   0xffd27000
> (gdb)  # the above its __cxa_throw in gcc's libstdc++.6.dylib.  The
first instruction has been patch to jump to a garbage address.
> 
> (gdb) x/8i 0x102244870-8
> 0x102244868
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>:
std
> 0x102244869
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>:
(bad)
> 0x10224486a
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>:
decl   (%rdi)
> 0x10224486c
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>:
(bad)
> 0x10224486d
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>:
add    %r8b,(%rax)
> 0x102244870 <__cxa_throw>:	jmpq   0xffd27000
> 0x102244875 <__cxa_throw+5>:	or     (%rax),%eax
> 0x102244877 <__cxa_throw+7>:	push   %rbx
> (gdb) 
> (gdb) watch *0x102244870
> Hardware watchpoint 2: *4330899568
> (gdb) r
> 
> Old value = -788165304
> New value = -1373139991
> 0x0000000100016203 in __asan_mach_override_ptr_custom ()
> (gdb) bt
> #0  0x0000000100016203 in __asan_mach_override_ptr_custom ()
> #1  0x0000000100015a9e in __interception::OverrideFunction ()
> #2  0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions ()
> #3  0x00007fff5fc13762 in ImageLoaderMachO::doInitialization ()
> #4  0x00007fff5fc1006e in ImageLoader::recursiveInitialization ()
> #5  0x00007fff5fc0feba in ImageLoader::runInitializers ()
> #6  0x00007fff5fc01fc0 in dyld::initializeMainExecutable ()
> #7  0x00007fff5fc05b04 in dyld::_main ()
> #8  0x00007fff5fc01397 in dyldbootstrap::start ()
> #9  0x00007fff5fc0105e in _dyld_start ()
> (gdb) x/8i 0x102244870
> 0x102244870 <__cxa_throw>:	jmpq   0xffd27000
> 0x102244875 <__cxa_throw+5>:	or     (%rax),%eax
> 0x102244877 <__cxa_throw+7>:	push   %rbx
> 0x102244878 <__cxa_throw+8>:	lea    -0x20(%rdi),%rbx
> 0x10224487c <__cxa_throw+12>:	mov    %rsi,-0x70(%rdi)
> # Here is where the patching is being done
> 
> -Nick
In case it helps at all, I've attached the output from an executable with a
debug
version of mach_override in libasan linked in. I am unclear if all of the
patching
is done prior to code execution or on the fly. In any case, the local context of
the
error appears to be...

Replacing function at 0x7fff91c19830
First 16 bytes of the function: 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 81 ec 
To disassemble, save the following function as disas.c and run:
  gcc -c disas.c && gobjdump -d disas.o
The first 16 bytes of the original function will start after four nop
instructions.

void foo() {
  asm volatile("nop;nop;nop;nop;");
  asm volatile(".byte 0x55, 0x48, 0x89, 0xe5, 0x41, 0x57, 0x41,
0x56;");
  asm volatile(".byte 0x41, 0x55, 0x41, 0x54, 0x53, 0x48, 0x81,
0xec;");
}

Matching: 55  FAIL
Matching: 55  FAIL
Matching: 55  OK
Matching: 48  FAIL
Matching: 48  FAIL
Matching: 48  FAIL
Matching: 48  FAIL
Matching: 48 89 e5  OK
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41  FAIL
Matching: 41 57  OK
BEFORE FIXING:
55 48 89 E5 41 57 41 56 41 55 41 54 53 48 81 EC 
55 48 89 E5 41 57 90 90 90 90 90 90 90 90 90 90 
AFTER_FIXING:
55 48 89 E5 41 57 41 56 41 55 41 54 53 48 81 EC 
55 48 89 E5 41 57 90 90 90 90 90 90 90 90 90 90 
First 16 bytes of the function after slicing: e9 cb f7 11 6e 57 41 56 41 55 41
54 53 48 81 ec
ASAN:SIGSEGV
==================================================================29051== ERROR:
AddressSanitizer: SEGV on unknown address 0x0000ffd27000 (pc 0x0000ffd27000 sp
0x7fff59ad3858 bp 0x7fff59ad3920 T0)
AddressSanitizer can not provide additional info.
    #0 0xffd26fff (/Users/howarth/./cond1_asan.exe+0xf9bfafff)
    #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
    #2 0x0
Stats: 0M malloced (0M for red zones) by 3 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 0 calls
Stats: 0M really freed by 0 calls
Stats: 1M (384 full pages) mmaped in 3 calls
  mmaps   by size class: 7:4095; 8:2047; 9:1023; 
  mallocs by size class: 7:1; 8:1; 9:1; 
  frees   by size class: 
  rfrees  by size class: 
Stats: malloc large: 0 small slow: 3

> 
> On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote:
> >> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at
bromo.med.uc.edu>
> >> wrote:
> >>> 
> >>> Nick,
> >>>   Can you take a quick look at the asan_eh_bug.tar.bz testcase
> >>> I uploaded into the newly opened radr://12777299,
"potential
> >>> pthread/eh bug exposed by libsanitizer". The FSF gcc
developers
> >>> have ported llvm.org's asan code into FSF gcc (and are
keeping
> >>> it synced to the upstream llvm.org code). I have been helping
> >>> with the darwin build and testing -fsanitize=address against
the
> >>> complete FSF gcc testsuite. This seems to have exposed a
potential
> >>> bug in pthread or eh on darwin under libasan. Hundreds of test
cases
> >>> in the g++ and libstdc++ testsuites fail under
-fsanitize=address
> >>> in the following manner...
> >>> 
> >>> ASAN:SIGSEGV
> >>>
================================================================>
>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address
0x0000ffd27000
> >>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
> >>> AddressSanitizer can not provide additional info.
> >>>    #0 0xffd26fff
(/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
> >>>    #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
> >>>    #2 0x0
> >>> Stats: 0M malloced (0M for red zones) by 3 calls
> >>> Stats: 0M realloced by 0 calls
> >>> Stats: 0M freed by 0 calls
> >>> Stats: 0M really freed by 0 calls
> >>> Stats: 1M (384 full pages) mmaped in 3 calls
> >>>  mmaps   by size class: 7:4095; 8:2047; 9:1023;
> >>>  mallocs by size class: 7:1; 8:1; 9:1;
> >>>  frees   by size class:
> >>>  rfrees  by size class:
> >>> Stats: malloc large: 0 small slow: 3
> >>> ==2738== ABORTING
> >>> 
> >>> The failure of...
> >>> 
> >>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
> >>> 
> >>> was used as the test case for the radar report and compiled
with...
> >>> 
> >>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98
cond1.C -g -O0
> >>> -o cond1_asan.exe
> >>> 
> >>> to produce the above failure. When compiled without libasan
as...
> >>> 
> >>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
> >>> 
> >>> the resulting executable runs fine. Debugging this in gdb
seems to show
> >>> that the failure
> >>> is occuring in the final call to dyld_stub_pthread_once ().
The same test
> >>> case
> >>> compiles fine with -fsanitize=address under llvm 3.2 clang++
and produces
> >>> no runtime errors
> >>> but the code execution path is very different in that case
(because of the
> >>> different
> >>> libstdc++).
> >>>    Can you take a quick peek at this and determine if this is
a darwin
> >>> pthread or unwinder
> >>> bug or an issue with libasan that FSF gcc's compiler is
exposing? Thanks
> >>> in advance for
> >>> any help on this.
> >>>         Jack
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> 
> >> 
> > 
> > 
> > 
> > -- 
> > Alexander Potapenko
> > Software Engineer
> > Google Moscow-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug_asan_cond1_run.log.bz2
Type: application/x-bzip2
Size: 3857 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20121129/b237c98e/attachment.bin>

Alexander Potapenko

2012-Nov-30 00:46 UTC

head link

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

Looks like this happens on x86_64 because the position of __cxa_throw
is too far from the allocated branch island (should be <2G). This can
be solved by allocating the branch islands somewhere near the text
segment (look for kIslandEnd in asan_mac.cc, this is currently
0x7fffffdf0000) or by patching the function with a longer instruction
sequence that stores the jump target in a register and jumps to that
target (which is a bit more complex to implement).

Once this problem is fixed, another one is going to arise. This is how
the first bytes of __cxa_throw look like:

0x0020c49ba5d916e0 <__cxa_throw+0>: lea    0xb4f01(%rip),%rax        #
0x20c49ba5e465e8 <_ZN10__cxxabiv120__unexpected_handlerE>
0x0020c49ba5d916e7 <__cxa_throw+7>: push   %rbx
0x0020c49ba5d916e8 <__cxa_throw+8>: lea    -0x20(%rdi),%rbx

If we move the relative LEA instruction somewhere, we must fix the
constant in order to keep it pointing to the same address.
mach_override already does this for relative CALL and JMP
instructions, but not for LEA. This should be fairly simple to fix.

Note that the 32-bit variant crashes on another invalid address:

ASAN:SIGSEGV
==================================================================89768== ERROR:
AddressSanitizer: SEGV on unknown address 0xcccccccc
(pc 0x00061f8c sp 0xbffa8bd0 bp 0xbffa8cc8 T0)
AddressSanitizer can not provide additional info.
    #0 0x61f8b
(/Users/glider/src/gcc-asan/inst/lib/i386/libstdc++.6.dylib+0x3f8b)
    #1 0x91391724 (/usr/lib/system/libdyld.dylib+0x2724)
    #2 0x0
Stats: 0M malloced (0M for red zones) by 3 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 0 calls
Stats: 0M really freed by 0 calls
Stats: 1M (256 full pages) mmaped in 2 calls
  mmaps   by size class: 7:4095; 8:2047;
  mallocs by size class: 7:1; 8:2;
  frees   by size class:
  rfrees  by size class:
Stats: malloc large: 0 small slow: 2
==89768== ABORTING

My guess is that this is caused by the following code being moved to a
branch island:

Dump of assembler code for function __cxa_throw:
0x00008f60 <__cxa_throw+0>: push   %esi
0x00008f61 <__cxa_throw+1>: push   %ebx
0x00008f62 <__cxa_throw+2>: call   0x7a60 <__x86.get_pc_thunk.bx>

Perhaps this makes __x86.get_pc_thunk.bx return an incorrect value.

Since libstdc++-v3 is built together with gcc, the two issues related
to instructions being moved to another place can be solved by padding
__cxa_throw() with five NOP instructions (enough to hold a JMP). I
believe this should be acceptable, because the performance penalty for
additional NOPs is negligible, and __cxa_throw() isn't a hot point.

On Thu, Nov 29, 2012 at 1:01 PM, Nick Kledzik <kledzik at apple.com>
wrote:> I debugged this a bit and it seems the mach_override patching of
__cxa_throw is bogus.  The start of that function is patched to jump to garbage.
>
> Breakpoint 1, 0x0000000100001c19 in main ()
> (gdb) display/i $pc
> 2: x/i $pc  0x100001c19 <main+318>:     callq  0x100016386
<dyld_stub___cxa_throw>
> (gdb) si
> 0x0000000100016386 in dyld_stub___cxa_throw ()
> 2: x/i $pc  0x100016386 <dyld_stub___cxa_throw>:        jmpq  
*0xae1c(%rip)        # 0x1000211a8
> (gdb)
> 0x0000000102244870 in __cxa_throw ()
> 2: x/i $pc  0x102244870 <__cxa_throw>:  jmpq   0xffd27000
> (gdb)  # the above its __cxa_throw in gcc's libstdc++.6.dylib.  The
first instruction has been patch to jump to a garbage address.
>
> (gdb) x/8i 0x102244870-8
> 0x102244868
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>:
std
> 0x102244869
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>:
(bad)
> 0x10224486a
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>:
decl   (%rdi)
> 0x10224486c
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>:
(bad)
> 0x10224486d
<_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>:
add    %r8b,(%rax)
> 0x102244870 <__cxa_throw>:      jmpq   0xffd27000
> 0x102244875 <__cxa_throw+5>:    or     (%rax),%eax
> 0x102244877 <__cxa_throw+7>:    push   %rbx
> (gdb)
> (gdb) watch *0x102244870
> Hardware watchpoint 2: *4330899568
> (gdb) r
>
> Old value = -788165304
> New value = -1373139991
> 0x0000000100016203 in __asan_mach_override_ptr_custom ()
> (gdb) bt
> #0  0x0000000100016203 in __asan_mach_override_ptr_custom ()
> #1  0x0000000100015a9e in __interception::OverrideFunction ()
> #2  0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions ()
> #3  0x00007fff5fc13762 in ImageLoaderMachO::doInitialization ()
> #4  0x00007fff5fc1006e in ImageLoader::recursiveInitialization ()
> #5  0x00007fff5fc0feba in ImageLoader::runInitializers ()
> #6  0x00007fff5fc01fc0 in dyld::initializeMainExecutable ()
> #7  0x00007fff5fc05b04 in dyld::_main ()
> #8  0x00007fff5fc01397 in dyldbootstrap::start ()
> #9  0x00007fff5fc0105e in _dyld_start ()
> (gdb) x/8i 0x102244870
> 0x102244870 <__cxa_throw>:      jmpq   0xffd27000
> 0x102244875 <__cxa_throw+5>:    or     (%rax),%eax
> 0x102244877 <__cxa_throw+7>:    push   %rbx
> 0x102244878 <__cxa_throw+8>:    lea    -0x20(%rdi),%rbx
> 0x10224487c <__cxa_throw+12>:   mov    %rsi,-0x70(%rdi)
> # Here is where the patching is being done
>
> -Nick
>
> On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote:
>>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at
bromo.med.uc.edu>
>>> wrote:
>>>>
>>>> Nick,
>>>>   Can you take a quick look at the asan_eh_bug.tar.bz testcase
>>>> I uploaded into the newly opened radr://12777299,
"potential
>>>> pthread/eh bug exposed by libsanitizer". The FSF gcc
developers
>>>> have ported llvm.org's asan code into FSF gcc (and are
keeping
>>>> it synced to the upstream llvm.org code). I have been helping
>>>> with the darwin build and testing -fsanitize=address against
the
>>>> complete FSF gcc testsuite. This seems to have exposed a
potential
>>>> bug in pthread or eh on darwin under libasan. Hundreds of test
cases
>>>> in the g++ and libstdc++ testsuites fail under
-fsanitize=address
>>>> in the following manner...
>>>>
>>>> ASAN:SIGSEGV
>>>>
================================================================>>>>
==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000
>>>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
>>>> AddressSanitizer can not provide additional info.
>>>>    #0 0xffd26fff
(/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
>>>>    #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
>>>>    #2 0x0
>>>> Stats: 0M malloced (0M for red zones) by 3 calls
>>>> Stats: 0M realloced by 0 calls
>>>> Stats: 0M freed by 0 calls
>>>> Stats: 0M really freed by 0 calls
>>>> Stats: 1M (384 full pages) mmaped in 3 calls
>>>>  mmaps   by size class: 7:4095; 8:2047; 9:1023;
>>>>  mallocs by size class: 7:1; 8:1; 9:1;
>>>>  frees   by size class:
>>>>  rfrees  by size class:
>>>> Stats: malloc large: 0 small slow: 3
>>>> ==2738== ABORTING
>>>>
>>>> The failure of...
>>>>
>>>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
>>>>
>>>> was used as the test case for the radar report and compiled
with...
>>>>
>>>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98
cond1.C -g -O0
>>>> -o cond1_asan.exe
>>>>
>>>> to produce the above failure. When compiled without libasan
as...
>>>>
>>>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
>>>>
>>>> the resulting executable runs fine. Debugging this in gdb seems
to show
>>>> that the failure
>>>> is occuring in the final call to dyld_stub_pthread_once (). The
same test
>>>> case
>>>> compiles fine with -fsanitize=address under llvm 3.2 clang++
and produces
>>>> no runtime errors
>>>> but the code execution path is very different in that case
(because of the
>>>> different
>>>> libstdc++).
>>>>    Can you take a quick peek at this and determine if this is a
darwin
>>>> pthread or unwinder
>>>> bug or an issue with libasan that FSF gcc's compiler is
exposing? Thanks
>>>> in advance for
>>>> any help on this.
>>>>         Jack
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>
>>
>>
>> --
>> Alexander Potapenko
>> Software Engineer
>> Google Moscow
>


-- 
Alexander Potapenko
Software Engineer
Google Moscow

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Nov 2012 - [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"

Reasonably Related Threads