Alexander Potapenko
2012-Nov-29 19:07 UTC
[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
Jack, can you please upload this test somewhere? On Thu, Nov 29, 2012 at 10:09 AM, Kostya Serebryany <kcc at google.com> wrote:> +glider > The compiler hardly matters here, I would expect the same failures with > clang. > Alex, could you please take a look? > > --kcc > > > On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at bromo.med.uc.edu> > wrote: >> >> Nick, >> Can you take a quick look at the asan_eh_bug.tar.bz testcase >> I uploaded into the newly opened radr://12777299, "potential >> pthread/eh bug exposed by libsanitizer". The FSF gcc developers >> have ported llvm.org's asan code into FSF gcc (and are keeping >> it synced to the upstream llvm.org code). I have been helping >> with the darwin build and testing -fsanitize=address against the >> complete FSF gcc testsuite. This seems to have exposed a potential >> bug in pthread or eh on darwin under libasan. Hundreds of test cases >> in the g++ and libstdc++ testsuites fail under -fsanitize=address >> in the following manner... >> >> ASAN:SIGSEGV >> ================================================================>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 >> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0) >> AddressSanitizer can not provide additional info. >> #0 0xffd26fff (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff) >> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) >> #2 0x0 >> Stats: 0M malloced (0M for red zones) by 3 calls >> Stats: 0M realloced by 0 calls >> Stats: 0M freed by 0 calls >> Stats: 0M really freed by 0 calls >> Stats: 1M (384 full pages) mmaped in 3 calls >> mmaps by size class: 7:4095; 8:2047; 9:1023; >> mallocs by size class: 7:1; 8:1; 9:1; >> frees by size class: >> rfrees by size class: >> Stats: malloc large: 0 small slow: 3 >> ==2738== ABORTING >> >> The failure of... >> >> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test >> >> was used as the test case for the radar report and compiled with... >> >> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g -O0 >> -o cond1_asan.exe >> >> to produce the above failure. When compiled without libasan as... >> >> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe >> >> the resulting executable runs fine. Debugging this in gdb seems to show >> that the failure >> is occuring in the final call to dyld_stub_pthread_once (). The same test >> case >> compiles fine with -fsanitize=address under llvm 3.2 clang++ and produces >> no runtime errors >> but the code execution path is very different in that case (because of the >> different >> libstdc++). >> Can you take a quick peek at this and determine if this is a darwin >> pthread or unwinder >> bug or an issue with libasan that FSF gcc's compiler is exposing? Thanks >> in advance for >> any help on this. >> Jack >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-- Alexander Potapenko Software Engineer Google Moscow
Alexander Potapenko
2012-Nov-29 19:15 UTC
[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
If this is the same test: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/eh/cond1.C.diff?cvsroot=gcc&r1=NONE&r2=1.1, then it doesn't fail for me on Mac OS 10.8 with ASan (Clang r168632) Have you tried symbolizing the report? On Thu, Nov 29, 2012 at 11:07 AM, Alexander Potapenko <glider at google.com> wrote:> Jack, can you please upload this test somewhere? > > On Thu, Nov 29, 2012 at 10:09 AM, Kostya Serebryany <kcc at google.com> wrote: >> +glider >> The compiler hardly matters here, I would expect the same failures with >> clang. >> Alex, could you please take a look? >> >> --kcc >> >> >> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at bromo.med.uc.edu> >> wrote: >>> >>> Nick, >>> Can you take a quick look at the asan_eh_bug.tar.bz testcase >>> I uploaded into the newly opened radr://12777299, "potential >>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers >>> have ported llvm.org's asan code into FSF gcc (and are keeping >>> it synced to the upstream llvm.org code). I have been helping >>> with the darwin build and testing -fsanitize=address against the >>> complete FSF gcc testsuite. This seems to have exposed a potential >>> bug in pthread or eh on darwin under libasan. Hundreds of test cases >>> in the g++ and libstdc++ testsuites fail under -fsanitize=address >>> in the following manner... >>> >>> ASAN:SIGSEGV >>> ================================================================>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 >>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0) >>> AddressSanitizer can not provide additional info. >>> #0 0xffd26fff (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff) >>> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) >>> #2 0x0 >>> Stats: 0M malloced (0M for red zones) by 3 calls >>> Stats: 0M realloced by 0 calls >>> Stats: 0M freed by 0 calls >>> Stats: 0M really freed by 0 calls >>> Stats: 1M (384 full pages) mmaped in 3 calls >>> mmaps by size class: 7:4095; 8:2047; 9:1023; >>> mallocs by size class: 7:1; 8:1; 9:1; >>> frees by size class: >>> rfrees by size class: >>> Stats: malloc large: 0 small slow: 3 >>> ==2738== ABORTING >>> >>> The failure of... >>> >>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test >>> >>> was used as the test case for the radar report and compiled with... >>> >>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g -O0 >>> -o cond1_asan.exe >>> >>> to produce the above failure. When compiled without libasan as... >>> >>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe >>> >>> the resulting executable runs fine. Debugging this in gdb seems to show >>> that the failure >>> is occuring in the final call to dyld_stub_pthread_once (). The same test >>> case >>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and produces >>> no runtime errors >>> but the code execution path is very different in that case (because of the >>> different >>> libstdc++). >>> Can you take a quick peek at this and determine if this is a darwin >>> pthread or unwinder >>> bug or an issue with libasan that FSF gcc's compiler is exposing? Thanks >>> in advance for >>> any help on this. >>> Jack >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > > > -- > Alexander Potapenko > Software Engineer > Google Moscow-- Alexander Potapenko Software Engineer Google Moscow
Jack Howarth
2012-Nov-29 20:12 UTC
[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
On Thu, Nov 29, 2012 at 11:15:37AM -0800, Alexander Potapenko wrote:> If this is the same test: > http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/eh/cond1.C.diff?cvsroot=gcc&r1=NONE&r2=1.1, > then it doesn't fail for me on Mac OS 10.8 with ASan (Clang r168632) > Have you tried symbolizing the report?Alexander, Which c++ compiler are you testing with? You won't reproduce the problem on clang++ because it is using a wildly different libstdc++ than FSF g++ 4.8 (see the gdb traces I posted earlier in reply to Kostya). I have reproduced this failure with... g++-fsf-4.8 -fsanitize=address -std=c++98 cond1.C -o cond1_asan.exe on darwin10, darwin11 and darwin12 against current FSF gcc trunk's libasan. I also verified that switching the FSF gcc X86_64 Fedora 15 to use emutls (like darwin) via --disable-tls doesn't trigger the bug on linux. I would also note that the gdb trace for the FSF gcc build of cond1.C seems to execute exactly the same compared the -fsanitize=address build with FSF gcc except latter crashes at the final call to dyld_stub_pthread_once()... 0x00000001023f842a in dyld_stub_pthread_once () (gdb) Single stepping until exit from function dyld_stub_pthread_once, which has no line number information. Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x00000000ffd27000 0x00000000ffd27000 in ?? () Jack> > On Thu, Nov 29, 2012 at 11:07 AM, Alexander Potapenko <glider at google.com> wrote: > > Jack, can you please upload this test somewhere? > > > > On Thu, Nov 29, 2012 at 10:09 AM, Kostya Serebryany <kcc at google.com> wrote: > >> +glider > >> The compiler hardly matters here, I would expect the same failures with > >> clang. > >> Alex, could you please take a look? > >> > >> --kcc > >> > >> > >> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at bromo.med.uc.edu> > >> wrote: > >>> > >>> Nick, > >>> Can you take a quick look at the asan_eh_bug.tar.bz testcase > >>> I uploaded into the newly opened radr://12777299, "potential > >>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers > >>> have ported llvm.org's asan code into FSF gcc (and are keeping > >>> it synced to the upstream llvm.org code). I have been helping > >>> with the darwin build and testing -fsanitize=address against the > >>> complete FSF gcc testsuite. This seems to have exposed a potential > >>> bug in pthread or eh on darwin under libasan. Hundreds of test cases > >>> in the g++ and libstdc++ testsuites fail under -fsanitize=address > >>> in the following manner... > >>> > >>> ASAN:SIGSEGV > >>> ================================================================> >>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 > >>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0) > >>> AddressSanitizer can not provide additional info. > >>> #0 0xffd26fff (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff) > >>> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) > >>> #2 0x0 > >>> Stats: 0M malloced (0M for red zones) by 3 calls > >>> Stats: 0M realloced by 0 calls > >>> Stats: 0M freed by 0 calls > >>> Stats: 0M really freed by 0 calls > >>> Stats: 1M (384 full pages) mmaped in 3 calls > >>> mmaps by size class: 7:4095; 8:2047; 9:1023; > >>> mallocs by size class: 7:1; 8:1; 9:1; > >>> frees by size class: > >>> rfrees by size class: > >>> Stats: malloc large: 0 small slow: 3 > >>> ==2738== ABORTING > >>> > >>> The failure of... > >>> > >>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test > >>> > >>> was used as the test case for the radar report and compiled with... > >>> > >>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g -O0 > >>> -o cond1_asan.exe > >>> > >>> to produce the above failure. When compiled without libasan as... > >>> > >>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe > >>> > >>> the resulting executable runs fine. Debugging this in gdb seems to show > >>> that the failure > >>> is occuring in the final call to dyld_stub_pthread_once (). The same test > >>> case > >>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and produces > >>> no runtime errors > >>> but the code execution path is very different in that case (because of the > >>> different > >>> libstdc++). > >>> Can you take a quick peek at this and determine if this is a darwin > >>> pthread or unwinder > >>> bug or an issue with libasan that FSF gcc's compiler is exposing? Thanks > >>> in advance for > >>> any help on this. > >>> Jack > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > >> > > > > > > > > -- > > Alexander Potapenko > > Software Engineer > > Google Moscow > > > > -- > Alexander Potapenko > Software Engineer > Google Moscow
Nick Kledzik
2012-Nov-29 21:01 UTC
[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
I debugged this a bit and it seems the mach_override patching of __cxa_throw is bogus. The start of that function is patched to jump to garbage. Breakpoint 1, 0x0000000100001c19 in main () (gdb) display/i $pc 2: x/i $pc 0x100001c19 <main+318>: callq 0x100016386 <dyld_stub___cxa_throw> (gdb) si 0x0000000100016386 in dyld_stub___cxa_throw () 2: x/i $pc 0x100016386 <dyld_stub___cxa_throw>: jmpq *0xae1c(%rip) # 0x1000211a8 (gdb) 0x0000000102244870 in __cxa_throw () 2: x/i $pc 0x102244870 <__cxa_throw>: jmpq 0xffd27000 (gdb) # the above its __cxa_throw in gcc's libstdc++.6.dylib. The first instruction has been patch to jump to a garbage address. (gdb) x/8i 0x102244870-8 0x102244868 <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>: std 0x102244869 <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>: (bad) 0x10224486a <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>: decl (%rdi) 0x10224486c <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>: (bad) 0x10224486d <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>: add %r8b,(%rax) 0x102244870 <__cxa_throw>: jmpq 0xffd27000 0x102244875 <__cxa_throw+5>: or (%rax),%eax 0x102244877 <__cxa_throw+7>: push %rbx (gdb) (gdb) watch *0x102244870 Hardware watchpoint 2: *4330899568 (gdb) r Old value = -788165304 New value = -1373139991 0x0000000100016203 in __asan_mach_override_ptr_custom () (gdb) bt #0 0x0000000100016203 in __asan_mach_override_ptr_custom () #1 0x0000000100015a9e in __interception::OverrideFunction () #2 0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions () #3 0x00007fff5fc13762 in ImageLoaderMachO::doInitialization () #4 0x00007fff5fc1006e in ImageLoader::recursiveInitialization () #5 0x00007fff5fc0feba in ImageLoader::runInitializers () #6 0x00007fff5fc01fc0 in dyld::initializeMainExecutable () #7 0x00007fff5fc05b04 in dyld::_main () #8 0x00007fff5fc01397 in dyldbootstrap::start () #9 0x00007fff5fc0105e in _dyld_start () (gdb) x/8i 0x102244870 0x102244870 <__cxa_throw>: jmpq 0xffd27000 0x102244875 <__cxa_throw+5>: or (%rax),%eax 0x102244877 <__cxa_throw+7>: push %rbx 0x102244878 <__cxa_throw+8>: lea -0x20(%rdi),%rbx 0x10224487c <__cxa_throw+12>: mov %rsi,-0x70(%rdi) # Here is where the patching is being done -Nick On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote:>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at bromo.med.uc.edu> >> wrote: >>> >>> Nick, >>> Can you take a quick look at the asan_eh_bug.tar.bz testcase >>> I uploaded into the newly opened radr://12777299, "potential >>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers >>> have ported llvm.org's asan code into FSF gcc (and are keeping >>> it synced to the upstream llvm.org code). I have been helping >>> with the darwin build and testing -fsanitize=address against the >>> complete FSF gcc testsuite. This seems to have exposed a potential >>> bug in pthread or eh on darwin under libasan. Hundreds of test cases >>> in the g++ and libstdc++ testsuites fail under -fsanitize=address >>> in the following manner... >>> >>> ASAN:SIGSEGV >>> ================================================================>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 >>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0) >>> AddressSanitizer can not provide additional info. >>> #0 0xffd26fff (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff) >>> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) >>> #2 0x0 >>> Stats: 0M malloced (0M for red zones) by 3 calls >>> Stats: 0M realloced by 0 calls >>> Stats: 0M freed by 0 calls >>> Stats: 0M really freed by 0 calls >>> Stats: 1M (384 full pages) mmaped in 3 calls >>> mmaps by size class: 7:4095; 8:2047; 9:1023; >>> mallocs by size class: 7:1; 8:1; 9:1; >>> frees by size class: >>> rfrees by size class: >>> Stats: malloc large: 0 small slow: 3 >>> ==2738== ABORTING >>> >>> The failure of... >>> >>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test >>> >>> was used as the test case for the radar report and compiled with... >>> >>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g -O0 >>> -o cond1_asan.exe >>> >>> to produce the above failure. When compiled without libasan as... >>> >>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe >>> >>> the resulting executable runs fine. Debugging this in gdb seems to show >>> that the failure >>> is occuring in the final call to dyld_stub_pthread_once (). The same test >>> case >>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and produces >>> no runtime errors >>> but the code execution path is very different in that case (because of the >>> different >>> libstdc++). >>> Can you take a quick peek at this and determine if this is a darwin >>> pthread or unwinder >>> bug or an issue with libasan that FSF gcc's compiler is exposing? Thanks >>> in advance for >>> any help on this. >>> Jack >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > > > -- > Alexander Potapenko > Software Engineer > Google Moscow
Jack Howarth
2012-Nov-29 21:46 UTC
[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
On Thu, Nov 29, 2012 at 01:01:42PM -0800, Nick Kledzik wrote:> I debugged this a bit and it seems the mach_override patching of __cxa_throw is bogus. The start of that function is patched to jump to garbage. > > Breakpoint 1, 0x0000000100001c19 in main () > (gdb) display/i $pc > 2: x/i $pc 0x100001c19 <main+318>: callq 0x100016386 <dyld_stub___cxa_throw> > (gdb) si > 0x0000000100016386 in dyld_stub___cxa_throw () > 2: x/i $pc 0x100016386 <dyld_stub___cxa_throw>: jmpq *0xae1c(%rip) # 0x1000211a8 > (gdb) > 0x0000000102244870 in __cxa_throw () > 2: x/i $pc 0x102244870 <__cxa_throw>: jmpq 0xffd27000 > (gdb) # the above its __cxa_throw in gcc's libstdc++.6.dylib. The first instruction has been patch to jump to a garbage address. > > (gdb) x/8i 0x102244870-8 > 0x102244868 <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>: std > 0x102244869 <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>: (bad) > 0x10224486a <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>: decl (%rdi) > 0x10224486c <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>: (bad) > 0x10224486d <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>: add %r8b,(%rax) > 0x102244870 <__cxa_throw>: jmpq 0xffd27000 > 0x102244875 <__cxa_throw+5>: or (%rax),%eax > 0x102244877 <__cxa_throw+7>: push %rbx > (gdb) > (gdb) watch *0x102244870 > Hardware watchpoint 2: *4330899568 > (gdb) r > > Old value = -788165304 > New value = -1373139991 > 0x0000000100016203 in __asan_mach_override_ptr_custom () > (gdb) bt > #0 0x0000000100016203 in __asan_mach_override_ptr_custom () > #1 0x0000000100015a9e in __interception::OverrideFunction () > #2 0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions () > #3 0x00007fff5fc13762 in ImageLoaderMachO::doInitialization () > #4 0x00007fff5fc1006e in ImageLoader::recursiveInitialization () > #5 0x00007fff5fc0feba in ImageLoader::runInitializers () > #6 0x00007fff5fc01fc0 in dyld::initializeMainExecutable () > #7 0x00007fff5fc05b04 in dyld::_main () > #8 0x00007fff5fc01397 in dyldbootstrap::start () > #9 0x00007fff5fc0105e in _dyld_start () > (gdb) x/8i 0x102244870 > 0x102244870 <__cxa_throw>: jmpq 0xffd27000 > 0x102244875 <__cxa_throw+5>: or (%rax),%eax > 0x102244877 <__cxa_throw+7>: push %rbx > 0x102244878 <__cxa_throw+8>: lea -0x20(%rdi),%rbx > 0x10224487c <__cxa_throw+12>: mov %rsi,-0x70(%rdi) > # Here is where the patching is being done > > -NickIn case it helps at all, I've attached the output from an executable with a debug version of mach_override in libasan linked in. I am unclear if all of the patching is done prior to code execution or on the fly. In any case, the local context of the error appears to be... Replacing function at 0x7fff91c19830 First 16 bytes of the function: 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 81 ec To disassemble, save the following function as disas.c and run: gcc -c disas.c && gobjdump -d disas.o The first 16 bytes of the original function will start after four nop instructions. void foo() { asm volatile("nop;nop;nop;nop;"); asm volatile(".byte 0x55, 0x48, 0x89, 0xe5, 0x41, 0x57, 0x41, 0x56;"); asm volatile(".byte 0x41, 0x55, 0x41, 0x54, 0x53, 0x48, 0x81, 0xec;"); } Matching: 55 FAIL Matching: 55 FAIL Matching: 55 OK Matching: 48 FAIL Matching: 48 FAIL Matching: 48 FAIL Matching: 48 FAIL Matching: 48 89 e5 OK Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 FAIL Matching: 41 57 OK BEFORE FIXING: 55 48 89 E5 41 57 41 56 41 55 41 54 53 48 81 EC 55 48 89 E5 41 57 90 90 90 90 90 90 90 90 90 90 AFTER_FIXING: 55 48 89 E5 41 57 41 56 41 55 41 54 53 48 81 EC 55 48 89 E5 41 57 90 90 90 90 90 90 90 90 90 90 First 16 bytes of the function after slicing: e9 cb f7 11 6e 57 41 56 41 55 41 54 53 48 81 ec ASAN:SIGSEGV ==================================================================29051== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 (pc 0x0000ffd27000 sp 0x7fff59ad3858 bp 0x7fff59ad3920 T0) AddressSanitizer can not provide additional info. #0 0xffd26fff (/Users/howarth/./cond1_asan.exe+0xf9bfafff) #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) #2 0x0 Stats: 0M malloced (0M for red zones) by 3 calls Stats: 0M realloced by 0 calls Stats: 0M freed by 0 calls Stats: 0M really freed by 0 calls Stats: 1M (384 full pages) mmaped in 3 calls mmaps by size class: 7:4095; 8:2047; 9:1023; mallocs by size class: 7:1; 8:1; 9:1; frees by size class: rfrees by size class: Stats: malloc large: 0 small slow: 3> > On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote: > >> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at bromo.med.uc.edu> > >> wrote: > >>> > >>> Nick, > >>> Can you take a quick look at the asan_eh_bug.tar.bz testcase > >>> I uploaded into the newly opened radr://12777299, "potential > >>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers > >>> have ported llvm.org's asan code into FSF gcc (and are keeping > >>> it synced to the upstream llvm.org code). I have been helping > >>> with the darwin build and testing -fsanitize=address against the > >>> complete FSF gcc testsuite. This seems to have exposed a potential > >>> bug in pthread or eh on darwin under libasan. Hundreds of test cases > >>> in the g++ and libstdc++ testsuites fail under -fsanitize=address > >>> in the following manner... > >>> > >>> ASAN:SIGSEGV > >>> ================================================================> >>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 > >>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0) > >>> AddressSanitizer can not provide additional info. > >>> #0 0xffd26fff (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff) > >>> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) > >>> #2 0x0 > >>> Stats: 0M malloced (0M for red zones) by 3 calls > >>> Stats: 0M realloced by 0 calls > >>> Stats: 0M freed by 0 calls > >>> Stats: 0M really freed by 0 calls > >>> Stats: 1M (384 full pages) mmaped in 3 calls > >>> mmaps by size class: 7:4095; 8:2047; 9:1023; > >>> mallocs by size class: 7:1; 8:1; 9:1; > >>> frees by size class: > >>> rfrees by size class: > >>> Stats: malloc large: 0 small slow: 3 > >>> ==2738== ABORTING > >>> > >>> The failure of... > >>> > >>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test > >>> > >>> was used as the test case for the radar report and compiled with... > >>> > >>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g -O0 > >>> -o cond1_asan.exe > >>> > >>> to produce the above failure. When compiled without libasan as... > >>> > >>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe > >>> > >>> the resulting executable runs fine. Debugging this in gdb seems to show > >>> that the failure > >>> is occuring in the final call to dyld_stub_pthread_once (). The same test > >>> case > >>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and produces > >>> no runtime errors > >>> but the code execution path is very different in that case (because of the > >>> different > >>> libstdc++). > >>> Can you take a quick peek at this and determine if this is a darwin > >>> pthread or unwinder > >>> bug or an issue with libasan that FSF gcc's compiler is exposing? Thanks > >>> in advance for > >>> any help on this. > >>> Jack > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > >> > > > > > > > > -- > > Alexander Potapenko > > Software Engineer > > Google Moscow-------------- next part -------------- A non-text attachment was scrubbed... Name: debug_asan_cond1_run.log.bz2 Type: application/x-bzip2 Size: 3857 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121129/b237c98e/attachment.bin>
Alexander Potapenko
2012-Nov-30 00:46 UTC
[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
Looks like this happens on x86_64 because the position of __cxa_throw is too far from the allocated branch island (should be <2G). This can be solved by allocating the branch islands somewhere near the text segment (look for kIslandEnd in asan_mac.cc, this is currently 0x7fffffdf0000) or by patching the function with a longer instruction sequence that stores the jump target in a register and jumps to that target (which is a bit more complex to implement). Once this problem is fixed, another one is going to arise. This is how the first bytes of __cxa_throw look like: 0x0020c49ba5d916e0 <__cxa_throw+0>: lea 0xb4f01(%rip),%rax # 0x20c49ba5e465e8 <_ZN10__cxxabiv120__unexpected_handlerE> 0x0020c49ba5d916e7 <__cxa_throw+7>: push %rbx 0x0020c49ba5d916e8 <__cxa_throw+8>: lea -0x20(%rdi),%rbx If we move the relative LEA instruction somewhere, we must fix the constant in order to keep it pointing to the same address. mach_override already does this for relative CALL and JMP instructions, but not for LEA. This should be fairly simple to fix. Note that the 32-bit variant crashes on another invalid address: ASAN:SIGSEGV ==================================================================89768== ERROR: AddressSanitizer: SEGV on unknown address 0xcccccccc (pc 0x00061f8c sp 0xbffa8bd0 bp 0xbffa8cc8 T0) AddressSanitizer can not provide additional info. #0 0x61f8b (/Users/glider/src/gcc-asan/inst/lib/i386/libstdc++.6.dylib+0x3f8b) #1 0x91391724 (/usr/lib/system/libdyld.dylib+0x2724) #2 0x0 Stats: 0M malloced (0M for red zones) by 3 calls Stats: 0M realloced by 0 calls Stats: 0M freed by 0 calls Stats: 0M really freed by 0 calls Stats: 1M (256 full pages) mmaped in 2 calls mmaps by size class: 7:4095; 8:2047; mallocs by size class: 7:1; 8:2; frees by size class: rfrees by size class: Stats: malloc large: 0 small slow: 2 ==89768== ABORTING My guess is that this is caused by the following code being moved to a branch island: Dump of assembler code for function __cxa_throw: 0x00008f60 <__cxa_throw+0>: push %esi 0x00008f61 <__cxa_throw+1>: push %ebx 0x00008f62 <__cxa_throw+2>: call 0x7a60 <__x86.get_pc_thunk.bx> Perhaps this makes __x86.get_pc_thunk.bx return an incorrect value. Since libstdc++-v3 is built together with gcc, the two issues related to instructions being moved to another place can be solved by padding __cxa_throw() with five NOP instructions (enough to hold a JMP). I believe this should be acceptable, because the performance penalty for additional NOPs is negligible, and __cxa_throw() isn't a hot point. On Thu, Nov 29, 2012 at 1:01 PM, Nick Kledzik <kledzik at apple.com> wrote:> I debugged this a bit and it seems the mach_override patching of __cxa_throw is bogus. The start of that function is patched to jump to garbage. > > Breakpoint 1, 0x0000000100001c19 in main () > (gdb) display/i $pc > 2: x/i $pc 0x100001c19 <main+318>: callq 0x100016386 <dyld_stub___cxa_throw> > (gdb) si > 0x0000000100016386 in dyld_stub___cxa_throw () > 2: x/i $pc 0x100016386 <dyld_stub___cxa_throw>: jmpq *0xae1c(%rip) # 0x1000211a8 > (gdb) > 0x0000000102244870 in __cxa_throw () > 2: x/i $pc 0x102244870 <__cxa_throw>: jmpq 0xffd27000 > (gdb) # the above its __cxa_throw in gcc's libstdc++.6.dylib. The first instruction has been patch to jump to a garbage address. > > (gdb) x/8i 0x102244870-8 > 0x102244868 <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>: std > 0x102244869 <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>: (bad) > 0x10224486a <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>: decl (%rdi) > 0x10224486c <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>: (bad) > 0x10224486d <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>: add %r8b,(%rax) > 0x102244870 <__cxa_throw>: jmpq 0xffd27000 > 0x102244875 <__cxa_throw+5>: or (%rax),%eax > 0x102244877 <__cxa_throw+7>: push %rbx > (gdb) > (gdb) watch *0x102244870 > Hardware watchpoint 2: *4330899568 > (gdb) r > > Old value = -788165304 > New value = -1373139991 > 0x0000000100016203 in __asan_mach_override_ptr_custom () > (gdb) bt > #0 0x0000000100016203 in __asan_mach_override_ptr_custom () > #1 0x0000000100015a9e in __interception::OverrideFunction () > #2 0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions () > #3 0x00007fff5fc13762 in ImageLoaderMachO::doInitialization () > #4 0x00007fff5fc1006e in ImageLoader::recursiveInitialization () > #5 0x00007fff5fc0feba in ImageLoader::runInitializers () > #6 0x00007fff5fc01fc0 in dyld::initializeMainExecutable () > #7 0x00007fff5fc05b04 in dyld::_main () > #8 0x00007fff5fc01397 in dyldbootstrap::start () > #9 0x00007fff5fc0105e in _dyld_start () > (gdb) x/8i 0x102244870 > 0x102244870 <__cxa_throw>: jmpq 0xffd27000 > 0x102244875 <__cxa_throw+5>: or (%rax),%eax > 0x102244877 <__cxa_throw+7>: push %rbx > 0x102244878 <__cxa_throw+8>: lea -0x20(%rdi),%rbx > 0x10224487c <__cxa_throw+12>: mov %rsi,-0x70(%rdi) > # Here is where the patching is being done > > -Nick > > On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote: >>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <howarth at bromo.med.uc.edu> >>> wrote: >>>> >>>> Nick, >>>> Can you take a quick look at the asan_eh_bug.tar.bz testcase >>>> I uploaded into the newly opened radr://12777299, "potential >>>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers >>>> have ported llvm.org's asan code into FSF gcc (and are keeping >>>> it synced to the upstream llvm.org code). I have been helping >>>> with the darwin build and testing -fsanitize=address against the >>>> complete FSF gcc testsuite. This seems to have exposed a potential >>>> bug in pthread or eh on darwin under libasan. Hundreds of test cases >>>> in the g++ and libstdc++ testsuites fail under -fsanitize=address >>>> in the following manner... >>>> >>>> ASAN:SIGSEGV >>>> ================================================================>>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address 0x0000ffd27000 >>>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0) >>>> AddressSanitizer can not provide additional info. >>>> #0 0xffd26fff (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff) >>>> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0) >>>> #2 0x0 >>>> Stats: 0M malloced (0M for red zones) by 3 calls >>>> Stats: 0M realloced by 0 calls >>>> Stats: 0M freed by 0 calls >>>> Stats: 0M really freed by 0 calls >>>> Stats: 1M (384 full pages) mmaped in 3 calls >>>> mmaps by size class: 7:4095; 8:2047; 9:1023; >>>> mallocs by size class: 7:1; 8:1; 9:1; >>>> frees by size class: >>>> rfrees by size class: >>>> Stats: malloc large: 0 small slow: 3 >>>> ==2738== ABORTING >>>> >>>> The failure of... >>>> >>>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test >>>> >>>> was used as the test case for the radar report and compiled with... >>>> >>>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g -O0 >>>> -o cond1_asan.exe >>>> >>>> to produce the above failure. When compiled without libasan as... >>>> >>>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe >>>> >>>> the resulting executable runs fine. Debugging this in gdb seems to show >>>> that the failure >>>> is occuring in the final call to dyld_stub_pthread_once (). The same test >>>> case >>>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and produces >>>> no runtime errors >>>> but the code execution path is very different in that case (because of the >>>> different >>>> libstdc++). >>>> Can you take a quick peek at this and determine if this is a darwin >>>> pthread or unwinder >>>> bug or an issue with libasan that FSF gcc's compiler is exposing? Thanks >>>> in advance for >>>> any help on this. >>>> Jack >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> >> >> >> >> -- >> Alexander Potapenko >> Software Engineer >> Google Moscow >-- Alexander Potapenko Software Engineer Google Moscow
Possibly Parallel Threads
- [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
- [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
- [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
- [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
- [LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"