Christian Schafmeister
2013-Oct-14 05:51 UTC
[LLVMdev] A weird, reproducable problem with MCJIT
I switched my Common Lisp compiler to use MCJIT on the weekend and ran into a weird problem compiling one particular function. It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when calling processFDE. The weird part is that the function does not appear to do anything special and I've whittled it down to the minimum size that still causes the crash. If I remove even one statement it compiles fine. Note: The function doesn't make much sense anymore but it does compile fine. It does have a lot of nested scopes. I can single step through processFDE and I see it pulls up a Length in processFDE of 1 and then a length of 16#1000000 - clearly something has been corrupted. Here is the top of the backtrace from lldb: ---------------------- * thread #1: tid = 0x557509, 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35, stop reason = EXC_BAD_ACCESS (code=2, address=0x1102adad1) frame #0: 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35 frame #1: 0x0000000107e6df04 libLLVM-3.4svn.dylib`llvm::RuntimeDyldMachO::registerEHFrames() + 356 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 frame #2: 0x0000000107e36f39 libLLVM-3.4svn.dylib`llvm::RuntimeDyld::registerEHFrames() + 25 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:595 frame #3: 0x0000000107aca9c7 libLLVM-3.4svn.dylib`llvm::MCJIT::finalizeObject() + 775 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:226 frame #4: 0x0000000103da55d3 libllvmo_dbg.dylib`llvmo::ExecutionEngine_O::getCompiledFunction(mem::smart_ptr<core::Symbol_O>, mem::smart_ptr<llvmo::Function_O>, mem::smart_ptr<core::ActivationFrame_O>, mem::smart_ptr<core::Symbol_O>) + 115 at /Users/meister/Development/cando/brcl/src/llvmo/../../src/llvmo/llvmoExpose.cc:811 Here is the function that causes the crash (it's Common Lisp and all macros have been expanded). (defun match-dimensions (array pat) (let ((zzz (eq pat '*))) (if zzz zzz (let ((rank (array-rank array) )) (if (listp pat) (BLOCK NIL (LET* ( (%DOTIMES-VAR 100) (I 0) ) (TAGBODY (GO bot) top (if (not t) (progn)) (SETQ PAT (CDR PAT)) (SETQ I (1+ I)) bot (if (< I %DOTIMES-VAR) (GO top)) ) (NULL PAT)))))))) The LLVM-IR file generated by compiling this function definition is here: https://dl.dropboxusercontent.com/u/6229900/broken.ll Any suggestions or pointers on how to debug this are welcome.
Hi Christian, Thanks for sharing this. Yaron Keren has been investigating some problems in the EH frame registration code recently, and I think this may be related. It at least sounds similar to the type of variations in behavior based on code size that Yaron was seeing. -Andy -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Christian Schafmeister Sent: Sunday, October 13, 2013 10:52 PM To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] A weird, reproducable problem with MCJIT I switched my Common Lisp compiler to use MCJIT on the weekend and ran into a weird problem compiling one particular function. It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when calling processFDE. The weird part is that the function does not appear to do anything special and I've whittled it down to the minimum size that still causes the crash. If I remove even one statement it compiles fine. Note: The function doesn't make much sense anymore but it does compile fine. It does have a lot of nested scopes. I can single step through processFDE and I see it pulls up a Length in processFDE of 1 and then a length of 16#1000000 - clearly something has been corrupted. Here is the top of the backtrace from lldb: ---------------------- * thread #1: tid = 0x557509, 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35, stop reason = EXC_BAD_ACCESS (code=2, address=0x1102adad1) frame #0: 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35 frame #1: 0x0000000107e6df04 libLLVM-3.4svn.dylib`llvm::RuntimeDyldMachO::registerEHFrames() + 356 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 frame #2: 0x0000000107e36f39 libLLVM-3.4svn.dylib`llvm::RuntimeDyld::registerEHFrames() + 25 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:595 frame #3: 0x0000000107aca9c7 libLLVM-3.4svn.dylib`llvm::MCJIT::finalizeObject() + 775 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:226 frame #4: 0x0000000103da55d3 libllvmo_dbg.dylib`llvmo::ExecutionEngine_O::getCompiledFunction(mem::smart_ptr<core::Symbol_O>, mem::smart_ptr<llvmo::Function_O>, mem::smart_ptr<core::ActivationFrame_O>, mem::smart_ptr<core::Symbol_O>) + 115 at /Users/meister/Development/cando/brcl/src/llvmo/../../src/llvmo/llvmoExpose.cc:811 Here is the function that causes the crash (it's Common Lisp and all macros have been expanded). (defun match-dimensions (array pat) (let ((zzz (eq pat '*))) (if zzz zzz (let ((rank (array-rank array) )) (if (listp pat) (BLOCK NIL (LET* ( (%DOTIMES-VAR 100) (I 0) ) (TAGBODY (GO bot) top (if (not t) (progn)) (SETQ PAT (CDR PAT)) (SETQ I (1+ I)) bot (if (< I %DOTIMES-VAR) (GO top)) ) (NULL PAT)))))))) The LLVM-IR file generated by compiling this function definition is here: https://dl.dropboxusercontent.com/u/6229900/broken.ll Any suggestions or pointers on how to debug this are welcome. _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi, I had similar problems with EH in ELF in RTDyldMemoryManager::registerEHFrames() calling __register_frame(). I'm not sure these problems are related to this problem since your crash happens in RuntimeDyldMachO::registerEHFrames() in its own processFDE (there are two functions named processFDE(), one in RuntimeDyldMachO.cpp and one in RTDyldMemoryManager.cpp) *before* RTDyldMemoryManager::registerEHFrames() and __register_frame() are called. It would seem that even if RTDyldMemoryManager::registerEHFrames() and __register_frame() got problematic input (as with the ELF dyn. linker) it should not cause a crash in the calling code but either a malfunction of exceptions or crash in RTDyldMemoryManager::registerEHFrames() / __register_frame(). A crash like the one you see should be related to RuntimeDyldMachO::registerEHFrames() inputs only. Yaron 2013/10/14 Kaylor, Andrew <andrew.kaylor at intel.com>> Hi Christian, > > Thanks for sharing this. > > Yaron Keren has been investigating some problems in the EH frame > registration code recently, and I think this may be related. It at least > sounds similar to the type of variations in behavior based on code size > that Yaron was seeing. > > -Andy > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On > Behalf Of Christian Schafmeister > Sent: Sunday, October 13, 2013 10:52 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] A weird, reproducable problem with MCJIT > > > I switched my Common Lisp compiler to use MCJIT on the weekend and ran > into a weird problem compiling one particular function. > > It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when > calling processFDE. > > The weird part is that the function does not appear to do anything special > and I've whittled it down to the minimum size that still causes the crash. > If I remove even one statement it compiles fine. Note: The function > doesn't make much sense anymore but it does compile fine. > It does have a lot of nested scopes. > > I can single step through processFDE and I see it pulls up a Length in > processFDE of 1 and then a length of 16#1000000 - clearly something has > been corrupted. > > Here is the top of the backtrace from lldb: > ---------------------- > * thread #1: tid = 0x557509, 0x0000000107e6e066 > libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35, > stop reason = EXC_BAD_ACCESS (code=2, address=0x1102adad1) > frame #0: 0x0000000107e6e066 > libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35 > frame #1: 0x0000000107e6df04 > libLLVM-3.4svn.dylib`llvm::RuntimeDyldMachO::registerEHFrames() + 356 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 > frame #2: 0x0000000107e36f39 > libLLVM-3.4svn.dylib`llvm::RuntimeDyld::registerEHFrames() + 25 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:595 > frame #3: 0x0000000107aca9c7 > libLLVM-3.4svn.dylib`llvm::MCJIT::finalizeObject() + 775 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:226 > frame #4: 0x0000000103da55d3 > libllvmo_dbg.dylib`llvmo::ExecutionEngine_O::getCompiledFunction(mem::smart_ptr<core::Symbol_O>, > mem::smart_ptr<llvmo::Function_O>, mem::smart_ptr<core::ActivationFrame_O>, > mem::smart_ptr<core::Symbol_O>) + 115 at > /Users/meister/Development/cando/brcl/src/llvmo/../../src/llvmo/llvmoExpose.cc:811 > > > Here is the function that causes the crash (it's Common Lisp and all > macros have been expanded). > > (defun match-dimensions (array pat) > (let ((zzz (eq pat '*))) > (if zzz > zzz > (let ((rank (array-rank array) )) > (if (listp pat) > (BLOCK NIL > (LET* ( > (%DOTIMES-VAR 100) > (I 0) > ) > (TAGBODY > (GO bot) > top > (if (not t) (progn)) > (SETQ PAT (CDR PAT)) > (SETQ I (1+ I)) > bot > (if (< I %DOTIMES-VAR) (GO top)) > ) > (NULL PAT)))))))) > > > > The LLVM-IR file generated by compiling this function definition is > here: https://dl.dropboxusercontent.com/u/6229900/broken.ll > > Any suggestions or pointers on how to debug this are welcome. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131014/06446599/attachment.html>
Christian Schafmeister
2013-Oct-14 18:53 UTC
[LLVMdev] A weird, reproducable problem with MCJIT
Andrew, Thanks for following up. Some people on IRC suggested that perhaps my BasicBlocks that contain only an "unreachable" IR instruction were being removed and two BasicBlock labels were getting the same address. I inserted a "call @unreachableError()" instruction before every "unreachable" instruction that I generated and that caused the function below to compile fine but another function now exhibits the same crash. It seems very sensitive to the size of something within the function. Another thing. When processFDE get's called I inserted printf statements to print the lengths of the FDE's. Here's what it looks like when everything compiles and finalizeObject doesn't crash: "/Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 In registerEHFrames P = 0x10975ec20 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 28/1c /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 68/44 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 44/2c /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 68/44 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 68/44 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 68/44 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 52/34 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 20/14 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 28/1c /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 28/1c /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 44/2c /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 36/24 "Loading bitcode file: /Users/meister/Development/cando/brcl/src/lisp/kernel/lsp/setf.bc "/Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 In registerEHFrames P = 0x231875000 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 28/1c /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 68/44 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 44/2c ... and so on Here's what it looks like when it crashes: "" In environment: COMMON-LISP:NIL "/Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 In registerEHFrames P = 0x27ddf71b8 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 1/1 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 16777216/1000000 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 17267682/1077be2 /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:26 processFDE Length = 1515870810/5a5a5a5a Process 85800 stopped * thread #1: tid = 0x59640d, 0x0000000106627f2c libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 44 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:25, stop reason = EXC_BAD_ACCESS (code=1, address=0x2da414805) frame #0: 0x0000000106627f2c libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 44 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:25 22 namespace llvm { 23 24 static unsigned char *processFDE(unsigned char *P, intptr_t DeltaForText, intptr_t DeltaForEH) { -> 25 uint32_t Length = *((uint32_t*)P); 26 printf("%s:%d processFDE Length = %u/%x\n", __FILE__, __LINE__, Length, Length); 27 P += 4; 28 unsigned char *Ret = P + Length; Notice that the "Length" field of the very first FDE is 1 byte long which is crazy - after that it it crashes. I'm trying to track down where the FDE entries are created - I don't know the code well at all. I can run any other tests you suggest. This problem is reproducable. Best, .Chris. "Kaylor, Andrew" <andrew.kaylor at intel.com> writes:> Hi Christian, > > Thanks for sharing this. > > Yaron Keren has been investigating some problems in the EH frame > registration code recently, and I think this may be related. It at > least sounds similar to the type of variations in behavior based on > code size that Yaron was seeing. > > -Andy > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Christian Schafmeister > Sent: Sunday, October 13, 2013 10:52 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] A weird, reproducable problem with MCJIT > > > I switched my Common Lisp compiler to use MCJIT on the weekend and ran into a weird problem compiling one particular function. > > It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when calling processFDE. > > The weird part is that the function does not appear to do anything special and I've whittled it down to the minimum size that still causes the crash. If I remove even one statement it compiles fine. Note: The function doesn't make much sense anymore but it does compile fine. > It does have a lot of nested scopes. > > I can single step through processFDE and I see it pulls up a Length in processFDE of 1 and then a length of 16#1000000 - clearly something has been corrupted. > > Here is the top of the backtrace from lldb: > ---------------------- > * thread #1: tid = 0x557509, 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35, stop reason = EXC_BAD_ACCESS (code=2, address=0x1102adad1) > frame #0: 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35 > frame #1: 0x0000000107e6df04 libLLVM-3.4svn.dylib`llvm::RuntimeDyldMachO::registerEHFrames() + 356 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 > frame #2: 0x0000000107e36f39 libLLVM-3.4svn.dylib`llvm::RuntimeDyld::registerEHFrames() + 25 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:595 > frame #3: 0x0000000107aca9c7 libLLVM-3.4svn.dylib`llvm::MCJIT::finalizeObject() + 775 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:226 > frame #4: 0x0000000103da55d3 libllvmo_dbg.dylib`llvmo::ExecutionEngine_O::getCompiledFunction(mem::smart_ptr<core::Symbol_O>, mem::smart_ptr<llvmo::Function_O>, mem::smart_ptr<core::ActivationFrame_O>, mem::smart_ptr<core::Symbol_O>) + 115 at /Users/meister/Development/cando/brcl/src/llvmo/../../src/llvmo/llvmoExpose.cc:811 > > > Here is the function that causes the crash (it's Common Lisp and all macros have been expanded). > > (defun match-dimensions (array pat) > (let ((zzz (eq pat '*))) > (if zzz > zzz > (let ((rank (array-rank array) )) > (if (listp pat) > (BLOCK NIL > (LET* ( > (%DOTIMES-VAR 100) > (I 0) > ) > (TAGBODY > (GO bot) > top > (if (not t) (progn)) > (SETQ PAT (CDR PAT)) > (SETQ I (1+ I)) > bot > (if (< I %DOTIMES-VAR) (GO top)) > ) > (NULL PAT)))))))) > > > > The LLVM-IR file generated by compiling this function definition is > here: https://dl.dropboxusercontent.com/u/6229900/broken.ll > > Any suggestions or pointers on how to debug this are welcome. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Apparently Analagous Threads
- [LLVMdev] A weird, reproducable problem with MCJIT
- [LLVMdev] A weird, reproducable problem with MCJIT
- [LLVMdev] A weird, reproducable problem with MCJIT
- [LLVMdev] Questions about attaching DWARF source code debugging information to generated LLVM-IR.
- [LLVMdev] A weird, reproducable problem with MCJIT