Hi, I had similar problems with EH in ELF in RTDyldMemoryManager::registerEHFrames() calling __register_frame(). I'm not sure these problems are related to this problem since your crash happens in RuntimeDyldMachO::registerEHFrames() in its own processFDE (there are two functions named processFDE(), one in RuntimeDyldMachO.cpp and one in RTDyldMemoryManager.cpp) *before* RTDyldMemoryManager::registerEHFrames() and __register_frame() are called. It would seem that even if RTDyldMemoryManager::registerEHFrames() and __register_frame() got problematic input (as with the ELF dyn. linker) it should not cause a crash in the calling code but either a malfunction of exceptions or crash in RTDyldMemoryManager::registerEHFrames() / __register_frame(). A crash like the one you see should be related to RuntimeDyldMachO::registerEHFrames() inputs only. Yaron 2013/10/14 Kaylor, Andrew <andrew.kaylor at intel.com>> Hi Christian, > > Thanks for sharing this. > > Yaron Keren has been investigating some problems in the EH frame > registration code recently, and I think this may be related. It at least > sounds similar to the type of variations in behavior based on code size > that Yaron was seeing. > > -Andy > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On > Behalf Of Christian Schafmeister > Sent: Sunday, October 13, 2013 10:52 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] A weird, reproducable problem with MCJIT > > > I switched my Common Lisp compiler to use MCJIT on the weekend and ran > into a weird problem compiling one particular function. > > It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when > calling processFDE. > > The weird part is that the function does not appear to do anything special > and I've whittled it down to the minimum size that still causes the crash. > If I remove even one statement it compiles fine. Note: The function > doesn't make much sense anymore but it does compile fine. > It does have a lot of nested scopes. > > I can single step through processFDE and I see it pulls up a Length in > processFDE of 1 and then a length of 16#1000000 - clearly something has > been corrupted. > > Here is the top of the backtrace from lldb: > ---------------------- > * thread #1: tid = 0x557509, 0x0000000107e6e066 > libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35, > stop reason = EXC_BAD_ACCESS (code=2, address=0x1102adad1) > frame #0: 0x0000000107e6e066 > libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35 > frame #1: 0x0000000107e6df04 > libLLVM-3.4svn.dylib`llvm::RuntimeDyldMachO::registerEHFrames() + 356 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 > frame #2: 0x0000000107e36f39 > libLLVM-3.4svn.dylib`llvm::RuntimeDyld::registerEHFrames() + 25 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:595 > frame #3: 0x0000000107aca9c7 > libLLVM-3.4svn.dylib`llvm::MCJIT::finalizeObject() + 775 at > /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:226 > frame #4: 0x0000000103da55d3 > libllvmo_dbg.dylib`llvmo::ExecutionEngine_O::getCompiledFunction(mem::smart_ptr<core::Symbol_O>, > mem::smart_ptr<llvmo::Function_O>, mem::smart_ptr<core::ActivationFrame_O>, > mem::smart_ptr<core::Symbol_O>) + 115 at > /Users/meister/Development/cando/brcl/src/llvmo/../../src/llvmo/llvmoExpose.cc:811 > > > Here is the function that causes the crash (it's Common Lisp and all > macros have been expanded). > > (defun match-dimensions (array pat) > (let ((zzz (eq pat '*))) > (if zzz > zzz > (let ((rank (array-rank array) )) > (if (listp pat) > (BLOCK NIL > (LET* ( > (%DOTIMES-VAR 100) > (I 0) > ) > (TAGBODY > (GO bot) > top > (if (not t) (progn)) > (SETQ PAT (CDR PAT)) > (SETQ I (1+ I)) > bot > (if (< I %DOTIMES-VAR) (GO top)) > ) > (NULL PAT)))))))) > > > > The LLVM-IR file generated by compiling this function definition is > here: https://dl.dropboxusercontent.com/u/6229900/broken.ll > > Any suggestions or pointers on how to debug this are welcome. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131014/06446599/attachment.html>
Hi, one possibility (discussed on IRC) is that zero-sized atoms are being created where BBs contain only a single 'unreachable'. This situation (BBs with only unreachable) occurs in a number of places in Christian's code. If this were the case (on OSX, with ld64) then ld64 would complain about it. I'm not sure what happens if a similar situation is presented to MCJIT. Also, now I see the debugging comment below - it seems that the size might be '1' - rather than 0 - so perhaps a red-herring. cheers Iain On 14 Oct 2013, at 19:36, Yaron Keren wrote:> Hi, > > I had similar problems with EH in ELF in RTDyldMemoryManager::registerEHFrames() calling __register_frame(). > > I'm not sure these problems are related to this problem since your crash happens in RuntimeDyldMachO::registerEHFrames() in its own processFDE (there are two functions named processFDE(), one in RuntimeDyldMachO.cpp and one in RTDyldMemoryManager.cpp) before RTDyldMemoryManager::registerEHFrames() and __register_frame() are called. > > It would seem that even if RTDyldMemoryManager::registerEHFrames() and __register_frame() got problematic input (as with the ELF dyn. linker) it should not cause a crash in the calling code but either a malfunction of exceptions or crash in RTDyldMemoryManager::registerEHFrames() / __register_frame(). A crash like the one you see should be related to RuntimeDyldMachO::registerEHFrames() inputs only. > > Yaron > > > > 2013/10/14 Kaylor, Andrew <andrew.kaylor at intel.com> > Hi Christian, > > Thanks for sharing this. > > Yaron Keren has been investigating some problems in the EH frame registration code recently, and I think this may be related. It at least sounds similar to the type of variations in behavior based on code size that Yaron was seeing. > > -Andy > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Christian Schafmeister > Sent: Sunday, October 13, 2013 10:52 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] A weird, reproducable problem with MCJIT > > > I switched my Common Lisp compiler to use MCJIT on the weekend and ran into a weird problem compiling one particular function. > > It crashes with an EXC_BAD_ACCESS error in MCJIT::finalizeObject when calling processFDE. > > The weird part is that the function does not appear to do anything special and I've whittled it down to the minimum size that still causes the crash. If I remove even one statement it compiles fine. Note: The function doesn't make much sense anymore but it does compile fine. > It does have a lot of nested scopes. > > I can single step through processFDE and I see it pulls up a Length in processFDE of 1 and then a length of 16#1000000 - clearly something has been corrupted. > > Here is the top of the backtrace from lldb: > ---------------------- > * thread #1: tid = 0x557509, 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35, stop reason = EXC_BAD_ACCESS (code=2, address=0x1102adad1) > frame #0: 0x0000000107e6e066 libLLVM-3.4svn.dylib`llvm::processFDE(unsigned char*, long, long) + 134 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:35 > frame #1: 0x0000000107e6df04 libLLVM-3.4svn.dylib`llvm::RuntimeDyldMachO::registerEHFrames() + 356 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp:81 > frame #2: 0x0000000107e36f39 libLLVM-3.4svn.dylib`llvm::RuntimeDyld::registerEHFrames() + 25 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:595 > frame #3: 0x0000000107aca9c7 libLLVM-3.4svn.dylib`llvm::MCJIT::finalizeObject() + 775 at /Users/meister/Development/cando/brcl/externals/src/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:226 > frame #4: 0x0000000103da55d3 libllvmo_dbg.dylib`llvmo::ExecutionEngine_O::getCompiledFunction(mem::smart_ptr<core::Symbol_O>, mem::smart_ptr<llvmo::Function_O>, mem::smart_ptr<core::ActivationFrame_O>, mem::smart_ptr<core::Symbol_O>) + 115 at /Users/meister/Development/cando/brcl/src/llvmo/../../src/llvmo/llvmoExpose.cc:811 > > > Here is the function that causes the crash (it's Common Lisp and all macros have been expanded). > > (defun match-dimensions (array pat) > (let ((zzz (eq pat '*))) > (if zzz > zzz > (let ((rank (array-rank array) )) > (if (listp pat) > (BLOCK NIL > (LET* ( > (%DOTIMES-VAR 100) > (I 0) > ) > (TAGBODY > (GO bot) > top > (if (not t) (progn)) > (SETQ PAT (CDR PAT)) > (SETQ I (1+ I)) > bot > (if (< I %DOTIMES-VAR) (GO top)) > ) > (NULL PAT)))))))) > > > > The LLVM-IR file generated by compiling this function definition is > here: https://dl.dropboxusercontent.com/u/6229900/broken.ll > > Any suggestions or pointers on how to debug this are welcome. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Christian Schafmeister
2013-Oct-14 19:40 UTC
[LLVMdev] A weird, reproducable problem with MCJIT
Yaron, Did you find a way around the problem? It looks like the problem comes before processFDE because by the time it gets to processFDE the eh_frame data is already corrupted. Does ELF and MachO share the same eh_frame format? I am developing this code in parallel on an Ubuntu Linux system but I haven't tried to run it on there for a couple of weeks. I'll bring it up to date and try my test case on it and we'll see what happens. Best, .Chris. Yaron Keren <yaron.keren at gmail.com> writes:> Hi, > > I had similar problems with EH in ELF in > RTDyldMemoryManager::registerEHFrames() calling __register_frame(). > > I'm not sure these problems are related to this problem since your > crash happens in RuntimeDyldMachO::registerEHFrames() in its own > processFDE (there are two functions named processFDE(), one in > RuntimeDyldMachO.cpp and one in RTDyldMemoryManager.cpp) > before RTDyldMemoryManager::registerEHFrames() and __register_frame() > are called. > > It would seem that even if RTDyldMemoryManager::registerEHFrames() > and __register_frame() got problematic input (as with the ELF dyn. > linker) it should not cause a crash in the calling code but either a > malfunction of exceptions or crash in > RTDyldMemoryManager::registerEHFrames() / __register_frame(). A crash > like the one you see should be related > to RuntimeDyldMachO::registerEHFrames() inputs only. > > Yaron > > > > 2013/10/14 Kaylor, Andrew <andrew.kaylor at intel.com> > > Hi Christian, > > Thanks for sharing this. > > Yaron Keren has been investigating some problems in the EH frame > registration code recently, and I think this may be related. It > at least sounds similar to the type of variations in behavior > based on code size that Yaron was seeing. > > -Andy >
Hi, There may be two problems with __register_frame usage. However based on http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-April/061768.html I think the existing code is correct for OS-X but likely buggy for Linux and Windows systems. Your crash is on OS-X, right? Anyhow, the first problem is very easy to fix. On Linux and Windows (at least) __register_frame should be called once and not called on every FDE as in processFDE in RTDyldMemoryManager,cpp does. So RTDyldMemoryManager::registerEHFrames was modified to: void RTDyldMemoryManager::registerEHFrames(uint8_t *Addr, uint64_t LoadAddr, size_t Size) { __register_frame(Addr); } On Windows 7 / MingW (gcc) this completely solved the problems I had with erratic exception behaviour. The second issue is a bit more complicated. With executable files, the linker combines .eh frames with four zero bytes from crtend to marking .eh_frame section end. As Rafael writes, this can't be done in codegen since it's a linker function done when all .eh_frames are combined. The dynamic linker must perform the same function, else __register_frame(.eh_frame) might continue processing after .eh_frame, depending if there were four zero bytes following it - or not - by chance. However this again isn't likely to be your source of problem, as __registerframe on OS-X processes one FDE at the time and the calling function processFDE() in RTDyldMemoryManager.cpp does know the size of eh_frame so it will not overrun the frame. The solution would be to allocate a larger buffer, copy .eh_frame into it with four zero bytes appended. This buffer needs to live as long as long it's registered in the runtime library. Yaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131014/2d983b51/attachment.html>