Yaron Keren
2013-Oct-22 20:40 UTC
[LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
Hi, Thanks for your ideas. Memory allocation already exceeds 2x64K in the "working" case so it's not the condition of allocating more than 64K. To be sure I had modified SectionMemoryManager::allocateSection to allocate four time the required memory but it did not trigger more crashes.I debugged through the allocation code including the Win32 code and it seems to work well. I have also tried disabling the MemGroup.FreeMem cache which did not matter. An added assert for no Stubs to the end of RuntimeDyldImpl::loadObject processRelocationRef(SectionID, *i, *obj, LocalSections, LocalSymbols, Stubs); assert(!Stubs.size()); indeed caught nothing = no stubs created. Disabling (de)registerEH did not help. Looking at relocations and sections printouts, the exception is: Unhandled exception at 0x0A3600D1 : 0xC0000005: Access violation writing location 0x00BC7680. which is right after the start of .text: emitSection SectionID: 1 Name: .text obj addr: 0A3F1350 new addr: 0A360000 DataSize: 253203 StubBufSize: 0 Allocate: 253203 ... Resolving relocations Section #1 0A360000 so at least it is running code but tries to write a wrong location. Another run exhibits similar crash, still in .text but somewhat later. I have checked and the function address I'm running is located in .text towards the end, as expected since it's the last function added to the Module. Also I speculated that if it crashes when .text crosses 128K but no, it happens when it's larger. I had attached gdb to the process hoping it will show more information but it showed even less information than the Visual C++ debugger. Out of ideas... Yaron 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>> I would guess that it’s crashing somewhere in the generated code. On > Windows we don’t have a way to get call stacks to the generated code > (though if you want to try it on Linux, that should work). You can > probably look at the address where the crash is occurring and verify that > it is in the generated code.**** > > ** ** > > There are a couple of things I would look for.**** > > ** ** > > First, I’d take a look at the SectionMemoryManager allocation handling. > The fact that the problem is code size dependent strongly points in this > direction. It may be that SectionMemoryManager does something wrong when > it hits a page boundary or something.**** > > ** ** > > Second, I’d look at the relocation processing. If it is generating any > stubs, that would be a potential problem spot, but it shouldn’t be > generating any stubs. So the obvious thing to look at is whether any of > the relocations are writing to the spot where the crash occurs.**** > > ** ** > > -Andy**** > > ** ** > > ** ** > > *From:* Yaron Keren [mailto:yaron.keren at gmail.com] > *Sent:* Tuesday, October 22, 2013 10:17 AM > *To:* Kaylor, Andrew > *Cc:* <llvmdev at cs.uiuc.edu> > *Subject:* Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF > codegen?**** > > ** ** > > OS is Windows 7 64 bit OS, compiler is 32 bit Visual C++ 2012 with 32 bit. > **** > > The target which is i686-pc-mingw32-elf so I can use the ELF dynamic > loader. **** > > Code model, relocation model and and memory manager are whatever default > for this - did not modify.**** > > ** ** > > The Module comes from clang. The source is 1000 or more lines repeating > C++ code in one big function:**** > > ** ** > > A+1;**** > > A*B.t();**** > > ** ** > > where A and B are matrices from Armadillo http://arma.sourceforge.net/. > This a stress and performance test due to the large number of EH and > temporary objects created.**** > > ** ** > > I am using the Engine Builder and MCJIT unmodified (except the > multi-modules patches which are not relevant as there is only one module) > like this:**** > > ** ** > > OwningPtr<llvm::ExecutionEngine> EE(llvm::EngineBuilder(M)**** > > .setErrorStr(&Error)**** > > .setUseMCJIT(true)**** > > .create());**** > > ** ** > > to run the function either **** > > ** ** > > llvm::Function *F = M->getFunction(Name);**** > > void *FN = EE->getPointerToFunction(F);**** > > or**** > > uint64_t FN = EE->getFunctionAddress(Name);**** > > ** ** > > followed by **** > > ** ** > > ((void (*)())FN)();**** > > or**** > > EE->runFunction(F, std::vector<llvm::GenericValue>());**** > > ** ** > > all work the same with smaller about 1000 lines of the above code module > and crash the same with more code. The call stack is unhelpful Visual C++ > says: Frames below may be incorrect and/or missing which indicates a real > problem with it. I have tried to provide less stack space (default is 10M) > for the compiled program without any change.**** > > ** ** > > Yaron**** > > ** ** > > ** ** > > 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>**** > > I’m not aware of such a limitation.**** > > **** > > What architecture, code model and relocation model are you using? Are > you using the SectionMemoryManager?**** > > **** > > -Andy**** > > **** > > *From**:* Yaron Keren [mailto:yaron.keren at gmail.com] > *Sent**:* Tuesday, October 22, 2013 8:12 AM > *To**:* <llvmdev at cs.uiuc.edu>; Kaylor, Andrew > *Subject**:* Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?* > *** > > **** > > I'm running in MCJIT a module generated from one C++ function. Every line > of the source function uses C++ classes and may throw an exception. As > long as there are less than (about) 1000 lines, everything works. With more > lines the compiled code crashes when running it, with no sensible stack > trace.**** > > **** > > Is there any kind of hard-coded size limitation in MCJIT / ELF Dynamic > Linker / ELF codegen / number of EH states in a function ? **** > > **** > > I did browse the code but could not find anything obvious. **** > > **** > > Yaron**** > > **** > > ** ** >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/e048d133/attachment.html>
Kaylor, Andrew
2013-Oct-22 20:55 UTC
[LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
So it looks like 0x0A3600D1 is a good code address and there's no problem executing the code there, but 0x00BC7680 is a bad data address. Is that correct? If so, this is almost certainly a relocation problem. You just need to find a relocation that writes an entry (probably a relative offset) at 0x0A3600D1+the size of the instruction at that address. BTW, what I said before about not being aware of any size limitations wasn't quite correct. If you have enough code and data that we end up putting sections at addresses that are more than 2GB apart we'll have problems, but you should see an assertion in that case. That can happen if we weren't able to get the address we requested from allocateMappedMemory, but it doesn't look like that's what's happening here. -Andy From: Yaron Keren [mailto:yaron.keren at gmail.com] Sent: Tuesday, October 22, 2013 1:41 PM To: Kaylor, Andrew Cc: <llvmdev at cs.uiuc.edu> Subject: Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen? Hi, Thanks for your ideas. Memory allocation already exceeds 2x64K in the "working" case so it's not the condition of allocating more than 64K. To be sure I had modified SectionMemoryManager::allocateSection to allocate four time the required memory but it did not trigger more crashes.I debugged through the allocation code including the Win32 code and it seems to work well. I have also tried disabling the MemGroup.FreeMem cache which did not matter. An added assert for no Stubs to the end of RuntimeDyldImpl::loadObject processRelocationRef(SectionID, *i, *obj, LocalSections, LocalSymbols, Stubs); assert(!Stubs.size()); indeed caught nothing = no stubs created. Disabling (de)registerEH did not help. Looking at relocations and sections printouts, the exception is: Unhandled exception at 0x0A3600D1 : 0xC0000005: Access violation writing location 0x00BC7680. which is right after the start of .text: emitSection SectionID: 1 Name: .text obj addr: 0A3F1350 new addr: 0A360000 DataSize: 253203 StubBufSize: 0 Allocate: 253203 ... Resolving relocations Section #1 0A360000 so at least it is running code but tries to write a wrong location. Another run exhibits similar crash, still in .text but somewhat later. I have checked and the function address I'm running is located in .text towards the end, as expected since it's the last function added to the Module. Also I speculated that if it crashes when .text crosses 128K but no, it happens when it's larger. I had attached gdb to the process hoping it will show more information but it showed even less information than the Visual C++ debugger. Out of ideas... Yaron 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>> I would guess that it's crashing somewhere in the generated code. On Windows we don't have a way to get call stacks to the generated code (though if you want to try it on Linux, that should work). You can probably look at the address where the crash is occurring and verify that it is in the generated code. There are a couple of things I would look for. First, I'd take a look at the SectionMemoryManager allocation handling. The fact that the problem is code size dependent strongly points in this direction. It may be that SectionMemoryManager does something wrong when it hits a page boundary or something. Second, I'd look at the relocation processing. If it is generating any stubs, that would be a potential problem spot, but it shouldn't be generating any stubs. So the obvious thing to look at is whether any of the relocations are writing to the spot where the crash occurs. -Andy From: Yaron Keren [mailto:yaron.keren at gmail.com<mailto:yaron.keren at gmail.com>] Sent: Tuesday, October 22, 2013 10:17 AM To: Kaylor, Andrew Cc: <llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>> Subject: Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen? OS is Windows 7 64 bit OS, compiler is 32 bit Visual C++ 2012 with 32 bit. The target which is i686-pc-mingw32-elf so I can use the ELF dynamic loader. Code model, relocation model and and memory manager are whatever default for this - did not modify. The Module comes from clang. The source is 1000 or more lines repeating C++ code in one big function: A+1; A*B.t(); where A and B are matrices from Armadillo http://arma.sourceforge.net/. This a stress and performance test due to the large number of EH and temporary objects created. I am using the Engine Builder and MCJIT unmodified (except the multi-modules patches which are not relevant as there is only one module) like this: OwningPtr<llvm::ExecutionEngine> EE(llvm::EngineBuilder(M) .setErrorStr(&Error) .setUseMCJIT(true) .create()); to run the function either llvm::Function *F = M->getFunction(Name); void *FN = EE->getPointerToFunction(F); or uint64_t FN = EE->getFunctionAddress(Name); followed by ((void (*)())FN)(); or EE->runFunction(F, std::vector<llvm::GenericValue>()); all work the same with smaller about 1000 lines of the above code module and crash the same with more code. The call stack is unhelpful Visual C++ says: Frames below may be incorrect and/or missing which indicates a real problem with it. I have tried to provide less stack space (default is 10M) for the compiled program without any change. Yaron 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>> I'm not aware of such a limitation. What architecture, code model and relocation model are you using? Are you using the SectionMemoryManager? -Andy From: Yaron Keren [mailto:yaron.keren at gmail.com<mailto:yaron.keren at gmail.com>] Sent: Tuesday, October 22, 2013 8:12 AM To: <llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>>; Kaylor, Andrew Subject: Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen? I'm running in MCJIT a module generated from one C++ function. Every line of the source function uses C++ classes and may throw an exception. As long as there are less than (about) 1000 lines, everything works. With more lines the compiled code crashes when running it, with no sensible stack trace. Is there any kind of hard-coded size limitation in MCJIT / ELF Dynamic Linker / ELF codegen / number of EH states in a function ? I did browse the code but could not find anything obvious. Yaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/3393373e/attachment.html>
Yaron Keren
2013-Oct-22 23:22 UTC
[LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
Yes, this is correct code address accessing bad data address. However, there is no other relocation before .text or near it. I'll send you the full debug printout, maybe you'll note something. The problem could be result of something else entirely else than the linker such as some library initialization code that by chance worked with smaller code but fails now. I need to debug and see what's going on. The trouble is no debug information. Maybe I can do without the source code information and debug the assembly but without any symbols it's really a challenge to understand anything. I did try to make MCJIT emit debug info but for some reason attached gdb did not understand it. Maybe this could be solved. I assumed there may be some limitations around 31-32 bits as there are various int32 members in the ELF structure, but that's far far away. Problems start at .text size of about 150K. Yaron 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>> So it looks like 0x0A3600D1 is a good code address and there’s no > problem executing the code there, but 0x00BC7680 is a bad data address. Is > that correct?**** > > ** ** > > If so, this is almost certainly a relocation problem. You just need to > find a relocation that writes an entry (probably a relative offset) at > 0x0A3600D1+the size of the instruction at that address.**** > > ** ** > > BTW, what I said before about not being aware of any size limitations > wasn’t quite correct. If you have enough code and data that we end up > putting sections at addresses that are more than 2GB apart we’ll have > problems, but you should see an assertion in that case. That can happen if > we weren’t able to get the address we requested from allocateMappedMemory, > but it doesn’t look like that’s what’s happening here.**** > > ** ** > > -Andy**** > > ** ** > > *From:* Yaron Keren [mailto:yaron.keren at gmail.com] > *Sent:* Tuesday, October 22, 2013 1:41 PM > > *To:* Kaylor, Andrew > *Cc:* <llvmdev at cs.uiuc.edu> > *Subject:* Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF > codegen?**** > > ** ** > > Hi,**** > > ** ** > > Thanks for your ideas.**** > > ** ** > > Memory allocation already exceeds 2x64K in the "working" case so it's not > the condition of allocating more than 64K. To be sure I had modified > SectionMemoryManager::allocateSection to allocate four time the required > memory but it did not trigger more crashes.I debugged through the > allocation code including the Win32 code and it seems to work well. I have > also tried disabling the MemGroup.FreeMem cache which did not matter.**** > > ** ** > > An added assert for no Stubs to the end of RuntimeDyldImpl::loadObject**** > > processRelocationRef(SectionID, *i, *obj, LocalSections, > LocalSymbols,**** > > Stubs);**** > > assert(!Stubs.size());**** > > indeed caught nothing = no stubs created.**** > > ** ** > > Disabling (de)registerEH did not help.**** > > ** ** > > Looking at relocations and sections printouts, the exception is:**** > > ** ** > > Unhandled exception at 0x0A3600D1 :**** > > 0xC0000005: Access violation writing location 0x00BC7680.**** > > ** ** > > which is right after the start of .text:**** > > ** ** > > emitSection SectionID: 1 Name: .text obj addr: 0A3F1350 new addr: 0A360000 > DataSize: 253203 StubBufSize: 0 Allocate: 253203**** > > ...**** > > Resolving relocations Section #1 0A360000**** > > ** ** > > so at least it is running code but tries to write a wrong location.**** > > Another run exhibits similar crash, still in .text but somewhat later.**** > > ** ** > > I have checked and the function address I'm running is located in .text > towards the end, as expected since it's the last function added to the > Module.**** > > ** ** > > Also I speculated that if it crashes when .text crosses 128K but no, it > happens when it's larger.**** > > ** ** > > I had attached gdb to the process hoping it will show more information but > it showed even less information than the Visual C++ debugger.**** > > ** ** > > Out of ideas... **** > > ** ** > > Yaron**** > > ** ** > > ** ** > > 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>**** > > I would guess that it’s crashing somewhere in the generated code. On > Windows we don’t have a way to get call stacks to the generated code > (though if you want to try it on Linux, that should work). You can > probably look at the address where the crash is occurring and verify that > it is in the generated code.**** > > **** > > There are a couple of things I would look for.**** > > **** > > First, I’d take a look at the SectionMemoryManager allocation handling. > The fact that the problem is code size dependent strongly points in this > direction. It may be that SectionMemoryManager does something wrong when > it hits a page boundary or something.**** > > **** > > Second, I’d look at the relocation processing. If it is generating any > stubs, that would be a potential problem spot, but it shouldn’t be > generating any stubs. So the obvious thing to look at is whether any of > the relocations are writing to the spot where the crash occurs.**** > > **** > > -Andy**** > > **** > > **** > > *From:* Yaron Keren [mailto:yaron.keren at gmail.com] > *Sent:* Tuesday, October 22, 2013 10:17 AM > *To:* Kaylor, Andrew > *Cc:* <llvmdev at cs.uiuc.edu> > *Subject:* Re: Size limitations in MCJIT / ELF Dynamic Linker/ ELF > codegen?**** > > **** > > OS is Windows 7 64 bit OS, compiler is 32 bit Visual C++ 2012 with 32 bit. > **** > > The target which is i686-pc-mingw32-elf so I can use the ELF dynamic > loader. **** > > Code model, relocation model and and memory manager are whatever default > for this - did not modify.**** > > **** > > The Module comes from clang. The source is 1000 or more lines repeating > C++ code in one big function:**** > > **** > > A+1;**** > > A*B.t();**** > > **** > > where A and B are matrices from Armadillo http://arma.sourceforge.net/. > This a stress and performance test due to the large number of EH and > temporary objects created.**** > > **** > > I am using the Engine Builder and MCJIT unmodified (except the > multi-modules patches which are not relevant as there is only one module) > like this:**** > > **** > > OwningPtr<llvm::ExecutionEngine> EE(llvm::EngineBuilder(M)**** > > .setErrorStr(&Error)**** > > .setUseMCJIT(true)**** > > .create());**** > > **** > > to run the function either **** > > **** > > llvm::Function *F = M->getFunction(Name);**** > > void *FN = EE->getPointerToFunction(F);**** > > or**** > > uint64_t FN = EE->getFunctionAddress(Name);**** > > **** > > followed by **** > > **** > > ((void (*)())FN)();**** > > or**** > > EE->runFunction(F, std::vector<llvm::GenericValue>());**** > > **** > > all work the same with smaller about 1000 lines of the above code module > and crash the same with more code. The call stack is unhelpful Visual C++ > says: Frames below may be incorrect and/or missing which indicates a real > problem with it. I have tried to provide less stack space (default is 10M) > for the compiled program without any change.**** > > **** > > Yaron**** > > **** > > **** > > 2013/10/22 Kaylor, Andrew <andrew.kaylor at intel.com>**** > > I’m not aware of such a limitation.**** > > **** > > What architecture, code model and relocation model are you using? Are > you using the SectionMemoryManager?**** > > **** > > -Andy**** > > **** > > *From**:* Yaron Keren [mailto:yaron.keren at gmail.com] > *Sent**:* Tuesday, October 22, 2013 8:12 AM > *To**:* <llvmdev at cs.uiuc.edu>; Kaylor, Andrew > *Subject**:* Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?* > *** > > **** > > I'm running in MCJIT a module generated from one C++ function. Every line > of the source function uses C++ classes and may throw an exception. As > long as there are less than (about) 1000 lines, everything works. With more > lines the compiled code crashes when running it, with no sensible stack > trace.**** > > **** > > Is there any kind of hard-coded size limitation in MCJIT / ELF Dynamic > Linker / ELF codegen / number of EH states in a function ? **** > > **** > > I did browse the code but could not find anything obvious. **** > > **** > > Yaron**** > > **** > > **** > > ** ** >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131023/3275bcb0/attachment.html>
Apparently Analagous Threads
- [LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
- [LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
- [LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
- [LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?
- [LLVMdev] Size limitations in MCJIT / ELF Dynamic Linker/ ELF codegen?