Reid Kleckner
2009-Jun-30 00:41 UTC
[LLVMdev] JIT allocates global data in function body memory
So I (think I) found a bug in the JIT: http://llvm.org/bugs/show_bug.cgi?id=4483 Basically, globals used by a function are allocated in the same buffer as the first code that uses it. However, when you free the machine code, you also free the memory holding the global's data. The address is still in the GlobalValue map, so any other code using that global will access freed memory, which will cause problems as soon as you reallocate that memory for something else. I tracked down the commit that introduced the bug: http://llvm.org/viewvc/llvm-project?view=rev&revision=54442 It very nicely explains what it does, but not why it does it, which I'd like to know before I change it. I couldn't find the author (johannes) on IRC so ssen told me to ask LLVMdev about this behavior. There's even a patch to work around this behavior on Apple ARM platforms: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JIT/JIT.cpp?view=diff&pathrev=72630&r1=58687&r2=58688 So what should the right long-term behavior be? It makes sense to me to use the JITMemoryManager for this so that clients of the JIT can customize allocation instead of using malloc or new char[]. On the other hand, that complicates the API and requires a homegrown malloc implementation in the DefaultMemoryManager. Reid
Dale Johannesen
2009-Jun-30 00:50 UTC
[LLVMdev] JIT allocates global data in function body memory
On Jun 29, 2009, at 5:41 PMPDT, Reid Kleckner wrote:> So I (think I) found a bug in the JIT: > http://llvm.org/bugs/show_bug.cgi?id=4483 > > Basically, globals used by a function are allocated in the same buffer > as the first code that uses it. However, when you free the machine > code, you also free the memory holding the global's data. The address > is still in the GlobalValue map, so any other code using that global > will access freed memory, which will cause problems as soon as you > reallocate that memory for something else. > > I tracked down the commit that introduced the bug: > http://llvm.org/viewvc/llvm-project?view=rev&revision=54442 > > It very nicely explains what it does, but not why it does it, which > I'd like to know before I change it. I couldn't find the author > (johannes) on IRC so ssen told me to ask LLVMdev about this behavior.That's me (and I'm not on IRC because I like messages to be archived). The reason everything needs to go in the same buffer is that we're JITting code on one machine, then sending it to another to be executed, and references from one buffer to another won't work in that environment. So that model needs to continue to work. If you want to generalize it so other models work as well, go ahead.> There's even a patch to work around this behavior on Apple ARM > platforms: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ExecutionEngine/JIT/JIT.cpp?view=diff&pathrev=72630&r1=58687&r2=58688 > > So what should the right long-term behavior be? It makes sense to me > to use the JITMemoryManager for this so that clients of the JIT can > customize allocation instead of using malloc or new char[]. On the > other hand, that complicates the API and requires a homegrown malloc > implementation in the DefaultMemoryManager. > > Reid > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reid Kleckner
2009-Jun-30 02:23 UTC
[LLVMdev] JIT allocates global data in function body memory
> That's me (and I'm not on IRC because I like messages to be > archived). The reason everything needs to go in the same buffer is > that we're JITting code on one machine, then sending it to another to > be executed, and references from one buffer to another won't work in > that environment. So that model needs to continue to work. If you > want to generalize it so other models work as well, go ahead.Maybe what I should do then is change TargetJITInfo::allocateSeparateGVMemory to allocateGVsWithCode and invert the meaning, since I feel like most users probably just want malloc or something similar. You could then subclass the appropriate TJI class and override that method. Would that be a reasonable API change? No one else calls or overrides that method. In order to do that we'd also need to hear from Evan about why Apple ARM needs to use malloc. Is it worth allocating global data through the memory manager, or is better to use malloc? Currently global data lives forever (or rather, as long as the JIT does, which makes sense since they live forever in a statically compiled program), so I was thinking it might be good for memory usage and locality to put small globals together into a buffer similar to the function stub buffer. That way you don't call malloc for every global int64, and you can lay them out sequentially in memory. Large global arrays should probably get their own block. Reid
Jeffrey Yasskin
2009-Jun-30 18:18 UTC
[LLVMdev] JIT allocates global data in function body memory
On Mon, Jun 29, 2009 at 5:50 PM, Dale Johannesen<dalej at apple.com> wrote:> > On Jun 29, 2009, at 5:41 PMPDT, Reid Kleckner wrote: > >> So I (think I) found a bug in the JIT: >> http://llvm.org/bugs/show_bug.cgi?id=4483 >> >> Basically, globals used by a function are allocated in the same buffer >> as the first code that uses it. However, when you free the machine >> code, you also free the memory holding the global's data. The address >> is still in the GlobalValue map, so any other code using that global >> will access freed memory, which will cause problems as soon as you >> reallocate that memory for something else. >> >> I tracked down the commit that introduced the bug: >> http://llvm.org/viewvc/llvm-project?view=rev&revision=54442 >> >> It very nicely explains what it does, but not why it does it, which >> I'd like to know before I change it. I couldn't find the author >> (johannes) on IRC so ssen told me to ask LLVMdev about this behavior. > > That's me (and I'm not on IRC because I like messages to be > archived). The reason everything needs to go in the same buffer is > that we're JITting code on one machine, then sending it to another to > be executed, and references from one buffer to another won't work in > that environment. So that model needs to continue to work. If you > want to generalize it so other models work as well, go ahead.So, you're moving code across machines without running any relocations on it? How can that work? Are you just assuming that everything winds up at the same addresses? Or is everything PC-relative on your platform, so all that matters is that globals and the code are in the same relative positions? How are you getting the size of the code you need to copy? MachineCodeInfo didn't exist when you wrote this patch, so I assume you've written your own JITMemoryManager. Even then, if you JIT more than one function, and they share any globals, you have to deal with multiple calls into the MemoryManager and functions that use globals allocated inside other buffers. You should be able to deal with having separate calls to allocate global space and allocate code space. You'd just remember the answers you gave and preserve them when copying to a new system. I'd like freeMachineCodeForFunction to avoid corrupting emitted globals, and with the current arrangement of information within the JIT, that means globals and code have to live in different allocations. I think Reid's suggesting a flag of some sort, with one setting for "freeMachineCodeForFunction works" and another for "globals and code are allocated by a single call into the MemoryManager." I'd like to avoid new knobs if it's possible, so do you really need that second option? Or do you just need globals to be allocated by some call into the MemoryManager? Thanks! Jeffrey
Apparently Analagous Threads
- [LLVMdev] JIT allocates global data in function body memory
- [LLVMdev] JIT allocates global data in function body memory
- [LLVMdev] JIT allocates global data in function body memory
- [LLVMdev] JIT allocates global data in function body memory
- [LLVMdev] JIT allocates global data in function body memory