Evan Cheng wrote:> On Apr 13, 2008, at 1:28 PM, Török Edwin wrote: > > >> Hi, >> >> In PR2193 LiveVariables runs out of memory on a 512M limit, after >> processing 11557 basicblocks. >> VirtRegInfo has ~180000 entries with ~700 bytes each. >> If I give it more memory (1.5G) it runs out of memory in LiveInterval. >> > > Some of the information kept by LiveVariables are somewhat redundant > and can be removed. I think there was talk of folding much of its > computation into LiveIntervalAnalysis. It's not clear whether that > would make a difference without careful analysis. > > It's not clear to me what to do about LiveInterval. It does keep a lot > of information but it's all essential. > > I suspect StrongPHIElimination may help the situation somewhat but > again I can't guess its impact. > >Another question to ask, is why that function became so large in the first place [X86DAGToDAGISel::SelectCode(llvm::SDOperand)] We have inline limits, don't we?>> I don't see any easy solution to reduce memory usage, but should we >> optimize such a huge function at once? >> If the function is over some reasonable limit (5k basic-blocks?) we >> could split it into smaller functions, and avoid >> these problems. >> > > But what is the right limit? How do you estimate the limit given > different host configuration?I think having a limit is better than having none. It will prevent memory usage to get out of control. The memory usage of LiveVariables/LiveInterval looks like O(N^2) to me, if we limit N, the memory usage can differ at most by a constant factor. Having a function twice the size, won't suddenly require 4x the memory. Determining a right limit, to fit exactly into a given amount of memory is hard. We could gather some statistics on medium and maximum basic-block count in applications from llvm-test. Then choose a default value, and allow the user to override via a command-line option. For example once gcc once gave me a warning, something like: more than 15000 BBs, disabling gcse (IIRC, I can't find the source file that gave me that warning anymore). Disabling an optimization entirely doesn't seem the best choice to me, I think it is better if we split it into smaller functions, and optimize those. But maybe the right solution is to avoid generating functions that large, instead of splitting it afterwards.> If you can make the decision earlier, > than it can default to the local register allocator path which doesn't > require LiveIntervalAnalysis. >Yep, the local allocator works. How much worse is the performance of generated code with the local allocator vs. the default?> > >> I am writing a pass to do this split at llvm IR level. What do you >> think? >> > > This functionality sounds useful, but I am not sure it's the *right* > fix. We should definitely try harder to make codegen more scalable. > But it's not a trivial problem.Good point. If we have the size-limiting pass, we could become lazy and not care of scalability too much. So maybe it is better if I don't write it ;) Best regards, --Edwin
Török Edwin wrote:> Evan Cheng wrote: > >> On Apr 13, 2008, at 1:28 PM, Török Edwin wrote: >> >> >> >>> Hi, >>> >>> In PR2193 LiveVariables runs out of memory on a 512M limit, after >>> processing 11557 basicblocks. >>> VirtRegInfo has ~180000 entries with ~700 bytes each. >>> If I give it more memory (1.5G) it runs out of memory in LiveInterval. >>> >>> >> Some of the information kept by LiveVariables are somewhat redundant >> and can be removed. I think there was talk of folding much of its >> computation into LiveIntervalAnalysis. It's not clear whether that >> would make a difference without careful analysis. >> >> It's not clear to me what to do about LiveInterval. It does keep a lot >> of information but it's all essential. >> >> I suspect StrongPHIElimination may help the situation somewhat but >> again I can't guess its impact. >> >> >> > > Another question to ask, is why that function became so large in the > first place [X86DAGToDAGISel::SelectCode(llvm::SDOperand)] > We have inline limits, don't we?most of functions called by SelectCode get a -30000 cost reduction because they are internal. Even if Caller.size() is 40000, the penalty is only 2000, thus we still have negative costs. Removing the /20 factor from here improves the situation a lot, however I think there is a reason for /20, and it can't be just removed: InlineCost += Caller->size()/20; Best regards, --Edwin
On Monday 14 April 2008 03:39, Török Edwin wrote:> Another question to ask, is why that function became so large in the > first place [X86DAGToDAGISel::SelectCode(llvm::SDOperand)] > We have inline limits, don't we?Some functions are just big. Consider production versions of the WRF code from spec 2006. It is machine-generated and can be HUGE!> >> I am writing a pass to do this split at llvm IR level. What do you > >> think? > > > > This functionality sounds useful, but I am not sure it's the *right* > > fix. We should definitely try harder to make codegen more scalable. > > But it's not a trivial problem. > > Good point. If we have the size-limiting pass, we could become lazy and > not care of scalability too much. So maybe it is better if I don't write > it ;)Another option is regioned compilation. Generally, I think degrading optimization (using a local allocator rather than a global one) is hackish. Regioning does degrade optimization due to visibility limits, but it doesn't require special-casing which flavors of optimization are run based on code size. I would very much like to see some kind of regioning make its way into llvm at some point. I'd do it myself if I had free cycles, which I don't. :( -Dave
Chris Lattner
2008-Apr-14  17:43 UTC
[LLVMdev] LiveVariables/LiveInterval on huge functions
On Mon, 14 Apr 2008, [ISO-8859-1] Török Edwin wrote:>> Another question to ask, is why that function became so large in the >> first place [X86DAGToDAGISel::SelectCode(llvm::SDOperand)] >> We have inline limits, don't we? > > most of functions called by SelectCode get a -30000 cost reduction > because they are internal. > Even if Caller.size() is 40000, the penalty is only 2000, thus we still > have negative costs. > > Removing the /20 factor from here improves the situation a lot, however > I think there is a reason for /20, and it can't be just removed: > InlineCost += Caller->size()/20;This sounds like unanticipated fallout from Evan's recent tweaks of the inliner. Evan, thoughts? -Chris -- http://nondot.org/sabre/ http://llvm.org/
Possibly Parallel Threads
- [LLVMdev] LiveVariables/LiveInterval on huge functions
- [LLVMdev] LiveVariables/LiveInterval on huge functions
- [LLVMdev] LiveVariables/LiveInterval on huge functions
- [LLVMdev] LiveVariables/LiveInterval on huge functions
- [LLVMdev] LiveVariables/LiveInterval on huge functions