Russell Wallace via llvm-dev
2015-Aug-22 22:03 UTC
[llvm-dev] Scaling to many basic blocks
How well does LLVM scale to many basic blocks? Let's say you have a single function consisting of a million basic blocks each with a few tens of instructions (and assuming the whole thing isn't trivially repetitive so the number of simultaneously live variables and whatever is large) and you feed that through the optimisers into the backend code generator, will this work okay, or will it take a very long time, or will it run into a scenario of 'look, all compilers are designed on the assumption that nobody is going to write a million lines of code in a single function'? (Not that I'm planning to do so by hand either, I'm just thinking about certain scenarios in automatic code generation.) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150822/6929b50b/attachment.html>
Michael Zolotukhin via llvm-dev
2015-Aug-22 22:57 UTC
[llvm-dev] Scaling to many basic blocks
Hi, Several passes would have troubles with such code (namely, GVN and JumpThreading). Recently we exposed similar issues when new more aggressive unrolling heuristic was implemented - we unrolled big loops expecting that the linear code would be then optimized, and it indeed was optimized, but compile time regressed significantly. But probably some of these issues have been fixed already - I didn’t check. Thanks, Michael> On Aug 22, 2015, at 3:03 PM, Russell Wallace via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > How well does LLVM scale to many basic blocks? Let's say you have a single function consisting of a million basic blocks each with a few tens of instructions (and assuming the whole thing isn't trivially repetitive so the number of simultaneously live variables and whatever is large) and you feed that through the optimisers into the backend code generator, will this work okay, or will it take a very long time, or will it run into a scenario of 'look, all compilers are designed on the assumption that nobody is going to write a million lines of code in a single function'? (Not that I'm planning to do so by hand either, I'm just thinking about certain scenarios in automatic code generation.) > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=ygVmcuuQ1MUhRUoJm-IgPtgjmvM0byfjlHDg99vufEI&m=z0N7FJAyaeJqJqE_W9ltYzHHpYI5tfa3q7x8UoMNaLA&s=mVoa5Fp3pKx6PIZ3F96zYIKlJmBSyzqARoAtuu0ws5A&e=
A somewhat more general question: What characteristics of generated IR contribute to poor optimization/code generation times? I recently found a pathological case in my own code of data tables that caused poor performance in the IR verifier. In general, I am not happy with the performance of LLVM on my code, so I know I need to do better. But how? I have a few ideas, but trying any of them could be time-consuming, with no guaranteed return at the end of it. Currently, I don't emit TBAA metadata. I have a hunch that doing so might speed up compilation, as alias analyses can be reduced in size, but I don't know that for sure, or how much of a benefit it would be. Yes, I need to do this in any case, but if my users complain of long compile times, I'd like to know what to work on first. On Sat, Aug 22, 2015 at 6:57 PM, Michael Zolotukhin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > Several passes would have troubles with such code (namely, GVN and > JumpThreading). Recently we exposed similar issues when new more aggressive > unrolling heuristic was implemented - we unrolled big loops expecting that > the linear code would be then optimized, and it indeed was optimized, but > compile time regressed significantly. But probably some of these issues > have been fixed already - I didn’t check. > > Thanks, > Michael > > On Aug 22, 2015, at 3:03 PM, Russell Wallace via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > How well does LLVM scale to many basic blocks? Let's say you have a > single function consisting of a million basic blocks each with a few tens > of instructions (and assuming the whole thing isn't trivially repetitive so > the number of simultaneously live variables and whatever is large) and you > feed that through the optimisers into the backend code generator, will this > work okay, or will it take a very long time, or will it run into a scenario > of 'look, all compilers are designed on the assumption that nobody is going > to write a million lines of code in a single function'? (Not that I'm > planning to do so by hand either, I'm just thinking about certain scenarios > in automatic code generation.) > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=ygVmcuuQ1MUhRUoJm-IgPtgjmvM0byfjlHDg99vufEI&m=z0N7FJAyaeJqJqE_W9ltYzHHpYI5tfa3q7x8UoMNaLA&s=mVoa5Fp3pKx6PIZ3F96zYIKlJmBSyzqARoAtuu0ws5A&e> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150822/19f760fb/attachment.html>
Russell Wallace via llvm-dev
2015-Aug-23 00:46 UTC
[llvm-dev] Scaling to many basic blocks
On Sat, Aug 22, 2015 at 11:57 PM, Michael Zolotukhin <mzolotukhin at apple.com> wrote:> Hi, > > Several passes would have troubles with such code (namely, GVN and > JumpThreading).Can you just choose not to run those particular passes? I suppose the big problem would be if there's a problem with the code generation and related stuff like instruction scheduling and register allocation - is that likely to be the case? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150823/eb228cee/attachment.html>
GVN and some other optimizers have really bad behavior on this number of basic blocks. Most algorithms are linear, but some are not, and can't be made linear. They mostly have cutoffs, but sometimes cutoffs get missed, etc :) Otherwise, most algorithms are either linear in the number of instructions/BB, or can be made linear in the number of instructions/BBs if they aren't. On Sat, Aug 22, 2015 at 3:03 PM, Russell Wallace via llvm-dev <llvm-dev at lists.llvm.org> wrote:> How well does LLVM scale to many basic blocks? Let's say you have a single > function consisting of a million basic blocks each with a few tens of > instructions (and assuming the whole thing isn't trivially repetitive so the > number of simultaneously live variables and whatever is large) and you feed > that through the optimisers into the backend code generator, will this work > okay, or will it take a very long time, or will it run into a scenario of > 'look, all compilers are designed on the assumption that nobody is going to > write a million lines of code in a single function'? (Not that I'm planning > to do so by hand either, I'm just thinking about certain scenarios in > automatic code generation.) > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >