On Aug 26, 2009, at 8:32 AM, Chris Lattner <clattner at apple.com> wrote:> On Aug 26, 2009, at 7:47 AM, Óscar Fuentes wrote: >>> Also if you use -time-passes with llc it should show which pass in >>> llc >>> takes so much time. >> >> These are the three main culprits for llc -O0 >> >> ---User Time--- --System Time-- --User+System-- ---Wall >> Time--- --- Name --- >> 10.9531 ( 30.0%) 0.4687 ( 58.8%) 11.4218 ( 30.6%) 11.5468 >> ( 30.6%) X86 DAG->DAG Instruction Selection >> 10.2500 ( 28.0%) 0.0156 ( 1.9%) 10.2656 ( 27.5%) 10.2500 >> ( 27.2%) Live Variable Analysis >> 4.8593 ( 13.3%) 0.0000 ( 0.0%) 4.8593 ( 13.0%) 4.8593 >> ( 12.9%) Linear Scan Register Allocator >> >> And there for -pre-RA-sched=fast -regalloc=simple -O0 code.bc >> >> 10.7187 ( 45.4%) 0.4375 ( 60.8%) 11.1562 ( 45.8%) 11.1718 >> ( 45.4%) X86 DAG->DAG Instruction Selection >> 7.4687 ( 31.6%) 0.0156 ( 2.1%) 7.4843 ( 30.7%) 7.5312 >> ( 30.6%) Simple Register Allocator >> 1.9531 ( 8.2%) 0.1406 ( 19.5%) 2.0937 ( 8.6%) 2.1093 >> ( 8.5%) X86 Intel-Style Assembly Printer >> >> I suppose we can't get rid of instruction selection :-) > > Pass -fast-isel to speed up instruction selection. > > Dan, I think that this should be made "non hidden" and updated (from > llc --help): > > -fast-isel - Enable the experimental > "fast" instruction selectorIt's turned on by -O0. And I guess it's not so "experimental" at this point :). It hasn't been tuned for a wide variety of applications yet though. An interesting option to add is -fast-isel-verbose, which prints out LLVM instructions that aren't going down the fast path. If there's something that shows up a lot, it may be worthwhile looking into why the front-end is using it, or looking into adding support for that instruction to the fast path. LLVM has made progress in this area, but there's more to be done. Dan
Dan Gohman <gohman at apple.com> writes: [snip]> An interesting option to add is -fast-isel-verbose, which prints out > LLVM instructions that aren't going down the fast path. If there's > something that shows up a lot, it may be worthwhile looking into why > the front-end is using it, or looking into adding support for that > instruction to the fast path.>From a bytecode file that disassembles into a 240K lines LLVM assemblyfile, -fast-isel-verbose outputs ~7600 missed instructions. There are lots of loads/stores of boolean values (i1), bitcasts and calls. store i1 : 1802 occurrences (23%) load i1* : 1076 occurrences (13%) call : 2590 occurrences (34%) bitcast : 1848 occurrences (24%) Almost all the calls have void return value and use the sret attribute. I can send the bytecode file to anyone interested. -- Óscar
On Aug 26, 2009, at 9:49 AM, Óscar Fuentes wrote:> Dan Gohman <gohman at apple.com> writes: > > [snip] > > >> An interesting option to add is -fast-isel-verbose, which prints out >> >> LLVM instructions that aren't going down the fast path. If there's >> >> something that shows up a lot, it may be worthwhile looking into why >> >> the front-end is using it, or looking into adding support for that >> >> instruction to the fast path. >> > > >> From a bytecode file that disassembles into a 240K lines LLVM >> assembly >> > file, -fast-isel-verbose outputs ~7600 missed instructions. > > There are lots of loads/stores of boolean values (i1), bitcasts and > calls. > > store i1 : 1802 occurrences (23%) > load i1* : 1076 occurrences (13%)I've added fast-path support for loads and stores of i1 now.> call : 2590 occurrences (34%)The fast-path doesn't currently support sret (which you mention below).> bitcast : 1848 occurrences (24%)For bitcasts, it depends on the specific types involved. Dan> > Almost all the calls have void return value and use the sret > attribute. > > I can send the bytecode file to anyone interested. > > -- > Óscar > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >