hameeza ahmed via llvm-dev
2019-Oct-24 17:20 UTC
[llvm-dev] No vectorization with -loop-vectorize and -slp-vectorizer
There is one problem. For matrix multiplication code, I am not getting same IR with o3 and individual passes as follows; Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O1 -march=native -Xclang -disable-O0-optnone -disable-llvm-passes -emit-llvm -S matmul.c -o matmul-noopt.ll Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/opt -O3 -S matmul-noopt.ll -o matmul-noopt-o3.ll Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -mllvm -debug-pass=Arguments -O3 -march=native -S -emit-llvm matmul.c -o matmul-o3.ll *obtain the o3 individual flags by above command to give as input in the following command;* Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/opt -tti -targetlibinfo -tbaa -scoped-noalias -assumption-cache-tracker -profile-summary-info -forceattrs -inferattrs -domtree -callsite-splitting -ipsccp -called-value-propagation -attributor -globalopt -domtree -mem2reg -deadargelim -domtree -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -inline -functionattrs -argpromotion -domtree -sroa -basicaa -aa -memoryssa -early-cse-memssa -speculative-execution -basicaa -aa -lazy-value-info -jump-threading -correlated-propagation -simplifycfg -domtree -aggressive-instcombine -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -libcalls-shrinkwrap -loops -branch-prob -block-freq -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -pgo-memop-opt -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -tailcallelim -simplifycfg -reassociate -domtree -loops -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -loop-rotate -licm -loop-unswitch -simplifycfg -domtree -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -indvars -loop-idiom -loop-deletion -loop-unroll -mldst-motion -phi-values -basicaa -aa -memdep -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -gvn -phi-values -basicaa -aa -memdep -memcpyopt -sccp -demanded-bits -bdce -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -lazy-value-info -jump-threading -correlated-propagation -basicaa -aa -phi-values -memdep -dse -loops -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -licm -postdomtree -adce -simplifycfg -domtree -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -barrier -elim-avail-extern -basiccg -rpo-functionattrs -globalopt -globaldce -basiccg -globals-aa -float2int -domtree -loops -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -loop-rotate -loop-accesses -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-distribute -branch-prob -block-freq -scalar-evolution -basicaa -aa -loop-accesses -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -loop-vectorize -loop-simplify -scalar-evolution -aa -loop-accesses -lazy-branch-prob -lazy-block-freq -loop-load-elim -basicaa -aa -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -domtree -loops -scalar-evolution -basicaa -aa -demanded-bits -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -slp-vectorizer -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-unroll -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -loop-simplify -lcssa-verification -lcssa -scalar-evolution -licm -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -transform-warning -alignment-from-assumptions -strip-dead-prototypes -globaldce -constmerge -domtree -loops -branch-prob -block-freq -loop-simplify -lcssa-verification -lcssa -basicaa -aa -scalar-evolution -block-freq -loop-sink -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instsimplify -div-rem-pairs -simplifycfg -S matmul-noopt.ll -o matmul-noopt_o3_opt_parms.ll Why the IRs matmul-noopt-o3.ll and matmul-noopt_o3_opt_parms.ll are not same? What is the mistake here? On Mon, Oct 14, 2019 at 2:33 PM HAPPY Mahto <cs17btech11018 at iith.ac.in> wrote:> > > On Mon, Oct 14, 2019 at 2:52 PM hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello, >> I am trying to test individual optimizations of opt on unoptimized IR. >> But, the IR is not getting vectorized by using following commands. >> >> Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O1 >> -Xclang -disable-llvm-passes -emit-llvm -S vecsum.c -o vecsum-noopt.ll >> >> > Try this: > Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O1 > -Xclang -disable-00-optnone -disable-llvm-passes -emit-llvm -S vecsum.c -o > vecsum-noopt.ll > Same for other commands too. > You need to add '-disable-00-optnone' flag, or you can remove the optnone > attribute from attributes #0 = { noinline , nounwind, optnone ... } > I hope this helps. > > -Happy > > Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/opt >> -loop-vectorize -slp-vectorizer -S vecsum-noopt.ll -o vecto.ll >> >> Though IR gets vectorized with O3 flag. Even I tried to replicate O3 >> behavior via individual flags with opt. But It is not showing same IR. >> >> Where am I doing mistake? >> >> Please help >> >> Thank You >> Regards >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191024/fe3fc16a/attachment.html>