Mehdi Amini via llvm-dev
2017-Apr-11 05:12 UTC
[llvm-dev] Relationship between clang, opt and llc
> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > clang -O0 does not disable all optimization passes modify the IR.; In fact it causes most functions to get tagged with noinline to prevent inlininingIt also disable lifetime instrinsics emission and TBAA, etc.> > What you really need to do is > > clang -O3 -c emit-llvm -o source.bc -v > > Find the -cc1 command line from that output. Execute that command with --disable-llvm-passes. leave the -O3 and everything else.That’s a bit complicated: CC1 options can be passed through with -Xclang, for example here just adding to the regular clang invocation ` -Xclang -disable-llvm-passes` Best, — Mehdi> > You should be able to feed the output from that command to opt/llc and get consistent results. > > > > > ~Craig > > On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi folks, > > I am wondering about the relationship clang, opt and llc. I understand that this has been asked, e.g., http://stackoverflow.com/questions/40350990/relationship-between-clang-opt-llc-and-llvm-linker <http://stackoverflow.com/questions/40350990/relationship-between-clang-opt-llc-and-llvm-linker>. Sorry for posting a similar question again, but I still have something that hasn't been resolved yet. > > More specifically I am wondering about the following two approaches compiling optimized executable: > > 1. clang -O3 -c source.c -o source.o > ... > clang a.o b.o c.o ... -o executable > > 2. clang -O0 -c -emit-llvm -o source.bc > opt -O3 source.bc -o source.bc > llc -O3 -filetype=obj source.bc -o source.o > ... > clang a.o b.o c.o ... -o executable > > I took a look at the source code of the clang tool and the opt tool, they both seem to use the PassManagerBuilder::populateModulePassManager() and PassManagerBuilder::populateFunctionPassManager() functions to add passes to their optimization pipeline; and for the backend, the clang and llc both use the addPassesToEmitFile() function to generate object code. > > So presumably the above two approaches to generating optimized executable file should do the same thing. However, I am seeing that the second approach is around 2% slower than the first approach (which is the way developers usually use) pretty consistently. > > Can anyone point me to the reasons why this happens? Or even correct my wrong understanding of the relationship between these two approaches? > > PS: I used the -debug-pass=Structure option to print out the passes, they seem the same except that the first approach has an extra pass called "-add-discriminator", but I don't think that's the reason. > > Peizhao > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170410/23c4aa8a/attachment.html>
Peizhao Ou via llvm-dev
2017-Apr-11 16:15 UTC
[llvm-dev] Relationship between clang, opt and llc
It's really nice of you pointing out the -Xclang option, it makes things much easier. I really appreciate your help! Best, Peizhao On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:> > On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > clang -O0 does not disable all optimization passes modify the IR.; In fact > it causes most functions to get tagged with noinline to prevent inlinining > > > It also disable lifetime instrinsics emission and TBAA, etc. > > > > What you really need to do is > > clang -O3 -c emit-llvm -o source.bc -v > > Find the -cc1 command line from that output. Execute that command with > --disable-llvm-passes. leave the -O3 and everything else. > > > That’s a bit complicated: CC1 options can be passed through with -Xclang, > for example here just adding to the regular clang invocation ` -Xclang > -disable-llvm-passes` > > Best, > > — > Mehdi > > > > > You should be able to feed the output from that command to opt/llc and get > consistent results. > > > > > ~Craig > > On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi folks, >> >> I am wondering about the relationship clang, opt and llc. I understand >> that this has been asked, e.g., http://stackoverflow.com >> /questions/40350990/relationship-between-clang-opt-llc-and-llvm-linker. >> Sorry for posting a similar question again, but I still have something that >> hasn't been resolved yet. >> >> More specifically I am wondering about the following two approaches >> compiling optimized executable: >> >> 1. clang -O3 -c source.c -o source.o >> ... >> clang a.o b.o c.o ... -o executable >> >> 2. clang -O0 -c -emit-llvm -o source.bc >> opt -O3 source.bc -o source.bc >> llc -O3 -filetype=obj source.bc -o source.o >> ... >> clang a.o b.o c.o ... -o executable >> >> I took a look at the source code of the clang tool and the opt tool, they >> both seem to use the PassManagerBuilder::populateModulePassManager() and >> PassManagerBuilder::populateFunctionPassManager() functions to add >> passes to their optimization pipeline; and for the backend, the clang and >> llc both use the addPassesToEmitFile() function to generate object code. >> >> So presumably the above two approaches to generating optimized executable >> file should do the same thing. However, I am seeing that the second >> approach is around 2% slower than the first approach (which is the way >> developers usually use) pretty consistently. >> >> Can anyone point me to the reasons why this happens? Or even correct my >> wrong understanding of the relationship between these two approaches? >> >> PS: I used the -debug-pass=Structure option to print out the passes, they >> seem the same except that the first approach has an extra pass called >> "-add-discriminator", but I don't think that's the reason. >> >> Peizhao >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/36760b5e/attachment.html>
toddy wang via llvm-dev
2018-Jan-05 21:19 UTC
[llvm-dev] Relationship between clang, opt and llc
I tried the following on LULESH1.0 serial version ( https://codesign.llnl.gov/lulesh/LULESH.cc) 1. clang++ -O3 LULESH.cc; ./a.out 20 Runtime: 9.487353 second 2. clang++ -O0 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; clang++ b.o -o b.out; ./b.out 20 Runtime: 24.15 seconds 3. clang++ -O3 -Xclang -disable-llvm-passes -c -emit-llvm -o a.bc LULESH.cc; opt -O3 a.bc -o b.bc; llc -O3 -filetype=obj b.bc -o b.o ; clang++ b.o -o b.out; ./b.out 20 Runtime: 9.53 seconds 1 and 3 have almost the same performance, while 2 is significantly worse, while I expect 1, 2 ,3 should have trivial difference. Is this a wrong expectation? @Peizhao, what did you try in your last post? On Tue, Apr 11, 2017 at 12:15 PM, Peizhao Ou via llvm-dev < llvm-dev at lists.llvm.org> wrote:> It's really nice of you pointing out the -Xclang option, it makes things > much easier. I really appreciate your help! > > Best, > Peizhao > > On Mon, Apr 10, 2017 at 10:12 PM, Mehdi Amini <mehdi.amini at apple.com> > wrote: > >> >> On Apr 10, 2017, at 5:21 PM, Craig Topper via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> clang -O0 does not disable all optimization passes modify the IR.; In >> fact it causes most functions to get tagged with noinline to prevent >> inlinining >> >> >> It also disable lifetime instrinsics emission and TBAA, etc. >> >> >> >> What you really need to do is >> >> clang -O3 -c emit-llvm -o source.bc -v >> >> Find the -cc1 command line from that output. Execute that command with >> --disable-llvm-passes. leave the -O3 and everything else. >> >> >> That’s a bit complicated: CC1 options can be passed through with -Xclang, >> for example here just adding to the regular clang invocation ` -Xclang >> -disable-llvm-passes` >> >> Best, >> >> — >> Mehdi >> >> >> >> >> You should be able to feed the output from that command to opt/llc and >> get consistent results. >> >> >> >> >> ~Craig >> >> On Mon, Apr 10, 2017 at 4:57 PM, Peizhao Ou via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi folks, >>> >>> I am wondering about the relationship clang, opt and llc. I understand >>> that this has been asked, e.g., http://stackoverflow.com >>> /questions/40350990/relationship-between-clang-opt-llc-and-llvm-linker. >>> Sorry for posting a similar question again, but I still have something that >>> hasn't been resolved yet. >>> >>> More specifically I am wondering about the following two approaches >>> compiling optimized executable: >>> >>> 1. clang -O3 -c source.c -o source.o >>> ... >>> clang a.o b.o c.o ... -o executable >>> >>> 2. clang -O0 -c -emit-llvm -o source.bc >>> opt -O3 source.bc -o source.bc >>> llc -O3 -filetype=obj source.bc -o source.o >>> ... >>> clang a.o b.o c.o ... -o executable >>> >>> I took a look at the source code of the clang tool and the opt tool, >>> they both seem to use the PassManagerBuilder::populateModulePassManager() >>> and PassManagerBuilder::populateFunctionPassManager() functions to add >>> passes to their optimization pipeline; and for the backend, the clang and >>> llc both use the addPassesToEmitFile() function to generate object code. >>> >>> So presumably the above two approaches to generating optimized >>> executable file should do the same thing. However, I am seeing that the >>> second approach is around 2% slower than the first approach (which is the >>> way developers usually use) pretty consistently. >>> >>> Can anyone point me to the reasons why this happens? Or even correct my >>> wrong understanding of the relationship between these two approaches? >>> >>> PS: I used the -debug-pass=Structure option to print out the passes, >>> they seem the same except that the first approach has an extra pass called >>> "-add-discriminator", but I don't think that's the reason. >>> >>> Peizhao >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180105/62f965ab/attachment.html>