Emanuele Del Sozzo via llvm-dev
2018-Aug-16 10:45 UTC
[llvm-dev] Replication -O3 optimizations manually
Hello llvm-dev, my name is Emanuele and I am an intern in ARM. As part of the project I am doing here, I would like to manually replicate the optimizations that LLVM applies when I type -O3. In other words, I would like to know what are the compilation flags/passes that -O3 triggers. I noticed that GCC reports, on its website, all the flags that are enforced by -O3 (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html), but I wasn't able to find something similar within LLVM documentation. On the other hand, I found that this command displays all the optimization passes applied by opt when -O3 flag is on: llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments I tried to apply the same optimization passes through opt, but, even though the performance are similar, the resulting binary is slower than the one generated using -O3 (also the binaries differ, of course). Again, I found this other command that does something similar (it lists the sequence of optimization passes applied): clang -O3 -mllvm -debug-pass=Arguments file.c In this case, the performance are still different and some of the optimization passes listed in the last block of passes (e.g. -machinemoduleinfo, -stack-protector, etc.) are unknown to opt. Said that, my question is: how can I find out what optimization passes/flags -O3 enforces in order to manually apply the same optimizations and have, hopefully, the same binary and performance? I am currently using LLVM version 5.0.2. Thank you for both your help and your time! Best regards Emanuele IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/737bb95c/attachment.html>
Stefano Cherubin via llvm-dev
2018-Aug-16 14:13 UTC
[llvm-dev] Replication -O3 optimizations manually
Hello Emanuele, When you provide the optimization level -O3 to the clang driver, it does not simply schedule a sequence of passes to be run on the intermediate representation.Indeed, it schedules target-independent and target-dependent passes.Moreover, IIRC, the optimization level is also used in the later stages of the code generation to apply target-dependent optimizations (i.e. vectorizer). The most common use case when someone wants to test its own pass/work within the LLVM toolchain is the following - use clang to generate a LLVM-IR file- use opt to run your desired pass / pass sequence and output another LLVM-IR file- use clang -O3 to compile to executable machine code However, with this approach you will run the passes on the LLVM-IR twice.There are use cases when this could invalidate your results.As opt stops at LLVM-IR level, I would suggest you to use also other LLVM tools to run individually the backend stages / sequence of passes which cannot be run by opt (such as llc / llvm-mc).An extensive list of tools/commands you can use is available at [0].For your specific case, I would suggest you to have a look at this restricted schema [1]. Yet there is another way to get into even fine grain detail.You can check which are the clang DriverActions you are running with a given command line. See [2].From that point you can rebuild the exact whole sequence of commands that the clang driver triggers. If you can provide more details about what is your use case (measure performance, pass development and testing, flag selection, phase ordering), we can suggest the most suitable approach. Kind regards, Stefano Cherubin [0] http://llvm.org/docs/CommandGuide/[1] https://github.com/skeru/LLVM-intro/blob/master/img/03/toolchain.pdf[2] https://clang.llvm.org/docs/DriverInternals.html#driver-stages On Thursday, 16 August 2018, 12:46:04 CEST, Emanuele Del Sozzo via llvm-dev <llvm-dev at lists.llvm.org> wrote: Hello llvm-dev, my name is Emanuele and I am an intern in ARM. As part of the project I am doing here, I would like to manually replicate the optimizations that LLVM applies when I type -O3. In other words, I would like to know what are the compilation flags/passes that -O3 triggers. I noticed that GCC reports, on its website, all the flags that are enforced by -O3 (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html), but I wasn't able to find something similar within LLVM documentation. On the other hand, I found that this command displays all the optimization passes applied by opt when -O3 flag is on: llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments I tried to apply the same optimization passes through opt, but, even though the performance are similar, the resulting binary is slower than the one generated using -O3 (also the binaries differ, of course). Again, I found this other command that does something similar (it lists the sequence of optimization passes applied): clang -O3 -mllvm -debug-pass=Arguments file.c In this case, the performance are still different and some of the optimization passes listed in the last block of passes (e.g. -machinemoduleinfo, -stack-protector, etc.) are unknown to opt. Said that, my question is: how can I find out what optimization passes/flags -O3 enforces in order to manually apply the same optimizations and have, hopefully, the same binary and performance? I am currently using LLVM version 5.0.2. Thank you for both your help and your time! Best regards Emanuele IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you._______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180816/f153881a/attachment.html>
cszide via llvm-dev
2018-Aug-17 01:55 UTC
[llvm-dev] Replication -O3 optimizations manually
Hi, Stefano I also have the problem as described by Emanuele. You say that clang schedules target-independent and target-dependent passes. However, when I use lli to execute bitcode generated by opt with -O3 or with the same optimization passes as -O3, the performance are still different. So, are there some special operations by -O3 option? I read the source code of opt, but I cannot find the reason. Best regards Zide At 2018-08-16 22:13:14, "Stefano Cherubin via llvm-dev" <llvm-dev at lists.llvm.org> wrote: Hello Emanuele, When you provide the optimization level -O3 to the clang driver, it does not simply schedule a sequence of passes to be run on the intermediate representation. Indeed, it schedules target-independent and target-dependent passes. Moreover, IIRC, the optimization level is also used in the later stages of the code generation to apply target-dependent optimizations (i.e. vectorizer). The most common use case when someone wants to test its own pass/work within the LLVM toolchain is the following - use clang to generate a LLVM-IR file - use opt to run your desired pass / pass sequence and output another LLVM-IR file - use clang -O3 to compile to executable machine code However, with this approach you will run the passes on the LLVM-IR twice. There are use cases when this could invalidate your results. As opt stops at LLVM-IR level, I would suggest you to use also other LLVM tools to run individually the backend stages / sequence of passes which cannot be run by opt (such as llc / llvm-mc). An extensive list of tools/commands you can use is available at [0]. For your specific case, I would suggest you to have a look at this restricted schema [1]. Yet there is another way to get into even fine grain detail. You can check which are the clang DriverActions you are running with a given command line. See [2]. From that point you can rebuild the exact whole sequence of commands that the clang driver triggers. If you can provide more details about what is your use case (measure performance, pass development and testing, flag selection, phase ordering), we can suggest the most suitable approach. Kind regards, Stefano Cherubin [0] http://llvm.org/docs/CommandGuide/ [1] https://github.com/skeru/LLVM-intro/blob/master/img/03/toolchain.pdf [2] https://clang.llvm.org/docs/DriverInternals.html#driver-stages On Thursday, 16 August 2018, 12:46:04 CEST, Emanuele Del Sozzo via llvm-dev <llvm-dev at lists.llvm.org> wrote: Hello llvm-dev, my name is Emanuele and I am an intern in ARM. As part of the project I am doing here, I would like to manually replicate the optimizations that LLVM applies when I type -O3. In other words, I would like to know what are the compilation flags/passes that -O3 triggers. I noticed that GCC reports, on its website, all the flags that are enforced by -O3 (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html), but I wasn't able to find something similar within LLVM documentation. On the other hand, I found that this command displays all the optimization passes applied by opt when -O3 flag is on: llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments I tried to apply the same optimization passes through opt, but, even though the performance are similar, the resulting binary is slower than the one generated using -O3 (also the binaries differ, of course). Again, I found this other command that does something similar (it lists the sequence of optimization passes applied): clang -O3 -mllvm -debug-pass=Arguments file.c In this case, the performance are still different and some of the optimization passes listed in the last block of passes (e.g. -machinemoduleinfo, -stack-protector, etc.) are unknown to opt. Said that, my question is: how can I find out what optimization passes/flags -O3 enforces in order to manually apply the same optimizations and have, hopefully, the same binary and performance? I am currently using LLVM version 5.0.2. Thank you for both your help and your time! Best regards Emanuele IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180817/c087c442/attachment.html>