Star Tan
2013-Jul-01 13:51 UTC
[LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
>Great. Now we have two test cases we can work with. Can you>upload the LLVM-IR produced by clang -O0 (without Polly)?Since tramp3d-v4.ll is to large (19M with 267 thousand lines), I would focus on the oggenc benchmark at firat. I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz.>2) Check why the Polly scop detection is failing > >You can use 'opt -polly-detect -analyze' to see the most common reasons >the scop detection failed. We should verify that we perform the most >common and cheap tests early. >I also attached the output file oggenc_polly_detect_analyze.log produced by "polly-opt -O3 -polly-detect -analyze oggenc.ll". Unfortunately, it only dumps valid scop regions. At first, I thought to dump all debugging information by "-debug" option, but it will dump too many unrelated information produced by other passes. Do you know any option that allows me to dump debugging information for the "-polly-detect" pass, but at the same time disabling debugging information for other passes? Star Tan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/40997612/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: oggenc.tgz Type: application/octet-stream Size: 657372 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/40997612/attachment.obj>
Tobias Grosser
2013-Jul-01 15:47 UTC
[LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
On 07/01/2013 06:51 AM, Star Tan wrote:>> Great. Now we have two test cases we can work with. Can you > >> upload the LLVM-IR produced by clang -O0 (without Polly)? > Since tramp3d-v4.ll is to large (19M with 267 thousand lines), I would focus on the oggenc benchmark at firat. > I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz.Sounds good.>> 2) Check why the Polly scop detection is failing >> >> You can use 'opt -polly-detect -analyze' to see the most common reasons >> the scop detection failed. We should verify that we perform the most >> common and cheap tests early. >> > I also attached the output file oggenc_polly_detect_analyze.log produced by "polly-opt -O3 -polly-detect -analyze oggenc.ll". Unfortunately, it only dumps valid scop regions. At first, I thought to dump all debugging information by "-debug" option, but it will dump too many unrelated information produced by other passes. Do you know any option that allows me to dump debugging information for the "-polly-detect" pass, but at the same time disabling debugging information for other passes?I really propose to not attach such large files. ;-) To dump debug info of just one pass you can use -debug-only=polly-detect. However, for performance measurements, you want to use a release build to get accurate numbers. Another flag that is interesting is the flag '-stats'. It gives me the following information: 4 polly-detect - Number of bad regions for Scop: CFG too complex 183 polly-detect - Number of bad regions for Scop: Expression not affine 103 polly-detect - Number of bad regions for Scop: Found base address alias 167 polly-detect - Number of bad regions for Scop: Found invalid region entering edges 59 polly-detect - Number of bad regions for Scop: Function call with side effects appeared 725 polly-detect - Number of bad regions for Scop: Loop bounds can not be computed 93 polly-detect - Number of bad regions for Scop: Non canonical induction variable in loop 8 polly-detect - Number of bad regions for Scop: Others 53 polly-detect - Number of regions that a valid part of Scop This seems to suggest that we most scops fail due to loop bounds that can not be computed. It would be interesting to see what kind of expressions these are. In case SCEV often does not deliver a result, this may be one of the cases where bottom up scop detection would help a lot, as outer regions are automatically invalidated if we can not get a SCEV for the loop bounds of the inner regions. However, I still have the feeling the test case is too large. You can reduce it I propose to first run opt with 'opt -O3 -polly -disable-inlining -time-passes'. You then replace all function definitions with s/define internal/define/. After this preprocessing you can use a regexp such as "'<,'>s/define \([^{}]* \){\_[^{}]*}/declare \1" to replace function definitions with their declaration. You can use this to binary search for functions that have a large overhead in ScopDetect time. I tried this a little, but realized that no matter if I removed the first or the second part of a module, the relative scop-detect time always went down. This is surprising. If you see similar effects, it would be interesting to investigate. Cheers, tobi Cheers, Tobi
Star Tan
2013-Jul-02 02:04 UTC
[LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
At 2013-07-01 23:47:06,"Tobias Grosser" <tobias at grosser.es> wrote:>On 07/01/2013 06:51 AM, Star Tan wrote: >>> Great. Now we have two test cases we can work with. Can you >> >>> upload the LLVM-IR produced by clang -O0 (without Polly)? >> Since tramp3d-v4.ll is to large (19M with 267 thousand lines), I would focus on the oggenc benchmark at firat. >> I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz. > >Sounds good. > >>> 2) Check why the Polly scop detection is failing >>> >>> You can use 'opt -polly-detect -analyze' to see the most common reasons >>> the scop detection failed. We should verify that we perform the most >>> common and cheap tests early. >>> >> I also attached the output file oggenc_polly_detect_analyze.log produced by "polly-opt -O3 -polly-detect -analyze oggenc.ll". Unfortunately, it only dumps valid scop regions. At first, I thought to dump all debugging information by "-debug" option, but it will dump too many unrelated information produced by other passes. Do you know any option that allows me to dump debugging information for the "-polly-detect" pass, but at the same time disabling debugging information for other passes? > >I really propose to not attach such large files. ;-) > >To dump debug info of just one pass you can use >-debug-only=polly-detect. However, for performance measurements, you >want to use >a release build to get accurate numbers. > >Another flag that is interesting is the flag '-stats'. It gives me the >following information: > > 4 polly-detect > - Number of bad regions for Scop: CFG too complex > 183 polly-detect > - Number of bad regions for Scop: Expression not affine > 103 polly-detect > - Number of bad regions for Scop: Found base address > alias > 167 polly-detect > - Number of bad regions for Scop: Found invalid region > entering edges > 59 polly-detect > - Number of bad regions for Scop: Function call with > side effects appeared > 725 polly-detect > - Number of bad regions for Scop: Loop bounds can not > be computed > 93 polly-detect > - Number of bad regions for Scop: Non canonical > induction variable in loop > 8 polly-detect > - Number of bad regions for Scop: Others > 53 polly-detect > - Number of regions that a valid part of Scop > >This seems to suggest that we most scops fail due to loop bounds that >can not be computed. It would be interesting to see what kind of >expressions these are. In case SCEV often does not deliver a result, >this may be one of the cases where bottom up scop detection would help >a lot, as outer regions are automatically invalidated if we can not get >a SCEV for the loop bounds of the inner regions.Thank you so much. This is what I need. I just want to know why these scops are invalid!> >However, I still have the feeling the test case is too large. You can >reduce it I propose to first run opt with 'opt -O3 -polly >-disable-inlining -time-passes'. You then replace all function >definitions with >s/define internal/define/. After this preprocessing you can use a regexp >such as "'<,'>s/define \([^{}]* \){\_[^{}]*}/declare \1" to replace >function definitions with their declaration. You can use this to binary >search for functions that have a large overhead in ScopDetect time. > >I tried this a little, but realized that no matter if I removed the >first or the second part of a module, the relative scop-detect time >always went down. This is surprising. If you see similar effects, it >would be interesting to investigate.No problem. I will try to reduce code size. Bests, Star Tan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130702/0210262e/attachment.html>
Sebastian Pop
2013-Jul-02 15:18 UTC
[LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
Star Tan wrote:> I attached the oggenc.ll (LLVM-IR produced by clang -O0 without Polly), which compressed into the file oggenc.tgz.Let me repeat what Tobi said: please do *not* send out large files to the mailing lists. Thanks, Sebastian -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Apparently Analagous Threads
- [LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
- [LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
- [LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
- [LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass
- [LLVMdev] [Polly][GSOC2013] FastPolly -- SCOP Detection Pass