Björn Ruytenberg via llvm-dev
2018-Mar-09 14:05 UTC
[llvm-dev] [Polly] Reduced code analyzability moving from LLVM 3.9.0 to 5.0.1
Hi Johannes, Perfect, thanks! The CFG now looks very similar to what I got on LLVM 3.9.0 ([1] vs [2]). Any idea why setting -simplifycfg-sink-common=false is necessary? Similar to LLVM 5.0.1, the default for 3.9.0 is true [3], and setting it to false wasn't necessary in the latter version. [1] https://nautilus.bjornweb.nl/files/polly501-cfg-simplifycfg-sink-common.pdf [2] https://nautilus.bjornweb.nl/files/polly390-cfg.pdf [3] https://github.com/llvm-mirror/llvm/blob/release_39/lib/Transforms/Utils/SimplifyCFG.cpp#L71 -- Kind regards, Björn Ruytenberg https://bjornweb.nl On 09/03/2018 09:18, Johannes Doerfert wrote:> Hi Björn, > > try to add this: > > -mllvm -simplifycfg-sink-common=false > > Cheers, > Johannes > > On 03/08, Björn Ruytenberg via llvm-dev wrote: >> Hi, >> >> Recently I was looking at the potential of optimizing through Polly. The >> code that I am trying to optimize [1] adjusts a picture's colors to get >> an Instagram-like effect. >> >> To improve code analyzability on LLVM 3.9.0, I made the following changes: >> - Improve SCoP detection through -polly-process-unprofitable >> - Enable outer loop vectorization through -polly-vectorizer=stripmine, >> disabling timeouts with -polly-dependences-computeout=0 >> - Avoid sign extensions by replacing all 32-bit ints with longs, as >> Polly seems to model using 64-bit loop counters >> - Avoid interrupting control flow through -ffast-math and moving mallocs >> to the top of the code >> >> So to compile, we have: >> clang -I. -O3 -g3 -Wall -Wextra -std=c99 -D_POSIX_C_SOURCE=200000L >> -ffast-math -mllvm -polly -mllvm -polly-dot -mllvm >> -polly-process-unprofitable -mllvm -polly-vectorizer=stripmine -mllvm >> -polly-dependences-computeout=0 -c -o localcolorcorrection.o >> localcolorcorrection.c >> >> Unfortunately, LLVM 5.0.1 generates different results in analyzing the >> CFG compared to LLVM 3.9.0. The latter version analyzes most of the CFG >> [2], but 5.0.1 leaves large parts of the hot paths untouched due to "non >> affine access functions" [3]. >> >> What I have tried: >> - Moving Polly to different positions in the LLVM pass pipeline >> (-polly-position=early vs. -polly-position=before-vectorizer). The >> latter option adds one large basic block, but otherwise doesn't seem to >> analyze the hot paths. >> - Setting -polly-delicm-compute-known=true and >> polly-delicm-overapproximate-writes=true. This doesn't seem to have >> effect on the hot paths. >> >> Can anyone give me some pointers on how to fix this? Or could this be a >> regression in Polly? >> >> Thanks! >> >> [1] https://nautilus.bjornweb.nl/files/localcolorcorrection.c >> [2] https://nautilus.bjornweb.nl/files/polly390-cfg.pdf >> [3] https://nautilus.bjornweb.nl/files/polly501-cfg.pdf >> >> -- >> Kind regards, >> Björn Ruytenberg >> https://bjornweb.nl >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Johannes Doerfert via llvm-dev
2018-Mar-12 23:56 UTC
[llvm-dev] [Polly] Reduced code analyzability moving from LLVM 3.9.0 to 5.0.1
Hi Björn, check block %for.inv147 in all your versions. In the 5.0.1 output (first email) it says that %add118.sink is involved in some non-affine accesses and that is basically why the large region wasn't detected as a SCoP anymore. Now the access is not "non-affine" but actually piece-wise affine (which is indistinguishable for ScalarEvolution and thereby for Polly [see [0]]). This is now a problem because, either: 1) LLVM got smarter and figured that is can sink and coalesce the memory loads in the predecessors of %for.inv147. This "simplifies" the code and reduces the code size. Or, 2) Polly runs later in the pipeline (which it does) and that means the code was always transformed this way but commonly after Polly was done. Since I saw this problem a lot recently I would guess it is because Polly was moved to the later position in the pipeline. Though, I did not check if this is true. Cheers, Johannes [0] https://www.youtube.com/watch?v=xSA0XLYJ-G0 On 03/09, Björn Ruytenberg wrote:> Hi Johannes, > > Perfect, thanks! The CFG now looks very similar to what I got on LLVM > 3.9.0 ([1] vs [2]). > > Any idea why setting -simplifycfg-sink-common=false is necessary? > Similar to LLVM 5.0.1, the default for 3.9.0 is true [3], and setting it > to false wasn't necessary in the latter version. > > [1] > https://nautilus.bjornweb.nl/files/polly501-cfg-simplifycfg-sink-common.pdf > [2] https://nautilus.bjornweb.nl/files/polly390-cfg.pdf > [3] > https://github.com/llvm-mirror/llvm/blob/release_39/lib/Transforms/Utils/SimplifyCFG.cpp#L71 > > -- > Kind regards, > Björn Ruytenberg > https://bjornweb.nl > > > On 09/03/2018 09:18, Johannes Doerfert wrote: > > Hi Björn, > > > > try to add this: > > > > -mllvm -simplifycfg-sink-common=false > > > > Cheers, > > Johannes > > > > On 03/08, Björn Ruytenberg via llvm-dev wrote: > >> Hi, > >> > >> Recently I was looking at the potential of optimizing through Polly. The > >> code that I am trying to optimize [1] adjusts a picture's colors to get > >> an Instagram-like effect. > >> > >> To improve code analyzability on LLVM 3.9.0, I made the following changes: > >> - Improve SCoP detection through -polly-process-unprofitable > >> - Enable outer loop vectorization through -polly-vectorizer=stripmine, > >> disabling timeouts with -polly-dependences-computeout=0 > >> - Avoid sign extensions by replacing all 32-bit ints with longs, as > >> Polly seems to model using 64-bit loop counters > >> - Avoid interrupting control flow through -ffast-math and moving mallocs > >> to the top of the code > >> > >> So to compile, we have: > >> clang -I. -O3 -g3 -Wall -Wextra -std=c99 -D_POSIX_C_SOURCE=200000L > >> -ffast-math -mllvm -polly -mllvm -polly-dot -mllvm > >> -polly-process-unprofitable -mllvm -polly-vectorizer=stripmine -mllvm > >> -polly-dependences-computeout=0 -c -o localcolorcorrection.o > >> localcolorcorrection.c > >> > >> Unfortunately, LLVM 5.0.1 generates different results in analyzing the > >> CFG compared to LLVM 3.9.0. The latter version analyzes most of the CFG > >> [2], but 5.0.1 leaves large parts of the hot paths untouched due to "non > >> affine access functions" [3]. > >> > >> What I have tried: > >> - Moving Polly to different positions in the LLVM pass pipeline > >> (-polly-position=early vs. -polly-position=before-vectorizer). The > >> latter option adds one large basic block, but otherwise doesn't seem to > >> analyze the hot paths. > >> - Setting -polly-delicm-compute-known=true and > >> polly-delicm-overapproximate-writes=true. This doesn't seem to have > >> effect on the hot paths. > >> > >> Can anyone give me some pointers on how to fix this? Or could this be a > >> regression in Polly? > >> > >> Thanks! > >> > >> [1] https://nautilus.bjornweb.nl/files/localcolorcorrection.c > >> [2] https://nautilus.bjornweb.nl/files/polly390-cfg.pdf > >> [3] https://nautilus.bjornweb.nl/files/polly501-cfg.pdf > >> > >> -- > >> Kind regards, > >> Björn Ruytenberg > >> https://bjornweb.nl > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-- Johannes Doerfert Researcher / PhD Student Compiler Design Lab (Prof. Hack) Saarland Informatics Campus, Germany Building E1.3, Room 4.31 Tel. +49 (0)681 302-57521 : doerfert at cs.uni-saarland.de Fax. +49 (0)681 302-3065 : http://www.cdl.uni-saarland.de/people/doerfert -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180313/1cde5367/attachment.sig>