manolis mi via llvm-dev
2019-Aug-23 15:42 UTC
[llvm-dev] Vectorization fails when dealing with a lot of for loops.
Hello, could you please have a look at this code posted on godbolt.org: https://godbolt.org/z/O-O-Q7 The problem is that inside the compute function, only the first loop vectorizes while the rest copies of it don't. But if I remove any of the for loops, then the rest vectorize successfully. Could you please confirm that this is a bug, otherwise give me more insight on why the vectorization fails? The message "Cannot identify array bounds" is not helpful. Thank you for your time, Emmanouil Michalainas, CERN -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190823/d109ac7b/attachment.html>
Nema, Ashutosh via llvm-dev
2019-Aug-26 05:53 UTC
[llvm-dev] Vectorization fails when dealing with a lot of for loops.
Hi Emmanouil, Seems like when you comment out a loop in "compute" function, it gets inlined, this helps LV by finding the bounds and vectorizes it. With the presence of all the loops LAA is not able to identify the bounds for the accesses: LAA: Can't find bounds for ptr: %arrayidx.i159 = getelementptr inbounds double, double* %30, i64 %and.i158, !dbg !46 LAA: Can't find bounds for ptr: %arrayidx.i155 = getelementptr inbounds double, double* %36, i64 %and.i154, !dbg !66 LAA: Can't find bounds for ptr: %arrayidx.i151 = getelementptr inbounds double, double* %42, i64 %and.i150, !dbg !91 Its bit strange as its able to identify the bounds correctly for the first loop. Without commenting the loop, if you make the "compute" function inline, it gets vectorize: - void compute( size_t batchSize, + void inline compute( size_t batchSize, Regards, Ashutosh From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of manolis mi via llvm-dev Sent: Friday, August 23, 2019 9:13 PM To: via llvm-dev <llvm-dev at lists.llvm.org> Cc: Stephan Hageboeck <stephan.hageboeck at cern.ch> Subject: [llvm-dev] Vectorization fails when dealing with a lot of for loops. [CAUTION: External Email] Hello, could you please have a look at this code posted on godbolt.org: https://godbolt.org/z/O-O-Q7 The problem is that inside the compute function, only the first loop vectorizes while the rest copies of it don't. But if I remove any of the for loops, then the rest vectorize successfully. Could you please confirm that this is a bug, otherwise give me more insight on why the vectorization fails? The message "Cannot identify array bounds" is not helpful. Thank you for your time, Emmanouil Michalainas, CERN -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/93e30ae5/attachment.html>
manolis mi via llvm-dev
2019-Aug-26 09:12 UTC
[llvm-dev] Απ: Vectorization fails when dealing with a lot of for loops.
Hello Ashutosh, Thank you for the answer, however, the "compute" function is intended to be compiled in a library, so it cannot be inlined. As you said, it's weird that the first loop vectorizes without the "cannot identify array bounds" problem, which makes us think that it's a bug. We have already written more complex code that vectorizes, thus showing us the potential of the auto-vectorizer, which we really need for our project. Regards, Emmanouil ________________________________ Από: Nema, Ashutosh <Ashutosh.Nema at amd.com> Στάλθηκε: Δευτέρα, 26 Αυγούστου 2019 7:53 πμ Προς: manolis mi <manolismih at windowslive.com> Κοιν.: Stephan Hageboeck <stephan.hageboeck at cern.ch>; llvm-dev <llvm-dev at lists.llvm.org> Θέμα: RE: Vectorization fails when dealing with a lot of for loops. Hi Emmanouil, Seems like when you comment out a loop in “compute” function, it gets inlined, this helps LV by finding the bounds and vectorizes it. With the presence of all the loops LAA is not able to identify the bounds for the accesses: LAA: Can't find bounds for ptr: %arrayidx.i159 = getelementptr inbounds double, double* %30, i64 %and.i158, !dbg !46 LAA: Can't find bounds for ptr: %arrayidx.i155 = getelementptr inbounds double, double* %36, i64 %and.i154, !dbg !66 LAA: Can't find bounds for ptr: %arrayidx.i151 = getelementptr inbounds double, double* %42, i64 %and.i150, !dbg !91 Its bit strange as its able to identify the bounds correctly for the first loop. Without commenting the loop, if you make the “compute” function inline, it gets vectorize: - void compute( size_t batchSize, + void inline compute( size_t batchSize, Regards, Ashutosh From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of manolis mi via llvm-dev Sent: Friday, August 23, 2019 9:13 PM To: via llvm-dev <llvm-dev at lists.llvm.org> Cc: Stephan Hageboeck <stephan.hageboeck at cern.ch> Subject: [llvm-dev] Vectorization fails when dealing with a lot of for loops. [CAUTION: External Email] Hello, could you please have a look at this code posted on godbolt.org: https://godbolt.org/z/O-O-Q7 The problem is that inside the compute function, only the first loop vectorizes while the rest copies of it don't. But if I remove any of the for loops, then the rest vectorize successfully. Could you please confirm that this is a bug, otherwise give me more insight on why the vectorization fails? The message "Cannot identify array bounds" is not helpful. Thank you for your time, Emmanouil Michalainas, CERN -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190826/c90ab180/attachment.html>
Michael Kruse via llvm-dev
2019-Aug-26 21:22 UTC
[llvm-dev] Vectorization fails when dealing with a lot of for loops.
There is some weird interaction between alias-analysis and the vectorized code. For the first loop, alias-analysis is able to determine that "output" is not aliasing with the read-only arrays: LAA: Processing memory accesses... AST: Alias Set Tracker: 2 alias sets for 7 pointer values. AliasSet[0x16a131c3120, 1] must alias, No access Pointers: (double* %arrayidx6, unknown) AliasSet[0x16a131c1e90, 6] may alias, No access Pointers: (double* %arrayidx, unknown), (double* %arrayidx1, unknown), (double** %_pointer.i, unknown), (i64* %_mask.i, unknown), (double* %arrayidx.i, unknown), (double* %arrayidx5, unknown) After vectorizing the first loop, it is not able to do this anymore (did not investigate the why). When trying to vectorize the second loop, it requires a runtime condition to guard against aliasing (for which it needs to determine the loop bounds), but is unable to do so because of the the and-mask of "G"/BracketAdapterWithMask: %arrayidx.i159 --- or as SCEV: ((8 * %and.i158)<nsw> + %36)<nsw> When removing one of the for-loops, the entire compute-function is inlined into the run function and this problem is magically resolved. Not sure why. Would you file a bug report? Michael Am Fr., 23. Aug. 2019 um 10:43 Uhr schrieb manolis mi via llvm-dev <llvm-dev at lists.llvm.org>:> > Hello, could you please have a look at this code posted on godbolt.org: > https://godbolt.org/z/O-O-Q7 > > The problem is that inside the compute function, only the first loop vectorizes while the rest copies of it don't. But if I remove any of the for loops, then the rest vectorize successfully. Could you please confirm that this is a bug, otherwise give me more insight on why the vectorization fails? The message "Cannot identify array bounds" is not helpful. > > Thank you for your time, > Emmanouil Michalainas, CERN > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev