Hi Michael and Florian, ( + llvm-dev for visibility) I would like to quickly follow up on "Pragma vectorize_width() implies vectorize(enable)", which got reverted with commit 858a1ae for 2 reasons, see also that revert commit message. Ignore the assert, that's been fixed now. The other thing is that with the patch behaviour is slightly changed and we could get a diagnostic we didn't get before: warning: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning] For the example given in revert 858a1ae, in both cases before and after my commit, the loop vectoriser was bailing because "Not vectorizing: The exiting block is not the loop latch". But the difference is that vectorize_width() now implies vectorize(enable), and so this is now marked as forced vectorisation which wasn't the case before. Because of this forced vectorization, and that the transformation wasn't applied, we now emit this diagnostic. The first part of this diagnostic is spot on: "the optimizer was unable to perform the requested transformation". We could argue about the suggestions given as to why the transformations didn't happen in this case, but overall I think this is an improvement. I just wanted to check if we are happy with this behaviour? Okay to recommit? Cheers, Sjoerd. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191002/4f680ac1/attachment-0001.html>
> On Oct 2, 2019, at 11:13, Sjoerd Meijer <Sjoerd.Meijer at arm.com> wrote: > > Hi Michael and Florian, > ( + llvm-dev for visibility) > > I would like to quickly follow up on "Pragma vectorize_width() implies vectorize(enable)", > which got reverted with commit 858a1ae for 2 reasons, see also that revert commit message. Ignore the assert, that's been fixed now. > > The other thing is that with the patch behaviour is slightly changed and we could get a diagnostic we didn't get before: > > warning: loop not vectorized: the optimizer was unable to > perform the requested transformation; the transformation might be disabled or > specified as part of an unsupported transformation ordering > [-Wpass-failed=transform-warning] > > For the example given in revert 858a1ae, in both cases before and after my commit, the loop vectoriser was bailing because "Not vectorizing: The exiting block is not the loop latch". But the difference is that vectorize_width() now implies vectorize(enable), and so this is now marked as forced vectorisation which wasn't the case before. Because of this forced vectorization, and that the transformation wasn't applied, we now emit this diagnostic. The first part of this diagnostic is spot on: "the optimizer was unable to perform the requested transformation". We could argue about the suggestions given as to why the transformations didn't happen in this case, but overall I think this is an improvement. > > I just wanted to check if we are happy with this behaviour? Okay to recommit?The additional warning makes sense to me and I think is also beneficial to the user. Before, we silently ignored vector_width() in the example [1] and I suppose the user was expecting vectorize_width(4) to be honored. Now we are more transparent in informing the user what is happening: we were not able to honor their requested pragma and I assume they would be interested in knowing. But I think it would be good for the warning to also mention that the requested transformation might not be legal (which is the case for building [1] with -Oz). This makes it a little better. Hans, as you reverted the patch, is the warning (modulo wording) in line with what you would expect? Cheers, Florian [1] float ScaleSumSamples_C(const float* src, float* dst, float scale, int width) { float fsum = 0.f; int i; #if defined(__clang__) #pragma clang loop vectorize_width(4) #endif for (i = 0; i < width; ++i) { float v = *src++; fsum += v * v; *dst++ = v * scale; } return fsum; } -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191002/bdce73de/attachment.html>
Am Mi., 2. Okt. 2019 um 07:08 Uhr schrieb Florian Hahn via llvm-dev <llvm-dev at lists.llvm.org>:> The other thing is that with the patch behaviour is slightly changed and we could get a diagnostic we didn't get before: > > warning: loop not vectorized: the optimizer was unable to > perform the requested transformation; the transformation might be disabled or > specified as part of an unsupported transformation ordering > [-Wpass-failed=transform-warning] > > For the example given in revert 858a1ae, in both cases before and after my commit, the loop vectoriser was bailing because "Not vectorizing: The exiting block is not the loop latch".The source looks like a straightforward canonical loop. What pass transformed it to have code between the exiting block and the latch?> But the difference is that vectorize_width() now implies vectorize(enable), and so this is now marked as forced vectorisation which wasn't the case before. Because of this forced vectorization, and that the transformation wasn't applied, we now emit this diagnostic. The first part of this diagnostic is spot on: "the optimizer was unable to perform the requested transformation". We could argue about the suggestions given as to why the transformations didn't happen in this case, but overall I think this is an improvement.The patch that added the warning as-is was by me (https://reviews.llvm.org/D55288). It changed to emitted message from "loop not vectorized: failed explicitly specified loop vectorization" to the lengthier description following review feedback. It's done by the WarnMissedTransformation and just looks for transformation metadata that is still in the IR after all passes that should have transformed them have ran. That is, it does not know why it is still there -- it could be because the LoopVectorize pass is not even in the pipeline -- and we cannot be more specific in the message. However, -Rpass-missed=loop-vectorize may give more information.> The additional warning makes sense to me and I think is also beneficial to the user. > > Before, we silently ignored vector_width() in the example [1] and I suppose the user was expecting vectorize_width(4) to be honored. Now we are more transparent in informing the user what is happening: we were not able to honor their requested pragma and I assume they would be interested in knowing.As already mentioned, the loop indeed was never vectorized. It is again the question whether vectorize_width(4) means "vectorize with simd length 4" or "if this is vectorized, use simd length 4". Since the decided on the former (which is also what the docs say), the warning is correct. That is, IMHO, Chrome should either remove the #pragma (since it has no effect), add -Wno-pass-failed. We could also wait for the LoopVectorize pass to support this, Philip Reames is currently working on it. Michael