Renato Golin
2013-Jun-07 16:53 UTC
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 15:41, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:> We don’t want to encode backend knowledge into the vectorizer (i.e. don’t > vectorize type X because the backend does not support it). >We already do, via the cost table. This case is no different. It might not be the best choice, but it is how the cost table is being built over the last months. The only way to get this result is indirectly via the cost model but the> backend must still support vectorized IR (it is part of the language) via > scalarization. >Absolutely! There are two problems to solve: increase the cost for SPFP when UseNEONForSinglePrecisionFP is false, so that vectorizers don't generate such code, and legalize correctly in the backend, for vector code that does not respect that flag. (You can of course assign UMAX cost for all floating point vector types in> the cost model for ARM and get the desired result - this won’t solve the > problem if somebody else writes the vectorize LLVM IR though) >I wouldn't use UMAX, since the idea is not to forbid, but to tell how expensive it is. But it would be a big number, yes. ;) cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130607/439b07ec/attachment.html>
Arnold Schwaighofer
2013-Jun-07 17:08 UTC
[LLVMdev] NEON vector instructions and the fast math IR flags
Renato, I think we agree. On Jun 7, 2013, at 11:53 AM, Renato Golin <renato.golin at linaro.org> wrote:> On 7 June 2013 15:41, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > We don’t want to encode backend knowledge into the vectorizer (i.e. don’t vectorize type X because the backend does not support it). > > We already do, via the cost table. This case is no different. It might not be the best choice, but it is how the cost table is being built over the last months. > >Using the cost model to communicate that the backend will generate wrong code is an abuse (in my opinion, this is not what the cost model is for). This is what I meant by encoding backend knowledge. Of course, we use the cost model to tell us how expensive an operation might be but we should not use it as an indicator how wrong it will be ;). (Which is what we would do if we give a v4f32 operation a high cost because the backend generates instructions that flush denormals to zero). What I wanted to say is that even if you give v4f32 a high cost you still have to solve the real problem in the ARM backend.> The only way to get this result is indirectly via the cost model but the backend must still support vectorized IR (it is part of the language) via scalarization. > > Absolutely! There are two problems to solve: increase the cost for SPFP when UseNEONForSinglePrecisionFP is false, so that vectorizers don't generate such code, and legalize correctly in the backend, for vector code that does not respect that flag. > > > (You can of course assign UMAX cost for all floating point vector types in the cost model for ARM and get the desired result - this won’t solve the problem if somebody else writes the vectorize LLVM IR though) > > I wouldn't use UMAX, since the idea is not to forbid, but to tell how expensive it is. But it would be a big number, yes. ;)I was referring to the case when you are abusing the cost model to forbid a vectorized v4f32 IR (which I thought you were proposing). What I am suggesting is that (if you care about denormals): * the arm backend has to be fixed to scalarize floating point vector operations (behind a flag) * the arm target transform model has to correctly reflect that What one could also do (but I don’t think is a good idea) is to just give floating point vector operations a max cost. You might run into unforeseen problems, including that other clients are generating vectorized LLVM IR. (This makes we wonder whether we clamp the cost computation at TYPE_MAX :)> > cheers, > --renato
Renato Golin
2013-Jun-07 17:24 UTC
[LLVMdev] NEON vector instructions and the fast math IR flags
On 7 June 2013 18:08, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:> What I am suggesting is that (if you care about denormals): > > * the arm backend has to be fixed to scalarize floating point vector > operations (behind a flag) > * the arm target transform model has to correctly reflect that >Yup. What I had in mind, too. This is why I asked Tobi to create two bugs, and we would fix them accordingly. ;) cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130607/f34ce44c/attachment.html>
Maybe Matching Threads
- [LLVMdev] NEON vector instructions and the fast math IR flags
- [LLVMdev] NEON vector instructions and the fast math IR flags
- [LLVMdev] NEON vector instructions and the fast math IR flags
- [LLVMdev] NEON vector instructions and the fast math IR flags
- [LLVMdev] NEON vector instructions and the fast math IR flags