On 6 February 2013 17:03, Nadav Rotem <nrotem at apple.com> wrote:> I can see why freakishly large vectors would produce bad code. The type > <50 x float> would be widened to the next power of two, and then split over > and over again until it fits into registers. So, any <50 x float> would > take 16 XMM registers, that will be spilled. The situation with integer > types is even worse because you can truncate or extend from one type to > another. >In that sense, an inner loop with sequential access would be vectorized into much better code than having a <50 x float>. Whether this is something LLVM could do with <50 x float> or should always be up to the front-end developer, I don't know. It doesn't seem particularly hard to do it in the vectorizer, but it's also probably won't be high on the TODO list for a while. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/f720fdc4/attachment.html>
Renato Golin wrote: [...]> I can see why freakishly large vectors would produce bad code. The > type <50 x float> would be widened to the next power of two, and > then split over and over again until it fits into registers. So, > any <50 x float> would take 16 XMM registers, that will be spilled. > The situation with integer types is even worse because you can > truncate or extend from one type to another. > > In that sense, an inner loop with sequential access would be vectorized > into much better code than having a <50 x float>. > > Whether this is something LLVM could do with <50 x float> or should > always be up to the front-end developer, I don't know. It doesn't seem > particularly hard to do it in the vectorizer, but it's also probably > won't be high on the TODO list for a while.I have actually been reading up on the vectorizer. I'm using LLVM 3.2, so the vectorizer isn't turned on by default. Would it be feasible to explicitly *not* use vectors --- switching to aggregates instead --- and then rely on the vectorizer to autovectorize the code where appropriate? -- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ │ 𝕻𝖍'𝖓𝖌𝖑𝖚𝖎 𝖒𝖌𝖑𝖜'𝖓𝖆𝖋𝖍 𝕮𝖙𝖍𝖚𝖑𝖍𝖚 𝕽'𝖑𝖞𝖊𝖍 𝖜𝖌𝖆𝖍'𝖓𝖆𝖌𝖑 𝖋𝖍𝖙𝖆𝖌𝖓. │ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 263 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/91ec9bc0/attachment.sig>
On 6 February 2013 18:02, David Given <dg at cowlark.com> wrote:> I have actually been reading up on the vectorizer. I'm using LLVM 3.2, > so the vectorizer isn't turned on by default.Not just that, but there is also a lot more coverage since last release (including floating points). Would it be feasible to> explicitly *not* use vectors --- switching to aggregates instead --- and > then rely on the vectorizer to autovectorize the code where appropriate? >It depends. If you use vectors that are within the boundaries of the target's vector sizes, than you can possibly generate better code directly. For instance, if your array has only 3 elements <3 x float>, the vectorizer could think it's not worth to change it. But if you generated all in vector types all the way through, the cost of using vector engines is reduced, and it may be worth even if the vectorizer thinks otherwise. As usual, this is not always true, as sometimes the vectorizer sees patterns you don't, or can add run-time checks to do selective vectorization and so on. In the long term, I think it's best to expect the compiler to do the hard work for you, and teach the compiler to recognize such cases, than add special cases on your own programs. As of now, though, you may have to balance. It'd be interesting to see a comparison of IRs and benchmarks for programs running with long vectors vs. arrays, and short non-power-of-two vectors vs. arrays. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/4e037b90/attachment.html>
> Whether this is something LLVM could do with <50 x float> or should > always be up to the front-end developer, I don't know. It doesn't seem > particularly hard to do it in the vectorizer, but it's also probably > won't be high on the TODO list for a while.| I have actually been reading up on the vectorizer. I'm using LLVM 3.2, | so the vectorizer isn't turned on by default. Would it be feasible to | explicitly *not* use vectors --- switching to aggregates instead --- and | then rely on the vectorizer to autovectorize the code where appropriate? As a pragmatic approach to developing things, I'd say that it's best to view LLVM as a compiler that won't change your code in big ways (even if one or two passes/plugins might). So to rely on the autovectorizer you really want to be producing code that is easy to for it to determine is vectorizable. So while your code-generator may not actually be using vectors, I'd think you'd need to be thinking about vectorizability throughout it to be aware of the way you're expressing things in LLVM IR. I'd be particularly wary of using arrays with loops on each "conceptual vector" operation, since the compiler may well fail to fuse the loops and hence see the opportunities. (But then I've been thinking a lot about loop fusion recently, so there's possibly an idée fixe there.) Regards, Dave