thr3ads.net - llvm dev - [LLVMdev] On large vectors [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2013-Feb-06 17:34 UTC

[LLVMdev] On large vectors

On 6 February 2013 17:03, Nadav Rotem <nrotem at apple.com> wrote:
> I can see why freakishly large vectors would produce bad code.  The type
> <50 x float> would be widened to the next power of two, and then
split over
> and over again until it fits into registers.  So, any <50 x float>
would
> take 16 XMM registers, that will be spilled. The situation with integer
> types is even worse because you can truncate or extend from one type to
> another.
>
In that sense, an inner loop with sequential access would be vectorized
into much better code than having a <50 x float>.

Whether this is something LLVM could do with <50 x float> or should always
be up to the front-end developer, I don't know. It doesn't seem
particularly hard to do it  in the vectorizer, but it's also probably
won't
be high on the TODO list for a while.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/f720fdc4/attachment.html>

David Given

2013-Feb-06 18:02 UTC

head link

[LLVMdev] On large vectors

Renato Golin wrote:
[...]>     I can see why freakishly large vectors would produce bad code.  The
>     type <50 x float> would be widened to the next power of two, and
>     then split over and over again until it fits into registers.  So,
>     any <50 x float> would take 16 XMM registers, that will be
spilled.
>     The situation with integer types is even worse because you can
>     truncate or extend from one type to another.
> 
> In that sense, an inner loop with sequential access would be vectorized
> into much better code than having a <50 x float>. 
> 
> Whether this is something LLVM could do with <50 x float> or should
> always be up to the front-end developer, I don't know. It doesn't
seem
> particularly hard to do it  in the vectorizer, but it's also probably
> won't be high on the TODO list for a while.
I have actually been reading up on the vectorizer. I'm using LLVM 3.2,
so the vectorizer isn't turned on by default. Would it be feasible to
explicitly *not* use vectors --- switching to aggregates instead --- and
then rely on the vectorizer to autovectorize the code where appropriate?

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│
│ 𝕻𝖍'𝖓𝖌𝖑𝖚𝖎 𝖒𝖌𝖑𝖜'𝖓𝖆𝖋𝖍 𝕮𝖙𝖍𝖚𝖑𝖍𝖚 𝕽'𝖑𝖞𝖊𝖍
𝖜𝖌𝖆𝖍'𝖓𝖆𝖌𝖑 𝖋𝖍𝖙𝖆𝖌𝖓.
│

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 263 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/91ec9bc0/attachment.sig>

Renato Golin

2013-Feb-06 18:40 UTC

head link

[LLVMdev] On large vectors

On 6 February 2013 18:02, David Given <dg at cowlark.com> wrote:
> I have actually been reading up on the vectorizer. I'm using LLVM 3.2,
> so the vectorizer isn't turned on by default.

Not just that, but there is also a lot more coverage since last release
(including floating points).

Would it be feasible to> explicitly *not* use vectors --- switching to aggregates instead --- and
> then rely on the vectorizer to autovectorize the code where appropriate?
>

It depends. If you use vectors that are within the boundaries of the
target's vector sizes, than you can possibly generate better code directly.
For instance, if your array has only 3 elements <3 x float>, the
vectorizer
could think it's not worth to change it. But if you generated all in vector
types all the way through, the cost of using vector engines is reduced, and
it may be worth even if the vectorizer thinks otherwise. As usual, this is
not always true, as sometimes the vectorizer sees patterns you don't, or
can add run-time checks to do selective vectorization and so on.

In the long term, I think it's best to expect the compiler to do the hard
work for you, and teach the compiler to recognize such cases, than add
special cases on your own programs. As of now, though, you may have to
balance.

It'd be interesting to see a comparison of IRs and benchmarks for programs
running with long vectors vs. arrays, and short non-power-of-two vectors
vs. arrays.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/4e037b90/attachment.html>

David Tweed

2013-Feb-07 10:00 UTC

head link

[LLVMdev] On large vectors

> Whether this is something LLVM could do with <50 x float> or should
> always be up to the front-end developer, I don't know. It doesn't
seem
> particularly hard to do it  in the vectorizer, but it's also probably
> won't be high on the TODO list for a while.
| I have actually been reading up on the vectorizer. I'm using LLVM 3.2,
| so the vectorizer isn't turned on by default. Would it be feasible to
| explicitly *not* use vectors --- switching to aggregates instead --- and
| then rely on the vectorizer to autovectorize the code where appropriate?

As a pragmatic approach to developing things, I'd say that it's best to
view LLVM as a compiler that won't change your code in big ways (even if one
or two passes/plugins might). So to rely on the autovectorizer you really want
to be producing code that is easy to for it to determine is vectorizable. So
while your code-generator may not actually be using vectors, I'd think
you'd need to be thinking about vectorizability throughout it to be aware of
the way you're expressing things in LLVM IR. I'd be particularly wary of
using arrays with loops on each "conceptual vector" operation, since
the compiler may well fail to fuse the loops and hence see the opportunities.
(But then I've been thinking a lot about loop fusion recently, so
there's possibly an idée fixe there.)

Regards,
Dave

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Feb 2013 - [LLVMdev] On large vectors

[LLVMdev] On large vectors

[LLVMdev] On large vectors

[LLVMdev] On large vectors

[LLVMdev] On large vectors

Maybe Matching Threads