thr3ads.net - llvm dev - [LLVMdev] Haswell New Instructions [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Nicolas Capens

2011-Jun-13 11:41 UTC

[LLVMdev] Haswell New Instructions

Hi all,

 

Intel has just revealed its AVX2 instruction set, to be supported by the
2013 Haswell architecture, and it's looking quite revolutionary:
http://software.intel.com/en-us/forums/showthread.php?t=83399
<http://software.intel.com/en-us/forums/showthread.php?t=83399&o=a&s=lr>
&o=a&s=lr

 

It includes powerful 'gather' instructions, which allow reading multiple
vector elements from non-contiguous memory locations. It also extends all
integer vector instructions to 256-bit, and can shift vector elements by
independent counts. This offers tremendous opportunity for auto-vectorizing
loops, since practically every scalar operation will have a direct vector
equivalent. It also facilitates implementing throughput computing languages
like OpenCL.

 

So I was wondering whether in LLVM a gather operation is best represented
with a 'load' instruction taking vector operands, or whether it's
better to
define it as a separate 'gather' instruction. What would be the pros and
cons of each approach, and what do you think should be the long-term goals
for the LLVM instruction set?

 

Cheers,

Nicolas

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110613/68fa5f7e/attachment.html>

Jose Fonseca

2011-Jun-13 12:37 UTC

head link

[LLVMdev] Haswell New Instructions

The important thing IMO, is to not represent the gather operation as an
instruction which takes a vector of pointers, because that's too restrictive
for architectures with 64bits pointers. What one most frequently wants to do in
those architectures is to specify a 64bit scalar base pointer with a vector of
32bit offsets. This fits what the VGATHERxxx described in the spec provides, and
this fits what e.g. we need for efficient texture lookups in software rendering.

Whether this is represented by extending the load LLVM instruction to take
"load pointer, offset", or by using a new "gather pointer,
offset" LLVM instruction is a lesser issue IMO.

Jose 

----- Original Message -----
> Hi all,
> Intel has just revealed its AVX2 instruction set, to be supported by
> the 2013 Haswell architecture, and it's looking quite revolutionary:
>
http://software.intel.com/en-us/forums/showthread.php?t=83399&o=a&s=lr
> It includes powerful 'gather' instructions, which allow reading
> multiple vector elements from non-contiguous memory locations. It
> also extends all integer vector instructions to 256-bit, and can
> shift vector elements by independent counts. This offers tremendous
> opportunity for auto-vectorizing loops, since practically every
> scalar operation will have a direct vector equivalent. It also
> facilitates implementing throughput computing languages like OpenCL.
> So I was wondering whether in LLVM a gather operation is best
> represented with a 'load' instruction taking vector operands, or
> whether it's better to define it as a separate 'gather'
instruction.
> What would be the pros and cons of each approach, and what do you
> think should be the long-term goals for the LLVM instruction set?
> Cheers,
> Nicolas
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110613/09853cd3/attachment.html>

David A. Greene

2011-Jun-15 19:31 UTC

head link

[LLVMdev] Haswell New Instructions

Jose Fonseca <jfonseca at vmware.com> writes:
> The important thing IMO, is to not represent the gather operation as
> an instruction which takes a vector of pointers, because that's too
> restrictive for architectures with 64bits pointers.
How is it restrictive?
> What one most frequently wants to do in those architectures is to specify a
> 64bit scalar base pointer with a vector of 32bit offsets.
Or 64-bit offsets.  We should not restrict offsets to 32 bits.

                                 -Dave

David A. Greene

2011-Jun-15 19:32 UTC

head link

[LLVMdev] Haswell New Instructions

"Nicolas Capens" <nicolas.capens at gmail.com> writes:
> Hi all,
>
> Intel has just revealed its AVX2 instruction set, to be supported by the
2013 Haswell architecture, and it's looking quite
> revolutionary:
http://software.intel.com/en-us/forums/showthread.php?t=83399&o=a&s=lr
Hooray!

But boo!  No 64-bit integer multiply yet.  :(

In any case, once I get the major AVX changes in, it should be almost
trivial to add HNI support.  I've got some more patches ready to go in
ASAP.  We're very close to sending "the big one" up.  :)

                              -Dave

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jun 2011 - [LLVMdev] Haswell New Instructions

[LLVMdev] Haswell New Instructions

[LLVMdev] Haswell New Instructions

[LLVMdev] Haswell New Instructions

[LLVMdev] Haswell New Instructions

Possibly Parallel Threads