thr3ads.net - llvm dev - [LLVMdev] LLVMdev Digest, Vol 80, Issue 13 [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Peter Lawrence

2011-Feb-14 20:49 UTC

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Andrew,
                your response highlights a naming problem in LLVM,  
which is that  "array" and "vector"
mean the same thing in normal computer language and compiler theory  
usage, so it is
inconvenient and misleading within LLVM to give one a very specific  
meaning that is different
from the other....

to the LLVM developers I would suggest using the term "packed" to  
refer to the type of
data that the SPARC-VIS, the PPC-Altivec, and the Intel-mmx/sse  
(among others) instruction
sets support.

As far as I am aware not a single one of any of the above types of
instruction sets allows the "subscripting" of packed data within a  
register  (the Maspar
computer had hardware that would allow subscripting of sub-elements  
of data within
a larger/wider register, but it was the exception, not the rule, and  
it did not support
any of the saturating arithmetic that is part-and-parcel of the  
packed data types in
the currently existing "multi-media instruction sets").


sincerely,
Peter Lawrence.



>
> Could you use an array of pointers instead? As far as I'm aware, the
> vector types in LLVM are intended to abstract the vector units on  
> modern
> CPUs (SSE, etc.) which generally support operations on only  
> integers and
> floating point values, which may be why there isn't native support for
> vectors of pointers.
>
> Andrew

Andrew Clinton

2011-Feb-14 21:14 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Agreed, I too was wondering why we need both arrays and vectors.  It 
goes against the grain, I think, of the structure typing system used by 
LLVM.  For example, a vector of 4 floats and an array of 4 floats are 
structurally the same type.  Would it be feasible in the future to 
consolidate the two types by allowing "vector" operations (add, 
multiply, etc.) on arrays where it makes sense, and doing away with the 
specialized vector types?

Andrew

On 02/14/2011 03:49 PM, Peter Lawrence wrote:> Andrew,
>                  your response highlights a naming problem in LLVM,
> which is that  "array" and "vector"
> mean the same thing in normal computer language and compiler theory
> usage, so it is
> inconvenient and misleading within LLVM to give one a very specific
> meaning that is different
> from the other....
>
> to the LLVM developers I would suggest using the term "packed" to
> refer to the type of
> data that the SPARC-VIS, the PPC-Altivec, and the Intel-mmx/sse
> (among others) instruction
> sets support.
>
> As far as I am aware not a single one of any of the above types of
> instruction sets allows the "subscripting" of packed data within
a
> register  (the Maspar
> computer had hardware that would allow subscripting of sub-elements
> of data within
> a larger/wider register, but it was the exception, not the rule, and
> it did not support
> any of the saturating arithmetic that is part-and-parcel of the
> packed data types in
> the currently existing "multi-media instruction sets").
>
>
> sincerely,
> Peter Lawrence.
>

Reid Kleckner

2011-Feb-14 21:27 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

On Mon, Feb 14, 2011 at 3:49 PM, Peter Lawrence
<peterl95124 at sbcglobal.net> wrote:> Andrew,
>                your response highlights a naming problem in LLVM,
> which is that  "array" and "vector"
> mean the same thing in normal computer language and compiler theory
> usage, so it is
> inconvenient and misleading within LLVM to give one a very specific
> meaning that is different
> from the other....
The only place in CS I know of that "vector" is used to mean
arbitrarily long sequence of elements is the C++ STL vector template
class.  So far as I know in the field of compilers "vectorization"
always refers to lowering code down to use SIMD operations.
> to the LLVM developers I would suggest using the term "packed" to
> refer to the type of
> data that the SPARC-VIS, the PPC-Altivec, and the Intel-mmx/sse
> (among others) instruction
> sets support.
"packed" is already used to refer to "packed structs", ie
structs
without any padding.

Reid

Peter Lawrence

2011-Feb-14 22:29 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Reid,
           although K&R's  "The C Programming Language" does
not use
the term vector,
note that Stroustrup's "The C++ Programming Language" does (way  
before STL was ever envisioned) !

Metcalf's "Effective Fortran-77" does  not use the term vector,   
except that it is frequently used as the "name"
for an array in array usage examples, but then in "Effective  
Fortran-90", when an array is used as an index
into an array it is referred to as a "vector-valued-subscript", not  
an "array-valued-subscript" !

Guy Steel's "Common Lisp"  quote: "one-dimensional arrays are
called
_vectors_ in Common Lisp...".

the equivalent and interchangable usage of "array" and
"vector" goes
back to the beginning of computing....

when multiple items are placed within a single word of computer  
memory or a single register, it is "packed"
data, regardless of whether that it is of uniform type and size  
(array/vector-like) or of different type
and size (struct/record-like).

best,
-Peter Lawrence.

On Feb 14, 2011, at 1:27 PM, Reid Kleckner wrote:
> On Mon, Feb 14, 2011 at 3:49 PM, Peter Lawrence
> <peterl95124 at sbcglobal.net> wrote:
>> Andrew,
>>                your response highlights a naming problem in LLVM,
>> which is that  "array" and "vector"
>> mean the same thing in normal computer language and compiler theory
>> usage, so it is
>> inconvenient and misleading within LLVM to give one a very specific
>> meaning that is different
>> from the other....
>
> The only place in CS I know of that "vector" is used to mean
> arbitrarily long sequence of elements is the C++ STL vector template
> class.  So far as I know in the field of compilers
"vectorization"
> always refers to lowering code down to use SIMD operations.
>
>> to the LLVM developers I would suggest using the term
"packed" to
>> refer to the type of
>> data that the SPARC-VIS, the PPC-Altivec, and the Intel-mmx/sse
>> (among others) instruction
>> sets support.
>
> "packed" is already used to refer to "packed structs",
ie structs
> without any padding.
>
> Reid

Villmow, Micah

2011-Feb-14 23:44 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Andrew,
 This is one area of LLVM that maps very nicely to our GPU architecture. A
vector is a native data type on these architectures. For example, on AMD's
GPUs, the native type is 4x32bit vector with sub-components. Each of the
individual 32bits can be indexed separately, but not dynamically. This is a big
difference from an array of 32bit values. So there are cases where the meaning
of the two are different on modern hardware.

Micah
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu]
> On Behalf Of Andrew Clinton
> Sent: Monday, February 14, 2011 1:14 PM
> To: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] LLVMdev Digest, Vol 80, Issue 13
> 
> Agreed, I too was wondering why we need both arrays and vectors.  It
> goes against the grain, I think, of the structure typing system used by
> LLVM.  For example, a vector of 4 floats and an array of 4 floats are
> structurally the same type.  Would it be feasible in the future to
> consolidate the two types by allowing "vector" operations (add,
> multiply, etc.) on arrays where it makes sense, and doing away with the
> specialized vector types?
> 
> Andrew
> 
> On 02/14/2011 03:49 PM, Peter Lawrence wrote:
> > Andrew,
> >                  your response highlights a naming problem in LLVM,
> > which is that  "array" and "vector"
> > mean the same thing in normal computer language and compiler theory
> > usage, so it is
> > inconvenient and misleading within LLVM to give one a very specific
> > meaning that is different
> > from the other....
> >
> > to the LLVM developers I would suggest using the term
"packed" to
> > refer to the type of
> > data that the SPARC-VIS, the PPC-Altivec, and the Intel-mmx/sse
> > (among others) instruction
> > sets support.
> >
> > As far as I am aware not a single one of any of the above types of
> > instruction sets allows the "subscripting" of packed data
within a
> > register  (the Maspar
> > computer had hardware that would allow subscripting of sub-elements
> > of data within
> > a larger/wider register, but it was the exception, not the rule, and
> > it did not support
> > any of the saturating arithmetic that is part-and-parcel of the
> > packed data types in
> > the currently existing "multi-media instruction sets").
> >
> >
> > sincerely,
> > Peter Lawrence.
> >
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David A. Greene

2011-Feb-15 16:38 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Peter Lawrence <peterl95124 at sbcglobal.net> writes:
> Andrew, your response highlights a naming problem in LLVM, which is
> that "array" and "vector" mean the same thing in normal
computer
> language and compiler theory usage, so it is inconvenient and
> misleading within LLVM to give one a very specific meaning that is
> different from the other....
I think any sort of separation at all is counterproductive.  The
existing array/vector split is artificial.  It would be better to have
one array-like type and allow a reasonable set of operations on it.
Target lowering should take care of legality issues.  For best
performance various transformation patterns will want to know about the
target but that's true independent of vector types.  Scalar optimizers
want to know about targets too.
> As far as I am aware not a single one of any of the above types of
> instruction sets allows the "subscripting" of packed data within
a
> register
Given what we know of Larrabee and speculating that the "Knights"
family
is likely a derivative of it, it's safe to assume that future Intel
architectures will be much more like traditional vector machines.  That
means gather/scatter, element indexing, etc.  The existing PINSR/PEXTR
and shuffle instructions already allow a degree of element indexing.
Note that the existing LLVM vector types already have insert/extract
operators.

Unifying array and vector and generalizing the result would open a lot
of optimization opportunities.

                              -Dave

David A. Greene

2011-Feb-15 16:38 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Andrew Clinton <andrew at sidefx.com> writes:
> Agreed, I too was wondering why we need both arrays and vectors.  It 
> goes against the grain, I think, of the structure typing system used by 
> LLVM.  For example, a vector of 4 floats and an array of 4 floats are 
> structurally the same type.  Would it be feasible in the future to 
> consolidate the two types by allowing "vector" operations (add, 
> multiply, etc.) on arrays where it makes sense, and doing away with the 
> specialized vector types?
Actually, I do not believe an array of 4 floats and a vector of 4 floats
are the same.  There are various bit packing and alignment semantics
that differ.

                            -Dave

David A. Greene

2011-Feb-15 16:42 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Reid Kleckner <reid.kleckner at gmail.com> writes:
> So far as I know in the field of compilers "vectorization" always
> refers to lowering code down to use SIMD operations.
Not true at all.  "Vectorization" as mapping exclusicely to
fixed-length
SSE-style instructions is relatively new.  Traditionally, vectorization
has meant targeting vector registers with variable-length operations and
masks.  The SSE-style instructions sets are very limited in comparison.

The Allen/Kennedy book is a good resource.

http://www.amazon.com/Optimizing-Compilers-Modern-Architectures-Dependence-based/dp/1558602860

                           -Dave

Peter Lawrence

2011-Feb-16 18:20 UTC

head link

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Dave,
> Unifying array and vector and generalizing the result would open a lot
> of optimization opportunities.
you would be piling an incomplete optimization on top of a pile of  
already
incomplete optimizations...  Vectorization in Fortran is already a  
"hard" problem,
requiring alias analysis (always an incomplete and inaccurate  
(conservative)
analysis) and loop-carried array subscript dependence analysis (which is
equivalent to the Diophantine Equation problem in Mathematics which  
is in
general not solvable, so you end up again with incomplete and inaccurate
(conservative) analysis). Doing this in C (without first class array/ 
matrix
types), and with even more alias analysis issues, makes it more  
problematic.

The final straw with all these multimedia instruction sets is that  
they require
large alignment on their "packed array" data types (even Intel which  
started
out not requiring alignment with MMX (though unaligned data invoked hugh
performance penalties), did evolve with SSE to what everyone else  
requires).

data alignment can (and should!) be analyzed within the same  
algorithms that do
alias analysis, and the analysis has the same inherent limitations.

The problem is that in real world applications it is typical for  
array data slices
(ie sections of arrays that are passed to subroutines to be  
processed)  to be unaligned,
even if the array base address is aligned, the bounds of the section  
being processed
are in general not aligned.

You end up with wanting to clone your algorithm kernel for various  
incoming
alignments (just like memcpy, memcmp, etc are often cloned internally  
for
various relative alignments of the incoming arguments), but with a  
kernel
accessing N different arrays you end up needing 2**N clones, which in  
general
is an impractical code-explosion.

The reason I object to the use of "vector" and "simd" when
describing
these
"packed data" multimedia instruction sets is that in practical  
reality the traditional
vectorization optimization technology  just does not apply.  You can  
always
come up with geewiz examples where it does, but you cannot make it work
in the general case.

No matter what fancy data shuffling/permuting/inserting/extracting  
instructions
get added to MMX/SSE/SSE2, they will still not solve the data  
alignment problem,
so the instruction sets remain incompatible with "traditional vector  
machines"
where there was always one-data-item-per-HW-register and there was never
any alignment issue.

best,
Peter Lawrence.

On Feb 15, 2011, at 8:38 AM, David A. Greene wrote:
> Peter Lawrence <peterl95124 at sbcglobal.net> writes:
>
>> Andrew, your response highlights a naming problem in LLVM, which is
>> that "array" and "vector" mean the same thing in
normal computer
>> language and compiler theory usage, so it is inconvenient and
>> misleading within LLVM to give one a very specific meaning that is
>> different from the other....
>
> I think any sort of separation at all is counterproductive.  The
> existing array/vector split is artificial.  It would be better to have
> one array-like type and allow a reasonable set of operations on it.
> Target lowering should take care of legality issues.  For best
> performance various transformation patterns will want to know about  
> the
> target but that's true independent of vector types.  Scalar optimizers
> want to know about targets too.
>
>> As far as I am aware not a single one of any of the above types of
>> instruction sets allows the "subscripting" of packed data
within a
>> register
>
> Given what we know of Larrabee and speculating that the "Knights"
> family
> is likely a derivative of it, it's safe to assume that future Intel
> architectures will be much more like traditional vector machines.   
> That
> means gather/scatter, element indexing, etc.  The existing PINSR/PEXTR
> and shuffle instructions already allow a degree of element indexing.
> Note that the existing LLVM vector types already have insert/extract
> operators.
>
> Unifying array and vector and generalizing the result would open a lot
> of optimization opportunities.
>
>                               -Dave

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Feb 2011 - [LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

[LLVMdev] LLVMdev Digest, Vol 80, Issue 13

Possibly Parallel Threads