thr3ads.net - llvm dev - [llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2) [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Amara Emerson via llvm-dev

2017-Jul-06 22:03 UTC

[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

[Sending again to list]

Hi Chris,

Responses inline...

On 6 July 2017 at 21:02, Chris Lattner via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Thanks for sending this out Graham.  Here are some comments:
>
> This is a clever approach to unifying the two concepts, and I think that
the approach is basically reasonable.  The primary problem that this will
introduce is:
>
> 1) Almost anything touching (e.g. transforming) vector operations will have
to be aware of this concept.  Given a first class implementation of SVE, I don’t
see how that’s avoidable though, and your extension of VectorType is sensible.
Yes, however we have found that the vast majority of vector transforms
don't need any modification to deal with scalable types. There are
obviously exceptions, things like analysing shuffle vector masks for
specific patterns etc.
>
> 2) This means that VectorType is sometimes fixed size, and sometime
unknowable.  I don’t think we have an existing analog for that in the type
system.
>
> Is this type a first class type?  Can you PHI them, can you load/store
them, can you pass them as function arguments without limitations?  If not, that
is a serious problem.  How does struct layout with a scalable vector in it work?
What does an alloca of one of them look like?  What does a spill look like in
codegen?Yes, as an extension to VectorType they can be manipulated and passed
around like normal vectors, load/stored directly, phis, put in llvm
structs etc. Address computation generates expressions in terms vscale
and it seems to work well.>
> I think that a target-specific type (e.g. like we have X86_mmx) is the only
reasonable alternative.  A subclass of VectorType is just another implementation
approach of your design above.  This is assuming that scalable vectors are
really first class types.
>
> The pros and cons of a separate type is that it avoids you having to touch
everything that touches VectorTypes, and if it turns out that the code that
needs to handle normal SIMD and scalable SIMD vectors is different, then it is a
win to split them into two types.  If, on the other hand, most code would treat
the two types similarly, then it is better to just have one type.
Fortunately the latter case is exactly what we've found. Most
operations on vectors are not actually concerned with their absolute
size, and more usually concerned with relative sizes if
anything.>
> The major concern I have here is that I’m not sure how scalable vectors can
be considered to be first class types, given that we don’t know their size.  If
they can’t be put in an LLVM struct (for example), then this would pose a
significant problem with your current approach.  It would be a huge problem if
VectorType could be in structs in some cases, but not others.We can have them as first class types but as you say it does require
us to be careful with reasoning about their sizes. In practice there
are architectural limits on the sizes of vectors, so it's possible to
have an upper bound on the size. However to be completely accurate,
type sizes in LLVM probably need to have some symbolic representation
such that we can reason about their sizes in terms of, essentially,
the vscale constant. The other potential avenue is to make all type
size queries in LLVM return optional values. We haven't implemented
either of these and we haven't yet hit an issue, not to say there
isn't one. I think most of the uses of querying type sizes are to
compare against other type sizes, so relative comparisons still work
even with scalable types. This area is something we want some
community input to build consensus on though.

>> With a scalable vector type defined, we now need a way to generate
addresses for
>> consecutive vector values in memory and to be able to create basic
constant
>> vector values.
>>
>> For address generation, the `vscale` constant is added to represent the
runtime
>> value of `n` in `<n x m x type>`.
>
> This should probably be an intrinsic, not an llvm::Constant.  The design of
llvm::Constant is already wrong: it shouldn’t have operations like divide, and
it would be better to not contribute to the problem.Could you explain your position more on this? The Constant
architecture has been a very natural fit for this concept from our
perspective.>
>> Multiplying `vscale` by `m` and the number of
>> bytes in `type` gives the total length of a scalable vector, and the
backend
>> can pattern match to the various load and store instructions in SVE
that
>> automatically scale with vector length.
>
> It is fine for the intrinsic to turn into a target specific ISD node in
selection dag to allow your pattern matching.
>
>
>> How do we spill/fill scalable registers on the stack?
>> -----------------------------------------------------
>>
>> SVE registers have a (partially) unknown size at build time and their
associated
>> fill/spill instructions require an offset that is implicitly scaled by
the
>> vector length instead of bytes or element size. To accommodate this we
>> created the concept of Stack Regions that are areas on the stack
associated
>> with specific data types or register classes.
>
> Ok, that sounds complicated, but can surely be made to work.  The bigger
problem is that there are various LLVM IR transformations that want to put
registers into memory.  All of these will be broken with this sort of type.Could you give an example?

Thanks for taking the time to review this,
Amara

Chris Lattner via llvm-dev

2017-Jul-06 22:13 UTC

head link

[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

On Jul 6, 2017, at 3:03 PM, Amara Emerson <amara.emerson at gmail.com>
wrote:>> 1) Almost anything touching (e.g. transforming) vector operations will
have to be aware of this concept.  Given a first class implementation of SVE, I
don’t see how that’s avoidable though, and your extension of VectorType is
sensible.
> 
> Yes, however we have found that the vast majority of vector transforms
> don't need any modification to deal with scalable types. There are
> obviously exceptions, things like analysing shuffle vector masks for
> specific patterns etc.
Ok great.
>> 2) This means that VectorType is sometimes fixed size, and sometime
unknowable.  I don’t think we have an existing analog for that in the type
system.
>> 
>> Is this type a first class type?  Can you PHI them, can you load/store
them, can you pass them as function arguments without limitations?  If not, that
is a serious problem.  How does struct layout with a scalable vector in it work?
What does an alloca of one of them look like?  What does a spill look like in
codegen?
> Yes, as an extension to VectorType they can be manipulated and passed
> around like normal vectors, load/stored directly, phis, put in llvm
> structs etc. Address computation generates expressions in terms vscale
> and it seems to work well.
Right, that works out through composition, but what does it mean?  I can't
have a global variable of a scalable vector type, nor does it make sense for a
scalable vector to be embeddable in an LLVM IR struct: nothing that measures the
size of a struct is prepared to deal with a non-constant answer.
>>> With a scalable vector type defined, we now need a way to generate
addresses for
>>> consecutive vector values in memory and to be able to create basic
constant
>>> vector values.
>>> 
>>> For address generation, the `vscale` constant is added to represent
the runtime
>>> value of `n` in `<n x m x type>`.
>> 
>> This should probably be an intrinsic, not an llvm::Constant.  The
design of llvm::Constant is already wrong: it shouldn’t have operations like
divide, and it would be better to not contribute to the problem.
> Could you explain your position more on this? The Constant
> architecture has been a very natural fit for this concept from our
> perspective.
It is appealing, but it is wrong.  Constant should really only model primitive
constants (ConstantInt/FP, etc) and we should have one more form for
“relocatable” constants.  Instead, we have intertwined constant folding and
ConstantExpr logic that doesn’t make sense.

A better pattern to follow are intrinsics like (e.g.) llvm.coro.size.i32(),
which always returns a constant value.
>> Ok, that sounds complicated, but can surely be made to work.  The
bigger problem is that there are various LLVM IR transformations that want to
put registers into memory.  All of these will be broken with this sort of type.
> Could you give an example?
The concept of “reg2mem” is to put SSA values into allocas for passes that can’t
(or don’t want to) update SSA.  Similarly, function body extraction can turn SSA
values into parameters, and depending on the implementation can pack them into
structs.  The coroutine logic similarly needs to store registers if they cross
suspend points, there are multiple other examples.

-Chris

Amara Emerson via llvm-dev

2017-Jul-06 22:53 UTC

head link

[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

On 6 July 2017 at 23:13, Chris Lattner <clattner at nondot.org>
wrote:>> Yes, as an extension to VectorType they can be manipulated and passed
>> around like normal vectors, load/stored directly, phis, put in llvm
>> structs etc. Address computation generates expressions in terms vscale
>> and it seems to work well.
>
> Right, that works out through composition, but what does it mean?  I
can't have a global variable of a scalable vector type, nor does it make
sense for a scalable vector to be embeddable in an LLVM IR struct: nothing that
measures the size of a struct is prepared to deal with a non-constant answer.Although the absolute size of the types aren't known at compile time,
there are upper bounds which the compiler can assume and use to allow
allocation of storage for global variables and the like. The issue
with composite type sizes again reduce to the issue of type sizes
being either symbolic expressions or simply unknown in some cases.
>>> This should probably be an intrinsic, not an llvm::Constant.  The
design of llvm::Constant is already wrong: it shouldn’t have operations like
divide, and it would be better to not contribute to the problem.
>> Could you explain your position more on this? The Constant
>> architecture has been a very natural fit for this concept from our
>> perspective.
>
> It is appealing, but it is wrong.  Constant should really only model
primitive constants (ConstantInt/FP, etc) and we should have one more form for
“relocatable” constants.  Instead, we have intertwined constant folding and
ConstantExpr logic that doesn’t make sense.
>
> A better pattern to follow are intrinsics like (e.g.) llvm.coro.size.i32(),
which always returns a constant value.
Ok, we'll investigate this issue further.>
>>> Ok, that sounds complicated, but can surely be made to work.  The
bigger problem is that there are various LLVM IR transformations that want to
put registers into memory.  All of these will be broken with this sort of type.
>> Could you give an example?
>
> The concept of “reg2mem” is to put SSA values into allocas for passes that
can’t (or don’t want to) update SSA.  Similarly, function body extraction can
turn SSA values into parameters, and depending on the implementation can pack
them into structs.  The coroutine logic similarly needs to store registers if
they cross suspend points, there are multiple other examples.I think this should still work. Allocas of scalable vectors are supported,
and it's only later at codegen that the unknown sizes result in more
work being needed to compute stack offsets correctly. The caveat being
that a direct call to something like getTypeStoreSize() will need to
be aware of expressions/sizeless-types. If however these passes are
exclusively using allocas to put registers into memory, or using
structs with extractvalue etc, then they shouldn't need to care and
codegen deals with the low level details.

Thanks,
Amara

llvm dev - Jul 2017 - [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)