thr3ads.net - llvm dev - [llvm-dev] [RFC] Supporting ARM's SVE in LLVM [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Renato Golin via llvm-dev

2016-Nov-28 14:23 UTC

[llvm-dev] [RFC] Supporting ARM's SVE in LLVM

On 28 November 2016 at 09:15, Alex Bradbury <asb at asbradbury.org>
wrote:> The RISC-V vector proposal is still in the development stage, but it
> will inevitably be vector length agnostic much like Hwacha. Krste gave
> a talk about his proposal for the 'V' extension last year
>
<https://riscv.org/wp-content/uploads/2015/06/riscv-vector-workshop-june2015.pdf>
> and I'm looking forward to his update at the RISC-V Workshop this
> Wednesday, not least because I'm hoping he'll have done my homework
> for me and contrast his proposal to what is publicly known about SVE.
Thanks! This is really helpful!

> The proposal includes a vsetvl instruction (slide 20) which returns
> the minimum of the hardware vector length and requested vector length.
I haven't seen a similar instruction in SVE yet, but the compulsory
predicate on all instructions kinda make that redundant, since you can
always use it to calculate the number of "affected" lanes, and thus
only increment the "right" amount per iteration and not rely on
additional instructions. But this also seem to fit the concept of
"vscale", so if you say:

  %scale = i64 vscale

In RISC-V, this would literally translate to:

  vsetvl t0, a0

Then you could use it to increment the induction variable(s):

  %index.next = add nuw nsw i64 %index, mul (i64 %scale, i64 4)
  %index2.next = add nuw nsw i64 %index, mul (i64 %scale, i64 16)

If using "vscale" directly, the back-end would have to know which
instructions it's pertinent to:

  %index.next = add nuw nsw i64 %index, mul (i64 vscale, i64 4)
  %index2.next = add nuw nsw i64 %index, mul (i64 vscale, i64 16)

If we assume that "vscale" is constant throughout the module, then
it's irrelevant. If there could be some change, this becomes a
problem, as you'd need a validity domain, which %scale would give you
for free.

Paul,

This is one of the issues we have to get right before any IR changes
are in effect, to make sure we won't need to change it again soon. In
SVE, such an operation would be a NOP, as the back-end is already
tracking it via the predicate registers.

cheers,
--renato

Paul Walker via llvm-dev

2016-Nov-28 14:36 UTC

head link

[llvm-dev] [RFC] Supporting ARM's SVE in LLVM

>I haven't seen a similar instruction in SVE yet, but the compulsory
>predicate on all instructions kinda make that redundant, since you can
>always use it to calculate the number of "affected" lanes, and
thus
>only increment the "right" amount per iteration and not rely on
>additional instructions. But this also seem to fit the concept of
>"vscale", so if you say:
>
>  %scale = i64 vscale
>
>In RISC-V, this would literally translate to:
>
>  vsetvl t0, a0
>This is one of the issues we have to get right before any IR changes
>are in effect, to make sure we won't need to change it again soon. In
>SVE, such an operation would be a NOP, as the back-end is already
>tracking it via the predicate registers.
SVE has a similar instruction that returns the current vector length (rdvl). 
The reason you don’t see it in our example loop’s instruction output is because
in that example we are able to use “incw x2” that increments x2 by the number of
i32s a vector can hold.

None of our proposals are syntactic sugar with all having relevance to directing
efficient code generation.

Renato Golin via llvm-dev

2016-Nov-28 14:38 UTC

head link

[llvm-dev] [RFC] Supporting ARM's SVE in LLVM

On 28 November 2016 at 14:36, Paul Walker <Paul.Walker at arm.com>
wrote:> SVE has a similar instruction that returns the current vector length
(rdvl).  The reason you don’t see it in our example loop’s instruction output is
because in that example we are able to use “incw x2” that increments x2 by the
number of i32s a vector can hold.
>
> None of our proposals are syntactic sugar with all having relevance to
directing efficient code generation.
Ok, makes sense. Your proposal is also not affected by using "vscale"
or %vscale on the induction variable.

cheers,
--renato

llvm dev - Nov 2016 - [RFC] Supporting ARM's SVE in LLVM

[llvm-dev] [RFC] Supporting ARM's SVE in LLVM

[llvm-dev] [RFC] Supporting ARM's SVE in LLVM

[llvm-dev] [RFC] Supporting ARM's SVE in LLVM