thr3ads.net - llvm dev - [llvm-dev] Questions about vscale [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Kai Wang via llvm-dev

2020-Apr-07 08:30 UTC

[llvm-dev] Questions about vscale

Hi,


In RISC-V v-extension, operations could operate on a group of vector
registers; we called it LMUL. If LMUL equals 2, it means we could operate
on 2 vector registers at the same time. So, we have the following
combinations of types.


          LMUL = 1           LMUL = 2            LMUL = 4            LMUL 8

int64_t | vscale x 1 x i64 | vscale x  2 x i64 | vscale x  4 x i64 | vscale
x  8 x i64

int32_t | vscale x 2 x i32 | vscale x  4 x i32 | vscale x  8 x i32 | vscale
x 16 x i32

int16_t | vscale x 4 x i16 | vscale x  8 x i16 | vscale x 16 x i16 | vscale
x 32 x i16

 int8_t | vscale x 8 x i8  | vscale x 16 x i8  | vscale x 32 x i8  | vscale
x 64 x i8


We have another architecture parameter, ELEN, which means the maximum size
of a single vector element in bits.

We hope the type system could be consistent under ELEN = 32 and ELEN = 64.
However, vscale may be a fractional value under ELEN = 32 in the above type
system. When ELEN = 32, i64 is an invalid type (we could ignore the first
row for ELEN = 32) and vscale may become 1/2 on run time to fit the
architecture (if the vector register only has 32 bits). Is there any
problem to assume vscale to be fractional under some circumstances? vscale
should be an unknown value when compiling. So, it should have no impact on
code generation and optimization. The relationship between types is correct
regardless vscale’s value. Is there anything I missed?


Thanks!

Hsiangkai
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200407/335fa446/attachment.html>

Renato Golin via llvm-dev

2020-Apr-07 09:04 UTC

head link

[llvm-dev] Questions about vscale

On Tue, 7 Apr 2020 at 09:30, Kai Wang via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>           LMUL = 1           LMUL = 2            LMUL = 4            LMUL =
8
> int64_t | vscale x 1 x i64 | vscale x  2 x i64 | vscale x  4 x i64 | vscale
x  8 x i64
> int32_t | vscale x 2 x i32 | vscale x  4 x i32 | vscale x  8 x i32 | vscale
x 16 x i32
> int16_t | vscale x 4 x i16 | vscale x  8 x i16 | vscale x 16 x i16 | vscale
x 32 x i16
>  int8_t | vscale x 8 x i8  | vscale x 16 x i8  | vscale x 32 x i8  | vscale
x 64 x i8
>
> We have another architecture parameter, ELEN, which means the maximum size
of a single vector element in bits.
Hi,

For my own education, some quick questions:

1. is LMUL always a multiple of ELEN?
2. Is this fixed on the hardware, depending on the actual lengths, or
is this dynamically set by software (on a register or status flag)?
2a. If dynamic, can it change from program to program? Function to function?

> We hope the type system could be consistent under ELEN = 32 and ELEN = 64.
However, vscale may be a fractional value under ELEN = 32 in the above type
system. When ELEN = 32, i64 is an invalid type (we could ignore the first row
for ELEN = 32) and vscale may become 1/2 on run time to fit the architecture (if
the vector register only has 32 bits).
Do you mean ELEN=32 like this?
int32_t | vscale x 1 x i32 | vscale x 2 x i32 | vscale x  4 x i32 |
vscale x  8 x i32
int16_t | vscale x 2 x i16 | vscale x 4 x i16 | vscale x  8 x i16 |
vscale x 16 x i16
  int8_t | vscale x 4 x i8   | vscale x 8 x i8   | vscale x 16 x  i8 |
vscale x 32 x i8

If the type is invalid, you would need to legalise it, and in that
case create some cluttered accessors (via insert/extract element) and
possibly use intrinsics to expose underlying instructions that can
deal with it.

Perhaps I'm not clear on what you need, but vscale is supposed to be
the number of valid elements (lanes), and given i64 is invalid, vscale
wouldn't apply?
> Is there any problem to assume vscale to be fractional under some
circumstances? vscale should be an unknown value when compiling. So, it should
have no impact on code generation and optimization. The relationship between
types is correct regardless vscale’s value. Is there anything I missed?
I believe the assumption was always that vscale is an integer.
Representing it as a fraction would need code change for sure, but
also reevaluate the assumptions.

I'm copying some SVE and LV people to give a more informed opinion.

cheers,
--renato

Hanna Kruppe via llvm-dev

2020-Apr-07 11:50 UTC

head link

[llvm-dev] Questions about vscale

Hi all,

On Tue, 7 Apr 2020 at 11:04, Renato Golin via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> On Tue, 7 Apr 2020 at 09:30, Kai Wang via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >           LMUL = 1           LMUL = 2            LMUL = 4           
LMUL = 8
> > int64_t | vscale x 1 x i64 | vscale x  2 x i64 | vscale x  4 x i64 |
vscale x  8 x i64
> > int32_t | vscale x 2 x i32 | vscale x  4 x i32 | vscale x  8 x i32 |
vscale x 16 x i32
> > int16_t | vscale x 4 x i16 | vscale x  8 x i16 | vscale x 16 x i16 |
vscale x 32 x i16
> >  int8_t | vscale x 8 x i8  | vscale x 16 x i8  | vscale x 32 x i8  |
vscale x 64 x i8
> >
> > We have another architecture parameter, ELEN, which means the maximum
size of a single vector element in bits.
>
> Hi,
>
> For my own education, some quick questions:
>
> 1. is LMUL always a multiple of ELEN?

This happens to be true (at least in the current spec, disregarding
some in-progress proposals) just because both are powers of two and
the largest possible LMUL equals the smallest possible ELEN (8), but I
don't think there is any meaning to be found in this observation. The
two values govern unrelated aspects of the vector unit.
> 2. Is this fixed on the hardware, depending on the actual lengths, or
> is this dynamically set by software (on a register or status flag)?
> 2a. If dynamic, can it change from program to program? Function to
function?

It's not clear whether by "this" you mean ELEN, LMUL, or something
else. ELEN is fixed in hardware. LMUL is a property of each individual
instruction. Most instructions take it from a control register, a few
encode it in the instruction as an immediate, but in any case it needs
to be statically determined (on a per-instruction basis) to be able to
allocate registers. This is not just a constraint for
compiler-generated code, but also for all hand-written assembly code
I've seen or can imagine.
>
> > We hope the type system could be consistent under ELEN = 32 and ELEN =
64. However, vscale may be a fractional value under ELEN = 32 in the above type
system. When ELEN = 32, i64 is an invalid type (we could ignore the first row
for ELEN = 32) and vscale may become 1/2 on run time to fit the architecture (if
the vector register only has 32 bits).
>
> Do you mean ELEN=32 like this?
> int32_t | vscale x 1 x i32 | vscale x 2 x i32 | vscale x  4 x i32 |
> vscale x  8 x i32
> int16_t | vscale x 2 x i16 | vscale x 4 x i16 | vscale x  8 x i16 |
> vscale x 16 x i16
>   int8_t | vscale x 4 x i8   | vscale x 8 x i8   | vscale x 16 x  i8 |
> vscale x 32 x i8
>
> If the type is invalid, you would need to legalise it, and in that
> case create some cluttered accessors (via insert/extract element) and
> possibly use intrinsics to expose underlying instructions that can
> deal with it.
>
> Perhaps I'm not clear on what you need, but vscale is supposed to be
> the number of valid elements (lanes), and given i64 is invalid, vscale
> wouldn't apply?

I don't know what "vscale wouldn't apply" is supposed to mean.
Whether
it's legal or not, you can write LLVM IR using (for example) the type
<vscale x 1 x i64> even if the target doesn't natively support it. The
purpose of legalization is to make sure that results in the behavior
the type is supposed to have. For <vscale x 1 x i32>, this means among
other things:

- it has the same number of elements as <vscale x 1 x i32>, but each
element is twice as big
- it has half as many elements (each of the same size) as <vscale x 2 x
i64>
- its total size in bits is the same as <vscale x 2 x i32>

I think that focusing on the completely illegal i64 might obscure the
real problem I see with the fractional vscale concept. Let's look at
<vscale x 1 x i32> instead. The elements are clearly legal in this
context, even in some vector types, but the <vscale x 1 x i32> type is
absent from Kai's table. This makes sense: the same vector register
fits 2x as many i32 elements as i64 elements, so if you start with
<vscale x 1 x i64> mapping to a single register, then <vscale x 2 x
i32> is the same size and fits in the same register class, while
<vscale x 1 x i32> is too small and must be legalized somehow.

But how? If we take Kai's table as gospel and look at a VLEN = ELEN 32
machine, the vector type <vscale x 2 x i32> is supposed to map to a
single vector register, which is 32b small, and thus <vscale x 2 x
i32> would have just one element in this context (matching the "vscale
= 1/2" intuition). To be consistent with this, <vscale x 1 x i32>
would have be contain just *half* an element. This is not something
any legalization strategy can achieve, because it is a fundamentally
impossible notion. So we end up in a situation where some types are
not just illegal and have to be legalized, but are contradictory and
can't be legalized in any meaningful way.

I don't think LLVM can/should support this kind of contradiction. Some
types have to be legalized, sometimes the legalization is not
efficient, sometimes it's not even implemented, that's all fine. But
letting some targets decide that <vscale x 1 x i32> is a fundamentally
impossible type to even assign a meaning to... that seems
unprecedented and contrary to the philosophy of LLVM IR as reasonably
target-independent IR.

The obvious solution is to use a different set of legal vector types
(and thus, a different interpretation of vscale) depending on the
largest legal element type (ELEN in RISC-V jargon). This corresponds
to the table for ELEN=32 that Renato gave above. Kai's proposal is
intended to avoid this, and I can understand the desire for that, but
it really seems like the lesser evil to me.

Best regards
Hanna

> > Is there any problem to assume vscale to be fractional under some
circumstances? vscale should be an unknown value when compiling. So, it should
have no impact on code generation and optimization. The relationship between
types is correct regardless vscale’s value. Is there anything I missed?
>
> I believe the assumption was always that vscale is an integer.
> Representing it as a fraction would need code change for sure, but
> also reevaluate the assumptions.
>
> I'm copying some SVE and LV people to give a more informed opinion.
>
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Apr 2020 - Questions about vscale

[llvm-dev] Questions about vscale

[llvm-dev] Questions about vscale

[llvm-dev] Questions about vscale

Seemingly Similar Threads