thr3ads.net - llvm dev - [llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat [Aug 2019]

If this information is useful, please help other people find it:
Share via:

Cameron McInally via llvm-dev

2019-Aug-29 23:23 UTC

[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat

Just spitballing... why not have a splat construct straight through LLVM?
It would make the IR more readable, opposed to the insert+shuffle method.

On Thu, Aug 29, 2019 at 19:06 Amara Emerson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> +1 to a new node, we’d very likely do the same thing for GlobalISel and
> move to a canonical spat representation for all targets.
>
> Amara
>
> > On Aug 29, 2019, at 5:26 AM, Graham Hunter via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Hi,
> >
> > During the discussion on introducing scalable vectors we established
> that we could use the canonical IR form for splats of scalable vector types
> (insert element into lane 0 of an undef vector, shuffle that with another
> undef vector of the same type and a zeroinitializer mask).
> >
> > We do run into a problem for lowering to SelectionDAG however, since
the
> canonical form there is a BUILD_VECTOR with all elements being the same.
> This obviously doesn't work if we don't know how many elements
there are.
> We have a couple of solutions and would like to know which the community
> prefers.
> >
> > 1) Add a new SPLAT_VECTOR ISD node. This was part of our overall RFC
> from 2016 and is the solution that we're currently using downstream. It
> just accepts a single scalar value. This has worked well with just the SVE
> codegen using it, but I don't know if we would run into problems if we
try
> to make this the canonical splat form for SDAG.
> >
> > 2) Extend BUILD_VECTOR to accept an initial boolean indicating whether
> it is a splat, and if true the first element can be assumed to be the same
> as all others. The splat form would be the only valid use of BUILD_VECTOR
> for scalable vector types. For fixed length vectors we could either change
> existing checks for splats to only look at the flag and would only need one
> extra argument for the splat value, or use the flag as a shortcut and fall
> back to checking all the elements if there's the possibility of a fold
> generating a splat and it not being recognized.
> >
> > Given the existence of MVTs with >1000 elements per vector, the
> SPLAT_VECTOR or BUILD_VECTOR with single element approach may also be
> beneficial for some fixed length backends.
> >
> > Any thoughts?
> >
> > -Graham
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> >
>
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>
>
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190829/af10f7cc/attachment.html>

Amara Emerson via llvm-dev

2019-Aug-29 23:34 UTC

head link

[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat

Good question. Years back the original SVE proposal was to have a new
instruction, “seriesvector”, to represent arithmetic series in vector elements.
That instruction could also be used to represent splats if the step value was 0,
but the instruction was deemed unnecessarily powerful during community feedback.
I believe the proposed replacement is “stepvector”
(https://reviews.llvm.org/D47774 <https://reviews.llvm.org/D47774>) which
doesn’t have a variable step so can’t do splats anymore.

IIRC the overall feeling is the same as the other attempts to canonicalize
representations. We don’t introduce new representations unless absolutely
necessary, and insert+shufflevector is technically sufficient to achieve a
splat, even though it looks pretty horrible, bloats the IR etc.

Amara
> On Aug 29, 2019, at 4:23 PM, Cameron McInally <cameron.mcinally at
nyu.edu> wrote:
> 
> Just spitballing... why not have a splat construct straight through LLVM?
It would make the IR more readable, opposed to the insert+shuffle method.
> 
> On Thu, Aug 29, 2019 at 19:06 Amara Emerson via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> +1 to a new node, we’d very likely do the same thing for GlobalISel and
move to a canonical spat representation for all targets.
> 
> Amara
> 
> > On Aug 29, 2019, at 5:26 AM, Graham Hunter via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> > 
> > Hi,
> > 
> > During the discussion on introducing scalable vectors we established
that we could use the canonical IR form for splats of scalable vector types
(insert element into lane 0 of an undef vector, shuffle that with another undef
vector of the same type and a zeroinitializer mask).
> > 
> > We do run into a problem for lowering to SelectionDAG however, since
the canonical form there is a BUILD_VECTOR with all elements being the same.
This obviously doesn't work if we don't know how many elements there
are. We have a couple of solutions and would like to know which the community
prefers.
> > 
> > 1) Add a new SPLAT_VECTOR ISD node. This was part of our overall RFC
from 2016 and is the solution that we're currently using downstream. It just
accepts a single scalar value. This has worked well with just the SVE codegen
using it, but I don't know if we would run into problems if we try to make
this the canonical splat form for SDAG.
> > 
> > 2) Extend BUILD_VECTOR to accept an initial boolean indicating whether
it is a splat, and if true the first element can be assumed to be the same as
all others. The splat form would be the only valid use of BUILD_VECTOR for
scalable vector types. For fixed length vectors we could either change existing
checks for splats to only look at the flag and would only need one extra
argument for the splat value, or use the flag as a shortcut and fall back to
checking all the elements if there's the possibility of a fold generating a
splat and it not being recognized.
> > 
> > Given the existence of MVTs with >1000 elements per vector, the
SPLAT_VECTOR or BUILD_VECTOR with single element approach may also be beneficial
for some fixed length backends.
> > 
> > Any thoughts?
> > 
> > -Graham
> > 
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> >
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e=
<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e=>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e=
<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e=>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190829/5d9ed984/attachment.html>

Renato Golin via llvm-dev

2019-Aug-30 10:25 UTC

head link

[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat

On Fri, 30 Aug 2019 at 00:35, Amara Emerson via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> IIRC the overall feeling is the same as the other attempts to canonicalize
representations. We don’t introduce new representations unless absolutely
necessary, and insert+shufflevector is technically sufficient to achieve a
splat, even though it looks pretty horrible, bloats the IR etc.
This is a recurrent enough question that perhaps we need to have a
blog post about it. :)

Mainly, the gist is flexibility and maintainability. Optimisations can
be done on standard IR, but new constructs need to be taught to all
passes before they become really useful. By introducing a new node
type for every language/machine concept, we'd have a combinatorial
explosion on the number of conversions to do, as well as lose the
ability to efficiently pattern match to find optimisations.

The side effect is having to be careful (and thus conservative) on
code transformation. You end up with longer and more complicated
def-use chains, which are hard to match, transform, move around,
simplify. But at least, simpler patterns work, and improving a pattern
becomes an incremental change.

The main benefit is being able to have a common infrastructure to do
all of those changes in the right place, and hopefully only once per
stage, at non-combinatorial time complexity.

Hope this helps.

--renato

PS: In this particular case, the ISD node would be temporary and
localised, and at this level, we really don't want code to change
anyway. Very different from IR.

llvm dev - Aug 2019 - [SVE][AArch64] Codegen for a scalable vector splat

[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat

[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat

[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat