Cameron McInally via llvm-dev
2019-Aug-29 23:23 UTC
[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat
Just spitballing... why not have a splat construct straight through LLVM? It would make the IR more readable, opposed to the insert+shuffle method. On Thu, Aug 29, 2019 at 19:06 Amara Emerson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> +1 to a new node, we’d very likely do the same thing for GlobalISel and > move to a canonical spat representation for all targets. > > Amara > > > On Aug 29, 2019, at 5:26 AM, Graham Hunter via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hi, > > > > During the discussion on introducing scalable vectors we established > that we could use the canonical IR form for splats of scalable vector types > (insert element into lane 0 of an undef vector, shuffle that with another > undef vector of the same type and a zeroinitializer mask). > > > > We do run into a problem for lowering to SelectionDAG however, since the > canonical form there is a BUILD_VECTOR with all elements being the same. > This obviously doesn't work if we don't know how many elements there are. > We have a couple of solutions and would like to know which the community > prefers. > > > > 1) Add a new SPLAT_VECTOR ISD node. This was part of our overall RFC > from 2016 and is the solution that we're currently using downstream. It > just accepts a single scalar value. This has worked well with just the SVE > codegen using it, but I don't know if we would run into problems if we try > to make this the canonical splat form for SDAG. > > > > 2) Extend BUILD_VECTOR to accept an initial boolean indicating whether > it is a splat, and if true the first element can be assumed to be the same > as all others. The splat form would be the only valid use of BUILD_VECTOR > for scalable vector types. For fixed length vectors we could either change > existing checks for splats to only look at the flag and would only need one > extra argument for the splat value, or use the flag as a shortcut and fall > back to checking all the elements if there's the possibility of a fold > generating a splat and it not being recognized. > > > > Given the existence of MVTs with >1000 elements per vector, the > SPLAT_VECTOR or BUILD_VECTOR with single element approach may also be > beneficial for some fixed length backends. > > > > Any thoughts? > > > > -Graham > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190829/af10f7cc/attachment.html>
Amara Emerson via llvm-dev
2019-Aug-29 23:34 UTC
[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat
Good question. Years back the original SVE proposal was to have a new instruction, “seriesvector”, to represent arithmetic series in vector elements. That instruction could also be used to represent splats if the step value was 0, but the instruction was deemed unnecessarily powerful during community feedback. I believe the proposed replacement is “stepvector” (https://reviews.llvm.org/D47774 <https://reviews.llvm.org/D47774>) which doesn’t have a variable step so can’t do splats anymore. IIRC the overall feeling is the same as the other attempts to canonicalize representations. We don’t introduce new representations unless absolutely necessary, and insert+shufflevector is technically sufficient to achieve a splat, even though it looks pretty horrible, bloats the IR etc. Amara> On Aug 29, 2019, at 4:23 PM, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > > Just spitballing... why not have a splat construct straight through LLVM? It would make the IR more readable, opposed to the insert+shuffle method. > > On Thu, Aug 29, 2019 at 19:06 Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > +1 to a new node, we’d very likely do the same thing for GlobalISel and move to a canonical spat representation for all targets. > > Amara > > > On Aug 29, 2019, at 5:26 AM, Graham Hunter via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > > Hi, > > > > During the discussion on introducing scalable vectors we established that we could use the canonical IR form for splats of scalable vector types (insert element into lane 0 of an undef vector, shuffle that with another undef vector of the same type and a zeroinitializer mask). > > > > We do run into a problem for lowering to SelectionDAG however, since the canonical form there is a BUILD_VECTOR with all elements being the same. This obviously doesn't work if we don't know how many elements there are. We have a couple of solutions and would like to know which the community prefers. > > > > 1) Add a new SPLAT_VECTOR ISD node. This was part of our overall RFC from 2016 and is the solution that we're currently using downstream. It just accepts a single scalar value. This has worked well with just the SVE codegen using it, but I don't know if we would run into problems if we try to make this the canonical splat form for SDAG. > > > > 2) Extend BUILD_VECTOR to accept an initial boolean indicating whether it is a splat, and if true the first element can be assumed to be the same as all others. The splat form would be the only valid use of BUILD_VECTOR for scalable vector types. For fixed length vectors we could either change existing checks for splats to only look at the flag and would only need one extra argument for the splat value, or use the flag as a shortcut and fall back to checking all the elements if there's the possibility of a fold generating a splat and it not being recognized. > > > > Given the existence of MVTs with >1000 elements per vector, the SPLAT_VECTOR or BUILD_VECTOR with single element approach may also be beneficial for some fixed length backends. > > > > Any thoughts? > > > > -Graham > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e= <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e=> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e= <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=eIwz_I5be5PrK43j88xB5Sq6rozn9dgrd7VgeFkKkwM&s=yN2OEcjQuvdCMAhGa4lDVDxfYHUjQxhk-nfvMfoyYog&e=>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190829/5d9ed984/attachment.html>
Renato Golin via llvm-dev
2019-Aug-30 10:25 UTC
[llvm-dev] [SVE][AArch64] Codegen for a scalable vector splat
On Fri, 30 Aug 2019 at 00:35, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org> wrote:> IIRC the overall feeling is the same as the other attempts to canonicalize representations. We don’t introduce new representations unless absolutely necessary, and insert+shufflevector is technically sufficient to achieve a splat, even though it looks pretty horrible, bloats the IR etc.This is a recurrent enough question that perhaps we need to have a blog post about it. :) Mainly, the gist is flexibility and maintainability. Optimisations can be done on standard IR, but new constructs need to be taught to all passes before they become really useful. By introducing a new node type for every language/machine concept, we'd have a combinatorial explosion on the number of conversions to do, as well as lose the ability to efficiently pattern match to find optimisations. The side effect is having to be careful (and thus conservative) on code transformation. You end up with longer and more complicated def-use chains, which are hard to match, transform, move around, simplify. But at least, simpler patterns work, and improving a pattern becomes an incremental change. The main benefit is being able to have a common infrastructure to do all of those changes in the right place, and hopefully only once per stage, at non-combinatorial time complexity. Hope this helps. --renato PS: In this particular case, the ISD node would be temporary and localised, and at this level, we really don't want code to change anyway. Very different from IR.