James Molloy via llvm-dev
2016-Nov-24 20:49 UTC
[llvm-dev] [RFC] Supporting ARM's SVE in LLVM
Hi Graham, One high level comment without reading the patchset too much - it seems 'vscale' in particular could be just as easy to implement as an intrinsic, which would be a less invasive patch. Is there a reason you didn't go down the intrinsic route? James On Thu, 24 Nov 2016 at 15:39, Graham Hunter via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > Paul Walker has now uploaded the first set of IR support patches to > phabricator, which use our revised design. We managed to remove the need > for new instructions for basic scalable vectorization in favor of adding > two new constant classes; here's a subset of the revised documentation > describing just those constants: > > ## *vscale* > > ### Syntax: > > > `vscale` > > ### Overview: > > This complex constant represents the runtime value of `n` for any scalable > type > `<n x m x ty>`. This is primarily used to increment induction variables and > generate offsets. > > ### Interface: > > ```cpp > Constant *VScaleValue::get(Type *Ty); > ``` > > ### Example: > > The following shows how an induction variable would be incremented for a > scalable vector of type `<n x 4 x i32>`. > > ```llvm > %index.next = add nuw nsw i64 %index, mul (i64 vscale, i64 4) > ``` > > ## *stepvector* > > ### Syntax: > > > `stepvector` > > ### Overview: > > This complex constant represents the runtime value of a vector of > increasing > integers in the arithmetic series: > > > `<0, 1, 2, ... num_elements-1>` > > This is the basis for a scalable form of vector constants. Adding a splat > changes the effective starting point, and multiplying changes the step. The > main uses for this are: > > * Predicate creation using vector compares for fully predicated loops (see > also: > [*propff*](#propff), [*test*](#test)). > * Creating offset vectors for gather/scatter via `getelementptr`. > * Creating masks for `shufflevector`. > > For the following loop, a `stepvector` constant would be added to a splat > of the > loop induction variable to create the data vector to store: > > ```cpp > unsigned a[LIMIT]; > > for (unsigned i = 0; i < LIMIT; i++) { > a[i] = i; > } > ``` > > ### Interface: > > ```cpp > Constant *StepVectorValue::get(Type *Ty); > ``` > > ### Example: > > The following shows the construction of a scalable vector of the form > <start, start-2, start-4, ...>: > > ```llvm > %elt = insertelement <n x 4 x i32> undef, i32 %start, i32 0 > %widestart = shufflevector <n x 4 x i32> %elt, <n x 4 x i32> undef, <n x > 4 x i32> zeroinitializer > %step = insertelement <n x 4 x i32> undef, i32 -2, i32 0 > %widestep = shufflevector <n x 4 x i32> %step, <n x 4 x i32> undef, <n x > 4 x i32> zeroinitializer > %stridevec = mul <n x 4 x i32> stepvector, %widestep > %finalvec = add <n x 4 x i32> %widestart, %stridevec > ``` > > > > > Current patch set: > https://reviews.llvm.org/D27101 > https://reviews.llvm.org/D27102 > https://reviews.llvm.org/D27103 > https://reviews.llvm.org/D27105 > > -Graham > > > > On 22/11/2016, 14:49, "Graham Hunter via llvm-dev" < > llvm-dev at lists.llvm.org> wrote: > > Hi Renato, > > Sorry for the delay in responding. We've been busy rethinking some of > our changes after the feedback we've received thus far (particularly from > the devmeeting). The incremental patches will use our revised design(which > should be less invasive), and I'll be updating our document to match. > > On 16/11/2016, 12:46, "Renato Golin" <renato.golin at linaro.org> wrote: > > > This email is long and hard to read. I'm not surprised no one > replied > > yet. I think your PDF attached is a good start away from the > > complexity, but we're not going to get far if we try to do things in > > one step. > > > Based on your repository, the number of changes is so great, and the > > changes so invasive, that we really should look back at what we need > > to do, one step at a time, and only perform the refactoring changes > > that are needed for each step. > > We don't intend to do this all in one go; we fully expect that we'll > need to refactor a few times based on community feedback as we > incrementally add support for scalable vectors. > > > > * This is a warts-and-all release of our development tree, with > plenty of TODOs and unfinished experiments > > > * We haven't posted our clang changes yet > > > > I don't mind FIXMEs or TODOs, but I did see a lot of spurious name > > changes, enum value moves (breaking old binaries) and a lot of new > > high-level passes (LoopVectorisationAnalysis) which will need a long > > review on their own before we even start thinking about SVE. > > > > I recommend you guys separate the refactoring from the > implementation > > and try to upstream the initial and uncontroversial refactorings > (name > > changes, etc), as well as move out the current functionality into > new > > passes, so then you can extend for SVE as a refactoring, not > > move-and-extend in the same pass. > > So our highest priority is getting basic support for SVE into the > codebase (types, codegen, assembler/disassembler, simple vectorization); > after that is in, we'll be happy to discuss our other changes like > separating out loop vectorization legality, controlling loops via > predication, or adding search loop vectorization. > > > We want to minimise the number of changes, so that we can revert > > breakages more easily, and have a steady progress, rather than a > > break-the-world situation. > > Same for us. The individual patches will be relatively small, this > repo was just for context if needed when discussing the smaller patches. > > > Finally, *every* test change needs to be scrutinised and guaranteed > to > > make sense. We really dislike spurious test changes, unless we can > > prove that the test was unstable to being with, in which case we > > change it to a better test. > > Yep, makes sense. > > Thanks, > > -Graham > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161124/d5b78471/attachment.html>
Paul Walker via llvm-dev
2016-Nov-25 14:42 UTC
[llvm-dev] [RFC] Supporting ARM's SVE in LLVM
Hi James, With our goal of making scalable vectors a first class type we already want to change the IR to incorporate "vscale" as part of the printed type (i.e. it's the "n" in "<n x 4 x i32>"). For this reason it makes sense to us if the isolated representation of "n" (i.e. vscale) is also treated as first class because the two representations go hand in hand. The above is the stylistic answer but from a more practical point of view there will be many instances where "vscale" gets queried. Folds that currently exist for ConstantInt will need to have "vscale" variants. We concluded using an intrinsic would pollute the code base a lot more in the long run when compared to the new constant approached. Paul!!! From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of James Molloy via llvm-dev <llvm-dev at lists.llvm.org> Reply-To: James Molloy <james at jamesmolloy.co.uk> Date: Thursday, 24 November 2016 at 20:49 To: Graham Hunter <Graham.Hunter at arm.com>, "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org> Cc: nd <nd at arm.com>, Will Lovett <Will.Lovett at arm.com> Subject: Re: [llvm-dev] [RFC] Supporting ARM's SVE in LLVM Hi Graham, One high level comment without reading the patchset too much - it seems 'vscale' in particular could be just as easy to implement as an intrinsic, which would be a less invasive patch. Is there a reason you didn't go down the intrinsic route? James On Thu, 24 Nov 2016 at 15:39, Graham Hunter via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi, Paul Walker has now uploaded the first set of IR support patches to phabricator, which use our revised design. We managed to remove the need for new instructions for basic scalable vectorization in favor of adding two new constant classes; here's a subset of the revised documentation describing just those constants: ## *vscale* ### Syntax:> `vscale`### Overview: This complex constant represents the runtime value of `n` for any scalable type `<n x m x ty>`. This is primarily used to increment induction variables and generate offsets. ### Interface: ```cpp Constant *VScaleValue::get(Type *Ty); ``` ### Example: The following shows how an induction variable would be incremented for a scalable vector of type `<n x 4 x i32>`. ```llvm %index.next = add nuw nsw i64 %index, mul (i64 vscale, i64 4) ``` ## *stepvector* ### Syntax:> `stepvector`### Overview: This complex constant represents the runtime value of a vector of increasing integers in the arithmetic series:> `<0, 1, 2, ... num_elements-1>`This is the basis for a scalable form of vector constants. Adding a splat changes the effective starting point, and multiplying changes the step. The main uses for this are: * Predicate creation using vector compares for fully predicated loops (see also: [*propff*](#propff), [*test*](#test)). * Creating offset vectors for gather/scatter via `getelementptr`. * Creating masks for `shufflevector`. For the following loop, a `stepvector` constant would be added to a splat of the loop induction variable to create the data vector to store: ```cpp unsigned a[LIMIT]; for (unsigned i = 0; i < LIMIT; i++) { a[i] = i; } ``` ### Interface: ```cpp Constant *StepVectorValue::get(Type *Ty); ``` ### Example: The following shows the construction of a scalable vector of the form <start, start-2, start-4, ...>: ```llvm %elt = insertelement <n x 4 x i32> undef, i32 %start, i32 0 %widestart = shufflevector <n x 4 x i32> %elt, <n x 4 x i32> undef, <n x 4 x i32> zeroinitializer %step = insertelement <n x 4 x i32> undef, i32 -2, i32 0 %widestep = shufflevector <n x 4 x i32> %step, <n x 4 x i32> undef, <n x 4 x i32> zeroinitializer %stridevec = mul <n x 4 x i32> stepvector, %widestep %finalvec = add <n x 4 x i32> %widestart, %stridevec ``` Current patch set: https://reviews.llvm.org/D27101 https://reviews.llvm.org/D27102 https://reviews.llvm.org/D27103 https://reviews.llvm.org/D27105 -Graham On 22/11/2016, 14:49, "Graham Hunter via llvm-dev" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi Renato, Sorry for the delay in responding. We've been busy rethinking some of our changes after the feedback we've received thus far (particularly from the devmeeting). The incremental patches will use our revised design(which should be less invasive), and I'll be updating our document to match. On 16/11/2016, 12:46, "Renato Golin" <renato.golin at linaro.org<mailto:renato.golin at linaro.org>> wrote: > This email is long and hard to read. I'm not surprised no one replied > yet. I think your PDF attached is a good start away from the > complexity, but we're not going to get far if we try to do things in > one step. > Based on your repository, the number of changes is so great, and the > changes so invasive, that we really should look back at what we need > to do, one step at a time, and only perform the refactoring changes > that are needed for each step. We don't intend to do this all in one go; we fully expect that we'll need to refactor a few times based on community feedback as we incrementally add support for scalable vectors. > > * This is a warts-and-all release of our development tree, with plenty of TODOs and unfinished experiments > > * We haven't posted our clang changes yet > > I don't mind FIXMEs or TODOs, but I did see a lot of spurious name > changes, enum value moves (breaking old binaries) and a lot of new > high-level passes (LoopVectorisationAnalysis) which will need a long > review on their own before we even start thinking about SVE. > > I recommend you guys separate the refactoring from the implementation > and try to upstream the initial and uncontroversial refactorings (name > changes, etc), as well as move out the current functionality into new > passes, so then you can extend for SVE as a refactoring, not > move-and-extend in the same pass. So our highest priority is getting basic support for SVE into the codebase (types, codegen, assembler/disassembler, simple vectorization); after that is in, we'll be happy to discuss our other changes like separating out loop vectorization legality, controlling loops via predication, or adding search loop vectorization. > We want to minimise the number of changes, so that we can revert > breakages more easily, and have a steady progress, rather than a > break-the-world situation. Same for us. The individual patches will be relatively small, this repo was just for context if needed when discussing the smaller patches. > Finally, *every* test change needs to be scrutinised and guaranteed to > make sense. We really dislike spurious test changes, unless we can > prove that the test was unstable to being with, in which case we > change it to a better test. Yep, makes sense. Thanks, -Graham _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161125/b8327e02/attachment.html>
Renato Golin via llvm-dev
2016-Nov-25 15:15 UTC
[llvm-dev] [RFC] Supporting ARM's SVE in LLVM
On 25 November 2016 at 14:42, Paul Walker via llvm-dev <llvm-dev at lists.llvm.org> wrote:> The above is the stylistic answer but from a more practical point of view > there will be many instances where "vscale" gets queried. Folds that > currently exist for ConstantInt will need to have "vscale" variants. We > concluded using an intrinsic would pollute the code base a lot more in the > long run when compared to the new constant approached.Hi Paul, Can you give us examples on where the vscale will be constant-folded? cheers, --renato