Hi Chandler,
Regardless of the canonical form we choose, we need code to match non-canonical
associated shuffle sequences and convert them into the canonical form. We also
need code to match the pattern where we extractelement on all elements and sum
them into this canonical form. This code needs to exist somewhere, so we need to
decide whether it exists in the frontend or the backend.
Having an intrinsic is obviously smaller, in terms of IR memory overhead, than
these instructions. However, I'm not sure how many passes we'll need to
teach about the new intrinsic. Obviously there are many passes that understand
integer addition, but likely many fewer would really learn anything useful by
looking through the reduction. We would need add code into InstCombine in order
to pull apart the reduction intrinsic when we learn that the vector has only one
contributing element.
In short, I don't have a strong opinion on this, because we need the
matching code somewhere regardless. Using the intrinsic means not matching
multiple times, but it means adding extra code to handle the intrinsic.
Regarding issues such as idiom recognition, use by the SLP vectorizer, etc.
these seem independent of whether the canonical form is an intrinsic or a
composite, and I don't think it makes the vectorizer cost model easier one
way or the other.
In any case, we now have relevant pattern-matching code in SDAGBuilder (although
it is currently somewhat specific to reductions after loops), so we already have
a better infrastructure to help backends with the shuffle-matching problem.
There is a corresponding 'VectorReduction' SDNode flag.
-Hal
----- Original Message -----> From: "Chandler Carruth" <chandlerc at gmail.com>
> To: "Asghar-ahmad Shahid" <Asghar-ahmad.Shahid at amd.com>,
"Rail Shafigulin" <rail at esenciatech.com>,
"llvm-dev"
> <llvm-dev at lists.llvm.org>, "Hal Finkel" <hfinkel at
anl.gov>
> Sent: Sunday, May 15, 2016 8:15:37 PM
> Subject: Re: [llvm-dev] sum elements in the vector
>
>
> I'm starting to think we should directly implement horizontal
> operations on vector types.
>
>
>
> My suspicion is that coming up with a nice model for this would help
> us a lot with things like:
> - Idiom recognition of reduction patterns that use horizontal
> arithmetic
> - Ability to use horizontal operations in SLPVectorizer
> - Significantly easier cost modeling of vectorizing loops with
> reductions in LoopVectorize
> - Other things I've not thought of?
>
> Curious what others think?
>
>
> -Chandler
>
>
> On Wed, May 11, 2016 at 10:07 PM Shahid, Asghar-ahmad via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
>
>
>
>
>
>
> > why in order to add this particular instruction (sum elements in a
> > vector) I need to add an insrinsic?
>
>
>
> Adding intrinsic is not the only way, it is one of the way and user
> WILL-NOT be required to invoke
>
> It specifically.
>
>
>
> Currently LLVM does not have any instruction to directly represent
> “sum of elements in a vector” and
>
> generate your particular instruction.However, you can do it without
> intrinsic by pattern matching the
>
> LLVM-IRs representing “sum of elements in vector” to your particular
> instruction in DAGCombiner.
>
>
>
> Regards,
>
> Shahid
>
>
>
>
>
>
>
>
> From: Rail Shafigulin [mailto: rail at esenciatech.com ]
> Sent: Monday, May 09, 2016 11:59 PM
> To: Shahid, Asghar-ahmad; llvm-dev
> Cc: Das, Dibyendu
>
>
>
>
>
>
>
> Subject: Re: [llvm-dev] sum elements in the vector
>
>
>
>
>
>
>
> I'm a little confused. Here is why.
>
>
>
>
>
> I was able to add a vector add instruction to my target without using
> any intrinsics and without adding any new instructions to LLVM. So
> here is my question: how come I managed to add a new vector
> instruction without adding an intrinsic and why in order to add this
> particular instruction (sum elements in a vector) I need to add an
> insrinsic?
>
>
>
>
>
> Another question that I have is whether compiler will be able to
> target this new instruction (sum elements in a vector) if it is
> implemented as an intrinsic or the user will have to specifically
> invoke an instrinsic.
>
>
>
>
>
> Pardon if questions seem dumb, I'm still learning things.
>
>
>
>
>
> Any help is appreciated.
>
>
>
>
>
> On Fri, May 6, 2016 at 1:51 PM, Rail Shafigulin <
> rail at esenciatech.com > wrote:
>
>
> Thanks for the reply. These steps will add an instruction as an
> intrinsic. Is it possible to add an actual new instruction so that a
> compiler could target it during an optimization? How hard is it to
> do it? Is that a realistic objective.
>
>
>
>
>
> Rail
>
>
>
>
>
>
>
> On Mon, Apr 4, 2016 at 9:02 PM, Shahid, Asghar-ahmad <
> Asghar-ahmad.Shahid at amd.com > wrote:
>
>
>
> Hi Rail,
>
>
>
> We had done this for generation of X86 PSAD (sum of absolute
> difference) instruction through
>
> Llvm intrinsic. Doing this requires following
>
> 1. Define an intrinsic, xyz(), for the required instruction and
> corresponding SDNode
>
> 2. Generate the “call xyz() “ IR based the matched pattern
>
> 3. Map “call xyz()” IR to corresponding SDNode in
> SelectionDagBuilder.cpp
>
> 4. Provide default expansion of the xyz() intrinsic
>
> 5. Legalize type and/or operation
>
> 6. Provide Lowering of intrinsic/SDNode to generate your target
> instruction
>
>
>
> You can visit http://llvm.org/docs/ExtendingLLVM.html for details.
>
>
>
> Regards,
>
> Shahid
>
>
>
>
>
>
>
>
>
>
> From: llvm-dev [mailto: llvm-dev-bounces at lists.llvm.org ] On Behalf
> Of Rail Shafigulin via llvm-dev
> Sent: Monday, April 04, 2016 11:00 PM
> To: Das, Dibyendu
> Cc: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] sum elements in the vector
>
>
>
>
>
>
> Thanks for the pointers. I looked at hadd instructions. They seem to
> do very similar to what I need. Unfortunately as I said before my
> LLVM experience is limited. My understanding is that when I create a
> new type of SDNode I need to specify a pattern for it, so that when
> LLVM is analyzing the code and is seeing a given pattern it would
> create this particular node. I'm really struggling to understand how
> it is done. So here are the problems that I'm having.
>
>
>
>
>
> 1. How do I identify that pattern that should be used?
>
>
> 2. How do I specify a given pattern?
>
>
>
>
>
> Do you (or someone else) mind helping me out?
>
>
>
>
>
> Any help is appreciated.
>
>
>
>
>
> On Mon, Apr 4, 2016 at 9:59 AM, Das, Dibyendu < Dibyendu.Das at amd.com
> > wrote:
>
>
>
> This is roughly along the lines of x86 hadd* instructions though the
> semantics of hadd* may not exactly match what you are looking for.
> This is probably more in line with x86/ARM SAD-like instructions but
> I don’t think llvm generates SAD without intrinsics.
>
>
>
> From: llvm-dev [mailto: llvm-dev-bounces at lists.llvm.org ] On Behalf
> Of Rail Shafigulin via llvm-dev
> Sent: Monday, April 04, 2016 9:34 AM
> To: llvm-dev < llvm-dev at lists.llvm.org >
> Subject: [llvm-dev] sum elements in the vector
>
>
>
>
> My target has an instruction that adds up all elements in the vector
> and stores the result in a register. I'm trying to implement it in
> my compiler but I'm not sure even where to start.
>
>
>
>
>
>
>
> I did look at other targets, but they don't seem to have anything
> like it ( I could be wrong. My experience with LLVM is limited, so
> if I missed it, I'd appreciate if someone could point it out ).
>
>
>
>
>
> My understanding is that if SDNode for such an instruction doesn't
> exist I have to define one. Unfortunately, I don't know how to do
> it. I don't even know where to start looking. Would someone care to
> point me in the right direction?
>
>
>
>
>
> Any help is appreciated.
>
>
>
>
>
> --
>
>
>
>
>
>
> Rail Shafigulin
>
> Software Engineer
> Esencia Technologies
>
>
>
>
>
>
>
>
> --
>
>
>
>
>
>
> Rail Shafigulin
>
> Software Engineer
> Esencia Technologies
>
>
>
>
>
>
>
>
> --
>
>
>
>
>
>
> Rail Shafigulin
>
> Software Engineer
> Esencia Technologies
>
>
>
>
>
>
>
>
> --
>
>
>
>
>
>
> Rail Shafigulin
>
> Software Engineer
> Esencia Technologies _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory