thr3ads.net - llvm dev - [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics [May 2015]

If this information is useful, please help other people find it:
Share via:

Shahid, Asghar-ahmad

2015-May-06 05:43 UTC

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Hi Renato,

That’s right. I agree with your *pattern vs complexity* thinking.

So I would drop llvm.sad() and go ahead with the remaining two.

Does it make sense in general?

Regards,
Shahid
> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Tuesday, May 05, 2015 8:40 PM
> To: Shahid, Asghar-ahmad
> Cc: James Molloy; llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
> 
> On 5 May 2015 at 15:41, Shahid, Asghar-ahmad <Asghar-
> ahmad.Shahid at amd.com> wrote:
> > With llvm.sad() intrinsic:
> > VC1 (Vector Cost) = Cost associated with "PSAD" instruction.
> >
> > W/ llvm.absd() and llvm.hadd()
> > VC2  = Cost associated with "absolute diff" + 
"horizontal add" ( ???
> > )
> >
> > As I will be querying with getIntrinsicCost(ID) for these two
intrinsics
> separately, Will VC1==VC2?
> 
> I see. You are correct to say that this is a crude approximation.
> 
> The way we do today is to get one of them and treat as "cheap",
or if not
> possible, to hope it'll dilute amidst other more expensive
instructions. Since
> the cost table is mostly to get it going, having
> 2/4 of the cost instead of 1/4 of the cost (for diff+add of 4-way vectors
> instead of diff+add of 4 scalars) will count little to the final score and
it'll
> probably encourage vectorization. On the generic cases that we fail to
> vectorize, we end up increasing the cost of the scalar operations.
> 
> I agree this is far from ideal, but it works reasonably well. The
alternative
> would be to have instructions pattern support, which would give us more
> fine grained control. I have suggested this many years ago, but so far, the
> current model is working well enough so that we haven't felt the need
to
> implement a complicated pattern matching support.
> 
> The cases where a pattern match would help are mainly:
> 
> * Detecting cases where the back end has special instructions for multiple
IR
> instructions. This is your case, and is common enough that should benefit
> almost all back-ends.
> 
> * Hazard detection, for instance when moving in and out of VFP registers,
or
> when two instructions in sequence are really bad in specific CPUs. This
would
> also benefit multiple back-ends, but probably has less impact on the
quality
> of the choices.
> 
> However, we should first try the current model, and only go towards the
> more complex model if we have enough patterns that would benefit strongly
> enough to compensate for the increase in complexity. This should be a
> consensus decision, I think.
> 
> In any case, not an argument to implement intrisics just because the cost
> model is not accurate enough. If anything, we should fix the cost model.
> 
> cheers,
> --renato

Renato Golin

2015-May-06 09:51 UTC

head link

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

On 6 May 2015 at 06:43, Shahid, Asghar-ahmad
<Asghar-ahmad.Shahid at amd.com> wrote:> That’s right. I agree with your *pattern vs complexity* thinking.
>
> So I would drop llvm.sad() and go ahead with the remaining two.
>
> Does it make sense in general?
We strive to keep things simple.

Creating too many intrinsics makes the IR brittle and hard to
optimise. Creating pattern matching rules can be complex, but they're
generally preferred to adding intrinsics. But creating a new pattern
matching engine just for the sake of one case is too much, so we end
up using heuristics instead.

However, heuristics are like connective tissue. They look functional,
but they add up to a big and ineffective blob, which ends up being
less effective every new data you add. So, in the end, having a plan
to generate a more efficient pattern matching structure could very
well reduce the amount of code, if (and only if), the plan accounts
for most current cases as well as the new ones.

For the time being, if you can get away with heuristics, and that
fills your allocated time for this task, that it's the best way
forward for now.

cheers,
--renato

Shahid, Asghar-ahmad

2015-May-06 10:21 UTC

head link

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

> For the time being, if you can get away with heuristics, and that fills
your
> allocated time for this task, that it's the best way forward for now.Sorry that I could not get what exactly you mean with "heuristics".
Is it the "intrinsics approach" itself or something else?

BTW, now my plan is to just add the two intrinsics for 'absolute
difference'
and 'horizontal add'.

Regards,
Shahid
> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Wednesday, May 06, 2015 3:22 PM
> To: Shahid, Asghar-ahmad
> Cc: James Molloy; llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
> 
> On 6 May 2015 at 06:43, Shahid, Asghar-ahmad <Asghar-
> ahmad.Shahid at amd.com> wrote:
> > That’s right. I agree with your *pattern vs complexity* thinking.
> >
> > So I would drop llvm.sad() and go ahead with the remaining two.
> >
> > Does it make sense in general?
> 
> We strive to keep things simple.
> 
> Creating too many intrinsics makes the IR brittle and hard to optimise.
> Creating pattern matching rules can be complex, but they're generally
> preferred to adding intrinsics. But creating a new pattern matching engine
> just for the sake of one case is too much, so we end up using heuristics
> instead.
> 
> However, heuristics are like connective tissue. They look functional, but
they
> add up to a big and ineffective blob, which ends up being less effective
every
> new data you add. So, in the end, having a plan to generate a more
efficient
> pattern matching structure could very well reduce the amount of code, if
> (and only if), the plan accounts for most current cases as well as the new
> ones.
> 
> For the time being, if you can get away with heuristics, and that fills
your
> allocated time for this task, that it's the best way forward for now.
> 
> cheers,
> --renato

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - May 2015 - [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Reasonably Related Threads