Shahid, Asghar-ahmad
2015-May-06 05:43 UTC
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi Renato, That’s right. I agree with your *pattern vs complexity* thinking. So I would drop llvm.sad() and go ahead with the remaining two. Does it make sense in general? Regards, Shahid> -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Tuesday, May 05, 2015 8:40 PM > To: Shahid, Asghar-ahmad > Cc: James Molloy; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics > > On 5 May 2015 at 15:41, Shahid, Asghar-ahmad <Asghar- > ahmad.Shahid at amd.com> wrote: > > With llvm.sad() intrinsic: > > VC1 (Vector Cost) = Cost associated with "PSAD" instruction. > > > > W/ llvm.absd() and llvm.hadd() > > VC2 = Cost associated with "absolute diff" + "horizontal add" ( ??? > > ) > > > > As I will be querying with getIntrinsicCost(ID) for these two intrinsics > separately, Will VC1==VC2? > > I see. You are correct to say that this is a crude approximation. > > The way we do today is to get one of them and treat as "cheap", or if not > possible, to hope it'll dilute amidst other more expensive instructions. Since > the cost table is mostly to get it going, having > 2/4 of the cost instead of 1/4 of the cost (for diff+add of 4-way vectors > instead of diff+add of 4 scalars) will count little to the final score and it'll > probably encourage vectorization. On the generic cases that we fail to > vectorize, we end up increasing the cost of the scalar operations. > > I agree this is far from ideal, but it works reasonably well. The alternative > would be to have instructions pattern support, which would give us more > fine grained control. I have suggested this many years ago, but so far, the > current model is working well enough so that we haven't felt the need to > implement a complicated pattern matching support. > > The cases where a pattern match would help are mainly: > > * Detecting cases where the back end has special instructions for multiple IR > instructions. This is your case, and is common enough that should benefit > almost all back-ends. > > * Hazard detection, for instance when moving in and out of VFP registers, or > when two instructions in sequence are really bad in specific CPUs. This would > also benefit multiple back-ends, but probably has less impact on the quality > of the choices. > > However, we should first try the current model, and only go towards the > more complex model if we have enough patterns that would benefit strongly > enough to compensate for the increase in complexity. This should be a > consensus decision, I think. > > In any case, not an argument to implement intrisics just because the cost > model is not accurate enough. If anything, we should fix the cost model. > > cheers, > --renato
On 6 May 2015 at 06:43, Shahid, Asghar-ahmad <Asghar-ahmad.Shahid at amd.com> wrote:> That’s right. I agree with your *pattern vs complexity* thinking. > > So I would drop llvm.sad() and go ahead with the remaining two. > > Does it make sense in general?We strive to keep things simple. Creating too many intrinsics makes the IR brittle and hard to optimise. Creating pattern matching rules can be complex, but they're generally preferred to adding intrinsics. But creating a new pattern matching engine just for the sake of one case is too much, so we end up using heuristics instead. However, heuristics are like connective tissue. They look functional, but they add up to a big and ineffective blob, which ends up being less effective every new data you add. So, in the end, having a plan to generate a more efficient pattern matching structure could very well reduce the amount of code, if (and only if), the plan accounts for most current cases as well as the new ones. For the time being, if you can get away with heuristics, and that fills your allocated time for this task, that it's the best way forward for now. cheers, --renato
Shahid, Asghar-ahmad
2015-May-06 10:21 UTC
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
> For the time being, if you can get away with heuristics, and that fills your > allocated time for this task, that it's the best way forward for now.Sorry that I could not get what exactly you mean with "heuristics". Is it the "intrinsics approach" itself or something else? BTW, now my plan is to just add the two intrinsics for 'absolute difference' and 'horizontal add'. Regards, Shahid> -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Wednesday, May 06, 2015 3:22 PM > To: Shahid, Asghar-ahmad > Cc: James Molloy; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics > > On 6 May 2015 at 06:43, Shahid, Asghar-ahmad <Asghar- > ahmad.Shahid at amd.com> wrote: > > That’s right. I agree with your *pattern vs complexity* thinking. > > > > So I would drop llvm.sad() and go ahead with the remaining two. > > > > Does it make sense in general? > > We strive to keep things simple. > > Creating too many intrinsics makes the IR brittle and hard to optimise. > Creating pattern matching rules can be complex, but they're generally > preferred to adding intrinsics. But creating a new pattern matching engine > just for the sake of one case is too much, so we end up using heuristics > instead. > > However, heuristics are like connective tissue. They look functional, but they > add up to a big and ineffective blob, which ends up being less effective every > new data you add. So, in the end, having a plan to generate a more efficient > pattern matching structure could very well reduce the amount of code, if > (and only if), the plan accounts for most current cases as well as the new > ones. > > For the time being, if you can get away with heuristics, and that fills your > allocated time for this task, that it's the best way forward for now. > > cheers, > --renato
Maybe Matching Threads
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics