Shahid, Asghar-ahmad
2015-May-05 14:41 UTC
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi Renato, Thanks for your response. My concern was actually this. For example, take vector type V8i16 on X86 target With llvm.sad() intrinsic: VC1 (Vector Cost) = Cost associated with "PSAD" instruction. W/ llvm.absd() and llvm.hadd() VC2 = Cost associated with "absolute diff" + "horizontal add" ( ??? ) As I will be querying with getIntrinsicCost(ID) for these two intrinsics separately, Will VC1==VC2? May be I am missing something obvious? Regards, Shahid> -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Tuesday, May 05, 2015 7:28 PM > To: Shahid, Asghar-ahmad > Cc: James Molloy; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics > > On 4 May 2015 at 08:37, Shahid, Asghar-ahmad <Asghar- > ahmad.Shahid at amd.com> wrote: > > My worry is regarding the query for cost calculation for specific SAD > > instructions such as ‘psad’ (X86) or ‘usad’ (ARM) in Loop Vectorizer. > > Hi Shahid, > > The vectorizer's cost model has the ability to return different costs for the > same instruction based on the arguments (scalar/vector, big/small, special > cases), so I don't think that adding intrisics will help you in defining the > correct cost. This is true for all other vectorizer's decisions and it works quite > well. > > If you find something missing, maybe we should fix the cost model, not > introduce more intrinsics. > > cheers, > --renato
Shahid, Asghar-ahmad
2015-May-06 05:43 UTC
[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
Hi Renato, That’s right. I agree with your *pattern vs complexity* thinking. So I would drop llvm.sad() and go ahead with the remaining two. Does it make sense in general? Regards, Shahid> -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Tuesday, May 05, 2015 8:40 PM > To: Shahid, Asghar-ahmad > Cc: James Molloy; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics > > On 5 May 2015 at 15:41, Shahid, Asghar-ahmad <Asghar- > ahmad.Shahid at amd.com> wrote: > > With llvm.sad() intrinsic: > > VC1 (Vector Cost) = Cost associated with "PSAD" instruction. > > > > W/ llvm.absd() and llvm.hadd() > > VC2 = Cost associated with "absolute diff" + "horizontal add" ( ??? > > ) > > > > As I will be querying with getIntrinsicCost(ID) for these two intrinsics > separately, Will VC1==VC2? > > I see. You are correct to say that this is a crude approximation. > > The way we do today is to get one of them and treat as "cheap", or if not > possible, to hope it'll dilute amidst other more expensive instructions. Since > the cost table is mostly to get it going, having > 2/4 of the cost instead of 1/4 of the cost (for diff+add of 4-way vectors > instead of diff+add of 4 scalars) will count little to the final score and it'll > probably encourage vectorization. On the generic cases that we fail to > vectorize, we end up increasing the cost of the scalar operations. > > I agree this is far from ideal, but it works reasonably well. The alternative > would be to have instructions pattern support, which would give us more > fine grained control. I have suggested this many years ago, but so far, the > current model is working well enough so that we haven't felt the need to > implement a complicated pattern matching support. > > The cases where a pattern match would help are mainly: > > * Detecting cases where the back end has special instructions for multiple IR > instructions. This is your case, and is common enough that should benefit > almost all back-ends. > > * Hazard detection, for instance when moving in and out of VFP registers, or > when two instructions in sequence are really bad in specific CPUs. This would > also benefit multiple back-ends, but probably has less impact on the quality > of the choices. > > However, we should first try the current model, and only go towards the > more complex model if we have enough patterns that would benefit strongly > enough to compensate for the increase in complexity. This should be a > consensus decision, I think. > > In any case, not an argument to implement intrisics just because the cost > model is not accurate enough. If anything, we should fix the cost model. > > cheers, > --renato
Reasonably Related Threads
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics
- [LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics