On 1 February 2017 at 08:27, Renato Golin <renato.golin at linaro.org> wrote:> Sorry, I meant min/max + reduce, just like above. > > %sum = add <N x float>, <N x float> %a, <N x float> %b > %min = @llvm.minnum(<N x float> %sum) > %red = @llvm.reduce(%min, float %acc)No, this is wrong. I actually meant overriding the max/min intrinsics to take vectors instead of two scalar options. The semantics of those intrinsics is to get a number of parameters and return the max/min on their own type. A vector is just a different packing of parameters, all of which have the same type, so there's no semantic difference other than the number of arguments. However, when they're lowered, they'll end up a a short sequence of instruction (if supported) in the same way. You'd only emit a max/min in vectorisation if the target supports it and the cost is low. For instance, in AArch64 it would only emit it if SVE support is enabled. Makes sense? cheers, --renato
I think you mean take a single vector operand, not two vector operands. If the intrinsic took two vectors then we would naturally expect the result to be a vector, where each element is the result of a scalar operation between each respective element in the vector operands. Just like a normal binary operator. On 1 February 2017 at 10:59, Renato Golin <renato.golin at linaro.org> wrote:> On 1 February 2017 at 08:27, Renato Golin <renato.golin at linaro.org> wrote: >> Sorry, I meant min/max + reduce, just like above. >> >> %sum = add <N x float>, <N x float> %a, <N x float> %b >> %min = @llvm.minnum(<N x float> %sum) >> %red = @llvm.reduce(%min, float %acc) > > No, this is wrong. I actually meant overriding the max/min intrinsics > to take vectors instead of two scalar options. > > The semantics of those intrinsics is to get a number of parameters and > return the max/min on their own type. A vector is just a different > packing of parameters, all of which have the same type, so there's no > semantic difference other than the number of arguments. > > However, when they're lowered, they'll end up a a short sequence of > instruction (if supported) in the same way. > > You'd only emit a max/min in vectorisation if the target supports it > and the cost is low. For instance, in AArch64 it would only emit it if > SVE support is enabled. > > Makes sense? > > cheers, > --renato
On 1 February 2017 at 14:36, Amara Emerson <amara.emerson at gmail.com> wrote:> I think you mean take a single vector operand, not two vector > operands. If the intrinsic took two vectors then we would naturally > expect the result to be a vector, where each element is the result of > a scalar operation between each respective element in the vector > operands. Just like a normal binary operator.Precisely. %vec = @llvm.maxnum(<N x float> %a, <N x float> %b) %scalar = @llvm.maxnum(<N x floag> %c) In NEON, the former is a simple VMAX (VADD, etc), while the latter can be a combination of VPMAX (VPADD etc). cheers, --renato