Hi, I received an internal test case from a game team (it wasn't about this in particular), and I was wondering if there was maybe an opportunity to canonicalize a particular code pattern: %inputi = bitcast <4 x float> %input to <4 x i32> %row0i = and <4 x i32> %inputi, <i32 -1, i32 0, i32 0, i32 0> %row0 = bitcast <4 x i32> %row0i to <4 x float> %row1i = and <4 x i32> %inputi, <i32 0, i32 -1, i32 0, i32 0> %row1 = bitcast <4 x i32> %row1i to <4 x float> %row2i = and <4 x i32> %inputi, <i32 0, i32 0, i32 -1, i32 0> %row2 = bitcast <4 x i32> %row2i to <4 x float> %row3i = and <4 x i32> %inputi, <i32 0, i32 0, i32 0, i32 -1> %row3 = bitcast <4 x i32> %row3i to <4 x float> This arises from code which expands a vector of scale factors into the diagonal of a 4x4 diagonal matrix. This code pattern is coming from intrinsics which are explicitly doing the masking like this. My question is: should we canonicalize this to: %row0 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 x i32> <i32 0, i32 4, i32 4, i32 4> %row1 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 x i32> <i32 4, i32 1, i32 4, i32 4> %row2 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 x i32> <i32 4, i32 4, i32 2, i32 4> %row3 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 x i32> <i32 4, i32 4, i32 4, i32 3> which seems to better express the intent, or a sequence of insertelement and extract element (which is what we get for the attached code), or leave it as is? (or any better ideas?) Forgive my naivete if there's something obvious I'm missing since I haven't done much w.r.t. vectors in LLVM. -- Sean Silva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140926/5ac3bb1d/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: diagonalToScalingMatrix.cpp Type: text/x-c++src Size: 314 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140926/5ac3bb1d/attachment.cpp>
On 26 September 2014 19:22, Sean Silva <chisophugis at gmail.com> wrote:> Hi, I received an internal test case from a game team (it wasn't about this > in particular), and I was wondering if there was maybe an opportunity to > canonicalize a particular code pattern: > > %inputi = bitcast <4 x float> %input to <4 x i32> > > %row0i = and <4 x i32> %inputi, <i32 -1, i32 0, i32 0, i32 0> > %row0 = bitcast <4 x i32> %row0i to <4 x float> > > %row1i = and <4 x i32> %inputi, <i32 0, i32 -1, i32 0, i32 0> > %row1 = bitcast <4 x i32> %row1i to <4 x float> > > %row2i = and <4 x i32> %inputi, <i32 0, i32 0, i32 -1, i32 0> > %row2 = bitcast <4 x i32> %row2i to <4 x float> > > %row3i = and <4 x i32> %inputi, <i32 0, i32 0, i32 0, i32 -1> > %row3 = bitcast <4 x i32> %row3i to <4 x float> > > This arises from code which expands a vector of scale factors into the > diagonal of a 4x4 diagonal matrix. This code pattern is coming from > intrinsics which are explicitly doing the masking like this. > > My question is: should we canonicalize this to: > > %row0 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 > x i32> <i32 0, i32 4, i32 4, i32 4> > %row1 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 > x i32> <i32 4, i32 1, i32 4, i32 4> > %row2 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 > x i32> <i32 4, i32 4, i32 2, i32 4> > %row3 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 > x i32> <i32 4, i32 4, i32 4, i32 3> > > which seems to better express the intent, or a sequence of insertelement and > extract element (which is what we get for the attached code), or leave it as > is? (or any better ideas?) > > Forgive my naivete if there's something obvious I'm missing since I haven't > done much w.r.t. vectors in LLVM.shufflevector does look more canonical. In the past I think we avoided creating shufflevector for fear of producing bad code in CodeGen, but I think Chandler just fixed that :-) Cheers, Rafael
I think that the pattern below should be canonicalized into a vector ’select’ instruction with a constant mask. I think that we already have code for canonicalizing select-like shuffles into selects.> On Oct 6, 2014, at 12:36 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote: > > On 26 September 2014 19:22, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: >> Hi, I received an internal test case from a game team (it wasn't about this >> in particular), and I was wondering if there was maybe an opportunity to >> canonicalize a particular code pattern: >> >> %inputi = bitcast <4 x float> %input to <4 x i32> >> >> %row0i = and <4 x i32> %inputi, <i32 -1, i32 0, i32 0, i32 0> >> %row0 = bitcast <4 x i32> %row0i to <4 x float> >> >> %row1i = and <4 x i32> %inputi, <i32 0, i32 -1, i32 0, i32 0> >> %row1 = bitcast <4 x i32> %row1i to <4 x float> >> >> %row2i = and <4 x i32> %inputi, <i32 0, i32 0, i32 -1, i32 0> >> %row2 = bitcast <4 x i32> %row2i to <4 x float> >> >> %row3i = and <4 x i32> %inputi, <i32 0, i32 0, i32 0, i32 -1> >> %row3 = bitcast <4 x i32> %row3i to <4 x float> >> >> This arises from code which expands a vector of scale factors into the >> diagonal of a 4x4 diagonal matrix. This code pattern is coming from >> intrinsics which are explicitly doing the masking like this. >> >> My question is: should we canonicalize this to: >> >> %row0 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 >> x i32> <i32 0, i32 4, i32 4, i32 4> >> %row1 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 >> x i32> <i32 4, i32 1, i32 4, i32 4> >> %row2 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 >> x i32> <i32 4, i32 4, i32 2, i32 4> >> %row3 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4 >> x i32> <i32 4, i32 4, i32 4, i32 3> >>I think that there is a bug in the shuffle pattern. It should be <i32 4, i32 5, i32 6, i32 3>.>> which seems to better express the intent, or a sequence of insertelement and >> extract element (which is what we get for the attached code), or leave it as >> is? (or any better ideas?) >> >> Forgive my naivete if there's something obvious I'm missing since I haven't >> done much w.r.t. vectors in LLVM. > > shufflevector does look more canonical. In the past I think we avoided > creating shufflevector for fear of producing bad code in CodeGen, but > I think Chandler just fixed that :-)Excellent!> > Cheers, > Rafael > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141008/95030760/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Is there pass to break down <4 x float> to scalars
- [LLVMdev] Is there pass to break down <4 x float> to scalars
- please help generate a square correlation matrix
- [LLVMdev] Is there pass to break down <4 x float> to scalars
- please help generate a square correlation matrix