Chris Lattner
2009-Jul-23 05:54 UTC
[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work
On Jul 21, 2009, at 11:14 PM, Eli Friedman wrote:> Testcase (compile with clang >= r76726): > #include <emmintrin.h> > __m128i a(__m128 a, __m128 b) { return a==a & b==b; } > > CodeGen ends up scalarizing the comparison, which is really bad, and > AFAIK different from what we did before vsetcc was removed. The ideal > code is a single cmpordps, although I don't think clang ever generated > that for this construct.Ok, we were missing this specific case because of some instcombine xforms that were only applying to scalars, not vectors. I tweaked them to cover vectors and we're getting "perfect" code for this now (one cmpordps). However, not all is sunshine and roses, there are some sad puppydog faces left. Specifically, things like this still get scalarized: #include <emmintrin.h> __m128i a(__m128 a, __m128 b, __m128 c) { return a==b & c==b; } The problem is that the IR going into Codegen has been (nicely) simplified to: define <2 x i64> @a(<4 x float> %a, <4 x float> %b, <4 x float> %c) nounwind readnone { entry: %cmp = fcmp oeq <4 x float> %a, %b ; <<4 x i1>> [#uses=1] %cmp4 = fcmp oeq <4 x float> %c, %b ; <<4 x i1>> [#uses=1] %and6 = and <4 x i1> %cmp, %cmp4 ; <<4 x i1>> [#uses=1] %and = sext <4 x i1> %and6 to <4 x i32> ; <<4 x i32>> [#uses=1] %conv = bitcast <4 x i32> %and to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %conv } When legalize types sees the sext from <4 x i1> -> <4 x i32>, its only solution right now is to scalarize the whole mess feeding into it, giving us really atrocious code. IMO, the solution to this is to have a legalize-types action for vectors that corresponds to "promote" on scalars. In this case, since X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a VSETCC node with a 4xi32 result, the and should vector promote to 4xi32, and the sext should vector promote as a vector sext_inreg. I don't think that implementing this is particularly hard, but I have plenty of other things I'm working on right now. Is anyone else interested in working on this? -Chris
Duncan Sands
2009-Jul-23 12:09 UTC
[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work
Hi Chris,> When legalize types sees the sext from <4 x i1> -> <4 x i32>, its only > solution right now is to scalarize the whole mess feeding into it, > giving us really atrocious code. > > IMO, the solution to this is to have a legalize-types action for > vectors that corresponds to "promote" on scalars. In this case, since > X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a > VSETCC node with a 4xi32 result, the and should vector promote to > 4xi32, and the sext should vector promote as a vector sext_inreg. > > I don't think that implementing this is particularly hard, but I have > plenty of other things I'm working on right now. Is anyone else > interested in working on this?I agree that this should be straightforward: if the vector element type is illegal (eg: i1), then legalize the element while keeping it a vector (eg: <4 x i1> -> <4 x i8> or whatever, <4 x i128> -> <8 x i64>). One question is whether type legalization should handle the element type in the same way it would if it was a scalar, eg should <4 x i1> get turned into <4 x i8>, since an i1 gets turned into an i8, or into something else like <4 x i32>? I guess the first option would be slightly simpler/more regular from the type legalization viewpoint. Operation legalization could later turn <4 x i8> into <4 x i32> if that's better for the operation. That said, I don't plan to work on this. Ciao, Duncan.