search for: cmpltps

Displaying 6 results from an estimated 6 matches for "cmpltps".

2008 Jun 17
2
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
...39;t work per element. Say we're trying to vectorize the following C++ code: if(v[0] < 0) v[0] += 1.0f; if(v[1] < 0) v[1] += 1.0f; if(v[2] < 0) v[2] += 1.0f; if(v[3] < 0) v[3] += 1.0f; With SSE assembly this would be as simple as: movaps xmm1, xmm0 // v in xmm0 cmpltps xmm1, zero // zero = {0.0f, 0.0f, 0.0f, 0.0f} andps xmm1, one // one = {1.0f, 1.0f, 1.0f, 1.0f} addps xmm0, xmm1 With the current definition of VFCmp this seems hard if not impossible to achieve. Vector compare instructions that return all 1's or all 0's per element...
2012 Sep 05
0
[LLVMdev] branch on vector compare?
Am 05.09.2012 00:24, schrieb Stephen: > Roland Scheidegger <sroland <at> vmware.com> writes: >> This looks quite similar to something I filed a bug on (12312). Michael >> Liao submitted fixes for this, so I think >> if you change it to >> %16 = fcmp ogt <4 x float> %15, %cr >> %17 = sext <4 x i1> %16 to <4 x i32> >> %18 =
2012 Sep 04
2
[LLVMdev] branch on vector compare?
Roland Scheidegger <sroland <at> vmware.com> writes: > This looks quite similar to something I filed a bug on (12312). Michael > Liao submitted fixes for this, so I think > if you change it to > %16 = fcmp ogt <4 x float> %15, %cr > %17 = sext <4 x i1> %16 to <4 x i32> > %18 = bitcast <4 x i32> %17 to i128 > %19 = icmp ne i128 %18, 0
2008 Jun 16
0
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
On Jun 13, 2008, at 12:27 AM, Nicolas Capens wrote: > Hi all, > > When trying to generate a VFCmp instruction when UnsafeFPMath is set > to true I get an assert “Unexpected CondCode” on my x86 system. This > also happens with UnsafeFPMath set to false and using an unordered > compare. Could someone look into this? > > While I’m at it, is there any reason why only the
2013 Feb 26
2
[LLVMdev] passing vector of booleans to functions
...cmp_add(<4 x float> %a, <4 x float> %b) { entry: %cmp = fcmp olt <4 x float> %a, %b %add = fadd <4 x float> %a, %b %sel = select <4 x i1> %cmp, <4 x float> %add, <4 x float> %a ret <4 x float> %sel } I will get (on SSE): movaps %xmm0, %xmm2 cmpltps %xmm1, %xmm0 addps %xmm2, %xmm1 blendvps %xmm1, %xmm2 movaps %xmm2, %xmm0 ret great :) But now, let us try to pass a mask to a function. define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float> %b) { entry: %add = fadd <4 x float> %a, %b %...
2008 Jun 13
6
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
Hi all, When trying to generate a VFCmp instruction when UnsafeFPMath is set to true I get an assert "Unexpected CondCode" on my x86 system. This also happens with UnsafeFPMath set to false and using an unordered compare. Could someone look into this? While I'm at it, is there any reason why only the most significant bit of the return value of VFCmp is defined (according to