Displaying 6 results from an estimated 6 matches for "cmpltps".
2008 Jun 17
2
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
...39;t work per element.
Say we're trying to vectorize the following C++ code:
if(v[0] < 0) v[0] += 1.0f;
if(v[1] < 0) v[1] += 1.0f;
if(v[2] < 0) v[2] += 1.0f;
if(v[3] < 0) v[3] += 1.0f;
With SSE assembly this would be as simple as:
movaps xmm1, xmm0 // v in xmm0
cmpltps xmm1, zero // zero = {0.0f, 0.0f, 0.0f, 0.0f}
andps xmm1, one // one = {1.0f, 1.0f, 1.0f, 1.0f}
addps xmm0, xmm1
With the current definition of VFCmp this seems hard if not impossible to
achieve. Vector compare instructions that return all 1's or all 0's per
element...
2012 Sep 05
0
[LLVMdev] branch on vector compare?
Am 05.09.2012 00:24, schrieb Stephen:
> Roland Scheidegger <sroland <at> vmware.com> writes:
>> This looks quite similar to something I filed a bug on (12312). Michael
>> Liao submitted fixes for this, so I think
>> if you change it to
>> %16 = fcmp ogt <4 x float> %15, %cr
>> %17 = sext <4 x i1> %16 to <4 x i32>
>> %18 =
2012 Sep 04
2
[LLVMdev] branch on vector compare?
Roland Scheidegger <sroland <at> vmware.com> writes:
> This looks quite similar to something I filed a bug on (12312). Michael
> Liao submitted fixes for this, so I think
> if you change it to
> %16 = fcmp ogt <4 x float> %15, %cr
> %17 = sext <4 x i1> %16 to <4 x i32>
> %18 = bitcast <4 x i32> %17 to i128
> %19 = icmp ne i128 %18, 0
2008 Jun 16
0
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
On Jun 13, 2008, at 12:27 AM, Nicolas Capens wrote:
> Hi all,
>
> When trying to generate a VFCmp instruction when UnsafeFPMath is set
> to true I get an assert “Unexpected CondCode” on my x86 system. This
> also happens with UnsafeFPMath set to false and using an unordered
> compare. Could someone look into this?
>
> While I’m at it, is there any reason why only the
2013 Feb 26
2
[LLVMdev] passing vector of booleans to functions
...cmp_add(<4 x float> %a, <4 x float> %b) {
entry:
%cmp = fcmp olt <4 x float> %a, %b
%add = fadd <4 x float> %a, %b
%sel = select <4 x i1> %cmp, <4 x float> %add, <4 x float> %a
ret <4 x float> %sel
}
I will get (on SSE):
movaps %xmm0, %xmm2
cmpltps %xmm1, %xmm0
addps %xmm2, %xmm1
blendvps %xmm1, %xmm2
movaps %xmm2, %xmm0
ret
great :)
But now, let us try to pass a mask to a function.
define <4 x float> @masked_add_1(<4 x i1> %mask, <4 x float> %a, <4 x float> %b) {
entry:
%add = fadd <4 x float> %a, %b
%...
2008 Jun 13
6
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
Hi all,
When trying to generate a VFCmp instruction when UnsafeFPMath is set to true
I get an assert "Unexpected CondCode" on my x86 system. This also happens
with UnsafeFPMath set to false and using an unordered compare. Could someone
look into this?
While I'm at it, is there any reason why only the most significant bit of
the return value of VFCmp is defined (according to