Displaying 4 results from an estimated 4 matches for "vcmpps".
Did you mean:
cmpps
2011 Jun 01
4
[LLVMdev] AVX Status?
...body (e.g. at Cray?) still actively working on it?
I have tried both LLVM 2.9 final and the latest trunk, and it seems like
some trivial stuff is already working and produces nice code for code
using <8 x float>.
Unfortunately, the backend gets confused about mask code as e.g.
produced by VCMPPS together with mask operations (which LLVM requires to
work on <8 x i32> atm) and corresponding bitcasts.
Consider these two examples:
define <8 x float> @test1(<8 x float> %a, <8 x float> %b, <8 x i32> %m)
nounwind readnone {
entry:
%cmp = tail call <8 x floa...
2011 Jun 02
0
[LLVMdev] AVX Status?
...atched in tablegen files only by
extending the 128-bit PatFrags and PatLeafs to their 256-bit
counterparts should work, but besides that (which is where the
interesting stuff happens) there's no support yet!
> Unfortunately, the backend gets confused about mask code as e.g.
> produced by VCMPPS together with mask operations (which LLVM requires to
> work on <8 x i32> atm) and corresponding bitcasts.
>
> Consider these two examples:
>
> define <8 x float> @test1(<8 x float> %a, <8 x float> %b, <8 x i32> %m)
> nounwind readnone {
> entry:
&...
2011 Jun 03
1
[LLVMdev] AVX Status?
...ly by
> extending the 128-bit PatFrags and PatLeafs to their 256-bit
> counterparts should work, but besides that (which is where the
> interesting stuff happens) there's no support yet!
>
>> Unfortunately, the backend gets confused about mask code as e.g.
>> produced by VCMPPS together with mask operations (which LLVM requires to
>> work on<8 x i32> atm) and corresponding bitcasts.
>>
>> Consider these two examples:
>>
>> define<8 x float> @test1(<8 x float> %a,<8 x float> %b,<8 x i32> %m)
>> nounwind...
2011 Jun 03
2
[LLVMdev] AVX Status?
...> %res
>> }
>
> That would be nice indeed
Some lowering code would be needed to convert from i1 masks to i8 masks
(the so-called packed vs. sparse mask issue). I don't think I've added
anything to do this as our vectorizer doesn't generate code this way.
>> -> VCMPPS, VANDPS, BLENDVPS
>>
>> Nadav Rotem sent around a patch a few weeks ago in which he implemented
>> codegen for the select for SSE, unfortunately I did not have time to
>> look at it in more depth so far.
>>
>> Can anybody comment on the current status of AVX?
>...