search for: v8f32

Displaying 20 results from an estimated 48 matches for "v8f32".

Did you mean: v4f32
2009 Dec 02
5
[LLVMdev] Selecting Vector Shuffle of Different Types
...28 : avx_fp_extract_vector_osta_node_mri_256<0x19, MRMDestReg, MRMDestMem, "extractf128", undef, X86f32, X86i32i8, // rr [(set VR128:$dst, (v4f32 (vector_shuffle (v8f32 undef), (v8f32 VR256:$src1), VEXTRACTF128_shuffle_mask:$src2)))]>; (This is simplified for the sake of exposition but this gets the idea across). TableGen reports a type contradition: VEXTRACTF128_256mri: (st:isVoid (vector_shuffle:v4f32 (undef:v8f32), V...
2009 Jun 17
0
[LLVMdev] Regular Expressions
...or_osta_vintrinsic_rmi_rrmi<0x0C, > i32i8imm, "blendps", "blendps">; Here's another option: defm BLENDPS : sse41_avx_fp_binary_vector_osta_vintrinsic_rmi_rrmi<0x0C, i32i8imm, "blendps", "blendps", v4f32, v8f32>; This is somewhere between the first and second options. It's not as convenient as the second but is more intuitive than the first. Still, looking at some random individual instruction, it wouldn't be immediately clear to me what those multiple types mean. I might think they're...
2009 Dec 03
0
[LLVMdev] Selecting Vector Shuffle of Different Types
...t_vector_osta_node_mri_256<0x19, MRMDestReg, >                      MRMDestMem, "extractf128", undef, X86f32, X86i32i8, >                   // rr >                   [(set VR128:$dst, >                         (v4f32 (vector_shuffle >                                     (v8f32 undef), (v8f32 VR256:$src1), >                                     VEXTRACTF128_shuffle_mask:$src2)))]>; > > (This is simplified for the sake of exposition but this gets the idea across). > > TableGen reports a type contradition: > > VEXTRACTF128_256mri:    (st:isVoid (vecto...
2009 Dec 03
2
[LLVMdev] Duplicate Label in Generates ISel
I've got the following problem in the X86 selector generated by TableGen: llvm/lib/Target/X86/X86GenDAGISel.inc:91821: error: duplicate case value llvm/lib/Target/X86/X86GenDAGISel.inc:91442: error: previously used here This seems to happen because of a pattern I added for VEXTRACTF128 which uses extract_subreg: [(set DSTREGCLASS:$dst, (DSTTYPE (extract_subreg
2009 Dec 03
0
[LLVMdev] Duplicate Label in Generates ISel
...x86_subreg_128bit)))], > > def x86_subreg_128bit : PatLeaf<(i32 1)>; Whoops, I forgot to fill in types: (outs VR128:$dst), (ins VR129:$src1, i32i8imm:$src2) [(set DSTREGCLASS:$dst, (v4f32 (extract_subreg (vector_shuffle (v8f32 undef), (v8f32 SRCREGCLASS:$src1), VEXTRACTF128_shuffle_mask:$src2), x86_subreg_128bit)))], -Dave
2009 Dec 03
1
[LLVMdev] Duplicate Label in Generates ISel
On Thursday 03 December 2009 13:43, David Greene wrote: > Whoops, I forgot to fill in types: > > (outs VR128:$dst), (ins VR129:$src1, i32i8imm:$src2) > > [(set DSTREGCLASS:$dst, > (v4f32 (extract_subreg > (vector_shuffle > (v8f32 undef), > (v8f32 SRCREGCLASS:$src1), > VEXTRACTF128_shuffle_mask:$src2), > x86_subreg_128bit)))], Well, it's conflicting with the hard-coded case statement from DAGISelEmitter.cpp. What's the best way to resolve this? I...
2009 Jun 17
3
[LLVMdev] Regular Expressions
...ass RegClass; string suffix; } Now we instantiate some concrete types: class X86_f32 : X86ValueType { let VT = f32; let RegClass = FR32; let suffix = "ss"; } class X86_v4f32 : X86ValueType { let VT = v4f32; let RegClass = VR128; let suffix = "ps"; } class X86_v8f32 : X86ValueType { let VT = v8f32; let RegClass = VR256; let suffix = "ps"; } Ok, you get the picture. Now let's look at how we would write instruction patterns: defm BLENDPS : sse41_avx_fp_binary_vector_osta_vintrinsic_rmi_rrmi<0x0C, i32i8imm, "ble...
2009 Jun 17
2
[LLVMdev] Regular Expressions
...x0C, >> i32i8imm, "blendps", "blendps">; > > Here's another option: > > defm BLENDPS : > sse41_avx_fp_binary_vector_osta_vintrinsic_rmi_rrmi<0x0C, > i32i8imm, "blendps", "blendps", v4f32, v8f32>; > > This is somewhere between the first and second options. It's not as > convenient as the second but is more intuitive than the first. Still, > looking at some random individual instruction, it wouldn't be > immediately > clear to me what those multiple types me...
2011 Jun 01
4
[LLVMdev] AVX Status?
...2> %mask to <8 x float> %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float> %a, <8 x float> %b, <8 x float> %blend_cond) nounwind readnone ret <8 x float> %res } llc (latest trunk) bails out with: LLVM ERROR: Cannot select: 0x2510540: v8f32 = bitcast 0x2532270 [ID=16] 0x2532270: v4i64 = and 0x2532070, 0x2532170 [ID=15] 0x2532070: v4i64 = bitcast 0x2510740 [ID=14] 0x2510740: v8f32 = llvm.x86.avx.cmp.ps.256 0x2510640, 0x2511340, 0x2510f40, 0x2511140 [ORD=3] [ID=12] ... The same counts for or and xor where VXORPS etc. sh...
2015 Aug 31
2
MCRegisterClass mandatory vs preferred alignment?
...get.td: > > class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, > dag regList, RegAltNameIndex idx = NoRegAltName> > > X86RegisterInfo.td: > > def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], > 256, (sequence "YMM%u", 0, 15)>; > def VR256X : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], > 256, (sequence "YMM%u", 0, 31)>; > > Seems to be 256bits/32bytes...
2011 Jun 02
0
[LLVMdev] AVX Status?
...  %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float> > %a, <8 x float> %b, <8 x float> %blend_cond) nounwind readnone >   ret <8 x float> %res > } > > llc (latest trunk) bails out with: > > LLVM ERROR: Cannot select: 0x2510540: v8f32 = bitcast 0x2532270 [ID=16] >   0x2532270: v4i64 = and 0x2532070, 0x2532170 [ID=15] >     0x2532070: v4i64 = bitcast 0x2510740 [ID=14] >       0x2510740: v8f32 = llvm.x86.avx.cmp.ps.256 0x2510640, 0x2511340, > 0x2510f40, 0x2511140 [ORD=3] [ID=12] > ... > > The same counts fo...
2018 Sep 25
2
Unsafe floating point operation (FDiv & FRem) in LoopVectorizer
...at>, <8 x float>* %1, align 4, !tbaa !2, !alias.scope !6 %2 = fcmp ogt <8 x float> %wide.load, %broadcast.splat30 %3 = getelementptr inbounds float, float* %B, i64 %index %4 = bitcast float* %3 to <8 x float>* %wide.masked.load = call <8 x float> @llvm.masked.load.v8f32.p0v8f32(<8 x float>* %4, i32 4, <8 x i1> %2, <8 x float> undef), !tbaa !2, !alias.scope !9 %5 = fdiv <8 x float> %wide.masked.load, %wide.load %6 = getelementptr inbounds float, float* %A, i64 %index %7 = bitcast float* %6 to <8 x float>* call void @llvm.masked...
2016 Sep 03
4
llc error
Hi all, The attached LLVM assembly file fails to generate x86 code when compiled using llc. compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 -mcpu=core-avx2 ex4.ll The error message is, LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, undef:i64 t16: i64 = add t2, Constant:i64<16> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg5 t1: i64 = Register %vreg5 t15: i64 = Constant<16> t4: i64...
2015 Aug 31
3
MCRegisterClass mandatory vs preferred alignment?
Looking around today, it appears that TargetRegisterClass and MCRegisterClass only includes a single alignment. This is documented as being the minimum legal alignment, but it appears to often be greater than this in practice. For instance, on x86 the alignment of %ymm0 is listed as 32, not 1. Does anyone know why this is? Additionally, where are these alignments actually defined? I
2014 Dec 13
2
[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps
I'm getting this on LLVM trunk: SplitVectorResult #0: 0x27e6250: v8f32 = llvm.x86.avx.rsqrt.ps.256 0x2739310, 0x2739420 [ORD=16] [ID=0] LLVM ERROR: Do not know how to split the result of this operator! clang: error: linker command failed with exit code 1 (use -v to see invocation) Oddly, when I build the same code without -flto I don't see this issue. I see a si...
2012 Jul 26
2
[LLVMdev] Why is this assertion here?
...VT.getSimpleVT().SimpleTy)) & 3); assert(Action != Promote && "Can't promote condition code!"); return Action; } The first part of the assertion I can understand, but why is there an assertion that there are only 32 types? in TOT LLVM if this code is called with v8f32,v2f64 or v4f64, this assert is triggered. Shouldn't the assert be: (unsigned)VT.getSimpleVT().SimpleTy < MVT::MAX_ALLOWED_VALUETYPE && or (unsigned)VT.getSimpleVT().SimpleTy < MVT::LAST_VECTOR_VALUETYPE && ? Thanks, Micah -------------- next part -------------- An HTML...
2009 Jun 18
0
[LLVMdev] Regular Expressions
...understand what you're saying with "munging strings is > easier". However, I still don't understand why you can't pass down > some 'my_f32' instead of '"f32"' and have the defm pull out the right > fields from my_f32. The AVX type would be v8f32, the SSE type would > be v4f32, etc. > > More generally, I don't see how strings can be better in any > circumstance: in any case where you pass down a string, you can pass > down a def that has fields relating to how you would otherwise munge > the string, no? Ok, I've b...
2011 Aug 25
2
[LLVMdev] AVX spill alignment
Hey guys, Are spills/reloads of AVX registers using aligned stores/loads? I can't seem to find the code that aligns the stack slots to 32-bytes. Could someone point me in the right direction? Thanks, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110825/b5724dec/attachment.html>
2011 Sep 01
0
[LLVMdev] AVX spill alignment
...s using aligned stores/loads? Yes. > I can't > seem to find the code that aligns the stack slots to 32-bytes. Could > someone point me in the right direction? The register class has 256-bit spill alignment: def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], 256, (sequence "YMM%u", 0, 15)> { let SubRegClasses = [(FR32 sub_ss), (FR64 sub_sd), (VR128 sub_xmm)]; } /jakob
2016 Sep 03
2
llc error
...ttached LLVM assembly file fails to generate x86 code when compiled >> using llc. >> >> compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 >> -mcpu=core-avx2 ex4.ll >> >> The error message is, >> >> LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 >> t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, >> undef:i64 >> t16: i64 = add t2, Constant:i64<16> >> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg5 >> t1: i64 = Register %vreg5 &g...