Displaying 5 results from an estimated 5 matches for "2xfloat".
Did you mean:
4xfloat
2012 Feb 28
0
[LLVMdev] Alias in LLVM 3.0
Hi Richard,
> In LLVM 2.9 and LLVM 3.0, our front-end generates:
>
> @__shuffle_2f32_2u32 = alias weak <2 x i32> (<2 x i32>, <2 x i32>)* @4
>
> And the calls, before linking, look like:
>
> %call9 = call <2 x float> @__shuffle_2f32_2u32(<2 x float> %tmp7, <2 x i32>
> %tmp8) nounwind
I don't see how this is possible - it should be
2012 Feb 27
2
[LLVMdev] Alias in LLVM 3.0
We use alias extensively in our library to support OpenCL generating code for both our CPUs and GPUs. During the transition to LLVM 3.0 with the new type system, we're seeing two problems. Both involve type conversions occurring across an alias.
In one case, one of the types is pointer to an opaque type, and ends up creating an assert in the verifier where it is checking that argument types
2012 Feb 03
1
[LLVMdev] Vectorization: Next Steps
...ffles are very sensitive to the ability of the codegen to lower them. If a vectorizer generates shuffle instructions which are not handled properly by the manual lowering code, then the instruction is scalarized.
2. Instructions with mixed types -Instructions which operate on mixed types, such as 2xfloat->2xdouble, are usually scalarized by the type legalizer.
Nadav
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
Sent: Friday, February 03, 2012 10:50
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Vectorizatio...
2012 Feb 03
0
[LLVMdev] Vectorization: Next Steps
Hi Hal,
> As some of you may know, I committed my basic-block autovectorization
> pass a few days ago. I encourage anyone interested to try it out (pass
> -vectorize to opt or -mllvm -vectorize to clang) and provide feedback.
> Especially in combination with -unroll-allow-partial, I have observed
> some significant benchmark speedups, but, I have also observed some
> significant
2012 Feb 03
8
[LLVMdev] Vectorization: Next Steps
As some of you may know, I committed my basic-block autovectorization
pass a few days ago. I encourage anyone interested to try it out (pass
-vectorize to opt or -mllvm -vectorize to clang) and provide feedback.
Especially in combination with -unroll-allow-partial, I have observed
some significant benchmark speedups, but, I have also observed some
significant slowdowns. I would like to share my