Displaying 7 results from an estimated 7 matches for "2011update".
2013 Jan 28
0
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
...; Is this sufficient to implement #pragma ivdep in clang?
I'm not completely sure of this:
"Note: The proven dependencies that prevent vectorization are not ignored,
only assumed dependencies are ignored."
http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm
Thus, there's a slight difference. It cannot be used to disable dependency
checking altogether (and just blame a sloppy programmer if there actually
are dependencies), but it just converts the "unknown alias" to "no alias"....
2013 Jan 28
5
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Monday, January 28, 2013 10:45:36 AM
> Subject: Re: [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
>
> Hi Pekka,
2013 Jan 28
0
[LLVMdev] parallel loop awareness to the LoopVectorizer
...9;m not completely sure of this:
> >
> > "Note: The proven dependencies that prevent vectorization are not
> > ignored,
> > only assumed dependencies are ignored."
> >
> >
> http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm
> >
> > Thus, there's a slight difference. It cannot be used to disable
> > dependency
> > checking altogether (and just blame a sloppy programmer if there
> > actually
> > are dependencies), but it just conv...
2013 Apr 09
1
[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics
...or the _mm256_castsi128_si256 intrinsic explicitly states that "the upper bits of the
resulting vector are undefined" and that "this intrinsic does not introduce extra moves to the
generated code".
http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_castsi128_si256.htm
Clang implements these typecast intrinsics differently. Is this intentional? I suspect that this was done to avoid a hardware penalty caused by partial register writes. But, isn't the overall cost of 2 additional instructions (vxor...
2013 May 09
1
[LLVMdev] Predicated Vector Operations
...ing question but I am not sure how it is related. In our example you can see the problem with a single thread. Both MIC and AVX[1] have masked stores operations and they have a different memory model.
Thanks,
Nadav
[1] http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_maskstore_pd.htm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130508/7637d894/attachment.html>
2013 May 09
0
[LLVMdev] Predicated Vector Operations
On Thu, May 9, 2013 at 1:09 AM, Nadav Rotem <nrotem at apple.com> wrote:
> On May 8, 2013, at 4:00 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>
> Thinking that a masked store is conservatively a store of the full
> width of the store right?
>
>
> It depends on the optimization. Consider this example:
>
> masked_store(Val, Ptr , M)
> X =
2013 May 08
4
[LLVMdev] Predicated Vector Operations
On May 8, 2013, at 4:00 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> Thinking that a masked store is conservatively a store of the full
> width of the store right?
It depends on the optimization. Consider this example:
masked_store(Val, Ptr , M)
X = masked_load(Ptr, M2)
If you assume that your store actually overwrites everything in that memory location then you