thr3ads.net - search: "2011update"

Displaying 7 results from an estimated 7 matches for "2011update".

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 28

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

...; Is this sufficient to implement #pragma ivdep in clang? I'm not completely sure of this: "Note: The proven dependencies that prevent vectorization are not ignored, only assumed dependencies are ignored." http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm Thus, there's a slight difference. It cannot be used to disable dependency checking altogether (and just blame a sloppy programmer if there actually are dependencies), but it just converts the "unknown alias" to "no alias"....

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

2013 Jan 28

[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer

----- Original Message ----- > From: "Nadav Rotem" <nrotem at apple.com> > To: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, January 28, 2013 10:45:36 AM > Subject: Re: [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer > > Hi Pekka,

[LLVMdev] parallel loop awareness to the LoopVectorizer

2013 Jan 28

[LLVMdev] parallel loop awareness to the LoopVectorizer

...9;m not completely sure of this: > > > > "Note: The proven dependencies that prevent vectorization are not > > ignored, > > only assumed dependencies are ignored." > > > > > http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm > > > > Thus, there's a slight difference. It cannot be used to disable > > dependency > > checking altogether (and just blame a sloppy programmer if there > > actually > > are dependencies), but it just conv...

[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics

2013 Apr 09

[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics

...or the _mm256_castsi128_si256 intrinsic explicitly states that "the upper bits of the resulting vector are undefined" and that "this intrinsic does not introduce extra moves to the generated code". http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_castsi128_si256.htm Clang implements these typecast intrinsics differently. Is this intentional? I suspect that this was done to avoid a hardware penalty caused by partial register writes. But, isn't the overall cost of 2 additional instructions (vxor...

[LLVMdev] Predicated Vector Operations

2013 May 09

[LLVMdev] Predicated Vector Operations

...ing question but I am not sure how it is related. In our example you can see the problem with a single thread. Both MIC and AVX[1] have masked stores operations and they have a different memory model. Thanks, Nadav [1] http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_maskstore_pd.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130508/7637d894/attachment.html>

[LLVMdev] Predicated Vector Operations

2013 May 09

[LLVMdev] Predicated Vector Operations

On Thu, May 9, 2013 at 1:09 AM, Nadav Rotem <nrotem at apple.com> wrote: > On May 8, 2013, at 4:00 PM, Eric Christopher <echristo at gmail.com> wrote: > > > Thinking that a masked store is conservatively a store of the full > width of the store right? > > > It depends on the optimization. Consider this example: > > masked_store(Val, Ptr , M) > X =

[LLVMdev] Predicated Vector Operations

2013 May 08

[LLVMdev] Predicated Vector Operations

On May 8, 2013, at 4:00 PM, Eric Christopher <echristo at gmail.com> wrote: > > Thinking that a masked store is conservatively a store of the full > width of the store right? It depends on the optimization. Consider this example: masked_store(Val, Ptr , M) X = masked_load(Ptr, M2) If you assume that your store actually overwrites everything in that memory location then you

search for: 2011update