thr3ads.net - search: "intref

Displaying 4 results from an estimated 4 matches for "intref_cl".

Did you mean: intref_cls

[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics

2013 Apr 09

[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics

...8_si256 intrinsic explicitly states that "the upper bits of the resulting vector are undefined" and that "this intrinsic does not introduce extra moves to the generated code". http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_castsi128_si256.htm Clang implements these typecast intrinsics differently. Is this intentional? I suspect that this was done to avoid a hardware penalty caused by partial register writes. But, isn't the overall cost of 2 additional instructions (vxor + vinsertf128) for *e...

[LLVMdev] Predicated Vector Operations

2013 May 09

[LLVMdev] Predicated Vector Operations

...not sure how it is related. In our example you can see the problem with a single thread. Both MIC and AVX[1] have masked stores operations and they have a different memory model. Thanks, Nadav [1] http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_maskstore_pd.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130508/7637d894/attachment.html>

[LLVMdev] Predicated Vector Operations

2013 May 09

[LLVMdev] Predicated Vector Operations

On Thu, May 9, 2013 at 1:09 AM, Nadav Rotem <nrotem at apple.com> wrote: > On May 8, 2013, at 4:00 PM, Eric Christopher <echristo at gmail.com> wrote: > > > Thinking that a masked store is conservatively a store of the full > width of the store right? > > > It depends on the optimization. Consider this example: > > masked_store(Val, Ptr , M) > X =

[LLVMdev] Predicated Vector Operations

2013 May 08

[LLVMdev] Predicated Vector Operations

On May 8, 2013, at 4:00 PM, Eric Christopher <echristo at gmail.com> wrote: > > Thinking that a masked store is conservatively a store of the full > width of the store right? It depends on the optimization. Consider this example: masked_store(Val, Ptr , M) X = masked_load(Ptr, M2) If you assume that your store actually overwrites everything in that memory location then you

search for: intref_cl