search for: intref_avx_castsi128_si256

Displaying 1 result from an estimated 1 matches for "intref_avx_castsi128_si256".

2013 Apr 09
1
[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics
...explicitly states that "the upper bits of the resulting vector are undefined" and that "this intrinsic does not introduce extra moves to the generated code". http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_avx_castsi128_si256.htm Clang implements these typecast intrinsics differently. Is this intentional? I suspect that this was done to avoid a hardware penalty caused by partial register writes. But, isn't the overall cost of 2 additional instructions (vxor + vinsertf128) for *every* 128-bit->256-bit typecast...