thr3ads.net - search: "_zgv_llvm_n4v

Displaying 2 results from an estimated 2 matches for "_zgv_llvm_n4v_llvm".

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 17

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

...of log10. Running > `opt -vector-library=Accelerate -inject-tli-mappings` on the IR below will > add the following attribute to the llvm.log10 call-site, indicating that > there’s a <4 x float> version of log10 called vlog10f. > > { "vector-function-abi-variant"="_ZGV_LLVM_N4v_llvm.log10.f32(vlog10f)" } > > > To double-check, if running -inject-tli-mappings on your example does not > add the vector-function-abi-variant attribute for `pow`, the vectorisers > won’t know about them. If the vector-function-abi-variant attribute is > actually created, but th...

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

So for us we use SLEEF to actually implement the libcalls (LLVM intrinsics) that LLVM by default would generate - and since SLEEF has highly optimal 8-wide pow, optimized for AVX and AVX2, we really want to use that. So we would not see 4/8 libcalls and instead see 1 call to something that lights up the ymm registers. I guess the problem then is that the default expectation is that pow would be

search for: _zgv_llvm_n4v_llvm