search for: 1aa0235f

Displaying 1 result from an estimated 1 matches for "1aa0235f".

2019 Sep 05
2
ARM vectorized fp16 support
Hi, I'm trying to compile half precision program for ARM, while it seems LLVM fails to automatically generate fused-multiply-add instructions for c += a * b. I'm wondering whether I did something wrong, if not, is it a missing feature that will be supported later? (I know there're fp16 FMLA intrinsics though) Test programs and outputs, $ clang -O3 -march=armv8.2-a+fp16fml