thr3ads.net - search: "vfp32"

Displaying 2 results from an estimated 2 matches for "vfp32".

Did you mean: fp32

2019 Sep 05

ARM vectorized fp16 support

...erate fused-multiply-add instructions for c += a * b. I'm wondering whether I did something wrong, if not, is it a missing feature that will be supported later? (I know there're fp16 FMLA intrinsics though) Test programs and outputs, $ clang -O3 -march=armv8.2-a+fp16fml -ffast-math -S -o- vfp32.c test_vfma_lane_f16: // @test_vfma_lane_f16 fmla v2.4s, v1.4s, v0.4s // fp32 is GOOD mov v0.16b, v2.16b ret $ cat vfp32.c #include <arm_neon.h> float32x4_t test_vfma_lane_f16(float32x4_t a, float32x4_t b, float...

ARM vectorized fp16 support

2019 Sep 05

ARM vectorized fp16 support

...ions > for c += a * b. I'm wondering whether I did something wrong, if not, > is it a missing feature that will be supported later? (I know there're > fp16 FMLA intrinsics though) > > Test programs and outputs, > > $ clang -O3 -march=armv8.2-a+fp16fml -ffast-math -S -o- vfp32.c > test_vfma_lane_f16: // @test_vfma_lane_f16 > fmla v2.4s, v1.4s, v0.4s // fp32 is GOOD > mov v0.16b, v2.16b > ret > $ cat vfp32.c > #include <arm_neon.h> > float32x4_t test_vfma_lane_f16(...

search for: vfp32