search for: f16_to_fp

Displaying 5 results from an estimated 5 matches for "f16_to_fp".

Did you mean: fp16_to_fp
2019 Dec 10
2
TypePromoteFloat loses intermediate rounding operations
Thanks Eli. I forgot to bring up the strict FP questions which I was working on when I found this. If we're in a strict FP function, do the fp_to_f16/f16_to_fp emitted by promoting load/store/bitcast need to be strict versions of fp_to_f16/f16_to_fp. And if so where do we get the chain, especially for the bitcast case which isn't a chained node. ~Craig On Tue, Dec 10, 2019 at 3:18 PM Eli Friedman <efriedma at quicinc.com> wrote: > We cou...
2019 Dec 10
2
TypePromoteFloat loses intermediate rounding operations
...te. So we lost the intermediate rounding between the 2 adds that was in the original clang IR. I believe this occurs because the TypePromoteFloat legalization converts all arithmetic operations to their f32 equivalents, but does not place conversions to/from half around them. Instead fp_to_f16 and f16_to_fp nodes are only generated at loads, stores, bitcasts, and a probably a few other places. Basically only the place where the 16-bit size is needed to make the operation possible. Basically what we have is a very similar implementation to promoting integers, but that doesn't work for FP because we...
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
...so: def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm), IIC_fpCVTSH, "vcvtb", ".f32.f16\t$Sd, $Sm", []>, Requires<[HasFP16]>, Sched<[WriteFPCVT]>; def : FP16Pat<(f16_to_fp GPR:$a), (VCVTBHS (COPY_TO_REGCLASS GPR:$a, SPR))>; def : FullFP16Pat<(f32 (fpextend HPR:$Sm)), (VCVTBHS (COPY_TO_REGLASS HPR:$Sm, SPR)>; I'm not sure of the COPY_TO_REGLASS semantics, but I would (dangerously) assume that it when it comes to cop...
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
...0, s0 vmov r0, s0 bx lr when we don't have the Armv8.2-A FP16 instructions available, and thus only have the conversion instructions. The problem is in the conversion rules, some rewrite rules to be more specific, and I think this is one of the culprits: def : Pat<(f16_to_fp GPR:$a), (VCVTBHS (COPY_TO_REGCLASS GPR:$a, SPR))>; This rewrite rule is supposed to first move the GPR reg in to a S-registers: vmov s0, r0 and then to the conversion: vcvtb.f32.f16 s0, s0 This rewrite rule gets triggered because the ISEL DAG has indeed this funny no...
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good. I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16 and FP16_TO_FP nodes are created to avoid inefficient code generation. I will double check if I can't achieve the same without using these nodes (because I really would like to get completely rid of them). Cheers, Sjoerd.