Displaying 3 results from an estimated 3 matches for "vcvtbhs".
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
Hi Sjoerd,
For ISel, I think having a separate register class will give you less headache. I wondering if you could get away with not touching the instructions descriptions at all, instead defining external pattens for the FullFP16 case, like so:
def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm),
IIC_fpCVTSH, "vcvtb", ".f32.f16\t$Sd, $Sm",
[]>,
Requires<[HasFP16]>,
Sched<[WriteFPCVT]>;
def : FP16Pat<(f16_to_fp GPR:$a),...
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
...t$Sd, $Sn, $Sm",
[(set HPR:$Sd, (fadd HPR:$Sn, HPR:$Sm))]>, // <~~~ new match rule using HPR
This is straightforward business so far, but I already learned the hard way
that the conversion are the tricky ones, so I repeat this for an f16 -> f32
upconvert:
def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins HPR:$Sm),
IIC_fpCVTSH, "vcvtb", ".f32.f16\t$Sd, $Sm",
[(set SPR:$Sd, (fpextend HPR:$Sm))]>, // <~~~~ new match rule using HPR and SPR
Requires<[HasFP16]...
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good.
I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16
and FP16_TO_FP nodes are created to avoid inefficient code generation. I will
double check if I can't achieve the same without using these nodes (because I
really would like to get completely rid of them).
Cheers,
Sjoerd.