Displaying 3 results from an estimated 3 matches for "t2ldrhi12".
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
...nd yes, to make it even funnier, this node has an i32 operand, and that's
because we do the half-float load with an integer load instruction.
And after this rewrite, we end up with this DAG:
t0: ch = EntryToken
t2: i32,ch = CopyFromReg t0, Register:i32 %0
t16: i32,ch = t2LDRHi12<Mem:LD2[%addr]> t2, TargetConstant:i32<0>, TargetConstant:i32<14>, Register:i32 %noreg, t0
t20: f16 = COPY_TO_REGCLASS t16, TargetConstant:i32<1> <~~~~~~~~~~~~~ PROBLEM HERE
t12: f32 = VCVTBHS t20, TargetConstant:i32<14>, Register:i32 %noreg
t7: i3...
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good.
I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16
and FP16_TO_FP nodes are created to avoid inefficient code generation. I will
double check if I can't achieve the same without using these nodes (because I
really would like to get completely rid of them).
Cheers,
Sjoerd.
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
...nd yes, to make it even funnier, this node has an i32 operand, and that's
because we do the half-float load with an integer load instruction.
And after this rewrite, we end up with this DAG:
t0: ch = EntryToken
t2: i32,ch = CopyFromReg t0, Register:i32 %0
t16: i32,ch = t2LDRHi12<Mem:LD2[%addr]> t2, TargetConstant:i32<0>, TargetConstant:i32<14>, Register:i32 %noreg, t0
t20: f16 = COPY_TO_REGCLASS t16, TargetConstant:i32<1> <~~~~~~~~~~~~~ PROBLEM HERE
t12: f32 = VCVTBHS t20, TargetConstant:i32<14>, Register:i32 %noreg
t7: i3...