Displaying 20 results from an estimated 1000 matches similar to: "[RFC] Half-Precision Support in the Arm Backends"
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
I would like to revive this thread, as I am struggling a lot with the FP16
implementation in the ARM backend. My implementation in
https://reviews.llvm.org/D38315 is finished (except one case), but a more
robust alternative implementation was suggested. One can indeed argue that my
current implementation is a bit fragile, because it involves manually patching
up the isel dags for a few cases. The
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
Hi Sjoerd,
For ISel, I think having a separate register class will give you less headache. I wondering if you could get away with not touching the instructions descriptions at all, instead defining external pattens for the FullFP16 case, like so:
def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm),
IIC_fpCVTSH, "vcvtb",
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good.
I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16
and FP16_TO_FP nodes are created to avoid inefficient code generation. I will
double check if I can't achieve the same without using these nodes (because I
really would like to get completely rid of them).
Cheers,
Sjoerd.
2019 Dec 10
2
TypePromoteFloat loses intermediate rounding operations
Thanks Eli.
I forgot to bring up the strict FP questions which I was working on when I
found this. If we're in a strict FP function, do the fp_to_f16/f16_to_fp
emitted by promoting load/store/bitcast need to be strict versions of
fp_to_f16/f16_to_fp. And if so where do we get the chain, especially for
the bitcast case which isn't a chained node.
~Craig
On Tue, Dec 10, 2019 at 3:18 PM
2019 Dec 10
2
TypePromoteFloat loses intermediate rounding operations
For the following C code
__fp16 x, y, z, w;
void foo() {
x = y + z;
x = x + w;
}
clang produces IR that extends each operand to float and then truncates to
half before assigning to x. Like this
define dso_local void @foo() #0 !dbg !18 {
%1 = load half, half* @y, align 2, !dbg !21
%2 = fpext half %1 to float, !dbg !21
%3 = load half, half* @z, align 2, !dbg !22
%4 = fpext half %3 to float, !dbg
2014 Jul 14
5
[LLVMdev] RFC: Do we still need @llvm.convert.to.fp16 and the reverse?
Hi all,
What do people think of doing away with the @llvm.convert.to.fp16 and
@llvm.convert.from.fp16 intrinsics, in favour of using "half" and
fpext/fptrunc? [1]
It looks like those intrinsics originally date from before "half"
actually existed in LLVM, and of course the backends have grown up
assuming that's what Clang will produce, so we'd have to improve their
2014 Jul 14
2
[LLVMdev] RFC: Do we still need @llvm.convert.to.fp16 and the reverse?
On Jul 14, 2014, at 7:23 AM, Tom Stellard <tom at stellard.net> wrote:
> On Mon, Jul 14, 2014 at 01:08:54PM +0100, Tim Northover wrote:
>> Hi all,
>>
>> What do people think of doing away with the @llvm.convert.to.fp16 and
>> @llvm.convert.from.fp16 intrinsics, in favour of using "half" and
>> fpext/fptrunc? [1]
>>
>
> I am in favor
2012 Nov 02
2
[LLVMdev] Half Float fp16 Native Support
hi all,
i am trying to implement native support for fp16 in llvm-3.1
i have already used the opencl patch for clang so the IR that is generated
is correct.
i tried to add some code so the the fp16 type is handled correctly but no
luck.
We have a target that has native fp16 units and tried to run a simple
program
int main ()
{
__fp16 a,b,c,d;
a= 1.1;
b=2.2;
c=3.3;
2019 Jan 22
4
_Float16 support
I'd like to start a discussion about how clang supports _Float16 for target architectures that don't have direct support for 16-bit floating point arithmetic.
The current clang language extensions documentation says, "If half-precision instructions are unavailable, values will be promoted to single-precision, similar to the semantics of __fp16 except that the results will be stored
2017 Jan 23
2
Changes to TableGen in v4.0?
I am trying to upgrade to the LLVM v4.0 branch, but I am seeing failures in
my TableGen descriptions for conversion from FP32 to FP16 (scalar and
vector).
The patterns I have are along the lines of:
[(set (f16 RF16:$dst), (fround (f32 RF32:$src)))]
or:
[(set (v2f16 VF16:$dst), (fround (v2f32 VF32:$src)))]
and these now produce the errors:
error: In CONV_f32_f16: Type inference
2014 Jul 25
3
[LLVMdev] FPU cannot be compatible with -soft-float code on mips by llc
Hi all,
-soft-float can not be rightly use by llc. All float function operation
will call soft float, but not hard.
My mips device cannot support half float type, so I hack the llvm, and
add soft half float and add -soft-float option.
I add the function define for __gnu_f2h_ieee() and __gnu_h2f_ieee (),
and it can call the soft half float.
However, all the others function about
2019 Jan 24
4
[cfe-dev] _Float16 support
On 24 Jan 2019, at 4:46, Sjoerd Meijer wrote:
> Hello,
>
> I added _Float16 support to Clang and codegen support in the AArch64
> and ARM backends, but have not looked into x86. Ahmed is right:
> AArch64 is fine, only a few ACLE intrinsics are missing. ARM has rough
> edges: scalar codegen should be mostly fine, vector codegen needs some
> more work.
>
>
2014 Jul 09
2
[LLVMdev] Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!
Hi all,�� � I am new to llvm. I need help. Thank you every!
� � I want to realize vcvtt.f16.f32 �NEON instruction by llvm. This instruction covert top-16bits of a single type to f16. I use the intrinsics function llvm.convert.to.fp16, but cannot llc ,�I meet is following problem :
LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16]
0x9fc0750: f32,ch = load 0x3aafd68,
2019 Jan 24
2
[cfe-dev] _Float16 support
It seems that there are several issues here:
1. Should the front end be concerned with whether or not the IR that it is emitting can be translated into a well-defined IR?
2. How should the selection DAG handle data types whose representation isn't defined by the ABI we're targeting?
3. What should the ABI do with half-precision floats?
Working backward...
The third question here is
2014 Jul 10
2
[LLVMdev] Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!
Hi Daniel, Thank you your replying. Yes, the problem is about MIPS backend. You give me this message "There is limited support for the <8 x f16> type when MSA (MIPS SIMD Architecture) is enabled but even then scalar half-precision is not currently supported." Could you give me some official link or some evidence? Thank you very much.
Robin
yalong at multicorewareinc.com
2019 Sep 05
2
ARM vectorized fp16 support
Thanks for reply. I was using LLVM 8.0. Let me try trunk and will let
you know if it works.
On Wed, Sep 4, 2019 at 11:19 PM Sjoerd Meijer <Sjoerd.Meijer at arm.com> wrote:
>
> Hi,
> Which version of Clang are you using? I do get a "vfma.f16" with a recent trunk build. I haven't looked at older versions and when this landed, but we had an effort to plug the remaining
2014 Jul 09
6
[LLVMdev] Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!
Thank you Kevin!!! If I use fptrunc and bitcast realise NEON vcvtt ( I can sure, "fptrunc double %tmp to float" is right, but "fptrunc float %tmp to half" is wrong). My target platform is MIPS. The command as following:
NEON: vcvtt.f16.f32 s2, s0
llvm Code:
%Vt_2 = load float* %VFP_s0, align 4
%Vt3_1 = fptrunc float %Vt_2 to half
%Vt4_1 = bitcast half
2017 Dec 04
2
[RFC] - Deduplication of debug information in linkers (LLD)
At least one proprietary linker put a lot of effort into deduplicating and
rewriting debug information. This took up the majority of the link time
despite serious engineering time on performance optimisation. For example,
some sections were written from scratch by the linker because that proved
faster than parsing the input. Teaching LLD to dedup DWARF should be
expected to dramatically slow it
2014 Jul 09
4
[LLVMdev] Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!
On 07/09/2014 12:41 PM, Matt Arsenault wrote:
> On 07/09/2014 03:30 PM, yalong at multicorewareinc.com wrote:
>> Thank you Kevin!!!
>> If I use fptrunc and bitcast realise NEON vcvtt ( I can sure,
>> "fptrunc double %tmp to float" is right, but "fptrunc float %tmp to
>> half" is wrong). My target platform is MIPS. The command as following:
2015 Sep 08
2
Strange types on x86 vcvtph2ps and vcvtps2ph intrinsics
Hi,
I was looking at the x86 vector intrinsics for converting half
precision floating point numbers and I'm a bit confused as to why
certain types were chosen. I've gone ahead and used their current
definition with success but I'd like to understand why the types used
with these intrinsics are done this way.
For reference see ``include/llvm/IR/IntrinsicsX86.td``. Here are the