thr3ads.net - search: "0b0010"

Backend subtraction changed to negative addition

2017 Jan 25

2

Backend subtraction changed to negative addition

...I have defined both my add and sub instructions: def ADD : ALUInst<0b0001, (outs GRRegs:$dst), (ins GRRegs:$src1, GRRegs:$src2), "add $src1, $src2, $dst", [(set i32:$dst, (add i32:$src1, i32:$src2))]>; def SUB : ALUInst<0b0010, (outs GRRegs:$dst), (ins GRRegs:$src1, GRRegs:$src2), "sub $src1, $src2, $dst", [(set i32:$dst, (sub i32:$src1, i32:$src2))]>; Is there a way to override this behaviour? Thanks --Phil -------------- next part -------------- A...

[LLVMdev] Possible missed optimization?

2010 Sep 04

6

[LLVMdev] Possible missed optimization?

Hello, while testing trivial functions in my backend i noticed a suboptimal way of assigning regs that had the following pattern, consider the following function: typedef unsigned short t; t foo(t a, t b) { t a4 = b^a^18; return a4; } Argument "a" is passed in R15:R14 and argument "b" is passed in R13:R12, the return value is stored in R15:R14. Producing the

[RFC] Half-Precision Support in the Arm Backends

2018 Jan 18

1

[RFC] Half-Precision Support in the Arm Backends

Hi Sjoerd, For ISel, I think having a separate register class will give you less headache. I wondering if you could get away with not touching the instructions descriptions at all, instead defining external pattens for the FullFP16 case, like so: def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm), IIC_fpCVTSH, "vcvtb", ".f32.f16\t$Sd, $Sm", []>, Requires<[HasFP16]>, Sched<[WriteFPCVT]>; def : FP16Pat<(f16_to_fp GPR:$a), (VCVTBHS (CO...

[LLVMdev] Possible missed optimization?

2010 Sep 04

1

[LLVMdev] Possible missed optimization?

Indeed, i've marked it as commutable: let isCommutable = 1, isTwoAddress = 1 in def XORRdRr : FRdRr<0b0010, 0b01, (outs GPR8:$dst), (ins GPR8:$src1, GPR8:$src2), "xor\t$dst, $src2", [(set GPR8:$dst, (xor GPR8:$src1, GPR8:$src2))]>; -------------- next part -------------- An HTML attachment wa...

[LLVMdev] Possible missed optimization?

2010 Sep 04

0

[LLVMdev] Possible missed optimization?

Hello > and as the return value. Is this a missed optimization from LLVM or did i > miss something out? > Changing the register allocation order didnt work. What are the patterns for xor / mov ? -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University

[RFC] Half-Precision Support in the Arm Backends

2018 Jan 18

0

[RFC] Half-Precision Support in the Arm Backends

...[(set HPR:$Sd, (fadd HPR:$Sn, HPR:$Sm))]>, // <~~~ new match rule using HPR This is straightforward business so far, but I already learned the hard way that the conversion are the tricky ones, so I repeat this for an f16 -> f32 upconvert: def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins HPR:$Sm), IIC_fpCVTSH, "vcvtb", ".f32.f16\t$Sd, $Sm", [(set SPR:$Sd, (fpextend HPR:$Sm))]>, // <~~~~ new match rule using HPR and SPR Requires<[HasFP16]>, Sched<[Wr...

[RFC] Half-Precision Support in the Arm Backends

2017 Dec 06

2

[RFC] Half-Precision Support in the Arm Backends

Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good. I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16 and FP16_TO_FP nodes are created to avoid inefficient code generation. I will double check if I can't achieve the same without using these nodes (because I really would like to get completely rid of them). Cheers, Sjoerd.

search for: 0b0010