thr3ads.net - search: "zextloadi8"

2019 Sep 10

2

tablegen exponential behavior

...))), (EXTRACT_SUBREG (i64 (DOT (DUPv2i32gpr WZR), (v8i8 (LD1Onev8b GPR64sp:$Rn)), (v8i8 (LD1Onev8b GPR64sp:$Rm)))), sub_32)>, Requires<[HasDotProd]>; def : DotProductI32<SDOTv8i8, sextloadi8>; def : DotProductI32<UDOTv8i8, zextloadi8>; Then when I extended it to 8 element vectors, the time spent by tblgen exploded: from under 7 seconds (on A-72) on the AArch64 td files and the above patch to more than half an hour when I decided to terminate the processes. Here are the additional def'pats that produce the exponential b...

[LLVMdev] Sub-Register Allocation

2013 Jan 11

2

[LLVMdev] Sub-Register Allocation

...with zero, and then copy into the small view. I'm working on this llvm function, define i16 @zext_i8_to_i16_simple(i8 %x) { %1 = zext i8 %x to i16 ret i16 %1 } I have a pattern where I load the 16 bit portion of the register with 0, and then copy in the 8 bit portion. def : Pat<(i16 (zextloadi8 addr:$src)), (INSERT_SUBREG (MOV16id 0), (MOV8md addr:$src), sub_byte)>; which produces working but odd assembly, zext_i8_to_i16_simple PROC ; @zext_i8_to_i16_simple ; BB#0: move.b 4(a7), d1 move.w #0, d0 move.b d1, d0 rts Notice the e...

[LLVMdev] Overlapping register classes

2009 Mar 15

5

[LLVMdev] Overlapping register classes

...ot;Bfin", [i32], 32, [P0, P1, P2, P3, P4, P5, SP, FP]>; For instance, a zero-extending byte load needs the address in a P-reg and can only load a D-reg: def LOAD32p_8z: F1<(outs D:$dst), (ins P:$ptr), "$dst = B[$ptr] (Z);", [(set D:$dst, (zextloadi8 P:$ptr))]>; Some instructions work on all registers: def GR : RegisterClass<"Bfin", [i32], 32, [R0, R1, R2, R3, R4, R5, R6, R7, P0, P1, P2, P3, P4, P5, SP, FP, I0, I1, I2, I3, M0, M1, M2, M3, B0, B1, B2, B3, L0, L1, L2, L3]>; For instance, I can load an arbi...

[LLVMdev] Sub-Register Allocation

2013 Jan 12

0

[LLVMdev] Sub-Register Allocation

On Jan 10, 2013, at 9:54 PM, Kenneth Waters <kwwaters at gmail.com> wrote: > I have a pattern where I load the 16 bit portion of the register with 0, and then copy in the 8 bit portion. > > def : Pat<(i16 (zextloadi8 addr:$src)), > (INSERT_SUBREG (MOV16id 0), (MOV8md addr:$src), sub_byte)>; > > which produces working but odd assembly, > > zext_i8_to_i16_simple PROC ; @zext_i8_to_i16_simple > ; BB#0: > move.b 4(a7), d1 > move.w #0, d0 >...

[LLVMdev] Instr Description Problem of MCore Backend

2011 Jun 23

0

[LLVMdev] Instr Description Problem of MCore Backend

Hello > Finally, I don't know how to describe following instructions in > MCoreInstrInfo.td, because of its variable ins/outs. Or what other files > should I use to finish this description? Do you need the isel support for them? If yes, then you should custom isel them. iirc ARM and SystemZ backends have similar instructions, while only the first one supports full isel for them. In

[LLVMdev] Overlapping register classes

2009 Mar 16

0

[LLVMdev] Overlapping register classes

...P3, P4, P5, > SP, FP]>; > > For instance, a zero-extending byte load needs the address in a P-reg > and can only load a D-reg: > > def LOAD32p_8z: F1<(outs D:$dst), (ins P:$ptr), > "$dst = B[$ptr] (Z);", > [(set D:$dst, (zextloadi8 P:$ptr))]>; > > Some instructions work on all registers: > > def GR : RegisterClass<"Bfin", [i32], 32, > [R0, R1, R2, R3, R4, R5, R6, R7, > P0, P1, P2, P3, P4, P5, SP, FP, > I0, I1, I2, I3, M0, M1, M2, M3, > B0, B1, B2, B3, L0, L1, L2, L3]>;...

[LLVMdev] Instr Description Problem of MCore Backend

2011 Jun 23

2

[LLVMdev] Instr Description Problem of MCore Backend

Hi, all: Now I'm working on writing a backend for Moto MCore, but I don't know how to describe some instructions. First, I've already written MCoreRegisterInfo.td like these: class MCoreReg<bits<4> num, string name> : Register<name> { let Namespace = "MCore"; field bits<4> Num = num; } def R0 : MCoreReg< 0, "R0">,

[LLVMdev] Target Dependent Hexagon Packetizer patch

2012 Apr 19

0

[LLVMdev] Target Dependent Hexagon Packetizer patch

...dComplexity = 10, isPredicable = 1 in >> def LDriub_indexed_V4 : LDInst<(outs IntRegs:$dst), >> (ins IntRegs:$src1, IntRegs:$src2), >> "$dst=memub($src1+$src2<<#0)", >> - [(set IntRegs:$dst, (zextloadi8 (add IntRegs:$src1, >> - IntRegs:$src2)))]>, >> + [(set (i32 IntRegs:$dst), >> + (i32 (zextloadi8 (add (i32 IntRegs:$src1), >> +...

search for: zextloadi8