thr3ads.net - search: "imm1"

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

2

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

.../gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) break; case OP_MAD: if (imm0.isInteger(0)) { + ImmediateValue imm1; i->setSrc(0, i->getSrc(2)); i->src(0).mod = i->src(2).mod; i->setSrc(1, NULL); i->setSrc(2, NULL); - i->op = i->src(0).mod.getOp(); - if (i->op != OP_CVT) - i->src(0).mod = 0; + if (i->src(0...

Tablegen pattern: How to emit a SDNode in an output pattern?

2018 Apr 09

2

Tablegen pattern: How to emit a SDNode in an output pattern?

I'm trying to write a tablegen pattern to that matches a sequence of SDNodes and emits again an SDNode and another instruction. The pattern I've written looks like the folowing: def : Pat<(foo (bar GPR:$rs1), simm12:$imm1), (bar (BAZ GPR:$rs1, simm12:$imm1))>; foo and bar are SDNodes, BAZ is an instruction. In particular, bar is defined as follows: def bar : SDNode<"ISD::BAR", SDTIntUnaryOp>; The basic idea of this pattern is to propagate bar over certain instructions until the...

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

2

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

...;> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp >> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) >> break; >> case OP_MAD: >> if (imm0.isInteger(0)) { >> + ImmediateValue imm1; >> i->setSrc(0, i->getSrc(2)); >> i->src(0).mod = i->src(2).mod; >> i->setSrc(1, NULL); >> i->setSrc(2, NULL); >> - i->op = i->src(0).mod.getOp(); >> - if (i->op != OP_CVT) &g...

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

2015 Aug 19

5

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

...f (!prog->getTarget()->isOpSupported(cmp->op, TYPE_F32)) - return; - if (imm0.reg.data.f32 != 1.0) - return; - if (i->getSrc(t)->getInsn()->dType != TYPE_U32) - return; + Instruction *src = i->getSrc(t)->getInsn(); + ImmediateValue imm1; + if (imm0.reg.data.u32 == 0) { + i->op = OP_MOV; + i->setSrc(0, new_ImmediateValue(prog, 0u)); + i->src(0).mod = Modifier(0); + i->setSrc(1, NULL); + } else if (imm0.reg.data.u32 == ~0U) { + i->op = i->src(t).mod.getOp(); +...

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

0

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

...en/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) > break; > case OP_MAD: > if (imm0.isInteger(0)) { > + ImmediateValue imm1; > i->setSrc(0, i->getSrc(2)); > i->src(0).mod = i->src(2).mod; > i->setSrc(1, NULL); > i->setSrc(2, NULL); > - i->op = i->src(0).mod.getOp(); > - if (i->op != OP_CVT) > - i->src(0)...

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

0

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

...ouveau/codegen/nv50_ir_peephole.cpp >>> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, >>> ImmediateValue &imm0, int s) >>> break; >>> case OP_MAD: >>> if (imm0.isInteger(0)) { >>> + ImmediateValue imm1; >>> i->setSrc(0, i->getSrc(2)); >>> i->src(0).mod = i->src(2).mod; >>> i->setSrc(1, NULL); >>> i->setSrc(2, NULL); >>> - i->op = i->src(0).mod.getOp(); >>> - if...

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

1

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

...hole.cpp >>>> @@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i, >>>> ImmediateValue &imm0, int s) >>>> break; >>>> case OP_MAD: >>>> if (imm0.isInteger(0)) { >>>> + ImmediateValue imm1; >>>> i->setSrc(0, i->getSrc(2)); >>>> i->src(0).mod = i->src(2).mod; >>>> i->setSrc(1, NULL); >>>> i->setSrc(2, NULL); >>>> - i->op = i->src(0).mod.getOp(); &...

[LLVMdev] Question on tablegen

2009 May 08

0

[LLVMdev] Question on tablegen

...junath, I had a very similar problem and I solved it using a custom vector shuffle and addition instead of mov. For example, Vector_shuffle s1, s2, <0,3> is mapped to a custom instruction where I transform the swizzle to a 32bit integer mask and an inverted mask. So I have dst, src0, src1, imm1, imm2 And I have my asm look similar to: Add dst, src0.imm1, src1.imm2 and then in the asm printer I intercept vector_shuffle and I convert the integer to x,y,z,w, 0, 1 or _. For example if the mask is to take x from s1 and yzw from s2, I would generate 0x1000 and 0x0234. So my result looks like Ia...

Are there some strong naming conventions in TableGen?

2017 Jul 27

2

Are there some strong naming conventions in TableGen?

...e: t5: ch = store<ST2[%ptr2](align=4)> t0, Constant:i16<3>, FrameIndex:i16<1>, undef:i16 I have defined the following instruction and associated DAG pattern. def MOVSUTO_A_i32o : CLPFPU_A_i32o_Inst<0b1000001101, (ins IMM16Operand:$ImmA,FPUaOffsetOperand:$OffsetB), (outs ), [], "movsuto_a\t$ImmA,$OffsetB","I32O",...

[LLVMdev] TableGen pattern

2009 May 19

1

[LLVMdev] TableGen pattern

Hello, I am trying to convert the subtree (vector_shuffle v2f32, v2f32 (build_vector imm1, imm2)) to a machine instruction that takes 2 v2f32's and 2 immediates. I tried the following table gen pattern : (set v2f32Reg:$dst, (vector_shuffle v2f32Reg:$src1, v2f32Reg:$src2, (build_vector imm:$c1, imm:$c2))) Table gen barfs about typ...

[LLVMdev] Question on tablegen

2009 May 08

2

[LLVMdev] Question on tablegen

Dan, Thanks a lot. Using a modifier in the assembly string works for this case. I am trying to solve a related problem. I am trying to print out a set of "mov" ops for the vector_shuffle node. Since the source of the "mov" is from one of the sources to vector_shuffle, depending on the mask, I am not sure what assembly string to emit. For example, if I have d <-

[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses

2014 Aug 15

2

[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses

...r read or def:ed. It however seems that Codegen treats subregister definitions as somehow clobbering the whole register. The SSA-code looks like this after isel: (Reg0 and Reg1 are 16bit registers. Reg2, Reg3 and Reg4 are 32 bit registers with 16bit subregs, hi16 and lo16.) Reg0 = #imm0 Reg1 = #imm1 Reg2 = IMPLICIT_DEF Reg3 = INSERT_SUBREG Reg2, Reg0, hi16 Reg4 = INSERT_SUBREG Reg3, Reg1, lo16 After TwoAddressInstructionPass it becomes: Reg5:hi16<def,read-undef> = Reg0 Reg5:lo16<def> = Reg1 So, in my world this means a setting of the high 16 bits in Reg5 (not affecti...

[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations

2015 Mar 25

0

[PATCH] nv50/ir: take postFactor into account when doing peephole optimizations

...t;setSrc(1, NULL); @@ -682,7 +685,7 @@ ConstantFolding::tryCollapseChainedMULs(Instruction *mul2, Instruction *insn; Instruction *mul1 = NULL; // mul1 before mul2 int e = 0; - float f = imm2.reg.data.f32; + float f = imm2.reg.data.f32 * exp2f(mul2->postFactor); ImmediateValue imm1; assert(mul2->op == OP_MUL && mul2->dType == TYPE_F32); @@ -782,9 +785,10 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) i->op = OP_MOV; i->setSrc(0, new_ImmediateValue(prog, 0u)); i->src(0).mod = Modifier(0); +...

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

2015 Nov 05

7

[PATCH mesa 0/5] nouveau: codegen: Make use of double immediates

Hi All, This series implements using double immediates in the nouveau codegen code. This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) This has been tested with the 2 double shader tests which I just send to the piglet list. On a gk208 (gk110 / SM35)

search for: imm1