thr3ads.net - search: "192b"

pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers

2018 Apr 23

2

pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers

...eg8<def> = FADD_A_oo %vreg4, %vreg7, %RFLAGA<imp-def,dead>; FPUaROUTADDRegisterClass:%vreg8 FPUaOffsetClass:%vreg4,%vreg7 144B %FA_ROFF0<def> = COPY %vreg8; FPUaROUTADDRegisterClass:%vreg8 176B MOVSUTO_SU_os_rpc %SU_ROFF0<kill>, %RPC<imp-def,dead> 192B NOP # End machine code for function addproddivConst. handleMove 64B -> 104B: %vreg4<def> = COPY %vreg3; FPUaOffsetClass:%vreg4 FPUaROUTMULRegisterClass:%vreg3 %vreg4: [64r,128r:0) 0 at 64r --> [104r,128r:0) 0 at 104r %vreg3: [48r,64r:0) 0 at 4...

[LLVMdev] Assert in live update from MI scheduler.

2012 Jun 12

2

[LLVMdev] Assert in live update from MI scheduler.

...g9<kill>, 0; mem:LD4[%stack.0.in] IntRegs:%vreg10,%vreg9 112B %vreg9<def> = ADD_ri %vreg10, 8; IntRegs:%vreg9,%vreg10 128B %vreg6<def> = CMPEQri %vreg10, 0; PredRegs:%vreg6 IntRegs:%vreg10 176B JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 192B JMP <BB#2> Successors according to CFG: BB#2 BB#1 208B BB#2: derived from LLVM BB %for.end Predecessors according to CFG: BB#1 224B %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any pointer") IntRegs:%vreg7,%vreg1 240B STriw_GP...

[LLVMdev] Assert in live update from MI scheduler.

2012 Jun 13

0

[LLVMdev] Assert in live update from MI scheduler.

...4[%stack.0.in] > IntRegs:%vreg10,%vreg9 > 112B %vreg9<def> = ADD_ri %vreg10, 8; IntRegs:%vreg9,%vreg10 > 128B %vreg6<def> = CMPEQri %vreg10, 0; PredRegs:%vreg6 IntRegs:%vreg10 > 176B JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > 192B JMP <BB#2> > Successors according to CFG: BB#2 BB#1 > > 208B BB#2: derived from LLVM BB %for.end > Predecessors according to CFG: BB#1 > 224B %vreg7<def> = LDriw %vreg1<kill>, 0; mem:LD4[%first1](tbaa=!"any > pointer") IntRegs:%v...

[LLVMdev] setCC and brcond

2013 Mar 19

0

[LLVMdev] setCC and brcond

...MOVri 1; GPRegs:%vreg3 128B STWi13 <fi#0>, 0, %vreg3<kill>; mem:ST4[%retval] GPRegs:%vreg3 144B BRrel <BB#3> Successors according to CFG: BB#3 160B BB#2: derived from LLVM BB %if.else Predecessors according to CFG: BB#0 176B %vreg2<def> = MOVri 0; GPRegs:%vreg2 192B STWi13 <fi#0>, 0, %vreg2<kill>; mem:ST4[%retval] GPRegs:%vreg2 Successors according to CFG: BB#3 208B BB#3: derived from LLVM BB %return Predecessors according to CFG: BB#2 BB#1 224B %vreg4<def> = LDWi13 <fi#0>, 0; mem:LD4[%retval] GPRegs:%vreg4 240B %R2<def...

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Mar 31

2

[ARM] Register pressure with -mthumb forces register reload before each call

...$s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp 160B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp 176B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp 192B $r0 = COPY %0:tgpr 208B $r1 = COPY %1:tgpr 224B $r2 = COPY %2:tgpr 240B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit...

linear-scan RA

2018 Sep 11

2

linear-scan RA

...ri 17 > 96B JMP_1 %bb.3 > > 112B bb.2: > ; predecessors: %bb.0 > successors: %bb.3(0x80000000); %bb.3(100.00%) > > 128B NOOP implicit %0:gr32 > 144B %1:gr32 = COPY %0:gr32 > 160B JMP_1 %bb.3 > > 176B bb.3: > ; predecessors: %bb.1, %bb.2 > > 192B NOOP implicit %1:gr32 > > # End machine code for function somefunc. > > > If you look at the "intervals" (the class is a misnomer since nowadays it contains a list of ranges...) in the beginning you see that %0 and %1 do not overlap anywhere. > > - Matthias >...

Very unresponsive, sometimes stalling domU (5.4, x86_64)

2010 Mar 02

3

Very unresponsive, sometimes stalling domU (5.4, x86_64)

...50 0 0 0| 0 0 | 126B 178B| 0 0 | 113 11 46 3 51 0 0 0| 0 0 | 328B 470B| 0 0 | 133 22 0 0 73 28 0 0| 48k 0 | 198B 454B| 0 0 | 30 27 41 3 51 5 0 0| 192k 0 | 522B 1246B| 0 0 | 164 61 9 2 89 0 0 0|8192B 968k| 630B 1896B| 0 0 | 62 35 0 0 100 0 0 0| 0 0 | 136B 178B| 0 0 | 15 16 0 0 100 0 0 0| 0 0 | 246B 292B| 0 0 | 14 17 1 0 99 0 0 0| 0 0 |1231k 28k| 0 0 |1004 925 0 0 100 0 0 0| 0 0 |3394k 7...

Very unresponsive, sometimes stalling domU (5.4, x86_64)

2010 Mar 02

3

Very unresponsive, sometimes stalling domU (5.4, x86_64)

...50 0 0 0| 0 0 | 126B 178B| 0 0 | 113 11 46 3 51 0 0 0| 0 0 | 328B 470B| 0 0 | 133 22 0 0 73 28 0 0| 48k 0 | 198B 454B| 0 0 | 30 27 41 3 51 5 0 0| 192k 0 | 522B 1246B| 0 0 | 164 61 9 2 89 0 0 0|8192B 968k| 630B 1896B| 0 0 | 62 35 0 0 100 0 0 0| 0 0 | 136B 178B| 0 0 | 15 16 0 0 100 0 0 0| 0 0 | 246B 292B| 0 0 | 14 17 1 0 99 0 0 0| 0 0 |1231k 28k| 0 0 |1004 925 0 0 100 0 0 0| 0 0 |3394k 7...

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

2

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

...y<def> = COPY %vreg15<kill>; R600_Reg128:%vreg23 R600_TReg32:%vreg15 register: %vreg23 replace range with [144r,160r:1) RESULT: [144r,160r:1)[160r,224r:0) 0 at 160r 1 at 144r 176B%vreg24<def> = COPY %vreg21<kill>; R600_Reg128:%vreg24,%vreg21 register: %vreg24 +[176r,256r:0) 192B%vreg24:sel_y<def> = COPY %vreg2; R600_Reg128:%vreg24 R600_Reg32:%vreg2 register: %vreg24 replace range with [176r,192r:1) RESULT: [176r,192r:1)[192r,256r:0) 0 at 192r 1 at 176r 208B%vreg25<def> = COPY %C1_Z; R600_Reg32:%vreg25 register: %vreg25 +[208r,272r:0) 224B%vreg26<def> = C...

[LLVMdev] scoreboard hazard det. and instruction groupings

2012 Jun 11

0

[LLVMdev] scoreboard hazard det. and instruction groupings

On Jun 11, 2012, at 12:07 PM, Hal Finkel <hfinkel at anl.gov> wrote: > Looking at VLIWPacketizerList::PacketizeMIs, it seems like the > instructions are first scheduled (via some external scheme?), and then > packetized 'in order'. Is that correct? Anshu? > In the PowerPC grouping scheme, resources are assigned on a group > basis (by the instruction dispatching

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

0

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

...ill>; R600_Reg128:%vreg23 > R600_TReg32:%vreg15 > register: %vreg23 replace range with [144r,160r:1) RESULT: > [144r,160r:1)[160r,224r:0) 0 at 160r 1 at 144r > 176B%vreg24<def> = COPY %vreg21<kill>; R600_Reg128:%vreg24,%vreg21 > register: %vreg24 +[176r,256r:0) > 192B%vreg24:sel_y<def> = COPY %vreg2; R600_Reg128:%vreg24 > R600_Reg32:%vreg2 > register: %vreg24 replace range with [176r,192r:1) RESULT: > [176r,192r:1)[192r,256r:0) 0 at 192r 1 at 176r > 208B%vreg25<def> = COPY %C1_Z; R600_Reg32:%vreg25 > register: %vreg25 +[208r,272r:0)...

linear-scan RA

2018 Sep 11

2

linear-scan RA

...bb.2: > > ; predecessors: %bb.0 > > successors: %bb.3(0x80000000); %bb.3(100.00%) > > > > 128B NOOP implicit %0:gr32 > > 144B %1:gr32 = COPY %0:gr32 > > 160B JMP_1 %bb.3 > > > > 176B bb.3: > > ; predecessors: %bb.1, %bb.2 > > > > 192B NOOP implicit %1:gr32 > > > > # End machine code for function somefunc. > > > > > > If you look at the "intervals" (the class is a misnomer since nowadays > it contains a list of ranges...) in the beginning you see that %0 and %1 do > not overlap anywh...

linear-scan RA

2018 Sep 11

2

linear-scan RA

The phi instruction is irrelevant; just the way I think about things. The question is if the allocator believes that t0 and t2 interfere. Perhaps the coalescing example was too simple. In the general case, we can't coalesce without a notion of interference. My worry is that looking at interference by ranges of instruction numbers leads to inaccuracies when a range is introduced by a copy.

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Apr 07

2

[ARM] Register pressure with -mthumb forces register reload before each call

If I'm understanding what's going on in this test correctly, what's happening is: * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6) * The function has three arguments, so those three plus the register we need to hold the

[ARM] Register pressure with -mthumb forces register reload before each call

2020 Apr 15

4

[ARM] Register pressure with -mthumb forces register reload before each call

...$s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp 160B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp 176B ADJCALLSTACKDOWN 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp 192B $r0 = COPY %0:tgpr 208B $r1 = COPY %2:tgpr 224B $r2 = COPY %1:tgpr 240B tBLXr 14, $noreg, %3:tgpr, <regmask $lr $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $q4 $q5 $q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit...

linear-scan RA

2018 Sep 11

2

linear-scan RA

...bb.3(100.00%) > >> > > >> > 128B NOOP implicit %0:gr32 > >> > 144B %1:gr32 = COPY %0:gr32 > >> > 160B JMP_1 %bb.3 > >> > > >> > 176B bb.3: > >> > ; predecessors: %bb.1, %bb.2 > >> > > >> > 192B NOOP implicit %1:gr32 > >> > > >> > # End machine code for function somefunc. > >> > > >> > > >> > If you look at the "intervals" (the class is a misnomer since > nowadays it contains a list of ranges...) in the beginning you...

[LLVMdev] RegisterCoalescing pass crashes with ImplicitDef registers

2012 Oct 20

2

[LLVMdev] RegisterCoalescing pass crashes with ImplicitDef registers

...vreg7<kill>, 0, 0, 0, %vreg5, 0, 0, 0, 1, pred:%PRED_SEL_OFF, 0; R600_Reg32:%vreg8,%vreg7,%vreg5 160B%vreg10<def> = IMPLICIT_DEF; R600_Reg128:%vreg10 176B%vreg9<def,tied1> = INSERT_SUBREG %vreg10<tied0>, %vreg6<kill>, sel_x; R600_Reg128:%vreg9,%vreg10 R600_Reg32:%vreg6 192B%vreg11<def,tied1> = INSERT_SUBREG %vreg9<tied0>, %vreg8<kill>, sel_y; R600_Reg128:%vreg11,%vreg9 R600_Reg32:%vreg8 208B%vreg13<def> = IMPLICIT_DEF; R600_Reg32:%vreg13 224B%vreg12<def,tied1> = INSERT_SUBREG %vreg11<tied0>, %vreg13, sel_z; R600_Reg128:%vreg12,%vreg...

[LLVMdev] Assert in live update from MI scheduler.

2012 Jun 13

4

[LLVMdev] Assert in live update from MI scheduler.

...gs:%vreg10,%vreg9 > > 112B %vreg9<def> = ADD_ri %vreg10, 8; IntRegs:%vreg9,%vreg10 > > 128B %vreg6<def> = CMPEQri %vreg10, 0; PredRegs:%vreg6 > IntRegs:%vreg10 > > 176B JMP_cNot %vreg6<kill>, <BB#1>, %PC<imp-def>; PredRegs:%vreg6 > > 192B JMP <BB#2> > > Successors according to CFG: BB#2 BB#1 > > > > 208B BB#2: derived from LLVM BB %for.end > > Predecessors according to CFG: BB#1 > > 224B %vreg7<def> = LDriw %vreg1<kill>, 0; > mem:LD4[%first1](tbaa=!"any &...

[LLVMdev] scoreboard hazard det. and instruction groupings

2012 Jun 11

3

[LLVMdev] scoreboard hazard det. and instruction groupings

On Mon, 11 Jun 2012 10:48:18 -0700 Andrew Trick <atrick at apple.com> wrote: > On Jun 11, 2012, at 9:30 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > > I'm considering writing more-detailed itineraries for some PowerPC > > CPUs that use the 'traditional' instruction grouping scheme. In > > essence, this means that multiple instructions will stall

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

2012 Oct 25

0

[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.

...Y %vreg6:sel_w<kill>; R600_Reg32:%vreg42 R600_Reg128:%vreg6 > 720B%T2_W<def> = COPY %vreg42<kill>; R600_Reg32:%vreg42 > > And after the pass : > > //Before Loop > ...Some COPYs... > 128B%vreg27:sel_x<def,read-undef> = COPY %C1_X; R600_Reg128:%vreg27 > 192B%vreg27:sel_y<def> = COPY %C1_Y; R600_Reg128:%vreg27 > 272B%vreg27:sel_z<def> = COPY %C1_Z; R600_Reg128:%vreg27 > 320B%vreg27:sel_w<def> = COPY %C1_W; R600_Reg128:%vreg27 > > //LOOP CONDITION > 512B%vreg30<def> = SETGT_INT 0, 0, 1, 0, 0, 0, %C0_X, 0, 0, 0, %vre...

search for: 192b