thr3ads.net - search: "add4"

Displaying 20 results from an estimated 26 matches for "add4".

Did you mean: add

[LLVMdev] Help me improve two-address code

2009 Apr 16

[LLVMdev] Help me improve two-address code

...asic function: int foo (int a, int b, int c, int d) { return a + b - c + d; } clang-cc -O2 yields: define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone { entry: %add = add i32 %b, %a ; <i32> [#uses=1] %sub = sub i32 %add, %c ; <i32> [#uses=1] %add4 = add i32 %sub, %d ; <i32> [#uses=1] ret i32 %add4 } which lowers to this assembler code (note: args arrive in r1..r12, and results are returned in r1..r3.): foo: add r2,r1 ### add r1,r2 is better sub r2,r3 mov r1,r2 ### unnecessary!! add r1,r4 jmp...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 08

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...do< > > The "let Uses = [R0]" is not needed. The pseudo instruction will be > expanded like this later: > > + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) > + .addReg(ptrA).addReg(ptrB); > + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0) > + .addReg(incr).addReg(dest); > + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) > + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB); > > The second instruction defines R0 and the 3rd reads R0 which is > enough to tell the register al...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 08

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...The "let Uses = [R0]" is not needed. The pseudo instruction will be >> expanded like this later: >> >> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) >> + .addReg(ptrA).addReg(ptrB); >> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0) >> + .addReg(incr).addReg(dest); >> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) >> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB); >> >> The second instruction defines R0 and the 3rd reads R0 which is >> enough...

Data structure improvement for the SLP vectorizer

2017 Mar 15

Data structure improvement for the SLP vectorizer

...double %load20, %load10 %mul2 = fmul fast double %load21, %load11 %mul3 = fmul fast double %load22, %load10 %mul4 = fmul fast double %load23, %load11 %add1 = fadd fast double %load30, %mul1 %add2 = fadd fast double %load31, %mul2 %add3 = fadd fast double %load32, %mul3 %add4 = fadd fast double %load33, %mul4 %out0 = getelementptr inbounds double, double* %out, i32 0 %out1 = getelementptr inbounds double, double* %out, i32 1 %out2 = getelementptr inbounds double, double* %out, i32 2 %out3 = getelementptr inbounds double, double* %out, i32 3 store d...

[LLVMdev] Help me improve two-address code

2009 Apr 16

[LLVMdev] Help me improve two-address code

...d) > { > return a + b - c + d; > } > > clang-cc -O2 yields: > > define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone { > entry: > %add = add i32 %b, %a ; <i32> [#uses=1] > %sub = sub i32 %add, %c ; <i32> [#uses=1] > %add4 = add i32 %sub, %d ; <i32> [#uses=1] > ret i32 %add4 > } > > which lowers to this assembler code (note: args arrive in r1..r12, and > results are returned in r1..r3.): > > foo: > add r2,r1 ### add r1,r2 is better > sub r2,r3 > mov r1,r2...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 04

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...+ def ATOMIC_LOAD_ADD_I32 : Pseudo< The "let Uses = [R0]" is not needed. The pseudo instruction will be expanded like this later: + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) + .addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0) + .addReg(incr).addReg(dest); + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB); The second instruction defines R0 and the 3rd reads R0 which is enough to tell the register allocator what to do. I do ha...

lm, gee and lme

2003 Mar 03

lm, gee and lme

...nce is partially influenced by group membership. My understanding here is that ignoring nonindependence (i.e., using lm) actually results in SE estimates that are too large, while modeling the nonindependence reduces SE and increases power. Here is an example: # lme model > mod.lme<-lme(GWB.ADD4~HOR,random=~1|GRP,data=TBH) > VarCorr(mod.lme) GRP = pdLogChol(1) Variance StdDev (Intercept) 0.3160445 0.5621783 Residual 0.7449425 0.8631005 > 0.3160445/(0.3160445+0.7449425) [1] 0.2978778 #Note the large ICC (high nonindependence) > summary(mod.lme)$tTable...

Data structure improvement for the SLP vectorizer

2017 Mar 15

Data structure improvement for the SLP vectorizer

There was some discussion of this on the llvm-commits list, but I wanted to raise the topic for discussion here. The background of the -commits discussion was that r296863 added the ability to sort memory access when the SLP vectorizer reached a load (the SLP vectorizer starts at a store or some other sink, and tries to go up the tree vectorizing as it goes along - if the input is in a different

[LLVMdev] Strong vs. default phi elimination and single-reg classes

2012 Jun 08

[LLVMdev] Strong vs. default phi elimination and single-reg classes

...to CFG: BB#0 BB#1 %vreg12<def> = PHI %vreg13, <BB#1>, %vreg11, <BB#0>;CTRRC8:%vreg12,%vreg13,%vreg11 %vreg5<def> = LDtoc <ga:@a>, %X2; G8RC:%vreg5 %vreg6<def> = LWZ 0, %vreg5; mem:Volatile LD4[@a](tbaa=!"int") GPRC:%vreg6 G8RC:%vreg5 %vreg7<def> = ADD4 %vreg6<kill>, %vreg3; GPRC:%vreg7,%vreg6,%vreg3 STW %vreg7<kill>, 0, %vreg5<kill>; mem:Volatile ST4[@a](tbaa=!"int") GPRC:%vreg7 G8RC:%vreg5 %vreg13<def> = COPY %vreg12<kill>; CTRRC8:%vreg13,%vreg12 %vreg13<def> = BDNZ8 %vreg13, <BB#1>; CTRRC8:%vr...

[LLVMdev] Too many load/store in Machine code represtation

2011 Jun 14

[LLVMdev] Too many load/store in Machine code represtation

...n stack slot. The following message is use "llc -march=ppc32" command and dump from MachineFunction. %reg16384<def> = LWZ 0, <fi#6>; mem:LD4[%b] GPRC:%reg16384 %reg16385<def> = LWZ 0, <fi#5>; mem:LD4[%c] GPRC:%reg16385 %reg16386<def> = ADD4 %reg16384<kill>, %reg16385<kill>; GPRC:%reg16386,16384,16385 STW %reg16386<kill>, 0, <fi#7>; mem:ST4[%a] GPRC:%reg16386 I am interesting in this method because it seems smart and different from the compiler textbook. For understanding how it generated the LWZ and ST...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 02

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...arx dest, ptr + // add r0, dest, incr + // st[wd]cx. r0, ptr + // bne- loopMBB + // fallthrough --> exitMBB + BB = loopMBB; + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) + .addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0) + .addReg(incr).addReg(dest); + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(PPC::BCC)) + .addImm(PPC::PRED_NE).addReg(PPC::CR0).addMBB(loopMBB); + BB->addSuc...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 08

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...]" is not needed. The pseudo instruction will be > >> expanded like this later: > >> > >> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) > >> + .addReg(ptrA).addReg(ptrB); > >> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0) > >> + .addReg(incr).addReg(dest); > >> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) > >> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB); > >> > >> The second instruction defines R0 and the 3rd reads R0...

[LLVMdev] problem trying to write an LLVM register-allocation pass

2012 Nov 11

[LLVMdev] problem trying to write an LLVM register-allocation pass

Hi Susan, It looks like the bitcode you have attached is corrupted. You should make sure to attach it as a binary file. Alternatively you can attach the LLVM assembly as text. You can generate an assembly file from bitcode with: llvm-dis -o <asm file> <bitcode> Regards, Lang. On Fri, Nov 9, 2012 at 11:15 AM, Susan Horwitz <horwitz at cs.wisc.edu> wrote: > Thanks Lang,

[LLVMdev] problem trying to write an LLVM register-allocation pass

2012 Nov 11

[LLVMdev] problem trying to write an LLVM register-allocation pass

....end: ; preds = %while.cond %call2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([7 x i8]* @.str, i32 0, i32 0), i32 %x.0) %add = add nsw i32 %x.0, 5 %mul = mul nsw i32 %x.0, 2 %sub = sub nsw i32 %mul, 1 %mul3 = mul nsw i32 %add, %sub %add4 = add nsw i32 %x.0, %mul3 %div5 = sdiv i32 %add4, %x.0 %add6 = add nsw i32 5, %add %sub7 = sub nsw i32 %div5, %add6 %add8 = add nsw i32 %add4, %sub7 %add9 = add nsw i32 %add8, %x.0 %add10 = add nsw i32 %add9, %add %add11 = add nsw i32 %add10, %sub %call12 = call i32 (i8*, ...)* @pri...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 10

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...arx dest, ptr + // add r0, dest, incr + // st[wd]cx. r0, ptr + // bne- loopMBB + // fallthrough --> exitMBB + BB = loopMBB; + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) + .addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), TmpReg) + .addReg(incr).addReg(dest); + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) + .addReg(TmpReg).addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(PPC::BCC)) + .addImm(PPC::PRED_NE).addReg(PPC::CR0).addMBB(loopMBB); + BB->addSucce...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 08

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

PPCTargetLowering::EmitInstrWithCustomInserter has a reference to the current MachineFunction for other purposes. Can you use MachineFunction::getRegInfo instead? Dan On Jul 8, 2008, at 1:56 PM, Gary Benson wrote: > Would it be acceptable to change MachineInstr::getRegInfo from private > to public so I can use it from > PPCTargetLowering::EmitInstrWithCustomInserter? > >

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 11

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

...arx dest, ptr + // add r0, dest, incr + // st[wd]cx. r0, ptr + // bne- loopMBB + // fallthrough --> exitMBB + BB = loopMBB; + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest) + .addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), TmpReg) + .addReg(incr).addReg(dest); + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX)) + .addReg(TmpReg).addReg(ptrA).addReg(ptrB); + BuildMI(BB, TII->get(PPC::BCC)) + .addImm(PPC::PRED_NE).addReg(PPC::CR0).addMBB(loopMBB); + BB->addSucce...

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 11

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

Hi Gary, This does not patch cleanly for me (PPCISelLowering.cpp). Can you prepare a updated patch? Thanks, Evan On Jul 10, 2008, at 11:45 AM, Gary Benson wrote: > Cool, that worked. New patch attached... > > Cheers, > Gary > > Evan Cheng wrote: >> Just cast both values to const TargetRegisterClass*. >> >> Evan >> >> On Jul 10, 2008, at 7:36

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 10

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

Just cast both values to const TargetRegisterClass*. Evan On Jul 10, 2008, at 7:36 AM, Gary Benson wrote: > Evan Cheng wrote: >> How about? >> >> const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : >> &PPC:G8RCRegClass; >> unsigned TmpReg = RegInfo.createVirtualRegister(RC); > > I tried something like that yesterday: > > const

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

2008 Jul 10

[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC

Evan Cheng wrote: > How about? > > const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : > &PPC:G8RCRegClass; > unsigned TmpReg = RegInfo.createVirtualRegister(RC); I tried something like that yesterday: const TargetRegisterClass *RC = is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass; but I kept getting this error no matter how I arranged it:

search for: add4