Displaying 20 results from an estimated 26 matches for "add4".
Did you mean:
add
2009 Apr 16
2
[LLVMdev] Help me improve two-address code
...asic function:
int
foo (int a, int b, int c, int d)
{
return a + b - c + d;
}
clang-cc -O2 yields:
define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone {
entry:
%add = add i32 %b, %a ; <i32> [#uses=1]
%sub = sub i32 %add, %c ; <i32> [#uses=1]
%add4 = add i32 %sub, %d ; <i32> [#uses=1]
ret i32 %add4
}
which lowers to this assembler code (note: args arrive in r1..r12, and
results are returned in r1..r3.):
foo:
add r2,r1 ### add r1,r2 is better
sub r2,r3
mov r1,r2 ### unnecessary!!
add r1,r4
jmp...
2008 Jul 08
3
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...do<
>
> The "let Uses = [R0]" is not needed. The pseudo instruction will be
> expanded like this later:
>
> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
> + .addReg(ptrA).addReg(ptrB);
> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
> + .addReg(incr).addReg(dest);
> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
>
> The second instruction defines R0 and the 3rd reads R0 which is
> enough to tell the register al...
2008 Jul 08
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...The "let Uses = [R0]" is not needed. The pseudo instruction will be
>> expanded like this later:
>>
>> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
>> + .addReg(ptrA).addReg(ptrB);
>> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
>> + .addReg(incr).addReg(dest);
>> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
>> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
>>
>> The second instruction defines R0 and the 3rd reads R0 which is
>> enough...
2017 Mar 15
2
Data structure improvement for the SLP vectorizer
...double %load20, %load10
%mul2 = fmul fast double %load21, %load11
%mul3 = fmul fast double %load22, %load10
%mul4 = fmul fast double %load23, %load11
%add1 = fadd fast double %load30, %mul1
%add2 = fadd fast double %load31, %mul2
%add3 = fadd fast double %load32, %mul3
%add4 = fadd fast double %load33, %mul4
%out0 = getelementptr inbounds double, double* %out, i32 0
%out1 = getelementptr inbounds double, double* %out, i32 1
%out2 = getelementptr inbounds double, double* %out, i32 2
%out3 = getelementptr inbounds double, double* %out, i32 3
store d...
2009 Apr 16
0
[LLVMdev] Help me improve two-address code
...d)
> {
> return a + b - c + d;
> }
>
> clang-cc -O2 yields:
>
> define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone {
> entry:
> %add = add i32 %b, %a ; <i32> [#uses=1]
> %sub = sub i32 %add, %c ; <i32> [#uses=1]
> %add4 = add i32 %sub, %d ; <i32> [#uses=1]
> ret i32 %add4
> }
>
> which lowers to this assembler code (note: args arrive in r1..r12, and
> results are returned in r1..r3.):
>
> foo:
> add r2,r1 ### add r1,r2 is better
> sub r2,r3
> mov r1,r2...
2008 Jul 04
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...+ def ATOMIC_LOAD_ADD_I32 : Pseudo<
The "let Uses = [R0]" is not needed. The pseudo instruction will be
expanded like this later:
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
The second instruction defines R0 and the 3rd reads R0 which is enough
to tell the register allocator what to do.
I do ha...
2003 Mar 03
0
lm, gee and lme
...nce is partially influenced by group membership. My
understanding here is that ignoring nonindependence (i.e., using lm)
actually results in SE estimates that are too large, while modeling the
nonindependence reduces SE and increases power.
Here is an example:
# lme model
> mod.lme<-lme(GWB.ADD4~HOR,random=~1|GRP,data=TBH)
> VarCorr(mod.lme)
GRP = pdLogChol(1)
Variance StdDev
(Intercept) 0.3160445 0.5621783
Residual 0.7449425 0.8631005
> 0.3160445/(0.3160445+0.7449425)
[1] 0.2978778 #Note the large ICC (high nonindependence)
> summary(mod.lme)$tTable...
2017 Mar 15
2
Data structure improvement for the SLP vectorizer
There was some discussion of this on the llvm-commits list, but I
wanted to raise the topic for discussion here. The background of the
-commits discussion was that r296863 added the ability to sort memory
access when the SLP vectorizer reached a load (the SLP vectorizer
starts at a store or some other sink, and tries to go up the tree
vectorizing as it goes along - if the input is in a different
2012 Jun 08
2
[LLVMdev] Strong vs. default phi elimination and single-reg classes
...to CFG: BB#0 BB#1
%vreg12<def> = PHI %vreg13, <BB#1>, %vreg11,
<BB#0>;CTRRC8:%vreg12,%vreg13,%vreg11
%vreg5<def> = LDtoc <ga:@a>, %X2; G8RC:%vreg5
%vreg6<def> = LWZ 0, %vreg5; mem:Volatile LD4[@a](tbaa=!"int")
GPRC:%vreg6 G8RC:%vreg5
%vreg7<def> = ADD4 %vreg6<kill>, %vreg3; GPRC:%vreg7,%vreg6,%vreg3
STW %vreg7<kill>, 0, %vreg5<kill>; mem:Volatile ST4[@a](tbaa=!"int")
GPRC:%vreg7 G8RC:%vreg5
%vreg13<def> = COPY %vreg12<kill>; CTRRC8:%vreg13,%vreg12
%vreg13<def> = BDNZ8 %vreg13, <BB#1>; CTRRC8:%vr...
2011 Jun 14
0
[LLVMdev] Too many load/store in Machine code represtation
...n stack
slot. The following message is use "llc -march=ppc32" command and dump from
MachineFunction.
%reg16384<def> = LWZ 0, <fi#6>; mem:LD4[%b] GPRC:%reg16384
%reg16385<def> = LWZ 0, <fi#5>; mem:LD4[%c] GPRC:%reg16385
%reg16386<def> = ADD4 %reg16384<kill>, %reg16385<kill>;
GPRC:%reg16386,16384,16385
STW %reg16386<kill>, 0, <fi#7>; mem:ST4[%a] GPRC:%reg16386
I am interesting in this method because it seems smart and different from
the compiler textbook. For understanding how it generated the LWZ and ST...
2008 Jul 02
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...arx dest, ptr
+ // add r0, dest, incr
+ // st[wd]cx. r0, ptr
+ // bne- loopMBB
+ // fallthrough --> exitMBB
+ BB = loopMBB;
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(PPC::BCC))
+ .addImm(PPC::PRED_NE).addReg(PPC::CR0).addMBB(loopMBB);
+ BB->addSuc...
2008 Jul 08
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...]" is not needed. The pseudo instruction will be
> >> expanded like this later:
> >>
> >> + BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
> >> + .addReg(ptrA).addReg(ptrB);
> >> + BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
> >> + .addReg(incr).addReg(dest);
> >> + BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
> >> + .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
> >>
> >> The second instruction defines R0 and the 3rd reads R0...
2012 Nov 11
2
[LLVMdev] problem trying to write an LLVM register-allocation pass
Hi Susan,
It looks like the bitcode you have attached is corrupted. You should make
sure to attach it as a binary file. Alternatively you can attach the LLVM
assembly as text. You can generate an assembly file from bitcode with:
llvm-dis -o <asm file> <bitcode>
Regards,
Lang.
On Fri, Nov 9, 2012 at 11:15 AM, Susan Horwitz <horwitz at cs.wisc.edu> wrote:
> Thanks Lang,
2012 Nov 11
0
[LLVMdev] problem trying to write an LLVM register-allocation pass
....end: ; preds = %while.cond
%call2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([7 x i8]* @.str, i32 0, i32 0), i32 %x.0)
%add = add nsw i32 %x.0, 5
%mul = mul nsw i32 %x.0, 2
%sub = sub nsw i32 %mul, 1
%mul3 = mul nsw i32 %add, %sub
%add4 = add nsw i32 %x.0, %mul3
%div5 = sdiv i32 %add4, %x.0
%add6 = add nsw i32 5, %add
%sub7 = sub nsw i32 %div5, %add6
%add8 = add nsw i32 %add4, %sub7
%add9 = add nsw i32 %add8, %x.0
%add10 = add nsw i32 %add9, %add
%add11 = add nsw i32 %add10, %sub
%call12 = call i32 (i8*, ...)* @pri...
2008 Jul 10
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...arx dest, ptr
+ // add r0, dest, incr
+ // st[wd]cx. r0, ptr
+ // bne- loopMBB
+ // fallthrough --> exitMBB
+ BB = loopMBB;
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), TmpReg)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(TmpReg).addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(PPC::BCC))
+ .addImm(PPC::PRED_NE).addReg(PPC::CR0).addMBB(loopMBB);
+ BB->addSucce...
2008 Jul 08
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
PPCTargetLowering::EmitInstrWithCustomInserter has a reference
to the current MachineFunction for other purposes. Can you use
MachineFunction::getRegInfo instead?
Dan
On Jul 8, 2008, at 1:56 PM, Gary Benson wrote:
> Would it be acceptable to change MachineInstr::getRegInfo from private
> to public so I can use it from
> PPCTargetLowering::EmitInstrWithCustomInserter?
>
>
2008 Jul 11
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...arx dest, ptr
+ // add r0, dest, incr
+ // st[wd]cx. r0, ptr
+ // bne- loopMBB
+ // fallthrough --> exitMBB
+ BB = loopMBB;
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), TmpReg)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(TmpReg).addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(PPC::BCC))
+ .addImm(PPC::PRED_NE).addReg(PPC::CR0).addMBB(loopMBB);
+ BB->addSucce...
2008 Jul 11
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Hi Gary,
This does not patch cleanly for me (PPCISelLowering.cpp). Can you
prepare a updated patch?
Thanks,
Evan
On Jul 10, 2008, at 11:45 AM, Gary Benson wrote:
> Cool, that worked. New patch attached...
>
> Cheers,
> Gary
>
> Evan Cheng wrote:
>> Just cast both values to const TargetRegisterClass*.
>>
>> Evan
>>
>> On Jul 10, 2008, at 7:36
2008 Jul 10
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Just cast both values to const TargetRegisterClass*.
Evan
On Jul 10, 2008, at 7:36 AM, Gary Benson wrote:
> Evan Cheng wrote:
>> How about?
>>
>> const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass :
>> &PPC:G8RCRegClass;
>> unsigned TmpReg = RegInfo.createVirtualRegister(RC);
>
> I tried something like that yesterday:
>
> const
2008 Jul 10
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Evan Cheng wrote:
> How about?
>
> const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass :
> &PPC:G8RCRegClass;
> unsigned TmpReg = RegInfo.createVirtualRegister(RC);
I tried something like that yesterday:
const TargetRegisterClass *RC =
is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass;
but I kept getting this error no matter how I arranged it: