search for: ld2

Displaying 20 results from an estimated 76 matches for "ld2".

Did you mean: ld
2015 May 05
2
[LLVMdev] [AArch64] Should we restrict to the pointer type used in ldN/stN intrinsics?
...The ldN like intrinsics (including all the ld1xN, ldN, ldNlane, ldNr, stN, stNlane) can use any pointer types. The definition (in IntrinsicsAArch64.td) of such intrinsics use 'LLVMAnyPointerType', which means we can pass any pointer type to such intrinsics. E.g. I tried following case ld2.ll: define { <4 x i32>, <4 x i32> } @test(float* %ptr) { %vld2 = call { <4 x i32>, <4 x i32> } @llvm.aarch64.neon.ld2.v4i32.p0f32(float* %ptr) ret { <4 x i32>, <4 x i32> } %vld2 } declare { <4 x i32>, <4 x i32> } @llvm.aarch64.neon.ld2.v4i32....
2010 Nov 08
2
[LLVMdev] [LLVMDev] Register Allocation and copy instructions
...1: # Machine code for function test5: Frame Objects: fi#-2: size=2, align=4, fixed, at location [SP+8] fi#-1: size=2, align=8, fixed, at location [SP+4] Function Live Outs: %AX BB#0: derived from LLVM BB %entry %reg16390<def> = MOVZX32rm16 <fi#-2>, 1, %reg0, 0, %reg0; mem:LD2[FixedStack-2] GR32:%reg16390 %reg16385<def> = COPY %reg16390:sub_16bit<kill>; GR16:%reg16385 GR32:%reg16390 %reg16391<def> = MOVZX32rm16 <fi#-1>, 1, %reg0, 0, %reg0; mem:LD2[FixedStack-1] GR32:%reg16391 %reg16384<def> = COPY %reg16391:sub_16bit&...
2011 Jul 27
2
[LLVMdev] Avoiding load narrowing in DAGCombiner
...of a word on my little-endian target, I emit and LD4 from the word-aligned address and an SRL 16 to shift the i16 into the LSbits of the register. DAGCombine visit()s an ISD::SRL node and notices that it is right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and replaces it with an LD2 from %arrayidx+2. Replaces -------- 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0 With ---- 0x17fceb0: i32,ch = load 0x17faa00, 0x17fac00, 0x17f6a70<LD2[%arrayidx4+2...
2010 Jun 03
2
[LLVMdev] Unused argument registers can not be reused ?
...egister: R14B dead +[0,3:0) livein register: R13W live through +[0,40:0) livein register: R13B dead +[0,3:0) livein register: R12W live through +[0,40:0) livein register: R12B dead +[0,3:0) 4 %reg1028<def> = MOV16rm %reg0, <ga:@b>; mem:LD2[@b] register: %reg1028 +[6,14:0) 12 %reg1029<def> = MOV16rr %reg1028<kill> register: %reg1029 +[14,30:0) 20 %reg1029<def> = ADD16rm %reg1029, %reg0, <ga:@a>, %SRW<imp-def>; mem:LD2[@a] register: %reg1029 replace range with [14,...
2012 Sep 20
2
[LLVMdev] Scheduling question (memory dependency)
...0B BB#0: derived from LLVM BB %entry Live Ins: %X3 %X4 16B %vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2 32B %vreg1<def> = COPY %X3; G8RC:%vreg1 48B STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] G8RC:%vreg1 64B %vreg4<def> = LHA 0, <fi#-1>; mem:LD2[%0] GPRC:%vreg4 ... --------------------------------------------------------------- So far, so good. When we get to list scheduling, not quite so good: --------------------------------------------------------------- ********** List Scheduling ********** SU(0): STH8 %X3<kill&...
2010 Jun 03
0
[LLVMdev] FW: Unused argument registers can not be reused ?
...d from LLVM BB %entry Live Ins: %R15W %R14W %R13W %R12W %reg1027<def> = MOV16rr %R12W %reg1026<def> = MOV16rr %R13W %reg1025<def> = MOV16rr %R14W %reg1024<def> = MOV16rr %R15W %reg1028<def> = MOV16rm %reg0, <ga:@b>; mem:LD2[@b] %reg1029<def> = ADD16rm %reg1028, %reg0, <ga:@a>, %SRW<imp-def,dead>; mem:LD2[@a] SUB16mr %reg0, <ga:@r>, %reg1029, %SRW<imp-def,dead>; mem:ST2[@r] LD2[@r] RET # End machine code for function test. # Machine code for function test: Functio...
2012 Sep 21
0
[LLVMdev] Scheduling question (memory dependency)
...------------------------------------------------------------ 16B %vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2 32B %vreg1<def> = COPY %X3; G8RC:%vreg1 48B STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] G8RC:%vreg1 64B %vreg0<def> = LHZ 0, <fi#-1>; mem:LD2[%i11] GPRC:%vreg0 ... ------------------------------------------------------------------ Note the %i11 instead of %0 on the LHZ as another difference. The scheduler then generates a dependency between the store and the load, and everything works properly. Does this help tickle an...
2012 Sep 21
2
[LLVMdev] Scheduling question (memory dependency)
...----------------------------------- > 16B %vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2 > 32B %vreg1<def> = COPY %X3; G8RC:%vreg1 > 48B STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] > G8RC:%vreg1 > 64B %vreg0<def> = LHZ 0, <fi#-1>; mem:LD2[%i11] GPRC:%vreg0 > ... > ------------------------------------------------------------------ > > Note the %i11 instead of %0 on the LHZ as another difference. The > scheduler then generates a dependency between the store and the load, > and everything works prope...
2011 Jul 27
0
[LLVMdev] Avoiding load narrowing in DAGCombiner
...le-endian target, I emit and LD4 from the word-aligned address > and an SRL 16 to shift the i16 into the LSbits of the register. > > DAGCombine visit()s an ISD::SRL node and notices that it is > right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and > replaces it with an LD2 from %arrayidx+2. > > Replaces > -------- > 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> > 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] > 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0 > > With > ---- > 0x17fceb0: i32,ch = load 0x17...
2011 Jul 27
2
[LLVMdev] Avoiding load narrowing in DAGCombiner
...emit and LD4 from the word-aligned address >> and an SRL 16 to shift the i16 into the LSbits of the register. >> >> DAGCombine visit()s an ISD::SRL node and notices that it is >> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and >> replaces it with an LD2 from %arrayidx+2. >> >> Replaces >> -------- >> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> >> 0x17f94c0: i32 = Constant<16> [ORD=9] [ID=10] >> 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0 >> >> With >> --...
2012 Sep 21
2
[LLVMdev] Scheduling question (memory dependency)
...16B %vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2 > > > 32B %vreg1<def> = COPY %X3; G8RC:%vreg1 > > > 48B STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] > > > G8RC:%vreg1 > > > 64B %vreg0<def> = LHZ 0, <fi#-1>; mem:LD2[%i11] GPRC:%vreg0 > > > ... > > > ------------------------------------------------------------------ > > > > > > Note the %i11 instead of %0 on the LHZ as another difference. The > > > scheduler then generates a dependency between the s...
2012 Sep 21
0
[LLVMdev] Scheduling question (memory dependency)
...---------- > > 16B %vreg2<def> = COPY %X4; G8RC_with_sub_32:%vreg2 > > 32B %vreg1<def> = COPY %X3; G8RC:%vreg1 > > 48B STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] > > G8RC:%vreg1 > > 64B %vreg0<def> = LHZ 0, <fi#-1>; mem:LD2[%i11] GPRC:%vreg0 > > ... > > ------------------------------------------------------------------ > > > > Note the %i11 instead of %0 on the LHZ as another difference. The > > scheduler then generates a dependency between the store and the load, > &...
2010 Nov 08
0
[LLVMdev] [LLVMDev] Register Allocation and copy instructions
...ke this move during register allocation, or how can I tell from (1) that I cannot execute %reg16385<def> = COPY %reg16390. Furthermore, how should I handle this case. > BB#0: derived from LLVM BB %entry > %reg16390<def> = MOVZX32rm16 <fi#-2>, 1, %reg0, 0, %reg0; mem:LD2[FixedStack-2] GR32:%reg16390 > %reg16385<def> = COPY %reg16390:sub_16bit<kill>; GR16:%reg16385 GR32:%reg16390 > %reg16391<def> = MOVZX32rm16 <fi#-1>, 1, %reg0, 0, %reg0; mem:LD2[FixedStack-1] GR32:%reg16391 > %reg16384<def> = COPY %reg1...
2004 May 24
1
discriminant analysis
...eed the predicted values to produce a plot of the analysis, as far as I know. Here my code: pcor.lda2<-lda(pcor~habarea+hcom+isol+flowcov+herbh+inclin+windprot+shrubcov+baregr, data=pcor.df, CV=T) table2<-table(pcor.df$pcor, pcor.lda2$class) table2 #doesn't work, becoause CV=True? pcor.ld2<-predict(pcor.lda2, dimen=1)$x plot(pcor.ld2) plot(pcor.lda2, type="density", dimen=1) #kernel density estimates I am happy if I get an answer from somebody! Stefanie von Felten Institut für Umweltwissenschaften Universität Zürich Winterthurerstrasse 190 CH-8057 Zürich e-mail: sfe...
2012 Sep 21
0
[LLVMdev] Scheduling question (memory dependency)
...gt; = COPY %X4; G8RC_with_sub_32:%vreg2 > > > > 32B %vreg1<def> = COPY %X3; G8RC:%vreg1 > > > > 48B STH8 %vreg1<kill>, 0, <fi#-1>; mem:ST2[FixedStack-1] > > > > G8RC:%vreg1 > > > > 64B %vreg0<def> = LHZ 0, <fi#-1>; mem:LD2[%i11] GPRC:%vreg0 > > > > ... > > > > ------------------------------------------------------------------ > > > > > > > > Note the %i11 instead of %0 on the LHZ as another difference. The > > > > scheduler then generates a...
2015 Mar 03
3
[LLVMdev] ReduceLoadWidth, DAGCombiner and non 8bit loads/extloads question.
...o undo the optimization added by ReduceLoadWidth. The 2nd approach seems more in line with what LLVM infrastructure wants but it seems silly to have to undo an optimization? Essentially, we have some bit packing structures and the code is trying to get the upper bits. The initial dag generates an LD2 with srl (which makes sense, it's what we want). The DAGCombiner then goes in and changes that LD2 with srl to an LD1 zextload, which we don't support. Why is LegalOperations really needed here? What is the purpose and point of this? It seems you could eliminate this and be all the better...
2011 Jul 27
0
[LLVMdev] Avoiding load narrowing in DAGCombiner
...e word-aligned address >>> and an SRL 16 to shift the i16 into the LSbits of the register. >>> >>> DAGCombine visit()s an ISD::SRL node and notices that it is >>> right-shifting the result of an LD4 from %arrayidx4 by 16 bits, and >>> replaces it with an LD2 from %arrayidx+2. >>> >>> Replaces >>> -------- >>> 0x17f7070: i32,ch = load 0x17faa00, 0x17f7f70, 0x17f6a70<LD4[%arrayidx4]> >>> 0x17f94c0: i32 = Constant<16>  [ORD=9] [ID=10] >>> 0x17f7470: i32 = srl 0x17f7070, 0x17f94c0 >>&g...
2006 Apr 04
0
Fisher's discriminant functions
...> data.lda Call: lda(group ~ (v1 + v2 + v3), data = tmp.data) Prior probabilities of groups: 1 2 3 0.3333333 0.3333333 0.3333333 Group means: v1 v2 v3 1 2.0 3.5 5.5 2 10.0 8.5 8.5 3 20.5 14.5 15.0 Coefficients of linear discriminants: LD1 LD2 v1 0.8294354 0.6168736 v2 2.8623498 -1.7696711 v3 -0.7612283 0.8423363 Proportion of trace: LD1 LD2 0.9984 0.0016 In this example, I get 2 functions: LD1 and LD2 as the canonical functions for 3 groups, and what I'd need is 3 functions for my 3 groups (Fisher's discriminant fu...
2012 Nov 08
0
mirt vs. eRm vs. ltm vs. winsteps
...ound(erm[order(erm$loc,decreasing=TRUE),],2) erm<-erm[c(2:4,1)] erm #I get the following order of item parameters: x2,x3,x7,x1,x4,x5,x6 library(ltm) grm<-grm(as.data.frame(pcmdat),constrained=TRUE,IRT.param=TRUE) ltm<-as.data.frame(unlist(coef.grm(grm))) ld1<-ltm[c(1,5,9,13,17,21,25),] ld2<-ltm[c(2,6,10,14,18,22,26),] ld3<-ltm[c(3,7,11,15,19,23,NA),] ltm<-as.data.frame(cbind(ld1,ld2,ld3)) ltm[7,3]<-0.5*rowSums(ltm[7,1:2])#to get the mean of ld1+ld2 when dividing by 3 names(ltm)<-c("thres1","thres2","thres3") rownames(ltm)<-c("x1&qu...
2016 Mar 11
3
masked-load endpoints optimization
...lse. Note that the x86 backend already does this, so either my proposal is ok for x86, or we're already doing an illegal optimization: define <4 x i32> @load_bonus_bytes(i32* %addr1, <4 x i32> %v) { %ld1 = load i32, i32* %addr1 %addr2 = getelementptr i32, i32* %addr1, i64 3 %ld2 = load i32, i32* %addr2 %vec1 = insertelement <4 x i32> undef, i32 %ld1, i32 0 %vec2 = insertelement <4 x i32> %vec1, i32 %ld2, i32 3 ret <4 x i32> %vec2 } $ ./llc -o - loadcombine.ll ... movups (%rdi), %xmm0 retq On Thu, Mar 10, 2016 at 10:22 PM, Nema, Ashut...