thr3ads.net - R help - [R] QA [May 2013]

If this information is useful, please help other people find it:
Share via:

arun

2013-May-26 01:03 UTC

[R] QA

Hi,
I hope this works for you.
fun1<- function(x){
??? big<- x>0.8*max(x) 
??? n<- length(big)
??? startRunOfBigs<- which(c(big[1],!big[-n] & big[-1]))
??? endRunOfBigs<- which(c(big[-n] & !big[-1], big[n]))
??? index<- vapply(seq_along(startRunOfBigs),function(i)
which.max(x[startRunOfBigs[i]:endRunOfBigs[i]])+startRunOfBigs[i]-1L,0L)
???? index<-ifelse(sum(is.na(match(index,c(1,12))))==0 &
x[index]!=max(x[index]), NA,index)
??? data.frame(Index=index[!is.na(index)],Value=x[index[!is.na(index)]])
??? ?}

##mm: data
fun3<- function(mat){
??????? indmat<-combn(seq_len(ncol(mat)),2)
??? lst1<- lapply(seq_len(ncol(indmat)),function(i) {mat[,indmat[,i]]})??? 
???
names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE))???????
??? lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)})
??? 
??? lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y) y[,1])))
??? lst4<-lapply(lst3,function(x) unlist(x[which.min(apply(x,1,function(y)
abs(diff(y)))),]) )
??? lst5<- lapply(lst4,function(x){ 
??? ??? ??? ??? if(abs(diff(x))>(nrow(mat)/2)){
??? ??? ??? ??? nrow(mat)-abs(diff(x))
??? ??? ??? ??? ??? }
??? ??? ??????????????? else(abs(diff(x)))
??? ??? ??? ??? ??? })
??? ??? 
??? lst6<- lapply(seq_along(lst5),function(i) {
??? ??? ??? ??? x2<-lst1[[i]]
??? ??? ??? ??? if(lst5[[i]]==0) {
??? ??? ??? ??? ??? indx1<- seq(length(x2[,2]))
??? ??? ??? ??? ??? sum(abs(x2[,1]-x2[indx1,2]))
??? ??? ??? ??? ??? }
??? ??? ??? ??? else{
??? ??? ??? ??? ??? lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2]
??? ??? ??? ??? ??? ??? ??? indx1<-seq(length(x3)-(j-1))
??? ??? ??? ??? ??? ??? ??? indx2<-c(setdiff(seq_along(x3),indx1),indx1)
??? ??? ??? ??? ??? ??? ??? sum(abs(x2[,1]-x2[indx2,2]))
??? ??? ??? ??? ??? ??? ??? })
??? ??? ??? ??? }
??? ??? ??? ??? })??? 
??? ??? ??? ??? 
??? names(lst6)<- names(lst1)
??? lst7<-lapply(lst6,unlist)
??? lst8<- lapply(lst7,function(x) {
??? ??? ??? Seq1<-seq_along(x)
??? ??? ??? if(length(Seq1)==1) x
??? ??? ??? else if(length(Seq1)==2){
??? ??? ??? ??? ??? ??? sum(abs(x[1]-x[2]))??? 
??? ??? ??? ??? ??? ??? }
??? ??? ??? else{ 
??? ??? ??? ??? ind<-rep(Seq1,each=2)[-1]
??? ??? ??? ??? ind1<-ind[-length(ind)]
??? ??? ??? ???
Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) {
??? ??? ??? ??? ??? ??? ??? ??? ??? abs(diff(x[i]))
??? ??? ??? ??? ??? ??? ??? ??? ??? }))??? ??? ??? ??? ??? ??? ??? 
??? ??? ??? ??? }

??? ??? ??? ??? ??? }
??? ??? ??? )
??? do.call(rbind,lst8)??? 
??? }

fun3(mm)?? #rownames represent the comparison between the particular columns
#????????? [,1]
#1_2? 2.5966051
#1_3? 1.0267435
#1_4? 3.7387830
#1_5? 1.8489204
#1_6? 6.5233654
#2_3? 4.2951411
#2_4? 1.9040790
#2_5? 2.2874235
#2_6? 5.1526016
#3_4? 0.9726777
#3_5? 2.1359229
#3_6? 5.0221450
#4_5? 0.9124638
#4_6? 8.2604187
#5_6 14.0550864


A.K.





________________________________
From: eliza botto <eliza_botto at hotmail.com>
To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com> 
Sent: Saturday, May 25, 2013 2:14 PM
Subject: QA




Dear Arun,
[text file is attached]
After your help on preparing loop for identifying peaks, here is my latest
question which is linked with my first question. but this time i will try to
make it more clear.
> dput(xx)structure(c(5L, 12L, NA, 4L, NA, NA, 4L, NA, NA, 5L, 9L, 12L,?
6L, NA, NA, 12L, NA, NA), .Dim = c(3L, 6L), .Dimnames = list(
? ? NULL, c("1", "8", "9", "23",
"87", "89")))> dput(mm)structure(c(0.706461987893674, 0.998391468394261, 0.72402995269242,?
1.70874688194537, 1.93906363083693, 0.89540353128442, 0.328327645695443,?
0.427434603701202, 0.591932250254601, 0.444627635494183, 1.44407704434405,?
1.79150336746345, 0.740380661614246, 1.39756784211974, 1.43602731683199,?
2.40482060634346, 1.61684982192949, 0.549848553223765, 0.245763715425745,?
0.315411788974968, 0.390626431538384, 0.369934560068472, 0.769100067815155,?
1.76366863411459, 0.480885978853889, 1.21441674507622, 2.50566408677391,?
3.27361599826255, 1.18508780425679, 0.465943778037697, 0.29380145690883,?
0.36356245877522, 0.373314458026047, 0.334849362386475, 0.882050057788756,?
0.626807814853613, 0.774295647517675, 0.853105130179133, 0.738085443815565,?
1.26063449947807, 1.57350832698427, 0.790095501697794, 0.510641105191147,?
0.874523657118082, 1.31257333325184, 0.882086374572265, 1.13881207205977,?
1.29163890813439, 0.0849732189580101, 0.070591276171845, 0.0926010253161898,?
0.362209761457517, 1.45769283057202, 3.16165004659667, 2.74903557756267,?
1.94633472878995, 1.19319875840883, 0.533232612926756, 0.225531074123974,?
0.122949089115578, 2.06195904001605, 1.41493262330451, 1.35748791897328,?
1.19490680241894, 0.702488756183322, 0.338258418490199, 0.123398398622741,?
0.138548982660226, 0.16170889185798, 0.414543218677095, 1.84629295875002,?
2.24547399004563), .Dim = c(12L, 6L))


You can see that that there are two matrices. "mm" is the actual
matrix and "xx" is the matrix indentifying the peaks of
"mm".For being a peak a value has to either the maximum value or
atleast 80% of the maximum value. you can see that the maximum value of coulmn 1
is in row number 5 and thats what it showed in matrix "xx" whereas,
the 80% of the maximum value is in row number 12 therefore it considered it the
second peak and row number was shown in "xx". i want to calculate the
distance matrix of "mm" in the following way...
The column are continous or cyclic.
The subtraction should start from the peak and should end when the peaks of two
columns are in the same row. The peaks are to be moved towrds eachother in the
shortest possible way.
For suppose the peak of colum 2 is in 4th row and the peak of column 6 is in
12th row. Now moving these two peak towwards eachother requires moving col 2 in
reverse direction or column 6 in forward direction.

For example

Initial:

Col 2

1 2 3 4(max) 5 6 7 8 9 10 11 12

Col 6

1 2 3 4 5 6 7 8 9 10 11 12(max)

a<-sum(abs(col2-col6))

step1:

Col 2

2 3 4(max) 5 6 7 8 9 10 11 12 1

Col 6

1 2 3 4 5 6 7 8 9 10 11 12(max)

b<-sum(abs(col2-col6))

step2:

Col 2

3 4(max) 5 6 7 8 9 10 11 12 1 2

Col 6

1 2 3 4 5 6 7 8 9 10 11 12(max)

c<-sum(abs(col2-col6))

step3:

Col 2

4(max) 5 6 7 8 9 10 11 12 1 2 3

Col 6

1 2 3 4 5 6 7 8 9 10 11 12(max)

d<-sum(abs(col2-col6))

step4:

Col 2

5 6 7 8 9 10 11 12 1 2 3 4(max)

Col 6

1 2 3 4 5 6 7 8 9 10 11 12(max)

e<-sum(abs(col2-col6))

total difference= abs(a-b)+abs(b-c)+abs(c-d)+abs(d-e)


The dissimilarity is zero if the peaks are already in the same row. like for
column 2 and 3 the distance is zero as peaks are under eachother. For column 1
and 4 the distance is onceagain zero. Although they have different nuber of
peaks but as atleast one of their peaks is under eachother therefore distance is
zero.

For Column 5 and 6 peaks can be moved in either direction as number of steps to
be followed are same.

for column 1 and 2 following is the procedure

Col1 has two maximum values in row 5th and 12th and column two has only one
maximum value at 4 row. As peak in 5th row of column one is closer to the peak
of column 2 therefore we will move towards it and procedure should be


Initial:

Col 1

1 2 3 4 5(max) 6 7 8 9 10 11 12(max)

Col 8

1 2 3 4(max) 5 6 7 8 9 10 11 12

a<-sum(abs(col1-col8))

Step1:?

Col 1

1 2 3 4 5(max) 6 7 8 9 10 11 12(max)

Col 8

12 1 2 3 4(max) 5 6 7 8 9 10 11

b<-sum(abs(col1-col8))

total difference=abs(a-b)

For column 4 and 5

Initial:

Col 4

1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max)

Col 5

1 2 3 4 5 6(max) 7 8 9 10 11 12

a<-sum(abs(col4-col5))

Step 1

Col 4

1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max)

Col 5

2 3 4 5 6(max) 7 8 9 10 11 12 1

b<-sum(abs(col4-col5))

Total Difference= abs(a-b)

If there is any point which i couldnt discuss please tell me...


Elisa?????????

arun

2013-May-26 01:35 UTC

head link

[R] QA

I thought you want to compare between the rows of two columns even if their
corresponding values fall in the same row.


fun3<- function(mat){
??????? indmat<-combn(seq_len(ncol(mat)),2)
??? lst1<- lapply(seq_len(ncol(indmat)),function(i) {mat[,indmat[,i]]})?? 
???
names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE))??????
??? lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)})
?? 
??? lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y) y[,1])))
??? lst4<-lapply(lst3,function(x) unlist(x[which.min(apply(x,1,function(y)
abs(diff(y)))),]) )
??? lst5<- lapply(lst4,function(x){
??????????????? if(abs(diff(x))>(nrow(mat)/2)){
??????????????? nrow(mat)-abs(diff(x))
??????????????????? }
??????????????????????? else(abs(diff(x)))
??????????????????? })
?????? 
??? lst6<- lapply(seq_along(lst5),function(i) {
??????????????? x2<-lst1[[i]]
??????????????? if(lst5[[i]]==0) {
??????????????????? #indx1<- seq(length(x2[,2]))
??????????????????? #sum(abs(x2[,1]-x2[indx1,2]))
??? ??? ???? 0?? ? ######################## set to zero ??????????????????? 
??? ??? ??? }
??????????????? else{
??????????????????? lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2]
??????????????????????????? indx1<-seq(length(x3)-(j-1))
??????????????????????????? indx2<-c(setdiff(seq_along(x3),indx1),indx1)
??????????????????????????? sum(abs(x2[,1]-x2[indx2,2]))
??????????????????????????? })
??????????????? }
??????????????? })?? 
?????????????? 
??? names(lst6)<- names(lst1)
??? lst7<-lapply(lst6,unlist)
??? lst8<- lapply(lst7,function(x) {
??????????? Seq1<-seq_along(x)
??????????? if(length(Seq1)==1) x
??????????? else if(length(Seq1)==2){
??????????????????????? sum(abs(x[1]-x[2]))?? 
??????????????????????? }
??????????? else{
??????????????? ind<-rep(Seq1,each=2)[-1]
??????????????? ind1<-ind[-length(ind)]
???????????????
Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) {
??????????????????????????????????? abs(diff(x[i]))
??????????????????????????????????? }))?????????????????????????? 
??????????????? }

??????????????????? }
??????????? )
??? lst9<-do.call(rbind,lst8)
??? lst9??? ?? 
??? }

?fun3(mm)
#????????? [,1]
#1_2? 2.5966051
#1_3? 1.0267435
#1_4? 0.0000000
#1_5? 1.8489204
#1_6? 0.0000000
#2_3? 0.0000000
#2_4? 1.9040790
#2_5? 2.2874235
#2_6? 5.1526016
#3_4? 0.9726777
#3_5? 2.1359229
#3_6? 5.0221450
#4_5? 0.9124638
#4_6? 0.0000000
#5_6 14.0550864


xx
#????? 1? 8? 9 23 87 89
#[1,]? 5? 4? 4? 5? 6 12
#[2,] 12 NA NA? 9 NA NA
#[3,] NA NA NA 12 NA NA

According to xx, 1&4, 2&3, 4&6 (also 0 because both have 12)
A.K.
________________________________
From: eliza botto <eliza_botto at hotmail.com>
To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com> 
Sent: Saturday, May 25, 2013 9:17 PM
Subject: RE: QA




thanks arun,
i dont think thANKyou is enough for wat u did. anyway, there is slight
modification that i want to ask to understand the codes more efficiently. what
if i want to consider the distance between the columns having atleast one peak
in the same month equal to zero, instead of "initial value"???
more precisely The distance between column 2 and 3 should be zero instead
of?4.2951411. similarly the distance between column 4 and 6 should be zero
instead of?8.260419.?
Thats just for my own knowledge to understand the loop. i hope you wont mind.
The loop works absolutely well.

Elisa

> Date: Sat, 25 May 2013 18:03:33 -0700
> From: smartpink111 at yahoo.com
> Subject: Re: QA
> To: eliza_botto at hotmail.com
> CC: r-help at r-project.org
> 
> Hi,
> I hope this works for you.
> fun1<- function(x){
> ??? big<- x>0.8*max(x) 
> ??? n<- length(big)
> ??? startRunOfBigs<- which(c(big[1],!big[-n] & big[-1]))
> ??? endRunOfBigs<- which(c(big[-n] & !big[-1], big[n]))
> ??? index<- vapply(seq_along(startRunOfBigs),function(i)
which.max(x[startRunOfBigs[i]:endRunOfBigs[i]])+startRunOfBigs[i]-1L,0L)
> ???? index<-ifelse(sum(is.na(match(index,c(1,12))))==0 &
x[index]!=max(x[index]), NA,index)
> ??? data.frame(Index=index[!is.na(index)],Value=x[index[!is.na(index)]])
> ??? ?}
> 
> ##mm: data
> fun3<- function(mat){
> ??????? indmat<-combn(seq_len(ncol(mat)),2)
> ??? lst1<- lapply(seq_len(ncol(indmat)),function(i)
{mat[,indmat[,i]]})???
> ???
names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE))???????
> ??? lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)})
> ??? 
> ??? lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y)
y[,1])))
> ??? lst4<-lapply(lst3,function(x)
unlist(x[which.min(apply(x,1,function(y) abs(diff(y)))),]) )
> ??? lst5<- lapply(lst4,function(x){ 
> ??? ??? ??? ??? if(abs(diff(x))>(nrow(mat)/2)){
> ??? ??? ??? ??? nrow(mat)-abs(diff(x))
> ??? ??? ??? ??? ??? }
> ??? ??? ??????????????? else(abs(diff(x)))
> ??? ??? ??? ??? ??? })
> ??? ??? 
> ??? lst6<- lapply(seq_along(lst5),function(i) {
> ??? ??? ??? ??? x2<-lst1[[i]]
> ??? ??? ??? ??? if(lst5[[i]]==0) {
> ??? ??? ??? ??? ??? indx1<- seq(length(x2[,2]))
> ??? ??? ??? ??? ??? sum(abs(x2[,1]-x2[indx1,2]))
> ??? ??? ??? ??? ??? }
> ??? ??? ??? ??? else{
> ??? ??? ??? ??? ??? lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2]
> ??? ??? ??? ??? ??? ??? ??? indx1<-seq(length(x3)-(j-1))
> ??? ??? ??? ??? ??? ??? ??? indx2<-c(setdiff(seq_along(x3),indx1),indx1)
> ??? ??? ??? ??? ??? ??? ??? sum(abs(x2[,1]-x2[indx2,2]))
> ??? ??? ??? ??? ??? ??? ??? })
> ??? ??? ??? ??? }
> ??? ??? ??? ??? })??? 
> ??? ??? ??? ??? 
> ??? names(lst6)<- names(lst1)
> ??? lst7<-lapply(lst6,unlist)
> ??? lst8<- lapply(lst7,function(x) {
> ??? ??? ??? Seq1<-seq_along(x)
> ??? ??? ??? if(length(Seq1)==1) x
> ??? ??? ??? else if(length(Seq1)==2){
> ??? ??? ??? ??? ??? ??? sum(abs(x[1]-x[2]))??? 
> ??? ??? ??? ??? ??? ??? }
> ??? ??? ??? else{ 
> ??? ??? ??? ??? ind<-rep(Seq1,each=2)[-1]
> ??? ??? ??? ??? ind1<-ind[-length(ind)]
> ??? ??? ??? ???
Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) {
> ??? ??? ??? ??? ??? ??? ??? ??? ??? abs(diff(x[i]))
> ??? ??? ??? ??? ??? ??? ??? ??? ??? }))??? ??? ??? ??? ??? ??? ??? 
> ??? ??? ??? ??? }
> 
> ??? ??? ??? ??? ??? }
> ??? ??? ??? )
> ??? do.call(rbind,lst8)??? 
> ??? }
> 
> fun3(mm)?? #rownames represent the comparison between the particular
columns
> #????????? [,1]
> #1_2? 2.5966051
> #1_3? 1.0267435
> #1_4? 3.7387830
> #1_5? 1.8489204
> #1_6? 6.5233654
> #2_3? 4.2951411
> #2_4? 1.9040790
> #2_5? 2.2874235
> #2_6? 5.1526016
> #3_4? 0.9726777
> #3_5? 2.1359229
> #3_6? 5.0221450
> #4_5? 0.9124638
> #4_6? 8.2604187
> #5_6 14.0550864
> 
> 
> A.K.
> 
> 
> 
> 
> 
> ________________________________
> From: eliza botto <eliza_botto at hotmail.com>
> To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
> Sent: Saturday, May 25, 2013 2:14 PM
> Subject: QA
> 
> 
> 
> 
> Dear Arun,
> [text file is attached]
> After your help on preparing loop for identifying peaks, here is my latest
question which is linked with my first question. but this time i will try to
make it more clear.
> 
> > dput(xx)
> structure(c(5L, 12L, NA, 4L, NA, NA, 4L, NA, NA, 5L, 9L, 12L,?
> 6L, NA, NA, 12L, NA, NA), .Dim = c(3L, 6L), .Dimnames = list(
> ? ? NULL, c("1", "8", "9", "23",
"87", "89")))
> > dput(mm)
> structure(c(0.706461987893674, 0.998391468394261, 0.72402995269242,?
> 1.70874688194537, 1.93906363083693, 0.89540353128442, 0.328327645695443,?
> 0.427434603701202, 0.591932250254601, 0.444627635494183, 1.44407704434405,?
> 1.79150336746345, 0.740380661614246, 1.39756784211974, 1.43602731683199,?
> 2.40482060634346, 1.61684982192949, 0.549848553223765, 0.245763715425745,?
> 0.315411788974968, 0.390626431538384, 0.369934560068472,
0.769100067815155,?
> 1.76366863411459, 0.480885978853889, 1.21441674507622, 2.50566408677391,?
> 3.27361599826255, 1.18508780425679, 0.465943778037697, 0.29380145690883,?
> 0.36356245877522, 0.373314458026047, 0.334849362386475, 0.882050057788756,?
> 0.626807814853613, 0.774295647517675, 0.853105130179133,
0.738085443815565,?
> 1.26063449947807, 1.57350832698427, 0.790095501697794, 0.510641105191147,?
> 0.874523657118082, 1.31257333325184, 0.882086374572265, 1.13881207205977,?
> 1.29163890813439, 0.0849732189580101, 0.070591276171845,
0.0926010253161898,?
> 0.362209761457517, 1.45769283057202, 3.16165004659667, 2.74903557756267,?
> 1.94633472878995, 1.19319875840883, 0.533232612926756, 0.225531074123974,?
> 0.122949089115578, 2.06195904001605, 1.41493262330451, 1.35748791897328,?
> 1.19490680241894, 0.702488756183322, 0.338258418490199, 0.123398398622741,?
> 0.138548982660226, 0.16170889185798, 0.414543218677095, 1.84629295875002,?
> 2.24547399004563), .Dim = c(12L, 6L))
> 
> 
> You can see that that there are two matrices. "mm" is the actual
matrix and "xx" is the matrix indentifying the peaks of
"mm".For being a peak a value has to either the maximum value or
atleast 80% of the maximum value. you can see that the maximum value of coulmn 1
is in row number 5 and thats what it showed in matrix "xx" whereas,
the 80% of the maximum value is in row number 12 therefore it considered it the
second peak and row number was shown in "xx". i want to calculate the
distance matrix of "mm" in the following way...
> The column are continous or cyclic.
> The subtraction should start from the peak and should end when the peaks of
two columns are in the same row. The peaks are to be moved towrds eachother in
the shortest possible way.
> For suppose the peak of colum 2 is in 4th row and the peak of column 6 is
in 12th row. Now moving these two peak towwards eachother requires moving col 2
in reverse direction or column 6 in forward direction.
> 
> For example
> 
> Initial:
> 
> Col 2
> 
> 1 2 3 4(max) 5 6 7 8 9 10 11 12
> 
> Col 6
> 
> 1 2 3 4 5 6 7 8 9 10 11 12(max)
> 
> a<-sum(abs(col2-col6))
> 
> step1:
> 
> Col 2
> 
> 2 3 4(max) 5 6 7 8 9 10 11 12 1
> 
> Col 6
> 
> 1 2 3 4 5 6 7 8 9 10 11 12(max)
> 
> b<-sum(abs(col2-col6))
> 
> step2:
> 
> Col 2
> 
> 3 4(max) 5 6 7 8 9 10 11 12 1 2
> 
> Col 6
> 
> 1 2 3 4 5 6 7 8 9 10 11 12(max)
> 
> c<-sum(abs(col2-col6))
> 
> step3:
> 
> Col 2
> 
> 4(max) 5 6 7 8 9 10 11 12 1 2 3
> 
> Col 6
> 
> 1 2 3 4 5 6 7 8 9 10 11 12(max)
> 
> d<-sum(abs(col2-col6))
> 
> step4:
> 
> Col 2
> 
> 5 6 7 8 9 10 11 12 1 2 3 4(max)
> 
> Col 6
> 
> 1 2 3 4 5 6 7 8 9 10 11 12(max)
> 
> e<-sum(abs(col2-col6))
> 
> total difference= abs(a-b)+abs(b-c)+abs(c-d)+abs(d-e)
> 
> 
> The dissimilarity is zero if the peaks are already in the same row. like
for column 2 and 3 the distance is zero as peaks are under eachother. For column
1 and 4 the distance is onceagain zero. Although they have different nuber of
peaks but as atleast one of their peaks is under eachother therefore distance is
zero.
> 
> For Column 5 and 6 peaks can be moved in either direction as number of
steps to be followed are same.
> 
> for column 1 and 2 following is the procedure
> 
> Col1 has two maximum values in row 5th and 12th and column two has only one
maximum value at 4 row. As peak in 5th row of column one is closer to the peak
of column 2 therefore we will move towards it and procedure should be
> 
> 
> Initial:
> 
> Col 1
> 
> 1 2 3 4 5(max) 6 7 8 9 10 11 12(max)
> 
> Col 8
> 
> 1 2 3 4(max) 5 6 7 8 9 10 11 12
> 
> a<-sum(abs(col1-col8))
> 
> Step1:?
> 
> Col 1
> 
> 1 2 3 4 5(max) 6 7 8 9 10 11 12(max)
> 
> Col 8
> 
> 12 1 2 3 4(max) 5 6 7 8 9 10 11
> 
> b<-sum(abs(col1-col8))
> 
> total difference=abs(a-b)
> 
> For column 4 and 5
> 
> Initial:
> 
> Col 4
> 
> 1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max)
> 
> Col 5
> 
> 1 2 3 4 5 6(max) 7 8 9 10 11 12
> 
> a<-sum(abs(col4-col5))
> 
> Step 1
> 
> Col 4
> 
> 1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max)
> 
> Col 5
> 
> 2 3 4 5 6(max) 7 8 9 10 11 12 1
> 
> b<-sum(abs(col4-col5))
> 
> Total Difference= abs(a-b)
> 
> If there is any point which i couldnt discuss please tell me...
> 
> 
> Elisa?????????

eliza botto

2013-May-26 01:39 UTC

head link

[R] QA

Thanks Arun,
i could never have done this on my own
the recent reply will make it easier 4 me to understand..
thanks onceagain..
enjoy your weekend
:D
Elisa
> Date: Sat, 25 May 2013 18:35:27 -0700
> From: smartpink111@yahoo.com
> Subject: Re: QA
> To: eliza_botto@hotmail.com
> CC: r-help@r-project.org
> 
> I thought you want to compare between the rows of two columns even if their
corresponding values fall in the same row.
> 
> 
> fun3<- function(mat){
>         indmat<-combn(seq_len(ncol(mat)),2)
>     lst1<- lapply(seq_len(ncol(indmat)),function(i) {mat[,indmat[,i]]})
>    
names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE))
>     lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)})
>    
>     lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y)
y[,1])))
>     lst4<-lapply(lst3,function(x)
unlist(x[which.min(apply(x,1,function(y) abs(diff(y)))),]) )
>     lst5<- lapply(lst4,function(x){
>                 if(abs(diff(x))>(nrow(mat)/2)){
>                 nrow(mat)-abs(diff(x))
>                     }
>                         else(abs(diff(x)))
>                     })
>        
>     lst6<- lapply(seq_along(lst5),function(i) {
>                 x2<-lst1[[i]]
>                 if(lst5[[i]]==0) {
>                     #indx1<- seq(length(x2[,2]))
>                     #sum(abs(x2[,1]-x2[indx1,2]))
>              0     ######################## set to zero
>             }
>                 else{
>                     lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2]
>                             indx1<-seq(length(x3)-(j-1))
>                             indx2<-c(setdiff(seq_along(x3),indx1),indx1)
>                             sum(abs(x2[,1]-x2[indx2,2]))
>                             })
>                 }
>                 })   
>                
>     names(lst6)<- names(lst1)
>     lst7<-lapply(lst6,unlist)
>     lst8<- lapply(lst7,function(x) {
>             Seq1<-seq_along(x)
>             if(length(Seq1)==1) x
>             else if(length(Seq1)==2){
>                         sum(abs(x[1]-x[2]))   
>                         }
>             else{
>                 ind<-rep(Seq1,each=2)[-1]
>                 ind1<-ind[-length(ind)]
>                
Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) {
>                                     abs(diff(x[i]))
>                                     }))                           
>                 }
> 
>                     }
>             )
>     lst9<-do.call(rbind,lst8)
>     lst9       
>     }
> 
>  fun3(mm)
> #          [,1]
> #1_2  2.5966051
> #1_3  1.0267435
> #1_4  0.0000000
> #1_5  1.8489204
> #1_6  0.0000000
> #2_3  0.0000000
> #2_4  1.9040790
> #2_5  2.2874235
> #2_6  5.1526016
> #3_4  0.9726777
> #3_5  2.1359229
> #3_6  5.0221450
> #4_5  0.9124638
> #4_6  0.0000000
> #5_6 14.0550864
> 
> 
> xx
> #      1  8  9 23 87 89
> #[1,]  5  4  4  5  6 12
> #[2,] 12 NA NA  9 NA NA
> #[3,] NA NA NA 12 NA NA
> 
> According to xx, 1&4, 2&3, 4&6 (also 0 because both have 12)
> A.K.
> ________________________________
> From: eliza botto <eliza_botto@hotmail.com>
> To: "smartpink111@yahoo.com" <smartpink111@yahoo.com> 
> Sent: Saturday, May 25, 2013 9:17 PM
> Subject: RE: QA
> 
> 
> 
> 
> thanks arun,
> i dont think thANKyou is enough for wat u did. anyway, there is slight
modification that i want to ask to understand the codes more efficiently. what
if i want to consider the distance between the columns having atleast one peak
in the same month equal to zero, instead of "initial value"??
> more precisely The distance between column 2 and 3 should be zero instead
of 4.2951411. similarly the distance between column 4 and 6 should be zero
instead of 8.260419.
> Thats just for my own knowledge to understand the loop. i hope you wont
mind.
> The loop works absolutely well.
> 
> Elisa
> 
> 
> > Date: Sat, 25 May 2013 18:03:33 -0700
> > From: smartpink111@yahoo.com
> > Subject: Re: QA
> > To: eliza_botto@hotmail.com
> > CC: r-help@r-project.org
> > 
> > Hi,
> > I hope this works for you.
> > fun1<- function(x){
> >     big<- x>0.8*max(x) 
> >     n<- length(big)
> >     startRunOfBigs<- which(c(big[1],!big[-n] & big[-1]))
> >     endRunOfBigs<- which(c(big[-n] & !big[-1], big[n]))
> >     index<- vapply(seq_along(startRunOfBigs),function(i)
which.max(x[startRunOfBigs[i]:endRunOfBigs[i]])+startRunOfBigs[i]-1L,0L)
> >      index<-ifelse(sum(is.na(match(index,c(1,12))))==0 &
x[index]!=max(x[index]), NA,index)
> >    
data.frame(Index=index[!is.na(index)],Value=x[index[!is.na(index)]])
> >      }
> > 
> > ##mm: data
> > fun3<- function(mat){
> >         indmat<-combn(seq_len(ncol(mat)),2)
> >     lst1<- lapply(seq_len(ncol(indmat)),function(i)
{mat[,indmat[,i]]})
> >    
names(lst1)<-as.character(interaction(as.data.frame(t(indmat)),sep="_",drop=TRUE))
> >     lst2<- lapply(lst1,function(x) {x1<- apply(x,2,fun1)})
> >     
> >     lst3<- lapply(lst2,function(x) expand.grid(lapply(x,function(y)
y[,1])))
> >     lst4<-lapply(lst3,function(x)
unlist(x[which.min(apply(x,1,function(y) abs(diff(y)))),]) )
> >     lst5<- lapply(lst4,function(x){ 
> >                 if(abs(diff(x))>(nrow(mat)/2)){
> >                 nrow(mat)-abs(diff(x))
> >                     }
> >                         else(abs(diff(x)))
> >                     })
> >         
> >     lst6<- lapply(seq_along(lst5),function(i) {
> >                 x2<-lst1[[i]]
> >                 if(lst5[[i]]==0) {
> >                     indx1<- seq(length(x2[,2]))
> >                     sum(abs(x2[,1]-x2[indx1,2]))
> >                     }
> >                 else{
> >                     lapply(seq(1+lst5[[i]]),function(j){x3<-x2[,2]
> >                             indx1<-seq(length(x3)-(j-1))
> >                            
indx2<-c(setdiff(seq_along(x3),indx1),indx1)
> >                             sum(abs(x2[,1]-x2[indx2,2]))
> >                             })
> >                 }
> >                 })    
> >                 
> >     names(lst6)<- names(lst1)
> >     lst7<-lapply(lst6,unlist)
> >     lst8<- lapply(lst7,function(x) {
> >             Seq1<-seq_along(x)
> >             if(length(Seq1)==1) x
> >             else if(length(Seq1)==2){
> >                         sum(abs(x[1]-x[2]))    
> >                         }
> >             else{ 
> >                 ind<-rep(Seq1,each=2)[-1]
> >                 ind1<-ind[-length(ind)]
> >                
Reduce(`+`,lapply(split(ind1,(seq_along(ind1)-1)%/%2+1),function(i) {
> >                                     abs(diff(x[i]))
> >                                     }))                            
> >                 }
> > 
> >                     }
> >             )
> >     do.call(rbind,lst8)    
> >     }
> > 
> > fun3(mm)   #rownames represent the comparison between the particular
columns
> > #          [,1]
> > #1_2  2.5966051
> > #1_3  1.0267435
> > #1_4  3.7387830
> > #1_5  1.8489204
> > #1_6  6.5233654
> > #2_3  4.2951411
> > #2_4  1.9040790
> > #2_5  2.2874235
> > #2_6  5.1526016
> > #3_4  0.9726777
> > #3_5  2.1359229
> > #3_6  5.0221450
> > #4_5  0.9124638
> > #4_6  8.2604187
> > #5_6 14.0550864
> > 
> > 
> > A.K.
> > 
> > 
> > 
> > 
> > 
> > ________________________________
> > From: eliza botto <eliza_botto@hotmail.com>
> > To: "smartpink111@yahoo.com" <smartpink111@yahoo.com> 
> > Sent: Saturday, May 25, 2013 2:14 PM
> > Subject: QA
> > 
> > 
> > 
> > 
> > Dear Arun,
> > [text file is attached]
> > After your help on preparing loop for identifying peaks, here is my
latest question which is linked with my first question. but this time i will try
to make it more clear.
> > 
> > > dput(xx)
> > structure(c(5L, 12L, NA, 4L, NA, NA, 4L, NA, NA, 5L, 9L, 12L, 
> > 6L, NA, NA, 12L, NA, NA), .Dim = c(3L, 6L), .Dimnames = list(
> >     NULL, c("1", "8", "9",
"23", "87", "89")))
> > > dput(mm)
> > structure(c(0.706461987893674, 0.998391468394261, 0.72402995269242, 
> > 1.70874688194537, 1.93906363083693, 0.89540353128442,
0.328327645695443,
> > 0.427434603701202, 0.591932250254601, 0.444627635494183,
1.44407704434405,
> > 1.79150336746345, 0.740380661614246, 1.39756784211974,
1.43602731683199,
> > 2.40482060634346, 1.61684982192949, 0.549848553223765,
0.245763715425745,
> > 0.315411788974968, 0.390626431538384, 0.369934560068472,
0.769100067815155,
> > 1.76366863411459, 0.480885978853889, 1.21441674507622,
2.50566408677391,
> > 3.27361599826255, 1.18508780425679, 0.465943778037697,
0.29380145690883,
> > 0.36356245877522, 0.373314458026047, 0.334849362386475,
0.882050057788756,
> > 0.626807814853613, 0.774295647517675, 0.853105130179133,
0.738085443815565,
> > 1.26063449947807, 1.57350832698427, 0.790095501697794,
0.510641105191147,
> > 0.874523657118082, 1.31257333325184, 0.882086374572265,
1.13881207205977,
> > 1.29163890813439, 0.0849732189580101, 0.070591276171845,
0.0926010253161898,
> > 0.362209761457517, 1.45769283057202, 3.16165004659667,
2.74903557756267,
> > 1.94633472878995, 1.19319875840883, 0.533232612926756,
0.225531074123974,
> > 0.122949089115578, 2.06195904001605, 1.41493262330451,
1.35748791897328,
> > 1.19490680241894, 0.702488756183322, 0.338258418490199,
0.123398398622741,
> > 0.138548982660226, 0.16170889185798, 0.414543218677095,
1.84629295875002,
> > 2.24547399004563), .Dim = c(12L, 6L))
> > 
> > 
> > You can see that that there are two matrices. "mm" is the
actual matrix and "xx" is the matrix indentifying the peaks of
"mm".For being a peak a value has to either the maximum value or
atleast 80% of the maximum value. you can see that the maximum value of coulmn 1
is in row number 5 and thats what it showed in matrix "xx" whereas,
the 80% of the maximum value is in row number 12 therefore it considered it the
second peak and row number was shown in "xx". i want to calculate the
distance matrix of "mm" in the following way...
> > The column are continous or cyclic.
> > The subtraction should start from the peak and should end when the
peaks of two columns are in the same row. The peaks are to be moved towrds
eachother in the shortest possible way.
> > For suppose the peak of colum 2 is in 4th row and the peak of column 6
is in 12th row. Now moving these two peak towwards eachother requires moving col
2 in reverse direction or column 6 in forward direction.
> > 
> > For example
> > 
> > Initial:
> > 
> > Col 2
> > 
> > 1 2 3 4(max) 5 6 7 8 9 10 11 12
> > 
> > Col 6
> > 
> > 1 2 3 4 5 6 7 8 9 10 11 12(max)
> > 
> > a<-sum(abs(col2-col6))
> > 
> > step1:
> > 
> > Col 2
> > 
> > 2 3 4(max) 5 6 7 8 9 10 11 12 1
> > 
> > Col 6
> > 
> > 1 2 3 4 5 6 7 8 9 10 11 12(max)
> > 
> > b<-sum(abs(col2-col6))
> > 
> > step2:
> > 
> > Col 2
> > 
> > 3 4(max) 5 6 7 8 9 10 11 12 1 2
> > 
> > Col 6
> > 
> > 1 2 3 4 5 6 7 8 9 10 11 12(max)
> > 
> > c<-sum(abs(col2-col6))
> > 
> > step3:
> > 
> > Col 2
> > 
> > 4(max) 5 6 7 8 9 10 11 12 1 2 3
> > 
> > Col 6
> > 
> > 1 2 3 4 5 6 7 8 9 10 11 12(max)
> > 
> > d<-sum(abs(col2-col6))
> > 
> > step4:
> > 
> > Col 2
> > 
> > 5 6 7 8 9 10 11 12 1 2 3 4(max)
> > 
> > Col 6
> > 
> > 1 2 3 4 5 6 7 8 9 10 11 12(max)
> > 
> > e<-sum(abs(col2-col6))
> > 
> > total difference= abs(a-b)+abs(b-c)+abs(c-d)+abs(d-e)
> > 
> > 
> > The dissimilarity is zero if the peaks are already in the same row.
like for column 2 and 3 the distance is zero as peaks are under eachother. For
column 1 and 4 the distance is onceagain zero. Although they have different
nuber of peaks but as atleast one of their peaks is under eachother therefore
distance is zero.
> > 
> > For Column 5 and 6 peaks can be moved in either direction as number of
steps to be followed are same.
> > 
> > for column 1 and 2 following is the procedure
> > 
> > Col1 has two maximum values in row 5th and 12th and column two has
only one maximum value at 4 row. As peak in 5th row of column one is closer to
the peak of column 2 therefore we will move towards it and procedure should be
> > 
> > 
> > Initial:
> > 
> > Col 1
> > 
> > 1 2 3 4 5(max) 6 7 8 9 10 11 12(max)
> > 
> > Col 8
> > 
> > 1 2 3 4(max) 5 6 7 8 9 10 11 12
> > 
> > a<-sum(abs(col1-col8))
> > 
> > Step1: 
> > 
> > Col 1
> > 
> > 1 2 3 4 5(max) 6 7 8 9 10 11 12(max)
> > 
> > Col 8
> > 
> > 12 1 2 3 4(max) 5 6 7 8 9 10 11
> > 
> > b<-sum(abs(col1-col8))
> > 
> > total difference=abs(a-b)
> > 
> > For column 4 and 5
> > 
> > Initial:
> > 
> > Col 4
> > 
> > 1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max)
> > 
> > Col 5
> > 
> > 1 2 3 4 5 6(max) 7 8 9 10 11 12
> > 
> > a<-sum(abs(col4-col5))
> > 
> > Step 1
> > 
> > Col 4
> > 
> > 1 2 3 4 5(max) 6 7 8 9(max) 10 11 12(max)
> > 
> > Col 5
> > 
> > 2 3 4 5 6(max) 7 8 9 10 11 12 1
> > 
> > b<-sum(abs(col4-col5))
> > 
> > Total Difference= abs(a-b)
> > 
> > If there is any point which i couldnt discuss please tell me...
> > 
> > 
> > Elisa          		 	   		  
	[[alternative HTML version deleted]]

R help - May 2013 - QA

[R] QA

[R] QA

[R] QA