# Dear R-experts, # (Once again) I want to avoid the usage of for-loops but unfortunately I don't know how. # I know functions like e.g. 'apply' but don't know how to use them for the solution of the following problem: # I have a data frame 'a' giving the number of columns in data frame 'b' that belong to one group a <- data.frame(group1=5, group2=4) b <- data.frame(col1=c(0,0,0), col2=c(0,1,0.5), col3=c(0,0,0), col4=c(1/3,0,0.5), col5=c(2/3,0,0), col6=c(0,0,0), col7=c(1,1/3,0), col8=c(0,2/3,0), col9=c(0,0,0)) # ... thus columns 1-5 in 'b' belong to group 1 and columns 6-9 in 'b' belong to group 2 # then I created a data frame giving all possible row combinations of 'b' r <- as.data.frame(t(combn(nrow(b), 2))) # .. so e.g. the second row of 'r' tells me that I have to perform an equation with the values of the # first and third row of table 'b'. The equation has to be calculated for each group seperately. # e.g. within group 2 (columns 6-9 in 'b') I have to calculate e.g. for rows 1 and 3 in 'b' # (abs(b[row1,col6] - b[row3, col6]) + abs(b[row1, col7] - b[row3, col7]) + .... + abs(b[row1, col9] - b[row3, col9]))/2 # the resulting data frame shall look as follows: result <- cbind(r, data.frame(group1=c(1,2/3,0.5), group2=c(2/3,0.5,0.5))) # The original tables are much larger and I don't know how to solve this problem w/o a lot of very slow for-loops. # Is there any possible solution w/o using 'for'-loops? # I'd be happy for any suggestions # Thank you very much in anticipation # Best regards # Thomas
Thomas, Here's a little bit of code to get you started. You can use the dist() function to calculate the absolute differences between all pairs of rows, then divide by 2 (as you requested), then convert the distance matrix to a vector. Do this for both groups of columns, and use cbind() to combine the results. cbind(as.vector(dist(b[, 1:5], method="manhattan")/2), as.vector(dist(b[, 6:9], method="manhattan")/2)) Jean On Mon, Aug 5, 2013 at 1:39 AM, Kulupp <kulupp@online.de> wrote:> # Dear R-experts, > > # (Once again) I want to avoid the usage of for-loops but unfortunately I > don't know how. > # I know functions like e.g. 'apply' but don't know how to use them for > the solution of the following problem: > # I have a data frame 'a' giving the number of columns in data frame 'b' > that belong to one group > a <- data.frame(group1=5, group2=4) > > b <- data.frame(col1=c(0,0,0), col2=c(0,1,0.5), col3=c(0,0,0), > col4=c(1/3,0,0.5), col5=c(2/3,0,0), > col6=c(0,0,0), col7=c(1,1/3,0), col8=c(0,2/3,0), > col9=c(0,0,0)) > > # ... thus columns 1-5 in 'b' belong to group 1 and columns 6-9 in 'b' > belong to group 2 > > # then I created a data frame giving all possible row combinations of 'b' > r <- as.data.frame(t(combn(nrow(b), 2))) > > # .. so e.g. the second row of 'r' tells me that I have to perform an > equation with the values of the > # first and third row of table 'b'. The equation has to be calculated for > each group seperately. > # e.g. within group 2 (columns 6-9 in 'b') I have to calculate e.g. for > rows 1 and 3 in 'b' > # (abs(b[row1,col6] - b[row3, col6]) + abs(b[row1, col7] - b[row3, > col7]) + .... + abs(b[row1, col9] - b[row3, col9]))/2 > > # the resulting data frame shall look as follows: > result <- cbind(r, data.frame(group1=c(1,2/3,0.5)**, > group2=c(2/3,0.5,0.5))) > > # The original tables are much larger and I don't know how to solve this > problem w/o a lot of very slow for-loops. > # Is there any possible solution w/o using 'for'-loops? > > # I'd be happy for any suggestions > # Thank you very much in anticipation > # Best regards > # Thomas > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi, You could use this, but it would be slow. """ # I know functions like e.g. 'apply' but don't know how to use them for the solution of the following problem: """ lst1<-split(seq_along(b),rep(unlist(a),unlist(a))) res<-cbind(r,sapply(lst1,function(i) {x1<-b[,i];sapply(split(as.matrix(r),row(r)),function(x){ x2<-x1[unlist(x),]; sum(abs(apply(x2,2,diff)))/2}) })) ?colnames(res)[3:4]<-colnames(a)[match(colnames(res), unlist(a),nomatch=0)] ?res #? V1 V2??? group2??? group1 #1? 1? 2 0.6666667 1.0000000 #2? 1? 3 0.5000000 0.6666667 #3? 2? 3 0.5000000 0.5000000 ?all.equal(res[,c(1:2,4,3)],result,check.attributes=FALSE) #[1] TRUE I guess Jean's solution would be faster and elegant than using ?apply() based solution. A.K. ----- Original Message ----- From: Kulupp <kulupp at online.de> To: r-help at r-project.org Cc: Sent: Monday, August 5, 2013 2:39 AM Subject: [R] Avoiding slow for-loops (once again) # Dear R-experts, # (Once again) I want to avoid the usage of for-loops but unfortunately I don't know how. # I know functions like e.g. 'apply' but don't know how to use them for the solution of the following problem: # I have a data frame 'a' giving the number of columns in data frame 'b' that belong to one group a <- data.frame(group1=5, group2=4) b <- data.frame(col1=c(0,0,0), col2=c(0,1,0.5), col3=c(0,0,0), col4=c(1/3,0,0.5), col5=c(2/3,0,0), ? ? ? ? ? ? ? ? col6=c(0,0,0), col7=c(1,1/3,0), col8=c(0,2/3,0), col9=c(0,0,0)) # ... thus columns 1-5 in 'b' belong to group 1 and columns 6-9 in 'b' belong to group 2 # then I created a data frame giving all possible row combinations of 'b' r <- as.data.frame(t(combn(nrow(b), 2))) # .. so e.g. the second row of 'r' tells me that I have to perform an equation with the values of the # first and third row of table 'b'. The equation has to be calculated for each group seperately. # e.g. within group 2 (columns 6-9 in 'b') I have to calculate e.g. for rows 1 and 3 in 'b' #? ? ? (abs(b[row1,col6] - b[row3, col6]) + abs(b[row1, col7] - b[row3, col7]) + .... + abs(b[row1, col9] - b[row3, col9]))/2 # the resulting data frame shall look as follows: result <- cbind(r, data.frame(group1=c(1,2/3,0.5), group2=c(2/3,0.5,0.5))) # The original tables are much larger and I don't know how to solve this problem w/o a lot of very slow for-loops. # Is there any possible solution w/o using 'for'-loops? # I'd be happy for any suggestions # Thank you very much in anticipation # Best regards # Thomas ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.