# Dear R-experts,
# (Once again) I want to avoid the usage of for-loops but unfortunately
I don't know how.
# I know functions like e.g. 'apply' but don't know how to use them
for
the solution of the following problem:
# I have a data frame 'a' giving the number of columns in data frame
'b'
that belong to one group
a <- data.frame(group1=5, group2=4)
b <- data.frame(col1=c(0,0,0), col2=c(0,1,0.5), col3=c(0,0,0),
col4=c(1/3,0,0.5), col5=c(2/3,0,0),
col6=c(0,0,0), col7=c(1,1/3,0), col8=c(0,2/3,0),
col9=c(0,0,0))
# ... thus columns 1-5 in 'b' belong to group 1 and columns 6-9 in
'b'
belong to group 2
# then I created a data frame giving all possible row combinations of
'b'
r <- as.data.frame(t(combn(nrow(b), 2)))
# .. so e.g. the second row of 'r' tells me that I have to perform an
equation with the values of the
# first and third row of table 'b'. The equation has to be calculated
for each group seperately.
# e.g. within group 2 (columns 6-9 in 'b') I have to calculate e.g. for
rows 1 and 3 in 'b'
# (abs(b[row1,col6] - b[row3, col6]) + abs(b[row1, col7] - b[row3,
col7]) + .... + abs(b[row1, col9] - b[row3, col9]))/2
# the resulting data frame shall look as follows:
result <- cbind(r, data.frame(group1=c(1,2/3,0.5), group2=c(2/3,0.5,0.5)))
# The original tables are much larger and I don't know how to solve this
problem w/o a lot of very slow for-loops.
# Is there any possible solution w/o using 'for'-loops?
# I'd be happy for any suggestions
# Thank you very much in anticipation
# Best regards
# Thomas
Thomas, Here's a little bit of code to get you started. You can use the dist() function to calculate the absolute differences between all pairs of rows, then divide by 2 (as you requested), then convert the distance matrix to a vector. Do this for both groups of columns, and use cbind() to combine the results. cbind(as.vector(dist(b[, 1:5], method="manhattan")/2), as.vector(dist(b[, 6:9], method="manhattan")/2)) Jean On Mon, Aug 5, 2013 at 1:39 AM, Kulupp <kulupp@online.de> wrote:> # Dear R-experts, > > # (Once again) I want to avoid the usage of for-loops but unfortunately I > don't know how. > # I know functions like e.g. 'apply' but don't know how to use them for > the solution of the following problem: > # I have a data frame 'a' giving the number of columns in data frame 'b' > that belong to one group > a <- data.frame(group1=5, group2=4) > > b <- data.frame(col1=c(0,0,0), col2=c(0,1,0.5), col3=c(0,0,0), > col4=c(1/3,0,0.5), col5=c(2/3,0,0), > col6=c(0,0,0), col7=c(1,1/3,0), col8=c(0,2/3,0), > col9=c(0,0,0)) > > # ... thus columns 1-5 in 'b' belong to group 1 and columns 6-9 in 'b' > belong to group 2 > > # then I created a data frame giving all possible row combinations of 'b' > r <- as.data.frame(t(combn(nrow(b), 2))) > > # .. so e.g. the second row of 'r' tells me that I have to perform an > equation with the values of the > # first and third row of table 'b'. The equation has to be calculated for > each group seperately. > # e.g. within group 2 (columns 6-9 in 'b') I have to calculate e.g. for > rows 1 and 3 in 'b' > # (abs(b[row1,col6] - b[row3, col6]) + abs(b[row1, col7] - b[row3, > col7]) + .... + abs(b[row1, col9] - b[row3, col9]))/2 > > # the resulting data frame shall look as follows: > result <- cbind(r, data.frame(group1=c(1,2/3,0.5)**, > group2=c(2/3,0.5,0.5))) > > # The original tables are much larger and I don't know how to solve this > problem w/o a lot of very slow for-loops. > # Is there any possible solution w/o using 'for'-loops? > > # I'd be happy for any suggestions > # Thank you very much in anticipation > # Best regards > # Thomas > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi,
You could use this, but it would be slow.
"""
# I know functions like e.g. 'apply' but don't know how to use them
for
the solution of the following problem:
"""
lst1<-split(seq_along(b),rep(unlist(a),unlist(a)))
res<-cbind(r,sapply(lst1,function(i)
{x1<-b[,i];sapply(split(as.matrix(r),row(r)),function(x){
x2<-x1[unlist(x),]; sum(abs(apply(x2,2,diff)))/2}) }))
?colnames(res)[3:4]<-colnames(a)[match(colnames(res), unlist(a),nomatch=0)]
?res
#? V1 V2??? group2??? group1
#1? 1? 2 0.6666667 1.0000000
#2? 1? 3 0.5000000 0.6666667
#3? 2? 3 0.5000000 0.5000000
?all.equal(res[,c(1:2,4,3)],result,check.attributes=FALSE)
#[1] TRUE
I guess Jean's solution would be faster and elegant than using ?apply()
based solution.
A.K.
----- Original Message -----
From: Kulupp <kulupp at online.de>
To: r-help at r-project.org
Cc:
Sent: Monday, August 5, 2013 2:39 AM
Subject: [R] Avoiding slow for-loops (once again)
# Dear R-experts,
# (Once again) I want to avoid the usage of for-loops but unfortunately
I don't know how.
# I know functions like e.g. 'apply' but don't know how to use them
for
the solution of the following problem:
# I have a data frame 'a' giving the number of columns in data frame
'b'
that belong to one group
a <- data.frame(group1=5, group2=4)
b <- data.frame(col1=c(0,0,0), col2=c(0,1,0.5), col3=c(0,0,0),
col4=c(1/3,0,0.5), col5=c(2/3,0,0),
? ? ? ? ? ? ? ? col6=c(0,0,0), col7=c(1,1/3,0), col8=c(0,2/3,0),
col9=c(0,0,0))
# ... thus columns 1-5 in 'b' belong to group 1 and columns 6-9 in
'b'
belong to group 2
# then I created a data frame giving all possible row combinations of
'b'
r <- as.data.frame(t(combn(nrow(b), 2)))
# .. so e.g. the second row of 'r' tells me that I have to perform an
equation with the values of the
# first and third row of table 'b'. The equation has to be calculated
for each group seperately.
# e.g. within group 2 (columns 6-9 in 'b') I have to calculate e.g. for
rows 1 and 3 in 'b'
#? ? ? (abs(b[row1,col6] - b[row3, col6]) + abs(b[row1, col7] - b[row3,
col7]) + .... + abs(b[row1, col9] - b[row3, col9]))/2
# the resulting data frame shall look as follows:
result <- cbind(r, data.frame(group1=c(1,2/3,0.5), group2=c(2/3,0.5,0.5)))
# The original tables are much larger and I don't know how to solve this
problem w/o a lot of very slow for-loops.
# Is there any possible solution w/o using 'for'-loops?
# I'd be happy for any suggestions
# Thank you very much in anticipation
# Best regards
# Thomas
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.