try this:
myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()
corr_mat <-as.matrix(read.table(textConnection("1 1 .5 0 0 0 0
0 0 0
2 .5 1 0 0 0 0 0 0 0
3 0 0 1.0 0 0 0 0 0 0
4 0 0 0 1 .5 .5 0 0 0
5 0 0 0 .5 1 .5 0 0 0
6 0 0 0 .5 .5 1 0 0 0
7 0 0 0 0 0 0 1 0 0
8 0 0 0 0 0 0 0 1 .5
9 0 0 0 0 0 0 0 .5 1"),header=FALSE))
closeAllConnections()
corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id
# split out the groups
groups <- split(as.character(myDat$id), myDat$group)
# process each subgroup
result <- lapply(groups, function(.grp){
subgroup <- corr_mat[.grp, .grp]
output <- NULL
# zero the diag
diag(subgroup) <- 0
same <- apply(subgroup, 1, function(x) any(x != 0))
if (any(same)){ # some match, choose one
output <- sample(same[same], 1)
}
if (any(!same)){ # get all that don't correlate
output <- c(output, same[!same])
}
output
})
# output as matrix
do.call(rbind, lapply(names(result), function(x) cbind(x,
names(result[[x]]))))
On Mon, Dec 7, 2009 at 7:38 PM, Juliet Hannah
<juliet.hannah@gmail.com>wrote:
> Hi List,
>
> Here is some example data.
>
> myDat <- read.table(textConnection("group id
> 1 101
> 1 201
> 1 301
> 2 401
> 2 501
> 2 601
> 3 701
> 3 801
> 3 901"),header=TRUE)
> closeAllConnections()
>
> corr_mat <-read.table(textConnection("1 1 .5 0 0 0 0 0
0 0
> 2 .5 1 0 0 0 0 0 0 0
> 3 0 0 1.0 0 0 0 0 0 0
> 4 0 0 0 1 .5 .5 0 0 0
> 5 0 0 0 .5 1 .5 0 0 0
> 6 0 0 0 .5 .5 1 0 0 0
> 7 0 0 0 0 0 0 1 0 0
> 8 0 0 0 0 0 0 0 1 .5
> 9 0 0 0 0 0 0 0 .5 1"),header=FALSE)
> closeAllConnections()
>
> corr_mat <- corr_mat[,-1]
> colnames(corr_mat) <- myDat$id
> rownames(corr_mat) <- myDat$id
>
> I need to subset this data such that observations within a group are not
> related, which is indicated by a 0 in corr_mat.
>
> For example, within group 1, 101 and 201 are related, so one of these
> has to be selected, say
> 101. 301 is not related to 101 or 201, so the final set for group 1
> consists of 101 and 301. There will always be at least 2 members in
> each group. I need to carry this task on all groups.
>
> One possible final data set looks like:
>
> group id
> 1 1 101
> 3 1 301
> 4 2 401
> 7 3 701
> 8 3 801
>
> Any suggestions? Thanks!
>
> Juliet
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]