thr3ads.net - R help - [R] data manipulation/subsetting and relation matrix [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Juliet Hannah

2009-Dec-08 01:38 UTC

[R] data manipulation/subsetting and relation matrix

Hi List,

Here is some example data.

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()

corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0   0  
0
2 .5   1  0   0   0   0   0   0   0
3 0    0  1.0   0   0   0   0   0   0
4 0    0  0   1   .5  .5  0   0   0
5 0    0  0   .5  1    .5  0   0   0
6 0    0  0   .5  .5   1 0    0   0
7 0    0  0   0    0   0  1   0  0
8 0   0   0   0    0   0   0  1  .5
9 0   0   0   0   0    0   0  .5 1"),header=FALSE)
closeAllConnections()

corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id

I need to subset this data such that observations within a group are not
related, which is indicated by a 0 in corr_mat.

For example, within group 1, 101 and 201 are related, so one of these
has to be selected, say
101. 301 is not related to 101 or 201, so the final set for group 1
consists of 101 and 301. There will always be at least 2 members in
each group. I need to carry this task on all groups.

One possible final data set looks like:

  group  id
1     1 101
3     1 301
4     2 401
7     3 701
8     3 801

Any suggestions? Thanks!

Juliet

jim holtman

2009-Dec-08 12:02 UTC

head link

[R] data manipulation/subsetting and relation matrix

try this:

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()
corr_mat <-as.matrix(read.table(textConnection("1 1   .5  0   0   0   0
0   0   0
2 .5   1  0   0   0   0   0   0   0
3 0    0  1.0   0   0   0   0   0   0
4 0    0  0   1   .5  .5  0   0   0
5 0    0  0   .5  1    .5  0   0   0
6 0    0  0   .5  .5   1 0    0   0
7 0    0  0   0    0   0  1   0  0
8 0   0   0   0    0   0   0  1  .5
9 0   0   0   0   0    0   0  .5 1"),header=FALSE))
closeAllConnections()
corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id
# split out the groups
groups <- split(as.character(myDat$id), myDat$group)
# process each subgroup
result <- lapply(groups, function(.grp){
    subgroup <- corr_mat[.grp, .grp]
    output <- NULL
    # zero the diag
    diag(subgroup) <- 0
    same <- apply(subgroup, 1, function(x) any(x != 0))
    if (any(same)){  # some match, choose one
        output <- sample(same[same], 1)
    }
    if (any(!same)){  # get all that don't correlate
        output <- c(output, same[!same])
    }
    output
})
# output as matrix
do.call(rbind, lapply(names(result), function(x) cbind(x,
names(result[[x]]))))



On Mon, Dec 7, 2009 at 7:38 PM, Juliet Hannah
<juliet.hannah@gmail.com>wrote:
> Hi List,
>
> Here is some example data.
>
> myDat <- read.table(textConnection("group id
> 1 101
> 1 201
> 1 301
> 2 401
> 2 501
> 2 601
> 3 701
> 3 801
> 3 901"),header=TRUE)
> closeAllConnections()
>
> corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0  
0   0
> 2 .5   1  0   0   0   0   0   0   0
> 3 0    0  1.0   0   0   0   0   0   0
> 4 0    0  0   1   .5  .5  0   0   0
> 5 0    0  0   .5  1    .5  0   0   0
> 6 0    0  0   .5  .5   1 0    0   0
> 7 0    0  0   0    0   0  1   0  0
> 8 0   0   0   0    0   0   0  1  .5
> 9 0   0   0   0   0    0   0  .5 1"),header=FALSE)
> closeAllConnections()
>
> corr_mat <- corr_mat[,-1]
> colnames(corr_mat) <- myDat$id
> rownames(corr_mat) <- myDat$id
>
> I need to subset this data such that observations within a group are not
> related, which is indicated by a 0 in corr_mat.
>
> For example, within group 1, 101 and 201 are related, so one of these
> has to be selected, say
> 101. 301 is not related to 101 or 201, so the final set for group 1
> consists of 101 and 301. There will always be at least 2 members in
> each group. I need to carry this task on all groups.
>
> One possible final data set looks like:
>
>  group  id
> 1     1 101
> 3     1 301
> 4     2 401
> 7     3 701
> 8     3 801
>
> Any suggestions? Thanks!
>
> Juliet
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more reasonably related threads

R help - Dec 2009 - data manipulation/subsetting and relation matrix

[R] data manipulation/subsetting and relation matrix

[R] data manipulation/subsetting and relation matrix

Apparently Analagous Threads