thr3ads.net - R help - [R] Consolidate column contents of equally "named" columns [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Daniel Malter

2012-Apr-28 14:46 UTC

[R] Consolidate column contents of equally "named" columns

Hi,

I have a data frame whose first row (not the header) contains the true
column names. The same column name can  occur multiple times in the dataset.
Columns with equal names are not adjacent, and for each observation only one
of the equally named columns contains the actual data (see the example
below). I am looking for an easy method consolidate these columns into one
column for each unique column name. Say,

x1<-c("x",1,NA,NA)
x2<-c("x",NA,2,NA)
x3<-c("x",NA,NA,3)
y1<-c("y",3,NA,NA)
y2<-c("y",NA,1,NA)
y3<-c("y",NA,NA,2)
d<-data.frame(x1,y1,x2,y2,x3,y3)
d

# d looks like:

    x1   y1   x2   y2   x3   y3
1    x    y    x    y    x    y
2    1    3 <NA> <NA> <NA> <NA>
3 <NA> <NA>    2    1 <NA> <NA>
4 <NA> <NA> <NA> <NA>    3    2
>From this, I want to create the table or data frame
x y
1 3
2 1
3 2

I would appreciate your help.

Daniel


--
View this message in context:
http://r.789695.n4.nabble.com/Consolidate-column-contents-of-equally-named-columns-tp4594852p4594852.html
Sent from the R help mailing list archive at Nabble.com.

Rui Barradas

2012-Apr-28 16:33 UTC

head link

[R] Consolidate column contents of equally "named" columns

Hello,

This solution is not very pretty but it works.

nms <- unlist(d[1, ])
nm <- unique(nms)
dd <- na.exclude(sapply(nm, function(jj){
		inx <- nms %in% jj
		do.call(rbind, as.list(d[, inx]))
	}))
dd <- dd[ dd[ , nm[1]] != nm[1], ]
dd <- data.frame(apply(dd, 2, as.integer))
dd

Hope this helps,

Rui Barradas


--
View this message in context:
http://r.789695.n4.nabble.com/Consolidate-column-contents-of-equally-named-columns-tp4594852p4594980.html
Sent from the R help mailing list archive at Nabble.com.

David Winsemius

2012-Apr-29 13:05 UTC

head link

[R] Consolidate column contents of equally "named" columns

On Apr 28, 2012, at 10:46 AM, Daniel Malter wrote:
> Hi,
>
> I have a data frame whose first row (not the header) contains the true
> column names. The same column name can  occur multiple times in the  
> dataset.
> Columns with equal names are not adjacent, and for each observation  
> only one
> of the equally named columns contains the actual data (see the example
> below). I am looking for an easy method consolidate these columns  
> into one
> column for each unique column name. Say,
>
> x1<-c("x",1,NA,NA)
> x2<-c("x",NA,2,NA)
> x3<-c("x",NA,NA,3)
> y1<-c("y",3,NA,NA)
> y2<-c("y",NA,1,NA)
> y3<-c("y",NA,NA,2)
> d<-data.frame(x1,y1,x2,y2,x3,y3)
> d
>
It would avoid problems with manipulating factors it these were  
created (or converted to) character columns, choose one of:

d=data.frame(x1,y1,x2,y2,x3,y3, stringsAsFactors=FALSE)

d[]<-lapply(d, as.character)

> # d looks like:
>
>    x1   y1   x2   y2   x3   y3
> 1    x    y    x    y    x    y
> 2    1    3 <NA> <NA> <NA> <NA>
> 3 <NA> <NA>    2    1 <NA> <NA>
> 4 <NA> <NA> <NA> <NA>    3    2
>
>> From this, I want to create the table or data frame
>
> x y
> 1 3
> 2 1
> 3 2
na.omit(
     data.frame(
         X=stack(d[-1,grep("x", names(d))]),
         Y=stack(d[-1,grep("y", names(d))]),
         stringsAsFactors=FALSE)[ c(1,3) ])

   X.values Y.values
1        1        3
5        2        1
9        3        2


If it were less regular you might need to merge with the "source"  
columns that stack generates.


-- 
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Bert Gunter

2012-Apr-29 14:48 UTC

head link

[R] Consolidate column contents of equally "named" columns

I believe the regularity of the problem allows a (to me, anyway)
simpler procedure.

td <-  t(apply(d,2, na.omit))
data.frame(split(as.numeric(td[,-1]),td[,1]))


-- Bert

On Sat, Apr 28, 2012 at 9:33 AM, Rui Barradas <ruipbarradas at sapo.pt>
wrote:> Hello,
>
> This solution is not very pretty but it works.
>
> nms <- unlist(d[1, ])
> nm <- unique(nms)
> dd <- na.exclude(sapply(nm, function(jj){
> ? ? ? ? ? ? ? ?inx <- nms %in% jj
> ? ? ? ? ? ? ? ?do.call(rbind, as.list(d[, inx]))
> ? ? ? ?}))
> dd <- dd[ dd[ , nm[1]] != nm[1], ]
> dd <- data.frame(apply(dd, 2, as.integer))
> dd
>
> Hope this helps,
>
> Rui Barradas
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Consolidate-column-contents-of-equally-named-columns-tp4594852p4594980.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

R help - Apr 2012 - Consolidate column contents of equally "named" columns

[R] Consolidate column contents of equally "named" columns

[R] Consolidate column contents of equally "named" columns

[R] Consolidate column contents of equally "named" columns

[R] Consolidate column contents of equally "named" columns