thr3ads.net - R help - [R] data manipulation [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Marc Bernard

2005-Sep-08 16:17 UTC

[R] data manipulation

Dear All,
 
I would be grateful if you can help me. My problem is the following:
I have a data set like:
 
ID  time      X1          X2
1    1          x111      x211
1    2          x112      x212
2    1          x121      x221
2    2          x122      x222
2    3          x123      x223
 
where X1 and X2 are 2 covariates and "time" is the time of observation
and ID indicates the cluster.
 
I want to merge the above data by creating a new variable  "X" and
"type" as follows:
 
ID   time    X            type
1     1      x111         X1
1     2      x112         X1
1     1      x211         X2
1     2      x212         X2
2     1      x121         X1
2     2      x122         X1
2     3      x123         X1
2     1      x221         X2
2     2      x222         X2
2     3      x223         X2

 
Where "type" is a factor variable indicating if the observation is
related to X1 or X2...
 
Many thanks in advance,
 
Bernard

		
---------------------------------


	[[alternative HTML version deleted]]

Martin Lam

2005-Sep-08 17:12 UTC

head link

[R] data manipulation

Hi,

This may not be the best solution, but at least it's
easy to see what i'm doing, assume that your data set
is called "data":

# remove the 4th column
data1 = data[,-4]

# remove the 3rd column
data2 = data[,-3]

# use cbind to add an extra column with only X1 
#elements
data1 = cbind(data1, array("X1", nrow(data1), 1)

# use cbind to add an extra column with only X2
#elements
data2 = cbind(data2, array("X2", nrow(data2), 1)

# use rbind to add them together as rows
data3 = rbind(data1, data2)

# rename the names of the columns
colnames(data3) <- c("ID", "time", "X",
"type")

# show output
data3

The only thing I couldn't figure out is how to sort
the data set per row, perhaps someone else could help
us out on this?

Martin

--- Marc Bernard <bernarduse1 at yahoo.fr> wrote:
> Dear All,
>  
> I would be grateful if you can help me. My problem
> is the following:
> I have a data set like:
>  
> ID  time      X1          X2
> 1    1          x111      x211
> 1    2          x112      x212
> 2    1          x121      x221
> 2    2          x122      x222
> 2    3          x123      x223
>  
> where X1 and X2 are 2 covariates and "time" is the
> time of observation and ID indicates the cluster.
>  
> I want to merge the above data by creating a new
> variable  "X" and "type" as follows:
>  
> ID   time    X            type
> 1     1      x111         X1
> 1     2      x112         X1
> 1     1      x211         X2
> 1     2      x212         X2
> 2     1      x121         X1
> 2     2      x122         X1
> 2     3      x123         X1
> 2     1      x221         X2
> 2     2      x222         X2
> 2     3      x223         X2
> 
>  
> Where "type" is a factor variable indicating if the
> observation is related to X1 or X2...
>  
> Many thanks in advance,
>  
> Bernard
> 
> 		
> ---------------------------------
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> 


	
		
______________________________________________________
Click here to donate to the Hurricane Katrina relief effort.

Sebastian Luque

2005-Sep-08 17:12 UTC

head link

[R] data manipulation

Marc Bernard <bernarduse1 at yahoo.fr> wrote:> Dear All,
> I would be grateful if you can help me. My problem is the following:
> I have a data set like:
> ID  time      X1          X2
> 1    1          x111      x211
> 1    2          x112      x212
> 2    1          x121      x221
> 2    2          x122      x222
> 2    3          x123      x223
> where X1 and X2 are 2 covariates and "time" is the time of
observation and ID
> 	indicates the cluster.
> I want to merge the above data by creating a new variable "X" and
"type" as
> 	follows:
> ID   time    X            type
> 1     1      x111         X1
> 1     2      x112         X1
> 1     1      x211         X2
> 1     2      x212         X2
> 2     1      x121         X1
> 2     2      x122         X1
> 2     3      x123         X1
> 2     1      x221         X2
> 2     2      x222         X2
> 2     3      x223         X2
> Where "type" is a factor variable indicating if the observation
is related to
> 	X1 or X2...

Say your original data is in dataframe df, then this might do what you
want:

R> newdf <- rbind(df[, 1:3], df[, c(1, 2, 4)])
R> names(newdf)[3] <- "X"
R> newdf$type <- substr(c(df[[3]], df[[4]]), 1, 2)

Cheers,

-- 
Sebastian P. Luque

Thomas Lumley

2005-Sep-08 17:30 UTC

head link

[R] data manipulation

This is what reshape() does.

 	-thomas

On Thu, 8 Sep 2005, Marc Bernard wrote:
> Dear All,
>
> I would be grateful if you can help me. My problem is the following:
> I have a data set like:
>
> ID  time      X1          X2
> 1    1          x111      x211
> 1    2          x112      x212
> 2    1          x121      x221
> 2    2          x122      x222
> 2    3          x123      x223
>
> where X1 and X2 are 2 covariates and "time" is the time of
observation and ID indicates the cluster.
>
> I want to merge the above data by creating a new variable  "X"
and "type" as follows:
>
> ID   time    X            type
> 1     1      x111         X1
> 1     2      x112         X1
> 1     1      x211         X2
> 1     2      x212         X2
> 2     1      x121         X1
> 2     2      x122         X1
> 2     3      x123         X1
> 2     1      x221         X2
> 2     2      x222         X2
> 2     3      x223         X2
>
>
> Where "type" is a factor variable indicating if the observation
is related to X1 or X2...
>
> Many thanks in advance,
>
> Bernard
>
>
> ---------------------------------
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Jean Eid

2005-Sep-08 18:08 UTC

head link

[R] data manipulation

I am sure all this work but If you want exaclty the output to be the way
you mentioned do this

temp<-read.table("yourfile", as.is=T, header=T)
temp1<-temp[, 1:3]
temp2<-temp[, c(1,2,4)]
colnames(temp1)[3]<-"X"
colnames(temp2)[3]<-"X"
temp3<-merge(temp1, temp2, all=T)
temp3$type<-toupper(substr(temp3$X, 1,2))


after which you can generate factors and such..
note the as.is=T in read.table keeps the variables X1, X2, as characters.
This is done for substr...


P.S. I am sure you can use reshape instead of the second to the fifth
commands above

?reshape

Jean

On Thu, 8 Sep 2005, Sebastian Luque wrote:
> Marc Bernard <bernarduse1 at yahoo.fr> wrote:
> > Dear All,
>
> > I would be grateful if you can help me. My problem is the following:
> > I have a data set like:
>
> > ID  time      X1          X2
> > 1    1          x111      x211
> > 1    2          x112      x212
> > 2    1          x121      x221
> > 2    2          x122      x222
> > 2    3          x123      x223
>
> > where X1 and X2 are 2 covariates and "time" is the time of
observation and ID
> > 	indicates the cluster.
>
> > I want to merge the above data by creating a new variable
"X" and "type" as
> > 	follows:
>
> > ID   time    X            type
> > 1     1      x111         X1
> > 1     2      x112         X1
> > 1     1      x211         X2
> > 1     2      x212         X2
> > 2     1      x121         X1
> > 2     2      x122         X1
> > 2     3      x123         X1
> > 2     1      x221         X2
> > 2     2      x222         X2
> > 2     3      x223         X2
>
>
> > Where "type" is a factor variable indicating if the
observation is related to
> > 	X1 or X2...
>
>
> Say your original data is in dataframe df, then this might do what you
> want:
>
> R> newdf <- rbind(df[, 1:3], df[, c(1, 2, 4)])
> R> names(newdf)[3] <- "X"
> R> newdf$type <- substr(c(df[[3]], df[[4]]), 1, 2)
>
> Cheers,
>
> --
> Sebastian P. Luque
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>

Dieter Menne

2005-Sep-10 17:27 UTC

head link

[R] data manipulation

Marc Bernard <bernarduse1 <at> yahoo.fr> writes:
> I would be grateful if you can help me. My problem is the following:
> I have a data set like:
> 
> ID  time      X1          X2
> 1    1          x111      x211
> 1    2          x112      x212....
 > where X1 and X2 are 2 covariates and "time" is the time of
observation and ID
indicates the cluster.> 
> I want to merge the above data by creating a new variable  "X"
and "type" as
follows:> 
> ID   time    X            type
> 1     1      x111         X1....

Try reshape. And have courage, this is one of the more complex interfaces in R, 
very powerful, but intimidating.

Dieter

Reasonably Related Threads

Search for more seemingly similar threads

R help - Sep 2005 - data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

Reasonably Related Threads