Dear All, I would be grateful if you can help me. My problem is the following: I have a data set like: ID time X1 X2 1 1 x111 x211 1 2 x112 x212 2 1 x121 x221 2 2 x122 x222 2 3 x123 x223 where X1 and X2 are 2 covariates and "time" is the time of observation and ID indicates the cluster. I want to merge the above data by creating a new variable "X" and "type" as follows: ID time X type 1 1 x111 X1 1 2 x112 X1 1 1 x211 X2 1 2 x212 X2 2 1 x121 X1 2 2 x122 X1 2 3 x123 X1 2 1 x221 X2 2 2 x222 X2 2 3 x223 X2 Where "type" is a factor variable indicating if the observation is related to X1 or X2... Many thanks in advance, Bernard --------------------------------- [[alternative HTML version deleted]]
Hi, This may not be the best solution, but at least it's easy to see what i'm doing, assume that your data set is called "data": # remove the 4th column data1 = data[,-4] # remove the 3rd column data2 = data[,-3] # use cbind to add an extra column with only X1 #elements data1 = cbind(data1, array("X1", nrow(data1), 1) # use cbind to add an extra column with only X2 #elements data2 = cbind(data2, array("X2", nrow(data2), 1) # use rbind to add them together as rows data3 = rbind(data1, data2) # rename the names of the columns colnames(data3) <- c("ID", "time", "X", "type") # show output data3 The only thing I couldn't figure out is how to sort the data set per row, perhaps someone else could help us out on this? Martin --- Marc Bernard <bernarduse1 at yahoo.fr> wrote:> Dear All, > > I would be grateful if you can help me. My problem > is the following: > I have a data set like: > > ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212 > 2 1 x121 x221 > 2 2 x122 x222 > 2 3 x123 x223 > > where X1 and X2 are 2 covariates and "time" is the > time of observation and ID indicates the cluster. > > I want to merge the above data by creating a new > variable "X" and "type" as follows: > > ID time X type > 1 1 x111 X1 > 1 2 x112 X1 > 1 1 x211 X2 > 1 2 x212 X2 > 2 1 x121 X1 > 2 2 x122 X1 > 2 3 x123 X1 > 2 1 x221 X2 > 2 2 x222 X2 > 2 3 x223 X2 > > > Where "type" is a factor variable indicating if the > observation is related to X1 or X2... > > Many thanks in advance, > > Bernard > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >______________________________________________________ Click here to donate to the Hurricane Katrina relief effort.
Marc Bernard <bernarduse1 at yahoo.fr> wrote:> Dear All,> I would be grateful if you can help me. My problem is the following: > I have a data set like:> ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212 > 2 1 x121 x221 > 2 2 x122 x222 > 2 3 x123 x223> where X1 and X2 are 2 covariates and "time" is the time of observation and ID > indicates the cluster.> I want to merge the above data by creating a new variable "X" and "type" as > follows:> ID time X type > 1 1 x111 X1 > 1 2 x112 X1 > 1 1 x211 X2 > 1 2 x212 X2 > 2 1 x121 X1 > 2 2 x122 X1 > 2 3 x123 X1 > 2 1 x221 X2 > 2 2 x222 X2 > 2 3 x223 X2> Where "type" is a factor variable indicating if the observation is related to > X1 or X2...Say your original data is in dataframe df, then this might do what you want: R> newdf <- rbind(df[, 1:3], df[, c(1, 2, 4)]) R> names(newdf)[3] <- "X" R> newdf$type <- substr(c(df[[3]], df[[4]]), 1, 2) Cheers, -- Sebastian P. Luque
This is what reshape() does. -thomas On Thu, 8 Sep 2005, Marc Bernard wrote:> Dear All, > > I would be grateful if you can help me. My problem is the following: > I have a data set like: > > ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212 > 2 1 x121 x221 > 2 2 x122 x222 > 2 3 x123 x223 > > where X1 and X2 are 2 covariates and "time" is the time of observation and ID indicates the cluster. > > I want to merge the above data by creating a new variable "X" and "type" as follows: > > ID time X type > 1 1 x111 X1 > 1 2 x112 X1 > 1 1 x211 X2 > 1 2 x212 X2 > 2 1 x121 X1 > 2 2 x122 X1 > 2 3 x123 X1 > 2 1 x221 X2 > 2 2 x222 X2 > 2 3 x223 X2 > > > Where "type" is a factor variable indicating if the observation is related to X1 or X2... > > Many thanks in advance, > > Bernard > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
I am sure all this work but If you want exaclty the output to be the way you mentioned do this temp<-read.table("yourfile", as.is=T, header=T) temp1<-temp[, 1:3] temp2<-temp[, c(1,2,4)] colnames(temp1)[3]<-"X" colnames(temp2)[3]<-"X" temp3<-merge(temp1, temp2, all=T) temp3$type<-toupper(substr(temp3$X, 1,2)) after which you can generate factors and such.. note the as.is=T in read.table keeps the variables X1, X2, as characters. This is done for substr... P.S. I am sure you can use reshape instead of the second to the fifth commands above ?reshape Jean On Thu, 8 Sep 2005, Sebastian Luque wrote:> Marc Bernard <bernarduse1 at yahoo.fr> wrote: > > Dear All, > > > I would be grateful if you can help me. My problem is the following: > > I have a data set like: > > > ID time X1 X2 > > 1 1 x111 x211 > > 1 2 x112 x212 > > 2 1 x121 x221 > > 2 2 x122 x222 > > 2 3 x123 x223 > > > where X1 and X2 are 2 covariates and "time" is the time of observation and ID > > indicates the cluster. > > > I want to merge the above data by creating a new variable "X" and "type" as > > follows: > > > ID time X type > > 1 1 x111 X1 > > 1 2 x112 X1 > > 1 1 x211 X2 > > 1 2 x212 X2 > > 2 1 x121 X1 > > 2 2 x122 X1 > > 2 3 x123 X1 > > 2 1 x221 X2 > > 2 2 x222 X2 > > 2 3 x223 X2 > > > > Where "type" is a factor variable indicating if the observation is related to > > X1 or X2... > > > Say your original data is in dataframe df, then this might do what you > want: > > R> newdf <- rbind(df[, 1:3], df[, c(1, 2, 4)]) > R> names(newdf)[3] <- "X" > R> newdf$type <- substr(c(df[[3]], df[[4]]), 1, 2) > > Cheers, > > -- > Sebastian P. Luque > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Marc Bernard <bernarduse1 <at> yahoo.fr> writes:> I would be grateful if you can help me. My problem is the following: > I have a data set like: > > ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212....> where X1 and X2 are 2 covariates and "time" is the time of observation and IDindicates the cluster.> > I want to merge the above data by creating a new variable "X" and "type" asfollows:> > ID time X type > 1 1 x111 X1.... Try reshape. And have courage, this is one of the more complex interfaces in R, very powerful, but intimidating. Dieter