Dear All, I would be grateful if you can help me. My problem is the following: I have a data set like: ID time X1 X2 1 1 x111 x211 1 2 x112 x212 2 1 x121 x221 2 2 x122 x222 2 3 x123 x223 where X1 and X2 are 2 covariates and "time" is the time of observation and ID indicates the cluster. I want to merge the above data by creating a new variable "X" and "type" as follows: ID time X type 1 1 x111 X1 1 2 x112 X1 1 1 x211 X2 1 2 x212 X2 2 1 x121 X1 2 2 x122 X1 2 3 x123 X1 2 1 x221 X2 2 2 x222 X2 2 3 x223 X2 Where "type" is a factor variable indicating if the observation is related to X1 or X2... Many thanks in advance, Bernard --------------------------------- [[alternative HTML version deleted]]
Hi,
This may not be the best solution, but at least it's
easy to see what i'm doing, assume that your data set
is called "data":
# remove the 4th column
data1 = data[,-4]
# remove the 3rd column
data2 = data[,-3]
# use cbind to add an extra column with only X1
#elements
data1 = cbind(data1, array("X1", nrow(data1), 1)
# use cbind to add an extra column with only X2
#elements
data2 = cbind(data2, array("X2", nrow(data2), 1)
# use rbind to add them together as rows
data3 = rbind(data1, data2)
# rename the names of the columns
colnames(data3) <- c("ID", "time", "X",
"type")
# show output
data3
The only thing I couldn't figure out is how to sort
the data set per row, perhaps someone else could help
us out on this?
Martin
--- Marc Bernard <bernarduse1 at yahoo.fr> wrote:
> Dear All,
>
> I would be grateful if you can help me. My problem
> is the following:
> I have a data set like:
>
> ID time X1 X2
> 1 1 x111 x211
> 1 2 x112 x212
> 2 1 x121 x221
> 2 2 x122 x222
> 2 3 x123 x223
>
> where X1 and X2 are 2 covariates and "time" is the
> time of observation and ID indicates the cluster.
>
> I want to merge the above data by creating a new
> variable "X" and "type" as follows:
>
> ID time X type
> 1 1 x111 X1
> 1 2 x112 X1
> 1 1 x211 X2
> 1 2 x212 X2
> 2 1 x121 X1
> 2 2 x122 X1
> 2 3 x123 X1
> 2 1 x221 X2
> 2 2 x222 X2
> 2 3 x223 X2
>
>
> Where "type" is a factor variable indicating if the
> observation is related to X1 or X2...
>
> Many thanks in advance,
>
> Bernard
>
>
> ---------------------------------
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
______________________________________________________
Click here to donate to the Hurricane Katrina relief effort.
Marc Bernard <bernarduse1 at yahoo.fr> wrote:> Dear All,> I would be grateful if you can help me. My problem is the following: > I have a data set like:> ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212 > 2 1 x121 x221 > 2 2 x122 x222 > 2 3 x123 x223> where X1 and X2 are 2 covariates and "time" is the time of observation and ID > indicates the cluster.> I want to merge the above data by creating a new variable "X" and "type" as > follows:> ID time X type > 1 1 x111 X1 > 1 2 x112 X1 > 1 1 x211 X2 > 1 2 x212 X2 > 2 1 x121 X1 > 2 2 x122 X1 > 2 3 x123 X1 > 2 1 x221 X2 > 2 2 x222 X2 > 2 3 x223 X2> Where "type" is a factor variable indicating if the observation is related to > X1 or X2...Say your original data is in dataframe df, then this might do what you want: R> newdf <- rbind(df[, 1:3], df[, c(1, 2, 4)]) R> names(newdf)[3] <- "X" R> newdf$type <- substr(c(df[[3]], df[[4]]), 1, 2) Cheers, -- Sebastian P. Luque
This is what reshape() does. -thomas On Thu, 8 Sep 2005, Marc Bernard wrote:> Dear All, > > I would be grateful if you can help me. My problem is the following: > I have a data set like: > > ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212 > 2 1 x121 x221 > 2 2 x122 x222 > 2 3 x123 x223 > > where X1 and X2 are 2 covariates and "time" is the time of observation and ID indicates the cluster. > > I want to merge the above data by creating a new variable "X" and "type" as follows: > > ID time X type > 1 1 x111 X1 > 1 2 x112 X1 > 1 1 x211 X2 > 1 2 x212 X2 > 2 1 x121 X1 > 2 2 x122 X1 > 2 3 x123 X1 > 2 1 x221 X2 > 2 2 x222 X2 > 2 3 x223 X2 > > > Where "type" is a factor variable indicating if the observation is related to X1 or X2... > > Many thanks in advance, > > Bernard > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
I am sure all this work but If you want exaclty the output to be the way
you mentioned do this
temp<-read.table("yourfile", as.is=T, header=T)
temp1<-temp[, 1:3]
temp2<-temp[, c(1,2,4)]
colnames(temp1)[3]<-"X"
colnames(temp2)[3]<-"X"
temp3<-merge(temp1, temp2, all=T)
temp3$type<-toupper(substr(temp3$X, 1,2))
after which you can generate factors and such..
note the as.is=T in read.table keeps the variables X1, X2, as characters.
This is done for substr...
P.S. I am sure you can use reshape instead of the second to the fifth
commands above
?reshape
Jean
On Thu, 8 Sep 2005, Sebastian Luque wrote:
> Marc Bernard <bernarduse1 at yahoo.fr> wrote:
> > Dear All,
>
> > I would be grateful if you can help me. My problem is the following:
> > I have a data set like:
>
> > ID time X1 X2
> > 1 1 x111 x211
> > 1 2 x112 x212
> > 2 1 x121 x221
> > 2 2 x122 x222
> > 2 3 x123 x223
>
> > where X1 and X2 are 2 covariates and "time" is the time of
observation and ID
> > indicates the cluster.
>
> > I want to merge the above data by creating a new variable
"X" and "type" as
> > follows:
>
> > ID time X type
> > 1 1 x111 X1
> > 1 2 x112 X1
> > 1 1 x211 X2
> > 1 2 x212 X2
> > 2 1 x121 X1
> > 2 2 x122 X1
> > 2 3 x123 X1
> > 2 1 x221 X2
> > 2 2 x222 X2
> > 2 3 x223 X2
>
>
> > Where "type" is a factor variable indicating if the
observation is related to
> > X1 or X2...
>
>
> Say your original data is in dataframe df, then this might do what you
> want:
>
> R> newdf <- rbind(df[, 1:3], df[, c(1, 2, 4)])
> R> names(newdf)[3] <- "X"
> R> newdf$type <- substr(c(df[[3]], df[[4]]), 1, 2)
>
> Cheers,
>
> --
> Sebastian P. Luque
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
Marc Bernard <bernarduse1 <at> yahoo.fr> writes:> I would be grateful if you can help me. My problem is the following: > I have a data set like: > > ID time X1 X2 > 1 1 x111 x211 > 1 2 x112 x212....> where X1 and X2 are 2 covariates and "time" is the time of observation and IDindicates the cluster.> > I want to merge the above data by creating a new variable "X" and "type" asfollows:> > ID time X type > 1 1 x111 X1.... Try reshape. And have courage, this is one of the more complex interfaces in R, very powerful, but intimidating. Dieter