Dear R Helpers, I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. I have a dataframe df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4))> dfName X1 X2 1 a 12 200 2 a 13 250 3 a 14 300 4 b 20 600 5 b 25 700 6 c 30 900 First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: df.index = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))> df.indexName X1 X2 Index 1 a 12 200 1 2 a 13 250 2 3 a 14 300 3 4 b 20 600 1 5 b 25 700 2 6 c 30 900 1 How can I do this? Secondly, I would like to reshape this dataframe in the form:> df21 2 3 a 12 13 14 b 20 25 NA c 30 NA NA Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) colnames(df2) = c("V1", "V2", "V3") However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? Thank you so much for your help on these two issues. With best regards, Dana Sevak
Dana Sevak wrote:> Dear R Helpers, > > I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. > > I have a dataframe > > df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4)) > >> df > Name X1 X2 > 1 a 12 200 > 2 a 13 250 > 3 a 14 300 > 4 b 20 600 > 5 b 25 700 > 6 c 30 900 > > First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: > > df.index = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) > >> df.index > Name X1 X2 Index > 1 a 12 200 1 > 2 a 13 250 2 > 3 a 14 300 3 > 4 b 20 600 1 > 5 b 25 700 2 > 6 c 30 900 1 > > How can I do this? > > > Secondly, I would like to reshape this dataframe in the form: > >> df2 > 1 2 3 > a 12 13 14 > b 20 25 NA > c 30 NA NAThis does it more or less your way: ds <- split(df, df$Name) ds <- lapply(ds, function(x){x$Index <- seq_along(x[,1]); x}) df2 <- unsplit(ds, df$Name) tapply(df2$X1, df2[,c("Name", "Index")], function(x) x) athough there may exist much easier ways ... Uwe Ligges> Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: > > df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) > colnames(df2) = c("V1", "V2", "V3") > > However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? > > Thank you so much for your help on these two issues. > > With best regards, > Dana Sevak > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
one way is the following: df.index <- df df.index$Index <- ave(seq_along(df$Name), df$Name, FUN = seq_along) df.index df2 <- reshape(df.index[c("Name", "Index", "X1")], timevar = "Index", idvar = "Name", direction = "wide") df2 I hope it helps. Best, Dimitris Dana Sevak wrote:> Dear R Helpers, > > I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. > > I have a dataframe > > df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4)) > >> df > Name X1 X2 > 1 a 12 200 > 2 a 13 250 > 3 a 14 300 > 4 b 20 600 > 5 b 25 700 > 6 c 30 900 > > First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. The resulting df should look like: > > df.index = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) > >> df.index > Name X1 X2 Index > 1 a 12 200 1 > 2 a 13 250 2 > 3 a 14 300 3 > 4 b 20 600 1 > 5 b 25 700 2 > 6 c 30 900 1 > > How can I do this? > > > Secondly, I would like to reshape this dataframe in the form: > >> df2 > 1 2 3 > a 12 13 14 > b 20 25 NA > c 30 NA NA > > Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). If I could generate the Index column, I think I could accomplish this with: > > df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) > colnames(df2) = c("V1", "V2", "V3") > > However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? > > Thank you so much for your help on these two issues. > > With best regards, > Dana Sevak > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
Try this: DF$Index <- ave(1:nrow(DF), DF$Name, FUN = seq_along) reshape(DF[-3], dir = "wide", idvar = "Name", timevar = "Index") Also see the reshape package for another similar facility. On Wed, May 13, 2009 at 2:02 AM, Dana Sevak <dana.sevak at yahoo.com> wrote:> > Dear R Helpers, > > I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. > > I have a dataframe > > df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4)) > >> df > ?Name X1 ?X2 > 1 ? ?a 12 200 > 2 ? ?a 13 250 > 3 ? ?a 14 300 > 4 ? ?b 20 600 > 5 ? ?b 25 700 > 6 ? ?c 30 900 > > First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. ?The resulting df should look like: > > df.index = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) > >> df.index > ?Name X1 ?X2 ? ?Index > 1 ? ?a 12 200 ? ?1 > 2 ? ?a 13 250 ? ?2 > 3 ? ?a 14 300 ? ?3 > 4 ? ?b 20 600 ? ?1 > 5 ? ?b 25 700 ? ?2 > 6 ? ?c 30 900 ? ?1 > > How can I do this? > > > Secondly, I would like to reshape this dataframe in the form: > >> df2 > ? 1 ?2 ?3 > a 12 13 14 > b 20 25 NA > c 30 NA NA > > Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs). ?If I could generate the Index column, I think I could accomplish this with: > > df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) > colnames(df2) = c("V1", "V2", "V3") > > However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? > > Thank you so much for your help on these two issues. > > With best regards, > Dana Sevak > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Dana,> ---------- Forwarded message ---------- > From: Dana Sevak <dana.sevak at yahoo.com> > To: r-help at r-project.org > Date: Tue, 12 May 2009 23:02:00 -0700 (PDT) > Subject: [R] Help with reshape/reShape and indexing > > Dear R Helpers, > > I have trouble applying reShape and reshape although I read the documentation and several posts, so I would very much appreciate your help on the two points below. >There are usually many ways to accomplish any given task in R, and which one you use is a matter of preference. I've settled on use the reshape package for these kinds of tasks. If you're comfortable with the solutions already suggested there's no need to continue reading. Otherwise here's another approach:> I have a dataframe > > df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4)) > > > df > Name X1 ?X2 > 1 ? ?a 12 200 > 2 ? ?a 13 250 > 3 ? ?a 14 300 > 4 ? ?b 20 600 > 5 ? ?b 25 700 > 6 ? ?c 30 900 > > First I need to add an additional column to this dataframe that will count the number of rows per each Name entry. ?The resulting df should look like: > > df.index = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1)) > > > df.index > Name X1 ?X2 ? ?Index > 1 ? ?a 12 200 ? ?1 > 2 ? ?a 13 250 ? ?2 > 3 ? ?a 14 300 ? ?3 > 4 ? ?b 20 600 ? ?1 > 5 ? ?b 25 700 ? ?2 > 6 ? ?c 30 900 ? ?1 > > How can I do this? >Easy enough with the plyr package (loaded with reshape): df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4)) library(reshape) df$Index <- ddply(df, "Name", colwise(seq_along))[,1]> > Secondly, I would like to reshape this dataframe in the form: > > > df2 > ?1 ?2 ?3 > a 12 13 14 > b 20 25 NA > c 30 NA NA > > Since the df is sorted by Name and X2, I would need that the available X1 values populate the resulting rows in df2 from left to right (i.e. if only one value is available, it is written in the first column and the remaining columns get NAs).I don't really understand this. What happened to X2? Anyway, I would do it like this:> df$X2 <- NULL > m.df <- melt(df, measure.vars="X1") > df.final <- cast(m.df, ... ~ Index) > df.finalName variable 1 2 3 1 a X1 12 13 14 2 b X1 20 25 <NA> 3 c X1 30 <NA> <NA> But I don't see why you want to drop X2, so I would actually do df = data.frame(Name=c("a", "a", "a", "b", "b", "c"), X1=c("12", "13", "14", "20", "25", "30"), X2 = c(200, 250, 300, 600, 700, 4)) df$Index <- ddply(df, "Name", colwise(seq_along))[,1] df$X2 <- as.character(df$X2) m.df <- melt(df, measure.vars=c("X1","X2")) df.final <- cast(m.df, ... ~ Index) df.final Name variable 1 2 3 1 a X1 12 13 14 2 a X2 200 250 300 3 b X1 20 25 <NA> 4 b X2 600 700 <NA> 5 c X1 30 <NA> <NA> 6 c X2 4 <NA> <NA> All the best, Ista> ?If I could generate the Index column, I think I could accomplish this with: > > df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index) > colnames(df2) = c("V1", "V2", "V3") > > However, is there a way to get to df2 without using the Index column and still have the NAs written as described above? > > Thank you so much for your help on these two issues. > > With best regards, > Dana Sevak