Dear colleagues, I'd like to reshape a datafame in a long format to a wide format, but I do not quite get what I want. Here is an example of the data I've have (dat): sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d") tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C") code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5", "a5", "a6") dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code) and below is what I'd like to obtain. That is, I'd like the tr variable in different columns (as a timevar) with their value (val). sp code tr.A tr.B tr.C a a1 31 NA NA a a2 NA 32 NA a a2 NA 33 NA ** a a3 NA NA 34 b a3 35 36 NA b a4 NA NA 37 c a4 38 NA NA d a4 39 NA NA d a5 NA 40 41 d a6 NA NA 42 Using reshape: reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp" )) I'm getting very close. The only difference is in the 3rd row (**), that is when sp and code are the same I only get one record. Is there a way to get all records? Any idea? Thank you very much for any help Juli Pausas -- http://www.ceam.es/pausas
reshape(dat, direction="wide", timevar="tr", idvar=c("id", "code","sp" ))[,2:6] But, I don't understand why you use reshape On 10/02/2008, juli pausas <pausas at gmail.com> wrote:> Dear colleagues, > I'd like to reshape a datafame in a long format to a wide format, but > I do not quite get what I want. Here is an example of the data I've > have (dat): > > sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d") > tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C") > code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5", > "a5", "a6") > dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code) > > and below is what I'd like to obtain. That is, I'd like the tr > variable in different columns (as a timevar) with their value (val). > > sp code tr.A tr.B tr.C > a a1 31 NA NA > a a2 NA 32 NA > a a2 NA 33 NA ** > a a3 NA NA 34 > b a3 35 36 NA > b a4 NA NA 37 > c a4 38 NA NA > d a4 39 NA NA > d a5 NA 40 41 > d a6 NA NA 42 > > Using reshape: > > reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp" )) > > I'm getting very close. The only difference is in the 3rd row (**), > that is when sp and code are the same I only get one record. Is there > a way to get all records? Any idea? > > Thank you very much for any help > > Juli Pausas > > -- > http://www.ceam.es/pausas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
This isn't really well defined. Suppose we have two rows that both have a, a2 and a value for B. Now suppose we have another row with a,a2 but with a value for C. Does the third row go with the first one? the second one? a new row? both the first and the second? Here is one possibility but without a good definition of the problem we don't know whether its answering the problem that is intended. In the code below we assume that all dat rows that have the same sp value and the same code value are adjacent and if a tr occurs among those dat rows that is equal to or less than the prior row in factor level order then the new dat row must start a new output row else not. Thus within an sp/code group we assign each row a 1 until we get a tr that is less than the prior row's tr and then we start assigning 2 and so on. This is the new column seq below. We then use seq as part of our id.var in reshape. For the particular example in your post this does give the same answer. f <- function(x) cumsum(c(1, diff(x) <= 0)) dat$seq <- ave(as.numeric(dat$tr), dat$sp, dat$code, FUN = f) reshape(dat[-1], direction="wide", timevar="tr", idvar=c("code","sp","seq" ))[-3] On Feb 10, 2008 4:58 PM, juli pausas <pausas at gmail.com> wrote:> Dear colleagues, > I'd like to reshape a datafame in a long format to a wide format, but > I do not quite get what I want. Here is an example of the data I've > have (dat): > > sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d") > tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C") > code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5", > "a5", "a6") > dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code) > > and below is what I'd like to obtain. That is, I'd like the tr > variable in different columns (as a timevar) with their value (val). > > sp code tr.A tr.B tr.C > a a1 31 NA NA > a a2 NA 32 NA > a a2 NA 33 NA ** > a a3 NA NA 34 > b a3 35 36 NA > b a4 NA NA 37 > c a4 38 NA NA > d a4 39 NA NA > d a5 NA 40 41 > d a6 NA NA 42 > > Using reshape: > > reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp" )) > > I'm getting very close. The only difference is in the 3rd row (**), > that is when sp and code are the same I only get one record. Is there > a way to get all records? Any idea? > > Thank you very much for any help > > Juli Pausas > > -- > http://www.ceam.es/pausas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >