Dear all, Lets say I have several data frames as follows:> set.seed(42) > dates <- as.Date(c("2010-01-19", "2010-01-20")) > times <- c("09:30:00", "11:30:00", "13:30:00", "15:30:00") > shows <- c("Red Dwarf", "Being Human", "Doctor Who") > > df1 <- data.frame(Date = dates[1], Time = times[1], Show = shows, Score = 1:3) > df2 <- data.frame(Date = dates[1], Time = times[2], Show = shows, Score = 1:3) > df3 <- data.frame(Date = dates[1], Time = times[4], Show = shows, Score = 1:3) > df4 <- data.frame(Date = dates[2], Time = times[1], Show = shows, Score = 1:3) > df5 <- data.frame(Date = dates[2], Time = times[2], Show = shows, Score = 1:3) > df6 <- data.frame(Date = dates[2], Time = times[3], Show = shows, Score = 1:3) > df7 <- data.frame(Date = dates[2], Time = times[4], Show = shows, Score = 1:3) > df7Date Time Show Score 1 2010-01-20 15:30:00 Red Dwarf 1 2 2010-01-20 15:30:00 Being Human 2 3 2010-01-20 15:30:00 Doctor Who 3 I would like to somehow reshape the data into a different format:> df.list <- list(df1, df2, df3, df4, df5, df6, df7) > my.df <- Reduce(function(x, y) merge(x, y, all=TRUE), df.list, accumulate=F) > my.xtab <- xtabs(as.numeric(Score) ~ Date + Show + Time, data = my.df)This is where my problem occurs. In Time = 13:30:00, there is now data for "2010-01-19" which was not in any of my original data frames above:> # I do not want the zeros below > my.xtab[,,"13:30:00"]Show Date Being Human Doctor Who Red Dwarf 2010-01-19 0 0 0 2010-01-20 2 3 1 Perhaps I am missing something in the way i call the xtabs function? Thank you kindly for your time, Tony Breyal OS: Windows XP 64bit> sessionInfo()R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States. 1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.0
Henrique Dallazuanna
2010-Jan-20 16:37 UTC
[R] Reshaping data with xtabs giving me 'extra' data
Try with tapply: with(do.call(rbind, df.list), tapply(Score, list(Date, Time, Show), length)) On Wed, Jan 20, 2010 at 10:20 AM, Tony B <tony.breyal at googlemail.com> wrote:> Dear all, > > Lets say I have several data frames as follows: > >> set.seed(42) >> dates <- as.Date(c("2010-01-19", "2010-01-20")) >> times <- c("09:30:00", "11:30:00", "13:30:00", "15:30:00") >> shows <- c("Red Dwarf", "Being Human", "Doctor Who") >> >> df1 <- data.frame(Date = dates[1], Time = times[1], Show = shows, Score = 1:3) >> df2 <- data.frame(Date = dates[1], Time = times[2], Show = shows, Score = 1:3) >> df3 <- data.frame(Date = dates[1], Time = times[4], Show = shows, Score = 1:3) >> df4 <- data.frame(Date = dates[2], Time = times[1], Show = shows, Score = 1:3) >> df5 <- data.frame(Date = dates[2], Time = times[2], Show = shows, Score = 1:3) >> df6 <- data.frame(Date = dates[2], Time = times[3], Show = shows, Score = 1:3) >> df7 <- data.frame(Date = dates[2], Time = times[4], Show = shows, Score = 1:3) >> df7 > ? ? ? ?Date ? ? Time ? ? ? ?Show Score > 1 2010-01-20 15:30:00 ? Red Dwarf ? ? 1 > 2 2010-01-20 15:30:00 Being Human ? ? 2 > 3 2010-01-20 15:30:00 ?Doctor Who ? ? 3 > > I would like to somehow reshape the data into a different format: > >> df.list <- list(df1, df2, df3, df4, df5, df6, df7) >> my.df <- Reduce(function(x, y) merge(x, y, all=TRUE), df.list, accumulate=F) >> my.xtab <- xtabs(as.numeric(Score) ~ Date + Show + Time, data = my.df) > > This is where my problem occurs. In Time = 13:30:00, there is now data > for "2010-01-19" which was not in any of my original data frames > above: > >> # I do not want the zeros below >> my.xtab[,,"13:30:00"] > ? ? ? ? ? ?Show > Date ? ? ? ? Being Human Doctor Who Red Dwarf > ?2010-01-19 ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 > ?2010-01-20 ? ? ? ? ? 2 ? ? ? ? ?3 ? ? ? ? 1 > > Perhaps I am missing something in the way i call the xtabs function? > > Thank you kindly for your time, > Tony Breyal > > OS: Windows XP 64bit >> sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 ?LC_CTYPE=English_United > States.1252 ? ?LC_MONETARY=English_United States.1252 > LC_NUMERIC=C ? ? ? ? ? ? ? ? ? ? ? ? ? LC_TIME=English_United States. > 1252 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods > base > > loaded via a namespace (and not attached): > [1] tools_2.10.0 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Reasonably Related Threads
- How to get a specific named element in a nested list
- create data frame(s) from a list with different numbers of rows
- splitting into multiple dataframes and then create a loop to work
- man page for as.matrix for data frames outdated?
- Windows 2000 crash while using rbind (PR#8225)