Gavin Rudge
2014-Oct-22 15:48 UTC
[R] creating individual records from a frequency distribution
I've got data frame containing a simple frequency distribution of numbers of people in three age groups by area. df1<-data.frame(area=c(1,2),group1=c(2,3),group2=c(1,5),group3=c(4,0)) df1 I want to get a data frame with one record per person (in my case 15 of them) which would look like this, with variables indicating the area and age group to which each belongs df2<-data.frame(person_id=seq(1:15),area=c(rep(1,7),rep(2,8)),group_num=c(1,1,2,3,3,3,3,1,1,1,2,2,2,2,2)) df2 This is not the same as melting wide to long data as in reshape2, as I'm melting from aggregated data. I can get vectors of columns values using the rep command, but sewing them together and allowing for zeros looks a bit cumbersome. I'm assuming there is a simple command that does this sort of thing. Any help gratefully received, GavinR [[alternative HTML version deleted]]
Sven E. Templer
2014-Oct-22 18:25 UTC
[R] creating individual records from a frequency distribution
With melt and rep you are close. If you combine them it works: library(reshape) # your data: df1 <- data.frame(area=c(1,2),group1=c(2,3),group2=c(1,5),group3=c(4,0)) df2<-data.frame(person_id=seq(1:15),area=c(rep(1,7),rep(2,8)),group_num=c(1,1,2,3,3,3,3,1,1,1,2,2,2,2,2)) # first melt d <- melt(df1,"area",2:4) # then repeat each row by 'counts' d <- d[rep(seq(nrow(d)), times=d$value),] # then order (if order of id's is not arbitrary), and add ids d <- d[order(d$area,d$variable),] d$value <- seq(nrow(d)) # compare cbind(df2,"---",d) Best, Sven. On 22 October 2014 17:48, Gavin Rudge <G.Rudge at bham.ac.uk> wrote:> I've got data frame containing a simple frequency distribution of numbers of people in three age groups by area. > > df1<-data.frame(area=c(1,2),group1=c(2,3),group2=c(1,5),group3=c(4,0)) > df1 > > I want to get a data frame with one record per person (in my case 15 of them) which would look like this, with variables indicating the area and age group to which each belongs > > df2<-data.frame(person_id=seq(1:15),area=c(rep(1,7),rep(2,8)),group_num=c(1,1,2,3,3,3,3,1,1,1,2,2,2,2,2)) > df2 > > This is not the same as melting wide to long data as in reshape2, as I'm melting from aggregated data. I can get vectors of columns values using the rep command, but sewing them together and allowing for zeros looks a bit cumbersome. I'm assuming there is a simple command that does this sort of thing. > > Any help gratefully received, > > GavinR > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.