Dear R helpers: I am a R novice and have a question about using table() to extract frequences over many sub-datasets. A small example input dataframe and wanted output dataframe are provided below. The real data is very large so a for loop is what I try to avoid. Can someone englithen me how to use sapply or the like to achieve it? Many thanks in advance! -Sean #example input dataframe id <- c('tom', 'tom', 'tom', 'jack', 'jack', 'jack', 'jack') var_interest <- c("happy","unhappy", "", "happy", "unhappy", 'soso','happy') input.df <- data.frame(id=id, var_interest=var_interest) input.df wanted.df <- #output dataframe I want id_unique <- c('tom','jack') happy_freq<-c(1,2) unhappy_freq<-c(1,1) soso_freq<-c(0,1) miss_freq<-c(1,0) output.df <-data.frame(id_unique=id_unique, happy_freq=happy_freq, unhappy_freq=unhappy_freq, soso_freq=soso_freq, miss_freq=miss_freq) output.df [[alternative HTML version deleted]]
Sean Zhang wrote:> Dear R helpers: > > I am a R novice and have a question about using table() to extract > frequences over many sub-datasets. > A small example input dataframe and wanted output dataframe are provided > below. The real data is very large so a for loop is what I try to avoid. > > Can someone englithen me how to use sapply or the like to achieve it?I'd simply use table(input.df) or perhaps closer to the result you want: reshape(as.data.frame(table(input.df)), direction="wide", timevar="var_interest") Uwe Ligges> Many thanks in advance! > > -Sean > > #example input dataframe > id <- c('tom', 'tom', 'tom', 'jack', 'jack', 'jack', 'jack') > var_interest <- c("happy","unhappy", "", "happy", "unhappy", 'soso','happy') > input.df <- data.frame(id=id, var_interest=var_interest) > input.df > wanted.df <- > > #output dataframe I want > id_unique <- c('tom','jack') > happy_freq<-c(1,2) > unhappy_freq<-c(1,1) > soso_freq<-c(0,1) > miss_freq<-c(1,0) > output.df <-data.frame(id_unique=id_unique, happy_freq=happy_freq, > unhappy_freq=unhappy_freq, soso_freq=soso_freq, miss_freq=miss_freq) > output.df > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
This may be what you want:> table(input.df$id, input.df$var_interest)happy soso unhappy jack 0 2 1 1 tom 1 1 0 1 On Wed, Feb 11, 2009 at 1:34 PM, Sean Zhang <seanecon at gmail.com> wrote:> Dear R helpers: > > I am a R novice and have a question about using table() to extract > frequences over many sub-datasets. > A small example input dataframe and wanted output dataframe are provided > below. The real data is very large so a for loop is what I try to avoid. > > Can someone englithen me how to use sapply or the like to achieve it? > > Many thanks in advance! > > -Sean > > #example input dataframe > id <- c('tom', 'tom', 'tom', 'jack', 'jack', 'jack', 'jack') > var_interest <- c("happy","unhappy", "", "happy", "unhappy", 'soso','happy') > input.df <- data.frame(id=id, var_interest=var_interest) > input.df > wanted.df <- > > #output dataframe I want > id_unique <- c('tom','jack') > happy_freq<-c(1,2) > unhappy_freq<-c(1,1) > soso_freq<-c(0,1) > miss_freq<-c(1,0) > output.df <-data.frame(id_unique=id_unique, happy_freq=happy_freq, > unhappy_freq=unhappy_freq, soso_freq=soso_freq, miss_freq=miss_freq) > output.df > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?