Dear R-listers, I am a newbie with R and I am struggling with something I consider very basic. I wish to produce a table (to import in a latex file) of summary statistics, but for as much as I've been looking around and trying various alternatives (plyr, reporttools, pastecs and Hmisc) I haven't found what I am looking for. Probably I am doing something wrong, but I can't figure out what. Let's make up three simple variables: var1 <- runif(1000) var2 <- runif(1000) var3 <- factor(rep(1:2, 500), labels = c("m", "f")) and let's create a dataset out of them: data <- data.frame(var1, var2, var3) what I'd like to get is a table such as the following one: variable mean sd min max obs missing var1 var2 var3 where for each variable, I can read in line the mean, the standard deviation, the min and the max values, the number of observations and the percentage of missing data. Can you advice any way to achieve it? Thanks a lot in advance for your kind help, f. [[alternative HTML version deleted]]
On 12-12-20 6:45 AM, Francesco Sarracino wrote:> Dear R-listers, > > I am a newbie with R and I am struggling with something I consider very > basic. I wish to produce a table (to import in a latex file) of summary > statistics, but for as much as I've been looking around and trying various > alternatives (plyr, reporttools, pastecs and Hmisc) I haven't found what I > am looking for. Probably I am doing something wrong, but I can't figure out > what. > Let's make up three simple variables: > > var1 <- runif(1000) > var2 <- runif(1000) > var3 <- factor(rep(1:2, 500), labels = c("m", "f")) > > and let's create a dataset out of them: > data <- data.frame(var1, var2, var3) > > what I'd like to get is a table such as the following one: > > variable mean sd min max obs missing > var1 > var2 > var3 > > where for each variable, I can read in line the mean, the standard > deviation, the min and the max values, the number of observations and the > percentage of missing data. > Can you advice any way to achieve it? > Thanks a lot in advance for your kind help,I'm not sure what you want for var3: it doesn't make sense to calculate the mean or sd for a factor. But for the other variables, using package tables, you do latex( tabular( Heading("variable")*(var1 + var2) ~ (mean + sd + min + max + (obs=length) + (missing=function(x) sum(is.na(x)))), data=data) ) You might want a breakdown of the summaries by var3; you'd get that this way: latex( tabular( Heading("variable")*(var1 + var2)*var3 ~ (mean + sd + min + max + (obs=length) + (missing=function(x) sum(is.na(x)))), data=data) ) Duncan Murdoch
On 12/20/2012 10:45 PM, Francesco Sarracino wrote:> Dear R-listers, > > I am a newbie with R and I am struggling with something I consider very > basic. I wish to produce a table (to import in a latex file) of summary > statistics, but for as much as I've been looking around and trying various > alternatives (plyr, reporttools, pastecs and Hmisc) I haven't found what I > am looking for. Probably I am doing something wrong, but I can't figure out > what. > Let's make up three simple variables: > > var1<- runif(1000) > var2<- runif(1000) > var3<- factor(rep(1:2, 500), labels = c("m", "f")) > > and let's create a dataset out of them: > data<- data.frame(var1, var2, var3) > > what I'd like to get is a table such as the following one: > > variable mean sd min max obs missing > var1 > var2 > var3 > > where for each variable, I can read in line the mean, the standard > deviation, the min and the max values, the number of observations and the > percentage of missing data.Hi Francesco, Have a look at the describe function (prettyR) that allows you to select the summary statistics you want for numeric variables and automatically produces count statistics for factor, character and logical variables. There are also other "auto-summarize" functions in packages such as psych and Hmisc. Jim