Dear list. New to R, I'm looking for a way of using crosstab to output low-dimensional (higher than 2) contingency tables (frequencies, per-cents by rows, % by columns, mean, quantiles....) I'm looking for something of the following sort dataframe: singers, categorical variates: voice category (soprano,mezzo-soprano, ...) , voice type( drammatic, spinto, lirico-spinto, lirico, leggero), school (german, italian, french, russian, anglo-saxon, other);repertory (opera, Lieder, oratorio, operetta) continuous variate: age I would like to tabulate the frequencies (relative percentages) say in the following way columns: school 仸 repertory rows : voice category 仸 voice type or to output in the cells of the above table, the statistics (mean/median/quantiles) for age I've seen that the function bwplot(age~school | repertory, data= singers, layout=c(4,2)) would do graphically something similar to what I want, but I desire the output also in tabular form Thanks Isotta __________________________________________________ [[alternative HTML version deleted]]
On Tue, 2005-08-30 at 12:28 -0700, Isotta Felli wrote:> Dear list. > > New to R, I'm looking for a way of using crosstab to output > low-dimensional (higher than 2) contingency tables (frequencies, > per-cents by rows, % by columns, mean, quantiles....) I'm looking for > something of the following sort > > dataframe: singers, > categorical variates: voice category (soprano,mezzo-soprano, ...) , > voice type( drammatic, spinto, lirico-spinto, lirico, leggero), school > (german, italian, french, russian, anglo-saxon, other);repertory > (opera, Lieder, oratorio, operetta) > continuous variate: age > > I would like to tabulate the frequencies (relative percentages) say > in the following way > columns: school repertory > rows : voice category voice type > > or to output in the cells of the above table, the statistics > (mean/median/quantiles) for age > > > I've seen that the function bwplot(age~school | repertory, data> singers, layout=c(4,2)) would do graphically something similar to what > I want, but I desire the output also in tabular form > > Thanks > > IsottaI don't know that you will find a single function that will do all of what you desire, but you can look at the ctab() function, which is in the 'catspec' package on CRAN by John Hendrickx. This will do multi-way tables with summary row/col statistics. There is also the CrossTable() function in the 'gmodels' package on CRAN, though this will only do one way and two way tables, with summary row/col statistics. Neither of the above provide typical summary output for continuous data. They are more for categorical variables. For simple count multi-way output, you can also look at the ftable() function which is in the base package. Also look at the summary() function in the base package which will provide range, mean, quantile data for continuous variables. It may just be a matter of formatting the output in a fashion that you desire on a post analysis basis. HTH, Marc Schwartz
Hi Isotta, You can do this with the reshape package (available from CRAN). eg install.packages("reshape") library(reshape) data(singer, package="lattice") singer$type <- c("drammatic", "spinto", "lirico-spinto", "lirico", "leggero")[sample(1:5, 235, replace=T)] singer$school <- c("german", "italian", "french", "russian", "anglo-saxon", "other")[sample(1:6, 235, replace=T)] singer$repertory <- c("opera", "Lieder", "oratorio", "operetta")[sample(1:4, 235, replace=T)] # First deshape the data (this puts it in a form easy to reshape) singer_d <- deshape(singer, id=c("type", "school", "repertory", "voice.part"), m="height") # You can do things like reshape(singer_d, type ~ school, length, subset=variable=="height") reshape(singer_d, type ~ school, mean, subset=variable=="height") # or with margins reshape(singer_d, type + school ~ ., length, subset=variable=="height", margins=c("type","grand_row")) # or with multiple stats opera.sum <- function(x) c(min=min(x), mean=mean(x), max=max(x)) reshape(singer_d, type + school ~ ., opera.sum, subset=variable=="height") # What you'd want for your data, but doesn't work well with this example # and is going to be a big table regardless! reshape(singer_d, type + voice.part ~ school + repertory, length) There's some more info available at http://had.co.nz/reshape but I'm still working on it. Hadley