I have a large n-way contingency table, constructed as a table object, and want to pool (collapse) some categories, summing the frequencies in all collapsed cells. How can I do this? thx, -Michael -- Michael Friendly friendly at yorku.ca York University http://www.math.yorku.ca/SCS/friendly.html Psychology Department 4700 Keele Street Tel: (416) 736-5115 x66249 Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Mike, At 01:53 PM 4/8/2002 -0400, Michael Friendly wrote:>I have a large n-way contingency table, constructed as a table >object, and want to pool (collapse) some categories, summing the >frequencies in all collapsed cells. How can I do this?Use apply; for example: > tab <- as.table(array(1:24, c(2,3,4))) > tab , , A A B C A 1 3 5 B 2 4 6 , , B A B C A 7 9 11 B 8 10 12 , , C A B C A 13 15 17 B 14 16 18 , , D A B C A 19 21 23 B 20 22 24 > apply(tab, c(2,3), sum) # sums over first coordinate A B C D A 3 15 27 39 B 7 19 31 43 C 11 23 35 47 I hope that this helps, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
beautiful, I didn't realise you could use "c(2,3)". Its not in "help". I guess it's so obvious now... John Strumila > apply(tab, c(2,3), sum) # sums over first coordinate -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Perhaps I should have given an example. # 5 column matrix: Day, Time, Station, State, Freq tv.matrix<-read.table("C:/R/mosaics/tv.dat")> tv.matrix[1:5,]V1 V2 V3 V4 V5 1 1 1 1 1 6 2 2 1 1 1 18 3 3 1 1 1 6 4 4 1 1 1 2 5 5 1 1 1 11>tv <- array(tv.matrix[,5], dim=c(5,11,5,3)) dimnames(tv) <- list(c("Monday","Tuesday","Wednesday","Thursday","Friday"), c("8:00","8:15","8:30","8:45","9:00","9:15","9:30", "9:45","10:00","10:15","10:30"), c("A","C","N","F","Other"), c("O","S","P")) names(dimnames(tv))<-c("Day", "Time", "Station", "State") Say I want to collapse the 11 time categories to 3. The real problem is when I have only the 4-way array (or an equivalent table). But here, I thought I could just collapse the 2nd column to 3 categories: tv.matrix[,2] <- 1+ as.integer((tv.matrix[,2]-1) /4) tv2 <- array(tv.matrix[,5], dim=c(5,3,5,3)) dimnames(tv2) <- list(c("Monday","Tuesday","Wednesday","Thursday","Friday"), c("8:00-8:45","9:00-9:45","10:00-10:30"), c("A","C","N","F","Other"), c("O","S","P")) names(dimnames(tv2))<-c("Day", "Time", "Station", "State") But something is wrong, because the margins are not the same:> margin.table(tv,1)Day Monday Tuesday Wednesday Thursday Friday 21271 20486 19304 19779 17275> margin.table(tv2,1)Day Monday Tuesday Wednesday Thursday Friday 996 912 962 898 771 What am I missing? -Michael -- Michael Friendly friendly at yorku.ca York University http://www.math.yorku.ca/SCS/friendly.html Psychology Department 4700 Keele Street Tel: (416) 736-5115 x66249 Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
you're right as usual. I only looked at the 'arguments' section. -----Original Message----- From: ripley at stats.ox.ac.uk [mailto:ripley at stats.ox.ac.uk] Sent: Friday, 12 April 2002 4:40 AM To: Strumila, John Cc: r-help Subject: RE: [R] pooling categories in a table On Tue, 9 Apr 2002, Strumila, John wrote:> beautiful, I didn't realise you could use "c(2,3)". Its not in "help". I > guess it's so obvious now...I beg to differ: it is under MARGIN in help(apply) (and in a couple of good books on S). Also note that some 1.5.0 colSums() will do this sort of thing a mite more efficiently. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._