I have a large n-way contingency table, constructed as a table object, and want to pool (collapse) some categories, summing the frequencies in all collapsed cells. How can I do this? thx, -Michael -- Michael Friendly friendly at yorku.ca York University http://www.math.yorku.ca/SCS/friendly.html Psychology Department 4700 Keele Street Tel: (416) 736-5115 x66249 Toronto, Ontario, M3J 1P3 Fax: (416) 736-5814 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Mike, At 01:53 PM 4/8/2002 -0400, Michael Friendly wrote:>I have a large n-way contingency table, constructed as a table >object, and want to pool (collapse) some categories, summing the >frequencies in all collapsed cells. How can I do this?Use apply; for example: > tab <- as.table(array(1:24, c(2,3,4))) > tab , , A A B C A 1 3 5 B 2 4 6 , , B A B C A 7 9 11 B 8 10 12 , , C A B C A 13 15 17 B 14 16 18 , , D A B C A 19 21 23 B 20 22 24 > apply(tab, c(2,3), sum) # sums over first coordinate A B C D A 3 15 27 39 B 7 19 31 43 C 11 23 35 47 I hope that this helps, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
beautiful, I didn't realise you could use "c(2,3)".  Its not in
"help".  I
guess it's so obvious now...
John Strumila
    > apply(tab, c(2,3), sum)  # sums over first coordinate
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Perhaps I should have given an example.
#  5 column matrix:  Day, Time, Station, State, Freq    
 tv.matrix<-read.table("C:/R/mosaics/tv.dat")
> tv.matrix[1:5,]
  V1 V2 V3 V4 V5
1  1  1  1  1  6
2  2  1  1  1 18
3  3  1  1  1  6
4  4  1  1  1  2
5  5  1  1  1 11> 
 tv <- array(tv.matrix[,5], dim=c(5,11,5,3))                            
dimnames(tv) <-
list(c("Monday","Tuesday","Wednesday","Thursday","Friday"),
               
c("8:00","8:15","8:30","8:45","9:00","9:15","9:30",
"9:45","10:00","10:15","10:30"), 
               
c("A","C","N","F","Other"),
c("O","S","P"))
 names(dimnames(tv))<-c("Day", "Time",
"Station", "State")
Say I want to collapse the 11 time categories to 3. The real problem
is when I have only the 4-way array (or an equivalent table).  But
here, I thought I could just collapse the 2nd column to 3 categories:
tv.matrix[,2] <- 1+ as.integer((tv.matrix[,2]-1) /4)
tv2 <- array(tv.matrix[,5],
dim=c(5,3,5,3))                                        
dimnames(tv2) <-
list(c("Monday","Tuesday","Wednesday","Thursday","Friday"),
               
c("8:00-8:45","9:00-9:45","10:00-10:30"),
               
c("A","C","N","F","Other"),
c("O","S","P"))
 names(dimnames(tv2))<-c("Day", "Time",
"Station", "State")
But something is wrong, because the margins are not the same:
> margin.table(tv,1)
Day
   Monday   Tuesday Wednesday  Thursday    Friday 
    21271     20486     19304     19779     17275 > margin.table(tv2,1)
Day
   Monday   Tuesday Wednesday  Thursday    Friday 
      996       912       962       898       771 
What am I missing?
-Michael
-- 
Michael Friendly              friendly at yorku.ca
York University               http://www.math.yorku.ca/SCS/friendly.html
Psychology Department
4700 Keele Street             Tel:  (416) 736-5115 x66249
Toronto, Ontario, M3J 1P3     Fax:  (416) 736-5814
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
you're right as usual. I only looked at the 'arguments' section. -----Original Message----- From: ripley at stats.ox.ac.uk [mailto:ripley at stats.ox.ac.uk] Sent: Friday, 12 April 2002 4:40 AM To: Strumila, John Cc: r-help Subject: RE: [R] pooling categories in a table On Tue, 9 Apr 2002, Strumila, John wrote:> beautiful, I didn't realise you could use "c(2,3)". Its not in "help". I > guess it's so obvious now...I beg to differ: it is under MARGIN in help(apply) (and in a couple of good books on S). Also note that some 1.5.0 colSums() will do this sort of thing a mite more efficiently. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._