Hi Listers, Surely, I just have a mental block and there is a more elegant way of creating a summary count (other than extracing it from ftable). I''d like to create a new data.frame containing counts of spell by loc ie have three columns showing spell,loc,count. Below the data.frame... Any help appreciated Thanks Herry spell loc 101 Parts 1 102 Overall 2 105 Parts 1 106 None 1 111 None 1 116 Parts 1 118 None 1 119 Overall 4 123 Overall 1 125 Parts 1 126 Overall 1 127 Parts 1 128 Overall 1 134 Overall 1 138 Overall 2 139 Overall 1 142 Overall 1 191 Parts 1 192 Parts 2 193 Parts 2 204 Parts 2 205 None 1 207 Parts 2 208 Overall 2 210 Overall 2 211 Parts 1 212 Parts 2 215 Overall 2 218 Overall 2 220 Overall 2 221 Overall 2 222 Parts 2 223 Overall 2 224 Overall 2 225 Overall 2 226 Parts 2 228 Overall 2 232 Parts 2 236 Overall 2 238 Parts 2 302 None 1 304 Parts 3 306 Overall 3 309 Parts 4 310 Parts 3 311 Overall 3 312 Overall 3 314 Parts 3 317 None 3 319 Parts 3 320 Overall 3 321 Overall 3 322 Overall 3 -------------------------------------------- Alexander Herr - Herry [[alternate HTML version deleted]]
Alexander.Herr at csiro.au wrote: Hi Listers,> Surely, I just have a mental block and there is a more elegant way of > creating a summary count (other than extracing it from ftable). I'd like > to create a new data.frame containing counts of spell by loc ie have > three columns showing spell,loc,count. Below the data.frame... > > Any help appreciated > Thanks Herry > > spell loc > 101 Parts 1 > 102 Overall 2... It's a bit hard to tell exactly what you want from the example. If the assumptions that "spell" is the name of the second column and "loc" is the name of the third are correct: 1) add a name for the first field 2) Use a single comma (or other separator) between your data fields herr.df<-read.table("herr.dat",header=T,sep=",") boggle<-as.data.frame.table(table(herr.df$spell,herr.df$loc)) Freq2<-as.numeric(boggle$Var2)*as.numeric(boggle$Freq) aggregate(Freq2,by=list(Var1=boggle$Var1),FUN=sum) Jim
The options I know of are: 1. aggregate (in the base package), with FUN = length. But this converts character vectors to factors, which is sometimes annoying and sometimes dangerous. 2. summarize, in the Hmisc package (again, with FUN = length). I find summarize to be a very useful function in general, but it has a lot of overhead if all you want is counts. Very slow with a large data frame. 3. Some wrapper that calls tabulate directly. I use: table.mat <- function(x) { uid <- do.call("paste", as.list(x)) count <- tabulate(factor(uid)) x <- x[order(uid), ] i <- !duplicated(sort(uid)) out <- x[i, ] out$Count <- count last <- length(out) o <- do.call("order", as.list(out[-last])) out <- out[o, ] dimnames(out) <- list(1:(dim(out)[1]), names(out)) out } This is based on my memory of a function that I think Scott Chasalow wrote and often used. My memory is only of what the function did, not on the code, so Scott may have something a bit better? (I am cc'ing Scott)> Message: 16 > From: Alexander.Herr at csiro.au > To: r-help at stat.math.ethz.ch > Date: Mon, 13 Jan 2003 14:22:23 +1000 > Subject: [R] summarizing dataframe > > Hi Listers, > > Surely, I just have a mental block and there is a more elegant way ofcreating a> summary count (other than extracing it from ftable). I'd like tocreate a new> data.frame containing counts of spell by loc ie have three columnsshowing> spell,loc,count. Below the data.frame... > > Any help appreciated > Thanks HerryJim James A. Rogers, Ph.D. <rogers at cantatapharm.com> Statistical Scientist Cantata Pharmaceuticals 3-G Gill St Woburn, MA 01801 617.225.9009 Fax 617.225.9010