This must be FAQ, but I can't find it in archives or with a site search. I am trying to construct a frequency table. I guess this should be done with table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh! Please correct me if what I am looking for is not called a "frequency table". Perhaps it's called grouped data.> zz$x9[1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60 [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85 [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 I (think) I want it to look like: 40-49 2 50-59 15 60-69 20 70-79 19 80-89 12 90-99 2 Or the other way around with transpose. classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") For the rownames sum(zz$x9 > 40 & zz$x9 < 50) For getting frequency counts is very laborious... I got this far:> table(cut(zz$x9, brk))(40,50] (50,60] (60,70] (70,80] (80,90] (90,100] 2 19 21 19 8 1> brk[1] 40 50 60 70 80 90 100> > t(table(cut(zz$x9, brk)))(40,50] (50,60] (60,70] (70,80] (80,90] (90,100] [1,] 2 19 21 19 8 1 Still feels a million miles off. Now I could do with a little help please after spending a couple of hours working this out.
I guess you want something like: table(cut(zz$x9, c(-Inf, seq(40, 90, by=10), Inf))) HTH, Andy> From: Kai Hendry > > This must be FAQ, but I can't find it in archives or with a > site search. > > I am trying to construct a frequency table. I guess this > should be done with > table. Or perhaps factor and split. Or prop.table. cut? > findInterval? Argh! > > Please correct me if what I am looking for is not called a > "frequency table". > Perhaps it's called grouped data. > > > zz$x9 > [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 > 64 75 58 60 56 60 > [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 > 72 87 52 72 80 85 > [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 > > I (think) I want it to look like: > > 40-49 2 > 50-59 15 > 60-69 20 > 70-79 19 > 80-89 12 > 90-99 2 > > Or the other way around with transpose. > > classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") > For the rownames > > sum(zz$x9 > 40 & zz$x9 < 50) > For getting frequency counts is very laborious... > > I got this far: > > table(cut(zz$x9, brk)) > > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > 2 19 21 19 8 1 > > brk > [1] 40 50 60 70 80 90 100 > > > > t(table(cut(zz$x9, brk))) > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > [1,] 2 19 21 19 8 1 > > Still feels a million miles off. > > Now I could do with a little help please after spending a > couple of hours > working this out. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
?data.frame data.frame( table(cut(x, seq(0, 1, by=0.1))) )> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Kai Hendry > Sent: 17 March 2004 14:55 > To: r-help at stat.math.ethz.ch > Subject: [R] Frequency table > > > This must be FAQ, but I can't find it in archives or with a site search. > > I am trying to construct a frequency table. I guess this should > be done with > table. Or perhaps factor and split. Or prop.table. cut? > findInterval? Argh! > > Please correct me if what I am looking for is not called a > "frequency table". > Perhaps it's called grouped data. > > > zz$x9 > [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 > 75 58 60 56 60 > [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 > 87 52 72 80 85 > [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 > > I (think) I want it to look like: > > 40-49 2 > 50-59 15 > 60-69 20 > 70-79 19 > 80-89 12 > 90-99 2 > > Or the other way around with transpose. > > classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") > For the rownames > > sum(zz$x9 > 40 & zz$x9 < 50) > For getting frequency counts is very laborious... > > I got this far: > > table(cut(zz$x9, brk)) > > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > 2 19 21 19 8 1 > > brk > [1] 40 50 60 70 80 90 100 > > > > t(table(cut(zz$x9, brk))) > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > [1,] 2 19 21 19 8 1 > > Still feels a million miles off. > > Now I could do with a little help please after spending a couple of hours > working this out. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html
Kai Hendry wrote:> This must be FAQ, but I can't find it in archives or with a site search. > > I am trying to construct a frequency table. I guess this should be done with > table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh! > > Please correct me if what I am looking for is not called a "frequency table". > Perhaps it's called grouped data. > > >>zz$x9 > > [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60 > [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85 > [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 > > I (think) I want it to look like: > > 40-49 2 > 50-59 15 > 60-69 20 > 70-79 19 > 80-89 12 > 90-99 2 > > Or the other way around with transpose. > > classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") > For the rownames > > sum(zz$x9 > 40 & zz$x9 < 50) > For getting frequency counts is very laborious... > > I got this far: > >>table(cut(zz$x9, brk))table(cut(zz$x9, brk, right = FALSE)) should do the trick. Uwe Ligges> > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > 2 19 21 19 8 1 > >>brk > > [1] 40 50 60 70 80 90 100 > >>t(table(cut(zz$x9, brk))) > > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > [1,] 2 19 21 19 8 1 > > Still feels a million miles off. > > Now I could do with a little help please after spending a couple of hours > working this out. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Kai Hendry <hendry at cs.helsinki.fi> writes:> This must be FAQ, but I can't find it in archives or with a site search. > > I am trying to construct a frequency table. I guess this should be done with > table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh! > > Please correct me if what I am looking for is not called a "frequency table". > Perhaps it's called grouped data. > > > zz$x9 > [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60 > [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85 > [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 > > I (think) I want it to look like: > > 40-49 2 > 50-59 15 > 60-69 20 > 70-79 19 > 80-89 12 > 90-99 2 > > Or the other way around with transpose. > > classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") > For the rownames > > sum(zz$x9 > 40 & zz$x9 < 50) > For getting frequency counts is very laborious... > > I got this far: > > table(cut(zz$x9, brk)) > > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > 2 19 21 19 8 1 > > brk > [1] 40 50 60 70 80 90 100 > > > > t(table(cut(zz$x9, brk))) > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > [1,] 2 19 21 19 8 1 > > Still feels a million miles off. > > Now I could do with a little help please after spending a couple of hours > working this out.Hmm, interesting complication of the convention that tables are 1D arrays there... You got this far: classes <- c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") brk <- seq(40,100,10) However, your intervals include the wrong end and the labels are ugly, so try table(cut(zz,breaks=brk,right=FALSE,labels=classes)) This at least gives you the right counts and labels: 40-49 50-59 60-69 70-79 80-89 90-99 2 15 20 19 12 2 for a column display, you need to convert to a matrix somehow. Transposing twice will actually do it, but I think I prefer matrix(table(cut(zz,breaks=brk,right=FALSE)),dimnames=list(age=classes,"")) which gives this: age 40-49 2 50-59 15 60-69 20 70-79 19 80-89 12 90-99 2 -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
On Wed, 2004-03-17 at 08:55, Kai Hendry wrote:> This must be FAQ, but I can't find it in archives or with a site search. > > I am trying to construct a frequency table. I guess this should be done with > table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh! > > Please correct me if what I am looking for is not called a "frequency table". > Perhaps it's called grouped data. > > > zz$x9 > [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60 > [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85 > [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 > > I (think) I want it to look like: > > 40-49 2 > 50-59 15 > 60-69 20 > 70-79 19 > 80-89 12 > 90-99 2 > > Or the other way around with transpose. > > classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") > For the rownames > > sum(zz$x9 > 40 & zz$x9 < 50) > For getting frequency counts is very laborious... > > I got this far: > > table(cut(zz$x9, brk)) > > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > 2 19 21 19 8 1 > > brk > [1] 40 50 60 70 80 90 100 > > > > t(table(cut(zz$x9, brk))) > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > [1,] 2 19 21 19 8 1 > > Still feels a million miles off. > > Now I could do with a little help please after spending a couple of hours > working this out.Try this: table(cut(zz$x9, brk, labels = c("40 - 49", "50 - 59", "60 - 69", "70 - 79", "80 - 89", "90 - 99"), right = FALSE)) This should give you something like: 40 - 49 50 - 59 60 - 69 70 - 79 80 - 89 90 - 99 2 15 20 19 12 2 You can use the labels argument in cut to define the group labels and 'right = FALSE' "closes" the intervals to the right side of the range. HTH, Marc Schwartz
On Wed, Mar 17, 2004 at 04:55:19PM +0200, Kai Hendry wrote:> I am trying to construct a frequency table. I guess this should be done with > table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh!> I got this far: > > table(cut(zz$x9, brk)) > > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > 2 19 21 19 8 1 > > brk > [1] 40 50 60 70 80 90 100 > > > > t(table(cut(zz$x9, brk))) > (40,50] (50,60] (60,70] (70,80] (80,90] (90,100] > [1,] 2 19 21 19 8 1 > > Still feels a million miles off.Why? To me it looks like you figured it all out. You found out how to use cut() to get the appropriate factor and you used table() to compute the counts. Nothing wrong with that... The only difference to what you wanted to get is that your example looked more like a data frame. Try as.data.frame(yourtable) which will give you something like this:> as.data.frame(tbl)b Freq 1 (0,20] 17 2 (20,40] 28 3 (40,60] 19 4 (60,80] 15 5 (80,100] 21 Is that what you wanted? cu Philipp -- Dr. Philipp Pagel Tel. +49-89-3187-3675 Institute for Bioinformatics / MIPS Fax. +49-89-3187-3585 GSF - National Research Center for Environment and Health Ingolstaedter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/~pagel
Kai Hendry <hendry at cs.helsinki.fi> writes:> 40-49 2 > 50-59 15 > 60-69 20 > 70-79 19 > 80-89 12 > 90-99 2Here's another solution for this 10-year age group thing: tt<-table(zz%/%10) n <- names(tt) names(tt) <- paste(n,0,"-",n,9,sep="") tt data.frame(count=c(tt)) Beware that empty groups are silently zapped, though. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Assuming x contains the data, taking the solutions so far and adding some minor improvements gives: groups <- x %/% 10 lev <- min(groups):max(groups) lab <- factor( paste( lev, "0-", lev, "9", sep = "" ) ) groups <- factor( groups, lev = lev, lab = lab ) tab <- table( groups, dnn = "Range" ) as.data.frame( tab ) # for graphical output: bp <- barplot( tab ) text( bp, tab, tab, pos = 3, xpd = TRUE ) --- Date: Wed, 17 Mar 2004 16:55:19 +0200 From: Kai Hendry <hendry at cs.helsinki.fi> To: <r-help at stat.math.ethz.ch> Subject: [R] Frequency table This must be FAQ, but I can't find it in archives or with a site search. I am trying to construct a frequency table. I guess this should be done with table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh! Please correct me if what I am looking for is not called a "frequency table". Perhaps it's called grouped data.> zz$x9[1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60 [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85 [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84 I (think) I want it to look like: 40-49 2 50-59 15 60-69 20 70-79 19 80-89 12 90-99 2 Or the other way around with transpose. classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99") For the rownames sum(zz$x9 > 40 & zz$x9 < 50) For getting frequency counts is very laborious... I got this far:> table(cut(zz$x9, brk))(40,50] (50,60] (60,70] (70,80] (80,90] (90,100] 2 19 21 19 8 1> brk[1] 40 50 60 70 80 90 100> > t(table(cut(zz$x9, brk)))(40,50] (50,60] (60,70] (70,80] (80,90] (90,100] [1,] 2 19 21 19 8 1 Still feels a million miles off. Now I could do with a little help please after spending a couple of hours working this out.
Hi, See if this generic function I made can help you. data <- c(65, 70, 85, 65, 65, 65, 62, 55, 82, 59, 55, 66, 74, 55, 65, 56, 80, 73, 45, 64, 75, 58, 60, 56, 60, 65, 53, 63, 72, 80, 90, 95, 55, 70, 79, 62, 57, 65, 60, 47, 61, 53, 80, 75, 72, 87, 52, 72, 80, 85, 75, 70, 84, 60, 72, 70, 76, 70, 79, 72, 69, 80, 62, 74, 54, 58, 58, 69, 81, 84) #------------ begin options of table--------------- min <- 40 max <- 100 h <- 10 #-------------- end options of table--------------- #-------- begin declaration of variables----------- Fi <- numeric(); FacA <- numeric(); FacP <- numeric(); FrA <- numeric(); FrP <- numeric() #-------- end declaration of variables------------- #----------------- begin function------------------ Createtable <- function() { Fi <<- table(cut(data, br = seq(min, max, h), right = FALSE)) K <- length(names(Fi)) n <- length(data) for(i in 1:K) { FrA[i] = Fi[i] / n } for(i in 1:K) { FrP[i] = (Fi[i] / n) * 100 } for(i in 1:K) { FacA[i] = sum(Fi[1:i]) } for(i in 1:K) { FacP[i] = (sum(Fi[1:i]) / n) * 100 } table <- data.frame(Fi, FrA, FrP, FacA, FacP) } #----------------- end function------------------ tab <- Createtable() print("Complete table:") print(tab) Jos? Cl?udio Faria UESC/DCET Brasil 73-634.2779 joseclaudio.faria at terra.com.br jc_faria at uol.com.br