Can someone help me with a misunderstanding I'm having with hist? I expected, from the example below, that the number of bins would always be 10 and the length of the counts array the same. According to the help section 'breaks' can be a integer indicating the number of bins. From the example below, the number of bins (length of the counts array) varies. Am I wrong in expecting the same number of bins every time from my hist() call (am I doing something wrong)?> hist(rnorm(1000),breaks=10)$counts;[1] 2 10 18 41 85 151 188 195 149 92 48 14 5 2> hist(rnorm(1000),breaks=10)$counts;[1] 1 5 19 39 89 155 207 179 162 89 32 19 3 1> hist(rnorm(1000),breaks=10)$counts;[1] 2 3 19 46 101 149 196 204 137 79 43 16 4 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 2 6 19 41 89 166 188 193 151 87 37 14 6 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 2 6 26 48 90 145 188 177 143 95 52 19 9> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 8 14 35 101 148 195 197 158 82 34 20 8> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 2 12 17 57 82 157 196 215 135 80 25 19 3> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 1 3 14 51 86 130 212 194 152 89 45 18 5> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 1 4 18 46 112 146 173 195 155 93 38 11 7 0 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 1 2 13 39 97 148 198 189 145 101 40 26 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 4 11 39 102 128 191 204 148 111 49 11 1 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 1 0 19 136 354 309 160 20 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 3 3 19 52 95 148 198 179 136 100 41 18 7 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 2 4 18 39 88 155 178 188 170 88 46 18 5 1> hist(rnorm(1000),breaks=10,plot=FALSE)$counts;[1] 1 1 17 125 372 335 126 23> hist(rnorm(1000),breaks=10)$counts;[1] 3 21 36 88 153 194 196 158 96 38 13 3 1> hist(rnorm(1000),breaks=10)$counts;[1] 6 18 37 77 155 213 201 130 105 42 11 4 1> hist(rnorm(1000),breaks=10)$counts;[1] 2 2 21 35 80 155 181 199 165 99 37 16 8> hist(rnorm(1000),breaks=10)$counts;[1] 1 4 19 49 93 143 201 216 136 79 38 13 7 1> hist(rnorm(1000),breaks=10)$counts;[1] 1 3 19 38 100 138 208 182 147 93 48 18 5> hist(rnorm(1000),breaks=10)$counts;[1] 1 0 22 122 331 342 156 25 1> hist(rnorm(1000),breaks=10)$counts;[1] 2 8 10 48 98 154 170 194 168 91 40 15 2> hist(rnorm(1000),breaks=10)$counts;[1] 3 17 127 350 355 120 25 3> hist(rnorm(1000),breaks=10)$counts;[1] 3 6 8 43 81 151 188 216 163 85 33 15 7 1 Thanks, John [[alternative HTML version deleted]]
On 5/20/08, John Gant <john.gant at gmail.com> wrote:> Can someone help me with a misunderstanding I'm having with hist? I > expected, from the example below, that the number of bins would always be 10 > and the length of the counts array the same. According to the help section > 'breaks' can be a integer indicating the number of bins.The help actually says: breaks: one of: * a vector giving the breakpoints between histogram cells, * a single number giving the number of cells for the histogram, * a character string naming an algorithm to compute the number of cells (see 'Details'), * a function to compute the number of cells. In the last three cases the number is a suggestion only. The last sentence explains your observation. BTW, this was recently discussed: https://stat.ethz.ch/pipermail/r-help/2008-May/162492.html -Deepayan
On 5/20/2008 2:51 PM, John Gant wrote:> Can someone help me with a misunderstanding I'm having with hist? I > expected, from the example below, that the number of bins would always be 10 > and the length of the counts array the same. According to the help section > 'breaks' can be a integer indicating the number of bins. From the example > below, the number of bins (length of the counts array) varies. Am I wrong in > expecting the same number of bins every time from my hist() call (am I doing > something wrong)?Your expectation is wrong. When breaks is a single integer, it's a suggestion, it's not fixed. This is described on the help page: "In the last three cases the number is a suggestion only." If you want to fix the locations of the breaks specify them explicitly. Duncan Murdoch> >> hist(rnorm(1000),breaks=10)$counts; > [1] 2 10 18 41 85 151 188 195 149 92 48 14 5 2 >> hist(rnorm(1000),breaks=10)$counts; > [1] 1 5 19 39 89 155 207 179 162 89 32 19 3 1 >> hist(rnorm(1000),breaks=10)$counts; > [1] 2 3 19 46 101 149 196 204 137 79 43 16 4 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 2 6 19 41 89 166 188 193 151 87 37 14 6 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 2 6 26 48 90 145 188 177 143 95 52 19 9 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 8 14 35 101 148 195 197 158 82 34 20 8 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 2 12 17 57 82 157 196 215 135 80 25 19 3 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 1 3 14 51 86 130 212 194 152 89 45 18 5 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 1 4 18 46 112 146 173 195 155 93 38 11 7 0 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 1 2 13 39 97 148 198 189 145 101 40 26 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 4 11 39 102 128 191 204 148 111 49 11 1 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 1 0 19 136 354 309 160 20 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 3 3 19 52 95 148 198 179 136 100 41 18 7 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 2 4 18 39 88 155 178 188 170 88 46 18 5 1 >> hist(rnorm(1000),breaks=10,plot=FALSE)$counts; > [1] 1 1 17 125 372 335 126 23 >> hist(rnorm(1000),breaks=10)$counts; > [1] 3 21 36 88 153 194 196 158 96 38 13 3 1 >> hist(rnorm(1000),breaks=10)$counts; > [1] 6 18 37 77 155 213 201 130 105 42 11 4 1 >> hist(rnorm(1000),breaks=10)$counts; > [1] 2 2 21 35 80 155 181 199 165 99 37 16 8 >> hist(rnorm(1000),breaks=10)$counts; > [1] 1 4 19 49 93 143 201 216 136 79 38 13 7 1 >> hist(rnorm(1000),breaks=10)$counts; > [1] 1 3 19 38 100 138 208 182 147 93 48 18 5 >> hist(rnorm(1000),breaks=10)$counts; > [1] 1 0 22 122 331 342 156 25 1 >> hist(rnorm(1000),breaks=10)$counts; > [1] 2 8 10 48 98 154 170 194 168 91 40 15 2 >> hist(rnorm(1000),breaks=10)$counts; > [1] 3 17 127 350 355 120 25 3 >> hist(rnorm(1000),breaks=10)$counts; > [1] 3 6 8 43 81 151 188 216 163 85 33 15 7 1 > > Thanks, > John > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.