When I was studying the function cut I found this example:> x <- rep(0:8, tx0) > x[1] 0 0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5 5 5 5 5 5 5 5 5 5 6 [39] 6 6 6 6 7 7 7 8 8 8 8 8> cut(x, b = 8)[1] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] [6] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (0.994,2] [11] (0.994,2] (0.994,2] (0.994,2] (2,3] (2,3] [16] (2,3] (2,3] (2,3] (2,3] (3,4] [21] (3,4] (3,4] (3,4] (3,4] (4,5] [26] (4,5] (4,5] (4,5] (4,5] (4,5] [31] (4,5] (4,5] (4,5] (4,5] (4,5] [36] (4,5] (4,5] (5,6] (5,6] (5,6] [41] (5,6] (5,6] (6,7.01] (6,7.01] (6,7.01] [46] (7.01,8.01] (7.01,8.01] (7.01,8.01] (7.01,8.01] (7.01,8.01] 8 Levels: (-0.008,0.994] (0.994,2] (2,3] (3,4] (4,5] (5,6] ... (7.01,8.01] I undestand that the resulting factor yields as its first component the corresponding "intervals" that the original vector (x) elements belong to. So, clearly, the first element of x, 0, belongs to (-0.008,0.994] because -0.008 < 0 <= 0.994. However, when I come to the 14th element of x, that is, 2, I don't see that it belongs to (2,3], because 2 < 2 <= 3, is strictly false according to the standard inverval notation (http://en.wikipedia.org/wiki/Interval_notation). Maybe, there is something I have not understood of either the cut function or the interval notation. Do you have any comments? Thanks, Sergio.
On 11-11-07 7:48 PM, JulioSergio wrote:> When I was studying the function cut I found this example: > >> x<- rep(0:8, tx0) >> x > [1] 0 0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5 5 5 5 5 5 > 5 5 5 5 6 > [39] 6 6 6 6 7 7 7 8 8 8 8 8 > >> cut(x, b = 8) > [1] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] > (-0.008,0.994] > [6] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (0.994,2] > [11] (0.994,2] (0.994,2] (0.994,2] (2,3] (2,3] > [16] (2,3] (2,3] (2,3] (2,3] (3,4] > [21] (3,4] (3,4] (3,4] (3,4] (4,5] > [26] (4,5] (4,5] (4,5] (4,5] (4,5] > [31] (4,5] (4,5] (4,5] (4,5] (4,5] > [36] (4,5] (4,5] (5,6] (5,6] (5,6] > [41] (5,6] (5,6] (6,7.01] (6,7.01] (6,7.01] > [46] (7.01,8.01] (7.01,8.01] (7.01,8.01] (7.01,8.01] (7.01,8.01] > 8 Levels: (-0.008,0.994] (0.994,2] (2,3] (3,4] (4,5] (5,6] ... (7.01,8.01] > > I undestand that the resulting factor yields as its first component the > corresponding "intervals" that the original vector (x) elements belong to. > So, clearly, the first element of x, 0, belongs to (-0.008,0.994] because > -0.008< 0<= 0.994. However, when I come to the 14th element of x, > that is, 2, I don't see that it belongs to (2,3], because 2< 2<= 3, > is strictly false according to the standard inverval notation > (http://en.wikipedia.org/wiki/Interval_notation). Maybe, there is something > I have not understood of either the cut function or the interval notation. > Do you have > any comments?The actual break is 1.996, but the formatting rounded that to 2 for printing. You can see it more accurately if you set dig.lab to 4 or more. Duncan Murdoch> > Thanks, > > Sergio. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
In stepping through 'cut.default' that is called, I get the following interval: -0.008 0.994 1.996 2.998 4.000 5.002 6.004 7.006 8.008 In printing out to three significant digits, you will have "(0.994, 2]" or "(2,3]" as you see in the factors. If instead you used: cut(x, b = 8, dig.lab = 7) you will get: cut(x, b= 8, dig.lab=7) [1] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] [7] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (0.994,1.996] (0.994,1.996] (0.994,1.996] [13] (0.994,1.996] (1.996,2.998] (1.996,2.998] (1.996,2.998] (1.996,2.998] (1.996,2.998] [19] (1.996,2.998] (2.998,4] (2.998,4] (2.998,4] (2.998,4] (2.998,4] [25] (4,5.002] (4,5.002] (4,5.002] (4,5.002] (4,5.002] (4,5.002] [31] (4,5.002] (4,5.002] (4,5.002] (4,5.002] (4,5.002] (4,5.002] [37] (4,5.002] (5.002,6.004] (5.002,6.004] (5.002,6.004] (5.002,6.004] (5.002,6.004] [43] (6.004,7.006] (6.004,7.006] (6.004,7.006] (7.006,8.008] (7.006,8.008] (7.006,8.008] [49] (7.006,8.008] (7.006,8.008] 8 Levels: (-0.008,0.994] (0.994,1.996] (1.996,2.998] (2.998,4] (4,5.002] ... (7.006,8.008] aren't floating point numbers fun; read FAQ 7.31 On Mon, Nov 7, 2011 at 7:48 PM, JulioSergio <juliosergio at gmail.com> wrote:> When I was studying the function cut I found this example: > >> x <- rep(0:8, tx0) >> x > ?[1] 0 0 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5 5 5 5 5 5 > 5 5 5 5 6 > [39] 6 6 6 6 7 7 7 8 8 8 8 8 > >> cut(x, b = 8) > ?[1] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] > (-0.008,0.994] > ?[6] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (-0.008,0.994] (0.994,2] > [11] (0.994,2] ? ? ?(0.994,2] ? ? ?(0.994,2] ? ? ?(2,3] ? ? ? ? ?(2,3] > [16] (2,3] ? ? ? ? ?(2,3] ? ? ? ? ?(2,3] ? ? ? ? ?(2,3] ? ? ? ? ?(3,4] > [21] (3,4] ? ? ? ? ?(3,4] ? ? ? ? ?(3,4] ? ? ? ? ?(3,4] ? ? ? ? ?(4,5] > [26] (4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(4,5] > [31] (4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(4,5] > [36] (4,5] ? ? ? ? ?(4,5] ? ? ? ? ?(5,6] ? ? ? ? ?(5,6] ? ? ? ? ?(5,6] > [41] (5,6] ? ? ? ? ?(5,6] ? ? ? ? ?(6,7.01] ? ? ? (6,7.01] ? ? ? (6,7.01] > [46] (7.01,8.01] ? ?(7.01,8.01] ? ?(7.01,8.01] ? ?(7.01,8.01] ? ?(7.01,8.01] > 8 Levels: (-0.008,0.994] (0.994,2] (2,3] (3,4] (4,5] (5,6] ... (7.01,8.01] > > I undestand that the resulting factor yields as its first component the > corresponding "intervals" that the original vector (x) elements belong to. > So, clearly, the first element of x, 0, belongs to (-0.008,0.994] because > -0.008 < 0 <= 0.994. However, when I come to the 14th element of x, > that is, 2, I don't see that it belongs to (2,3], because 2 < 2 <= 3, > is strictly false according to the standard inverval notation > (http://en.wikipedia.org/wiki/Interval_notation). Maybe, there is something > I have not understood of either the cut function or the interval notation. > Do you have > any comments? > > Thanks, > > Sergio. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
Duncan Murdoch <murdoch.duncan <at> gmail.com> writes: ... Thanks Duncan!