I am trying to understand cut so I can divide a list of numbers into 10 group: 0-9.0 10-10.9 20-20.9 30-30.9, 40-40.9, 50-50.9 60-60.9 70-70.9 80-80.9 90-90.9 As I try to do this, I have been playing with the cut function. Surprising the following for applications of cut give me the exact same groups. This surprises me given that I have varied parameters include.lowest and right. Can someone help me understand what include.lowest and right do? I have looked at the help page, but I don't seem to understand what I am being told! Thank you, John values <- c((0:99),c(0.9:99.9)) sort(values) c1<-cut(values,10,include.lowest=FALSE,right=TRUE) c2<-cut(values,10,include.lowest=FALSE,right=FALSE) c3<-cut(values,10,include.lowest=TRUE,right=TRUE) c4<-cut(values,10,include.lowest=TRUE,right=FALSE) cbind(min=aggregate(values,list(c1),min),max=aggregate(values,list(c1),max)) cbind(min=aggregate(values,list(c2),min),max=aggregate(values,list(c2),max)) cbind(min=aggregate(values,list(c3),min),max=aggregate(values,list(c3),max)) cbind(min=aggregate(values,list(c4),min),max=aggregate(values,list(c4),max)) You can run the code below, or inspect the results I got which are reproduced below:> cbind(min=aggregate(values,list(c1),min),max=aggregate(values,list(c1),max))min.Group.1 min.x max.Group.1 max.x 1 (-0.0999,9.91] 0 (-0.0999,9.91] 9.9 2 (9.91,19.9] 10 (9.91,19.9] 19.9 3 (19.9,29.9] 20 (19.9,29.9] 29.9 4 (29.9,39.9] 30 (29.9,39.9] 39.9 5 (39.9,50] 40 (39.9,50] 49.9 6 (50,60] 50 (50,60] 59.9 7 (60,70] 60 (60,70] 69.9 8 (70,80] 70 (70,80] 79.9 9 (80,90] 80 (80,90] 89.9 10 (90,100] 90 (90,100] 99.9> cbind(min=aggregate(values,list(c2),min),max=aggregate(values,list(c2),max))min.Group.1 min.x max.Group.1 max.x 1 [-0.0999,9.91) 0 [-0.0999,9.91) 9.9 2 [9.91,19.9) 10 [9.91,19.9) 19.9 3 [19.9,29.9) 20 [19.9,29.9) 29.9 4 [29.9,39.9) 30 [29.9,39.9) 39.9 5 [39.9,50) 40 [39.9,50) 49.9 6 [50,60) 50 [50,60) 59.9 7 [60,70) 60 [60,70) 69.9 8 [70,80) 70 [70,80) 79.9 9 [80,90) 80 [80,90) 89.9 10 [90,100) 90 [90,100) 99.9> cbind(min=aggregate(values,list(c3),min),max=aggregate(values,list(c3),max))min.Group.1 min.x max.Group.1 max.x 1 [-0.0999,9.91] 0 [-0.0999,9.91] 9.9 2 (9.91,19.9] 10 (9.91,19.9] 19.9 3 (19.9,29.9] 20 (19.9,29.9] 29.9 4 (29.9,39.9] 30 (29.9,39.9] 39.9 5 (39.9,50] 40 (39.9,50] 49.9 6 (50,60] 50 (50,60] 59.9 7 (60,70] 60 (60,70] 69.9 8 (70,80] 70 (70,80] 79.9 9 (80,90] 80 (80,90] 89.9 10 (90,100] 90 (90,100] 99.9> cbind(min=aggregate(values,list(c4),min),max=aggregate(values,list(c4),max))min.Group.1 min.x max.Group.1 max.x 1 [-0.0999,9.91) 0 [-0.0999,9.91) 9.9 2 [9.91,19.9) 10 [9.91,19.9) 19.9 3 [19.9,29.9) 20 [19.9,29.9) 29.9 4 [29.9,39.9) 30 [29.9,39.9) 39.9 5 [39.9,50) 40 [39.9,50) 49.9 6 [50,60) 50 [50,60) 59.9 7 [60,70) 60 [60,70) 69.9 8 [70,80) 70 [70,80) 79.9 9 [80,90) 80 [80,90) 89.9 10 [90,100] 90 [90,100] 99.9 John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
Have you read FAQ 7.31 recently, John? Your whole premise is flawed. You should be thinking of ranges [0,10), [10,20), and so on because numbers ending in 0.9 are never going to be exact. -- Sent from my phone. Please excuse my brevity. On April 16, 2016 7:38:50 PM PDT, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:>I am trying to understand cut so I can divide a list of numbers into 10 >group: > 0-9.0 >10-10.9 >20-20.9 >30-30.9, >40-40.9, >50-50.9 >60-60.9 >70-70.9 >80-80.9 >90-90.9 > >As I try to do this, I have been playing with the cut function. >Surprising the following for applications of cut give me the exact same >groups. This surprises me given that I have varied parameters >include.lowest and right. Can someone help me understand what >include.lowest and right do? I have looked at the help page, but I >don't seem to understand what I am being told! >Thank you, >John > >values <- c((0:99),c(0.9:99.9)) >sort(values) >c1<-cut(values,10,include.lowest=FALSE,right=TRUE) >c2<-cut(values,10,include.lowest=FALSE,right=FALSE) >c3<-cut(values,10,include.lowest=TRUE,right=TRUE) >c4<-cut(values,10,include.lowest=TRUE,right=FALSE) >cbind(min=aggregate(values,list(c1),min),max=aggregate(values,list(c1),max)) >cbind(min=aggregate(values,list(c2),min),max=aggregate(values,list(c2),max)) >cbind(min=aggregate(values,list(c3),min),max=aggregate(values,list(c3),max)) >cbind(min=aggregate(values,list(c4),min),max=aggregate(values,list(c4),max)) > >You can run the code below, or inspect the results I got which are >reproduced below: > >> >cbind(min=aggregate(values,list(c1),min),max=aggregate(values,list(c1),max)) > min.Group.1 min.x max.Group.1 max.x >1 (-0.0999,9.91] 0 (-0.0999,9.91] 9.9 >2 (9.91,19.9] 10 (9.91,19.9] 19.9 >3 (19.9,29.9] 20 (19.9,29.9] 29.9 >4 (29.9,39.9] 30 (29.9,39.9] 39.9 >5 (39.9,50] 40 (39.9,50] 49.9 >6 (50,60] 50 (50,60] 59.9 >7 (60,70] 60 (60,70] 69.9 >8 (70,80] 70 (70,80] 79.9 >9 (80,90] 80 (80,90] 89.9 >10 (90,100] 90 (90,100] 99.9 >> >cbind(min=aggregate(values,list(c2),min),max=aggregate(values,list(c2),max)) > min.Group.1 min.x max.Group.1 max.x >1 [-0.0999,9.91) 0 [-0.0999,9.91) 9.9 >2 [9.91,19.9) 10 [9.91,19.9) 19.9 >3 [19.9,29.9) 20 [19.9,29.9) 29.9 >4 [29.9,39.9) 30 [29.9,39.9) 39.9 >5 [39.9,50) 40 [39.9,50) 49.9 >6 [50,60) 50 [50,60) 59.9 >7 [60,70) 60 [60,70) 69.9 >8 [70,80) 70 [70,80) 79.9 >9 [80,90) 80 [80,90) 89.9 >10 [90,100) 90 [90,100) 99.9 >> >cbind(min=aggregate(values,list(c3),min),max=aggregate(values,list(c3),max)) > min.Group.1 min.x max.Group.1 max.x >1 [-0.0999,9.91] 0 [-0.0999,9.91] 9.9 >2 (9.91,19.9] 10 (9.91,19.9] 19.9 >3 (19.9,29.9] 20 (19.9,29.9] 29.9 >4 (29.9,39.9] 30 (29.9,39.9] 39.9 >5 (39.9,50] 40 (39.9,50] 49.9 >6 (50,60] 50 (50,60] 59.9 >7 (60,70] 60 (60,70] 69.9 >8 (70,80] 70 (70,80] 79.9 >9 (80,90] 80 (80,90] 89.9 >10 (90,100] 90 (90,100] 99.9 >> >cbind(min=aggregate(values,list(c4),min),max=aggregate(values,list(c4),max)) > min.Group.1 min.x max.Group.1 max.x >1 [-0.0999,9.91) 0 [-0.0999,9.91) 9.9 >2 [9.91,19.9) 10 [9.91,19.9) 19.9 >3 [19.9,29.9) 20 [19.9,29.9) 29.9 >4 [29.9,39.9) 30 [29.9,39.9) 39.9 >5 [39.9,50) 40 [39.9,50) 49.9 >6 [50,60) 50 [50,60) 59.9 >7 [60,70) 60 [60,70) 69.9 >8 [70,80) 70 [70,80) 79.9 >9 [80,90) 80 [80,90) 89.9 >10 [90,100] 90 [90,100] 99.9 >John David Sorkin M.D., Ph.D. >Professor of Medicine >Chief, Biostatistics and Informatics >University of Maryland School of Medicine Division of Gerontology and >Geriatric Medicine >Baltimore VA Medical Center >10 North Greene Street >GRECC (BT/18/GR) >Baltimore, MD 21201-1524 >(Phone) 410-605-7119 >(Fax) 410-605-7913 (Please call phone number above prior to faxing) > >Confidentiality Statement: >This email message, including any attachments, is for t...{{dropped:15}}
Jeff, Perhaps I was sloppy with my notation: I want groups>=0 <10 >=10 <20 >=20<30......>=90 <100In any event, my question remains, why did the four different versions of cut give me the same results? I hope someone can explain to me the function of include.lowest and right in the call to cut. As demonstrated in my example below, the parameters do not seem to alter the results of using cut. Thank you, John P.S. How do I find FAQ 7.31? Thank you, John I John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Jeff Newmiller <jdnewmil at dcn.davis.ca.us> 04/16/16 11:07 PM >>>Have you read FAQ 7.31 recently, John? Your whole premise is flawed. You should be thinking of ranges [0,10), [10,20), and so on because numbers ending in 0.9 are never going to be exact. -- Sent from my phone. Please excuse my brevity. On April 16, 2016 7:38:50 PM PDT, John Sorkin <jsorkin at grecc.umaryland.edu> wrote: I am trying to understand cut so I can divide a list of numbers into 10 group: 0-9.0 10-10.9 20-20.9 30-30.9, 40-40.9, 50-50.9 60-60.9 70-70.9 80-80.9 90-90.9 As I try to do this, I have been playing with the cut function. Surprising the following for applications of cut give me the exact same groups. This surprises me given that I have varied parameters include.lowest and right. Can someone help me understand what include.lowest and right do? I have looked at the help page, but I don't seem to understand what I am being told! Thank you, John values <- c((0:99),c(0.9:99.9)) sort(values) c1<-cut(values,10,include.lowest=FALSE,right=TRUE) c2<-cut(values,10,include.lowest=FALSE,right=FALSE) c3<-cut(values,10,include.lowest=TRUE,right=TRUE) c4<-cut(values,10,include.lowest=TRUE,right=FALSE) cbind(min=aggregate(values,list(c1),min),max=aggregate(values,list(c1),max)) cbind(min=aggregate(values,list(c2),min),max=aggregate(values,list(c2),max)) cbind(min=aggregate(values,list(c3),min),max=aggregate(values,list(c3),max)) cbind(min=aggregate(values,list(c4),min),max=aggregate(values,list(c4),max)) You can run the code below, or inspect the results I got which are reproduced below: cbind(min=aggregate(values,list(c1),min),max=aggregate(values,list(c1),max)) min.Group.1 min.x max.Group.1 max.x 1 (-0.0999,9.91] 0 (-0.0999,9.91] 9.9 2 (9.91,19.9] 10 (9.91,19.9] 19.9 3 (19.9,29.9] 20 (19.9,29.9] 29.9 4 (29.9,39.9] 30 (29.9,39.9] 39.9 5 (39.9,50] 40 (39.9,50] 49.9 6 (50,60] 50 (50,60] 59.9 7 (60,70] 60 (60,70] 69.9 8 (70,80] 70 (70,80] 79.9 9 (80,90] 80 (80,90] 89.9 10 (90,100] 90 (90,100] 99.9 cbind(min=aggregate(values,list(c2),min),max=aggregate(values,list(c2),max)) min.Group.1 min.x max.Group.1 max.x 1 [-0.0999,9.91) 0 [-0.0999,9.91) 9.9 2 [9.91,19.9) 10 [9.91,19.9) 19.9 3 [19.9,29.9) 20 [19.9,29.9) 29.9 4 [29.9,39.9) 30 [29.9,39.9) 39.9 5 [39.9,50) 40 [39.9,50) 49.9 6 [50,60) 50 [50,60) 59.9 7 [60,70) 60 [60,70) 69.9 8 [70,80) 70 [70,80) 79.9 9 [80,90) 80 [80,90) 89.9 10 [90,100) 90 [90,100) 99.9 cbind(min=aggregate(values,list(c3),min),max=aggregate(values,list(c3),max)) min.Group.1 min.x max.Group.1 max.x 1 [-0.0999,9.91] 0 [-0.0999,9.91] 9.9 2 (9.91,19.9] 10 (9.91,19.9] 19.9 3 (19.9,29.9] 20 (19.9,29.9] 29.9 4 (29.9,39.9] 30 (29.9,39.9] 39.9 5 (39.9,50] 40 (39.9,50] 49.9 6 (50,60] 50 (50,60] 59.9 7 (60,70] 60 (60,70] 69.9 8 (70,80] 70 (70,80] 79.9 9 (80,90] 80 (80,90] 89.9 10 (90,100] 90 (90,100] 99.9 cbind(min=aggregate(values,list(c4),min),max=aggregate(values,list(c4),max)) min.Group.1 min.x max.Group.1 max.x 1 [-0.0999,9.91) 0 [-0.0999,9.91) 9.9 2 [9.91,19.9) 10 [9.91,19.9) 19.9 3 [19.9,29.9) 20 [19.9,29.9) 29.9 4 [29.9,39.9) 30 [29.9,39.9) 39.9 5 [39.9,50) 40 [39.9,50) 49.9 6 [50,60) 50 [50,60) 59.9 7 [60,70) 60 [60,70) 69.9 8 [70,80) 70 [70,80) 79.9 9 [80,90) 80 [80,90) 89.9 10 [90,100] 90 [90,100] 99.9 John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, isfor the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.