michael watson (IAH-C)
2003-Jul-03 13:02 UTC
[R] Generating a vector for breaks in a histogram
Hi I have two lots of numbers which I would like to histogram using the hist() function. For comparative reasons, I want them to be on the same scale, which I can use the xlim and ylim options to achieve. However, having them on the same scale is meaningless unless they have the same "breaks". Consulting the documentation, there are 4 ways of defining the number of breaks, only one of which is "definite", the others merely form suggestions, which I have found is not good enough. The only definite way is to provide a vector to the hist() function which is a vector of the break points for the histogram. So I need to generate a vector that contains say, 500, numbers in it, equi-distance apart between a min and a max. EG:> myfunc(n=20,min=1,max=20)would provide a vector, length 20, with the numbers 1 through 20 in it. Is there a function in R that can do this? Thanks Mick
Dear Mick, Have a look at ?seq - seq(1,20,length=20) should do it. HTH Thomas --- Thomas Hotz Research Associate in Medical Statistics University of Leicester United Kingdom Department of Epidemiology and Public Health 22-28 Princess Road West Leicester LE1 6TP Tel +44 116 252-5410 Fax +44 116 252-5423 Division of Medicine for the Elderly Department of Medicine The Glenfield Hospital Leicester LE3 9QP Tel +44 116 256-3643 Fax +44 116 232-2976> -----Original Message----- > From: michael watson (IAH-C) [mailto:michael.watson at bbsrc.ac.uk] > Sent: 03 July 2003 14:02 > To: 'r-help at stat.math.ethz.ch' > Subject: [R] Generating a vector for breaks in a histogram > > > Hi > > I have two lots of numbers which I would like to histogram > using the hist() function. For comparative reasons, I want > them to be on the same scale, which I can use the xlim and > ylim options to achieve. > > However, having them on the same scale is meaningless unless > they have the same "breaks". Consulting the documentation, > there are 4 ways of defining the number of breaks, only one > of which is "definite", the others merely form suggestions, > which I have found is not good enough. > > The only definite way is to provide a vector to the hist() > function which is a vector of the break points for the > histogram. So I need to generate a vector that contains say, > 500, numbers in it, equi-distance apart between a min and a max. EG: > > > myfunc(n=20,min=1,max=20) > > would provide a vector, length 20, with the numbers 1 through > 20 in it. > > Is there a function in R that can do this? > > Thanks > Mick > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >
michael watson (IAH-C)
2003-Jul-03 13:48 UTC
[R] Generating a vector for breaks in a histogram
Fantastic. You're right, I was looking for seq(). However, my plan for using it for hist() was foiled! I thought if I did something like:> b <- seq(0,500,10) > hist(myvble,breaks=b)It would bin myvble into the bins 0-50,50-100,100-150 etc and in that way I could ensure that two histograms are on the same scale with the same bins! I get the following error: Error in hist.default(Cy5, breaks = s) : some 'x' not counted; maybe 'breaks' do not span range of 'x' Now this makes sense of course, my bins probably DON'T span the entire range of X. SOOOOO I am still left with the same problem: 1) two variables 2) I want to draw histograms of both 3) I want them to have the SAME x-y scale on the graph 4) I want them to have the SAME bin range How do i do it? Any suggestions? Cheers Mick -----Original Message----- From: Hotz, T. [mailto:th50 at leicester.ac.uk] Sent: 03 July 2003 14:16 To: michael watson (IAH-C); r-help at stat.math.ethz.ch Subject: RE: [R] Generating a vector for breaks in a histogram Dear Mick, Have a look at ?seq - seq(1,20,length=20) should do it. HTH Thomas --- Thomas Hotz Research Associate in Medical Statistics University of Leicester United Kingdom Department of Epidemiology and Public Health 22-28 Princess Road West Leicester LE1 6TP Tel +44 116 252-5410 Fax +44 116 252-5423 Division of Medicine for the Elderly Department of Medicine The Glenfield Hospital Leicester LE3 9QP Tel +44 116 256-3643 Fax +44 116 232-2976> -----Original Message----- > From: michael watson (IAH-C) [mailto:michael.watson at bbsrc.ac.uk] > Sent: 03 July 2003 14:02 > To: 'r-help at stat.math.ethz.ch' > Subject: [R] Generating a vector for breaks in a histogram > > > Hi > > I have two lots of numbers which I would like to histogram > using the hist() function. For comparative reasons, I want > them to be on the same scale, which I can use the xlim and > ylim options to achieve. > > However, having them on the same scale is meaningless unless > they have the same "breaks". Consulting the documentation, > there are 4 ways of defining the number of breaks, only one > of which is "definite", the others merely form suggestions, > which I have found is not good enough. > > The only definite way is to provide a vector to the hist() > function which is a vector of the break points for the > histogram. So I need to generate a vector that contains say, > 500, numbers in it, equi-distance apart between a min and a max. EG: > > > myfunc(n=20,min=1,max=20) > > would provide a vector, length 20, with the numbers 1 through > 20 in it. > > Is there a function in R that can do this? > > Thanks > Mick > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >
One of my discoveries while learning the art of R, is that time has moved on since I did my basic statistics in school (although to my dismay the teaching of statistics in school appears also to have not noticed the movement.) I have seen a few references when people want to pie chart something, for the advice to be "find a better way." I've been reading some of the ash work (see package of same name and loads of papers on the web), also some interesting work on dot plots as an alternative to histograms. They make me feel that unless the data that you have in both histograms accidentally works well with the same set of bins you may not get the comparative assessment that you think you are getting. I am beginning to form the opinion that in most cases (if not all) there are better alternatives to histograms. _________________________________________________ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: Tom.Mulholland at health.wa.gov.au The contents of this e-mail transmission are confidential an...{{dropped}}
My gut feeling is that stacked dotplots would have given you the same insight. In general terms it's about getting the right tool for the right job. My comment was about the order of choosing rather than ignoring totally. If I recall correctly the article about dot plots was about old fashioned hand drawn dot plots where dots were either stacked above each other or if more appropriate next to each other as near as possible to where they should be located on the axis. This results in a pattern that looks very similar to the histogram. The argument being made if I recall correctly is that if you choose the wrong bins for a histogram you may well end up with the same type of result that you had with the densityplot. My practical way of looking at this is to look at what happens to the overall shape of the histogram when you change the bins. The issue is how quickly and reliably do you get to the "truth" using the various techniques. As you've noted the density plot doesn't seem to deal with some types of data as well as it does others. So when I am looking at data I use a variety of methods, and histograms come later than rugplots or density plots, but I tend to do both of those together. I'm just learning and welcome guidance in a field that I do not claim expertise in. _________________________________________________ Tom Mulholland Senior Policy Officer WA Country Health Service 189 Royal St, East Perth, WA, 6004 Tel: (08) 9222 4062 e-mail: Tom.Mulholland at health.wa.gov.au The contents of this e-mail transmission are confidential an...{{dropped}}