R plotting experts: I have a bivariate dataset composed of 300 (x,y) continuous datapoints. 297 of these points are located within the y range of [0,10], while 2 are located at 20 and one at 55. No coding errors, real outliers. When plotting these data with a scatterplot, I obviously have a problem. If I plot the full dataset with ylim = c(0,55), then I cannot see the structure in the data in the [0, 10] range. If I truncate the y axis with ylim = c(0,10), then I cannot see the 3 outliers. If I break the y axis from 10 to 20 (using plotrix functions), I still do not see the data optimally because of the white space from y=20 to y=55. What I would like to do is break the y axis at 2 points, roughly 10-20 and 20-55. Is there a function that can break an axis in 2 places? Thanks in advance for any suggestions. Brant
On Thu, 2007-04-12 at 13:41 -0500, Inman, Brant A. M.D. wrote:> R plotting experts: > > I have a bivariate dataset composed of 300 (x,y) continuous datapoints. > 297 of these points are located within the y range of [0,10], while 2 > are located at 20 and one at 55. No coding errors, real outliers. > > When plotting these data with a scatterplot, I obviously have a problem. > If I plot the full dataset with ylim = c(0,55), then I cannot see the > structure in the data in the [0, 10] range. If I truncate the y axis > with ylim = c(0,10), then I cannot see the 3 outliers. If I break the y > axis from 10 to 20 (using plotrix functions), I still do not see the > data optimally because of the white space from y=20 to y=55. > > What I would like to do is break the y axis at 2 points, roughly 10-20 > and 20-55. Is there a function that can break an axis in 2 places? > > Thanks in advance for any suggestions. > > BrantBrant, I am not a particular fan of broken axes (though others will disagree), much less two breaks. Presuming that your data might look something like this: http://www.itl.nist.gov/div898/handbook/eda/section3/scattera.htm A couple of thoughts: 1. Not being sure if your data range above actually includes 0, you may want to consider a log scaled axis, if not. 2. I might be tempted to use two plots: A. A first a plot of the entire data set, showing the 3 outliers B. A second plot of the 297 pairs with axes constrained to the appropriate ranges to enable better visualization of the data structure. If number 2 is more appropriate, you could also use par("mfcol") to set up side by side plots. See ?par. HTH, Marc Schwartz
Try something like this (modify to how you like it): x <- runif(100) y <- rnorm(100, 5, 2) y[1:3] <- c(19, 21, 50) layout(matrix( 3:1, ncol=1 ), heights=c(2,3,4)) par(mar=c(5,4,0,2)+0.1) plot(x,y, ylim=c(0,10), ylab='') par(mar=c(0.5,4,0,2)+0.1) plot(x,y, ylim=c(18,22), xlab='', xaxt='n' ) axis(1, labels=FALSE) par(mar=c(0.5,4,4,2)+0.1) plot(x,y, ylim=c(49,51),xlab='', main='my title', xaxt='n', ylab='' ) axis(1, labels=FALSE) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at intermountainmail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Inman, > Brant A. M.D. > Sent: Thursday, April 12, 2007 12:41 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Putting 2 breaks on Y axis > > > R plotting experts: > > I have a bivariate dataset composed of 300 (x,y) continuous > datapoints. > 297 of these points are located within the y range of [0,10], > while 2 are located at 20 and one at 55. No coding errors, > real outliers. > > When plotting these data with a scatterplot, I obviously have > a problem. > If I plot the full dataset with ylim = c(0,55), then I cannot > see the structure in the data in the [0, 10] range. If I > truncate the y axis with ylim = c(0,10), then I cannot see > the 3 outliers. If I break the y axis from 10 to 20 (using > plotrix functions), I still do not see the data optimally > because of the white space from y=20 to y=55. > > What I would like to do is break the y axis at 2 points, > roughly 10-20 and 20-55. Is there a function that can break > an axis in 2 places? > > Thanks in advance for any suggestions. > > Brant > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Inman, Brant A. M.D. wrote:> R plotting experts: > > I have a bivariate dataset composed of 300 (x,y) continuous datapoints. > 297 of these points are located within the y range of [0,10], while 2 > are located at 20 and one at 55. No coding errors, real outliers. > > When plotting these data with a scatterplot, I obviously have a problem. > If I plot the full dataset with ylim = c(0,55), then I cannot see the > structure in the data in the [0, 10] range. If I truncate the y axis > with ylim = c(0,10), then I cannot see the 3 outliers. If I break the y > axis from 10 to 20 (using plotrix functions), I still do not see the > data optimally because of the white space from y=20 to y=55. > > What I would like to do is break the y axis at 2 points, roughly 10-20 > and 20-55. Is there a function that can break an axis in 2 places? >Hi Brant, gap.plot in the plotrix package can do one break, and it is possible to do two, as gap.boxplot does. It wouldn't be too difficult to recode gap.plot to get more than one break. I'll see what I can do today. Jim