Hi,
In my work, I often investigate relationships between highly skewed data.
Example:
set.seed(111)
require(MASS)
d = data.frame(mvrnorm(1000, mu=c(0,0), Sigma=matrix(c(1,.6,.6,1), nrow=2)))
names(d) = c("x","y")
## Skew Y
d$y = d$y^4
plot(d$x, d$y)
lines(lowess(d$x, d$y), lwd=2, col="blue")
Unfortunately, with such skewed data, it's hard to see the line unless we
"zoom in," by either ignoring the outliers, or "breaking"
the scale of the
y axis, such that the first 2/3 of the graph correspond to a "normal"
scale, and the remaining 1/3 correspond to a "compressed" scale. (I
know
this is generally not recommended, see
http://r.789695.n4.nabble.com/split-a-y-axis-to-show-data-on-different-scales-td805816.html).
I created a function to do this:
scaleBreak = function(x,y,axis=2, breakpos=1,...){
#### figure out which Y values are above the breakpos
y_above = y[y>breakpos]; x_above = x[y>breakpos]
y_below = y[y<=breakpos]; x_below = x[y<=breakpos]
#### pick ranges
range_1 = range(y_below)
range_2 = range(y_above)
#### find limits of y axis (so it spans 2/3rds)
mx = max(y_below); mn = min(y_below)
ylims1 = c(mn, mx + (mx-mn)/2)
#### plot bottom graph
plot(range(x), ylims1, type="n", yaxt="n",...)
points(x_below, y_below, yaxt="n",...)
axis(2, pretty(y_below))
min2 = (breakpos + .45*breakpos - max(y))/.45
#### add second graph
par(new=TRUE)
plot(range(x), c(min2, max(y)), type="n", yaxt="n",
xaxt="n", ylab="",
axes=F)
points(x_above, y_above, yaxt="n",...)
axis(2, pretty(y_above)[-1])
require(plotrix)
axis.break(axis=axis, breakpos)
}
The problem I'm having is that the fitted line is not printed on the scale
of the bottom 2/3 plot:
scaleBreak(d$x, d$y, breakpos=20)
lines(lowess(d$x, d$y), lwd=2, col="blue")
Can anyone think of a solution where the line is on the scale of the bottom
2/3 of the plot, in the right location? (Again, let me preempt the
objections others might raise about how this should not be done. I know it
generally should not be done, but let's pretend I have an excellent reason
:) )
Thanks in advance!
[[alternative HTML version deleted]]
Since points() and lines() plot on the scale of the last plot, you could plot your "first" graph after the "second" one, or if there is a particular reason why you plot them in the order you do, add a third, empty graph with the scale of the first one as the last plot. B. On 2014-05-07, at 2:37 PM, Dustin Fife wrote:> Hi, > > In my work, I often investigate relationships between highly skewed data. > Example: > > set.seed(111) > require(MASS) > d = data.frame(mvrnorm(1000, mu=c(0,0), Sigma=matrix(c(1,.6,.6,1), nrow=2))) > names(d) = c("x","y") > ## Skew Y > d$y = d$y^4 > plot(d$x, d$y) > lines(lowess(d$x, d$y), lwd=2, col="blue") > > Unfortunately, with such skewed data, it's hard to see the line unless we > "zoom in," by either ignoring the outliers, or "breaking" the scale of the > y axis, such that the first 2/3 of the graph correspond to a "normal" > scale, and the remaining 1/3 correspond to a "compressed" scale. (I know > this is generally not recommended, see > http://r.789695.n4.nabble.com/split-a-y-axis-to-show-data-on-different-scales-td805816.html). > I created a function to do this: > > scaleBreak = function(x,y,axis=2, breakpos=1,...){ > > #### figure out which Y values are above the breakpos > y_above = y[y>breakpos]; x_above = x[y>breakpos] > y_below = y[y<=breakpos]; x_below = x[y<=breakpos] > > #### pick ranges > range_1 = range(y_below) > range_2 = range(y_above) > > #### find limits of y axis (so it spans 2/3rds) > mx = max(y_below); mn = min(y_below) > ylims1 = c(mn, mx + (mx-mn)/2) > > #### plot bottom graph > plot(range(x), ylims1, type="n", yaxt="n",...) > points(x_below, y_below, yaxt="n",...) > axis(2, pretty(y_below)) > > min2 = (breakpos + .45*breakpos - max(y))/.45 > > #### add second graph > par(new=TRUE) > plot(range(x), c(min2, max(y)), type="n", yaxt="n", xaxt="n", ylab="", > axes=F) > points(x_above, y_above, yaxt="n",...) > axis(2, pretty(y_above)[-1]) > require(plotrix) > axis.break(axis=axis, breakpos) > > } > > The problem I'm having is that the fitted line is not printed on the scale > of the bottom 2/3 plot: > > scaleBreak(d$x, d$y, breakpos=20) > lines(lowess(d$x, d$y), lwd=2, col="blue") > > Can anyone think of a solution where the line is on the scale of the bottom > 2/3 of the plot, in the right location? (Again, let me preempt the > objections others might raise about how this should not be done. I know it > generally should not be done, but let's pretend I have an excellent reason > :) ) > > Thanks in advance! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Dustin,
Similar to what Boris suggested, try scaling the values and not
resetting the plot limits:
scaleBreak = function(x,y,axis=2,breakpos=1,ymult=0.45,...){
#### figure out which Y values are above the breakpos
y_above<-y[y>breakpos]
x_above<-x[y>breakpos]
y_below<-y[y<=breakpos]
x_below<-x[y<=breakpos]
# scale the y limit according to the compression of high y values
ylim<-c(min(y),(max(y_above)-breakpos)*1.1*ymult+breakpos)
#### plot bottom graph
plot(0,xlim=range(x),ylim=ylim,type="n", yaxt="n",...)
points(x_below, y_below, yaxt="n",...)
axis(2, pretty(y_below))
#### add second graph
newy<-(y_above-breakpos)*ymult+breakpos
points(x_above,newy,...)
ylabels<-pretty(y_above)
cat(ylabels,(ylabels-breakpos)*ymult+breakpos,"\n")
axis(2, at=(ylabels-breakpos)*ymult+breakpos,labels=ylabels)
require(plotrix)
axis.break(axis=axis, breakpos,style="gap")
}
I would also suggest using the gap style axis break.
Jim
On Wed, 7 May 2014 01:37:17 PM Dustin Fife wrote:> Hi,
>
> In my work, I often investigate relationships between highly skewed
data.> Example:
>
> set.seed(111)
> require(MASS)
> d = data.frame(mvrnorm(1000, mu=c(0,0),
Sigma=matrix(c(1,.6,.6,1), nrow=2)))> names(d) = c("x","y")
> ## Skew Y
> d$y = d$y^4
> plot(d$x, d$y)
> lines(lowess(d$x, d$y), lwd=2, col="blue")
>
> Unfortunately, with such skewed data, it's hard to see the line
unless we> "zoom in," by either ignoring the outliers, or
"breaking" the scale of
the> y axis, such that the first 2/3 of the graph correspond to a
"normal"
> scale, and the remaining 1/3 correspond to a "compressed" scale.
(I
know> this is generally not recommended, see
>
http://r.789695.n4.nabble.com/split-a-y-axis-to-show-data-on-different-scale
> s-td805816.html). I created a function to do this:
>
> scaleBreak = function(x,y,axis=2, breakpos=1,...){
>
> #### figure out which Y values are above the breakpos
> y_above = y[y>breakpos]; x_above = x[y>breakpos]
> y_below = y[y<=breakpos]; x_below = x[y<=breakpos]
>
> #### pick ranges
> range_1 = range(y_below)
> range_2 = range(y_above)
>
> #### find limits of y axis (so it spans 2/3rds)
> mx = max(y_below); mn = min(y_below)
> ylims1 = c(mn, mx + (mx-mn)/2)
>
> #### plot bottom graph
> plot(range(x), ylims1, type="n", yaxt="n",...)
> points(x_below, y_below, yaxt="n",...)
> axis(2, pretty(y_below))
>
> min2 = (breakpos + .45*breakpos - max(y))/.45
>
> #### add second graph
> par(new=TRUE)
> plot(range(x), c(min2, max(y)), type="n", yaxt="n",
xaxt="n",
ylab="",> axes=F)
> points(x_above, y_above, yaxt="n",...)
> axis(2, pretty(y_above)[-1])
> require(plotrix)
> axis.break(axis=axis, breakpos)
>
> }
>
> The problem I'm having is that the fitted line is not printed on the
scale> of the bottom 2/3 plot:
>
> scaleBreak(d$x, d$y, breakpos=20)
> lines(lowess(d$x, d$y), lwd=2, col="blue")
>
> Can anyone think of a solution where the line is on the scale of the
bottom> 2/3 of the plot, in the right location? (Again, let me preempt the
> objections others might raise about how this should not be done. I
know it> generally should not be done, but let's pretend I have an excellent
reason>
> :) )
>
> Thanks in advance!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible
code.